Preconditions — Part I | Andrzej's C++ blog

Preconditions — Part I

Posted on January 4, 2013 by Andrzej Krzemieński

In this post, I want to share my thoughts about the notion of precondition. In “Design by Contract” philosophy, preconditions are always mentioned along postconditions and invariants, and in the context of OO design. In this post I focus only on preconditions and not necessarily related to any class. For instance, the following function specifies a precondition on its argument:

double sqrt(double x);
// precondition: x >= 0

Note that the function specifies the precondition even though there is no language feature for this purpose (at least in C++). A precondition is a “concept” or an “idea” rather than a language feature. This is the kind of preconditions that this post is about.

A motivating example

Consider the following piece of code for authenticating users. Its responsibility is to have the end-user enter his user-name and check if this name is already recorded in the internal data-base.

bool checkIfUserExists(std::string userName)
{
  std::string query = "select count(*) from USERS where NAME = \'" + userName + "\';"
  return DB::run_sql<int>(query) > 0;
}

bool autheticate()
{
  std::string userName = UI::readUserInput();
  return checkIfUserExists(userName); 
}

This code may look correct at first. Especially because in most of the cases it works as expected. If the end-user is nice, and enters names like “tom”, our program comes back with the right response. However, if one of the users is malicious, he may enter a “user name” like the following:

JOHN'; delete from USERS where 'a' = 'a

In this case the query in function checkIfUserExists becomes this:

select count(*) from USERS where NAME = 'JOHN'; delete from USERS where 'a' = 'a';

This is a simple example of a serious security issue in a program, but for our purposes it is sufficient to call this situation a bug. Assuming that functions checkIfUserExists and authenticate were written by two different people, which of them is responsible for the bug? The author of authenticate can say “I expected that checkIfUserExists only queries one table and does not issue arbitrary commands to DB.” The author of checkIfUserExists can say “I expected a user name in the function argument — not arbitrary SQL commands.” Neither of them would be wrong. They worked under different expectations, and their expectations were not explicitly stated. Without clearly stated expectations (or a contract), it is impossible to tell whose fault it is: it is a failure to communicate between two programmers. (Or even between one programmer.)

We have two problems then: (1) the program has a bug, (2) it is not clear whose responsibility it is. The latter problem would have been avoided if function checkIfUserExists made its assumptions (or the lack thereof) about its arguments explicit. Suppose we have a function that can tell valid user names from the invalid ones:

bool isValidName( std::string const& text )
{
  const static std::regex NAME{"\\w+"};
  std::smatch match;
  return std::regex_match(text, match, NAME);
}

Now when dividing the code into smaller pieces (in our case: into functions) before we write the implementation of the functions, just after we decide on the function’s interface (bool checkIfUserExists(std::string)) we should also decide on the allowed range of the argument values. We can allow any value (and then apply filtering ourselves) or require only valid user names. Any choice is fine, but we have to make it explicit. In the header file that contains our function’s declaration, we decorate the declaration with our assumption.

bool checkIfUserExists(std::string userName);
// precondition: isValidName(userName)

Or in case we accept any string:

bool checkIfUserExists(std::string userName);
// precondition: true

Whichever of the two we decide upon, the responsibility for checking the value of the string becomes clear. We said that the precondition needs to be put along the function’s declaration because it is part of the function’s interface, like function’s name, return type etc.. The notable difference in the case of a precondition, though, is that (at least in C++) there is no language feature that helps us express it. We have to use comments. Comments? Does it sound wrong? In the environment I work in, I observed a tendency to avoid writing comments. True, there exist good reasons to avoid comments in places where we have a better alternative. But this should not make comment avoidance a common practice. One could say that such preconditions in comments may be misleading because comments are not syntax- and type-checked. Today’s IDE’s have the capability of recognizing certain patterns in comments and using them for generating documentation or tool-tip hints. Also, with a bit of inventive declarations and macros, you can force the compiler to enforce the type-safety of the asserted predicates, at the expense of slightly polluting the function declaration syntax. For instance, consider this solution (it requires C++11):

template <typename T>
struct RETURN
{
  template <typename U>
  struct precondition_expression_type
  {
    static_assert(
      std::is_constructible<bool, U>::value, 
      "expression in precondition is not convertible to bool"
    );
    using type = T;
  }; 
  
  template <typename U>
  using PRECOND = typename precondition_expression_type<U>::type;
};

#define PRECONDITION(...) ::PRECOND<decltype(__VA_ARGS__)>

Using this template and the macro, you we can declare our function as:

auto checkIfUserExists(std::string userName) -> RETURN<bool>
  PRECONDITION( isValidName(userName) );

Compiler will refuse the compilation if the expression is invalid or if it is not contextually convertible to bool.

What does a precondition mean?

A precondition makes certain assumptions explicit. When a function specifies a precondition it is clear that the caller is supposed to guarantee that the precondition holds. This is the contract: one should not call the function if he cannot guarantee satisfying the function’s precondition. This does not necessarily mean that the caller needs to do any checks himself; there are other means of fulfilling the guarantee. To illustrate this consider the precondition of sqrt:

double sqrt(double x);
// precondition: x >= 0

The following three functions guarantee that sqrt’s precondition will hold, even though they do not check the value of the argument:

double fun1()
{
  const double x = 255.0; // using a literal
  return sqrt(x);
}

double fun2(double x, double y)
// precondition: x >= 0 && y >= 0
{
  return sqrt(x) + sqrt(y); // relying on another precondition
}

double fun3(double x)
// precondition: true
{
  return sqrt(abs(x)); // abs(x) is never negative
}                      // (relying in abs's postcondition)

Preconditions (along with other contract-programming concepts) bring order into the program. For instance, in our example with authentication, we may wonder if wiping out one DB table is a bug in the program or is it simply an unfortunate user input over which the program has no control. If function checkIfUserExists requires only valid user names in the precondition it helps set a certain ‘border’: it is acceptable for the user to enter any string, and this string can even enter the program, and this is fine; but the invalid string cannot cross the border; if it does, then (and only then) this is a bug. In other words, preconditions help us distinguish bugs from other unusual situations in the program.

Note also that we could have addressed the problem in a different way:

// NOT RECOMMENDED!
bool checkIfUserExists(std::string userName)
{
  if (!isValidName(userName)) SIGNAL(); 

  std::string query = "select count(*) from USERS where NAME = \'" + userName + "\';"
  return DB::run_sql<int>(query) > 0;
}

bool autheticate()
{
  std::string userName = UI::readUserInput();
  if (!isValidName(userName)) SIGNAL();

  return checkIfUserExists(userName); 
}

Ignore for the moment what SIGNAL() is (although it is a very difficult and important question). In this solution nobody trusts nobody and everyone just checks for the dangerous condition wherever they can. The author of checkIfUserExists cannot be sure if his caller will validate the input, so he does it himself. Similarly, the author of autheticate cannot be sure if checkIfUserExists will be prepared for any input and has to validate it himself. This solution has certain drawbacks, though. First: performance; now we repeatedly check for the same condition (although in this particular case this will be negligible compared to DB access). Second, the code now becomes messy. The programmers loose control over the code. If the author of autheticate at some point gets to see the implementation of checkIfUserExists, he may discover that since the latter already does the check, he can skip his own check for clarity/performance reasons. The author of checkIfUserExists may observe (and implement) something opposite. More, if function checkIfUserExists throws an exception when passed an invalid name, someone may try to use it for validating strings:

bool checkIfNameIsValid(std::string text)
{
  try {
    checkIfUserExists(text);
    return true;
  }
  catch (BadUserName const&) {
    return false;
  }
}

Next problem here is what SIGNAL() should do inside checkIfUserExists. Throw an exception? But is the author of autheticate prepared to handle it? Return an error code? Use errno? Again, can we trust the author of autheticate to check it, given that we do not trust that he would validate the input? Whatever we choose, the program will grow in complexity; and complexity (especially a messy one like this) is likely to cause bugs.

What if a precondition is violated?

Violating a precondition is conceptually similar to dereferencing a null pointer. A function simply works under the assumption that certain conditions are met. If this is not the case, the function is likely to do something else than what its author and its specification expected. This is called an undefined behavior. For an example, let’s consider an implementation of function sqrt that uses Newton’s Iteration algorithm.

double sqrt(double x)
// precondition: x >= 0
{
    double y = 1.0;
    double prev_y;

    do {
        prev_y = y;
        y = (y + x / y) * 0.5; 
    }
    while (!closeEnough(y, prev_y));

    return y;
}

We keep finding a better and better approximation (variable y) until we find one that lays within an acceptable tolerance. Function closeEnough checks if the current approximation is significantly different from the previous one. It is not obvious though that this loop is guaranteed to ever stop. Our expectation that the algorithm will stop is based on the observation that the difference between y and prev_y decreases in each iteration; and that the final result — call it final_y — satisfies the condition:

closeEnough(final_y, (final_y + x / final_y) * 0.5)

The two are indeed the case, and y and prev_y converge in fact quickly, but only provided that x is non-negative. The moment we pass a negative x, variables y and prev_y never converge; and our loop never stops.

In consequence, violating a precondition may cause a program to hang. If the end-user computes a simulation, and he knows the simulation will be running for a week, it is only after a week that he will learn that his simulation is not running and the program is simply hanging.

Validating the precondition manually

Given that violating the precondition can have severe consequences (just consider providing an out-of-range index when accessing an element in an array), why not just validate it as the first thing inside the function?

As we already said, calling a function whose precondition is not satisfied is an undefined behavior: our function can legally do just anything. Using an additional measure to validate the precondition does fit into “anything.” However, a couple of things should be kept in mind.

1. Checking for a precondition inside the function should not relieve us from still specifying the precondition. While the check is performed, our clients should not take it for granted. They should not be led to expect that we have committed to performing the check, and still comply with the contract. Our check is part of function’s implementation: it is subject to change. The contract, on the other hand, is part of the interface: it is supposed to be something stable.

To illustrate this, consider the two ways of accessing elements in std::vector:

void test(std::vector<int> const& vec, size_t n)
{
  vec[n];     // (1)
  vec.at(n);  // (2)
}

The first one specifies the following contract:

precondition: n < vec.size();
returns: *(vec.begin() + n).

The second specifies a different contract:

precondition: true (no precondition);
effects: if n < vec.size() then returns *(vec.begin() + n); otherwise throws out_of_range.

This means that the following way of breaking the loop is perfectly valid:

void forEach(std::vector<int> const& v, std::function<void(int)> f)
// precondition: f != nullptr
{
  size_t i = 0
  try {
    for(;;) {
      f(v.at(i++));
    }
  }
  catch (std::out_of_range const&) {
    // finished loop
  }
}

Using operator[] in place of at above would in turn cause a UB (even if this UB happens to result in throwing out_of_range).

2. While it may be possible to form an expression that tells valid arguments from invalid ones, executing the expression may be fatal to the program. For instance consider the following function:

template <typename IIT> // requires: InputIterator<IIT>
void displayFirstSecondNext(IIT beg, IIT end)
// precondition: std::distance(beg, end) >=2
{
  std::cout << "first: " << *beg++ << std::endl;
  std::cout << "second: " << *beg++ << std::endl;
  std::cout << "next: "

  while (beg != end) {
    std::cout << *beg++ << " ";
  }
}

This function requires that the input range contains two or more elements. We can easily check it using function std::distance, but this by would require incrementing the iterator. In case we are given an InputIterator (e.g., the iterator interface for IO-streams), if the iterator is incremented while checking the precondition, we can never get back to the value it was referring to before, and we would not be able to display the first element in the range inside the function body.

Similarly, validating the precondition may change the complexity guarantees of the algorithm. Consider:

bool containsSorted(std::vector<int> const& v, int i)
// precondition: std::is_sorted(begin(v), end(v))
{
  return std::binary_search(begin(v), end(v), i);
}

This function expects a sorted vector and therefore it can offer a logarithmic complexity. However, in order to check if the vector is sorted we have to use an algorithm with a linear complexity. In the end, our function containsSorted requires a sorted vector and offers a linear complexity if it validates the precondition. This can make our program run much much slower. This may be unacceptable even for debug builds.

Also, for performance-critical applications if we check for the same precondition multiple-times it unnecessarily slows the program down. Consider:

void forEach(std::vector<int> const& v, std::function<void(int)> f)
// precondition: f != nullptr
{
  for (size_t i = 0; i < v.size(); ++i) { // (1) precondition check
    f(v[i]);
  }
}

By specifying the loop’s terminating condition we already check the precondition of operator[], if we were checking it again inside operator[] we would be unnecessarily making the program run slower.

3. Evaluating the precondition is the easy part; the tough part is how to report the broken precondition. What should function sqrt do if it detects a precondition violation? Return some special value like NaN? But if the caller doesn’t bother to check the precondition, will he bother to check the special return value? Also see this post to see why it can cause more bugs. Putting it differently, since violating the precondition is the failure (bug) of the caller, giving the control back to him in order to fix the problem is unlikely to work. The caller’s job is to satisfy the precondition (it is entirely in his control) — not to handle the consequences of his own fault.

The same issue is applicable to other ways of signaling function failures. You could think of returning a combined value:

optional<double> sqrt(double x);
// precondition: x >= 0

This is problematic also. How do you specify the effects of the function? I.e., what does it do? Let’s try: it returns an ‘uninitialized’ optional in case x < 0. But this is counter to the concept of a precondition. If you know what the function should do for negative arguments, it does not have a precondition: it is well defined for any value of type double, and therefore it can be legally used in ways that we may not like:

optional<double> sqrt(double x);
// precondition: true

bool isNegative(double x);
// precondition: true
{
  return !sqrt(x);
}

A compound value may make sense in cases like converting a string to a number, where not being able to convert a string is a frequent and expected situation:

optional<int> toInt(std::string s);
// precondition: true

int getNumber();
{
  std::string s;
  
  for (;;) {
    std::cout << "enter a number: ";
    std::cin >> s; 
    if (auto ans = toInt(s)) {
      return *ans;
    }
    else {
      std::cout << "this was not a number,  ";
    }
  }
}

But in the case of sqrt this is different. The only case where we would be forced to return the special value to the caller is when we know there is a bug in the caller. We would probably need to specify the contract in the following way: “requires a non-negative double, never returns an uninitialized optional.” But this means that we never need to check for the uninitialized optional. So, why returning a compound value? A better choice would be to alter the type of function’s argument rather than its return type, but we leave that option for Part II.

If we decided to throw an exception on precondition violation, at least we do not have to involve the return value. However, one problem remains: we are passing the control to the caller of which we are certain that it has a bug. Exceptions are used to signal a failure inside the function (failure to do what it is expected to). We are now abusing the exception mechanism a bit to signal a bug outside of the function. Also, it will not work for functions that want provide a no-fail guarantee:

double sqrt(double x) noexcept;
// precondition: x >= 0

This problem (of non-throwing functions with a precondition) has been also discussed in depth in N3248.

Another option is to stop the program at this point by calling std::terminate. This appears very harsh. On the other hand, we are killing the program that is about to enter a UB, and already has a bug; and std::terminate may give us a chance to collect the information about the program state (create a memory dump or some such) and restart the program. This is in fact the default behavior in the proposal to add Contract Programming to C++: see N1962.

I have to stop there: I do not want the post to be to long. In the next post we will explore how and when to best specify preconditions, what alternatives to preconditions we have, and how compilers and other tools can assist with enforcing the preconditions.

This entry was posted in programming and tagged correctness. Bookmark the permalink.

46 Responses to Preconditions — Part I

Bruno says:

January 4, 2013 at 11:20 am

At the moment, using cassert is probably the best solution for checking preconditions such as the one on sqrt?

Reply
- Andrzej Krzemieński says:
  
  January 4, 2013 at 11:42 am
  
  If your goal is only to validate the precondition at run-time than assert is one of the best choices (which does not mean it is a good choice). This is because they are used for the same “class” of problems: bugs. assert gives you an option to respond to the detected failure in two ways: either abort or ignore. Other assertions give you more options, like launch the debugger in the place where assertion failed. I would also recommend trying BOOST_ASSERT. Apart from abort or ignore options it offers a third: calling a callback function that will be defined by whoever compiles the entire program. This way you can separate the code that detects the bug from making the decision what to do.
  
  But assertions are very different from preconditions. Assertions are not part of function’s interface. They only detect bugs inside implementation. Preconditions detect bugs in the “interfacing” between components.
  
  Reply
Cedric says:

January 4, 2013 at 2:42 pm

For when a precondition is violated, terminating the program is by far too brutal for most cases I’ve been working on.

Throwing an exception gives at least chances of terminating the program properly, with a message to the end user and if the software is well designbed there are chances that we can recover and that only a part of the software will not function. In your sqrt example, say it is used in a spreadsheet application, entering a wrong formula you can design your software to issue an error message but not to close, therefore the user can continue working, finding workarounds the time the software developper fixes the issue.

Reply
- Andrzej Krzemieński says:
  
  January 4, 2013 at 8:57 pm
  
  This question whether to stop a program that has been observed to work incorrectly or to let it run and hope for the best has always been very controversial. For instance, my experience (with a life-critical application) is quite the opposite: when we detect that the program starts working incorrectly, we want to restart it. I guess that the controversy stems from the fact that neither approach (terminate or let it run) is good. The only proper way of dealing with bugs is not to have them or eliminate them. Responding to them at run-time is much too late.
  Speaking a bit idealistically, it is better to make sure preconditions are not violated than to think what to do when they are. There are some tools and techniquest that help do that.
  
  Reply
  - Cedric says:
    
    January 5, 2013 at 3:06 pm
    
    Andrzej,
    I totally agree that the most important thing is to avoid bugs, and clarifying contract through preconditions participate to it. Your paper is very clear about that.
    
    Through, we all know it is almost impossible in real life not to have bugs. So it is mandatory to think about what appens when the program fails. Let’s add that this question should be part of the initial thinkins and design.
    
    Throwing an “invalid_argument” exception pros are:
    * If the calling program does nothijng perticular, il will terminate the program with an uncatched exception: that’s your recommended behavior
    * The calling program could catch all exceptions at least to be able to clean up / close / and so on so to close properly. Only the calling program knows what should be done so control has to be returned to the caller
    * this clean up can include for UI programs a message for the end user that zavoids the very frustrating brutal closing of the application
    There is one con:
    * If your function do never throw exception, you can use the noexcept keyword that the compiler could use to perfom optimization
    
    To summarize, I would say that it is not the function itself that should decide what to do in case of violated precondition (from an object oriented dezsign point of view, or even from a modular design point of view). Returning control to the calling function is useless as you perfectly explained, and exception throwing seems a good compromise.
    
    Reply
    - Andrzej Krzemieński says:
      
      January 7, 2013 at 11:14 am
      
      @Cedric: You bring up good points, and I wish I could respond to them properly. However, I believe that analyzing the problem in detail would be too big a topic to fit into a comment here, so I will only stick to clarify one point i have strong feelings about.
      
      I imagine that for the programs you happened to work with, signalling a broken precondition by throwing an exception is just the thing you need, and I wouldn’t try to convince you otherwise. However, I would like you to look just a bit more warmly at the alternative (suitable for others) of calling std::terminate. My impression (perhaps incorrect) is that people sometimes miss the chance that std::terminate offers. Note that it enables you to clean up critical resources and display an “apologize” message. I tried to collect some use cases for the function here.
      
      I am not trying to convince you to using it. I just want to clarify what it is for.
    - sudarkoff says:
      
      January 8, 2013 at 6:20 pm
      
      There are four types of behavior: “expected desired” – that’s what your program does; “expected undesired” – exceptional situations you can foresee and therefore deal with (that’s what @Cedric is talking about, I believe); “unexpected desired” – that’s the infamous “feature, not a bug” case; and finally there’s the “unexpected undesired” – you weren’t expecting this to ever happen and you have no idea what to do about it – terminate is your only friend in this case. That’s what Andrzej’s approach is addressing.
    - Andrzej Krzemieński says:
      
      January 10, 2013 at 10:53 am
      
      @sudarkoff:
      “There are four types of behavior: «expected desired» […]”
      — I like this nomenclature. It gives a good insight into the nature of things we perceive as “undesired”.
      
      This is perhaps why I dislike term “error”. I believe it often groups all undesired things without making the important distinction into “expected” and “unexpected” ones.
- Róbert Dávid says:
  
  January 6, 2013 at 11:47 pm
  
  One size does not fit all.
  For a nuclear reactor controller software, with multiple controllers doing Byzantine votes, I would rather have a software crash, than doing anything.
  For a computer game, I would even skip testing, and just hope it won’t happen as I need every picoseconds of CPU time for more FPS.
  For some cases, I would fire up a debugger. It is really a question of what the software is.
  
  Reply
  - Cedric says:
    
    January 7, 2013 at 8:53 am
    
    Robert, I entirely agree, it depends on what the software is.
    And the function does not know, so it has to enable the calling software to decide. That was all my point…
    
    Reply
    - Andrzej Krzemieński says:
      
      January 7, 2013 at 9:40 am
      
      …and the best way to have the calling program decide what to do that I have been able to find so far is what BOOST_ASSERT employs. It just calls a callback. This callback is not defined in the library: whoever assembles the entire program, registers the callback.
      
      Of course, the cases where you are not allowed to check for the precondition, need to be treated separately by some sort of a flag (either a compile-time flag, or an additional parameter to the function potentially evaluating the precondition).
Grout says:

January 4, 2013 at 9:01 pm

“Exception abuse” is I think too strong a term for a function throwing invalid_argument(). The error is thrown where it is first understood to be an error.

Reply
- Andrzej Krzemieński says:
  
  January 5, 2013 at 11:31 am
  @Grout: This is an interesting argument. Although, I am not sure I entirely agree.
  
  I definitely agree with a similar statement, that you cannot report a situation before you detect it. I would be willing to accept that “exception abuse” is too strong a term provided that we accept the following “best practice” for exception handling mechanism: “throw an exception whenever you feel the stack unwinding mechanism should be launched”.
  
  My concern with your statement is about the word “error,” which means different things for different people. What do you mean by “error”?
  1. A bug?
  2. An unusual situation in a correct program (like bad user input, or running out of system resources)?
  3. Both of these?
  In case of #2, you always detect the unusual situation right in time. If the OS cannot provide the resources you ask for, it gives you the negative reply correctly and immediately. In fact, there is nothing incorrect about being short of resources. Therefore, upon throwing exceptions (in case #2) you are sure you detected the situation the moment it occurred. The process of stack unwinding is run inside a correct program.
  
  In case of a bug, as you noted, you do not know when it occurred and how long it has been there, and how vast it is. You are launching stack unwinding in a program that is in an invalid state. Stack unwinding calls the functions you registered: destructors, copy constructors of exception objects, exception handlers. These functions likely also rely on valid program state, they may also come with preconditions. Even the process of stack unwinding may further corrupt your program’s state.
  
  This is my — a bit catastrophic — view of using exceptions for signalling bugs. I do not try to discourage you from using exceptions though, because I cannot offer a much better alternative. My goal is for the readers not to feel good about using exceptions for signalling bugs. If you signal bugs with exceptions, you do it because out of the bad alternatives you pick one that you believe is the least bad.
  
  Bugs — ideally — should be avoided: not detected and signaled.
  Reply
  - Mike says:
    
    January 7, 2013 at 8:55 pm
    
    I think this is really the crux of it. There’s much in your reply I agree with.
    
    Over the years, I’ve found myself worrying less about “exceptions” as such and more about how to detect and report bugs in software. Out-of-memory conditions, etc. are certainly valid concerns but they are little more than annoying mosquitoes when compared to the inevitability of real bugs (I say this despite working on software that routinely pushes the physical resource limits of the machines it runs on). More important is to express (and verify) the invariants (and assumptions) your code operates under. This is what brings the latent bugs to light.
    
    I appreciate BOOST_ASSERT()-like macros for development purposes because they do provide options for how to deal with bugs during development/testing. I don’t regard them as much help once released into the wild other than for logging purposes. Out in the wild, there is a hard requirement that you do something sensible once an error is detected. For library-level code, “do something sensible” generally means “let somebody higher up in the callstack decide what to do”. You cast exceptions as a least-of-many-evils but I regard it as the only sensible thing to do. Even in my wildest dreams, I cannot conceive of a better general solution. Perhaps my imagination is too limiting…
    
    If unwinding the stack causes further state corruption, that’s probably an indication of a(-nother) bug in your code — that is, not implementing the basic guarantee correctly. Fix it.
    
    Reply
    - Andrzej Krzemieński says:
      
      January 7, 2013 at 9:26 pm
      
      If unwinding the stack causes further state corruption, that’s probably an indication of a(-nother) bug in your code — that is, not implementing the basic guarantee correctly. Fix it.
      
      — Such “further corruption” is not necessarily a second bug. You measure/define/provide the correctness of a software component in form of a deal or contract: you only get the correct results if you ascertain the correct prerequisites: if you do not meet the necessary prerequisites you are not guaranteed the results,and this “conditional” or “transitive” guarantee is not a bug. In the post above, if function sqrt implemented with Newton’s iteration enters an infinite loop because someone provided a negative input, it is not a second bug. It is still the first bug and its consequences.
      
      Similarly, stack unwinding is supposed to work only given that certain preconditions are met: namely, that the program is in valid state. If it is not, stack unwinding may further “explore” the first bug. I will try to come up with an example that illustrates it in one of the next posts.
      
      Let me just share one interesting example with exceptions used for precondition failures. In this case it is an invariant rather than a precondition:
      
      struct Thing { // invariant: this->isValid() void fun(); ~Thing(); }; void use(Thing t) { // ... // t becomes invalid t.fun(); // throws: invariant failure } // std::terminate: invariant in destructor failed
      
      If fun detects the invariant failure, it throws. During stack unwinding we call t’s destructor. The destructor also requires that the invariant holds, so the destructor throws also,during stack unwinding…
      
      Not that I am against calling std::terminate on contract failures, but its worth noting still.
  - Mike says:
    
    January 9, 2013 at 10:51 pm
    
    [This is a follow-up to your reply to my earlier post, comment 889.]
    “Similarly, stack unwinding is supposed to work only given that certain preconditions are met: namely, that the program is in valid state.”
    
    Under what circumstances (short of memory corruption), in a fully basic exception safe program, could the program be in an invalid state? I’m struggling to see how that might occur… Perhaps I’m not using the same definition of “valid” that you are.
    
    “The destructor also requires that the invariant holds, so the destructor throws also,during stack unwinding…”
    
    If the destructor is noexcept (the usual case), then this is your second bug. If the destructor is designed to throw, then I guess the designer is sorta getting what they asked for: std::terminate().
    
    In any case, I would think functional languages have many fewer problems of this kind. Side-effects seem to be the genesis of most of the error handling complexity. Eliminate side-effects and you free the programmer to make linear decisions about linear code. Easier said than done of course…
    
    Reply
    - Andrzej Krzemieński says:
      
      January 10, 2013 at 10:03 am
      
      Under what circumstances (short of memory corruption), in a fully basic exception safe program, could the program be in an invalid state? I’m struggling to see how that might occur… Perhaps I’m not using the same definition of «valid» that you are.
      
      … or perhaps we mean something different by “basic exception safety guarantee.”
      
      The notion of “invalid state” is in general very fuzzy, but invariants you put into the code make it objective. If your class — for instance — specifies an invariant (as a predicate) and there exists an object of this class that happens not to satisfy the predicate, you have objectively detected an invalid state (I called “bug” in my post, but perhaps “invalid state” is a more accurate term).
      
      By “basic exception safety guarantee” I understand the statement: “if a function ends in throwing an exception, it leaves the objects that it tried to mutate in valid (but perhaps unspecified) state”. This does not require anything of the function in cases where it does not throw an exception. Let me illustrate it with code example. Pardon me, but for now I can only come up with a silly one.
      
      struct Thing { bool state = true; INVARIANT(state == true); void run() noexcept { state = false; // BUG } }; Thing t; t.run(); // provides basic exception safety
      
      Function fun provides basic exception safety because it is never the case that after it throws it leaves the object in invalid state.
      
      —
      
      “If the destructor is noexcept (the usual case), then this is your second bug. If the destructor is designed to throw, then I guess the designer is sorta getting what they asked for: std::terminate().”
      
      I agree with: (simplifying a bit) if I throw from destructor, it is my fault that std::terminate is called.
      
      However, note that guideline “throw if you detect an invariant failure” forces me to throw from destructors. I conclude that this guideline is dangerous itself.
      
      —
      
      “In any case, I would think functional languages have many fewer problems of this kind. Side-effects seem to be the genesis of most of the error handling complexity. Eliminate side-effects and you free the programmer to make linear decisions about linear code. Easier said than done of course…”
      
      Not sure if I fully understand what you are saying here, but I intuitively agree. Functional languages are safer in nature (at least in my limited understanding).
      
      However, there are reasons people would still want to write in imperative languages: (1) some things are expressed easier when you can mutate objects, (2) imperative languages have potential to generate faster code (at least today).
bitsmaker says:

January 7, 2013 at 3:38 am

Andrzej,

Great article as always. What do you think about Andrei Alexandrescu’s method of using his Expected template class as described here http://isocpp.org/blog/2012/12/systematic-error-handling-in-c-andrei-alexandrescu for the return type?

Obviously one would still be handling a precondition error in the function but the noexcept guarantee would be preserved. When the caller tries to use the value returned from sqrt the exception would be thrown.

Reply
- Andrzej Krzemieński says:
  
  January 7, 2013 at 4:57 pm
  
  @bitsmaker: I only had a look at the slides (didn’t watch the movie) but it looks Andrei is presenting a tool for run-time exceptions or exceptional results — but not signalling bugs. It is similar to the example of sqrt returning optional<double> (mentioned in this post), except that we have additional info in case the optional is null.
  
  Are you asking this in the context of detecting precondition failures? I do not think using it for signalling bugs would be intuitive. And forcing the programmers to ‘convert’ from Expected<T> to T everywhere only to check situations that must never occur (precondition violation) would IMHO cause more problems that it tries to solve.
  
  Reply
  - bitsmaker says:
    
    January 7, 2013 at 5:42 pm
    
    Yes, I was asking in the context of detecting precondition failures. If you see slide 28 and 29 you will see that we do not have to convert. One calls Expected’s get method and the get method will throw in case of a bad value (that seems pretty intuitive to me but perhaps what you will include in the next article may be better). Otherwise one can continue processing. Of course one still has to call ‘get’.
    
    Reply
    - Andrzej Krzemieński says:
      
      January 7, 2013 at 9:02 pm
      
      I used term ‘convert’ incorrectly. I meant that you cannot just call:
      
      double y = sqrt(x);
      
      And I have to add the extra get, which is not that bad alone, but consider a longer expression:
      
      double d = log(sqrt(x) * sqrt(y));
      
      Now it becomes:
      
      double d = log(sqrt(x).get() * sqrt(y).get()).get();
      
      Note that boost::optional offers a similar functionality:
      
      boost::optional<double> sqrt(double x); double y = *sqrt(x);
      
      Accessing the contained value in uninitialized optional calls STATIC_ASSERT, which can be configured to throw an exception. However, throwing an exception is at least controversial (a we can see in this discussion). For instance, consider a slightly modified example with function fun2 from this post, but usingExpected:
      
      Expected<double> sqrt(double x) noexcept; // precondition: ?? none ?? or x >= 0 ?? double fun2(double x, double y) noexcept // precondition: x >= 0 && y >= 0 { return sqrt(x).get() + sqrt(y).get(); }
      
      If the unchecked get throws reaching the noexcept border triggers the call to std::terminate. (Well, didn’t I say I like std::terminate?) Or you can be more cautious:
      
      Expected<double> sqrt(double x) noexcept; // precondition: ?? none ?? or x >= 0 ?? Expected<double> fun2(double x, double y) noexcept // precondition: none? { auto ax = sqrt(x); if (!ax) { return ax; // reuse the same exception } auto ay = sqrt(y); if (!ay) { return ay; } return ax.get() + ay.get(); }
      
      Now we are back in the good days of error codes handling. The reason C++ (and then other languages) introduced exceptions was to avoid the error handling like this. Because when people do it the error handling code brings in so much complexity that the programmers can hardly focus on the proper logic (the “happy path”) and they more likely to plant a bug.
      
      On the other hand, I cannot offer a much better alternative. In my next post I will try to focus on how to avoid breaking the preconditions rather than responding to the detected breakage.
Cedric says:

January 7, 2013 at 11:21 pm

@Andrzej: I think we agree on most things, and calling std::terminate is good and is what is called when an exception is not catched. Calling std::terminate in a catch code is clear to me and explicits what you want to happen.

I never liked handlers, as it is never clear who sets the termination routine, as well as you need a kind of “errno” static information if you want to carry a bit of information about the failure.

About the BOOST_ASSERT it opens another discussion (perhaps a coming blog entry from you?): should checking occur in all builds or only in some (typicaly DEBUG) builds? Also it is not convenient for cpp files as the caller has no control over the macros that were set during compilation.

Once again this not the main part of your article, thank you for taking this time.

Reply
Pingback: Design by Contract | In pausa
Ben Hanson says:

January 8, 2013 at 10:39 am

Ideally whole classes of bugs would be caught at compile time. Static_assert is a start, but with constexpr we can go much further (as soon as there is support in VC++ I will add this support to our database library at work). As far as the SQL injection example goes, the solution is SQL parameters. Again I am waiting for VC++ to catch up, but when it does I will add a means to apply these automatically using a combined SQL formatter and executer that uses variadic templates.

Reply
innochenti says:

January 8, 2013 at 2:24 pm

What is the future of N1962 proposal?

Reply
- Andrzej Krzemieński says:
  
  January 8, 2013 at 3:11 pm
  
  It is very unlikely it will make it into planned C++17. See this statement from N1962’s author.
  
  Reply
Elron says:

January 10, 2013 at 2:14 pm

You could use more exact types, eg, PostitiveDouble sqrt(PositiveDouble)
The caller of sqrt is responsible for creating a PositiveDouble, and if it does so by converting from a double, it can decide how to deal with errors right there where it’s most relevant.
Likewise for ValidName, NonEmptyVector, VectorOfAtLeastSizeN, etc. It gets tedious, but it formalizes each party’s obligations.

Reply
- Andrzej Krzemieński says:
  
  January 10, 2013 at 2:26 pm
  
  Quite so! This is what I want to explore in my next post. And you spoiled the surprise 😦
  
  Reply
  - Elron says:
    
    January 10, 2013 at 2:48 pm
    
    My apologies, but I’m sure you’ll have much to teach us about how to make this convenient with what tools C++ gives us!
    
    Reply
mortoray says:

January 10, 2013 at 4:14 pm

I don’t see precondition violations as the same as derefencing a null pointer. Indeed, dereferencing a null pointer would seem to result directly from failing to do a proper precondition check.

Netwon’s method is a good example of where preconditions cannot properly check the input, which implies there must be runtime checks at the same level as preconditions. The sqrt function is simple, but if you took a period function Newton’s method may diverge. Clearly this is because the input is invalid, but the only way to determine this is by doing the calculation.

I don’t argue against static_asserts. Anything which can be detected at compile time should prevent compilation, this is indeed the ideal situation. After compilation though I consider all errors to be runtime errors and the program must deal wtih them in a suitable runtime fashion — you may no longer treat this as a development build.

Reply
- Andrzej Krzemieński says:
  
  January 10, 2013 at 4:53 pm
  “I don’t see precondition violations as the same as dereferencing a null pointer. Indeed, dereferencing a null pointer would seem to result directly from failing to do a proper precondition check.”
  
  — the analogy I see between the two is the guarantee you get in case you dereference a null pointer and/or when you call a function and fail to satisfy its precondition: you have no guarantee what happens; you have a UB (although in the latter case you have a UB at higher level of abstraction)
  
  Similarly, you do not have to check the pointer before dereferencing it safely; you only need to make sure that it is not null:
```
Thing t;
Thing * pt = &t;
t->work(); // safe dereference w/o explicit check
```
  —
  
  “Netwon’s method is a good example of where preconditions cannot properly check the input, which implies there must be runtime checks at the same level as preconditions.”
  
  — this is quite the opposite to my understanding. A non-negative input to the particular case of Newton’s algorithm for solving only a square root is sufficient to guarantee that the algorithm will stop. This appears to have been proven. In this case the caller is able to verify the precondition.
  
  In case of functions where you cannot check if the function will succeed by only inspecting its inputs, you cannot state the precondition — I agree. For instance function open_file(std::string) cannot state a precondition that the file should exist, because the existence of the file is a “run-time” property that can change asynchronously relative to our program.
  
  —
  
  “After compilation though I consider all errors to be runtime errors and the program must deal wtih them in a suitable runtime fashion — you may no longer treat this as a development build.”
  
  — One can hardly argue with that. But let me recall the case of function displayFirstSecondNext above: not every precondition can be turned into a run-time check.
  Reply
  - mortoray says:
    
    January 10, 2013 at 6:14 pm
    
    My intention for preconditions and postconditions is that rather than have UB I would like to have defined and sane behaviour. I don’t consider it hard to achive and I believe it leads to far more stable systems.
    
    For Newton’s method I meant the general case; sqrt is probably fine.
    
    For displayFirstSecondNext, yes, there is no way to check this precondition in the general case. Once you start outputting you’ve corrupted the system state and have a different class of error: there is no way to completely unwind at that point (restore to state prior to function call).
    
    Reply
    - Andrzej Krzemieński says:
      
      January 10, 2013 at 7:53 pm
      
      “My intention for preconditions and postconditions is that rather than have UB I would like to have defined and sane behavior. I don’t consider it hard to achieve and I believe it leads to far more stable systems.”
      
      — Your point of view has been applied in std::sqrt. The function accepts any input (including negative values) and either computes the square root or sets the value of errno to EDOM.
      
      Under the definition of a precondition that I tried to convey in my post, this means that function std::sqrt imposes no precondition on its input values. We may call it only a question of definition.
      
      This observation is analogous to the difference between member function at and operator[] for accessing elements of std::vector. The latter has good reasons for not validating its precondition itself.
    - mortoray says:
      
      January 11, 2013 at 3:46 am
      
      Is it fair to call them checked and unchecked preconditions? Functions with checked preconditions are then safe to call with any input and those without result in undefined behaviour on invalid input.
      
      I definitely agree the need to have unchecked preconditions exists for performance reasons.
    - Andrzej Krzemieński says:
      
      January 11, 2013 at 8:25 am
      
      “Is it fair to call them checked and unchecked preconditions?”
      
      — I prefer to look at it as two different definitions of a “precondition”. Under my definition a function gives you guarantees as to its behavior only when you satisfy the precondition. It sets the limits within which a function can guarantee its behavior (throwing exceptions is part of these guarantees). That is, “precondition” is related to providing or not providing a guarantee of behavior.
      
      In the other point of view, you guarantee certain behavior even on precondition failure. Therewith you guarantee that detected “precondition” violation detected in run-time does not pass by unnoticed. However, it is now difficult to strictly define what “precondition” is, because you can no longer tie it with giving or not giving guarantees. This causes practical problems. Consider std::sqrt:
      
      double AbsoluteValue(double x) { errno = 0; std::sqrt(x); if (errno == EDOM) { errno = 0; return -x; } else { return x; } }
      
      What can you say about this implementation of absolute value computation? Is it correct? Is it a bug? Should it be allowed? If you allow providing negative values to sqrt and you guarantee the results, you must accept that this implementation of AbsoluteValue is correct and well defined. But that feels so wrong.
      
      Nonetheless, I expect that a decent Contract framework (like Contract++) even if it takes the approach “precondition failure –> no guarantee”, will allow you to specify whether you want to disable/enable the check for precondition. For instance:
      
      double sqrt(double x) precondition< build_mode == debug >(x >= 0); // (test only in debug mode) // or double sqrt(double x) precondition<true>(x >= 0); // (test always) constexpr bool doLin = evaluate_linear_preconds == true || build_mode == debug; bool containsSorted(std::vector<int> const& v, int i) precondition<doLin>( std::is_sorted(begin(v), end(v)) );
    - mortoray says:
      
      January 11, 2013 at 1:52 pm
      
      To me, the AbsoluteValue function would not be okay because I don’t believe errors should carry a lot of functionally useful information. That is, I believe the error should carry information that if reported it would aid in debugging or operator correction, but not necessarily runtime aids. They can return what type of error is with resprect to failed calculation, state invalid, etc., but not specifics on the error.
      
      That is, though they should be expected to happen, they should not be relied upon to give meaning to the inputs of the function. So when “sqrt” returns are non-zero errno value you have to assume that the calculation did not work, but it reveals no information about the parameter you provided.
      
      This would be the same for the checkIfUserExists function. Should an error be returned it does not reveal anything about that user. You cannot assume it means the user does not exist, only a “false” non-error path return can reveal that.
      
      I also don’t think these runtime preconditions must necessarily be guaranteed, rather they should be “expected”. The “sqrt” function may or may not be doing the precondition check. That is, if a program is given 100% correct inputs its output should be correct even if completely absent of runtime error checks. Error checks should be used to discover and recover from erroneous input.
      
      This approach is not the classic one, I admit. Especially with ERRNO which has been abused to carry functionally relevant information from a great number of system calls. A lot of these I do not consider errors, but are rather extended reporting mechanisms of the function itself. Error paths should be for errors only.
Joel Lamotte says:

January 15, 2013 at 10:58 am

Hi, your blog is incredibely interesting, in particular relative to error handling that is a big subject to me as I’m currently struggling with related problems in my current work (a complex game, which error handling is also complex).
I can’t wait to see the rest of these article (the one about terminate() helped me A LOT) even if we had an unfortunate prediction in the comments 😀

Is there a reason this blog don’t expose any rss link?

Reply
- Andrzej Krzemieński says:
  
  January 15, 2013 at 12:19 pm
  
  Hi Joel, thank you for your kind words.
  
  I am not well familiar with RSS mechanism, so I am not sure I understand what you are asking about. The following link looks like RSS link:
  https://akrzemi1.wordpress.com/feed/
  
  Reply
  - Joel Lamotte says:
    
    January 16, 2013 at 9:24 am
    
    Yes that’s the link I was looking for (once registered I’ll be notified if you post new content). It’s not exposed on the main page.
    
    Reply
    - Andrzej Krzemieński says:
      
      January 16, 2013 at 9:39 am
      
      Thanks for the suggestion. I will add it when I have some time.
Markus Klein says:

January 23, 2013 at 6:16 pm

Hello Andrzej,
really nice article. Well written, and I think the code example, especially the first one with the sql-injection illutstrates the problem very well. I’ve blogged on perconditions on C++ the very same day, so I’ve been very interessted to read, how you would tackle the problem. We share many views, however I have to disagree in one point. There is a language feature which is quite adept to handle most preconditions: Type Safety.
Taking the ‘bool isValidName( std::string const& text )’ function and putting it in the constructor of a type called ‘user_name’, would allow me to express the precondition in the functions signature:
checkIfUserExists(user_name const&);
Just wonder what you think of this. Keep up the good work!

Reply
- Cedric says:
  
  January 24, 2013 at 8:46 am
  
  Hi Markus,
  that’s a very good and obvious piece of advice. And not only for preconditions, it is just the principle of encapsulation and OO design thats makes changes easier.
  
  Reply
Andrzej Krzemieński says:

January 23, 2013 at 7:57 pm

Hi Markus,
I hope you do not mind if I paste the link to your post here? http://clean-cpp.org/expressing-preconditions-types/. It is a good observation. “Elron” also suggested it in one of the comments above.

I like this idea. I believe it is the right way to go, and this is what I want to explore in the second part. I also believe this approach has certain disadvantages. But allow me some time to come up with convincing examples, and well thought explanation. Just to give you an idea of what I am thinking about, consider how the constructor of the proposed type user_name might be implemented. Does it need to specify a precondition?

BTW, perhaps you know it already, there is a library Constrained Value that helps build constrained types. It has been proposed and provisionally accepted to Boost.

Reply
- markus klein says:
  
  January 24, 2013 at 11:23 pm
  
  Hi Andrezej, of course I do not mind you posting links to my blog. It’s much more fun writing if someone is reading, isn’t it? I’m happy you to read that you like the idea of expressing preconditions as types and I’m looking forward to your next post.
  
  Reply
- Sławomir says:
  
  February 6, 2013 at 4:11 pm
  
  Hi Andrzej, great article! I very enjoyed reading it. Also the discussion in comments is very interesting. Similar to Markus I’m waiting for your research on type-based constrains.
  Regarding the boost library you mentioned – did the status of it changed since 2008? Perhaps I’m missing something but this is the year I see on library page.
  
  Reply
  - Andrzej Krzemieński says:
    
    February 6, 2013 at 4:39 pm
    
    Have a look at this Review Wizard Report for November 2012. It says that Constrained Value “has been accepted to Boost (September 2010), but has not yet been submitted to SVN”.
    
    Reply