Preconditions — Part IV

This is the last post about preconditions. We will try to address concerns about a potential UB connected with expressing preconditions. We will also try to explore how language support for preconditions could look like.

Preconditions and UB-safety

The definition of a precondition I have been trying to advertise somehow threatens us with lots of potential UB situations. You may feel as though I was encouraging you to plant more and more UB. So let me explain myself.

Let’s consider a project where you prefer that whatever wrong happens in the program ‘module’, you want this to be reported via an exception: never crash or stop or enter a UB. One example of such situation is where you have written a server, which is executing tasks: pieces of code written by different users that you have no control over. But you do not want the server to crash only because one of the users has a bug in his code, in his task/module. You want that module to throw exceptions upon any failure; even if that failure is caused by a bug. Another such example is where you do batch job which consists of doing thousands of similar small tasks. The entire process runs for a week. You do not want the failure in one task to stop the entire job (in case UB ends in a crash). You want to skip that task and to proceed to the next one. You are willing to take the risk of entering a UB and having the program do random, uncontrolled things. We will refer to such situation as “no-halt guarantee”.

Before we address the UB concern; if you require the no-halt guarantee, answer this question. In such projects, do you use std::vector’s operator [] or do you dereference raw pointers? Both these operations provide a precondition: vector’s index must be in range; dereferenced pointer must point to a valid object — at minimum it must not be null.

If your answer to this question is “I am not using these”, you probably do not require a language that works ‘close to metal’, and you could use a language based on some virtual machine, which leaves almost no room for a UB at the cost of reduced performance. Such languages do not require preconditions, because the behaviour is always defined. Somewhat simplifying things, C++ provides UB to allow maximum uncompromised run-time efficiency. The ‘UB way’ helps avoid situations where the same ‘precondition’ is checked multiple times for safety. Preconditions described in this series of posts are a generalization of this ‘UB way’ for user defined functions.

If your position is “I am forced to use vector and raw pointers, but for anything else I want a no-UB style”, then quite possibly the preconditions is not the feature for you. Going back to the example from Part III, for the division operator of type BigInt, what should happen if the divisor is 0? If you require a guarantee that the function throws on zero divisor, your division is well-defined for the zero divisor: it is well-defined what is going to happen. This means there is no precondition. Paper N3248 calls such situation a wide contract: any function input is valid. In the case of BigInt, it spoils the mathematical abstraction, but at least provides the no-halt guarantee.

So who needs these preconditions? First, above we only described one possible trade-off between factors like program correctness, efficiency and no-halt guarantee. There are other valid trade-offs. C++ is often used in applications where performance is the top priority. You want to avoid the situation of redundant safety checks: repeated checks for the same condition.

Also, the arguments in favour of using preconditions might become stronger if we had language support for preconditions and compiler which would understand them and could help us find bugs in the code. Below we try to show how such support could look like.

One thing that I want to point out is that just checking every possible “invalid function input” and throwing an exception on it does not necessarily make the program more correct or safe. It only prevents crashes, but not bugs. Bugs, in turn, are a sort of UB on the higher level of abstraction.

A potential language feature

The thing I describe in this section is more like a fantasy. I have never seen a feature like this, although it looks to me it is implementable.

So, how would an ideal support for preconditions look in C++? This would be a facility for helping the compiler in the static analysis of the program code. This would be somewhat similar to how the function argument type checking works. Declaration like:

void fun(string s);

guarantees two things. The compiler will enforce that the callers always pass the argument that is a string. Inside the function, we are guaranteed that the value of s will always contain a valid string. Am I saying too obvious things? Note that some scripting languages do not offer this compile-time guarantee.

How would that work for preconditions? The mechanism would also be twofold. The compiler would verify (as much as possible) if the caller satisfied the precondition. Inside function body, it could apply code optimizations based on the assumption that the precondition holds.

Let’s see how it works by example. We will use fairly well-familiar libraries: Boost.Optional and std::function. Consider the following function.

void apply(function<void(int)> f, optional<int> i)
{
  f(*i);
}

Consciously or not, we made certain assumptions about the states of i and f. You probably do not want to dereference i if it has not been set with an integral value. Similarly, you probably do not wont to invoke f if it has been default-initialized, without any function that could be called.

Accessing the value contained inside optional<int> has a precondition. If we had preconditions in C++ the operation could be declared as:

template <typename T>
T& optional<T>::operator*()
precondition{ bool(*this) };

Optional does not verify the precondition, because it assumes you already made sure it holds, and it does not want to duplicate the check. You should check the precondition not because of “safety issues” but because of the business logic of your program.

In contrast, note that invoking default-constructed std::functions is well defined. It is guaranteed to throw exception of type std::bad_function_call. There is no precondition here. if we wanted to make it very explicit, we could write:

template <typename R, typename... Args>
R function<R(Args...)>::operator()(Args... args)
precondition{ true };

In C++11 our function apply compiles fine, and works somehow. We have to be careful what we pass to it, but — what is important now — it compiles. With the hypothetical support for preconditions this function would not compile, or at minimum, the compiler would issue a warning message (which you can turn into an error with compiler switches). The message would indicate that we attempted to use a function (operator*) but it cannot be guaranteed that its precondition would always hold. This would be an indication to the programmer that he needs to change something about this code, so that a reasonable guarantee is given. He could do it in a couple of ways.

First, — and this is least preferred way because it usually hides a more serious logic error — we can just check for the precondition using an if-statement:

void apply1(function<void(int)> f, optional<int> i)
{
  if (i) {
     f(*i); // ok
  }
}

Static analyser can be sure that the precondition is satisfied. We are inside the branch that is guarded by expression that is exactly our precondition. Object i has not been modified between the check and the invocation of operator*. Note that expression bool(i) that we are implicitly invoking is also a function whose precondition, in general, should be checked. However, bool(i) has a wide contract (has no precondition).

Second, we can rely on other function’s postcondition. Here, we are changing the semantics of the function significantly, but the goal is to provide you with the overview of the feature.

void apply2(function<void(int)> f, optional<int> i)
{
  i = 0;
  f(*i); // ok
}

Optional’s assignment from T (int in our case) has a postcondition. The function would probably be defined as:

template <typename T>
optional& optional<T>::operator=(T&&)
postcondition{ bool(*this) };

If preconditions are supported, postconditions will likely be supported as well. In function apply2, the function that most recently modified i (the assignment) guarantees that it leaves the object in the state that meets predicate bool(i). This is the predicate that is exactly required by operator*. Analyser should be able to detect that. For a more practical example consider that algorithm std::sort propagates a property (its postcondition) std::is_sorted to functions like std::binary_search.

Let’s try to modify our function, so that it is closer to the original:

void apply2(function<void(int)> f, optional<int> i)
{
  if (!i) { i = 0; }
  f(*i); // ok
}

If the expression inside the if-statement were bool(i) the analyser could make some conclusions. But how is it supposed to know that expression !i is a compliment of bool(i)? We would need to have a yet another language feature to be able to express ‘relationships’ between predicates, or expressions in general. A language like this has already been considered for C++ in form of concept axioms. With axiom syntax (as described in “Design of Concept Libraries for C++”) we could express the relation as:

template <typename T>
axiom(optional<T> o)
{
  !o <=> !bool(o);
}

This reads that for each object o of type optional<T> expression !o is equivalent to expression !bool(o): one can be substituted for the other without affecting the program’s semantics.

Once the static analyser is taught this equivalence, it can transform the body of apply2 to:

  if (!bool(i)) { i = 0; }
  f(*i); // ok

and then to:

  if (bool(i)) {} else { i = 0; }
  f(*i); // ok

From the last one (if this is not already obvious from the previous) it can gather that either the precondition is already satisfied, or we call the assignment that guarantees our precondition as its postcondition.

Third option to silence the “potentially broken precondition” warning is to declare your own precondition:

void apply3(function<void(int)> f, optional<int> i)
precondition{ bool(i) }
{
  f(*i);
}

Now, the verification of the precondition is moved to the higher level. This would usually be the preferred way of handling the preconditions.

Last, although it is not very elegant, it will often be useful to locally disable the precondition check:

void apply4(function<void(int)> f, optional<int> i)
{
  [[satisfied(bool(i))]] f(*i);
}

Here satisfied is an attribute that tells the compiler/analyser that it should assume the precondition holds, even though it cannot verify it. It says “trust the programmer.” While not very safe, it is useful when migrating the legacy code, when we cannot afford to set all precondition checks right in one go.

This also shows a model usage of attributes. Attributes in C++ have a limited usage compared to other languages. Their addition or removal must not affect the semantics of the program. Their addition or removal must not render a valid program invalid. They can be used to give hints to the compiler. Turning warnings on and off is a good example of such hint.

Note that f does not get this additional compiler safety check, because it does not specify a precondition. The decision for std::function’s operator() was to avoid any UB, and be well defined when it is default-constructed.

Thus described functionality makes preconditions similar to concept axioms. Also, similarly to axioms, compilers and other tools can make use of the assumptions expressed in preconditions and optimize the code based on them. In places where static analysis cannot be performed, concepts would still render run-time checks. Such checks would still be superior to manually planted asserts inside function body for a couple of reasons:

  1. Preconditions would be evaluated in the caller rather than inside the function. This way core dumps or other diagnostic tools would correctly report that the error is in the caller.
  2. All preconditions can be globally enabled or disabled even in third party precompiled code, without the necessity to recompile each function with narrow contract (with preconditions). N1800 describes how it works.
  3. Compiler can easily detect and warn us if predicates in preconditions have side effects.
  4. constexpr functions evaluated at compile-time could make use of preconditions and report failures at compile-time also.

Such a support for preconditions would be a very helpful feature. But let’s not fantasize too much. For now the best thing we can do is to use assertions and comments — a very useful and often underestimated language feature.

This entry was posted in programming and tagged . Bookmark the permalink.

8 Responses to Preconditions — Part IV

  1. What about giving names to preconditions, like replacing

    template <typename T>
    T& optional<T>::operator*()
    precondition{ bool(*this) };
    

    with

    template <typename T>
    T& optional<T>::operator*()
    precondition{ optional<T>::defined };
    

    This should behaves like creating smaller sub type for existing type.
    We can always replace any preconditions by introducing new types. But this way is not reasonable because we quickly can get tons of types.
    Instead we can use type limitations using custom modifiers (like const and volatile does).

    For optional type we can introduce *defined* modifier and mark operator*() with it.
    Then we can mark assignment operator to cast *defined* attribute on local variable.

    Adding and verifying correct preconditions make code complicated. And it will be very easy to misuse this feature by asking much more then needed.
    Using limited type subsets solve similar task using simpler and natural approach.

    • So, if I understood you correctly, defined is a new category of things. it is not a variable or type or function: it is a “tag” or a “property” of an object, right? some functions (like the assignment) can make that property “effective”; some other operations (like resetting the optional) remove the property. Other operations, like operator* expect the property defined. Did I get it right? I like that, and I believe it is to great extent what I tried to describe. But I believe it might not be sufficient to express lots of typical preconditions. In the example of optional the property applies to *this; but consider this function, written with predicates (rather than properties):

      int funct(optional<int> i, optional<int> j)
      precondition{ bool(j) }; // i can be uninitialized
      

      How do you express that with ‘properties’?

      • I agree that adding tags (like defined for optional) can not cover all cases without extending this approach. We can use it everywhere only if it is turing-complete (but it will make syntax complicated).
        Lets take your sample:

        int funct(optional<int> i, optional<int> j)
        precondition{ bool(j) }; // i can be uninitialized

        Lets assume we already have some definition for “defined” associated with “optional” type.

        We can use syntax similar to “const” modifier (not perfect way but should be natural for C++)

        // assume defined == precondition{ bool(j) }
        int funct(optional<int> i, defined optional<int> j)
        // i can be uninitialized, so we have no modifiers for it

        With given definition compiler can assume some safety (safety == no undefined behavior, exceptions are OK) for i and j usage.
        For example we can use all methods of j defined with “defined” modifier while we need additional checks for i to cast missing modifier. Same apply while forwarding arguments to other functions: “defined” modifier make be silently cast out while adding it must be verbose. Note, that for “const” modifier we have opposite behavior. “const” can be silently added but can not be easily removed.
        Other similar language feature int, short int, long int, long long int.
        Here we have three modifiers: “short”, “long” and “long long” associated with type int.
        “short” reduce type precision so it can be silently cast it out (short int -> int is safe) but we need explicit check to add it. With “long” we can do opposite: adding it to the type are safe (int -> long int is safe).

        Internally compiler can use different types (probably we can use several modifiers with one variable, like “const volatile long long int”)

        Adding “defined” modifier usually will be added with an “if” statement like this:

        int funct(optional<int> i, defined optional<int> j)
        {
          if (i)
          {
              // now compiler can assume that i has "defined" mosidier, because operator bool () was marked as casting "defined" while returning true
              // but I would like to have some syntax to do it explicitly 
              return *i; // must be safe
          }
          else
          {
             return *j; // safe, check was done by caller
          }
        }
        
        • I guess your next question can be:

          int funct(optional<int> i, optional<int> j)
          precondition{ bool(i) || bool(j) }; // one must be initialized
          

          We could not simulate it with new type (at least without using single object with two optional members).
          I don’t know good solution for this case. Now I can think only about introducing new class that will hold all dependent members and represent own constraints on entire set.

        • Oleksandr, I believe that we are almost talking about the same thing.

          The syntax you propose will not handle situations where a precondition applies to more than one argument at a time. Suppose that for our function funct() above you need to say that one of them needs to be “defined”: not necessarily. Let’s modify the proposed syntax, so that it can also handle multi-parameter preconditions. It is now similar to the one investigated in ‘A Concept Design for the STL’ (see Appendix B).

          template <typename T>
          property either_one_defined(optional<T> const& i, optional<T> const& j);
          
          int funct(optional<int> i, optional<int> j)
          precondition{ either_one_defined(i, j) };
          

          Here property is a new keyword used to define properties that can be “parametrized” with arguments. Such properties are defined, transferred and used as you described.

          Do you like it? If so, it is only slight a difference when we allow defining the body of such property:

          template <typename T>
          property either_one_defined(optional<T> const& i, optional<T> const& j)
          {
            return bool(i) || bool(j);
          }
          

          But now, one can ask: why introduce a new entity, like “properties”, when one can just use normal functions like this:

          template <typename T>
          bool either_one_defined(optional<T> const& i, optional<T> const& j)
          {
            return bool(i) || bool(j);
          }
          
          int funct(optional<int> i, optional<int> j)
          precondition{ either_one_defined(i, j) };
          

          Note that here (and also in my post) the function is used only as a tag or as a “property” that the compiler or a static analyser matches (the way you described), and the function is not evaluated. Evaluating it can be an additional bonus in some situations, but when specifying a precondition, the function is used primarily as a tag.

        • Funny, you really guessed my thoughts.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.