## Contracts, Preconditions & Invariants

In this post we will see what a contract is, how preconditions and invariants can be derived from the contract, and how this process can help detect bugs. Two points that I will be stressing in this post are: (1) preconditions and invariants are not “contracts” and (2) only a subset of contract-related bugs can be detected through preconditions and invariants.

Let’s start by comparing two function declarations that are equivalent form the type system perspective:

```bool f(int a, int b, int c);

bool is_in_range(int val, int lo, int hi)
// this checks if val is in closed range [lo, hi]
;
```

There is a lot more information associated with the second declaration. The function name as well as function parameter names give us a hint as to what behavior we can expect of the function. In case this wasn’t enough, we have a comment explaining this in detail. For instance, we know that although `lo` and `hi` are just two integers, they will be interpreted as a closed range. We have sufficient information to use function `is_in_range()` correctly. The same cannot be said about function `f()` though. We do not know what it does. We do not know when to call it, and if it would be a good idea. We do not know what values to pass to it, and how they would be interpreted.

The difference between the two functions is in the contract. In this context, a contract is all the information that the user (function user, here) has to know in order to use the component (function) correctly. This information can be communicated by any means: good choice of names, comments, HTML documentation or a paper one, implied, communicated over the phone or face to face, also in the type system. Even for function `f()` above part of the contract has been communicated: that it will take three arguments of type `int`. That is something, although not enough: its contract is not complete.

With the contract in place, the user has enough information to use the component correctly, but it is still possible that she will use it incorrectly. For instance, the previous library used for dealing with ranges of integers used to have a similar interface, but took the arguments in different order:

```bool OldLib::isInRange(int lo, int hi, int value)
// returns true if value is in closed range [lo, hi]
// otherwise returns false
;
```

The user may have the old interface imprinted in her mind so deeply that even after learning the contract of the new library, she may unconsciously pass the arguments to the new function in the wrong order:

```bool validate(int val)
{
int lower = config.get("LOWER_BOUND");
int upper = config.get("UPPER_BOUND");
return is_in_range(lower, upper, val); // Bad order of args!
}
```

This type of bugs is generally unavoidable. One way to reduce it is to do thorough code reviews. But humans are faulty, and in the same way as they make mistakes writing the code, they make mistakes reviewing the code. The best code review would be performed by a machine, but machines do not understand function contracts. The idea behind preconditions is to make as much information from the function contract as possible available to the machines. We cannot teach the machine the entire contract (at least not as of today), but we can insert “bug criteria” inferred from the contract. In our case, because `lo` and `hi` represent a closed range `[lo, hi]` it is never possible that a user willingly in a correct program passes value `lo` which is greater than `hi`. We could write it more explicitly:

```bool is_in_range(int val, int lo, int hi)
// this checks if val is in closed range [lo, hi]
[[contract::precondition(lo <= hi)]]
;
```

Interestingly, it is a valid C++17 program. But it may not be the best idea to write it this way: (1) compilers give a warning on this attribute declaration, (2) and they have reason: this declaration my give a false impression as to what it does: it does nothing: it is just a comment but with square brackets rather than slashes. However, if we were able to teach a tool to recognize these special comments, it could use them to rewrite our programs. For instance, an intelligent tool could rewrite the above function `validate()` to:

```bool validate(int val)
{
int lower = config.get("LOWER_BOUND");
int upper = config.get("UPPER_BOUND");

assert(upper <= val); // <- Generated from the precondition
//    of function is_in_range()
return is_in_range(lower, upper, val); // Note: bad order
}
```

So, by declaring a precondition in our function, we allow the tool to inject the corresponding code anywhere our function is invoked. In case the injected code is an `assert()`, it can perform a check at runtime and abort the program should the check fail. Note that expression `upper <= val` is not my mistake. The automated tool that generated it simply did not give in to the confusion caused by the arguments being passed in the wrong order to `is_in_range()`. Value `lower` may be (correctly) smaller than value `upper` but if it so happens that `val` is less than `lower` the bug (arguments passed in the wrong order) will be detected.

But what if `val` is greater than `upper`? We still have a bug, and the `assert()` generated from the precondition will not detect it. Yes; and this illustrates the important characteristics of preconditions (and postconditions and invariants and similar): they are not for detecting every bug in a program that uses annotated functions. But it makes it easy to automatically detect some bugs.

By now, you must have wondered, would it not be better to define a type `Range` rather than using preconditions? The answer is: yes. It is not generally possible in all cases to turn a precondition into a new type, but in our case it is a perfect solution: it is very likely that we will reuse the new type `Range` in many places. For instance, not only function `is_in_range()` can use it, but also the type representing `config`:

```struct Range
{
int hi;
int lo;
};

bool is_in_range(int val, Range r)
// this checks if val is in closed range r
;

bool validate(int val)
{
Range rng = config.get_range("LOWER_BOUND", "UPPER_BOUND");
return is_in_range(rng, val); // ERROR: type mismatch
}
```

Now, our bug (bad order of arguments) is detected by the type system! So it is clearly an improvement: bugs are detected earlier. But, we didn’t take care of all possible things that we could easily detect. For instance, the implementation of `is_in_range` is likely:

```bool is_in_range(int val, Range r)
// this checks if val is in closed range r
{
return r.lo <= val && val <= r.hi;
}
```

But in order for this implementation to make sense, we have to trust that `[r.lo, r.hi]` is a closed range. So we still have a precondition to declare: it is just spelled differently:

```bool is_in_range(int val, Range r)
// this checks if val is in closed range r
[[contract::precondition(r.lo <= r.hi)]]
;
```

Does this sound wrong? It does. Making sure that `r.lo <= r.hi` should not be our business: it should be the business of class `Range`. In fact it should have a member function that checks if a given value is in range. So, let’s try to define our type `Range` properly. First attempt:

```struct Range
{
int lo;
int hi;

bool is_inside(int val) const
{
return lo <= val && val <= hi;
}

int size() const
{
return hi - lo;
}
};
```

This looks better, but still has the same problem: in order for `is_inside` to make sense, it is required that `lo < hi`. But how can we be sure of that? The same requirement applies to function `size()`, and likely to any other function in the interface of `Range`. They have the same precondition: `lo <= hi`. How can we guarantee that it is satisfied?

First, we cannot afford to have `lo` and `hi` public: if anyone can change them arbitrarily at any time, there is no way that we can guarantee anything. So, the second attempt:

```class Range
{
int lo;
int hi;

public:
explicit Range(int l, int h)
[[contract::precondition(l <= h)]]
: lo(l), hi(h) {}

bool is_inside(int val) const
[[contract::precondition(lo <= hi)]]
{
return lo <= val && val <= hi;
}

int size() const
[[contract::precondition(lo <= hi)]]
{
return hi - lo;
}
};
```

Now, we can see that except for the constructor we have the same precondition for every public function: `lo <= hi`. This is hardly surprising. In fact, the whole point of having this class is to signal or guarantee that this condition holds. Otherwise we would have just used `std::pair<int, int>`. This condition that motivates the creation of a new class is called the class invariant. We can declare it once rather than repeat it upon every member function:

```class Range
{
int lo;
int hi;
[[contract::invariant(lo <= hi)]];

public:
explicit Range(int l, int h)
[[contract::precondition(l <= h)]]
: lo(l), hi(h) {}

bool is_inside(int val) const
{
return lo <= val && val <= hi;
}

int size() const
{
return hi - lo;
}
};
```

A class invariant is so tightly associated with the class that you can use it as a criterion for determining whether you need to define a new class or not. If you cannot think of a class invariant, then you probably do not need a new class either: a simple aggregate will do. (Although sometimes it is not possible or easy to express the invariant as a C++ expression: for instance an invariant can be that some resource is within our ownership throughout the lifetime of the object.)

You can notice one important thing, though: we didn’t get rid of the precondition in the constructor. We cannot fully guarantee that `lo` is never greater than `hi`. We depend on the callers of the constructor: that they pass an `l` that is no grater than `h`. A code that is using our `Range` can still have a bug:

```bool validate(int val)
{
int lower = config.get("LOWER_BOUND");
int upper = config.get("UPPER_BOUND");

Range r {upper, lower}; // Note: bad order
return r.is_inside(val);
}
```

Thus, class invariant (much like a postcondition) is something conditional: the class can guarantee that it holds, only if its preconditions have been met.

Alternatively, one could think of defining the constructor of `Range` differently:

```Range::Range(int l, int h)
// no precondition
: lo(l), hi(h)
{
if (l > h)
throw RangeError{};
}
```

This does guarantee that we literally never have `Range`s with the violated invariant. But other than this, this solution has a number of downsides. First, exceptions are just a control flow: an invisible branch. They do not help detect bugs. As long as we have a precondition in place, a clever tool can make use of it. One example we have seen earlier is to inject checking code in the callers:

```bool validate(int val)
{
int lower = config.get("LOWER_BOUND");
int upper = config.get("UPPER_BOUND");

assert(upper <= lower); // <- generated from the precondition
//    of Range's constructor
Range r {upper, lower}; // Note: bad order
return r.is_inside(val);
}
```

Macro `assert()` is a better tool for detecting bugs. But it is just one example of making use of preconditions. Other tools can use them for performing more informed static analysis. This opportunity goes away if we abandon preconditions and start throwing exceptions.

Second, there are more mundane issues with throwing exceptions. Not everybody wants or can afford to use them. You also cannot signal errors via exceptions from `noexcept` functions.

2. My guess is that it is the syntactic overhead that makes it not that attractive: https://www.boost.org/doc/libs/1_78_0/libs/contract/doc/html/boost_contract/tutorial.html#boost_contract.tutorial.non_member_functions. It is a lot of typing, not easy to understand if you see it the first time. And it still has the downside of being inside the implementation as opposed to function declarations, so they cannot be used to improve the static analysis of programs. Personally, I use plain old `assert()` in my programs, and I find the net effect better.