Semantic requirements in concepts

The word ‘concept’ in the context of C++ generic programming has two meanings. The first is more abstract: it is the notion from the domain of Generic Programming (GP) in general. GP is not tied to any specific language: it is an approach to writing programs, and concepts are part of this approach. In this sense concepts have been with us since the inception of the STL. The second meaning is the keyword concept in C++20 with its associated semantics: its goal is to approximate the more abstract notion of a concept from GP, and this works only to some extent. One notable difference is that concepts in GP specify semantic requirements on types they constrain, and C++ concepts cannot express them directly.

In this post we will see how semantic requirements in concepts can break your program if you don’t pay attention to them, and what can be done in C++20 concepts to account for semantic requirements.

We start with a motivating example, using concepts and algorithms defined in one of the previous posts. We have concept Addable and a generic function sum constrained with it. A number of readers remarked that rather than extending the concept, as I did in that post, it would be better to provide two overloads for function template sum: one that works for every Addable type, and the other, more efficient, if we know the type is additionally totally_ordered. Thus, we have the following set of declarations:

template <typename T>
concept Addable = 
  std::regular<T> &&
  requires(T x, T y) {
    { x += y } -> std::same_as<T&>;
    { x +  y } -> std::convertible_to<T>;
  }; 

template <Addable T>
  T sum(T a, T b);

template <Addable T>
  requires std::totally_ordered<T>
  T sum(T a, T b);

The implementations of the two algorithms:

// generic version 

template <Addable T>
T sum(T a, T b)
{
  return a + b;
}

// specialized version

template <typename T>
T sum_impl(T a, T b)
{
  assert (a >= b);
  a += b;          
  return a;
}

template <Addable T>
  requires std::totally_ordered<T>
T sum(T a, T b)
{
  if (b > a) 
    return sum_impl(b, a);
  else
    return sum_impl(a, b);
} 

Now, we have a new user of our library. She’s got an idea. She will use it to concatenate strings. It is allowed? Let’s check:

static_assert(Addable<std::string>);

Yes, it works. So, she starts using the library for strings:

std::string air = "air";
std::string bus = "bus";
std::cout << sum(air, bus);

This outputs:

busair

What happened? The specialized overload reordered the arguments, so that the “bigger” one is on the left-hand side. Was the library author allowed to reorder the arguments?

Note that there was no need to answer this question earlier. We tested int, boost::rational<int>, boost::multiprecision::cpp_int and std::complex<double>, and there was no such problem, because for all these types the addition is commutative. Or maybe we should say that for these types operator+ represents the addition operation, which is by definition commutative.

Arguably, it is an abuse to have operator+ in std::string represent string concatenation. Concatenation has some similarities with addition (like being associative), but does not have all its properties. We could blame std::string for this decision, but that is beside the point. The point is, the author of the library, of function fun, made an assumption: for types that model Addable, operator+ is a commutative operation: it should be possible to provide the arguments in reverse order and the result should not change. The problem is, this assumption has not been communicated to the user.

Associativity of operator+ is an example of a semantic requirement of concept. It is fine, and actually necessary, to make such semantic requirements. In fact, we have more of them in our concept. For instance, we assume that

(a + b) == (T{a} += b)

We also delegate to std::regular, which has its own set of semantic requirements; to name only one:

T{a} == a // copy is equal to the original

One overload mentions std::totally_ordered, which makes a number of assumptions about the semantic properties of relational operators.

Thus, a concept is a set of syntactic and also semantic requirements, even if there is no way to express those semantics requirements. What we do, is to declare these requirements that we can, and hope that the users will somehow figure out what the remaining (semantic) constraints are. One way to communicate the semantic requirements is to describe them in human language in the documentation. Another, somewhat similar, way is to put them as comments inside concept declarations:

template <typename T>
concept Addable = 
  std::regular<T> &&
  requires(T x, T y) {
    { x += y } -> std::same_as<T&>;
    { x +  y } -> std::convertible_to<T>;

    // semantics { 
    //  (x + y) == (y + x);
    //  (x + y) == (x += y); 
    // }
  }; 

Or apply some clever trick to make the compiler syntax-check these expressions:

template <typename T>
concept Addable = 
  std::regular<T> &&
  requires(T x, T y) {
    { x += y } -> std::same_as<T&>;
    { x +  y } -> std::convertible_to<T>;

    requires requires {
      (x + y) == (y + x);
      (x + y) == (x += y);
    };
  }; 

The nested requirements introduced by requires requires will obviously always succeed (they are a composition of things that we already require); but they gives us the guarantee that the program will fail to compile if we get the expression wrong. (I know that the second equation is a bit disturbing, and should probably be rewritten to:

(x + y) == [](T x, T y){ return x += y; }(x, y)

but you get the point.)

Interestingly, the previous design for concepts (for C++11) had at some point the feature for expressing such semantic constraints, as described in this post. But people did not appreciate it, and it could not express all possible semantic requirements, anyway. A requirement that operation .empty() on a container should have time complexity O(1) is also a semantic requirement, and it cannot be expressed with an expression in C++.

Let’s summarize what we have so far. A concept is a set of requirements. Users are required to satisfy all concept requirements in order to correctly use algorithms constrained by the concepts. A compiler can only check if a (syntactic) subset of concept constraints is satisfied by the user-provided type. The responsibility for making sure that all semantic concepts constraints are satisfied by the type when it is passed to a template constrained by the concept ultimately falls on the user.

The situation is even more complex when we introduce concept-based overloads, as in our case. We have two overloads, both templates, which differ only by constraints:

template <Addable T>
  T sum(T a, T b);

template <Addable T>
  requires std::totally_ordered<T>
  T sum(T a, T b);

Now, imagine that I have a yet another type, call it Injector that I know models concept Addable: it satisfies both its syntactic and semantic requirements. However, the type also defines operators <, >, <= and >=, but it uses these operators for different purpose than checking which number is greater than the other. For instance, both expressions a >= b and a > b “inject” the value of a into b and return true if the value after injection is non-zero. This is strange, but I am allowed to do it.

The first overload of algorithm sum would work for me. But because there is the other one, which I do not like or want, and Injector satisfies the syntactic (but not semantic) requirements of std::totally_ordered the second overload will be chosen. But because Injector doesn’t model std::totally_ordered (it doesn’t satisfy the concept’s semantic requirements), I will get undefined behavior, which in this case will manifest by an assertion failure.

One can imagine an even worse scenario. The algorithms library originally ships with only one overload:

template <Addable T>
  T sum(T a, T b);

Now, I am 100% sure that type Injector can be safely used with the library, so I use it. Then, at some point, in the newer release, the library author decides to add a convenience overload, thinking, “I know how to make the algorithm faster when user type additionally models std::totally_ordered ”. And now, after the upgrade my program starts to crash, or worse: starts giving random results. This is because the mechanism for concept-based overloading didn’t check if my type models the concept: it only checked if my type satisfies the syntactic part of the requirements. In this scenario, I — the library user — didn’t get any chance to check if my type satisfies the new concept, because the new concept sneaked in silently with the new overload. The same fatal effect would be observed if the library author changed the implementation of the algorithm from:

template <Addable T>
T sum(T a, T b)
{
  return a + b;
}

to

template <typename T>
T sum_impl(T a, T b)
{
  assert (a >= b);
  a += b;          
  return a;
}

template <Addable T>
T sum(T a, T b)
{
  if constexpr (std::totally_ordered<T>)
  {
    if (b > a) 
      return sum_impl(b, a);
    else
      return sum_impl(a, b);
  }
  else
  {
    return a + b;
  }
} 

For the same reason, the example that I have shown in this post in function clever_swap is dangerous and not recommended.

The conclusion from this illustration is: concept-based overloads (or specializations) are part of the generic library’s contract: users must be informed initially about all overloads and specializations based on concepts so that they can get prepared (for semantic requirements), and from now on concept-based overloads/specializations cannot be added, lest the new unsatisfied semantic constraints should destroy users’ programs.

At this point, you might be asking the question. Semantic requirements make the whole thing dangerous and fragile. But what about the good old run-time polymorphism in the object-oriented style? Do base classes with virtual functions, apart from syntactic requirements, not also require semantic requirements? Consider the following base class:

class IAddable
{
  virtual IAddable* add(IAddable& rhs) = 0;
  // ...
};

Are we also not making a semantic requirement that a.add(b) and b.add(a) render the same result? Why is nobody shouting about the dangers of not satisfying the semantic requirements of virtual functions?

The problem is not so serious in the OO-stlyle polymorphism because in this model any type that implements the interface is forced to explicitly declare this fact:

class BigInt : public IAddable 
{
  BigInt* add(IAddable& rhs) override;
  // ... 
};

Here, by declaring the inheritance, we also declare a conscious decision that we want our class to implement the interface. A std::string cannot accidentally implement IAddable. And this changes everything. In case of generic programming in C++ — templates and concepts — the satisfaction of (the checkable part of) the interface can be accidental.

There are ways to protect against this accidental concept satisfaction, though. We can do this by artificially encoding the semantic requirements as syntactic requirements, much like how inheritance in OO style works. Let’s rewrite our concept Addable:

template <typename T>
  constexpr bool enable_addable = false;

template <std::integral T>
  constexpr bool enable_addable<T> = true;

template <std::floating_point T>
  constexpr bool enable_addable<T> = true;

template <typename T>
concept Addable = 
  std::regular<T> &&
  requires(T x, T y) {
    { x += y } -> std::same_as<T&>;
    { x +  y } -> std::convertible_to<T>;

    requires enable_addable<T>;
  }; 

Here enable_addable is a variable template, which evaluates to false unless it is specialized otherwise. We also provide two specializations for built-in integral and floating-point types. (std::integral and std::floating_point are concepts.) Our concept will be modeled by int and double but will no longer be modeled by boost::rational<int>. In order to make the concept work with the user’s type, the user needs to provide a specialization for their type:

template <typename T>
  constexpr bool enable_addable<boost::rational<T>> = true;

This is now safe: no type can accidentally be identified as modelling the concept. The “models” relation has now to be explicitly declared. But this is also uncomfortable: new types don’t “just work” with the library we have to add and add specializations.

There is another option. Given that practically any type that satisfies the syntactic requirements of concept Addable also models the concept (satisfies also the remaining semantic requirements) with the exception of std::string which abused the operator overloading, we can invert the mechanism for enabling the concept satisfaction: a concept is considered modeled by the type, unless the type is explicitly disabled:

template <typename T>
  constexpr bool disable_addable = false;

template <typename C, typename T, typename A>
  constexpr bool disable_addable<
    std::basic_string<C, T, A>
  > = true;

template <typename T>
concept Addable = 
  std::regular<T> &&
  requires(T x, T y) {
    { x += y } -> std::same_as<T&>;
    { x +  y } -> std::convertible_to<T>;

    requires !disable_addable<T>;
  }; 

We disabled our concept for type std::basic_string and hope that there are no more types that overload operator += for non-commutative operations. This is less save, but requires less typing from the users.

This technique is used in C++20 ranges. We have concept std::ranges::view and std::ranges::range. Syntactically, a view is a range that is additionally std::semiregular. Semantically, a view is cheap to copy, assign and destroy (required complexity is O(1)). Containers are ranges that are semiregular but do not satisfy the semantic requirements of std::ranges::view, therefore the Ranges library uses std::ranges::enable_view.

Similarly, we have concept std::ranges::sized_range which requires a range and operation ranges::size(t) with some semantic properties, like amortized constant-time complexity. In order to avoid accidental constraint satisfaction, the Ranges library also provides std::ranges::disable_sized_range.

This entry was posted in programming and tagged , , , . Bookmark the permalink.

7 Responses to Semantic requirements in concepts

  1. Johel Ernesto Guerrero Peña says:

    Towards the end, ‘s/enable_range/enable_view’.

  2. Helmut says:

    Why did you call the semantic variable template “enable_addable” and not “addition_is_commutative” (or similar)? In the latter case it is clear what is semantically needed to enable the algorithm and “addition_is_commutative” potentially also makes sense in other algorithms.

    • Concept Addable has in fact many semantic requirements. In the post I have mentioned one more: that operators + and += are consistently defined. But there are more. For instance, for the second overload, I require that relational operations and addition work in tandem:

      a > 0 && b > 0 => a + b > a  
      

      If I added one variable per every single semantic requirement, users would have to take care of plenty of them. Instead I chose to mimick what OO-polymorphism does: there is one declaration of conformance which says, “I accept all semantic requirements”.

      • Helmut says:

        How does the user now what “all” semantic requirements are?

        • The user needs to rely on the library author, who defined the concept, that he communicated all the semantic requirements.

          This post is telling the story of the library author failing to do their part of the job. If the library author were aware of semantic requirements of the concept and communicated them clearly, then the user would have a clean situation: learn al declared semantic requirements, determine if my type is satisfying them, and sign off by specializing the “enable” trait.

          Now, you could ask how the library author knows that all the semantic requirements have been identified and listed for their concept. The strict answer to this question is, you can never be sure. The practical answer is, designing a good interface (concept in this case) is an iterative process. You declare what you think is right, then you try to use it, see why it doesn’t work, and refine it. After a number of iterations you get enough confidence that you did it right.

  3. Helmut says:

    “The strict answer to this question is, you can never be sure.”

    You can be sure when the library author provides a strict mathematical proof that his algorithm works under the given requirements. For example, the Euclidean algorithm works for all Euclidean domains (and Euclidean domain is essentially a semantic concept).

    For the library author, the iterative process does not end after a certain “number of iterations”; you have to iterate until you find a mathematical proof.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.