The double life of objects

Some common knowledge: the lifetime of the object starts when its initialization is complete. Based on this we can get some further expectations: an object becomes const only after its initialization is complete. But this lifetime property of objects becomes blurred when copy elision comes into play. When a copy is elided, we have a situation where we would otherwise have two objects, each initialized separately, but now they are blended into one, its life time spanning across the caller and the calle, which has a number of surprising effects, receiving two initializations being one of them.

In the following examples we will use class Rng that is quite small but convincing: you define a class when you need to maintain the invariant. Our class represents a range of integers between the minimum and maximum values.

class Rng
{
  int _min;
  int _max;
  // invariant: _min <= _max

public:
  Rng(int lo, int hi) 
    // precondition: lo <= hi
  : _min(lo), _max(hi)
  { assert(_min <= _max); }

  int min() const { return _min; }
  int max() const { return _max; }

  void set(int lo, int hi)
    // precondition: lo <= hi
  {
    _min = lo;
    _max = hi;
    assert(_min <= _max); 
  }
};

Now, let’s try to use it in a somewhat artificial program:

const Rng foo()
{
  const Rng x {1, 2};
  return x;
}

Rng bar()
{
  Rng y = foo();
  y.set(3, 4);
  return y;
}

int main()
{
  const Rng z = bar();
  return z.min();
}

C++ allows compilers to perform an optimization known as NRVO (named return value optimization). Even though not mandated, it is so popular that GCC enables it by default. We assume that it is present in our examples. The effect is that in the above snippet we have only one object of type Rng. Even though there are three names referring to it — x, y and z — even though some of them are const and others not, this is a single object! No copy- or move- constructor of Rng is invoked. You can verify it by displaying its address in the three functions: see example.

When you look at the body of main(), you might get an impression that this object — called z in this scope — has been initialized right after function bar() finished. But this is not true. This object has been initialized earlier, in function foo(), after the constructor of Rng finished constructing the object, there called x.

But, curious as it is, does this matter in practice? If you think in terms of values rather than objects and their addresses, you should never tell the difference, right? Right. But you might wonder if this object is const or not, and where its const-ness starts, because you might have heard that modifying a const object is undefined behavior. In our example x is declared const, so you cannot change its value easily. You could do it with tricks like const_cast, and then it would be undefined behavior. But in function bar() this same object is declared non-const, and we explicitly change its value. This is ok for a non-const object, but it was const just a second ago, inside foo(). So, is this all fine, or did we get the undefined behavior?

The reading of the C++ Standard doesn’t help much. The definition of copy elision ([class.copy.elision]) doesn’t specify in sufficient detail what happens if lifetimes of const and non-const objects are fused during copy elision. But we can get some intuition from studying why modifying a const object is undefined behavior in the first place.

While const objects are technically allowed to be stored in read-only memory, I am not aware of any implementation actually doing it for automatic objects, especially when they are initialized with values known only at run-time. So, I do not think this is what motivates the need for undefined behavior. Instead, it is to enable certain optimizations: a compiler is allowed to assume that the value of a const object (that is not a volatile object at the same time) doesn’t change, and if it reads the value at any point, it doesn’t have to read it again later, whatever happens. This can sometimes cause somewhat counterintuitive results:

#include <iostream>

int main()
{
  const int i = 9;
  int& j = const_cast<int&>(i);
  j = 4;
  std::cout << i << std::endl; // prints 9
  std::cout << j << std::endl; // prints 4
}

Here, in line 7 we modify the value stored in object i (through reference j). But when we display this value in line 8 it still appears to hold the original value. It might look like the compiler did something wrong, but in fact it is allowed to do that: in line 7, by assigning to a const object, we triggered undefined behavior, so we cannot expect anything. You may not like it, but people who do not attempt to dodge the safety features of the type system benefit from this optimization. For a life example see this Compiler Explorer link.

But these optimizations are “local”: the compiler only needs to see the scope of the single function. They do not affect or are affected by how a different function sees the same object. So, we can say that in our earlier example x, y and z are three “views” of the same object, and the property of being const applies to a view rather than to an object. The same object can be const when viewed from one function and non-const when viewed from another function.

And that’s it for today. I am grateful to Joshua Berne for explaining this aspect of object const-ness to me.

This entry was posted in programming and tagged , . Bookmark the permalink.

2 Responses to The double life of objects

  1. phamlh says:

    Interesting article! I never though about how `const` and RVO play together until reading yours.

    I think I can understand it this way:
    1. The `const` is a feature mainly for syntax (And for some optimizations), it helps to validate the code.
    2. The RVO is an optimisation, applied much later, after the compilation/validation phase.

    This way, we can put `const` and RVO into different phase/layer of the compilation. And they are kind of independent.

    Again, thanks for the exciting post.

  2. malcolmparsons2015 says:

    Guaranteed copy elision requires RVO to be in the compilation/validation phase so that the type need not have an accessible copy/move constructor at all.

    https://en.cppreference.com/w/cpp/language/copy_elision

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.