Sessions and object lifetimes

In this post we will see how C++ object lifetime can be used to control the duration of sessions: time spent owing and using a resource. The goal is to get a better understanding of what tools the language offers for using and sharing resources efficiently.

In this post I am using terms defined in another post, “«Resource» or «Session»?”:

Resource — something we may be short or out of and that we need to share with others (and do the sharing efficiently).

Session — the time period we spend with the resource as its exclusive owner.

When working with programs we often need to use resources, and when performance is critical, you need to keep track of how often and for how long you keep your resources for you; therefore the concept of a session is essential.

As discussed in another post, “C++’s best feature”, the language gives us a convenient tool for managing sessions: a predictable and well-defined object lifetime.

Guard objects

In the simplest case we can tie the session to object’s lifetime directly: session starts when object’s lifetime starts and it ends when object’s lifetime ends.

We can illustrate it with an example. On Linux systems you can “start a session” with a socket by calling C function socket; you end the session by calling another C function: close. (For a nice tutorial on using sockets in Linux see here.)

If we want to tie these functions to a lifetime of an object we define a class and put the two functions in constructor and destructor respectively:

#include <sys/socket.h> // Linux header
#include <unistd.h>     // Linux header
#include <stdexcept>

class SocketGuard
  int socket_id;

  explicit SocketGuard()
    : socket_id{ socket(AF_INET, SOCK_STREAM, 0) }
    if (socket_id < 0)
      throw std::runtime_error{ translate_err(socket_id) };

  ~SocketGuard() { close(socket_id); } // ignore error code

  int id() const { return socket_id; }

  // class invariant: id() >= 0
  SocketGuard(SocketGuard const&) = delete;
  SocketGuard(SocketGuard&&)      = delete;

A couple of things to note. The whole state of the socket is stored indirectly through one int. Although the system needs to hold a lot of data connected to an open socket, class SocketGuard has size and layout compatible with type int.

In constructor we check for the error condition, and if one occurs we throw an exception. If an exception is thrown from constructor, object’s lifetime does not start. If the object’s lifetime does not start, it cannot end, therefore destructor will not be called.

Conversely, if constructor does not end in exception, we guarantee that we store a file descriptor to a valid and open socket. There is no need to check in destructor if socket_id is valid: it just is. There is no need to check it in id() either, and thinking what to do if it isn’t. This is reflected in the class invariant. It says that whenever we call id() it will always return a valid socket ID. Branch instructions may be expensive at run-time, and we are able to avoid a number of them.

Note that the call to function close might fail (and return an error code). We are ignoring this failure. We follow the reasoning outlined in this post. We use exceptions only to signal the failure that prevents us from doing our task. (The task in this case would be to communicate with some other program using the socket.) Failure to release the resource and share it with others is not a failure to do the task.

This particular example illustrates one other interesting thing. The fact that we do not pass any arguments in the constructor does not imply that we will be creating some partially-formed object. In our case the default constructor starts a session with a fully operational resource.

This is an ideal example of design pattern known as RAII.

The guarantee that an object whose lifetime started (and not yet ended) represents a valid session with a resource is very important: it makes the code that works with such an object very simple and concise. Throwing an exception from constructor on resource acquisition failure is one way of fulfilling this guarantee. I know of at least one more: it is implemented in std::lock_guard.

Objects of type std::lock_guard represent sessions with a resource which in this case is a mutex. If a std::lock_guard object cannot acquire an exclusive ownership of a mutex, it causes a suspension of the current thread. Once the other session with the mutex is ended the current thread is resumed, the constructor ends, the lock’s lifetime begins and the object represents a valid session.

The language gives us an opportunity to tie the session end to the end of object’s lifetime, but it does not force us to do it. Instead, we can implement a destructor like this:

  if (condition)

Releasing the resource only sometimes sounds like a bad idea. However, if the condition is selected carefully, this can make perfect sense and be very useful. In fact std::unique_ptr does exactly that. Its destructor is implemented more-less like this:

unique_ptr<T, D>::~unique_ptr()
  if (get() != nullptr)

This looks quite uncontroversial: a unique_ptr can be null, as any raw pointer. If we consider a unique_ptr as an object representing a session with a resource (chunk of memory in this case), this means that the object may be alive (its lifetime has begun) and not represent a valid session. Why did std::unique_ptr depart from the seemingly golden rule “object’s lifetime = session”?

Someone could say that this is because unique_ptr has to mimic raw pointers as close as possible, and this would probably be correct; but there is another important reason: it is necessary to provide move semantics.

Movable objects

Move semantics is a powerful feature. It is really a game changer, and I can confidently say that it is the single most important addition to C++ after C++98. But — as defined in C++ — it comes with a certain cost.

Conceptually, from the point of view taken in this post, C++ move semantics is the situation where a session is tied to the union of the life-times of two (or more) objects.

Consider the following piece of a program:

std::unique_ptr<Tool> make_tool()
// postcondition: return != nullptr
  std::unique_ptr<Tool> ans = std::make_unique<Tool>();
  return ans;

std::unique_ptr<Tool> calibrate(std::unique_ptr<Tool> t)
// precondition:  t != nullptr
// postcondition: return != nullptr
  return t;

int main()
  std::unique_ptr<Tool> tool = calibrate(make_tool());

It uses three names to refer to objects of type std::unique_ptr<Tool>: ans, t and tool. But when our compiler performs copy elision where allowed, physically we are dealing only with two objects.

One object is created during the call to make_tool and it is further observed and modified as function parameter during the call to calibrate. In other words, ans and t refer to the very same object.

Now, when function calibrate returns, another object has to be created. Copy elision is not allowed between function argument and its returned destination. The new object referred to as tool is created using the move constructor. Its life-time begins and continues to represent the session (with the chunk of memory). For a short while the lifetime of object referred to as t still continues, until its destructor starts, but it does not represent a session anymore. It is in a special state: a session-representing object without a session to represent. This is illustrated in the following diagram.


When the destructor of a unique_ptr is invoked, it has to be prepared for the situation, that the object is in this special ‘zombie’ state, and has to know what to do in order not to corrupt the program state. So it keeps a ‘flag’ somewhere that indicates (in this way or the other) that a given object is in a zombie state.

From the point of view of the session (if one can say so), there is always exactly one object representing it. The session is ‘protected’ from being leaked in the sense that there is always exactly one object in charge, ready to end the session in its destructor.

From the point of view of the programmer, we get an extra flexibility: a session can ‘jump’ from one object to another, crossing the scope boundaries that single objects are confined to. In consequence, we can, for instance, write factory functions that return objects representing sessions in progress.

The costs come in terms of performance and increased code complexity. First, as we could see in the definition of the destructor, it may do one of two different things selected based on a run-time condition. Executing a branch is expensive in itself.

But a move operation can also be called manually, and it may be other functions than destructor which will observe the zombie state; and for each member function we must now decide what we want to do if the object in question is in zombie state. Function definition grows and the choices are not obvious.

For instance, unique_ptr in some member functions (like in destructor) chooses to check for the zombie state (null pointer). For other functions (like operator->) it documents calling it when in zombie state as an undefined behavior, and puts the burden of making sure that it does not happen on the users. This is reflected in my example with unique_ptr<Tool> by the pre- and post-conditions in a number of places: the responsibility for guaranteeing no UB is pushed on others, higher up.

A class representing a session with a socket and supporting move semantics might look like this:

#include <sys/socket.h> // Linux header
#include <unistd.h>     // Linux header
#include <stdexcept>

class Socket
  int socket_id;

  explicit Socket()
    : socket_id{ socket(AF_INET, SOCK_STREAM, 0) }
    if (socket_id < 0)
      throw std::runtime_error{ translate_err(socket_id) };

  Socket(Socket&& r) noexcept
    : socket_id{r.socket_id}
    r.socket_id = -1;

  bool is_valid() const { return socket_id != -1; }

  // class invariant: !is_valid() || id() >= 0
    if (is_valid())
      close(socket_id); // ignore error code

  int id() const { return socket_id; }
  // precondition:  is_valid()
  // postcondition: return >= 0

  Socket(Socket const&) = delete;

A number of differences to observe:

1st: the move constructor. It does two things in one transaction: the new object’s state starts to represent the continued session, and the source object enters the zombie state. If possible (and it is possible in our case), we should declare the constructor as noexcept. The goal is not to make sure the constructor does not throw (it has almost nothing to do with it, as explained in this post), but in order for others to query at compile-time if they can rely on the assumption that this constructor never fails. Interestingly, because we do not provide a copy constructor, the noexcept is not necessary for std::move_if_noexcept to work correctly. When a type is movable and non-copyable, std::move_if_noexcept always chooses to move. But other libraries might make use of this information and, for instance, select between doing small-buffer optimization or not.

2nd: now that we are indicating a ‘no-session’ state, we need to provide a way for checking for it. It will be used to guard calls to other member functions, and to express preconditions.

3rd: the invariant becomes weak. Expression !a || b is equivalent to implication if we had one in C++: a implies b. Thus, the class invariant effectively says, “if object is valid than the strong invariant should hold”; or (in the context of using sessions): “either I guarantee a session in progress or not”. This affects every other function: they now have to handle the “or not” case.

4th: destructor is slower: it has to make a branch.

5th: function id has a precondition. This means that the users need to keep track of whether the precondition is satisfied or not at the peril of causing an undefined behavior.

For a completely move-enabled type, I should have also defined the move assignment operator. But it comes with its own set of problems, which would not fit into this post.

Whether this can still be called ‘RAII’, I am not sure.

Destructive move

Because of the above problems, there becomes more and more popular an idea of a destructive move (or a pilfering constructor). While the semantics of a move constructor is “transactionally, start the lifetime of the new object using the old object’s state, and put the old object into a zombie state”, the semantics of a pilfering constructor or a destructive move is “transactionally, start the lifetime of the new object using the old object’s state, and end the lifetime of the old object”. Different proposals vary in how they want to accomplish this. For more information about the subject I recommend the following reading:

  1. Sean Parent, “About Move”.
  2. Jens Weller, “C++ and Zombies: a moving question”.
  3. Pable Halpern, “Destructive Move”.
  4. Peter Dimov, “Valueless Variants Considered Harmful”.

Semi-manual life-time

Sometimes having an object in a zombie state is exactly what we want. As we said in the beginning, in order to efficiently share the resources with others we have to keep the sessions as short as possible: acquire resources as late as possible and release them as soon as possible. Sometimes tying these events to object lifetime start and end may not be enough. One example of such situation has been given in this post. If a more fine-grained control is needed over the session span, or an object life-time, and we do not want to risk any session leakage, Boost.Optional is just the tool for the job. You can start the session some time after the optional object is created and end it some time before the object is destroyed. Yet, the library guarantees that the session is contained within the lifetime of an optional object. This can be illustrated with the following piece of code:

int main()
  boost::optional<SocketGuard> g;

  // ...

  g.emplace();     // session starts

  // ...

  g = boost::none; // session ends

  // ... 

And the following diagram:


g starts in a zombie state. Note that boost::opitonal only needs a guard-like type SocketGuard. Only later, when we call g.emplace() a “contained object” is created to whose lifetime the session is bound. This contained object is called *g in the diagram. Later on, instruction g = boost::none ends the life-time of a contained object, and therewith the session with the socket, but object g remains alive in the zombie state.

And that’s it for today. I hope it gives you a different perspective on the things we deal with every day.

This entry was posted in programming and tagged , , , . Bookmark the permalink.

6 Responses to Sessions and object lifetimes

  1. Mike Tryhorn says:

    Good article. But, I do have one question: is there any particular reason why you have marked SocketGuard’s default, no-argument constructor as ‘explicit’? Is it just a matter of coding style?

  2. Kapten Mugg says:

    Since the SocketGuard constructor doesn´t take any arguments, do we really need to declare it “explicit”?

  3. Kapten Mugg says:

    ohh sorry missed that…

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s