Type inference for temporaries

You probably already know that in C++11 you can have the type of your automatic object deduced from its initializer:

void reset( std::vector<std::complex<double>> & vec )
{
  auto it = vec.begin();
  // ...
}

But did you know that you can also have the type of your temporary deduced — from the surrounding context? This feature is somehow less advertised and less known in C++11, although it offers benefits similar to that of auto for automatic objects. I mentioned it in my other post, but now I think it deserves a bit more than just being mentioned. So here we go.

In our examples we will be dealing with matrices. We will be using type std::pair<unsigned, unsigned> for matrix indices. For the first example, let’s write a function where we can ask the user for matrix index:

std::pair<unsigned, unsigned> readIndex()
{
  int i, j;
  std::cin >> i >> j;
  return std::pair<unsigned, unsigned>(i, j);
}

This function returns a temporary. There is a redundancy in this code. Since our function’s return type is already defined as Index, it is obvious that whenever we put a return-statement it will be Index that we will be returning. There is no need to repeat this obvious thing. Yet, C++03 syntax forces us to type it again. On the other hand, it is possible for the compiler to check what type needs to be returned and use this type. One could expect the following syntax:

std::pair<unsigned, unsigned> readIndex()
{
  int i, j;
  std::cin >> i >> j;
  return auto(i, j); // ILLEGAL!!
}

Technically, it makes sense, but it is illegal in C++. Instead, C++11 offers an even shorter alternative. We just use “brace” syntax for initialization of a temporary; and since “brace” syntax unambiguously indicates the initialization of an object, we can skip the type, and let the compiler deduce it:

std::pair<unsigned, unsigned> readIndex()
{
  int i, j;
  std::cin >> i >> j;
  return {i, j};
}

Do we really need this good? We could have rewritten the return-statement to:

return std::make_pair(i, j);

Or we could introduce an alias for std::pair<unsigned, unsigned>:

using Index = std::pair<unsigned, unsigned>;

Well, yes; brace initialization itself is only a syntax sugar, and nearly everything they allow you to do can be achieved by putting more auxiliary code. But this is an essential sugar — much like lambdas — and it should not be underestimated. A programming language should allow you to clearly express your ideas, and not having to write a lot of boiler-plate code adds to clarity; especially that initializing an object is a very common task. The syntax presented here might appear confusing at first, but once you realize that braces just denote the initialization of the new object (an automatic one, or a temporary), everything becomes clear. And brief.

Now, let’s write a function that computes a trace of a square matrix. That is, we will be summing up the elements on the matrix’s main diagonal.

template <typename T, unsigned N> 
// requires: T is Addable, Regular
T trace( Matrix<T, N, N> const& mat )
{
  T ans{}; // value-initialization
  for (unsigned i = 0; i != N; ++i) {
    ans = ans + mat[{i, i}];
  }
  return ans;
};

Although we did not provide the definition of Matrix, you can guess how it looks. for sure, it provides subscript operator:

template <typename T, unsigned M, unsigned N>
T const& Matrix<T, M, N>::operator[]( std::pair<unsigned, unsigned> i ) const;
// requires: (i.first < M && i.second < N)

Note how we access the elements from the main diagonal:

mat[{i, i}]; // same as: mat[std::pair<unsigned, unsigned>(i, i)];

Here again, because brace initialization occurs inside operator[], compiler can check what type the operator requires and use it as the type of the initialized temporary.

Now, let’s suppose our matrix type also provides a way to extract a sub-matrix thereof:

using Index = std::pair<unsigned, unsigned>;

template <typename T, unsigned M, unsigned N>
auto Matrix<T, M, N>::sub( Index lo, Index hi ) -> SubMatrix<T>;
// requires: (lo.first <= hi.first  &&  lo.first <= hi.first)
//           (hi.first < M  &&  hi.second < N)

That is, we provide two indices as arguments they define a rectangle. Function sub returns only the values that fall within the rectangle. What we return is not another matrix, but a view of the original matrix — a view that only allows us to access a sub-range of elements of the original matrix. So, how can we extract a sub-matrix from a matrix?

Matrix<double, 20, 20> mat = { /*...*/ };
auto small = mat.sub( Index{0, 0}, Index{3, 3} );

But here again, because the compiler knows what argument types function sub expects, it can deduce them, and does not require the programmer to type something that is already obvious:

Matrix<double, 20, 20> mat = { /*...*/ };
auto small = mat.sub( {0, 0}, {3, 3} );

The fourth context in which the type of a temporary can be deduced is assignment. We will use it when implementing a function for finding the index of the biggest element in the matrix.

template <typename T, unsigned M, unsigned N> 
// requires: T is LessThanComparable, (N > 0  &&  M > 0)
Index maxElemIndex( Matrix<T, M, N> const& mat )
{
  Index ans {0, 0};
  for (unsigned i = 0; i != M; ++i)
    for (unsigned j = 0; j != N; ++j)
      if (mat[ans] < mat[{i, j}])
        ans = {i, j};
  return ans;
};

Reset idiom

The above examples show all four context where the type of a temporary can be deduced from the context. Now, we shall see how we can combine this type inference and zero-initialization syntax to form an idiom for resetting objects. We need a type for that that has a default constructor which puts an object into a null-state. Let’s try unique_ptr.

std::unique_ptr<Index> ptr{ new Index{3, 3} };
use(*ptr);
ptr = {}; // reset

The last instruction creates a value-initialized temporary of type matching the left-hand side expression (std::unique_ptr<Index>) and move-assigns it to ptr. For unique_ptr, null-state means storing nullptr; and assignment means clearing the target first, and then assigning the new (null) value. Thus, ={} indicates a reset instruction for types that are MoveAssignable (or CopyAssignable) and DefaultConstructible.

Explicit constructors

Note that the reset idiom — and in fact none of the temporary type inference syntax — works if you try to call an explicit constructor. For instance, you cannot reset a unique_ptr like this:

ptr = { new Index{3, 3} }; // INVALID

Explicit constructors are simply not considered a valid candidates in these cases. This is obvious to some and surprising to others. This is mostly because with the addition of the new initialization syntax, it is not that clear anymore what ‘explicit’ means. Does it mean “you have to explicitly request the (any) initialization” or “you have to explicitly request the call to T’s constructor”? Both are valid answers, but for C++11 that latter one was chosen.

And that’s it for today. You can play with the new initialization syntax if you use GCC 4.4 or higher. The only examples that I was not able to compile with GCC are those that use operator[]. But it looks more like a bug in GCC.

Advertisements
This entry was posted in programming and tagged . Bookmark the permalink.

6 Responses to Type inference for temporaries

  1. Michal Mocny says:

    Excellent article.

    I do think that the syntax is handy, but I still wish it were easier to deduce the return type of a function. In your example maxElemIndex, you had to explicitly name the variable “ans”‘s type. For the same reasons this article already outlines, I particularly want this when the return type is complex and/or deduced from argument types (using the new “auto fun(…) -> decltype(…)” syntax).

    • Hi Michal. Thank you for the kind words.
      Did you see Google’s Go language? It has a feature of named return values. This post shows some examples. This addresses a common design where you have a variable, like my ans that you keep preparing as the function executes and return it in the end. If we had it in C++, function maxElemIndex would read something like:

      template <typename T, unsigned M, unsigned N> 
      // requires: T is LessThanComparable, (N > 0  &&  M > 0)
      auto maxElemIndex( Matrix<T, M, N> const& mat ) -> Index ans{0, 0}
      {
        for (unsigned i = 0; i != M; ++i)
          for (unsigned j = 0; j != N; ++j)
            if (mat[ans] < mat[{i, j}])
              ans = {i, j};
        return ans;
        // or simply return;
      };
      
      • Michal Mocny says:

        Andrzej,
        Yes, I have studied Go quite a bit (thought I do not use it regularly). Named+multiple return types are one very nice improvement Go brings to this family of languages. I’ve done my best to emulate these features in C++11, such as using std::tuple and std::tie, but with std::tie you cannot use return type inference which makes it much less usable. Perhaps I will go poking around to see if there are new clever techniques in this area.

  2. mortoray says:

    I see the advantages to the new bracket init syntax, though I’m still not totally convinced that it won’t make code simply more confusing.

  3. Sarfaraz says:

    Good post. Thanks for sharing this.

    However, why did you overload `operator[]` and used it as `mat[{i, i}];`? I think overloading `operator()` and using it as `mat(i,i)` is much better and clean.

    • Well, the goal was to show the four contexts where the type of a temporary can be deduced. operator[] is one such context. I could not think of a better example. I am not trying to say that providing operator[] is the right way of matrix design.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s