Gotchas of type inference

Update. I was a bit imprecise when saying that type deduction using auto works exactly the same as template argument deduction. There is one noticeable difference. It is explained below in the updated post.

C++ comes with a number of tools for type inference. By “type inference” here I mean the ability of the compiler to figure out the type of any given expression or initializer, and thus not to force you to write the redundant type down. You probably already heard of the new usage of keyword auto and the addition of decltype. There are more type inference mechanisms that are not connected with any particular keyword: the selection of the right function or function template overload, deducing the return type of a lambda function, inferring the type of a temporary, and probably more. In all those cases the compiler is able to infer the type to be used. The question to be explored in this post is: does the compiler infer the same type that you think of?

Before we go further, let me just add one disclaimer. Herb Sutter mentioned in his keynote talk at GoingNative 2012 that C++11 is too fresh for anyone to be able to come up with the comprehensive guidelines based on real-life experience. I agree with this statement. The following things I describe are just random observations that I made while playing with the new features and reading the C++ Standard.

Declaration type or expression type?

int i = 0;
decltype(i) j = 0;

What type is deduced by decltype? Or in other words, what is the type of j? You could say, it is obvious: i is of type int so decltype(i) also returns type int. And you would be right about the type, but not necessarily about this being obvious. When used this way, i.e. with the name of object as parameter, decltype means “the type that this object was declared with.” But what about this:

int i = 0;
decltype(&i) p = 0;

Here, &i is not an object, it is an expression. This is not a problem, if it is an expression, decltype means “the type that would be returned if we evaluated the expression”; in our case: int*. So, decltype means the type of declaration for object names, and the result of evaluating the expression for expressions. But if we look again at the former example, i alone is also an expression. A simple one, but an expression which renders a reference type int&.

Thus, for some expressions decltype returns the type obtained by the evaluation of the expression, and for other expressions it returns a related, but still different, type. Do you mind this lack of uniformity? If so, use decltype with double parentheses:

int i = 0;
decltype((i)) r = i;

This time, r is declared as int&! This is because (i) is definitely an expression, not an object name.

How do you feel about it? It looks like decltype is a tool for two slightly different, not entirely separate, purposes: (1) detecting the type that some name was declared with and (2) determining the type of an expression. I tend to agree with Faisal Vali (perfect forwarding is described this post) that it would be easier for C++ programmers if these two queries were implemented with two different keywords, the other one being named, say, exprtype. Since this is not the case, if you are really always interested in expression types and never in types of name declarations, you could consider using the following macro, which always checks for expression types:

#define EXPRTYPE(...) decltype((__VA_ARGS__))

Deducing finction’s return type

You might think that the above is not a real-life issue. But it is. Let me give you one convincing example. Many C++ programmers find it surprising that for small one-liner functions they have to specify the return type even though it could be easily deduced from the returned expression. The following is impossible in C++:

template <class T, class U>
auto min(T x, U y) 
{
  return y < x ? y : x;
}

The compiler should be able to deduce what we mean, but it does not. Note that to specify the return type is not that easy in this two-argument template. Of course, the following would definitely work:

template <class T, class U>
auto min(T x, U y) -> decltype(y < x ? y : x)
{
  return y < x ? y : x;
}

But having to spell the same expression twice is just… well, not the right way. In order not to repeat the expression for every function you add, you can devise a macro, that will do it for you. This is based on the suggestion presented by Dave Abrahams here:

#define RETURNS(...) -> decltype(__VA_ARGS__) { return (__VA_ARGS__); }

Armed with this macro, you can define our function min as:

template <class T, class U>
auto min(T x, U y) RETURNS(y < x ? y : x);

This is pretty nice for one-liner functions, but it has one drawback: it is very likely to cause an undefined behavior, and in fact it does for our function min. In fact, it is not a problem with the macro but with decltype, and the previous example (with explicit decltype) will also likely suffer from UB, as pointed out by Daveed Vandevoorde here. This is because the result of decltype(y < x ? y : x) is an lvalue reference: we are returning an lvalue reference to an automatic object created inside our function.

But hey, lambdas do deduce (in some cases) the type they return. Do they suffer from the above potential UB problem? No, because they deduce the type in a different way! They decay their return type; i.e., they convert reference type to non-cv object type, function to function pointer, and array to pointer. These are the same conversions that occur when you pass arguments to functions by value. Thus, a lambda with deduced return type:

[](T x, U y) { return y < x ? y : x; }

Is equivalent to:

[](T x, U y) -> typename std::decay<decltype(y < x ? y : x)>::type
{ return y < x ? y : x; }

But do you know what that means? Lambdas cannot deduce reference return types! If you need your lambda to return a reference, you have to say it explicitly. But you see, it is not bad that lambdas do not deduce references; otherwise they would be risking causing an UB. And in the language that encourages value semantics returning by value is not a bad thing.

Similarly, if you do not mind returning by value in general, the above suggested macro could be replaced with:

#define RETURNS(...)                                   \
  -> typename std::decay<decltype(__VA_ARGS__)>::type  \
  { return (__VA_ARGS__); }                            \

`auto` prefers regular objects

If you use auto to create variables based on other variables, the types thus generated are object types (rather than reference types) without cv-qualifications:

int i = 0;
int& j = i;
const int k = 0;
int& f();
int g();

auto a = i;   // deduced: int
auto b = j;   // deduced: int
auto c = k;   // deduced: int
auto d = f(); // deduced: int
auto e = g(); // deduced: int

Note that if you used decltype for thus creating variables, the result would be different:

decltype(i)   a = i;   // int
decltype(j)   b = j;   // int&
decltype(k)   c = k;   // const int
decltype(f()) d = f(); // int&
decltype(g()) e = g(); // int

There are more differences between the two tools that we shall see later on. The question to answer right now is, how the type deduction with auto works? The answer is: it works almost exactly the same as the deduction of function argument types when instantiating a function template. Or by the means of example:

auto a = INITIALIZER;

template <class T> void deduce(T a);
deduce(INITIALIZER);

In both cases, both a’s will be deduced the same type. And similarly the following examples:

auto& a = INITIALIZER;

template <class T> void deduce(T& a);
deduce(INITIALIZER);

auto&& a = INITIALIZER;

template <class T> void deduce(T&& a); // perfect forwarding
deduce(INITIALIZER);

One essential difference, though, between auto and function template argument deduction is in deducing the type of initializer list. Namely, template deduction will fail on such argument:

auto a = {1, 2, 3};   // decltype(a) is std::initializer_list<int>

template <class T> void deduce(T&& a);
deduce({1, 2, 3});    // deduction fails (compile-time error)

Perfect forwarding with `auto`

But wait. In C++ we have now perfect forwarding for rvalue reference arguments in function templates (see here, here and here). Given that auto works like deducing function template argument types, does it that it can also be used for perfect forwarding? Yes:

int i = 0;
int& j = i;
const int k = 0;
int& f();
int g();

auto&& a = i;   // deduced: int&
auto&& b = j;   // deduced: int&
auto&& c = k;   // deduced: const int&
auto&& d = f(); // deduced: int&
auto&& e = g(); // deduced: int&&

Perfect forwarding always deduces a reference; but it can be an lvalue reference or an rvalue reference, also cv-qualifiers are “forwarded.” Is it useful? Yes, typically in templates:

template <class R> // requires Range<R>
void process(R&& range)
{
  for (auto&& elem : range) {
    // process elem
  }
}

By using perfect forwarding for declaring the reference elem we say “if range is constant, iterate using a reference to const; if not, iterate over references to mutable objects; if range gives access to elements by value, use rvalue references.”

`auto` vs `decltype`

As promissed, we will now identify the differences between type inference mechanism offered by auto and decltype. In fact we provided the answer already. decltype is sort of an operator: you have to give it an expression (even if it is a simple name of some object), and it gives you the expression’s or the object’s type. auto, on the other hand, mimics the behaviour of type deduction in template functions, and does not even necessarily require an expression:

auto a = {0, 1, 2};
std::vector<int> v{a};

Here, {0, 1, 2} is not an expression, but it is good enough an initializer.

And since we are on it, let’s see one tricky case regarding type deduction from initializer list. Consider the following code:

auto a{2};
std::vector<int> v(a);

In C++11 (and in C++14) this had the semantics of creating variable a of type std::initializer_list<int>.
It was later declared a bug in C++14 specification and the semantics were changed, so that this now deduces the type of a as int. GCC since version 5 implements the new semantics even in mode -std=c++11.

6 Responses to Gotchas of type inference

Motti Lanzkron says:

April 5, 2012 at 11:31 am

Thanks for a very interesting post.

I found it since it refers to my “Inferring too much” post although I have to say that you don’t seem to be referring to anything I wrote there 🙂

Andrzej Krzemieński says:

April 5, 2012 at 12:01 pm

Thanks Motti. I changed “References” to “Further reading”. Hopefully this will be slightly less confusing.

Will says:

August 10, 2012 at 6:51 pm

Typo: “Of course, the following would definately work:”, “definately” -> “definitely”.
(And linking for fun and referencing: http://www.d-e-f-i-n-i-t-e-l-y.com ^^ although I know it’s really a typo here, I’ve seen that you correctly used “definitely” in all your other articles.)

David Chamont says:

November 2, 2018 at 4:41 pm

For the very last example, I suspect you should not use “auto a{2};”, which deduce “int” with all the g++ versions I tried, but “auto a = {2};”, which will actually deduce an “std::list_initializer”. Perhaps something which was not true in 2012 ?

- Andrzej Krzemieński says:
  
  November 2, 2018 at 5:34 pm
  
  Oh dear. Indeed this is a change since C++11. In GCC 4.9 and earlier, this is an initializer_list: https://wandbox.org/permlink/UvJJNkOpxkf0nZ7j. In GCC 5.1 and later this is an int: https://wandbox.org/permlink/qksvKsEQnVJF7Bw2 (even with -std=C++11, which I think is a bug).
  
  I will need to update the post. Thanks for reporting this!
  
  - Andrzej Krzemieński says:
    
    November 2, 2018 at 9:48 pm
    
    Ok, I have rewritten the last bit. Thanks again for reporting this.