The cost of std::initializer_list

C++ aims to be a language for people working close to metal. It offers abstractions that cost no (or minimum) run-time or memory overhead. If you require zero-overhead abstractions, std::initializer_list may not be a tool for you.

Consider the following use case of std::initializer_list:

vector<string> vec1 {"ant", "bat", "cat"};
vector<string> vec2 {vec1[2], vec1[0], vec1[1]};

The initialization of vec1 is pretty straightforward. In vec2 the situation is more interesting. We initialize it with a permutation of elements in vec1. It works, if you test it; and there seems nothing unusual about it; until you observe one detail. std::vector’s initializer-list constructor will make use of the initializer_list more-less like this:

template <typename T>
vector<T>::vector(initializer_list<T> l)
  const T*       it  = l.begin();  // raw pointer!
  const T* const end = l.end();    // raw pointer!

  for (; it != end; ++it) 

  // or, for sure, something faster

I know: this is not how you write a clean code, and it may be even incorrect; but my goal is to show one important aspect of initializer_list’s interface: its begin() and end() return raw pointers. An increment operation on an address just adds a constant offset, which requires that the elements are laid out contiguously somewhere, in the desired order. This means that somewhere there must reside a contiguous sequence of strings {"cat", "ant", "bat"}. Elements in vec1 are contiguous, but not in the right order; so what piece of memory will our iterator it be really iterating over?

And this question brings us to the core of the initializer-list constructor feature. Whenever a list-initialization is requested, and compiler determines that it will use the initializer-list constructor for initialization, a temporary array is created, wherein all the necessary elements are copy-constructed. This has a couple of implications:

  1. There is more copying involved than we can see.
  2. initializer_list will not work for move-only types (like unique_ptr or unique_lock).
  3. There is a temporary involved, which might get us into memory-safety issues.

Let’s examine in more detail what happens in the following line:

vector<string> vec2 {vec1[2], vec1[0], vec1[1]};

First, a temporary array of type const std::string is created. Its size and values is determined by what we see inside the braces:

const std::string _a [3] = {vec1[2], vec1[0], vec1[1]};

This performs N copies. Now, a temporary object of type std::initializer_list is created, pointing to the array, and passed along to the constructor call:

vector<string> vec2 (initializer_list<string>(_a, _a + 3));

Initializing elements of vec2 performs another N copies. The array is a temporary, so as soon as the initialization ends, it gets destroyed. Binding such temporary array to an instance of std::initializer_list works like binding a temporary object to a reference: in some contexts it extends the life-time of the temporary target. But if you try to store it as a class member, you will get into memory management issues.

As indicated above, initializer-list constructor will not work with non-copyable types. For copyable types, the question for you is: can you afford the additional copying?

For scalar types (like int or double) or for small trivially-copyable types it is not a big deal. Also it is not a big deal in non-critical parts of the code. In any other case, the convenience of syntactic shorthand may not be worth the run-time cost.

But perhaps an even more important question is, how often do you really want to initialize something like a vector with the number of elements known at compile time? In practice, about the only places I have found it useful are unit-tests or globals’ initialization.

Now, we need to make one important distinction. Initializing an object with braces does not mean “use an initializer-list constructor.” It really means (for classes with constructors), “select the best suitable constructor.” It may turn out to be the initializer-list constructor, but may as well not. In fact, we can still enable in our type a list-initialization with variable number of arguments, without paying the cost of initializer-list constructor. However, this comes at the expense of introducing variadic templates, SFINAE, and a number of meta-programming tricks. Suppose we are writing a wrapper for a std::vector with no-copy list initialization. We just need to create a variadic constructor template and decompose the parameter pack recursively:

// C++11 :

template <typename T>
class Vec
  std::vector<T> _vect;

  template <typename... UList>
  Vec(UList &&... vs)
    process(std::forward<UList>(vs)...); // decompose
  template <typename U, typename... UList>
  void process(U && v, UList &&... vs)
  void process() {} // end recursion  

This already “works”, but has a problem that our Vec is constructible from any number of any types whatsoever, so we have to constrain the constructor a bit:

// C++11 :

# define REQUIRES(...)                                      \
  typename std::enable_if<(__VA_ARGS__), bool>::type = true \
template <typename... UList,
          REQUIRES(nonarrow_convertible<T, UList...>::value)>
Vec(UList &&... vs);

We are emulating Concepts Lite here a bit. and we still need to define the type trait (an approximation of a concept) nonarrow_convertible which tests that each second and further parameter is convertible to the first one, excluding the narrowing conversions.

In C++11, using a hint from this Stack Overflow answer, I was able to come with the following implementation:

// C++11 :

// 1. Implementing a variadic logical AND
template <bool...> struct bool_sequence {};

template <bool... Bs>
using bool_and
 = std::is_same<bool_sequence<Bs...>,
                bool_sequence<(Bs || true)...>>;

// 2. Helper function to test implicit conversion 
template <typename T>
  std::true_type create(T v);

// 3a. Test for conversion and non-narrowing
template <typename T, typename U>
  decltype(create<U>({std::declval<T>()})) // <- braces

// 3b. Fallback function if sfinae fails on 3a
template <typename T, typename U>
  std::false_type test_nonnarow_conv(long);

// 3c. Single-argument conversion trait
template <typename T, typename U>
  using is_nonarrow_convertible
  = decltype(test_nonnarow_conv<T, U>(0));

// 4. Our multi-argument trait
template <typename T, typename... Ts>
  using nonarrow_convertible
  = bool_and<is_nonarrow_convertible<Ts, T>::value...>;

For a full working example, see here. I admit, it looks scary. But it should be noted that the effort (to understand what is going on in this implementation) is on the side of the implementer. In exchange, the user has a clean list-initialization syntax:

Vec<int> v {1, 2, 3, 4};
Vec<int> u {};  // this is also a default constructor

Vec<unique_ptr<int>> pv { make_unique<int>(1),
                          make_unique<int>(2) };

A variadic constructor template is not an ideal solution. Difficult notation aside, it has other problems compared to std::initializer_list:

  1. The interface for accessing the elements inside the constructor is less friendly (compared to STL-like interface of std::initializer_list).
  2. All the template instantiations that need to be performed cause longer compile-times, and some of them may not disappear from the resulting binary.

In fact, this is the area of active C++ development. Below you can see how the same variadic constructor template can be implemented without so many helper classes or recursion, in a C++ with Concepts Lite and Expression Folding.

template <typename T>
  std::true_type create(T v);
template <typename T, typename U>
  concept bool NarrowConvertible = requires()
    create<U>( {std::declval<T>()} ); 

template <typename T>
  struct Vec
    std::vector<T> _vect;
    template <typename... UList >
      requires (NarrowConvertible<T, UList> && ...) 
      Vec(UList&&... vs)
        (_vect.push_back(std::forward<UList>(vs)), ...);

Lines 16 and 20 show how you can use fold-expressions to eliminate the need for recursion in a variadic template. In line 16 we are able to intersect all the conditions in one expression. If UList is empty, the expression returns true. Line 20 is perhaps even more interesting. It combines the push_back expressions with operator comma, which effectively sequences a number of instructions one after another. If UList is empty, the resulting expression is void(). These not-yet-C++ features are available already in GCC 6.1, and you can test them online here.

For the end, I provide the relevant quotes from the C++ Standard, that describe the behavior of list-initialization via the initializer-list constructor.

§ 8.5.4 ¶ 5:

An object of type std::initializer_list<E> is constructed from an initializer list as if the implementation allocated a temporary array of N elements of type const E, where N is the number of elements in the initializer list. Each element of that array is copy-initialized with the corresponding element of the initializer list, and the std::initializer_list<E> object is constructed to refer to that array. [Note: A constructor or conversion function selected for the copy shall be accessible (Clause 11) in the context of the initializer list. —end note] If a narrowing conversion is required to initialize any of the elements, the program is ill-formed.

struct X {
  X(std::initializer_list<double> v);
X x{ 1,2,3 };

The initialization will be implemented in a way roughly equivalent to this:

const double __a[3] = {double{1}, double{2}, double{3}};
X x(std::initializer_list<double>(__a, __a+3));

assuming that the implementation can construct an initializer_list object with a pair of pointers. —end example]

§ 8.5.4 ¶ 6:

The array has the same lifetime as any other temporary object (12.2), except that initializing an initializer_list object from the array extends the lifetime of the array exactly like binding a reference to a temporary. [Example:

typedef std::complex<double> cmplx;
std::vector<cmplx> v1 = { 1, 2, 3 };

void f() {
  std::vector<cmplx> v2{ 1, 2, 3 };
  std::initializer_list<int> i3 = { 1, 2, 3 };

struct A {
  std::initializer_list i4;
  A() : i4{ 1, 2, 3 } {} // ill-formed, would create a dangling reference

For v1 and v2, the initializer_list object is a parameter in a function call, so the array created for { 1, 2, 3 } has full-expression lifetime. For i3, the initializer_list object is a variable, so the array persists for the lifetime of the variable. For i4, the initializer_list object is initialized in the constructor’s ctor-initializer as if by binding a temporary array to a reference member, so the program is ill-formed (12.6.2). —end example] [Note: The implementation is free to allocate the array in read-only memory if an explicit array with the same initializer could be so allocated. —end note]

This entry was posted in programming and tagged , . Bookmark the permalink.

18 Responses to The cost of std::initializer_list

  1. TONGARI J says:

    As an alternative:

    template<std::size_t N>
      : _vect(std::make_move_iterator(std::begin(a)), std::make_move_iterator(std::end(a)))

    Extra braces needed though, but somebody may find this more idiomatic:

    Vec<int> v {{1, 2}};
  2. Gonzalo BG says:

    How do you provide the other vector constructors alongside this one? (e.g. `vector v (N)`)

    • The answer is in the previous post. In short:

      struct with_size_t {};
      constexpr with_size_t with_size {}; // a 'tag'
      // ...
      // definitely explicit:
      explicit Vec(with_size_t, size_t s) : _vect(s) {}
      // ...
      int main()
        Vec<size_t> v1 {1u, 2u, 3u};    // {1, 2, 3}
        Vec<size_t> v2 {5u};            // {5}
        Vec<size_t> v3 {with_size, 5u}; // {0, 0, 0, 0, 0}
      • Gonzalo BG says:

        Now that C++17 guarantees copy-elision I think I prefer using a static member function rather than adding tags to the std namespace:

        auto v0 = vector{1, 2}; // initializer list
        auto v1 = vector::with_size(N, 2); // sized
        auto v3 = vector::with_capacity(N); // reserve

        • Now, that indeed solves about 70% of the problems. But copy elision is only guaranteed in some contexts, and does not work when the returned value is perfect-forwarded to the destination object:

          // using 'tags':
          std::optional<Vec> ov {std::in_place, with_size, 10}; // no copying or moving

          This cannot be done with static member functions. (I think.)

  3. Gonzalo BG says:

    Good example, that cannot be done with static member functions without involving at least a move. That settles it for me, tags are objectively better. Thanks!

  4. Krzysztof says:

    Isn’t the extra copy guaranteed to be elided in C++17? Does that C++17 feature applies here?

    • According to my reading of P0135R1, it does not. That is, a copy-initialization of an object of type const T using a prvalue of type T is not elided and requires the “materialization of a temporary object”.

    • Correction, as per discussion in this thread, in C++17, the extra copy into a temporary array will be elided. Apparently, somewhere between C++14 and C++17 the Standard has been changed to allow the elision in this case.

      Thanks for raising this.

    • Second correction. In initialization like the following:

      vector<string> vec2 {vec1[2], vec1[0], vec1[1]};

      double copying cannot be avoided. Expression vec1[2] is an lvalue, so when it is used to initialize a temporary there is nothing to elide. Similarly, when the element of the temporary array is referenced through expression *it, that expression is also an lvalue, so it cannot be “elided from”. We do get double copies, at least in the above examle.

  5. The variadic constructor template has another problem: it prevents you using braced initialization for the individual elements. This would be particularly inconvenient if you tried to use the same technique for something like std::map. Currently you can write:

    map m = { { 1, “one” }, { 2, “two” } };

    But this wouldn’t work if you used a variadic constructor template because it wouldn’t be able to deduce the types in the variadic argument list.

  6. dyp-cpp says:

    I was wondering what “nonarrow_convertible” was, but it seems it’s just a typo for “nonnarrow_convertible”; although there seems to be another typo in “nonnarow_convertible” in the definition of the type trait. To me, it might be a bit clearer as “non_narrow_convertible” or just as “brace_convertible”. But there might be corner-cases where initialization is rejected when using {} but not when using () that are rejected not because of narrowing, but e.g. because of ambiguity.

  7. Pingback: The Week in C++ 2016.27: July 3rd-9th | Utah C++  Users Group

  8. Pingback: Avoiding Copies And Moves With auto | Crascit

  9. Anatoliy says:

    `(_vect.push_back(std::forward(vs)), …);` for coming C++ may became unordered if `operator ,` is overloaded for `_vect`’s element type. `push_back` now return reference to newly constructed element. There is a need to use a cast to `void`.

  10. Smoke says:

    It’s a convenience function, syntactic sugar, whatever you want to call it, yes, it cost, but it’s a simple matter of time spent coding, versus runtime execution speed, ie, if you need speed, spend more time programming, otherwise, give it some sugar, and get it done.

    You’re stating the obvious, I wrote a fancy delegate wrapper, I know it’s slower, but, it’s more convenient than writing out all the code manually.

    You’re not some genius that unraveled a great mystery, you’re a DERP…

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s