In this post we will see a library for storing nullable/optional values that can in certain situations replace and outperform Boost.Optional.
What do you use Boost.Optional for? In my experience, the answer was typically one of the following:
- Because my type
T
has no null state (like -1) that would indicate that I have no proper value. - Because, I need to perform a two-phase initialization (I cannot initialize my
T
yet, but I already need it alive). - Because I need an interface that would indicate to the type system that my value may not be there and that its potential absence should be checked by the users.
In this post we will focus exclusively on the third motivation. We sometimes want to change the interface of T
, and not to treat all its values equally, in order to avoid bugs, as described in this post. Sometimes we want to change the type in order to affect the overload resolution, and select a desired overload. I have often seen it used in XML processing frameworks: if an XML attribute is represented as int
in the model, it is simply added when generating the XML attribute:
template <typename T> void output_attribute(output& out, const T& attr) { out.print_attribute(attr); }
But when the attribute is of type optional<int>
, we can add a clever overload that can skip the outputting of an optional attribute:
template <typename T> void output_attribute(output& out, const optional<T>& attr) { if (attr) output_attribute(*attr); }
However, this application of boost::optional
comes with a cost. We now need to store an additional bool
that indicates if we have a valid value. Conceptually, boost::optional<T>
can be represented as:
template <typename T> class optional { bool _initialized; std::aligned_storage_t<sizeof(t), alignof(T)> _storage; };
The consequences of such memory layout have been described in detail here. In short, for scalar types, wrapping them into boost::optional
doubles the memory usage.
Because of this problem, in performance-critical parts of the program, it may be more appropriate to resort to using “magic values”, like -1 in type int
to represent absent values. With this you will bring back the risk of bugs and misunderstandings into the project.
In your program, when you know that -1 can never be a valid value of int
, you can combine both type safety and run-time efficiency by defining a custom type:
class opt_int { int _value; public: constexpr opt_int() : _value{-1} {} constexpr opt_int(int i) : _value{i} {} constexpr bool has_value() const { return _value != -1; } constexpr int value() const { return _value; } // precondition: this->has_value() };
Of course, if you had to write such a class for every optional type T
you have, or for any magic value applicable in a given context, this burden might outweigh the gain in type safety. Therefore, our solution needs to be as easy to use as possible. We would need a class template, similar to boost::optional
:
compact_optional<int> oi;
However, it is not clear how we tell it that it should use -1 to represent a missing value?
One option would be to introduce a type trait that would associate a type with its “special” value. But it is not obvious which value should be selected, and in different places, even within the same program, we may want to use different values for an empty int
.
Provide this value as the second template parameter?
using opt_count = compact_optional<int, -1>; using opt_int = compact_optional<int, std::numeric_limits<int>::min()>;
This seems to work, and has a number of benefits. First, we do not have to store the value anywhere, it is encoded in the type. Second instantiations with different special values are different types and cannot inter-operate with one another: opt_count
and opt_int
represent something different, and whoever wanted to use one in place of the other, has likely made a mistake — now detected at compile-time.
But while it will work for scalar integer types, it will not work for user-defined types like std::string
or type code
(described in this post).
Instead, what we can do as a generic solution, as described by Vitali Lovich here, is to provide a policy parameter for constructing and detecting the special value:
using opt_count = compact_optional<int, my_int_policy>;
What we expect of a policy class is that it defines two static member functions:
struct my_int_policy { static int empty_value() { return -1; } static bool is_empty_value(const int& v) { return v == -1; } };
The first function creates the special value, while the second checks the current value for being special. While in our case the second function could be implemented in terms of the first, in general case we can provide a faster implementation for the second one. Consider storing an optional string, where empty string is a legitimate value, but we can use "\0\0"
for representing the special value (we do not expect null characters inside strings). The policy for handling this can look like this:
struct my_string_policy { static std::string empty_value() { return std::string("\0\0", 2); } // may allocate static bool is_empty_value(const std::string& v) { return v.compare(0, v.npos, "\0\0", 2) == 0; } // no alloc };
Class template compact_optional
And thus, we have outlined a design for a generic library capable of representing optional T
at the expense of sacrificing one value of T
, but with no spacial overhead. You can find the full implementation thereof here. It is supposed to be a replacement for “hidden optional values”, where one magic value is stored in int
; as well as for boost::optional<int>
, where space savings are desired.
Let me give you a short overview of this tool.
1. It has an interface different than that of boost::optional
. This is so for two reasons. First, its semantics differ from boost::optional
, so we want to avoid the confusion. Second, this library has strong focus on being unambiguous and explicit at the expense of scarifying some convenience: this results in different trade-offs in the interface. As in the examples above, you check if the value is present using member function has_value()
and read the value using const
member function value():
opt_int oi; // no value opt_int oj (2); // with initial value if (oi.has_value()) // test for having value int i = oi.value(); // reading value oi = opt_int(3); // set new value oj = opt_int(); // set to no-value
This, at the expense of certain flexibility, avoids two gotchas of boost::optional
. With the latter, many programmers incorrectly assume that the following test:
optional<bool> ob = true; if (ob) ...
checks if we have bool
with value true
. And also, because there is no implicit conversion from T
, we avoid this gotcha. Second, there is no way to get the write access to the contained value. This is important, because unlike in optional
, in our type compact_optional
the state of having no value is part of the contained value. boost::optional
guarantees the following:
if (oi) { // contains value *oi = some_value(); assert (oi); // still contains value }
compact_optional
could not guarantee that.
2. Because from the no-value policy argument we can deduce what T
is being stored, there is no need to provide the first template argument: you only need to provide the policy:
using opt_int = compact_optional<evp_int<int, -1>>;
Here, evp_int
is a policy template: we instantiate it by providing the T
and its value that would represent the no-value state. It will work for some integer types. For other types you will have to define your own policy.
3. In case you want to use boost::optional
or std::experimental::optional
(with their other useful features) but with the interface of compact_optional
a special policy is provided:
using opt_bool_policy = evp_optional<boost::optional<bool>>; using opt_bool = compact_optional<opt_bool_policy>
So, in the general case, the type stored internally, can be other than T
, provided that it can store no less states than T
. In a similar way, we provide a policy that can store a bool
in the capacity of the sizeof
of 1.
4. No relational ops are provided. If you really need to compare this types, you need to provide your own, and decide how you want to handle the no-value state.
5. Similarly to what we did for class template code
in the other post, we provide the second parameter that serves only to discriminate two otherwise identical types:
using opt_count = compact_optional<evp_int<int, -1>, class cnt_tag>; using opt_num = compact_optional<evp_int<int, -1>, class num_tag>;
Both, opt_count
and opt_num
are int
s where -1 indicates a no-value state, but they are different types, and using one in place of the other is a type-system error.
Comparison with Boost.Optional
So far, I have been advertizing compact_optional
, so one might get an impression that it is superior to boost::optional
(or std::experimental::optional
). But this is not the case. The former offers an advantage only in certain situations. Let’s go over their differences.
1. boost::optional<T>
is generic: it will work practically with any T
, to the extent that it is even possible to use type optional<optional<T>>
. With compact_optional
you have to make the choice case-by-case how you want to represent the no-value state. Having a nested compact_optional
is technically possible, but it requires sparing two values of T
.
2. boost::optional<T>
gives you a useful guarantee that if you initialize it to a no-value state, no object of type T
is created. This is useful for run-time performance reasons and allows a two phase initialization. In contrast, compact_optional
upon construction, immediately constructs a T
.
3. Some type T
may not have a ‘spare’ value to indicate the empty state. In such case, compact_optional
cannot help you. In contrast, boost::optional<T>
will work just fine: the additional empty state is stored separately. In a way, boost::optional<T>
can be thought as boost::variant<boost::none_t, T>
.
4. boost::optional<T>
is almost a container, capable of storing 0 or 1 elements. When it stores one element, you can alter its value. It is impossible in compact_optional
: its value and the “container’s size” are one thing. But in exchange, the latter, because it only provides a non-mutable access to the contained value, it can build the return value on the fly, similarly to the proxy reference in std::vector<bool>
, but because we only provide a non-mutable reference, it is much simpler and safer. This is why we can provide a compact_optional
for type bool
(which has no spare value) with the sizeof
of a single char
.
5. compact_optional
has a minimalistic interface: this is also to avoid any surprises. As a cost it is less convenient. Unlike boost::optional
, it does not have operator==
: you have to provide your own comparator and decide yourself how you want to compare the no-value states.
In general, it is expected that in the first pass of the implementation of your program, you will use boost::optional<T>
as a simple ready-to-use solution. Later, if you determine that boost::optional
kills your performance, you can think of replacing it with compact_optional
and how you want to represent the no-value state.
Acknowledgements
Vitali Lovich suggested to use a template parameter for representing the no-value policy (see here). This really improved the design of the library. Vitali has implemented a variation of std::experimental::optional
that implements a no-value policy design; it is here.
In the other direction, towards simplicity at the possible cost of efficiency, one can use a `std::vector` as value carrier for a very simple
Optional
class. That avoids dependency on Boost or alternatively a complicated home-brewed implementation, while still reaping the benefits of exception-free but still guaranteed correct normal case code (only exception when one forgets to check, as is proper). Oh, while I’m commenting, let me just give credit for the general concept of possible-value-class back to Barton and Nackman, namely theFallible
class in their book Scientific and Engineering C++.Can you get an answer to the question? http://stackoverflow.com/questions/34061095/trivially-default-constructible-stdoptional
Hi.
I was reading your implementation of std::optional. What is the reason for having two base classes: `optional_base` and `constexpr_optional_base` ?
Thanks for this question. This is to implement the requirement of
optional
’s destructor:Effects: If is_trivially_destructible_v<T> != true and *this contains a value, calls val->T::~T().
Remarks: If is_trivially_destructible_v<T> == true then this destructor shall be a trivial destructor.
Look at destructor in
constexpr_optional_base
, it is trivial.What I find scarier about the use of std::optional is this:
> struct opt_int {std::optional value;}; –> size==8
> struct opt_int2 {int v;bool v_defined;}; –> size==8
> struct opt_int3 {std::optional value; std::optional value2;}; –> size==16
> struct opt_int4 {int v;int v2;bool v_defined;bool v2_defined}; –> size==12
So the more std::optionals I use per struct, the less compact my struct will become. std::optional is unusable for me because of this.
The reason, opt_int2 has a size of 8 is because of the padding bytes, the compiler inserts at the end. When you change the members within opt_int4 to {int v; bool v_defined; int v2; bool v2_defined; }; then you end up with size==16, too.
Hi
Could you explain the reason for this : “But while it will work for scalar integer types, it will not work for user-defined types like std::string or type code (described in this post).”?
Sure. The quoted remark applies to the following example:
We pass
-1
as (non-type) template parameter. This is in the context of C++11. Passing values as template parameters is restricted to scalar types. You cannot pass a user-defined type like:C++20 allows values of some restricted user defined types as template arguments, but the above example with
std::string
would still not work.I see. Thanks for clarifying! And also for the great article.