Help the compiler warn you

Compiler warnings are a very useful tool for detecting bugs in your program. Because you can enable them selectively, and because you can choose to turn them into hard errors on your compiler, you can in fact build a dialect, which is a safer subset of C++. You can enable even more warning if you assist your compiler a bit and instruct it about your intentions. We will see how to do it in one case.

I have a function in my program for splitting strings. It has the following interface:

template <typename... T>  
bool split(string_view s, char sep, T&... tokens);

You can use it like this:

int a, b;
double c;
char d;
string s = "1|99|0.25|X";

if (split(s, '|', a, b, c, d)) {
  // use a, b, c, d
}  

That is, first the string to split, second the separator by which to split, and then a variable number of arguments of different types taken by reference. From the number and the types of the out parameters the function determines how many tokens are expected and in what format. While parsing the input string the function can determine if the split succeeded. In a way, it does two things.

  1. It checks if the number and the format of tokens in the string is the one expected.
  2. If so, it populates the variables from the scope with the corresponding values.

If the string is determined not to satisfy the expectations, it is not an “error”. Someone might be calling this function only to determine if the string is in the given format — nothing more. Thus, exceptions would not work here. Since the numeric values are communicated through out parameters, we can use the return value to indicate if the split was possible or not. However, there is a danger here: you cannot use values a, b, c or a unless function split returned true. But it is obvious that in a big program, sooner or later someone will forget about it, and he will just take for granted that the split must work, and will use the variables unguarded:

int a, b;
split(s, '|', a, b);
return b - a;

One might be tempted to just assign the initial values to a and b in order to “avoid” the “problem” later. But that would be very unfortunate. Using values a and b only makes sense when we know we have read them from the string. If not, whether they have some initial values taken out of the blue by the programmer or whether they have unspecified values is the same: in either case we have a logic error. However, if we go with unspecified values, we increase the chances that this bug will be found statically by the compiler or a tool like static analyzer.

Now, it might look like I am questioning the C++ Core Guideline ES.20 (Always initialize an object). But the real essence of the guideline is, “assign initial value upon declaration provided that you know this value”. If you do not have your value yet, and you already need the object (and its address), just initializing any value is not going to do any good.

So, we will likely get into the problem of using values we have not read. Exceptions would have prevented that, but using them is an overkill: there is nothing exceptional about a string (probably read from some config file, or user) not being in the expected format. Also note that sometimes we do have good defaults, and we only want to depart from them if the string matches the format:

int lo = 0, hi = 100;
split(s, '|', lo, hi); // ok to fail
return lo <= val && val <= hi;

Since C++17 you do not need exceptions to prevent unchecked failures from breaking into the program. We now have attribute [[nodiscard]]. We can use it to annotate our function:

template <typename... T>  
[[nodiscard]] bool split(string_view s, char sep, T&... tokens);

Not reading the value returned by this function results in a compiler warning. And if you turn warnings into errors: in an error. The user has to respond to this. For instance, with an if-statement:

int a, b;
double c;
char d;
string s = "1|99|0.25|X";

if (split(s, '|', a, b, c, d)) {
  // use a, b, c, d
}
else {
  throw bad_configuration{};
}  

Now the user is throwing an exception. This makes sense. It is the user who judges whether the situation is exceptional or not — not the implementer of function split. But the warning will now also appear in the cases where we consciously decided not to check the result:

int lo = 0, hi = 100;
split(s, '|', lo, hi); // ok to fail
return lo <= val && val <= hi;

Luckily, they usually go in pairs: a warning, and a construct to disable the warning. In our case, the idiomatic way to disable the warning is the cast to void:

int lo = 0, hi = 100;
(void)split(s, '|', lo, hi);
return lo <= val && val <= hi;

C++17 is quite new, but the attribute itself has been around for a while longer. In MSVC it is available since version 2017.3. In Clang it is available since version 3.9 (not restricted to C++17 mode); in GCC, since version 7 (not restricted to C++17 mode). In earlier versions of GCC (and therefore also in Clang) we have a compiler-specific alternative: attribute warn_unused_result:

template <typename... T>  
__attribute__((warn_unused_result))
bool split(string_view s, char sep, T&... tokens);

The only (sometimes annoying) difference is that you cannot silence this one with the cast to void, you need to read it. As per the explanation in bug report, attribute warn_unused_result is treated very strictly. No one should ignore the value, even consciously. You would use it for functions like malloc.

Advertisements
This entry was posted in programming and tagged , , , . Bookmark the permalink.

6 Responses to Help the compiler warn you

  1. Adam Badura says:

    In my experience possibility of ignoring the error / status result is always among the arguments given for use of exceptions. This would be a counterargument for it. Then again it is silenced way easier (including not noticing it while browsing through code) than you would need to silence an exception.

  2. richtw1 says:

    Over on the Reddit thread, someone suggested the following alternative, which I really like:


    template <typename... T>
    auto split(std::string_view s, char sep)
    -> std::optional<std::tuple<T...>>;

    if (auto tokens = split<int, int>(s, '|')) {
    auto &[a, b] = *tokens;
    // ...
    }

    • Thanks. This is a nice solution! (Although it does not illustrate my point well: that [[nodiscard]] is useful.)

      • robin says:

        I may be wrong, but when an attribute [[nodiscard]] is used, aren’t we always in this situation? I think the only place where we want to use [[nodiscard]] is where we return an error code. And in this case, I think that either exceptions or the technique of richtw1 is better.

        • The first STD function to use [[nodiscard]] is likely to be async, in order to avoid the problem described at the bottom of this post. And it has nothing to do with error codes. Similarly, function operator new would be a good candidate.

      • And it will not address my other use cases for function split():

        struct Range { int min, max; };
        
        void update (Range & r)
        {
          if (split(get_config(), r.min, r.max)) ...
        }
        

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s