Concealing bugs

Consider the following piece of program code.

void process_names()
{
  const char * fileName = "contents.txt";
  const char * foundName = 0; // will be assigned to later
  
  Names names = parse_file(foundName);
  foundName = find(names);
  // ...
}

This program is expected to open file contents.txt, read names from it; then find a desired name, and do something with it.

But the program will not do it, because I have a bug in this code. Accidentally, I have passed the wrong pointer to function parse_file: a null pointer. I confused them because names were so similar (I didn’t notice that my IDE suggested the wrong name in the hint); the types happened to match…

Now, would you like that this sort of bugs were detected at compile time? The correct answer is “yes”, but is it at all possible?

In general, compiler (or any static analyzer) cannot just warn you that you might have wanted to pass a different variable to a function than you did:

void fill (int h, int w)
{
  r.height(w); // did you want to pass h?
  r.width(h);  // did you want to pass w?
  // ...
}

In 99.99% of all function calls you pass the correct variables to functions. If compiler started sending warnings in all these cases, there would not be sufficient memory in the buffers to display them; and the usefulness of such information would be zero.

Therefore, to be sure that it gives you no (or, almost no) false positives, compiler has to see something really fishy in the program. Something that almost everyone would agree that it must be a bug.

One example of such clear situation is where we define an automatic variable in function scope but never read its value. There must be something wrong with it because why define a variable if you do not need it? You probably meant to use it but accidentally used something else instead, or forgot to use it at all.

Indeed, going back to our function process_names, if we intended to read from variable fileName only once, and forgot, now it is never read from, and a decent compiler like GCC or Clang (with option -Wall) will report a warning. If you treat compiler warnings seriously, you will be alerted by the compiler that something is wrong with the usage (or lack thereof) of variable fileName, and will have a chance to identify and fix the bug.

Note that in some sense in this scenario, having found our bug is a side effect of looking for declared but unread variables. That is, the core of our bug is not that we never read from fileName: the real cause is that we misspelled the variable name. To understand this better, let’s look at a slightly modified example, where we make a second use of variable fileName:

void process_names()
{
  const char * fileName = "contents.txt";
  const char * foundName = 0; // will be assigned to later
  
  Names names = parse_file(foundName); // bug
  foundName = find(names);
  // ...
  std::clog << "done processing " << fileName << std::endl;
}

Now, the check for unread variables cannot detect anything wrong with our function; it will not be able to help us. But the bug is still there.

However, unread variables is only one of many checks that a compiler can do. If we are lucky, there may be a different warning that checks for something else, which can similarly — as a side effect — detect our bug.

A good candidate for this would be a check for invoking memory access violation. For sure, it is (almost) never the programmer’s intention to invoke it. So, if there is one in the program, we can be quite sure that the programmer intended to write something else, and we can safely warn him. It so happens that a null pointer is being passed to a function that apparently expects a file name. Thus, invalid memory access might be in the air here, but we (the readers) do not know that, because we cannot see the declaration of function parse_file. In particular, we cannot see the type of the function’s argument.

Invoking memory access violation

For now, let’s use my own implementation of a string:

struct MyString
{
  char str [64];

  MyString(const char * s)
    : str() // zero-initialize the array
  {
    for (int i = 0 ; *s != 0 ; ++s)
      str[i++] = *s;
  }
};

For the record: I do not recommend creating strings classes like that. It has a lot of problems. In particular, it makes a silent assumptions that the passed null-terminated character sequences will be really short. My only goal is to illustrate my point: we have a string implementation that assumes it will never be given a null pointer, and based on this assumption, it just dereferences the pointer without any defensive checks.

At least in theory, we could expect that a smart compiler could detect that a program can potentially invoke an illegal memory access. So, here is a full program. We can test it with GCC (-Wall -Wextra). For the results see here. There is no warning, and when the program is run, we get a “Segmentation fault”. The same with Clang (see here): no warning.

One may now wonder, should it not be quite easy these days to detect a memory access violation confined to a single file, by means of only inspecting the code? Well, conceptually these checks are doable, but somewhat outside the scope of the compiler. A compiler has to generate an efficient code, but it also has to generate code efficiently. Compilers do some extra work to emit some warnings, but the checks they do are limited to things that fit into the normal compiler’s flow. So, let’s try a dedicated static analyzer: clang-tidy. (I am using llvm version 3.4.) These are the results:

$ clang-tidy test.cpp --
test.cpp:26:3: warning: Single-argument constructors must be explicit
  MyString(const char * s)
  ^
test.cpp:29:22: warning: Dereference of null pointer (loaded from variable 's')
    for (int i = 0 ; *s != 0 ; ++s)
                     ^
test.cpp:63:3: note: Calling 'process_names'
  process_names();
  ^
test.cpp:53:3: note: 'foundName' initialized to a null pointer value
  const char * foundName = 0;
  ^
test.cpp:56:28: note: Passing null pointer value via 1st parameter 's'
  Names names = parse_file(foundName);
                           ^
test.cpp:56:28: note: Calling constructor for 'MyString'
  Names names = parse_file(foundName);
                           ^
test.cpp:29:22: note: Dereference of null pointer (loaded from variable 's')
    for (int i = 0 ; *s != 0 ; ++s)

It finds more than I wanted (the implicit conversion is there on purpose), but it finds our bug. Or, to be more precise, it finds a memory access violation, but while following the message chain, we will reach to the place where we misspelled the variable name.

A static analyzer can find our bug, owing to the fact that our implementation of a string-like type — MyString — can potentially dereference a null pointer. This is not a bug in the implementation of MyString’s constructor. But invoking the constructor with a null-pointer value is a bug.

Defensive checks

Now, instead of using a hand crafted MyString let’s try our buggy function with std::string. In the test program we only need to change one alias (and include the corresponding header):

typedef std::string String;

When I run clang-tidy on the test program using std::string, I get a number of warnings about the contents of header <string>, but the interesting warning about memory access violation is not there!

What if I compile the program with GCC? No warning or error. What if I run it? I get the following results:

terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_M_construct null not valid

Aborted

This means that std::string’s constructor threw an exception. Indeed, if we look at the implementation of the constructor in cstdlib++ (see here), it calls an auxiliary function, _S_construct (or _M_construct in the trunk), and this function performs a defensive check for null pointer, and throws an exception.

Is the Standard Library implementation allowed to do that?

According to the C++ Standard, initializing an std::string with a null pointer is an undefined behavior. It is not specified directly, but is implied by the specification:


basic_string(const charT* s, const Allocator& a = Allocator());

Requires: s points to an array of at least traits::length(s) + 1 elements of charT.

“Undefined behavior” means that the implementation is allowed to do anything. In particular, it is allowed to throw an exception. So, the authors of std::basic_string in libstdc++ decided to throw an exception. This is for a good reason. This addresses (in some sense) the situation when the user forgets about the constructor’s requirement, and accidentally or ignorantly passes a null pointer nonetheless. In this case an exception is thrown and the memory access violation prevented.

But this defensive check has a negative side effect. It prevents the static analyzer from finding the bug. Now, there is no potential memory access violation that the analyzer can see. Instead, the bug detection is postponed till run-time.

Interestingly, if there were no defensive checks (as in the original test example), and we get a segmentation fault, it is also due to a run-time check performed by the operating system. So the only added value of an exception throw is that a program can intercept the exception and respond at run-time.

In case the bug is found during the static analysis, or at compile-time, the natural response is for the programmer to go in and fix the bug. But in case the bug is found in run-time there is no right response. Remember that memory access violation is only a symptom of the bug: not the bug itself. The bug is somewhere else, perhaps in a remote place of the program. Owing to the static analyzer output, the programmer can trace back from the symptom to the root cause. At run-time you only detect a symptom and nothing more. Even if you catch the exception at some point, the bug may be still outside the try-catch block, and the program may still be suffering from other symptoms, not detectable by defensive if-statements.

In fact, a program halt and a core dump might turn out to be a better solution, because the dump might contain sufficient information for the programmer to pin down the bug.

A comment inside the std::string’s constructor in cstdlib++, just above the defensive check, says:

// NB: Not required, but considered best practice.

I have doubts about such “best practice.” In a particular case, this is the thing you might need to resort to, but it comes with consequences. It is controversial.

One other interesting thing to be observed here, is that there are different “kinds” of undefined behavior. An UB at the C++ Standard level (like the one with the std::string’s interface) is not necessarily a UB at compiler level. The compiler or the Standard Library implementer can legally chose to make a Standard-level UB a perfectly well defined behavior on a given platform. In such case, even an UB-based sanitizer cannot detect it. For instance, if I test the std::string version of my test program with GCC’s UB sanitizer (option -fsanitize=undefined), (or Clang’s UB sanitizer, provided that I use libstdc++), I get no UB notifications. (See here for GCC test.)

Concealing bugs

But a defensive check and an exception throw, while not an ideal, are still a way of signalling that something has gone wrong. There exists an another way of approaching the problem of potential memory access violation, which people sometimes chose, but which makes bug detection even harder. It can be summarized as: “if a given input would be dangerous, behave as though we were given another, safer, input.”

At some point this has been considered as an addition to the to-be-standard string_view. The idea is the following. Because this works:

int n = 0;
std::string_view sv = {"file.txt", n}; // only first n chars
assert (sv.size() == 0);

And this works:

std::string_view sv = {nullptr, 0};  // only first n chars
assert (sv.size() == 0);

Make also the following work:

std::string_view sv = nullptr;

With the semantics identical to the former example.

The idea is noble: rather than crashing, or disturb the callers with annoying exceptions, try to do anything. Quietly. The effect is, however, that if I change my example to use std::string_view with this “improvement”, I get neither a static analyzer warning, nor any notification at run-time. (See here.)

The function behaves as though it were correct, it is doing something. But it cannot be doing the right thing, because we know it has a bug. But the function now keeps it secret to itself and never notifies anyone.

Assisting the analyzer

In order to assist a static analyzer, or an UB sanitizer, you might want to explicitly put some indications about illegal state of your program inside the code. One of tools for that in GCC and Clang is __builtin_unreachable(), which says more-less, “I claim that the control flow will never reach this statement.” Now, the tools can make use of your claim:

  1. They can check whether you are correct (this is a no-trust mode), and/or
  2. They can use your claim for optimizations (this is a trust mode).

In order to give the programmer even more tools for giving hints to machines, the C++ Standard Committee is working on adding a new feature to the language known as contracts. This allows you to give a clear hint to static analyzers, compilers and other programmers, that certain values of objects are considered erroneous at certain points in program execution, e.g., at the point of passing arguments to functions. Class MyString could use a contract in the following way:

struct MyString
{
  char str [64];
 
  MyString(const char * s)
    [[expects: s != nullptr]];
};

This expects inside the attribute means that we are defining a precondition. This means that this is a bug to call this constructor with a null pointer. A static analyzer can make use of this information. The precondition is tied to function declaration. This means that static analyzer does not even have to peek inside the function body (it might be contained in a different file) in order to detect its invalid usage.

Contracts allow you to take the best of both worlds: you can at the same time instruct the static analyzer about potential invalid usages of the function, and request the compiler to insert a run-time defensive check. This latter possibility is totally optional, and if you do not want even the slightest run-time overhead, you can tell the compiler not to inject any defensive checks. But contracts belong to the future.

This entry was posted in programming and tagged , , , , . Bookmark the permalink.

21 Responses to Concealing bugs

  1. Zbyszek says:

    I wanted to write “Simply use std::path for file name.” and then I realized it’s not there yet (C++17).
    But really, to protect from this kind of problems use proper abstractions (not sure if this is the correct word to use here) like boost::path or if raw pointer is required: gsl::not_null, or as it is written in one of stackoverflow questions “Pointers that cannot be null are called references.” (http://stackoverflow.com/questions/33306553/gslnot-nullt-vs-stdreference-wrappert-vs-t).

    • Basically, you are saying that many bugs are avoided if we use more dedicated types with a narrower interface: an interface that allows doing only the right things, and prevents doing fishy things — all at compile-time. Right?

      (Note: this is something I have covered in this talk.)

      I agree. And there are many pieces of good advice to be given that would help avoid the bug from my example. The point of this post was something different, however. Even if you fail to apply some good practices, good tools can find a bug in your code, but only if you do not interfere with them.

      As a side note, Boost.Filesystem and and std::filesystem also suffer from some type safety problems:

      #include <boost/filesystem.hpp>
      
      void open_file(boost::filesystem::path) {}
      
      int main()
      {
          const char * foundName = nullptr;
          const char * fileName = "contents.txt";
          open_file(foundName);
      }
      

      This compiles and crashes.

      Live example here.

  2. bootrom says:

    I agree that simplify code and allow static analyzers like SAL do their job. And they report problems before runtime that is my personal goal, because my embedded code can’t be extensively tested for all codepathes at runtime.

    But if you write some text parser, you are free to choose another approach, because in this case it is easy to create automatic testsuite with huge amount of tests and run it for each version.

  3. bootrom says:

    I agree that not-doing-too-much-safety-checks simplify code and allow static analyzers like SAL do their job. And they report problems before runtime that is my personal goal, because my embedded code can’t be extensively tested for all codepathes at runtime.

    But if you write some text parser, you are free to choose another approach, because in this case it is easy to create automatic testsuite with huge amount of tests and run it for each version.

    • I am not sure what you mean in the second paragraph. If you can afford unit-tests, you can still perform static analysis. And in that case, you had better be careful not to sabotage the analyzer by inserting defensive if’s which hide the useful information.

      Of course, you can do other run-time checks that still expose enough information to the analyzers. They just need to be clever.

      • bootrom says:

        In some cases I was not able to adapt the source code to have both: defensive “if’s” (producing errors during run of tests) and static analyzer (showing errors without running code). In this situation one must decide for one way and decision is based on project nature and properties.

        • If you do not mind, could you give me an example of a situation where you need to make this kind of choice?

          To the extent that I can imagine the problem, any defensive if could be replaced with a macro:

          int fun(int * p, int i)
          {
            VERIFY(condition(p, i));
          }
          

          And this macro, depending on the build mode, would expand to either to:

          if (!condition(p, i))
            runtime_action(); // maybe throw?
          

          or to:

          if (!condition(p, i))
            __builtin_unreachable(); // or sth similar
          
  4. Very interesting analysis of what checks there are other than the compiler’s, which makes this post stand out, in my opinion.
    As you suggest, compilers’ checks remain the best option when applicable though. I would tend to think however that the cases where the compiler checking the order of the arguments is overkill is less than 99.99%. The example you’re pointing at with width and height is indeed easy to write correctly, because they are setters, and setters carry a name. But this is not the case for constructors arguments (like Rectangle(int w, int h)), for which I find that making parameters explicit and enforcing compiler checks really brings value. This could be implemented with Strong types for example (I can point you to dedicated blog posts, should you have an interest for this).

    The fact that the issue with MyString can be solved with the GSL’s not_null shows that you’re touching a hot topic. I realize that this goes beyond simply wrapping a type into a strong type to give it a specific name. It’s more about wrapping your type in one that provides specific semantics.

    • Jonathan, thanks for your comment. Indeed, more dedicated types remove some of the bugs once and for all. And I try to use them, where I can. For instance, I am using my hand-crafted version of strongly-typedefe-d bool: described here.

      Yet, I think some bugs cannot be solved with this technique. For instance, accidentally mistyping

      bool rslt = x < y;
      

      for:

      bool rslt = y < x;
      

      You will not be inventing two different types for the left-hand-side and the right-hand-side argument of operator <. It is not that visible for operator < as we are taught to work with it since childhood days, but consider this:

      std::is_base_of<X, Y>::value;
      

      Is X supposed to be a base or a derived class? I always have problems remembering these.

      The only solution to these problems (unit-tests aside, I need them to be detected earlier) is to find them as a side effect of looking for something else, like unused variables, UB, etc. And for this, I need the programmers not to interfere with defensive if’s. Some of them spoil the static analysis.

      • Krzysiek says:

        Andrzej, on a side note:
        would it be possible to rewrite std::is_base_of to look like this:
        std::is::base_of::value;
        ?

        • Krzysiek says:
          std::is<X>::base_of<Y>::value;
        • That indeed looks less ambiguous. But knowing the rules for dependent names in templates, it would probably be:

          std::is<X>::template base_of<Y>::value
          

          But in general, I agree with your direction. This:

          return l <= x && x <= h; 
          

          is inferior to:

          return closed_interval{l, h}.contains(x); 
          
  5. Jesse Maurais says:

    void process_names()
    {
    const char * fileName = “contents.txt”;
    Names names = parse_file(foundName); // error: undeclared identifier
    const char * foundName = find(names);
    // …
    }

    The rule of thumb in C++ since as far back as I can remember: Never declare variables before you use them. That’s it. You’re done.

  6. Pingback: C++ annotated: Sep – Dec 2016 | CLion Blog

  7. Pingback: C++ annotated: Sep - Dec 2016 - ReSharper C++ BlogReSharper C++ Blog

  8. Aleksander Fular says:

    I have one question – somewhat related. Regarding exceptions. This may be a bit wide topic.. 🙂
    Do you think the exceptions are a good thing, or the approach taken by rust – ban exceptions, return Result is better?
    There are advantages of both of these, but which do you think is better – or just which you’d rather see? 🙂

    Thanks,

    • From what I know, Rust also offers a “panic” mode, which works much like exceptions, except that you cannot catch them.

      I have been using exceptions for so long, and have learned the ways of dealing with them, that I cannot honestly compare them to other methods, which I only know superficially.

      Interestingly, I have recently had a conversation with Niall Douglas, who is working on Outcome library, which tries to implement Rust’s error handling in C++. The thread can be found here. He argues that there are sections in the program where local handling of errors and explicit control flow (as offered by Result) is more convenient; whereas in other sections you will find exceptions more convenient. You do not have to decide on one. With the tool like Outcome, he claims, it is easy to move from one technique to the other at any level of the application.

      I had a negative experience with the return codes recently. I was writing a function that parsed a complicated xml file, reported erroneous structure or values, and built a data structure representing the xml. I decided that different validation failures will be implemented as error codes. Because there is nothing exceptional in discovering a malformed request. It is not quite the same as Result, but shares the same characteristic: you have to explicitly react to any error condition. My code turned out to be so long and difficult to read, that I was forced to use exceptions; even though I have no exceptional situation.

      Also, when I look at the Rust examples, like this one, I am not convinced about it. Note that they had to use a comment to illustrate what the function is doing, as though the authors didn’t themselves believe that one can easily grasp the meaning.

      The useful advice I can give on using exceptions I can give, is not to catch them. Or rather catch them only in small number of places, like when recovering from a failed transaction in a server.

      • Aleksander Fular says:

        Thanks for prompt response,

        That was just a tutorial! The put comments as you are in your blog posts. 😉

        @Outcome, I feel that the main power in Rust’ Result is that there is no choosing between Exception/Result approach. In Rust code you know no code can throw an exception, code simply follows WYSWIG approach, no hidden returns from the middle of the functions because something threw an exception. And Rust enforces users to handle all branches in the code – When you get Result from function you have to explicitly unpack it or do a pattern matching, whereas in C++ you can just simply ignore it and dereference it.

        • Interesting. So, you are saying your program will not compile unless you tell what action you want to do upon Err?

        • Aleksander Fular says:

          I could not reply to your message directly, no reply button? 🙂

          Yep, program will not compile.
          I think the Rust would be a very interesting toy for you to play with, as the compiler there does a lot of checking for you. ( Never invalid references! )

          I shortened the example slightly: https://is.gd/WYRCnq, it shows what happens when you dont handle one case of the Result. even though the function always returns with some value, there will be a compiler error. That can be worked around by using the ‘_’ matcher which is similar to a ‘default’.

          This mechanism enforces you to always write a code handling all possible branches, Of course it may lead to boilerplate code – especially during propagation of error, there is a builtin try! for that case.

          This is what makes Rust’ Result error probably better than anything that C++ could offer in that manner.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.