Bug of the week

Today we are going to see a case study illustrating a bug. Not a very spectacular one: a typical bug one would encounter in everyday work. We will start with the symptoms, identify the root cause, and suggest measures to prevent similar things from happening in the future.

The program in question, among other things, displays times of day with minute resolution. Because there is a lot of information to be displayed, and the output needs to be compact, the program does not separate hours from minutes, like "11:30", but glues them together, like "1130". The interpretation is obvious and this format is common in our industry. Even when we see a three-digit number like "130" it is unambiguously identified as 01:30.

At some point it was observed that the program sometimes outputs different times than would be expected. Sometimes we get the wrong minute: "100", expected "140". Sometimes we get the correct minute but wrong hour "310", expected "510". Sometimes the time is correct, and sometimes everything is wrong.

If we list the examples along we may start to observe the pattern.

actual output expected output
 045  045
 100  140
1010 1650
1225 2025

The actual output is always smaller or equal to the expected value. The difference grows as the actual time grows. The bug is reproducible and the difference appears to only depend on the value of the actual time of day. Therefore it should be easy to identify the place in the code responsible for the situation.

Indeed after a short while, we discover that someone decided to store the time of day as int denoting “minutes since midnight.” Not a bad idea: it optimizes the storage capacity and we can easily compute the difference between two times (on the same date). But the downside is that somewhere in the program we have the following piece of code:

struct flight_data
  // ...
  date arrival_date; // no problem here
  int  arrival_time; // minutes since midnight

void display(const flight_data& f)
  // ...
  cout << f.arrival_date;
  cout << f.arrival_time;

This explains the discrepancy. The expected 01:40 a.m. is 100 minutes after midnight. Therefore we see "100" instead of "140" .

At this point, someone might just change function display to:

void display(const flight_data& f)
  // ...
  cout << f.arrival_date;
  cout << minutes_as_HHMM(f.arrival_time);

and consider the matter closed. It could be ‘closed’ in the sense that when we test again, the numbers will be reflecting the expected values, bug fixed, bug statistics improved…

But we can clearly see that since it was easy to make a mistake, it is not unlikely that in some other place in the code someone else also fell into the same trap. Even if there is none now, it is likely that it will occur in the future, because it is so easy to get it wrong.

Just fixing the bugs “for now” is not satisfactory. I strongly believe that we should aim at something more: design the code in such a way that the omissions like the one above are impossible, or difficult to achieve. The developer’s goal, as I see it, should be to not have bugs in the first place rather than to plant bugs and then fix them.

One thing we can fix immediately is to rename the member variable:

struct flight_data
  // ...
  date arrival_date;
  int  arrival_time_as_minutes_since_midnight;

The variable name sends a clear message, plus, if we do it, the compiler will show us every place where it is referred to by the old name and we can decide if there are other places that need fixing.

However, function display in real life is not as short and simple as we have seen it above. It is more likely, or at least conceivable, that you would see something like this:

void display(const flight_data& flt,
             bool abbreviate, 
             const supplementary_data& suppl)
  bool do_vca = supl.do_any_tla;
  bool do_strict = suppl.force_strict || !suppl.aux_vec.empty();

  if (!abbreviate || suppl.traffic && suppl.traffic->do_tse)
    do_strict = true;
    do_vca |= is_interline(flt);
  cout << boost::format("%1%;%2% %3%") 
    % boost::str(boost::format("%1%%2%") 
      % (do_vca ? flt.status : extend_status(flt.status)
      % suppl.alteration_str)
    % boost::str(boost::format("from %1% %2% %3%")
      % flt.departure_apt % flt.departure_date
      % (do_strict ? flt.arrival_time_as_minutes_since_midnight
        : flt.estimated_arrival_time))
    % boost::str(boost::format("to %1% %2% %3%")
      % flt.arrival_apt % (flt.departure_date + flt.dep_offset)
      % (do_strict ? flt.departure_time
        : flt.estimated_departure_time));

If you ever need to read through what this function does, and working in a big project I really have to analyze such things, you will be distracted by a number of things. First, just the amount of symbols packed like this is distracting. Second, in line 10, the statement is indented as though someone wanted to execute it conditionally. But because there are no braces in this if-statement, it will not be the case. So, we have a bug either in the logic flow or in the indentation.

Next, you will notice I used boost::format. Do you know this library? If not, now you will be distracted. You will have to go out of your way in order to spend a couple of minutes checking what it does. If you do know it, you will wonder why it was used in this unconventional and suboptimal way. A bug, or was there some purpose?

In the end, you will be so distracted by these things that it may slip your attention that we are streaming an integer that we do not want displayed as an integer. You will have to rely on the assumption that if it compiles, hopefully it does the right thing.

Employ the type system

A simple way to avoid the above problem is to make use of the type system. Introduce a new type for storing information about minutes elapsed since midnight:

class Minutes_since_midnight
  int minutes;
  // invariant:  minutes >= 0  &&  minutes < 24 * 60

  explicit Minutes_since_midnight(int m) : minutes(m) {}
    // requires:  m >= 0  &&  m < 24 * 60

  int as_int() const { return minutes; }

  // and relational operators ...

This is something similar to an opaque typedef. It is more however: we can associate an invariant with it. Now the type, apart from being distinct, also conveys the information that we are only interested in the subset of values of int.

Now, we can use the type in our program:

struct flight_data
  // ...
  date arrival_date;
  Minutes_since_midnight arrival_time;

void display(const flight_data& f)
  // ...
  cout << f.arrival_date;
  cout << f.arrival_time; // compile-time error

An attempt to stream out Minutes_since_midnight will result in compiler error, which will offer us an opportunity to think how we want to proceed before the program is built. One tempting option is to define a dedicated streaming operator for our type, but personally I doubt there is one good default way of displaying such a type. I would rather rely on converting functions, like minutes_as_HHMM.

Is it worth the trouble?

Type Minutes_since_midnight does not add any run-time performance hit compared to an int, so one may think there is no practical trade-off involved here. But the truth is there is one trade-off at stake, and it looks like it is sometimes non-trivial. The safety measure we have seen above requires that we define a new class, whereas type int is available out of the box, for free. The cost of adding a new class is the developer’s additional time. We now have to consider all the guidelines related to defining classes: which members should be private; should we define a custom copy constructor? If the policy in your work requires adding a .h and .cpp file for each class, it means adding new files. Then, new includes, then documenting it, then new unit tests. Look, a class is additional code which could contain bugs…

So you might ask the question, is the potential gain in safety worth the investment?

Everyone will have their own answer. Mine is this. If you do not make mistakes in your programs, it is not worth the trouble. After all this technique is only about preventing mistakes (which are not guaranteed to be there). If you are writing a small toy or test program for yourself exclusively, you probably don’t mind the bugs. But if the program is for customers other than yourself, and if the consequence of the bug is potentially severe, ignoring this technique appears like a strange thing to do; unless you know a superior solution.

This entry was posted in programming and tagged , , . Bookmark the permalink.

11 Responses to Bug of the week

  1. I’m agree that adding new type is the right think to do in this case. In your article you have used entire paragraph to explain how and why you choose custom format for displaying daytime. This information must be documented in some form anyway. It can be just a comment you already have near arrival_time field. But comment is the weakest possibility. If possible it should be encoded with strong type and allows compiler to do his part of work.
    On the other hand, if we will use similar classes a lot we can get right decisions quickly. Strongly typed classes usually does not need custom behavior. For example they don’t need to support arithmetic operations. I think their responsibility are to only maximum possible restriction on a generic int type. We usually don’t need just another fully integrated c++ type but only int or string wrapper.
    Many data model classes contains a number of custom fields like: name, id, address, price etc. In most cases int and string can be used to represent corresponding values. But if we choose to use custom types for them we can eliminate entire class of “typo-like” errors.
    For example, we never need to look at a first_name property as a collection of characters. So most operations available for string are useless or logically invalid for first_name.
    Consider another not related side for this approach. If we use custom types for most fields in our data type classes and combine it with boost fusion library we will get reflection possibility for our c++ classes without any runtime overhead.

  2. I think the simplest solution for this problem is using some well known library like boost or Qt, they already provide us well designed, robust, easy to use api to tweak with the date/time

  3. Brent Friedman says:

    Consider this alternative implementation of class Minutes_since_midnight:

    enum class Minutes_since_midnight : int;

    Minutes_since_midnight mins{60};

    Incredibly succinct and easy to read. It’s also less flexible of course.

  4. kaist says:

    This will compile:

    auto mins = Minutes_since_midnight(60);

  5. Rob G says:

    Ah, so “Uniform Initialization” is now “Uniform Initialization, except for enum’s, and classes with constructors defined”. Sounds about right for C++ – no consistency anywhere.

  6. Vincent Jacquet says:

    I agree with what you say but there is one thing bugging me. Aren’t you breaking encapsulation by leaking the representation in the name of the class?

    • A very interesting question. It probably deserves a separate post. I do not feel I am leaking representation in this case. I also cannot tell with certainty that the implementation is not leaked, as I do not know any “objective” ways of determining it. If I need a type that will tell me how many minutes elapsed since midnight (and that would guarantee that the computation of the difference between two values will be fast), how else would I call it? Or does your question imply that a name HHMM or HoursAndMinutes would better reflect my idea?

      Anyway, I have a more general comment here. One can set forth a number of different goals that one wants to achieve when writing a program. But not all of them are really valuable or worth pursuing. Goal “reduce ambiguity” or “eliminate potential confusion” is something that I understand, and desire. Goal like “do not break encapsulation” sounds really abstract to me. I can see how the former helps avoid bugs. Regarding the latter, I fail to see the immediate connection. If by the introduction of type Minutes_since_midnight I reduce ambiguity and avoid potential bugs, and at the same time I break encapsulation, then I consider it the right trade-off.

      • Vincent Jacquet says:

        For me, encapsulation is more a tool than a goal. I’d say a goal would be “No raw primitive types”.

        As for the name, if you want to represent the number of minutes since midnight, then Minutes_since_midnight is fine.
        But if you want to represent the time of the day, then I would propose Time_of_day .

        To challenge my designs, I usually play “what if”.
        a) What if I need seconds?
        b) What if I measure that 80% of the time is spent on converting to and from string, and only 20% in computing the difference of two values?
        c) What if I need to make my type as small as possible on a memory constrained environment?

        With Minutes_since_midnight, you can do (c); you could do (b) but then the internal representation might not actually be the number of minutes since midnight, which would be unexpected; and, finally, I do not know how you could do (a).
        This is what I meant by breaking the encapsulation: you almost cannot change the implementation, as if you made the field “minutes” public.

        BTW, I could do (a), (b), (c) with Time_of_day but I could not do any of them with the raw primitive “int”.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s