Update. The advice I gave below using Boost.Optional was invalid and potentially bug-prone. It is now changed, to reflect my experience with using Boost.Optional.
In this post I wanted to show, by a not-so-short example, how the decision to signal the failure of the function to compute the result via magical return value could have caused a disaster. This example is not a fiction. I faced this situation while maintaining a life-critical application.
Programs — obviously — are different, subject to different constraints and different expectations. The program in this story is a tool (a specialized calculator) for checking if an aircraft loaded as specified (with passengers and cargo) is safe to fly and should be allowed to take off. The program is life-critical (its malfunction can cause death or injury to people or damage expensive equipment) but is not real-time: if it stops working, the user can perform the computation manually, on a sheet of paper. Therefore, it is acceptable to occasionally gracefully close the application with an apology message.
One of the safety checks the program makes is whether the total weight of things loaded, including the aircraft itself, does not exceed certain maximum value. Thus, somewhere in the code there is a function that computes “the gross aircraft weight”:
double FlightPlan::grossWeight();
The choice to represent weight as a raw (64-bit) floating-point variable is fishy already. There is an issue of comparing for equality ‘almost equal’ quantities, loosing precision as lots of divisions and multiplications come into play. But you might expect that whoever made the call to use double
s to represent weight in life-critical weight and balance calculations studied the problem sufficiently to make sure that these issues will not affect the expected correctness of the results. Right…
So, how is our function grossWeight
implemented? The thing is that the implementation is so ‘professional’ and so ‘modern’ that noöne knows exactly what the function is doing. In particular, occasionally when executing the function, it turns out that the complex data structure behind it is corrupt (incorrectly or partially initialized, partially destroyed, or who knows what) and it is impossible to compute and return the correct result. To make our example simple, we can represent the situation by the following code:
double FlightPlan::grossWeight() { if (Guts* guts = getGuts()) { return guts->computeGrossWeight(); } else { DO_WHAT; // ??? } }
Obtaining a null pointer guts
indicates all situations that we can detect that prevent us from returning the expected result. Worse still, since the responsibility for resource management is diluted among components (which we cannot see), it is not even known at this time whether it is a valid (though unusual) situation that guts == nullptr
, or if it is a programmer error (a bug).
So what shall we do if we cannot compute the requested value? Well, asking this question at the time of implementing or fixing any particular function is probably much too late. The error handling strategy should be decided upon designing the system architecture. Nonetheless, something needs to be done.
One of the developers of the system observes that negative values of type double
can never represent valid weight. So, he decides that ‘magical’ value -666.66
will be used to represent the fact that weight could not be computed. This solution causes no ambiguity: if you get the value -666.66
, you know you didn’t really get any value. He elegantly defines a magic constant:
const double BOGUS_WIEGHT = -666.66;
Now we know what to return:
double FlightPlan::grossWeight() { if (Guts* guts = getGuts()) { return guts->computeGrossWeight(); } else { return BOGUS_WIEGHT; } }
All this so far appears to make sense; but here comes the most interesting part. Another developer, or perhaps even the same one, writes the function that checks if an aircraft is too heavy. As you might expect, or not, the implementation reads:
bool tooHeavy(FlightPlan const& plan) { return plan.grossWeight() > plan.aircraft().maxGrossWeight(); }
Clear and elegant. An aircraft is too heavy to fly if its gross weight exceeds the allowed maximum. And this is what the function says…
And thus, in the world-class life-critical software we obtain behavior, “when you cannot check for safety, just report that it is safe.”
Whose fault is it?
You might say that the author of tooHeavy
lacked discipline: since it was decided that returned double
s representing weights may always carry the bogus value, he should have checked for its presence. But this reasoning is faulty. You cannot impose arbitrarily strict discipline on people, who do make mistakes, especially when they are new and inexperienced. Also the usage of magic value “has been decided on” only by the author of FlightPlan::grossWeight
and was not communicated to everyone. And even if it was, new developers arrive, they are not trained, and are not aware of all things that had been decided upon in the past.
Didn’t the author of tooHeavy
read the documentation? No. Perhaps he was taught to follow Agile methodology but the only thing he understod from it was that he does not have to care about documentation or having everything done. Besides, developers do not like reading documentation (nor unit tests, if you consider them documentation) and the function signature looked self-explanatory.
My position is that it is clearly the author of FlightPlan::grossWeight
who is guilty of the situation. This is because he wrote a function that can be misused so easily. The name of the function suggests that it returns weight and encourages such interpretation. It just encourages the immediate comparison. Even if you are aware that you should check for the condition, when developing a big part of the application, with lots of details in mind, and in stress of time running out, or even doing overtime, it is easy to forget; and if you forget, the code looks innocent — both to you and to the compiler.
How could that be avoided?
One question you could ask is why the magic value was some negative number rather than NaN. Floating-point types already provide a value that indicates a missing result. They are not a good solution for a high-level language. Note that as per IEEE 754 requirements, if function FlightPlan::grossWeight
returned NaN upon failure, tooHeavy
would still suffer from the very same problem; because any comparison on NaN is required to return false. We would only make the problem harder to find (because reasoning about NaN is difficult).
So how should an error like this be handled? The short answer is: nearly any way of signalling the failure is better than this one. The only elegant solution is to rearrange the entire program so that situations that the object is half-populated for an unknown reason do not arise; but you know, tell your manager that you need three years to refactor the program…
Th goal of this post is to highlight the problem rather than suggest improvements, so let’s just quickly have a glimpse at possible options.
First, we could call std::terminate
. Many find this option unacceptable, however note the assumptions that this program runs under: reporting invalid result can cause casualties, but stopping the program cannot. Given the likelihood that our ‘error condition’ may indicate a logic error, this option should be seriously considered.
Second, we could throw an exception. Although some reasonable objections could be risen also in this case: in programs that have not been designed with exception safety in mind, the program is likely to enter into a UB. Also, throwing an exception if we know we have a logic error is not much different that just letting the program run normally with logic error.
Third, we could use a compound return value, e.g. Boost.Optional, which would force the programmer that calls the function to pause for a moment and think:
boost::optional<double> FlightPlan::grossWeight(); // why is it not just double?
Now, the return type does send a message “there may be no double
returned”. But if someone is too fast to compare the values he will still make the same mistake:
bool tooHeavy(FlightPlan const& plan) { return plan.grossWeight() > plan.aircraft().maxGrossWeight(); // still compiles! }
Because of how implicit conversions from T
work in Boost.Optional, the mixed comparison between T
and optional<T>
compiles and works, and returns something else than one would expect: uninitialized optional<T>
is not greater than any T
, so we still get the same bug.
Thus, in order to be sure we are avoiding any inadvertent conversions or behavior, we would have to define our own type:
class OptionalQuantity { double val; public: explicit OptionalQuantity(double v) : val(v) {} // precondition: v >= 0.0 OptionalQuantity() : val(NaN) {}; bool has_value() const { return !isnan(val); } double quantity() const { return val; } // precondition: has_value() };
Now, the name of the class sends a clear message, and we also make sure that there are no surprises when using the class:
bool tooHeavy2(FlightPlan const& p) { OptionalQuantity optWgt = p.grossWeight(); if (optWgt.has_value()) { return optWgt.quantity() > p.aircraft().maxGrossWeight(); } else { // Now, do what? } }
The above solution, however, has certain problem. We only moved the problem one level up, and the caller will be facing the exactly same situation as the calleé the moment ago. This will either trigger a cascade of refactorings that will ultimately result in rewriting huge parts of program, or at some point some programmer will get too tired and will just type:
bool tooHeavy3(FlightPlan const& p) { return p.quantity().grossWeight() > p.aircraft().maxGrossWeight(); }
We would face a similar problem if we changed function FlightPlan::grossWeight
to return error code and pass the computed weight via an ‘output parameter,’ except that the solution with compound value is more in the spirit of Functional Programming style. Nonetheless, the return code solution is another option superior to magic values because the inconvenient interface at least signals to the programmer that he should check for the potential error.
At the very minimum, the author of FlightPlan::grossWeight
could have renamed the function to some ugly name, so that even typing it would suggest to the user that he should abandon the “mathematical thinking” and check for the magic value:
typedef double WeightOrNix; WeightOrNix FlightPlan::grossWeightOrBogusValue();
Conclusion
The conclusion I draw from the above story is that whatever discipline one may wish to have, developers will make inadvertent omissions, or make too many assumptions about the tools/components they use.
For one example, the mathematical-like notation (much like many other syntactic conveniences) makes the program code short, elegant and easy to grasp; but it also deceives us: we tend to forget that sub-programs are not mathematical functions and that they may simply fail or lack resources.
What you can do to avoid such omissions or minimize their consequences is to employ techniques that take into account the human factor, so that when someone forgets to check for contingency, we do not enter into a UB and/or cause a disaster. You have a handful of tools: compiler’s type system which can signal error at compile-time on omission, you can make some tiny inconveniences that disturb the programmers’ routine and force them to focus on what they are doing. Finally, you can perform some run-time checks and try to prevent program execution (or its parts) if you detect program states that programmers never thought of and prepared the program for.
Magic values, in contrast, encourage inadvertent omissions. One notable example of this problem from the C Standard Library is function atoi
: when it fails it returns 0, the same value that it returns when it successfully parses string "0"
. True, there is no ambiguity, because you can always check the value of global errno
to see if we had an error or not, but how many users of atoi
know that? And how many use this knowlege? One other frequent example of such situation is returning an empty string when you have no good string to return. While it may appear as a waste, one could consider type optional<string>
instead, which really draws the user’s attention.