Defensive programming

Have you ever asked yourself, or participated in a discussion on whether “defensive programming”is a good or a bad thing? The issue is controversial, and recently, while watching talk “Defensive Programming Done Right, Part I” by John Lakos, I realized (I think) why. Term “undefined behavior” means different things to different people.

Rather than saying something myself, let me ask you a question. Given the following initial code:

bool is_monitored(const Monitor* m, const TaskId* t)
{
  auto it = std::find(m->monitored_tasks().begin(),
                      m->monitored_tasks().end(),
                      *t);
  return it != m->monitored_tasks().end();
} 

Assuming that we never expect m or t to be null, which of the following variants would you call a “defensive programming” technique?

Variant A:

bool is_monitored(const Monitor* m, const TaskId* t)
{
  if (m == nullptr) return false;
  if (t == nullptr) return false;

  auto it = std::find(m->monitored_tasks().begin(),
                      m->monitored_tasks().end(),
                      *t);
  return it != m->monitored_tasks().end();
} 

Variant B:

bool is_monitored(const Monitor* m, const TaskId* t)
{
  if (m == nullptr) throw std::logic_error("...");
  if (t == nullptr) throw std::logic_error("...");

  auto it = std::find(m->monitored_tasks().begin(),
                      m->monitored_tasks().end(),
                      *t);
  return it != m->monitored_tasks().end();
} 

Variant C:

bool is_monitored(const Monitor* m, const TaskId* t)
{
  assert (m);
  assert (t);

  auto it = std::find(m->monitored_tasks().begin(),
                      m->monitored_tasks().end(),
                      *t);
  return it != m->monitored_tasks().end();
} 

Variant D:

bool is_monitored(const Monitor* m, const TaskId* t)
{
  if (m == nullptr) log_error("..."); // and continue
  if (t == nullptr) log_error("..."); // and continue

  auto it = std::find(m->monitored_tasks().begin(),
                      m->monitored_tasks().end(),
                      *t);
  return it != m->monitored_tasks().end();
} 

Whatever your answer is, note that other developer’s choice may be different. If you are in disagreement with your colleague whether “defensive programming” is a good or an evil thing, it is possible that you both entirely agree on what is good and what is evil, and the only difference is in how you define “defensive programming”.

I do not intend to give my choice here. My only goal with this short post is to note that the term is ambiguous.

This entry was posted in programming and tagged . Bookmark the permalink.

29 Responses to Defensive programming

  1. HiroshimaCC says:

    I would suggest that it depends on the intent of your code.
    – Does Monitor and/or TaskId supposed to be null at some point ? Then A is OK.
    – Does Monitor and/or TaskId never supposed to be null ? Then C is OK.
    – Does is_monitored supposed to allow your program to recover in case of failure ? Then B is probably OK (I’m not sure Monitor or TaskId being null can be considered a normal failure condition, but not being able to find t in m could be though – but not in that precise example).

    The problem with D is that depending of the implementation of log_error(), you cannot even be sure that the message will be correctly flushed in case of failure, so not very defensive.

  2. jrb (@jrxb7) says:

    If m and t are never supposed to be null then the best way is to write our function as

    bool is_monitored(const Monitor& m, const TaskId& t)

    This conveys better than any documentation the fact that m and t can never be null

  3. gwiazdorrr says:

    Not the best example. If you change argument types to const references the problem well be gone completely.

  4. assert for debug/dev builds. The rest depends on what you can actually handle and what the guide for the project is. If you can return and continue processing without messing up state then you should do so. I am personally against exceptions in critical code though.

  5. evpo says:

    Only B is defensive programming as long as the system can handle the exception and continue.

    A is close but there is an assumption that is_monitored() is false when nulls are received. When the task is actually monitored we can get race conditions or deadlocks. It’s not very defensive. In addition we swallowed the error and made it very hard to find. Unfortunately I see A in production code too often.

    C is OK in some scenarios especially if performance is crucial but it’s not defensive.

    Therefore I think A, C and D are not defensive programming because they push the application into dangerous territory. It will probably crash or produce incorrect results if nulls are received.

  6. Andy Prowl says:

    Very interesting poll. My instinctive answer would have been C.

    But I was not sure, so I did a bit of googling and found out that the definition of DP on Wikipedia mentions two potentially conflicting goals: (1) “Making the software behave in a predictable manner despite unexpected inputs or user actions” and (2) “Reducing the number of software bugs and problems”. I don’t know how authoritative this quote is (probably very little), but it seems to summarize the kind of ambiguity that you are hinting at.

    Variants A and B seem to serve goal (1) best; I can’t tell whether returning control to a buggy client is a wise design choice in general, but of the two variants I’d probably prefer B, because it gives a properly-designed client a chance to be aware of its own faultiness and take whatever emergency action is deemed to be most appropriate – thus also making it possible to fulfill goal (2). In both cases, however, the behavior is predictable, so from the viewpoint of goal (1) both variants A and B can be seen as DP techniques.

    From the viewpoint of goal (2), on the other hand, the best approach is variant C, because (at least in debug builds) it screams the violation out loud – although the behavior on release builds will still be undefined. This makes it impossible to ignore the bug and forces the programmer to take actions to fix it.

    D’s approach is to silently record the violation and move on, which never results in predictable behavior and does a poor job raising the awareness of the problem too: because of UB, in fact, variant D does not guarantee that the log will actually contain anything, and even if it did, logging the violation isn’t as effective as an assertion in reporting bugs.

    So in conclusion I’d rule variant D out as a DP technique, and that’s probably my only certainty 🙂

  7. FJW says:

    If the checks in D aren’t optimized away I would consider it a bug in the optimizer.

    B and C are both valid approaches, I might end up using either.

    A doesn’t follow the “Fail as loud (and early) as possible”-guideline which is why I would consider it the wrong approach.

    In real code the first thing I’d do would however be to replace the pointers with references and have a serious talk with whoever didn’t do so in the first place.

  8. hammonjj says:

    I preference would be Variant C since you stated that neither pointer should ever be null. By using an assert, you are basically acknowledging that without impacting production code since these kinds of errors generally happen in development environments where someone might mess with something they shouldn’t be (hopefully caught in a later code review).

  9. Joe says:

    If m and t could really never be null, I’d change the interface to

    bool is_monitored(const Monitor& m, const TaskId& t)

    That would remove all ambiguity for anyone reading the code.

  10. Dmitri says:

    Why don’t we go with option is E 🙂

    bool is_monitored(Monitor const& m, TaskId const& t)
    {

    }

  11. The main thing is to clearly define the contracts (primary contract, ordinary failure handling, contract violation handling which defaults to UB) of the function. Defensive programming is about detecting primary contract violations, and enforcing that contract. I think the intent of this blog article is to focus on ways to enforce the contract, but the variants here that always return normally have a different contract than those that may abort or throw exceptions.

    Maybe we’re then into defensive design, but I’d just like to point out that it’s no problem having both versions, i.e. both contracts. One isn’t forced to choose. One can say yes to both.

    So, let’s consider the basic variant, the one where a contract doesn’t include optional arguments:

    auto is_monitored(const Monitor& m, const TaskId& t)
        -> bool
    {
        auto it = std::find(m.monitored_tasks().begin(), m.monitored_tasks().end(), t);
        return (it != m->monitored_tasks().end());
    }
    

    Checking the addresses of the arguments here would be a kind of defensive programming, but a kind that indicates overall buggy code. It’s not a smart thing to do because it needlessly imposes that overhead on the calling code. It’s needless because it’s the job of an overload or variant with pointer arguments, or a dedicated utility for dereferencing pointers and checking whether they’re nullpointers.

    The same reasoning can be applied to the variant with pointer arguments: throwing exceptions would impose some needless overhead on the calling code. Overhead because the calling code would then have to be prepared to deal with the exceptions (this doesn’t usually involve a runtime execution overhead, but it does involve extra code and programmer’s time). Needless because to give the calling code control over throwing, throwing is better expressed by a dereferencing utility that can be invoked locally in the calling code. So:

    auto is_monitored_object(const Monitor* p_m, const TaskId* p_t)
        -> bool
    {
        assert( p_m != nullptr );  assert( p_t != nullptr );
        return is_monitored( *p_m, *p_t );
    }
    

    Here the asserts serve to define the primary contract.

    auto is_monitored(const Monitor* p_m, const TaskId* p_t)
        -> bool
    {
        return (p_m && p_t? is_monitored( *p_m, *p_t ): false);
    }
    

    And then we’re finally at the level of the calling code, which might do any of

        if( is_monitored( m, t ) ) ...
        if( is_monitored( p_m, p_t ) ) ...
        if( is_monitored_object( p_m, p_t ) ) ...
        if( is_monitored( *p_m, *p_t ) ) ...
        if( is_monitored( object_or_x( p_m ), object_or_x( p_t ) ) ) ...
    

    where the second to last invocation is a case of not trusting the compiler to optimize, and where object_or_x is some utility that throws if the pointer is null.

    This is a lot of variability. One possible reaction: should the calling code really have that possible variability, and how would the not-so-bright client code programmer decide? But then: for what reason would it be a good thing to restrict all calling code to just one arbitrary of the above?

    The way I see it: give the calling code the choice. Because the calling code’s programmer knows best what’s best in that code. The function overloads and variants should (ideally) support that.

    In practice, for my personal programming I would just do the variant with reference arguments. If any of the other two turned out to be needed I’d add them. But I can imagine that in some cases changing a header can be a costly thing to do (merging work, re-running unit tests, whatever), so that adding the two non-basic variants up front would be preferable: it’s not so much work anyway, and it could maybe even be partially automated if the pattern occurs often enough.

    Cheers!

    – Alf

  12. Magnus says:

    ‘A’ is flat out wrong. The function’s contract says nothing about ‘m’ or ‘t’ being null, in fact the programmer (both the caller and the writer of is_monitoring) have to assume they are never null. ‘A’ is not defensive programming, because if the programmer makes a mistake it is not detected (instead, the programme pretends ‘m’ is not monitoring ‘t’).

    The only way to make ‘A’ correct is to expand the terms of its contract to include the possibility of ‘m’ or ‘t’ being null. This is a mistake because it increases the combinations of the function’s behaviour, i.e. its contract. A caller which decides, “I don’t care if ‘m’ or ‘t’ is null, but if they are, then I want to behave as if ‘m’ is not monitoring ‘t'”, is now free to pass a null value and have it accepted by the function. The function cannot now be changed to something simpler (doesn’t handle nulls) without also needing to change the caller.

    ‘C’ is the best option. It minimises the complexity of the function’s obligations, and it defends against the caller passing a null argument (in debug builds). Good defensive programming.

    ‘B’ is not much better than ‘A’, because the behaviour of the programme is still well-defined (and increases the scope of the function’s contract). I would only use ‘B’ instead of ‘C’ if I didn’t trust my debug builds to catch ALL instances where ‘m’ or ‘t’ could be null, or to try catching a bug which is only happening in release builds (everyone’s favourite!).

    ‘D’ is even more wrong than ‘A’! The compiler can just remove both checks at the start of the function, because it will always execute ‘m->monitored_tasks’ and ‘*t’. It can assume they are never null, because of undefined behaviour. The programme will crash and not log anything (or make monkeys fly out of your nose).

    • codour says:

      +1 for the ‘D’ case analysis. It made my day 🙂

    • Stephan Bergmann says:

      Re D, the compiler can only remove the checks and log_error calls if it can prove that the dereference of m resp. t actually happens if the corresponding check is true, which it probably cannot assuming log_error can throw exceptions.

  13. krzaq says:

    I’d choose A/B or C, depending on how sure I am of the assertion that those pointers are never going to be null. If this function is an internal function of my framework, for example extracted to avoid code duplication it’s reasonable to assume that the preconditions have already been checked, although in that case, I’d take the arguments by a reference – it’s much clearer then.

    Otherwise I’d decide whether I want to swallow the error or not. By default I don’t, so my first choice would be B, although I’d use `std::invalid_argument` instead of `std::logic_error`.

  14. skebanga says:

    None. If m and t are never supposed to be null, enforce the contract by using references not pointers

  15. Just note that the question in this post is “which of these would you call defensive programming” rather than “which of these would you use”.

  16. red1939 says:

    I would say C, as it has the ability to use whatever implementation (exceptions or abort) to forcefully leave such corrupted context; assert is probably too strong for production code (and tests!) so exceptions could be used here. ‘A’ is simply making a wide interface and has nothing to do with defensive programming, rather api requirements (it’s not bad per se – depends if clients will want this “return false” behavior).

  17. Ophir says:

    I guess the use of pointer arguments is for the sake of making the example simple. As many people mentioned, using const reference could solve the issue far better than any of the options.
    A – perhaps it’s defensive, but it’s incorrect, since returning false is ambiguous – does it mean not found, or error? A correct defensive alternative would be returning both an error code and true/false.
    B – is the best option in my opinion, provided the whole program uses exceptions consistently. You can get really close to the Wikipedia definition of continuous programming “ensure the continuing function of a piece of software under unforeseen circumstances”
    C – is partially defensive. It will prevent silent bugs causing t==nullptr during testing .The case m==nullptr would have been noticed during debugging anyway. And those rare cases that did manage to escape testing will crash in production (perhaps causing your 2 billion dollar spacecraft to stop responding)
    D- certainly not defensive. If m==nullptr, why continue to a certain crash? The log is about debugging, not about defensive programming, in my opinion. This option can be great when combed with exception handling.

  18. Martin Ba says:

    Just to chime in the choir, I’ll add a few “truths” to the symphony: (And I really should watch that video 😉

    * Defensive programming is all about the code running in production.
    * Defensive programming is all about *not* crashing/terminating: Which includes, but *is not limited to*, not invoking UB. (Of course, this includes the *alternative* truth that devensive programming is solely all about not invoking UB, which would allow for termination to prevent UB.)
    * As assert can never be defensive, as it is no longer there in production.
    * A crash is never defensive per above.
    * An exception (Variant B) *can* be defensive, if the code base in general and (all the) call chains specifically won’t crash/abort/terminate because of a “regular” C++ exception and the code is decently able to roll back the current operation when it hit’s an exception.
    * Ignoring a problem (Variant A) *can* be defensive, if the resulting UB/crash is deemed worse that strange misbehavoiur of the application at runtime. (May well be the case.)

  19. Luc Hermitte says:

    I share Martin Ba’s view.

    Defensive programming is about having a program execution to continue even when the code is bugged. Detected errors can be hidden (variant A), or they can be reported (variant B). I would say it’s about widening the contracts and hoping we’ll be able to do something somewhere about the programming errors still present in production.

    Variant C is pure Design by Contract as far as I’m concerned. It’s not about having a program that resists programming errors. It’s about having a correct program. It can be helped by specifying the contract and catching the programming errors as soon as possible (thanks to static analysis), or thanks to tests run in Debug mode (in C and C++ where assert is available). BTW, assert() can be hacked to throw an exception and as a consequence, we can do defensive programming in this variant as well.

    In my opinion, variant D is … bugged unless log_error() throws an exception — which it certainly does not. Incorrect calls are instrumented, but in the end we’ll crash. Assertions are a much better tool to investigate logic errors. Somehow this variant is a poor variant of variant C.

  20. Johan says:

    According to the book “C++ in Action” (see chapter “Fighting Defensive Programming”), variant A would be considered a case of defensive programming and should thus be avoided!

  21. Robert Ramey says:

    Personally, I would like to combine B and C. The assertions will trap any errors running tests. But if the errors slip through to the release version we still might be able to recover or at least abort in a graceful way. Maybe should include D in order to save some information in case the unthinkable happens.

    Robert Ramey

  22. Vincent Jacquet says:

    “is_monitored” is telling us if a given task is being monitored by a given monitor. Considering that a non existent task cannot be monitored by any monitor and a non existent monitor cannot monitor anything, then A is just fine and not ambiguous at all.
    B & C would require the caller to check the parameters beforehand. If this function is used to guard code on monitor or task, the verification woud be done twice.
    D is just useless as it does more work without any real benefit.

    Yet, the best answer is probably E, the one given by Alf P. Steinbach, as it let the caller choose the most fitted function.

    • I have probably selected a bad example, because everyone seems to focus on how to turn pointers into references, or saying what they would ideally do, and this distracts everyone from the main question of this post: which of the four variants reflect the concept of defensive programming (under your understanding of the term).

  23. Hyena says:

    If we ignore the fact that such a function should take references instead of pointers as parameters, you have one option missing from the list of choices 😀

    * In case of nullptr, log an error (code file and line) and try to overcome the obstacle as gracefully as possible. In our case, the graceful handling would be to return FALSE after reporting the bug. It’s the same philosophy as ALWAYS ADDING THE DEFAULT CASE to any switch statement. You want to avoid undefined behavior and crashes by overcoming them as well as possible.

Leave a reply to HiroshimaCC Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.