Toggles in functions

Have you ever seen a function call like this?

process(true, false);

We are processing something: this should be clear from the context. But what do these parameters mean? What is true and what is false? From the function call we will never figure it out. The code surely isn’t self explanatory.

We will have to stop, and take a look at the declaration, and it does give us a hint:

void process(bool withValidation,
             bool withNewEngine);

Apparently, the author of the function uses the two bools as toggles. The implementation will probably be something like this:

void process(bool withValidation,
             bool withNewEngine)
{
  if (withValidation)  // toggle #1 used
    validate();

  do_something_toggle_independent_1();

  if (withNewEngine)   // toggle #2 used
    do_something_new(); 
  else
    do_something_old();

  do_something_toggle_independent_2();
}

From the author’s point of view, or the function’s definition point of view, there is no problem with readability, because each toggle has a name assigned to it. The problem occurs only on the side of the caller. The problem is not only with not being able to initially identify the toggles. If you already know the toggles, you can easily confuse their order. In fact, my initial example should have read:

process(false, true);

But I confused the order of the arguments. Did you notice that?

Once this bug hits the programmer, and she has determined the cause, she will probably put a comment in the function call to make the intention clear:

process(/*withValidation=*/ false, /*withNewEngine=*/ true);

And this almost looks as a missing language feature: named function parameters. Theoretically, this could look something like this:

// NOT IN C++:
process(withValidation: false, withNewEngine: true);

But even if such feature were added into C++ it is not likely that it would inter-operate with forwarding functions. The following use cases would remain unaddressed:

std::function<void(bool, bool)> callback = &process;
callback(???, ???); // what names to use?

There is a potential here for an even more contrived bug, far more difficult to track down. Suppose function process is a virtual member function. And in some other class you overload it, and in the overload you take the toggles in the wrong order:

struct Base
{
  virtual void process(bool withValidation,
                       bool withNewEngine);
};

struct Derived : Base
{
  void process(bool withNewEngine,
               bool withValidation) override;
};

The compiler will not detect the problem, because the semantics are only encoded in parameter names. And the types of the toggles are just two bools.

And this is not the end of the bugs caused by using bools in the interface. Because almost every built-in type converts to bool the following compiles fine, and does something else than expected:

std::vector<int> vec;
process(vec.data(), vec.size());

This problem is more common when toggles are used in constructors. Suppose your class has two constructors:

class C
{
  explicit C(bool withDefaults, bool withChecks);
  explicit C(int* array, size_t size);
};

At some point you decide to remove the second constructor, and you might hope that compiler will tell you about all places that need to be fixed. But this will not happen. Due to implicit conversions to bool the first constructor will take over all the calls.

There is a reason why people often decide to use type bool to represent a toggle, though. It is the only built in type, available out of the box, designed for storing two states. The exact number required by a toggle.

Enumerations

In order to fix the above problems we have to provide a type different than bool that will:

  • encode different toggles as distinct types,
  • prevent any implicit conversions.

C++11 comes with strong enumerations, which address both of these issues. Also, we can use type bool as enumeration’s underlying type; this way we make sure we only encode two states, and that we use up only the size of one bool. First we define the toggles:

enum class WithValidation : bool { False, True };
enum class WithNewEngine  : bool { False, True };

Now we can declare our function like this:

void process(WithValidation withValidation,
             WithNewEngine  withNewEngine);

There is some redundancy in such declaration, but the usage is exactly what is needed:

process(WithValidation::False, WithNewEngine::True); // ok

And if I provide the toggles in the wrong order, I get a compiler error due to type mismatch:

process(WithNewEngine::True, WithValidation::False); // fails!

Each toggle is a different but regular type: they can be perfect-forwarded, and you cannot get their order wrong in function declarations or virtual member function overrides.

But using enumerations for the purpose of toggles comes with a certain cost. To some extent toggles are similar to values of type bool, but enumerations, being so strongly typed do not reflect this similarity. Implicit conversions from and to bool do not work (and this is desired), but also explicit conversions do not work, and this is a problem. If we go back to the body of the function process, it no longer compiles:

void process(WithValidation withValidation,
             WithNewEngine  withNewEngine)
{
  if (withValidation)  // error: no contextual convert to bool
    validate();

  // ...
}

I have to use an explicit cast:

if (bool(withValidation))  // ok
  validate();

And if I need a logical expression of two toggles, it becomes even starnger:

if (bool(withNewEngine) || bool(withValidation))
  validate();

Also, you cannot direct-initialize a scoped enumeration from a bool:

bool read_bool();

class X
{
  WithNewEngine _withNewEngine;

public:
  X() 
    : _withNewEngine(read_bool()) // fails
    {}
};

I need to do an explicit cast:

class X
{
  WithNewEngine _withNewEngine;

public:
  X() 
    : _withNewEngine(WithNewEngine(read_bool())) // ok
    {}
};

Maybe this could be considered as super-explicit safety feature, but this looks like a bit too much of explicitness. Strong enumerations are more explicit than explicit constructors/conversions.

tagged_bool

Thus, in order to prevent bugs caused by naked bools and to avoid problems caused by strong enumerations, I had to come up with my own tool, which I called a tagged_bool. You can find the implementation here. It is quite short. You use it to create toggles like this:

using WithValidation = tagged_bool<class WithValidation_tag>;
using WithNewEngine  = tagged_bool<class WithNewEngine_tag>;

You need to forward-declare a distinct tag class, like WithValidation_tag. It does not need to be defined. It is used to render a unique instantiation of class template tagged_bool. Instances of this template offer explicit conversion to and from bool as well as from other instances of tagged_bool, because, as it is often the case in practice, the same bool value passed to the lower levels of the application becomes a different toggle with a different name. You use thus created toggles like this:

void process(WithValidation withValidation,
             WithNewEngine  withNewEngine)
{
  if (withNewEngine || withValidation)  // ok
    validate();

  // ...
}

process(WithValidation{true}, WithNewEngine{false}); // ok
process(WithNewEngine{true}, WithValidation{false}); // fails

And that’s it. tagged_bool is part of Explicit library, which provides a number of tools for stating your intentions more explicitly in interfaces.

Advertisements
This entry was posted in programming and tagged , , . Bookmark the permalink.

23 Responses to Toggles in functions

  1. Elron says:

    ” Also, we can use type bool as enumeration’s underlying type; this way we make sure we only encode two states, and that we use up only the size of one byte”

    sizeof(bool) might not be 1, so I think you meant “the size of one bool”

  2. Bartek F. says:

    Great post!
    But it’s quite sad that you have to roll your own solutions. There should be a direct help from the language.

  3. Qb says:

    I usually use a “flags” value class which is a bit mask of enum values.
    Something like this process(WITH_VALIDATION | WITH_NEWENGINE)
    Inside the function I can test single bit values with some inline member functions Is IsNot IsOneOf etc…

  4. Tomas says:

    Nice!
    This not only prevents the wrong order in derived overloads but also prevents the wrong order in definition -> declaration of methods.

    Thanks for sharing!

  5. kszatan says:

    My first thought would be to get rid of those switches and extract parts responsible for validations and new/old engine to separate classes and then pass them as arguments. The “process” function already does too much.

    • When your specific use case is sufficiently simple, then not using any ‘toggles’ may be indeed a better option. But in the cases where the decision to switch the toggle is made a number of layers up in the call stack such a rewrite may not be feasible/practical:

      int main()
      {
        Revert revert_to_old_functionality {read_from_config()};
        Layer1::fun(revert_to_old_functionality)
      }
      
      void Layer1::fun(Revert revert)
      {
        // something else...
        Layer2::fun(revert);
      };
      
      void Layer2::fun(Revert revert)
      {
        // something else...
        Layer3::fun(revert);
      };
      
      void Layer3::fun(Revert revert)
      {
        // something else...
        if (revert)
          do_poor_validation();
        else
          do_thorough_validation();
      };
      
  6. Pingback: More on Naked Primitives | Using C++

  7. micleowen says:

    class C
    {
    explicit C(bool withDefaults, bool withChecks);
    explicit C(int* array, size_t size);
    };

    explicit is used in constuctors with one parameter. Refresh your knowedge on this topic 🙂

    • There is a good reason to declare nearly all constructors as explicit, not only these with 1 argument, especially since C++11. Sometimes even a default constructor has better be declared as explicit. For justification, I would redirect anyone interested to this article.

  8. chris says:

    You’re going to hate me for this, but:

    “`
    struct S {
    void foo() {}
    };

    int main() {
    tagged_bool b{&S::foo}; // Compiles
    }
    “`

    http://melpon.org/wandbox/permlink/jgm8J5ju6SNNNmtS

  9. alfC says:

    Great posting. There is an obscure language called Asymptote that has C like syntax and has named parameters and uses type identity to resolve functions calls, including passing arguments in any order. In other words if the call can be deduced somehow it will do it for you.

  10. alfC says:

    tagged_bool can have a couple of static constants to reenable the double semicolon syntax. `WithNewEngine::False`.
    (Pity it needs to be capitalized or underscored)

    • Thanks for bringing this up. In the early stages, I tried to come up with double-colon syntax, but it always resulted in some surprises. In the end, I observed that WithNewEngine::False is not shorter or more intuitive than WithNewEngine{false}. Plus the latter allows you to pass run-time variables.

  11. ARNAUD says:

    I’m not I understand the reasoning behind the following for enum class types?

    Implicit conversions from and to bool do not work (and this is desired), but also explicit conversions do not work, and this is a problem.

    Why do you think writing the following is that bad?

    if(withValidation == WithValidation::True)

    On one hand you have something available in the language and code that is clearly legible and understood by all C++ developers. On the other hand, you need a specific templated class for automatic bool conversion? I’m not convinced.

    One more thing, imagine after a while your engine variable is not a bool anymore but can become no_engine, engine_v1, engine_v2 … the enum class allows this extension in a straightforward way, not your tagged_bool.

    • You bring up two points.

      Point 1.
      The choice between

      if (withValidation || withNewEngine)
      

      and

      if (withValidation == WithValidation::True 
          || withNewEngine == WithNewEngine::True)
      

      And actually, in a program that uses namespaces:

      if (withValidation == SomeNamespace::WithValidation::True 
          || withNewEngine == SomeNamespace::WithNewEngine::True)
      

      To me this is a trade-off between the level of desired type safety and convenience. My personal choice is: something safer than naked bools, but less verbose than strong enums. It seems that your trade-off falls something closer to strong enums.

      Point 2.
      The potential future addition of the third state.

      If you foresee that in the future you might need a third state, then indeed strong enums may be a better choice. But not necessarily, because, you just add the third value to the declaration, and all former if-statements still compile, even though you want to inspect all of them, to add the third state.

      It is just that in my experience, these ‘toggles’ are used as temporary solutions, and their future development is not to add a third state, but to remove two states. E.g., I improve some part of the program, but for a couple of months I want to give the users the ability to switch back to the old functionality, in case I overlooked something, and the improvement only makes the things worse. If after a couple of months all customers are satisfied, I remove the support for the older functionality, and the toggle gets removed.

  12. Magnus Fromreide says:

    It might be just me but I was missing plain old enums from the discussion the proper type to use in the interface.
    Enums does solve enogh of the probem to be a good candidate in this case.

    * Distinct types – Yes, enum values are of distinct types
    * Prevents conversion – Enogh, there is an implicit conversion from enum-value to int but you can’t go the other way.

    This means process(false, true) is a compile error just like process(WithNewEngine{true}, WithValidation{false}) but process(WithValidation{false}, WithNewEngine{true}) is ok.

    Now, this fails in that you can assign out of range values to the new types but I belive this is less of a problem than wrong order.

    One might also see the fact that the values ain’t scoped as a problem with enums.

    In any case I think enums deserve a mention in this post.

    • Thanks for bringing this up. Yes, if we consider naked bools on one end, and strong bool-based enums on the other end, old style enums fall something in between. The format of the posts in this blog does not allow to examine every aspect of the problem, so I need to remove some, but it is good they return back in the comments.

      I didn’t mention it in the post, but one thing you do with these toggles occasionally is to store them in class members. If you use a strong enumeration, with bool as a base type:

      enum class WithNewEngine : bool { False, True };
      

      You get the guarantee that:

      sizeof(WithNewEngine) == sizeof(bool);
      

      You get the same guarantee for tagged_bool. But with old-style enums, you get the size of an int.

  13. michal s says:

    this type tagging technique is very nice, I think I read about similar idea to prevent converting integer-like types. http://www.ilikebigbits.com/blog/2014/5/6/type-safe-identifiers-in-c

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s