Toggles in functions

Have you ever seen a function call like this?

process(true, false);

We are processing something: this should be clear from the context. But what do these parameters mean? What is true and what is false? From the function call we will never figure it out. The code surely isn’t self explanatory.

We will have to stop, and take a look at the declaration, and it does give us a hint:

void process(bool withValidation,
             bool withNewEngine);

Apparently, the author of the function uses the two bools as toggles. The implementation will probably be something like this:

void process(bool withValidation,
             bool withNewEngine)
{
  if (withValidation)  // toggle #1 used
    validate();

  do_something_toggle_independent_1();

  if (withNewEngine)   // toggle #2 used
    do_something_new(); 
  else
    do_something_old();

  do_something_toggle_independent_2();
}

From the author’s point of view, or the function’s definition point of view, there is no problem with readability, because each toggle has a name assigned to it. The problem occurs only on the side of the caller. The problem is not only with not being able to initially identify the toggles. If you already know the toggles, you can easily confuse their order. In fact, my initial example should have read:

process(false, true);

But I confused the order of the arguments. Did you notice that?

Once this bug hits the programmer, and she has determined the cause, she will probably put a comment in the function call to make the intention clear:

process(/*withValidation=*/ false, /*withNewEngine=*/ true);

And this almost looks as a missing language feature: named function parameters. Theoretically, this could look something like this:

// NOT IN C++:
process(withValidation: false, withNewEngine: true);

But even if such feature were added into C++ it is not likely that it would inter-operate with forwarding functions. The following use cases would remain unaddressed:

std::function<void(bool, bool)> callback = &process;
callback(???, ???); // what names to use?

There is a potential here for an even more contrived bug, far more difficult to track down. Suppose function process is a virtual member function. And in some other class you overload it, and in the overload you take the toggles in the wrong order:

struct Base
{
  virtual void process(bool withValidation,
                       bool withNewEngine);
};

struct Derived : Base
{
  void process(bool withNewEngine,
               bool withValidation) override;
};

The compiler will not detect the problem, because the semantics are only encoded in parameter names. And the types of the toggles are just two bools.

And this is not the end of the bugs caused by using bools in the interface. Because almost every built-in type converts to bool the following compiles fine, and does something else than expected:

std::vector<int> vec;
process(vec.data(), vec.size());

This problem is more common when toggles are used in constructors. Suppose your class has two constructors:

class C
{
  explicit C(bool withDefaults, bool withChecks);
  explicit C(int* array, size_t size);
};

At some point you decide to remove the second constructor, and you might hope that compiler will tell you about all places that need to be fixed. But this will not happen. Due to implicit conversions to bool the first constructor will take over all the calls.

There is a reason why people often decide to use type bool to represent a toggle, though. It is the only built in type, available out of the box, designed for storing two states. The exact number required by a toggle.

Enumerations

In order to fix the above problems we have to provide a type different than bool that will:

  • encode different toggles as distinct types,
  • prevent any implicit conversions.

C++11 comes with strong enumerations, which address both of these issues. Also, we can use type bool as enumeration’s underlying type; this way we make sure we only encode two states, and that we use up only the size of one bool. First we define the toggles:

enum class WithValidation : bool { False, True };
enum class WithNewEngine  : bool { False, True };

Now we can declare our function like this:

void process(WithValidation withValidation,
             WithNewEngine  withNewEngine);

There is some redundancy in such declaration, but the usage is exactly what is needed:

process(WithValidation::False, WithNewEngine::True); // ok

And if I provide the toggles in the wrong order, I get a compiler error due to type mismatch:

process(WithNewEngine::True, WithValidation::False); // fails!

Each toggle is a different but regular type: they can be perfect-forwarded, and you cannot get their order wrong in function declarations or virtual member function overrides.

But using enumerations for the purpose of toggles comes with a certain cost. To some extent toggles are similar to values of type bool, but enumerations, being so strongly typed do not reflect this similarity. Implicit conversions from and to bool do not work (and this is desired), but also explicit conversions do not work, and this is a problem. If we go back to the body of the function process, it no longer compiles:

void process(WithValidation withValidation,
             WithNewEngine  withNewEngine)
{
  if (withValidation)  // error: no contextual convert to bool
    validate();

  // ...
}

I have to use an explicit cast:

if (bool(withValidation))  // ok
  validate();

And if I need a logical expression of two toggles, it becomes even starnger:

if (bool(withNewEngine) || bool(withValidation))
  validate();

Also, you cannot direct-initialize a scoped enumeration from a bool:

bool read_bool();

class X
{
  WithNewEngine _withNewEngine;

public:
  X() 
    : _withNewEngine(read_bool()) // fails
    {}
};

I need to do an explicit cast:

class X
{
  WithNewEngine _withNewEngine;

public:
  X() 
    : _withNewEngine(WithNewEngine(read_bool())) // ok
    {}
};

Maybe this could be considered as super-explicit safety feature, but this looks like a bit too much of explicitness. Strong enumerations are more explicit than explicit constructors/conversions.

tagged_bool

Thus, in order to prevent bugs caused by naked bools and to avoid problems caused by strong enumerations, I had to come up with my own tool, which I called a tagged_bool. You can find the implementation here. It is quite short. You use it to create toggles like this:

using WithValidation = tagged_bool<class WithValidation_tag>;
using WithNewEngine  = tagged_bool<class WithNewEngine_tag>;

You need to forward-declare a distinct tag class, like WithValidation_tag. It does not need to be defined. It is used to render a unique instantiation of class template tagged_bool. Instances of this template offer explicit conversion to and from bool as well as from other instances of tagged_bool, because, as it is often the case in practice, the same bool value passed to the lower levels of the application becomes a different toggle with a different name. You use thus created toggles like this:

void process(WithValidation withValidation,
             WithNewEngine  withNewEngine)
{
  if (withNewEngine || withValidation)  // ok
    validate();

  // ...
}

process(WithValidation{true}, WithNewEngine{false}); // ok
process(WithNewEngine{true}, WithValidation{false}); // fails

And that’s it. tagged_bool is part of Explicit library, which provides a number of tools for stating your intentions more explicitly in interfaces.

This entry was posted in programming and tagged , , . Bookmark the permalink.

40 Responses to Toggles in functions

  1. Elron says:

    ” Also, we can use type bool as enumeration’s underlying type; this way we make sure we only encode two states, and that we use up only the size of one byte”

    sizeof(bool) might not be 1, so I think you meant “the size of one bool”

  2. Bartek F. says:

    Great post!
    But it’s quite sad that you have to roll your own solutions. There should be a direct help from the language.

  3. Qb says:

    I usually use a “flags” value class which is a bit mask of enum values.
    Something like this process(WITH_VALIDATION | WITH_NEWENGINE)
    Inside the function I can test single bit values with some inline member functions Is IsNot IsOneOf etc…

  4. Tomas says:

    Nice!
    This not only prevents the wrong order in derived overloads but also prevents the wrong order in definition -> declaration of methods.

    Thanks for sharing!

  5. kszatan says:

    My first thought would be to get rid of those switches and extract parts responsible for validations and new/old engine to separate classes and then pass them as arguments. The “process” function already does too much.

    • When your specific use case is sufficiently simple, then not using any ‘toggles’ may be indeed a better option. But in the cases where the decision to switch the toggle is made a number of layers up in the call stack such a rewrite may not be feasible/practical:

      int main()
      {
        Revert revert_to_old_functionality {read_from_config()};
        Layer1::fun(revert_to_old_functionality)
      }
      
      void Layer1::fun(Revert revert)
      {
        // something else...
        Layer2::fun(revert);
      };
      
      void Layer2::fun(Revert revert)
      {
        // something else...
        Layer3::fun(revert);
      };
      
      void Layer3::fun(Revert revert)
      {
        // something else...
        if (revert)
          do_poor_validation();
        else
          do_thorough_validation();
      };
      
    • Agreed. I couldn’t look at a function with bools in the parameters without trying really hard to refactor them away. It’s hard to take this problem at face value.

  6. Pingback: More on Naked Primitives | Using C++

  7. micleowen says:

    class C
    {
    explicit C(bool withDefaults, bool withChecks);
    explicit C(int* array, size_t size);
    };

    explicit is used in constuctors with one parameter. Refresh your knowedge on this topic 🙂

    • There is a good reason to declare nearly all constructors as explicit, not only these with 1 argument, especially since C++11. Sometimes even a default constructor has better be declared as explicit. For justification, I would redirect anyone interested to this article.

  8. chris says:

    You’re going to hate me for this, but:

    “`
    struct S {
    void foo() {}
    };

    int main() {
    tagged_bool b{&S::foo}; // Compiles
    }
    “`

    http://melpon.org/wandbox/permlink/jgm8J5ju6SNNNmtS

  9. alfC says:

    Great posting. There is an obscure language called Asymptote that has C like syntax and has named parameters and uses type identity to resolve functions calls, including passing arguments in any order. In other words if the call can be deduced somehow it will do it for you.

  10. alfC says:

    tagged_bool can have a couple of static constants to reenable the double semicolon syntax. `WithNewEngine::False`.
    (Pity it needs to be capitalized or underscored)

    • Thanks for bringing this up. In the early stages, I tried to come up with double-colon syntax, but it always resulted in some surprises. In the end, I observed that WithNewEngine::False is not shorter or more intuitive than WithNewEngine{false}. Plus the latter allows you to pass run-time variables.

  11. ARNAUD says:

    I’m not I understand the reasoning behind the following for enum class types?

    Implicit conversions from and to bool do not work (and this is desired), but also explicit conversions do not work, and this is a problem.

    Why do you think writing the following is that bad?

    if(withValidation == WithValidation::True)

    On one hand you have something available in the language and code that is clearly legible and understood by all C++ developers. On the other hand, you need a specific templated class for automatic bool conversion? I’m not convinced.

    One more thing, imagine after a while your engine variable is not a bool anymore but can become no_engine, engine_v1, engine_v2 … the enum class allows this extension in a straightforward way, not your tagged_bool.

    • You bring up two points.

      Point 1.
      The choice between

      if (withValidation || withNewEngine)
      

      and

      if (withValidation == WithValidation::True 
          || withNewEngine == WithNewEngine::True)
      

      And actually, in a program that uses namespaces:

      if (withValidation == SomeNamespace::WithValidation::True 
          || withNewEngine == SomeNamespace::WithNewEngine::True)
      

      To me this is a trade-off between the level of desired type safety and convenience. My personal choice is: something safer than naked bools, but less verbose than strong enums. It seems that your trade-off falls something closer to strong enums.

      Point 2.
      The potential future addition of the third state.

      If you foresee that in the future you might need a third state, then indeed strong enums may be a better choice. But not necessarily, because, you just add the third value to the declaration, and all former if-statements still compile, even though you want to inspect all of them, to add the third state.

      It is just that in my experience, these ‘toggles’ are used as temporary solutions, and their future development is not to add a third state, but to remove two states. E.g., I improve some part of the program, but for a couple of months I want to give the users the ability to switch back to the old functionality, in case I overlooked something, and the improvement only makes the things worse. If after a couple of months all customers are satisfied, I remove the support for the older functionality, and the toggle gets removed.

  12. Magnus Fromreide says:

    It might be just me but I was missing plain old enums from the discussion the proper type to use in the interface.
    Enums does solve enogh of the probem to be a good candidate in this case.

    * Distinct types – Yes, enum values are of distinct types
    * Prevents conversion – Enogh, there is an implicit conversion from enum-value to int but you can’t go the other way.

    This means process(false, true) is a compile error just like process(WithNewEngine{true}, WithValidation{false}) but process(WithValidation{false}, WithNewEngine{true}) is ok.

    Now, this fails in that you can assign out of range values to the new types but I belive this is less of a problem than wrong order.

    One might also see the fact that the values ain’t scoped as a problem with enums.

    In any case I think enums deserve a mention in this post.

    • Thanks for bringing this up. Yes, if we consider naked bools on one end, and strong bool-based enums on the other end, old style enums fall something in between. The format of the posts in this blog does not allow to examine every aspect of the problem, so I need to remove some, but it is good they return back in the comments.

      I didn’t mention it in the post, but one thing you do with these toggles occasionally is to store them in class members. If you use a strong enumeration, with bool as a base type:

      enum class WithNewEngine : bool { False, True };
      

      You get the guarantee that:

      sizeof(WithNewEngine) == sizeof(bool);
      

      You get the same guarantee for tagged_bool. But with old-style enums, you get the size of an int.

  13. michal s says:

    this type tagging technique is very nice, I think I read about similar idea to prevent converting integer-like types. http://www.ilikebigbits.com/blog/2014/5/6/type-safe-identifiers-in-c

  14. mftdev00 says:

    i dont like toggles in general. goes against single responsibilty. do one thing when true, another when false…

  15. SebB says:

    Hi Andrzej, nice post.
    I have a question, instead of explicitly deleting constructors for every type, like :

    constexpr explicit tagged_bool (bool v) : value {v} {}
    
    constexpr explicit tagged_bool (int) = delete;
    constexpr explicit tagged_bool (double) = delete;
    constexpr explicit tagged_bool (void*) = delete;
    

    could it be possible to just “delete” it for every type other than “bool” at once:

    constexpr explicit tagged_bool (bool v) : value {v} {}
    
    template <typename SomethingOtherThanBool>
    constexpr explicit tagged_bool(SomethingOtherThanBool) = delete;
    

    It seems to work, but is there a catch ?

    • There was no catch at the time of designing this interface, I just didn’t consider this possibility. It might be a useful addition. But now, that you suggested it, I can see one case, where it would have a negative impact: somebody may be using their own — safe — Boolean type that has an implicit conversion to bool. Maybe we want to allow it to inter-operate with tagged_bool.

  16. Pingback: C++ Annotated: Jan – Mar 2017 | CLion Blog

  17. Pingback: C++ Annotated: Jan – Mar 2017 - ReSharper C++ BlogReSharper C++ Blog

  18. Andres Gongora says:

    Excellent post. I took a look into your git repository and stumbled upon your “artificial namespace to prevent ADL into namespace xplicit”. I’m not sure why you do this. Could you please explain in what kind of situation your xplicit namespace needs protection against ADL?

    I have read your post about your customizable framework, and the one about overload resolution, so I’m starting to see the advantages of ADL. I also understand that when calling a function, ADL could accidentally use a different function from the namespace of the arguments if there happens to be one with the same name as the one we intend to call.

    Still, its the first time I see a protection against ADL in code, and wanted to ask you why it is needed and whether you recommend it as good practice.

    • It all becomes relevant when I provide a library with a number of classes. This is visible with namespaces like std or boost, which are full of types.

      Suppose in my huge library I have class X, in a separate header x.h:

      // header x.h:
      namespace lib
      {
        struct X
        {
          int i;
        };
          
        bool sponge(const X&) { return false; }
          
        // algorithms
        template <typename T>
        void alter(T& v) { if (sponge(v)) v.i = 0; }
      }
      

      I have made sponge part of Xs interface. Now the user of my library is using this class in main.cpp:

      #include "x.h"
      #include "a.h" // some other includes...
      
      int main()
      {
        lib::X x;
        lib::alter(x);
      }
      

      And it fails to compile. But everything looks fine. The reason for the file not compiling is that in the same library (in the same namespace), in file a.h my colleague has implemented other types and functions:

      // header a.h:
      namespace lib
      {
        struct A { bool sponge; };
        struct B { bool sponge; };
        
        template <typename T>
        bool sponge(T& a) { return a.sponge = true; }
      }
      

      And because of the lacking const his function sponge is a better match due to ADL.

      I may have not known about my colleague’s header. My colleague might have not known about my header. But now the user cannot even use our library. It is still not that bad because the program fails to compile. The worst thing would be if it compiled and did something else than the user intended because of the inadvertent ADL. But if I use this ADL protection systematically, and declare my types as:

      // header x.h:
      namespace lib
      {
        namespace X_scope
        {
          struct X
          {
            int i;
          };
          
          bool sponge(const X&) { return false; }
        } 
        using  X_scope::X;
      
        // algorithms
        template <typename T>
        void alter(T& v) { if (sponge(v)) v.i = 0; }
      }
      

      This problem never occurs (even if my colleague forgets about this pattern).

      Other reason is compile-time performance when instantiating templates: fewer namespaces need to be looked up.

      • Andres Gongora says:

        Thank you so much for your extremely quick answer and taking the time to write the above example code 🙂

        If I understand it right, by placing my class and all its accompanying functions in an auxiliary nested namespace, I’m protecting the parent namespace (lib in your example) against accidental ADL mismatches. Meaning that this patterns protects the developers of the library rather than its user.

        Yet, I’m also preventing the end user from using ADL to create a glue library to overload a third-party library and integrate it with my classes… Which leaves the user without any options (please correct me if I got this wrong).

        Do you have any recommendation of when this pattern should, or rather shouldn’t, be used?

        • Meaning that this patterns protects the developers of the library rather than its user.

          If I were to answer the question , “who is protected”, I would say it is the end user. Header a.h is fine in isolation and so is x.h. It is using them together that makes the problem appear.

          Yet, I’m also preventing the end user from using ADL to create a glue library to overload a third-party library and integrate it with my classes… Which leaves the user without any options (please correct me if I got this wrong).

          Sorry, I am not following what glue library is being in question here.

          Do you have any recommendation of when this pattern should, or rather shouldn’t, be used?

          Unfortunately, it seems to me that it should be used practically for any class, to tell its non-member interface from non-interface functions. This is to work around the too dangerous ADL in C++. But it is too burdensome, so I am afraid to just recommend it to normal programmers. In the library that I am writing (this explicit), I will write it once and I expect it will be used many a time by many users; so the effort to put that protection will pay off.

        • Andres Gongora says:

          Update to explanation of what I meant with “glue library”.

          I’m referring to a situation similar to that explain in this blog’s overload [resolution example](https://akrzemi1.wordpress.com/2015/11/19/overload-resolution/), where the user has 2 completely independent libraries which are not trivial to integrate. In this scenario we need “some glue” between both libraries:

          “`
          #include
          #include
          #include “some_glue_between_1_and_2”
          “`

          The nicest way to accomplish this, without modifying any of the libraries is through ADL (I believe?). Yet, by preventing ADL I’m also preventing the user from creating overloaded functions that have my ADL-protected class as argument.

        • Ok, I see what you mean. I am getting more and more convinced that for this “glue” or “customization points” one should not use function overloads, but class template specializations. I gave you one example of a bug when one you do not expect ADL but get one. There are other cases where you expect ADL, but do not get one. But this is a subject for another blog post.

        • Andres Gongora says:

          Oh! I completely forgot about class template specializations! That, and your previous answer just answered my question 🙂

          Thanks again for all and keep on posting, this blog is very insightful and inspiring.

  19. alfC says:

    Nice post. Why stop at `bool`, any argument of any type could be tagged, for consistency or for convenience. And then once you tag them, why preserve the order, a template metaprogram can reorder them are forward it to the right function. There is a language called Asymptote http://asymptote.sourceforge.net/ (for drawing, C-like, interpreted) that uses a very powerful combination of ordered and named arguments that deduces what to do in most cases. I think it is an interesting case study.

  20. Pingback: Ways to rewrite boolean parameters in C++ - Prog.World

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.