More than you need

The classes you design can do more (in terms of allowed operations) than what you could figure out from just looking at their member function declarations. The C++ Standard defines a number of cases where certain expressions involving your type are valid, even though there are no corresponding member function declarations. Sometimes this is just what you need; but sometimes the additional operations you never asked for can have grave negative impact on your program correctness.

For the first case, let’s consider implicitly declared special member functions. I often ask this question in job interviews. In the below code, what is the bug and how to fix it?

class String
{
  char * _str;

public:
  String(const char * str)
    : _str( new char[strlen(str) + 1] )
  {
    strcpy(_str, str);
  }
  
  ~String()
  {
    delete [] _str;
  }

  bool equal(const String& rhs) const
  {
    return strcmp(_str, rhs._str) == 0;
  } 
};

String make_string()
{
  return String("cat");
}

In case, you didn’t see the problem immediately, here is the answer. You may have wanted it or not, but the compiler has implicitly declared (and perhaps defined later) the copy constructor for String. The compiler does not know that you are managing a resource and generates a shallow copy. After the copy, the two instances think they are managing the very same resource: just wait for the problem to happen.

I am far from saying that auto-generating copy operations is something unwanted. In fact, having to always provide your custom definition in every class would be a nightmare. It is only about this case.

There is more than one way of fixing the above example; but the question we should be really answering is what precautions to take in order never to see this bug in the code in the first place. We leave this question open for now, though.

For the second case, let’s consider another bug. We have a class FixedDecimal that represents decimal numbers with fixed-point arithmetic. Class FixedDecimal supports multiplication: by other FixedDecimal and by an int:

FixedDecimal<2> d ("1.20");
FixedDecimal<2> f ("0.50");

assert ((d * 2).to_string() == "2.40");
assert ((d * f).to_string() == "0.60");

Accidentally, one day I requested multiplication by a double:

FixedDecimal<2> d ("1.20");
assert ((d * 0.50).to_string() == "0.60");

The program compiles, and only later do I find that the result is not 0.60 as I expected, but is 0.00 instead.

Do you know why? It has been already described in Inadvertent conversions. In short, FixedDecimal never intended to inter-operate with doubles; but it inter-operates with ints, and because doubles are implicitly convertible to ints…

Again, sometimes a conversion form double to int may be exactly what you need. But sometimes, especially in types modeling mathematical numeric concepts, this implicit conversion may be disastrous.

An important special case of implicit conversions kicking in when you least expect them, is when you inadvertently provide your own. Unless you mark your constructors explicit, especially those that can be called with one argument, you define an implicit conversion: sometimes one that you never wanted. Imagine the following piece of code:

string xml = "<root><elem>1</elem></root>";
parse_xml(xml, dom);

This program looks as though it might be doing something reasonable. It compiles; it is supposed to populate object dom with the xml content, and you might be thinking that this is the case, but in fact it does something else. This is because function parse_xml is declared as:

bool parse_xml(File const& file, Dom& dom);

And it so happens that File has the following constructor:

class File
{
public:
  File(std::string const& name, std::string mode = "r");
  // ...
};

It is not even a single-argument constructor, but because it has default values on the other arguments, it still can be used for conversions. Perhaps, the constructor didn’t even have the default arguments in the past, but they were added later. Had the compiler not been so generous in using our constructors for conversions, we would have spotted the problem during the compilation.

The existence of these “invisible functions” leads to conclusion that visible function declarations are not the best way to describe the type’s interface. Instead it is better to describe the interface by listing what expressions are allowed. This way we take into account not only the explicitly declared functions, but also all special functions and function calls injected by the compiler. This is, for instance, what Concepts Lite do. Let’s look at a sample concept:

// NOT C++ YET
template <class T>
concept bool Scalable = requires(T v, double s)
{
  v *= s;
  v /= s;
  { v * s } -> T;
  { v / s } -> T;
};

The concept requires that multiplication and division by a double works, but does not care whether T declares them as member (or non-member) functions, or if they are achieved through implicit conversions from double to int.

What can we do?

So, that’s what we have to live with. Compilers need to accept what the C++ Standard tells them to accept; and sometimes what the C++ Standard requires, in connection with our oversight may cause a bug.

But there are precautions we can take ourselves to reduce the chances of hitting one of these weaknesses of the language. Regarding the implicitly declared (and defined) member functions, one way of playing it safer is to apply the Rule of Zero, which can be read as follows. One class either only manages a resource (and does nothing else), or offers some program logic (and manages no resource directly). If some class does both, split it into two. In the class that manages the resource — only in this case is there a chance to leak a resource — you have to be careful to declare all the special member functions yourself. One way to do it safely is to first declare all the special member functions as deleted, and only re-enable them as needed. This way we start with the safe default. Or to derive (privately) all resource-handling classes from something similar to boost::noncopyable, that has all the special members inaccessible, and therewith prevents the default generation of these members in our class.

Also, note that since C++11, this behavior, where you have a custom destructor and compiler defines a shallow copy constructor, while still normative, is deprecated. Compilers must still do it, but what they can also do is to issue a warning that they are doing a deprecated and potentially unsafe thing. This is year 2015, we have been aware of this problem for at least 17 years, so you might expect that every modern compiler does that.

I have checked four compilers: GCC, Visual C++, Clang and ICC, to see if they give me a warning when I try to compile the example with the String above.

Clang with flag -Wdeprecated gives me a warning: definition of implicit copy constructor for ‘String’ is deprecated because it has a user-declared destructor [-Wdeprecated].

ICC with flag -Weffc++ gives: warning #2021: Effective C++ Item 11 declare a copy constructor and assignment operator for String.

GCC with flag -Weffc++ gives: ‘class String’ has pointer data members but does not override ‘String(const String&)’ or ‘operator=(const String&)’.

I couldn’t make Visual C++ to give me a warning about the problem. It warns me about other things (it considers calling strcpy unsafe), but not about the real issue here.

Personally, I feel only Clang does it the right way. It is good that I get any warning In ICC and GCC, but the reasons for so doing (that I have a pointer member) seem wrong.

Regarding the inadvertent conversion from double, both GCC and Clang offer a warning that can detect it: -Wfloat-conversion. Also warning C4244 in Visual C++ can detect it, but the latter warning detects much more, perhaps too much.

If you want to disable the inadvertent conversion selectively in numeric types, what you can do is to declare additional overloads with deleted definitions:

template <int Dec>
class FixedDecimal
{
  // ...
  FixedDecimal& operator *= (int s) { /*implement*/ }
  FixedDecimal& operator *= (double s) = delete;

  FixedDecimal& operator /= (int s) { /*implement*/ }
  FixedDecimal& operator /= (double s) = delete;
};

This way we have a dedicated overload for type double, except that it will fail the compilation when selected. The only problem with it is that now, practically for every function taking type int you have to provide the ‘shadow’ function taking double. The interface doubles (pun intended), but the bigger problem is that you will likely forget to provide the shadow deleted function at some point.

As an alternative, instead of int you can use a replacement, say only_int which is implicitly convertible from int, but not double, and probably not unsigned int:

template <int Dec>
class FixedDecimal
{
  // ...
  FixedDecimal& operator *= (only_int s) { /*implement*/ }
  FixedDecimal& operator /= (only_int s) { /*implement*/ }
};

You will have to define the type yourself, it will probably be something like the following:

class only_int
{
  int val_;

public:
  only_int (int v = 0) : val_(v) {}
  
  template <typename T, ENABLE_IF(is_signed_integral<T>)>  
    only_int (T v) : val_(v) {} // can static_assert the size
  
  template <typename T, ENABLE_IF(!is_signed_integral<T>)>
    only_int (T) = delete;

  int get() const { return val_; }
};

is_signed_integral is a type trait I have defined for describing signed integral types:

template <typename T>
using is_signed_integral = typename std::conditional<
  std::is_signed<T>::value && std::is_integral<T>::value,
  std::true_type,
  std::false_type
>::type;

ENABLE_IF is a macro I defined to make the code that uses enable_if tricks more concise. This syntax works in C++11, and the macro is defined as:

# define ENABLE_IF(...)                                   \
  typename std::enable_if<__VA_ARGS__::value, bool>::type \
    = true

The idea behind the three constructors of only_int is the following. The first constructor simply works when an int is used for copy construction. The second and third constructor declarations provide a ‘conditional’ constructor definition. If condition is_signed_integral is satisfied, the object is properly initialized. Otherwise, e.g., when copy-initializing from a double constructor is declared as deleted: a compile-time error is triggered.

(For a full working example see here.)

In the end, by using only_int consistently in the interface of FixedDecimal, you have one place to control which implicit conversions you want to accept and how to handle them.

Regarding the inadvertent use of converting constructors, this is the place where we get the least support from compilers. This is because there is no way to decide that the omission of keyword explicit is intended or not. Some constructors are just meant to be used in conversions. In case of the former warning about implicitly generating a copy constructor, we have a good criterion in the form of having or not having a user-provided destructor. In our case here, there is no such good criterion. Only ICC (to my knowledge) tries to address this by issuing a warning (#2304) whenever a non-explicit constructor with single argument is declared (when compiled with -Weffc++ parameter). But this is too eager: I want some constructors to be non-explicit. You can disable the warning per-class with preprocessor directives, but this is to coarse-grained. I have classes, where I need one or two constructors to be converting, but for the rest I want the compiler to warn me if I forget about keyword explicit.

The only thing I can do about the problem is to give an advice to anyone: get into the habit of declaring every constructor explicit (and only later change the ones you need to non-explicit, if you are sure this is what you want). This applies not only to single-argument constructors. As we have seen in one of the examples, constructors with more arguments can also be selected for conversion if default argument values are involved. You may not have the default values specified, but you cannot guarantee they will never be added in the future. Also, since C++11 you can trigger a sort of implicit conversion from more than one argument:

Tool t;
Pred p;
return {&t, p};

Are you aware that you are returning a std::shared_ptr?

Admittedly, the fact that you are using braces, makes it less likely that you will trigger an implicit conversion inadvertently; but the fact that you are requesting the initialization of something you may not know is scary. I consider it a safety gap in the STL that it has most of the multi-parameter constructors non-explicit.

The only good candidate for a non-explicit constructor by default is the move constructor and the copy constructor.

From my own experience I know that following advice “always declare your constructors explicit” is impossible. Even though I accept the rule, and know the consequences of departing from it, I just forget.

I wish C++ did it the other way: no constructor can be used in implicit conversions unless it is explicitly indicated as potentially converting. Since this is probably an irreversible decision, I wish compiler vendors gave me a warning about all converting non-copy, non-move constructors, plus an attribute that disables this warning per constructor declaration:

template <int Dec>
class FixedDecimal
{
public:
  FixedDecimal ();                     // warning!
  FixedDecimal (tag_t, size_t);        // warning!
  FixedDecimal (FixedDecimal const&);  // no warn (copy)
  FixedDecimal (int) [[converting]];   // no warn (attr)
  FixedDecimal (double) = delete;      // no warn (deleted)
  explicit FixedDecimal (std::string); // no warn (explicit)

  // ...
};

Unfortunately, this is not available either.

Practical advice

The cases described above have one thing in common: our types may allow more valid expressions than we want. One good way to make sure it is not the case is to (unit-)test it. Having unit tests for your types is always a good idea, but here, in our case we want to assert that certain expressions should fail to compile. Typical unit-test frameworks cannot handle this. In general, it is not possible to check from within C++ that certain expressions fail to compile. To some extent SFINAE tricks may do that, but not always, especially not if you are using static_asserts.

What you may be forced to do is to test the desired compile-time failures using an external tool: provide a source file with an expression that you expect to fail to compile and run it through the compiler. If compiler reports success, your tool reports a failure, and vice-versa.

This is how desired compile-time failures are tested in Boost. Let’s look at Boost.Optional’s tests. Here is a one test program that we expect to fail to compile. It reads more-less:

#include "boost/optional.hpp"

struct NoInitFromNull {};

void test_conversion_from_null()
{
  boost::optional<NoInitFromNull> opt = 0;
}

In the config file we specify that the “test” passes if the compiler reports compilation error (returns a non-zero code). Boost.Build system is able to handle this natively. I am not aware of any other C++ build systems with this capability. However, this SO question says you can achieve the same effect in CMake with some effort.

Another conclusion to be drawn from this post is to use more than one compiler. This is what we do in my team. For a number of reasons we want to use GCC to produce the binaries. But since we find the error/warning messages from Clang superior, we will also compile the program with the latter: only to check for potential bugs.

This entry was posted in programming and tagged , , , , , . Bookmark the permalink.

14 Responses to More than you need

  1. quicknir says:

    Halfway through, looks good, but I cannot see the underscore in “parse_xml”. This is especially significant as xml is already used, so the code was very confusing until I understood the issue. Thought I should mention it ASAP if you wanted to fix it for others.

  2. viboes says:

    Do you think that is is too soon to generalize only_int to

    template <class T, template <typename> class Trait>;
    class only {
      T val_;
     
    public:
      only (T v) : val_(v) {}
       
      template <typename U, ENABLE_IF(Trait<U>)>  
        only(U v) : val_(v) {}
       
      template <typename U, ENABLE_IF(!Trait<U>)>
        only(U) = delete;
     
      int get() const { return val_; }
    };
    
    using only_int = only<int, is_signed_integral>;
    
    • I edited your post a bit. WordPress uses its own markup syntax; a short hint for pasting code snippets can be found here. I hope I preserved the spirit though.

      To answer your question: it would be too soon for me. I know what problem I am solving with only_int: it is the third time I was hit with the problem of double converting to int. I cannot recall any other problem I have faced so far that would be solved by the generalized only. When I encounter any, I will consider generalizing the solution.

      You have made an interesting observation, though.

  3. viboes says:

    Why will you want a warning in

      FixedDecimal ();                     // warning!
      FixedDecimal (tag_t, size_t);        // warning!
    
    • To prevent the C++11 type inference from brace-initializers:

      fun({tag, 10}); // what am I actually passing?
      fun({}); // am I passing what I think I am passing?
      

      There are cases where this is useful, but I do not want to get this ‘usefulness’ accidentally.

  4. For Visual Studio, what switches did you compile with for the “definition of implicit copy constructor for ‘String’ is deprecated because it has a user-declared destructor” warning? (or, really, the lack thereof?)

  5. Chris says:

    -Weffc++ is also a thing on GCC, but it had a habit of not working wonderfully in 2012. The example I have is a bit different in that it doesn’t have a destructor, but I once asked about a false positive on SO (*shudder* three-year-old questions): http://stackoverflow.com/questions/11496942/understanding-weffc

  6. Pingback: CppCube » (翻译)超乎你的需要

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.