CS 70

Safe Numeric Conversions

As we've learned, although promotions from one numeric type to another are always safe, conversions may not be. In many situations, C++ will automatically convert from one numeric type to another, but this behavior can lead to unexpected results if the conversion is not safe.

For example, consider this code:

#include <iostream>
#include <cstdint>

int main() {
    int32_t x = 123456789;
    uint16_t y = x;  // Implicit conversion from int32_t to uint16_t
                     // If it was an `int16_t`, we would overflow the int
                     // and rather than wrapping around, it would be undefined
                     // behavior.
    std::cout << y << "\n";  // Prints 52501, not 123456789!
    return 0;
}

Even with all warnings turned on, this code compiles without any warnings. But the conversion from int32_t to uint16_t is not safe, because the value 123456789 cannot be represented in a uint16_t.

Worse, if we had used an int16_t instead of a uint16_t, the conversion would have resulted in undefined behavior because the value is out of range for the destination type (and, unlike unsigned types which wrap around on overflow, signed-integer overflow is undefined behavior in C++).

Some code does produce warnings. For example,

#include <iostream>
#include <vector>

struct Elephant {
    int age = 42;
    // ... imagine this is a full-featured class ...
};

Elephant& fetchElephant(std::vector<Elephant>& elephantHouse, int index) {
    // Allows Python-style negative indexing.
    // Check the range first.
    if (index < -elephantHouse.size() || index >= elephantHouse.size()) {
        throw std::out_of_range("fetchElephant: index out of range");
    }
    // Convert negative index to positive index.
    if (index < 0) {
        index = elephantHouse.size() + index;
    }
    return elephantHouse[index];
}

int main() {
    std::vector<Elephant> elephantHouse(5);
    Elephant& dumbo = fetchElephant(elephantHouse, 3);
    std::cout << "Dumbo is " << dumbo.age << " years old.\n";
    return 0;
}

This code gives no warnings when compiled with just -Wall, but if we add -Wextra, we get this warning:

cast-problem2.cpp:12:48: warning: comparison of integers of different signs: 'int' and 'size_type' (aka 'unsigned long') [-Wsign-compare]
   12 |     if (index < -elephantHouse.size() || index >= elephantHouse.size()) {
      |                                          ~~~~~ ^  ~~~~~~~~~~~~~~~~~~~~
cast-problem2.cpp:12:15: warning: comparison of integers of different signs: 'int' and 'size_type' (aka 'unsigned long') [-Wsign-compare]
   12 |     if (index < -elephantHouse.size() || index >= elephantHouse.size()) {
      |         ~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.

But when we run this code, it will always crash with an out-of-bounds exception, because elephantHouse.size() is an unsigned type, and when we negate it, it becomes a very large positive number, so the condition index < -elephantHouse.size() is always true when index is positive, and the condition index >= elephantHouse.size() is true whenever index is negative.

A Common Solution: Explicit Casts

One common solution to the problem of signed/unsigned comparisons is to use explicit casts to ensure that both sides of the comparison are of the same type. But doing so requires care. For example, we could change the condition to

if (static_cast<size_t>(index) < -elephantHouse.size()
    || static_cast<size_t>(index) >= elephantHouse.size()) {

but this change doesn't fix the original problem and introduces a new one: if index is negative, then converting it to size_t will produce a very large positive number, and the comparison will not work as intended. A more correct fix would be to cast elephantHouse.size() to int:

if (index < -static_cast<int>(elephantHouse.size())
    || index >= static_cast<int>(elephantHouse.size())) {

But if we return to our first example, we can see that explicit casts don't always help. The number 123456789 simply cannot be represented in a uint16_t, so even if we write

uint16_t y = static_cast<uint16_t>(x);

the conversion is still unsafe, and the code still produces an unexpected result.

  • Hedgehog speaking

    Gah. I never realized that using static_cast might actually lead to more problems! Now I see that it doesn't deal with lossy conversions at all!

  • Goat speaking

    Meh. And writing static_cast<uint16_t>(x) is a lot of typing for a little thing.

  • Duck speaking

    Can we write (int16_t)x instead? That's shorter.

  • LHS Cow speaking

    Well, that doesn't fix our problem in this case, but it's also what's known as a C-style cast, and it's generally discouraged in C++ because it can do many different kinds of casts (including const_cast and reinterpret_cast), which can lead to unsafe code. So it's better to avoid C-style casts in C++.

  • RHS Cow speaking

    But let's see if we can come up with a better solution.

The CS 70 Narrowing-Cast Library

The CS 70 narrowing-cast library provides a set of functions that perform safe conversions between numeric types. It provides a family of functions named to_type for each built-in numeric type, such as to_int, to_uint16, to_double, etc. Each function checks that the conversion is safe, and if it is not, it throws an exception (specifically, std::overflow_error).

Thus our first example can be rewritten as

#include <iostream>
#include <cs70/narrowing_cast.hpp>  // Safe numeric conversions
#include <cstdint>

using namespace cs70;

int main() {
    int32_t x = 123456789;
    uint16_t y = to_uint16_t(x);  // Safe conversion from int32_t to uint16_t
    std::cout << y << "\n";       // Never reached because the conversion
                                  // throws an exception
    return 0;
}

Now this program crashes at runtime with an exception, rather than producing an unexpected result.

  • Duck speaking

    So it crashes now? Is that better?

  • LHS Cow speaking

    Well, at least it doesn't produce a wrong answer silently. And we can catch the exception if we want to handle it gracefully.

Similarly, our second example can be rewritten as

Elephant& fetchElephant(std::vector<Elephant>& elephantHouse, int index) {
    // Allows Python-style negative indexing
    // Check the range first
    int houseSize = to_int(elephantHouse.size());
    if (index < -houseSize || index >= houseSize) {
        throw std::out_of_range("fetchElephant: index out of range");
    }
    // Convert negative index to positive index
    if (index < 0) {
        index = houseSize + index;
    }
    return elephantHouse[index];
}
  • Horse speaking

    Hay, wait a moment. What if elephantHouse.size() is too big to fit in an int? Won't to_int throw an exception then?

  • LHS Cow speaking

    Because we expect the elephant house to have a modest number of elephants, an int is a reasonable type to represent the number of elephants. But because we've used safe casting, if we somehow do end up with more than 2 billion elephants in our elephant house, the code will throw an exception rather than silently producing a wrong result.

  • Pig speaking

    Can we convert MORE types? Like floating point ones?

Floating Point Types

The CS 70 narrowing-cast library also supports conversions to and from floating point types: float, double, and long double. The conversion functions that produce floats are to_float, to_double, and to_longdouble, and the ones that produce integer types all accept floating-point arguments.

When converting from floating point to integer types, the conversion is safe if the floating point value is within the range of the integer type. We assume that you're okay with rounding down to the nearest integer; we only throw an exception if the value is out of range for the integer type.

When converting from integer types to floating-point types, the conversion is safe if the integer value can be represented exactly in the floating-point type. So the integer value must be within the range of values that can be represented exactly in the floating-point type, which depends on the number of bits in the mantissa of the floating-point type. Specifically, for float, it must be in the range [-224, 224]; for double, it must be in the range [-253, 253].

  • Horse speaking

    Hay! So in our first example, with 123456789, if we had converted it to a float instead of a uint16_t, that would have lost information? Because 123456789 is bigger than 224?

  • LHS Cow speaking

    Yes, that's right. Converting 123456789 to a float would lose information, because float can only represent integers exactly up to 16,777,216. 123456789 actually becomes 123456792 because it's represented as 1.110101101111001101000112 × 226, which is 123456792 in decimal—it only has 24 bits of precision in the mantissa.

Wrapping Up

Here are the key points to remember:

  • C++ will automatically convert between numeric types, but the conversion may not be safe.
  • Adding static_cast can help make conversions explicit, but it doesn't guarantee safety—writing static_cast<uint16_t>(x) doesn't make the conversion safe if x is too large to fit in a uint16_t or is negative.
  • The CS 70 narrowing-cast library provides functions that perform safe conversions between numeric types, throwing an exception if the conversion is not safe. You should use it whenever you need to convert between numeric types in this course.
  • Hedgehog speaking

    This isn't just an issue in CS 70, right? It's going to be a problem in other C++ code I write? So what should I do when I write C++ code outside of CS 70?

  • LHS Cow speaking

    Yes, this is a general issue in C++. You have three options:

    • Microsoft's GSL (Guidelines Support Library) provides a narrow function that performs safe conversions.
    • The Boost library provides a numeric_cast function that performs safe conversions.
    • You can use the same narrowing-cast library you used in CS 70. It's available as a Gist on GitHub, but it puts the functions in the meo namespace instead of cs70.
    • You can write your own safe conversion functions, but be careful to handle all the edge cases correctly.
  • Goat speaking

    Meh. Or you could just do things the classic C++ way and not bother checking anything ever and hope for the best.

  • Hedgehog speaking

    rolls eyes … I think I'll opt for safety, thanks!

(When logged in, completion status appears here.)