The Source for Our randuint32 Library

This is bonus material. You don't need to look at this code.

Header File: `randuint32.hpp`

/* randuint32.hpp
   ==============

   Interface definition for the RandUInt32 class, which provides a simple
   way to get random numbers.

   This file is provided as part of CS 70, but may be used in other
   code.
*/

#ifndef RANDUINT32_HPP_INCLUDED
#define RANDUINT32_HPP_INCLUDED

#include <iostream>
#include <cstdint>
#include "jsf.hpp"

/**
 * \class RandUInt32
 * \brief Provides a simplified way to get random numbers.
 *
 * \details
 * Contains a C++-style a random number engine as a data member that it
 * uses to produce the numbers.
 *
 */

class RandUInt32 {
 private:
    typedef jsf32 rng_type;
 public:
    /*
     * \brief    Construct a new RandUInt32 with a random seed (uses system
     *           entropy under the hood).
     *
     * \note     Entropy collection is slow. Prefer creating one RandUInt32
     *           and reusing it rather than constructing per draw.  In fact,
     *           this code just delegates to the parameterized constructor
     *           with the seed -1, which causes it to use system entropy.
     */
    RandUInt32();

    /*
     * \brief    Construct a new RandUInt32 with the given seed.
     *
     * \param seedval
     *           63-bit seed used to initialize the generator.  If the value
     *           is positive, it is used directly; if negative, its value is
     *           negated and mixed with some system entropy to produce a
     *           truly random seed.
     *
     * \note     Same positive seed ⇒ same sequence. Initialization is ~100×
     *           slower than drawing a value, so prefer reusing a single
     *           instance.  The negative seed option is provided to make it
     *           easier to truly random seeds from conditional code paths
     *           (e.g., set the seed value to -1 and then if seed is specified
     *           on the command line, set it to that positive value).
     *
     * \warning  Seeding the state with a 32-bit value causes measurable bias
     *           in early outputs. Classic occupancy problem: 2^32 seeds mapping
     *           to 2^32 possible first outputs means ~37% of uint32_t values
     *           will NEVER occur as the first result (n balls → n bins leaves
     *           n/e bins empty). Not a PRNG flaw -- this is fundamental
     *           probability theory.  When you want actual random behavior,
     *           it's easiest to use the default constructor, which provides
     *           a good source of randomness and avoids this issue, but if
     *           you want to use std::random_device or some other source
     *           of 32-bit entropy to seed the generator, make *two* calls
     *           and combine them to get a 64-bit seed.
     */
    RandUInt32(int64_t seedval);

    /*
     * \brief    Get a random unsigned integer in the range [0, 2^32-1].
     *
     * \return   A random unsigned 32-bit integer.
     *
     * \warning  Don't *ever* write `gen.get() % n` to get a number
     *           in the range [0, n-1].  This introduces bias unless
     *           n is a power of two.  Use `gen.get(n)` instead.
     */
    uint32_t get();

    /*
     * \brief    Get a random unsigned integer in the range [0, max-1].
     *
     * \param    max The upper bound (exclusive) for the random number.
     * \return   A random unsigned 32-bit integer in the range [0, max-1].
     *
     * \throws   std::invalid_argument if max is 0.
     */
    uint32_t get(uint32_t max);

    // Although the preferred interface is to use get(), we also allow the
    // PRNG to be used as a standard C++ PRNG, so that it can be used
    // with standard library facilities like std::shuffle,
    // std::uniform_int_distribution, etc.
    using result_type = typename rng_type::result_type;
    static constexpr result_type min() {
        return rng_type::min();
    }
    static constexpr result_type max() {
        return rng_type::max();
    }
    result_type operator()() {
        return rng_();
    }

 private:
    rng_type rng_;
};

#endif

The class is designed so that we could change rng_type to be a different PRNG. Currently it uses jsf32, but it could for example use std::minstd_rand or std::mt19937 instead.

Source File: `randuint32.cpp`

The following would be entirely sufficient:

/* randuint32.cpp
   ==============

   Implementation of the RandUInt32 class, which provides a simple way
   to get random numbers.

   This file is provided as part of CS 70, but may be used in other
   code.
*/

#include "randuint32.hpp"
#include <chrono>
#include <random>
#include <stdexcept>
#include <cinttypes>

namespace {
// This is a private helper function to get 64-bits of system entropy
// for seeding the PRNG.
//
// - Assumes std::random_device actually provides non-deterministic entropy.
uint64_t system_entropy() {
    std::random_device rd;
    uint64_t val = rd();            // Get first 32 bits
    val = (val << 32) | rd();       // Shift and merge in next 32 bits
    return val;
}

}  // namespace

RandUInt32::RandUInt32() : RandUInt32(-1) {
    // Nothing (else) to do.
}

RandUInt32::RandUInt32(int64_t seedval)
    : rng_(seedval < 0 ? system_entropy() : uint64_t(seedval)) {
    // Nothing (else) to do.
}

uint32_t RandUInt32::get(size_t max) {
    std::uniform_int_distribution<uint32_t> udist(0, max - 1);
    return udist(rng_);
}

uint32_t RandUInt32::get() {
    std::uniform_int_distribution<uint32_t> udist(0, UINT32_MAX);
    return udist(rng_);
}

If you want to have you're own version of RandUInt32 to use, the one above might be the best, because the code is the simplest.

For this class though, we like to give examples where we can know for sure what the output would be for a given seed. The std::uniform_int_distribution class template is not guaranteed to produce the same results on different standard library implementations (or even different versions of the same library). To have identical results across systems, we avoid using std::uniform_int_distribution and instead implement its functionality ourselves (unless you switch the underlying generator to something that makes doing that super-awkward).

In general, reimplementing code that is already in the standard library is a bad idea. Significant engineering effort has gone into making the standard-library code perform very well (although obviously that can vary from system to system).

If you want to see the actual code we use for RandUInt32, with all of these details included (and more), you can click the button below. Not that it uses some C++ features we haven't covered in this class, including templated (compile-time) constants, and if constexpr which is an if whose outcome must be determined at compile-time.

/* randuint32.cpp
   ==============

   Implementation of the RandUInt32 class, which provides a simple way
   to get random numbers.

   This file is provided as part of CS 70, but may be used in other
   code.
*/

#include "randuint32.hpp"
#include <chrono>
#include <random>
#include <stdexcept>
#include <cstdint>

namespace {
// This is a private helper function to get 64-bits of system entropy
// for seeding the PRNG.
//
// - In some ways, this function is overkill, but the C++ standard does not
//   actually guarantee that std::random_device is non-deterministic, so
//   we mix in some other sources of (hopefully) non-deterministic entropy.
uint64_t system_entropy() {
    // Phase one, use the random device to get some entropy
    std::random_device rd;
    uint64_t val = rd();
    val = (val << 32) | rd();
    // Phase two, mix in some timing noise
    auto now = std::chrono::high_resolution_clock::now();
    val ^= now.time_since_epoch().count();
    // Phase three, multiply-xorshift mixing
    val *= 17857243053658340393ULL;
    val ^= (val >> 35);
    // Phase four, mix in the address of a local variable
    val ^= reinterpret_cast<uint64_t>(&val);
    // Phase five, multiply-xorshift mixing again
    val *= 1080363914018236217ULL;
    val ^= (val >> 43);
    // Phase six, mix in the address of our own code
    val ^= reinterpret_cast<uint64_t>(&system_entropy);
    // Phase seven, final multiply-xorshift mixing
    val *= 3262792193083468147ULL;
    val ^= (val >> 31);
    return val;
}

}  // namespace

RandUInt32::RandUInt32() : RandUInt32(-1) {
    // Nothing (else) to do.
}

RandUInt32::RandUInt32(int64_t seedval)
    : rng_(seedval < 0 ? system_entropy() ^ uint64_t(seedval)
                       : uint64_t(seedval)) {
    // jsf32 only has a 32-bit seed, which means that we would have bias
    // if we only used 32 bits of our entropy, so we use some of the high
    // bits to advance a little bit.
    uint32_t advance = (seedval >> 32) & 0x1ff;
    for (uint32_t i = 0; i < advance; ++i) {
        rng_();
    }
}

uint32_t RandUInt32::get(uint32_t max) {
    if (max == 0) {
        throw std::invalid_argument("RandUInt32::get(max): max must be > 0");
    }
    if constexpr (rng_type::min() == 0 && rng_type::max() == UINT32_MAX) {
        // From https://www.pcg-random.org/posts/bounded-rands.html
        uint32_t x = rng_();
        uint64_t m = uint64_t(x) * uint64_t(max);
        uint32_t l = uint32_t(m);
        if (l < max) {
            uint32_t t = -max;
            if (t >= max) {
                t -= max;
                if (t >= max)
                    t %= max;
            }
            while (l < t) {
                x = rng_();
                m = uint64_t(x) * uint64_t(max);
                l = uint32_t(m);
            }
        }
        return m >> 32;
    } else {
        std::uniform_int_distribution<unsigned int> udist(0, max - 1);
        return udist(rng_);
    }
}

unsigned int RandUInt32::get() {
    if constexpr (rng_type::min() == 0 && rng_type::max() == UINT32_MAX) {
        return rng_();
    } else {
        std::uniform_int_distribution<unsigned int> udist(0, UINT32_MAX);
        return udist(rng_);
    }
}

Queries

What's jsf32?
It's a C++-style PRNG that is fast and compact. The original algorithm was invented by Bob Jenkins and implemented in C, but we use this C++ implementation.
Prof. Melissa discusses it in depth on her blog here and here.
Are there other generators I can use?
Yes, lots. There are a number built into the C++ standard library, but they tend to be older generators that have various flaws.
The std::mt19937 generator (a.k.a. The Mersenne Twister) is pretty good, but uses up a lot of space—at least 19,937 bits or about 2.5 kiB, and some implementations use twice that to gain added speed.
Technically, the Mersenne Twister fails some PRNG test suites (failing the Matrix Rank and Linear Complexity tests).
What about Prof. Melissa's PRNG?
You can download a C++ implementation of PCG from GitHub.
Why don't you use it here?
Prof. Melissa, as a matter of principle, prefers not to require anyone to use her PRNG.