CS 70

Random Numbers

It is often useful to generate a random number. Perhaps you want to simulate a stochastic process, or maybe you need to randomly decide where you should insert an item into a data structure. There are many ways generate random numbers in C++.

Do Not Use Any C-Style PRNGs in CS 70

One option always available to C++ programmers is to do things the way a C programmer would, so you may have seen code that uses one of the various random-number generators available in C. In this case, you should not use any of them.

Specifically…

Never use the C Library PRNG, rand()

The C standard library provides a function rand(), but we prohibit its use in CS 70. The C standard describes rand() as follows:

Description

The rand function computes a sequence of pseudo-random integers in the range 0 to RAND_MAX inclusive.

The rand function is not required to avoid data races with other calls to pseudo-random sequence generation functions.

Recommended Practice

There are no guarantees as to the quality of the random sequence produced and some implementations are known to produce sequences with distressingly non-random low-order bits. Applications with particular requirements should use a generator that is known to be sufficient for their needs.

Environmental limits

The value of the RAND_MAX macro shall be at least 32767

The potentially very limited range and low quality of rand()'s output are big reasons not to use the C standard library's rand()—not just in CS 70, but almost anywhere.

Running cpplint on code that uses rand() will generate an error to remind you not to use it. Unfortunately, the test is mostly concerned with just one of the issues (data races causing problems for multithreaded code) and so recommends a POSIX variant function, rand_r(), but rand_r() still doesn't address other problems from using rand().

So…

Don't Use POSIX Library PRNGs Either

Linux, Unix systems (including macOS), and Windows all support the POSIX C library, which augments the Standard C library with additional functionality, including “improved” random-number–generation facilities; specifically random() and drand48() (and their thread-safe _r variants).

These are generators with a C-style interface, and are not of particularly high quality. Although widely used in various code that makes casual use of random numbers, don't use them in CS 70.

The CS 70 Random-Number Library

For CS 70, we have provided a simple library for generating random numbers. Our library is a wrapper for a very robust pseudorandom number generator. It is preinstalled on the CS 70 server.

Our library generates random unsigned integers with 32 bits. Hence, it is named randuint32.

In order to use randuint32, you need to include the header in the file where you want to use it:

#include <cs70/randuint32.hpp>

The CS 70 Random-Number–Library Interface

We provide the following functionality:

RandUInt32()
A default constructor. A default constructed RandUInt32 object will generate a different random sequence of numbers every time it is used (identical to calling RandUInt32(-1)).
RandUInt32(int64_t seedval)

A parameterized constructor. A RandUInt32 constructed with a given seedval

  • When given a positive number, this constructor will generate the same random sequence every time it is used. This property is very useful for debugging purposes.
  • When given a negative number, this constructor will generate a different random sequence every time it is used.
get(uint32_t max)
Returns a uint32_t holding the next pseudorandom unsigned int in the range [0, max - 1].
get()
Returns a uint32_t holding the next pseudorandom unsigned int in the range [0, 2^32 - 1]. You probably want to use the other get(…) function, not this one.

Except on ILP32 systems, there is also a get(size_t max) function. This function exists to detect unsafe conversions from size_t to uint32_t—it will throw an exception if max is greater than 2^32 - 1, since, by its very nature, RandUInt32 generates 32-bit unsigned integers. If you really want larger random numbers, see below.

A Note on Seeding

If you want the output of the random-number generator to be different in different runs, you should use the default constructor or pass a negative number to the parameterized constructor.

But sometimes it's useful to have a random-number generator that produces the same sequence of (random-looking) numbers each time you run your program. This option is especially useful for debugging, because it means that if you find a bug, you can easily reproduce it. To get this behavior, pass a positive number to the parameterized constructor.

  • Duck speaking

    If I just want one random number, can I create a RandUInt32 object, call get() once, and then throw the object away?

  • LHS Cow speaking

    Although you could do that, it's not a good idea. Setting up the random-number generator takes some time, especially if you ask it to generate a different sequence each time, because it needs to gather some entropy from the system. So even if you want just one random number at a particular place in your code, if you're going to need additional “single random numbers” in other parts of your code, or if you're going to need multiple random numbers somewhere else, it's better to create one RandUInt32 object and call get() on it wherever you need a random number.

  • Hedgehog speaking

    What's entropy?

  • LHS Cow speaking

    In this context, it means “really random stuff from the system”. For example, the exact timing between keystrokes or mouse movements, or sending and receiving network packets is pretty random. We'll have a deeper-dive section at the end of this page if you're curious.

Compiling

In order to compile, you will need to link with the library by adding -lranduint32 to your linking step. For example, here's the Makefile for the following two examples:

CXX = clang++
CXXFLAGS = -g -Wall -Wextra -pedantic
LIBS = -lranduint32
TARGETS = rand-test dice

# Note: The rules below use useful-but-cryptic make "Automatic variables"
#       to avoid duplicating information in multiple places, the most useful
#       of which are:
#
#   $@  The file name of the target of the rule
#   $^  The names of all the prerequisites, with spaces between them.
#   $<  The name of the first prerequisite
#
# For GNU make, you'll find that it supports quite an extensive list, at
#   http://www.gnu.org/software/make/manual/html_node/Automatic-Variables.html

all: $(TARGETS)

rand-test: rand-test.o
    $(CXX) $^ -o $@ $(LIBS)

rand-test.o: rand-test.cpp
    $(CXX) $(CXXFLAGS) -c $<

dice: dice.o
    $(CXX) $^ -o $@ $(LIBS)

dice.o: dice.cpp
    $(CXX) $(CXXFLAGS) -c $<

clean:
    rm -rf $(TARGETS) *.o

Example 1: Generating Repeatable or Nonrepeatable Pseudorandom Sequences

This example shows how to use RandUInt32 in two different ways to generate random sequences.

#include <iostream>
#include <cs70/randuint32.hpp>

int main() {
    // We are going to generate some number of random samples
    constexpr size_t NUM_TO_GENERATE = 10;

    // Let's generate numbers between 0 and 99
    constexpr size_t RAND_RANGE = 100;

    // Using a specified seed, we can create a pseudorandom
    // number generator that always produces the same
    // random-seeming sequence on every run of the program
    //
    // This is good for debugging programs that need to use
    // random numbers so that you get the same 'random'
    // sequence while you test your code
    size_t SEED = 17;

    // pass the seed to to construct a CS70random object
    RandUInt32 rng_repeatable{SEED};

    // We can now generate the same sequence each time we run the program
    std::cout << "Reproducible pseudorandom sequence:" << std::endl;
    for (size_t i = 0; i < NUM_TO_GENERATE; ++i) {
        std::cout << rng_repeatable.get(RAND_RANGE) << " ";
    }
    std::cout << std::endl;

    // If we do not pass a seed to the constructor, we will get
    // a different random sequence every time we run the program
    RandUInt32 rng_nonrepeatable;

    // We can now generate a different sequence each time we run the program
    std::cout << "Different pseudorandom sequence for every run:" << std::endl;
    for (size_t i = 0; i < NUM_TO_GENERATE; ++i) {
        std::cout << rng_nonrepeatable.get(RAND_RANGE) << " ";
    }
    std::cout << std::endl;

    return 0;
}

Running the code four times gives the following results. Notice the reproducibility of the first method each time the program is run.

Example 2: Simulating a Dice Rolling Game

In this example, we simulate rolling a die many times, trying to get a winning number. We count the number of times we win and get a point, and then estimate the winning probability.

#include <iostream>
#include <cmath>
#include <cs70/randuint32.hpp>

using std::cout;
using std::endl;

int main() {
    // Let's simulate a game where we are trying to roll
    // a die and get the winning number
    constexpr size_t NUM_OF_DICE_ROLLS = 1000;
    constexpr size_t DICE_SIDES = 6;
    constexpr size_t WINNING_NUMBER = 2;

    // create a random number generator with no seed
    RandUInt32 rng_nonrepeatable;

    size_t num_wins = 0;

    for (size_t i = 0; i < NUM_OF_DICE_ROLLS; ++i) {
        unsigned int dice_roll = rng_nonrepeatable.get(DICE_SIDES);
        if (dice_roll == WINNING_NUMBER) {
            // you get 1 point for rolling the winning number
            cout << '+';
            ++num_wins;
        } else {
            // you get 0 points for rolling anything else
            cout << '_';
        }
    }

    // In the code below, notice that we convert the numerator to a double,
    // otherwise, we would get integer division and have a value of 0 for
    // the result!  We either do that with an explicit conversion, or by
    // multiplying by a double constant.

    double expected_wins = double(NUM_OF_DICE_ROLLS) / DICE_SIDES;
    double expected_percent = 100.0 / DICE_SIDES;
    double actual_percentage = 100.0 * num_wins / NUM_OF_DICE_ROLLS;

    cout << "\n\nYou won " << num_wins << " points out of " << NUM_OF_DICE_ROLLS
         << " trials.\n\n"

         << "We'd expect to win " << expected_percent << "% of games ("
         << expected_wins << " wins on average)\n"
         << "In this simulation, we won " << actual_percentage << "% of games"
         << endl;
    if (std::abs(actual_percentage - expected_percent) <= 0.5) {
        cout << "Whoa, that's pretty close!" << endl;
    } else if (actual_percentage < expected_percent) {
        cout << "Better luck next time!" << endl;
    } else if (actual_percentage > expected_percent) {
        cout << "I guess we got lucky!" << endl;
    }

    return 0;
}

Running this program three times gave the following output:

cs70 SERVER > ./dice
_+_+______++_____+____+__++____+_++_____________+________+_______+___+___++_________+____________________+_+_+____+___+______________+_____+______+________+________++_+_+_______+________+_+____+_+_+________________+_+____+_______+________________________+++_________++______+__+____+_+_____________+___+________+_+____+____________+_+_+________________+____++___+___+________________+_____+___________+_+_______+___________________++___+_++_____+______+_+________+___+______________+____+_____+___+_+_++____________+_______+__+_+___+__+_________________+_____+__+____+__________++___+_____++________+____________+_____+_____+_________+_______________++_________+____+__+____++______+____+_____+_____+___+___+_________+_+_________+__+____+________+_++____++_____+++_____+___+_____________+__+______+____________+_+____+____+____++_+__________+_________+____+___+_+___+____________+_+______++____+___+_________________+______+_+______+___+_____++_________+_____+___+_____+____+__________+___+___+___+_+

You won 172 points out of 1000 trials.

We'd expect to win 16.6667% of games (166.667 wins on average)
In this simulation, we won 17.2% of games
I guess we got lucky!
cs70 SERVER > ./dice
____+____________+__________+_______+___________________________+______+_+_____+_____________+++___+___++__________++__+_+____________+___+__________________________+__________________+_______+______+_______+___+_____+_________+_+__+_+__________+______________________+__+___+____________+____________+________+__+__+___+_+_____+__+________+__+_____+_+++_____________+___++_________+______+_+__+____+___+__+_++_________+___+___+_____________++____+__________+____+_________+______+_+___+___+__+________+_+_____________+______________+_____+______+____+_______________+___+______+______++_____+_____+__________+_+_____+_____+_+__+___+___+_+_________+____+_________________+___+______________+___++___+_____________+___________++______+____+___+_+_____+_+________+___________++____+_______+____+_++______+__+_____________________++____++___+____+___+_____________+___________+______________________________+__+___+_++____________+__+++___________________+_+_++______________++____+_______+____+_____+__

You won 154 points out of 1000 trials.

We'd expect to win 16.6667% of games (166.667 wins on average)
In this simulation, we won 15.4% of games
Better luck next time!
cs70 SERVER > ./dice
+_______________++_+_+___+________+_+___++________+_________++______+________+_______________________++_________++______+_________+_____+++__________________+____+_____+______+_++______+__+__________+__+________+___+_____+____+____+__++____________++_+_+___________+___+_________+_+__+__+_+_____+_______+____+___________________+__________+__++__+___+______+_+______________+_+______________++____________++_++_______+__++___++_______+___+_______________________+___+_+__++_+__+______++______________++_+_+______++___+_______+_+_____+____________+________________+_______+___________+__________+_______+___________+______________+_______+_________________________+_________+__+___+_____+___________+_____+___________________++___________+__+_+_+__+_+________+_+_________+______+____+_+_+_______________+______+____+_____________+______________+_________+__+_++_____+________+_____+___+_++_+________+_+_+_______+____+____+++____+_+__+_+____+__+____++_____________+____++________+____________+____+__+_

You won 167 points out of 1000 trials.

We'd expect to win 16.6667% of games (166.667 wins on average)
In this simulation, we won 16.7% of games
Whoa, that's pretty close!

Bonus Features: Using RandUInt32 as a Standard C++ PRNG

The RandUInt32 class can also be used as a standard C++ PRNG. It meets the requirements of a Uniform Random Bit Generator (URBG) as defined in the C++ standard. So you can use it with algorithms that take a URBG as a parameter, such as std::shuffle and std::sample. For example, you can run

    std::shuffle(container.begin(), container.end(), rng);

to shuffle the elements of an array-like container, where rng is a RandUInt32 object.

If you want to generate random numbers larger than 32-bits or use different distributions, you can use RandUInt32 with the random number distribution classes in the C++ standard library. For example,

    std::uniform_int_distribution<uint64_t> dist1(0ul, 1000000000000ul);
    uint64_t big_random_number = dist1(rng);
    std::normal_distribution<double> dist2(0.0, 1.0);
    double normal_random_number = dist2(rng);

As an example, here is a program that shuffles a deck of cards using RandUInt32 as the PRNG:

#include <iostream>
#include <random>
#include <vector>
#include <algorithm>
#include <cs70/randuint32.hpp>

enum suit_t { HEARTS, CLUBS, DIAMONDS, SPADES };
constexpr const char* SUITS[] = {"♥", "♣", "♦", "♠"};
constexpr const char* RANKS[] = {"A", "2", "3",  "4", "5", "6", "7",
                                 "8", "9", "10", "J", "Q", "K"};

int main() {
    // Create a deck of cards, in order
    std::vector<std::string> deck;
    for (const auto& suit : SUITS) {
        for (const auto& rank : RANKS) {
            std::string card = std::string(rank) + suit;
            // If red, wrap with ANSI escape codes to make it red
            if (suit == SUITS[HEARTS] || suit == SUITS[DIAMONDS]) {
                card = "\033[31m" + card + "\033[0m";  // Red
            }
            deck.push_back(card);
        }
    }

    // Create a RandUInt32 object with random seed
    RandUInt32 rng;

    // Show four games of shuffling the deck
    for (int game = 1; game <= 4; ++game) {
        std::cout << "Game " << game << ":" << std::endl;

        // Shuffle the deck using std::shuffle and our RandUInt32 as the PRNG
        std::shuffle(deck.begin(), deck.end(), rng);

        // Print out four 13 card hands
        for (int hand = 0; hand < 4; ++hand) {
            std::cout << "  - Hand " << (hand + 1) << ": ";
            for (int card = 0; card < 13; ++card) {
                if (card > 0) {
                    std::cout << ", ";
                }
                std::cout << deck[hand * 13 + card];
            }
            std::cout << std::endl;
        }
        std::cout << std::endl;
    }

    return 0;
}

And here's the code running…

Can Computers Produce “Real” Randomness?

  • Duck speaking

    I heard that computers can't really produce random numbers.

  • LHS Cow speaking

    This is an oft-repeated claim, but it's not quite true. While a Turing Machine, alone, can't magic true randomness out of thin air, real computers have more going on than just a Turing Machine.

We'll talk more about randomness in the lesson materials, but, briefly, actual real-world computers can produce truly random numbers.

Modern computers have actual hardware random-number generators that use sources of true randomness, such as electronic noise (e.g., the rdseed instruction on Intel CPUs and the Secure Enclave TRNG on Apple Silicon Macs).

But even before machines had dedicated hardware for generating random numbers, they drew on external sources of randomness (known as entropy sources) such as mouse movements, keyboard timings, and network-packet timings. These sources are inherently unpredictable, allowing the system to maintain an “entropy pool” that slowly accretes entropy from these sources. When a program requests random data, the system can extract randomness from the entropy pool. (Such systems use cryptographic techniques to ensure that the output is unpredictable, even if an attacker can observe the same entropy sources.)

Modern operating systems provide access to these sources of randomness through special interfaces. For example, on Unix-like systems (including Linux and macOS), you can read from /dev/random or /dev/urandom to get random bytes. On Windows, you can use the CryptGenRandom function.

These generators can produce truly random numbers, but they are often slow and sometimes have usage limits. For example, for a system based on an entropy pool, if the pool runs low on entropy, requests for random data may be forced to wait until more entropy is gathered (e.g., the system has received more mouse movements or network packets).

In practice, a good pseudorandom number generator (PRNG), once seeded with true random data, can produce sequences of numbers that are indistinguishable from true randomness for all practical purposes, and can generate random numbers vastly faster than true random sources. This is why most applications use PRNGs rather than directly using true random sources.

The Danger of Careless Seeding

Our RandUInt32 can seed itself automatically from system entropy sources, but many other PRNGs require the programmer to provide a seed. Picking a seed carelessly can inadvertently introduce bias into the random numbers the PRNG produces.

One common error is to use a single 32-bit seed value—32-bits just isn't enough entropy for a PRNG with 32-bit outputs. To illustrate the issue, let's show that we'll see bias in the first output of a PRNG if we only use a 32-bit seed.

Let's assume a high-quality PRNG where the correspondence between seeds and initial outputs is essentially random. If we have a 32-bit seed, there are only \(2^{32}\) possible seeds. And, because the PRNG produces 32-bit outputs, there are also \(2^{32}\) possible first outputs. Imagine a grid of 4 billion buckets, where each bucket corresponds to one of the possible first outputs, and we throw 4 billion balls (one for each possible seed) into the air where it will randomly land in one of the buckets. Does it seem likely that every bucket will have exactly one ball? No! In fact, on average, about 37% of the buckets will be empty! So if we use a 32-bit seed, about 37% of the possible first outputs will never occur! And a similar number will occur more than once, meaning that some outputs are more likely than others.

  • Hedgehog speaking

    That's terrible! How can we avoid this problem?

  • LHS Cow speaking

    This issue is one reason we provided RandUInt32 with a default constructor that seeds itself from system entropy. If you want to use your own seed, make sure it's at least 64 bits. If you're using a source of 32-bit entropy (like std::random_device on many systems), make two calls and combine them to get a 64-bit seed.

  • Duck speaking

    But I know I've seen code like this:

        std::random_device rd;
        std::mt19937 rng(rd());
    

    Is that wrong?

  • LHS Cow speaking

    Yes. It really is. It's incredibly common, but it's still wrong. The std::mt19937 PRNG has 19,937 bits of state, so seeding it with only 32 bits is really bad. If you want to use std::random_device to seed std::mt19937, make multiple calls to rd() and combine them to get a larger seed.

(When logged in, completion status appears here.)