Generating Random Numbers in C++
When we implement our randomized insert, we can't exactly pull out an \( N \)-sided die every time we make a recursive call to insert!
Indeed—we can't have our insert function stop and wait for a die roll. Instead, we need to simulate the die roll (or coin toss) by generating a random number in code.
From Dice to C++
Rather than roll a die, we want to generate a random integer between 0 and \( N-1 \) (inclusive) in our code, using a pseudorandom-number generator (PRNG).
Oh, I know about that! You just say
rand() % Nto get a random number between 0 and \( N-1 \), right?
C's built-in random facilities are pretty primitive and have their own issues. And
cpplintconsidersrand()problematic enough that it won't let you use it. And usingrand() % Nintroduces bias unless \( N \) divides evenly into the range of valuesrand()can produce.
Although C++ supports the problematic
rand()function from C (tradition!), it also provides a much more powerful and flexible set of random-number generation facilities in the standard library, via the<random>header.
Actually, Python's
randommodule is so much easier to get started with than C++'s<random>and less error-prone for common tasks.
And even Java's
java.util.Randomis pretty straightforward.
This is one example whwere C++'s power and flexibility comes at the cost of simplicity—when you just want to do something like get a random number in a certain range, it can feel like a lot of scaffolding is needed to get there. In this lesson, and the upcoming assignment, we felt like you had enough things going on without needing to learn all the ins and outs of C++'s random-number generation facilities, so we created a simple-to-use random-number generator for you to use: the RandUInt32 class.
The CS 70 RandUInt32 Class
The CS 70 server has a library called randuint32. You can include it in your code with:
#include <cs70/randuint32.hpp>
The library defines a class called RandUInt32.
Instances of that class can generate uniformly distributed random integers. You access those numbers by calling the class's get(…) member function, which takes in a number N, and returns a random integer between 0 and N-1 (and thus has N distinct possible values).
What if I call
myGenerator.get(0)?
Maybe something spooky happens! Perhaps magic smoke comes out of the computer! You never know with undefined behavior!
At one time, our library did have undefined behavior for that case, but now it throws a
std::invalid_argumentexception instead. That's another way it's a bit friendlier than C++'s built-in random facilities.
Any time you want to compile and link code that uses the RandUInt32 class, you will need to tell the linker
-lranduint32to link in theranduint32library.
So if your program is called myrand.cpp, you would compile to a .o file like normal and then link with our random-number library. For example,
clang++ -c -std=c++17 -Wall -Wextra -pedantic myrand.cpp
clang++ -o myrand myrand.o -lranduint32
Because this library is CS 70–specific, you will need to compile code that uses it on the CS 70 server.
Your Turn
The following program generates a RandUInt32 object, then calls its get(…) member function 6000 times to generate numbers between 0 and 5 (i.e., six different outcomes) and summarizes the results:
#include <cs70/randuint32.hpp>
#include <cstddef>
#include <iostream>
constexpr size_t TOTAL_ROLLS = 6000;
constexpr size_t DICE_SIDES = 6;
int main() {
RandUInt32 rand;
size_t counts[DICE_SIDES] = {0, 0, 0, 0, 0, 0};
for (size_t i = 0; i < TOTAL_ROLLS; ++i) {
size_t roll = rand.get(DICE_SIDES);
++counts[roll];
}
std::cout << "Results from " << TOTAL_ROLLS << " dice rolls" << std::endl;
for (size_t i = 0; i < DICE_SIDES; ++i) {
std::cout << i << "\t" << counts[i] << std::endl;
}
return 0;
}
We'll give you two options for running this code. The first way mirrors how you'd use it in your own code for assignments. Use the link below to open the code in GitHub, and then clone that repository in Visual Studio Code and run it on the CS 70 server.
Remember, you need to make VS Code run the program on the server.
You can build the program just by running
make.
Or, alternatively, if that all seems like too much trouble, we've made it so you can run the code in Online GDB. (But for this case, we had to do a few tricks to let that environment use our library, so we made a small tweak to the #includes.)
When “Random” Isn't Actually Random
Are random numbers from
RandUInt32really random?
Is anything really random, avian creature? Perhaps your experience is the playback of a recording of some previous universe. Would your tiny, meat-based brain even notice?
These are good questions, although maybe Feebleglorp could chill a bit.
Let's think about whether “real” randomness, whatever that might be, is actually something we want.
When our algorithms need to make a random choice, they don't want to wait around while someone rolls real dice out in the world. They want to decide very quickly, otherwise that wait would become a bottleneck for the algorithm's performance.
In fact, modern random-number generators on today's machines are really fast. Good ones can produce a new random number in under a nanosecond!
To run quickly, random-number generators use deterministic algorithms that produce a long sequence of random-looking numbers that eventually cycles back and repeats.
Meh, that's not random at all!
Well, that's one reason why sometimes they're called pseudorandom-number generators (PRNGs).
But the truth is, the fixed sequence of outputs they produce is usually very long.
For example, the Mersenne Twister produces a staggering \( 2^{19937} - 1 \) outputs before it repeats itself (known as the period of the PRNG).
Other modern PRNGs are smaller than the Mersenne Twister, but still have staggeringly large periods.
If you start a modern PRNG from some arbitrary place in its period, you'll be producing a stream of random-looking numbers that no one on Earth has ever seen before.
I guess that seems kinda random.
Meh. Not random enough for me!
Why not? Can you envision a scenario where a sequence of random-looking numbers no one has ever seen before wouldn't feel random when you saw them?
Meh. I don't know. I bet they aren't as random as real randomness.
Oddly enough, the opposite is often true!
A lot of real-world randomness has some bias to it (e.g., flipping a fair coin is biased towards the side that was up when you started), and some processing is required to remove that bias. In contrast, good PRNGs produce random sequences that pass a broad range of statistical tests.
For many PRNGs, even if you see lots of their output, you still can't easily figure out where in their deterministic sequence they are, so they aren't preditable in that sense.
But PRNGs do provide reproducability. If you start the sequence of outputs at the same place, you get the same results.
Deterministic behavior that feels random?!? Cool!!
It is cool!
Seeding...
Where you start your PRNG is known as the seed.
If you give your PRNG the same seed twice, it will produce the same sequence of numbers both times!
If your program's behavior depends on that sequence of numbers, you'll get the same outcomes in both runs, even though your program's behavior is “random”.
When you default construct our RandUInt32 class, it chooses the seed randomly.
Whoa, whoa! You need randomness before you can have randomness? I thought we had all these reasons not to use real randomness!
One way to look at things is that PRNGs are an efficient amplifier for some initial randomness.
Where does the initial randomness for the seed come from?
Good question!
One big problem with C++'s (and C's) PRNG facilities is that they tend to assume that the programmer can figure that out somehow.
Lots of programmers get this part wrong when they have to do it.
Yeah, because computers can't be random. Meh.
Actually, real-world computers have lots of sources of unpredictable values (sometimes called entropy sources).
One example of an entropy source is the high-resolution clock in the machine. It would be very hard to know exactly which nanosecond of the clock we started our program, so we can use that (perhaps scrambled up a bit) to make a seed value.
Even better-quality randomess is maintained by the operating system, based on all kinds of entropy sources the computer has access to (like when you last pressed a key, or when network packets arrive).
The key thing is that although a random seed value might be a bit slow to compute (and require us to find a good entropy source to use), once we've initialized the generator, it can then run quickly, generating lots of good (pseudorandom) randomness.
And remember, the default constructor for our
RandUInt32class takes care of finding entropy to make a random initial seed for you, so you don't have to worry about it!
But I could pick a non-random seed instead?
Yes! In our
RandUInt32class, if you provide a positive 64-bit integer as a constructor argument, it uses that integer to choose the starting point instead.
Why would I do that?
Mostly for debugging. Setting the seed means you can get the same behavior from your program every time, even with “random“ numbers.
For example, if your program segfaults in some runs but not others, you could set the seed yourself to reproduce the error consistently.
That way you can investigate what's happening a lot more easily!
Hay! You said I could pick a positive 64-bit integer as a seed? What about negative ones?
Sometimes you want to be able to choose between random seeding and fixed seeding based on user input. It would be annoying to need to call a different constructor for those two different cases, so we made it so that if you provide a negative integer as the seed, it uses random seeding instead.
A Quick Demo
This program seeds the random-number generator before generating 10 numbers between 0 and 100.
Edit your myrand program to change main to match the code below, recompile and run it, and let us know what happens!
int main() {
RandUInt32 rand_{1994}; // Here, our seed is a favorite year....
for (size_t i=0; i < 10; ++i) {
if (i > 0) {
std::cout << ", ";
}
std::cout << rand_.get(100);
}
std::cout << std::endl;
return 0;
}
If you think that's crazy, try the seed 1279079580, especially if you consider 3 to be a lucky number.
Final Thoughts on Seeding & Usage
I was pretty surprised when I saw 42 appear three times in a row!
That's nothing, that other seed had a pretty serious amount of three-ness.
Yes, those examples are what Prof. Melissa calls a “PRNG party trick”.
One way to come up with party tricks is to just step through seeds until you find one that does what you want.
Maybe you could cheat at the lottery!!
Lotteries generally don't use PRNGs, but if they did, they'd be very careful not to let anyone control how the seed is generated.
In this class, as we've mentioned, our goal for using a specific seed is not for party tricks—
—it's so we can get the same reproducible behavior over and over for debugging!
So, what should we take away from this discussion?
Takeaways
- Because
RandUInt32can seed itself with a good seed, only override it with a specific seed as a bonus feature for debugging. - Although
RandUInt32can seed itself, that doesn't mean you should create newRandUInt32objects all the time. As much as possible, just create one that gets used a lot.- For example, in the homework, one
RandUInt32object per tree object is a good way to go.
- For example, in the homework, one
- Use the
get()member function to get random numbers within a certain range.
Bonus Fun (Optional)
Prof. Melissa is the designer of a (pseudo)random-number generator known as PCG that has a number of desirable properties.
It was chosen as the default PRNG for NumPy! And it's used by the language Go as well.
Prof. Melissa has a website for PCG and a blog which covers various interesting topics related to PRNGs.
She also has a different library that makes using C++ RNGs easier called
randutils, very much modelled after Python's random-number library. It's described in this blog post.
It might be useful if you need easy-to-use random numbers in other projects, But we recommend you use
RandUInt32in this assignment.
If you're interested in how RandUInt32 is implemented, you can find the code on this page:
(When logged in, completion status appears here.)