Writing Tests
Good testing requires attention to detail, systematic thinking, and creativity.
Here are some big ideas to keep in mind when you write tests. (You'll also want to read the documentation for the CS 70 testing library.)
Testing is an Ethical Responsibility
Faulty software can cause real damage. You may recall the story of the Therac 25 radiation therapy machine from Week 2. That's an example where inadequate testing directly caused loss of life. Although the stakes aren't always quite that high, bugs can also cause security/privacy failures, and significant loss of time, productivity, or revenue.
Learning to test and debug is part of learning to program. Testing and debugging are not “extra work” that will hopefully not be necessary; they are always part of the process!
It's also good for your grade!
Testing is also good for your grade! Good testing will reveal bugs in your code so that you can fix them before you submit, thus improving your correctness grade. In addition, we will be evaluating your tests by applying them to both correct and incorrect implementations to see how many bugs you can catch!
So approach the testing component of each assignment with just as much creative energy and eagerness to learn as the implementation component.
Boundaries of What Can Be Tested
There are some limitations as to what your tests can do, however.
Your Tests Must Only Use the Public Interface
Given the public interface, you should be able to write all your tests before you write a single line of code implementing the things you are testing. Because you're writing tests to ensure that your code works the way that the interface promises, your tests should work with any implementation of that interface, not just with your particular solution.
Even if you don't expect your code to ever be used with someone else's implementation, keep in mind that you might change the way your own implementation works (e.g., you learn some cool new way of doing things and decide to rewrite your code). Any test that depends on the particular behavior of your older code (i.e., behaviors outside the specification) will fail.
Your Tests Must Not Break the Rules
When you use any piece of code functionality, there are often rules to follow (e.g., “you must always …” or “you must not …”). Some languages will always detect misuse and give an error during compilation or at run time, but, as you know, C++ interfaces typically make no promises that misuse will be detected, and don't specify what will happen if misuse occurs.
In other words, C++'s legendary “undefined behavior”.
Exactly.
But can I test “undefined behavior”?
No!
Trying to test undefined behavior doesn't make sense—by definition there is no known correct outcome to test against. Thus your tests must follow all usage rules given in the specification.
Your Tests Must Not Make Extra Assumptions
But if I know my code always returns zero for undefined behavior, can't I test that?
No. Because your tests aren't just for your implementation.
We will run your tests on a correct implementation that may differ significantly from your own.
If You Don't Follow the Rules
Each of the above sections created boundaries for testing. If you (as examples)
- Try to test something that wasn't in the specified public interface,
- Write a test that does something that the public interface said wasn't allowed, or
- Lock your tests to specific details of your implementation that aren't part of the specified interface,
then it is highly likely that your tests will fail on other correct implementations, which is a big problem.
If your tests fail a known-correct implementation, you will score ZERO for your tests!
What!!? Why…?
If your tests claim that a known-correct implementation is wrong, something is seriously wrong with the tests. Either they
- Make invalid assumptions, or
- Break the rules in some way and thus trigger undefined behavior (which is even worse!).
There is no way to trust the reliability of your tests if they fail correct implementations—your tests might fail all of our incorrect implementations, too, but fail for the wrong reasons. A valid testing suite must reliably distinguish between a correct and an incorrect implementation.
Because this requirement is such a huge deal, the autograder will warn you if your tests fail when run against our correct implementation. Those warnings will allow you to hunt down your flawed test and fix or eliminate it.
You should plan to submit your project well before the deadline (even if it is not quite complete), in case the autograder's output contains important warnings.
Remember that you can submit as many times as you need to before the due date.
Okay, now we know what we shouldn't do, but what should we do?
Practical Advice
Keep Your Tests Limited and Focused
One of the goals of testing is to determine if something is wrong. Another goal, nearly as important, is to help you determine where the problem is in the code.
If each test is focused on the behavior of a specific, manageable, subset of the code, you will have a small debugging region to work on when a test fails.
Should each test test exactly one piece of functionality?
Small tests are good, but that is probably too small.
Since you can only use the public interface, you will often have to perform a number of operations just to set up a scenario that you can test. Often you will really be testing relationships between operations rather than single isolated operations (e.g., if I insert something, do I get the right result from the size function?).
Rule of Thumb: Try to arrange your tests so that each one tests a very limited set of new operations/cases, but otherwise uses the work from things that have already been tested.
Adopt the Right Mindset for Testing
At its core, testing is about finding bugs. Rather than trying to prove that your own code is correct, imagine that your job is to find a flaw in someone else's code. Imagine plausible ways they might have screwed up and introduced a bug, and whether you can figure out a way to detect (or even just trigger) that bug.
- For each function, brainstorm as many publicly observable things that must be true after the function is called as you can.
- Don't just focus on the main purpose for the function. Often there are other more peripheral conditions that should or shouldn't change as well.
- Each one of those is an opportunity for someone much less clever than you (obviously) to cause incorrect behavior.
- Write tests that check those assumptions.
You can't test every possible scenario! But there are ways to focus your attention:
- Think about “edge cases”—that is, consider behavior at the extremes of the input, as well as at transition points where behavior should change.
- Think about coverage. Do different cases trigger different behavior (and therefore probably different codepaths)? Make sure you test all of the high-level possibilities.
- Notice your own bugs. Did you cause a bug and then fix it? Consider writing a test that would catch that bug, if you don't already have one!
- Imagine other people's bugs. Even if you carefully avoided the bug, where did things get complicated or subtle? What is easy to get wrong?
(When logged in, completion status appears here.)