Phase 6: Move size_ to Node
In this step, you will prepare to implement randomized insert by moving information about the size of the tree from the TreeSet class template to the nested-node class.
We'll adopt our usual process: Plan, Write Tests, Implement, Test & Fix (as described in the development process instructions for HW6).
Check out the hints and tips on the side!
And in this phase, it really helps to draw a diagram.
You will need to submit a planning image for this phase, as Planning/Phase_6.jpg.
Your Task…
Moving size_
In order to make randomized insert work, you will need to know the size of each subtree. For Homework 6, you probably stored size_ as a data member of the TreeStringSet class. But that doesn't give you an easy way to find the size of a subtree!
Remember that a subtree is represented by the node that is the root of the subtree. If you want to know the size of every subtree, you need to move size_ into the Node class. Then the size of any (sub)tree is stored in its root's size_ data member!
In this step you will update your encoding so that size_ is a part of the Node struct (and no longer directly stored in the enclosing TreeSet<T> class) and then adjust your implementation as necessary.
Suggested Helper: nodeSize
Given a subtree, we might want to say subtree->size_ to find its size. However, empty trees are represented by a nullptr value, so this code won't work in that case. Rather than scatter nullptr checks into your code every time you want to know the size of a subtree, it's useful to write a helper function that takes a Node* pointer and
- If the pointer is
nullptr, returns 0. - Otherwise, returns the Node's
size_.
You can then use nodeSize(subtree) any time you need the size of the subtree and have it “just work” even if the subtree is empty. (It's your private helper function, so you don't have to call it nodeSize.)
To be clear, adding this helper function is strongly recommended. Using this function will make your code cleaner and easier to read, and reduce the chance of segfaults caused by trying to access size_ on a nullptr. If you find yourselves writing node->size_ to read the size of a node, stop. You probably want to be calling nodeSize(node) instead.
Maintaining Sizes
Think Through Maintaining Sizes with Rotations Concretely
Human beings are much better at concrete thinking than abstract thinking, and excellent at going from concrete examples to abstract understanding, so the easiest way to figure out what you need to do to update the sizes when rotating the tree is to work through a concrete example of updating an example tree with real sizes. When you do, you'll be able to see the pattern and generalize. Here's an example:

Draw this tree yourselves and see how the sizes of b and d change when you do a right rotation at d. How much did they change by? Can you see where that number comes from? Can you generalize that…?
You'll need to change your insert function and its associated helpers to make sure that the subtree size stored in each node is correct after each insert operation. Thus you will have to update your rotate functions so they properly update size_ in the subtrees involved in the rotation. It is very easy not to think this part through properly and make a mistake.
- Plan before you code. Understand how node sizes need to be updated when you rotate the tree. (See the pullout for more.)
- The amount of code you need to write is quite small, but understanding what that code needs to be requires some thinking. Just as a rotation can be performed locally in constant time by just reassigning pointers in the relevant nodes, we can update the sizes in constant time using the sizes in the relevant nodes.
- Remember, use your
nodeSizehelper function to make your code cleaner and less error-prone!
Required Function: printSizesToStream
Meh. What use is this function?
It'll be a huge help in testing to make sure your subtree sizes are correct!
In fact, it's used in the test code we provide below.
Implement the public printSizesToStream function. It behaves just like the printToStream function, except that instead of printing the item for each node it prints the size of the subtree rooted at each node.
Specifically, your printSizesToStream function should write your tree to the provided ostream (output stream) using the following algorithm, which is almost identical to the one you implemented for printToStream:
DISPLAY-SIZES(subtree): if the subtree is empty, PRINT 0 otherwise: PRINT "(" DISPLAY-SIZES(left subtree) PRINT ", " PRINT the size of this (sub)tree PRINT ", " DISPLAY-SIZES(right subtree) PRINT ")"
For the non-empty case, we also strongly suggest you sanity check your tree's sizes using code similar to the following:
assert(node->size_ ==
1 + nodeSize(node->left_) + nodeSize(node->right_));
If your sizes are wrong somewhere, this check will tell you as soon as you try to print the sizes!
Some Useful Test Code
Because it's so important to get the sizes right, we have some tests that you can add to your test suite that will help you verify that your code is maintaining sizes correctly.
bool sizeTest() {
// Set up the TestingLogger object
TestingLogger log("size");
constexpr size_t NUM_SAMPLES = 50;
constexpr size_t NUM_CHECKS = 16;
int items[NUM_SAMPLES] = {
28, 94, 46, 13, 37, 21, 7, 59, 56, 32, 56, 11, 42, 88, 14, 84, 54,
67, 3, 12, 44, 72, 81, 24, 6, 11, 37, 65, 6, 52, 75, 78, 90, 5,
14, 72, 27, 8, 9, 22, 75, 74, 33, 77, 93, 55, 63, 76, 17, 73};
const std::string expectedLeafOutputs[NUM_CHECKS] = {
"0",
"(0, 1, 0)",
"(0, 2, (0, 1, 0))",
"(0, 3, ((0, 1, 0), 2, 0))",
"((0, 1, 0), 4, ((0, 1, 0), 2, 0))",
"((0, 1, 0), 5, (((0, 1, 0), 2, 0), 3, 0))",
"((0, 2, (0, 1, 0)), 6, (((0, 1, 0), 2, 0), 3, 0))",
"(((0, 1, 0), 3, (0, 1, 0)), 7, (((0, 1, 0), 2, 0), 3, 0))",
"(((0, 1, 0), 3, (0, 1, 0)), 8, (((0, 1, 0), 3, (0, 1, 0)), 4, 0))",
"(((0, 1, 0), 3, (0, 1, 0)), 9, (((0, 1, 0), 4, ((0, 1, 0), 2, 0)), 5, "
"0))",
"(((0, 1, 0), 3, (0, 1, 0)), 10, ((((0, 1, 0), 2, 0), 5, ((0, 1, 0), "
"2, 0)), 6, 0))",
"(((0, 1, 0), 3, (0, 1, 0)), 10, ((((0, 1, 0), 2, 0), 5, ((0, 1, 0), "
"2, 0)), 6, 0))",
"(((0, 2, (0, 1, 0)), 4, (0, 1, 0)), 11, ((((0, 1, 0), 2, 0), 5, ((0, "
"1, 0), 2, 0)), 6, 0))",
"(((0, 2, (0, 1, 0)), 4, (0, 1, 0)), 12, ((((0, 1, 0), 3, (0, 1, 0)), "
"6, ((0, 1, 0), 2, 0)), 7, 0))",
"(((0, 2, (0, 1, 0)), 4, (0, 1, 0)), 13, ((((0, 1, 0), 3, (0, 1, 0)), "
"7, ((0, 1, 0), 3, (0, 1, 0))), 8, 0))",
"(((0, 2, (0, 1, 0)), 5, ((0, 1, 0), 2, 0)), 14, ((((0, 1, 0), 3, (0, "
"1, 0)), 7, ((0, 1, 0), 3, (0, 1, 0))), 8, 0))"};
const std::string expectedRootOutputs[NUM_CHECKS] = {
"0",
"(0, 1, 0)",
"((0, 1, 0), 2, 0)",
"((0, 1, 0), 3, (0, 1, 0))",
"(0, 4, ((0, 1, 0), 3, (0, 1, 0)))",
"((0, 2, (0, 1, 0)), 5, (0, 2, (0, 1, 0)))",
"((0, 1, 0), 6, ((0, 1, 0), 4, (0, 2, (0, 1, 0))))",
"(0, 7, ((0, 1, 0), 6, ((0, 1, 0), 4, (0, 2, (0, 1, 0)))))",
"((0, 6, ((0, 1, 0), 5, ((0, 1, 0), 3, (0, 1, 0)))), 8, (0, 1, 0))",
"((0, 6, ((0, 1, 0), 5, ((0, 1, 0), 3, (0, 1, 0)))), 9, (0, 2, (0, 1, "
"0)))",
"((0, 4, ((0, 1, 0), 3, (0, 1, 0))), 10, ((0, 2, (0, 1, 0)), 5, (0, 2, "
"(0, 1, 0))))",
"((0, 4, ((0, 1, 0), 3, (0, 1, 0))), 10, ((0, 2, (0, 1, 0)), 5, (0, 2, "
"(0, 1, 0))))",
"((0, 1, 0), 11, (((0, 1, 0), 3, (0, 1, 0)), 9, ((0, 2, (0, 1, 0)), 5, "
"(0, 2, (0, 1, 0)))))",
"(((0, 1, 0), 7, (((0, 1, 0), 3, (0, 1, 0)), 5, (0, 1, 0))), 12, ((0, "
"1, 0), 4, (0, 2, (0, 1, 0))))",
"((((0, 1, 0), 7, (((0, 1, 0), 3, (0, 1, 0)), 5, (0, 1, 0))), 11, ((0, "
"1, 0), 3, (0, 1, 0))), 13, (0, 1, 0))",
"(((0, 1, 0), 3, (0, 1, 0)), 14, ((((0, 2, (0, 1, 0)), 4, (0, 1, 0)), "
"8, ((0, 1, 0), 3, (0, 1, 0))), 10, (0, 1, 0)))",
};
for (auto kind : {LEAF, ROOT}) {
TreeSet<int> myTree{kind};
// Our inserted items are all in the range [0, 99] and may contain
// duplicates, so we can use this array to track which items we've seen
// to independently compute the expected size of the tree.
bool seen[100] = {false};
auto expectedOutputs =
(kind == LEAF) ? expectedLeafOutputs : expectedRootOutputs;
size_t expectedSize = 0;
for (size_t i = 0; i < NUM_SAMPLES; ++i) {
int item = items[i];
if (i < NUM_CHECKS) {
std::stringstream ss;
myTree.printSizesToStream(ss);
affirm_expected(ss.str(), expectedOutputs[i]);
}
myTree.insert(item);
if (!seen[item]) {
seen[item] = true;
expectedSize++;
}
affirm_expected(myTree.size(), expectedSize);
}
}
// Print a summary of the all the affirmations and return true
// if they were all successful.
return log.summarize();
}
Feel free to match the naming of the test function to the names of your existing tests, and don't forget to call it from main in your test suite!
Whoa, that looked complicated!
It's actually not too bad once you break it down. It's just inserting a bunch of items into two trees (one with
LEAFinsertion and one withROOTinsertion) and checking that the sizes are correct after each insert. It checks the output ofprintSizesToStreamagainst expected outputs that we computed ahead of time.
Meh. I just pasted it in without reading it.
(When logged in, completion status appears here.)