Assignment 5

Harvey Mudd College
Computer Science 60
Assignment 5, Due Sunday, February 22, by 11:59 pm

Link to the CS 60 submission site
Back to the top-level assignments page
Back to the top-level CS 60 page

Infinite Fun with Rex and 20 Questions in Java

This assignment has three parts. In Part 1 you will use the delayed evaluation mechanism in rex (the $ sign) to write functions that do interesting things with infinite lists. In Part 2 you will implement the 20 Questions game in Java. Both of these parts may be done individually or in pairs.

Finally, in Part 3, you will augment your 20 Questions game to permit the game to be saved to a file and to allow the user to read in a file from a previous game. This is the only problem this week that is individual only. If you did Part 2 with a partner, you should each use the code that you wrote together for Part 2 as the basis for your solution to Part 3.

A Reminder on Style

Please keep in mind that at this point in the course some of the assignment points will be awarded based on your programs' commenting, formatting, and design -- under the heading of style.

Any readable and well-commented style is OK. These are some of the things to look out for when writing programs for yourself (or others) to read:

Design

For large programs (e.g. 20 Questions), be sure to design your program on paper before you sit down at the computer. Think about what functions you will want and how they will interact with one another.
Each function should be fairly short and should have a clear "mission". Use helper functions of your own design, as needed or desired. Large, convoluted functions are difficult to read and understand -- more importantly, however, they're very difficult for the original programmer to modify and debug!
Avoid global variables unless they are absolutely unavoidable.

Formatting

Please use ample whitespace: blank lines between functions, spaces within expressions to make them readable and separate component parts.
This also includes a clear and clear structure to your code, and indenting that indicates that structure.
You should use meaningful variable and function names. One-letter variable names are OK if they are a common idiom, e.g., f and R for first and rest or i for an index variable. Otherwise, you should use longer names that indicate the meaning of the variable.
You should keep your lines of code to less than 80 characters -- this helps to make it readable from within all kinds of programs, and avoids wrapping issues.

Comments

We ask you to include a top-of-file comment with your name, partner (if any), time spent, and any other comments for the graders (or instructors)
Each function, even your own helper functions, should have at least a short top-of-function comment documenting its purpose, inputs, and outputs.
Comments within functions are welcome, but not required unless something tricky is going on. Many people find them helpful to write because they keep the flow of the program in mind.

Part 1: Rex and Infinity (25 Points, Pair or Individual)

This part can be done as a pair or individually. Please write all of the following functions in a file called part1.rex and submit that file on the submission site.

In class we wrote a function called from(N) that takes a number N as input and generates the infinite list of integers greater than or equal to N. The function was written as:

  from(N) => [N |$ from(N+1)];

This function is built-in to rex.

Another useful function is get(N,L). It takes a non-negative integer N and an arbitrary infinite list L as input and returns the Nth element in the list (where the first element is assumed to have index 0). This function can be written as:

  get(N, [F | R]) => N == 0 ? F : get(N-1, R);

Notice that this function definition assumes that N is not larger than the length of the list. (This is OK here because the lists we'll be dealing with are, well, um, infinitely long!) Finally, one last function which you'll use is find_index(X,L). This function takes an element X and an arbitrary list L and returns the index of the first occurence of X in list L. This function can be written as:

  find_index(X, [Y | L]) => X == Y ? 0 : 1 + find_index(X, L);

Notice that this implementation also implicitly assumes that the list is infinitely long. (Otherwise, we would need a base case!) The functions get and find_index are NOT built-in to rex so you may wish to copy and paste them into your own file.

Now you could use these functions as follows:

rex > get(2, from(10));
12

rex > find_index(5, from(4));
1

You're welcome to use these functions in your code if you like.

The functions that you'll write for this assignment are described below:

First, the totally gratuitous story. Professor I. Lai of the Pasadena Institute of Technology (P.I.T.) wants to write a function that takes as input two infinite lists L1 and L2 and returns an infinite list that contains all of the elements in L1 and all of the elements in L2, where the order of the elements in this new list is not important. Professor Lai's implementation works like this:
```
      lai(L1, L2) => [F1 | R1] = L1, [F1 |$ lai(R1, L2)];
    
```
You can try this function out in rex. For example, try lai(from(100), from(0)). What you'll discover is that the numbers from the second infinite list [0, 1, 2, ...] never ever get put on this list. That is, no matter how long you wait, you will never see 0, for example, in this infinite list. So, if you try to run
```
      find_index(0, lai(from(100), from(0)));
    
```
you'll end up running forever! Professor Z. Ip of the Massachusetts Institutue of Typing has proposed another approach to merging infinite lists L1 and L2 which he calls "zipping". Zipping takes the first element of L1, then the first element of L2 then it goes back and takes the next element from L1 followed by the next element of L2, etc. For example, here's what would happen if we zipped from(100) with from(0):
```
      rex > zip(from(100), from(0));
      [100, 0, 101, 1, 102, 2, 103, 3, 104, 4, 105, 5, 106, 6, 107, 7,
       108, 8, 109, 9, 110, 10, 111, 11, 112, 

      more? (return to continue, q to quit, a to abort, g to go
      non-stop, b to break) 
    
```
Your first task (and it's very short) is to implement the zip function. Use delayed evaluation, of course! Specifically, your delayed evaluation implementation of zip should put some finite number of elements on the output list (it can be more than one) and then should delay the evaluation of the rest of the list.
Your next task is a bit more challenging. Now you are going to implement a pairs function for infinite lists. That is, given two infinite lists L1 and L2, pairs(L1, L2) returns an infinite list of all pairs (a pair is list with two elements) in which the first element comes from L1 and the second one comes from L2. Professor Lai's implementation, called lai_pairs has the following behavior:
```
    rex > lai_pairs(from(0), from(0));
    [ [0, 0], [0, 1], [0, 2], [0, 3], [0, 4], [0, 5], 
    [0, 6], [0, 7], [0, 8], [0, 9], [0, 10], [0, 11], 
    [0, 12], [0, 13], [0, 14], [0, 15], [0, 16], 
    [0, 17], [0, 18], [0, 19], [0, 20], [0, 21], 
    [0, 22], [0, 23], [0, 24],

    more? (return to continue, q to quit, a to abort, g to go
    non-stop, b to break) 
```
Professor Di Agnol observes that this is very bad. The pair [1, 0], for example, will never ever appear on this list, no matter how long we wait! Her idea is to write a function called pairs(L1, L2) that will build all pairs from L1 and L2 in such a way that eventually any pair that you're looking for will appear on this list. Her clever insight is that the output could look like this:
```
    rex > pairs(from(0), from(0));
    [[0, 0], [0, 1], [1, 0], [0, 2], [1, 1], [2, 0], 
    [0, 3], [1, 2], [2, 1], [3, 0], 
    [0, 4], [1, 3], [2, 2], [3, 1], [4, 0], 
    [0, 5], [1, 4], [2, 3], [3, 2], [4, 1], [5, 0], 
    [0, 6], [1, 5], [2, 4], [3, 3], 

    more? (return to continue, q to quit, a to abort, g to go
    non-stop, b to break) 
    
```
Take a close look at the pattern of the output. Notice that the first pair contains the 0th element of the first list and the 0th element of the second list. The next two pairs are those in which sums of the indices are exactly 1 (that is, the pair comprising the 0th element in the first list and the 1st element of the second list and then the pair comprising the 1st element of the first list and the 0th element of the second list). The next few pairs are those in which the sums of the indices are exactly 2, etc. In this way, if we are waiting for any given element to show up on the list, it will appear eventually. Your task is to write this pairs function using delayed evaluation. You may need to write several helper functions to do this. Again, your function should output some finite number of pairs and should then delay the evaluation of the rest of the list.

Part 2: Twenty Questions (60 Points, Pair or Individual)

In this part of the assignment you will implement the 20 Questions game demonstrated in class but without the saving and loading of games from files. The file "stuff" happens in Part 3 of this assignment.

You will find a compiled version of our solution to this assignment in the file called TQ1.class in the directory /cs/cs60/hwfiles/a5. Play this game by running

  java TQ1

so that you get a good feel for how the program should work! You may copy to this to your own directory if you wish. If you do that, make sure to copy TreeNode.class. This file contains the compiled version of the TreeNode that defines a node of the tree.

Your game should be in a file called TQ1.java. This file should contain both the definition of your tree node class (please call that class TreeNode) as well as a class called TQ1 that will contain your main function and all of its helper functions. Keep in mind that all of the functions in the TQ1 class must be declared to be static as discussed in class. (The functions in the TreeNode class can be non-static - in fact that is preferable!) When you compile your code, you will notice that it creates two files, one is TQ1.class and one is TreeNode.class. The compiler creates such a file for each compiled class.

Here are the requirements for your program:

The program should start off knowing only one question ("Is it bigger than a breadbox?") and two animals ("a squirrel" and "an elephant"). This data should be organized as a binary tree. To make the graders' lives easier, please use this question and these two animals as the starting data.
Each time the game is played, the program asks a question (the question at the current node in the tree) and waits for the user's "yes" or "no" response. You should assume throughout this assignment that any answer that starts with the letter 'y' or 'Y' stands for "yes" and anything else stands for "no". Based on this response, the program then asks the next question. A handy Java function for doing this is the following: If myString is a string then myString.charAt(0) will return the character at location 0 in the string. In general, you can ask for myString.charAt(i) and get the ith character. Remember that characters and strings are different in Java, so if you want to compare that character to the letter z then you could do something like if(myString.charAt(0) == 'y'.
At the end of the game, the program has either guessed the animal correctly or doesn't know the animal. In the latter case, the program asks the user to tell it the animal. It is assumed that the user will also specify the correct article ("a" or "an") before the name of the animal, e.g., "an aardvark" or "a giraffe". Then the program asks the user to give it a question that distinguishes between this new animal and the one it found in its binary tree. (It is assumed that the user will give an actual question, ending in a question mark.) Finally, the program asks if the answer to this question would be yes or no for the animal that it has just learned. Now the binary tree is updated to contain all of this new information.
The player should then be asked if she or he would like to play again. If the answer is yes, the game should repeat from the root of the tree. If the answer is no, the game should thank the player for playing and then list all of the animals that it knows. This listing should be printed using a recursive traversal of the tree. You should not store the names of the known animals in another data structure (e.g. an array or linked list) - the tree should be the only place where this data is stored. For full credit, keep this code as simple and elegant as possible.
You can design your own TreeNode in any way you like. However, here are a few observations that may make your life easier:
- If a TreeNode keeps a reference to its parent in the tree, this will make it much easier to insert new things in the tree!
- If a TreeNode keeps track of whether it is a question (e.g. "Is it bigger than a breadbox") or an animal (e.g. "an aardvark"), your life as the programmer will be a bit easier.
- For questions (e.g. "Is it bigger than a breadbox"), keeping the entire question stored in the TreeNode is important. However, for animals (non-questions), it is probably easier to just keep a short string including just the article and the animal name (e.g. "an elephant" or "a squirrel"). The reason is that you are likely to ask several different kinds of questions about animals (e.g. "Is it an elephant" and "Tell me a question that distinguishes between a computer scientist and an elephant").
- Feel free to add any other user information that you wish in a TreeNode.

Getting input from the user is not difficult. An example of how to do this is provided in the file IO.java in the directory /cs/cs60/hwfiles/a5. Copy this file to your directory and feel free to use or modify this code.

One important note: Any function that uses Java's IO functions may throw an IOException and must therefore have the words throws IOException in its signature. In fact, any function that calls a function that throws an IOException may itself throw an IOException and so forth! For example, imagine that our main function calls another function called getInput. Imagine that his getInput function uses Java's IO functions to get input from the user. Then getInput must have a signature like this:

public static String getInput() throws IOException

Similarly, main itself must have a throws IOException in its signature because it calls getInput.

Part 3: Storing and Retrieving Files (15 Points, Individual)

This last part of the assignment should be done individually. If you worked with a partner on Part 2, you should now each take that code and augment it to do this last part on your own.

If you play your 20-questions game more than once, you'll notice that there is an important capability missing -- it would be nice if the tree of animals and questions were stored somewhere at the end of the game and read in again the next time you want to play! This last part of the assignment adds the ability to store your game tree at the end of the game and retrieve it next time you play.

A compiled sample program called TQ2.class is available for you to try in the directory /cs/cs60/hwfiles/a5. There is also a sample saved game called game.txt so that you can see the format in which the game should be saved. Indeed, you can load that game in when you run TQ2.

In the directory /cs/cs60/hwfiles/a5 you will also find a file called ReadFile.java that demonstrates the reading of a file. This very simple program expects its input to come from a file called Zach.txt that we have also provided. Finally, we've provided a file called WriteFile.java that demonstrates writing to a file. This simple program will create a file called Ran.txt. The suffix .txt is not important, it's just a convention to alert the user that this is a text file (as opposed to a program or something else).

Now, modify your TQ1.java file to construct your own TQ2.java java program. When the user starts playing, the program should ask if a previous game should be loaded from a file. If not, the game begins with the standard single question "Is it bigger than a breadbox?" and the standard two animals "a squirrel" and "an elephant". At the end of the game, the user should be asked if the game tree should be stored. If so, the contents of the tree should be stored in a file whose name is specified by the user.

Saving a game (i.e. Writing a file)...

Recall that we can't just "dump" a tree into a file. The tree is made up of nodes that are "floating" around in the computer's memory. Therefore, to store the tree for later use, we must "linearize" it by storing the contents of the nodes as a sequence of strings in a file. Later we will be able to rebuild the tree from these strings!

Specifically, the data for each TreeNode is stored as a single line of text where the line is of the form String1*String2. The first string String1 is the representation of the node (e.g. either a question or the word "a" or "an" followed by the name of an animal). Afterwards there is an "*" symbol. Then, String2 is either the string Question or the string Animal, indicating the kind of node this is (i.e. whether it is representing a question or an animal). The order in which the nodes are stored in the file is: root node, followed by recursive enumeration of the "yes" tree, followed by recursive enumeration of the "no" tree. Recursion is the secret to all happiness in storing your tree as a sequence of strings. As you recurse through your tree, store each node as a string in the file! Use our WriteFile.java function as a starting point for writing to a file.

Reloading a previous game (i.e. "Reading a file")

How about reconstructing the tree? To do this you will need to read in the file. Take a look at our ReadFile.java example. This program reads in the content of the file Zach.txt. When you go to read in a previously stored game, you will probably want to call a helper function that knows about the file to be read. This helper function will read in the first string which we know corresponds to the root of the game tree, so construct a new TreeNode here with the information that you've just read in from your file. Now, we want to recursively build the "yes" and "no" trees! Recursion can be exploited to build these two "subtrees". Once each of those two subtrees has been built, we can hook them up to the root node and we're done! Of course, the building of each subtree does the very same thing.

How do we take a string of the form String1*String2 and get the two strings out? That is how do we take a string that we've read in such as Is it bigger than a breadbox*Question and get out the two strings Is it bigger than a breadbox and Question which we will need to build a TreeNode? One nice way to do this is to use the tokenizing tool in the java.util library. Take a look at the sample code in the file token.java in /cs/cs60/hwfiles/a5.

Avoid this pitfall: You will probably need to compare strings in your program. Imagine that the variable myString is a reference to a string. You would like to check if this string is the word "Spam". Using myString == "Spam" will return False, even if myString refers to a string "Spam". (Why!?) To check two strings for equality, you must use the equals method for strings, as in myString.equals("Spam").

In total, the three files that you will submit for this assignment are part1.rex, TQ1.java, and TQ2.java. Please include your definition of theTreeNode class in TQ1.java and TQ2.java. (In fact, if you include it in just TQ1.java this is fine too, since the compiled version of TreeNode will then be available for TQ2 to use.)

Last modified February 2009 by Ran Libeskind-Hadas

Harvey Mudd College Computer Science 60 Assignment 5, Due Sunday, February 22, by 11:59 pm