Assignment 2

Harvey Mudd College
Computer Science 60
Assignment 2, Due Friday, Feb. 8, by midnight

First-class functioning and Unicalc! (part 1)

The problems in this assignment exercise high- and low-level functional programming concepts, including the use of functions as both arguments and return values from other functions. The last problem is an introduction to the Unicalc unit calculator application of Rex. It will be completed in Assignment 3.

Reading

This assignment covers much of the material through Chapter 4 in the text. You may also want to experiment with functions returning functions and anonymous functions in rex.

Practice Problems

If you would like a few more practice problems to try either before or after the assignment, there are several (with solutions) in /cs/cs60/as/a2/practice.probs.

Testing your code

As with assignment 1, there are tests you can run in the /cs/cs60/as/a2 directory to try out your functions. They are Test1.in through Test25.in. Each problem tells you which tests you can run to test that problem's code. However, you can also test ALL of the problems in this assignment with

  rex hw2.rex < /cs/cs60/as/a2/TotalTests.in

Do test your code before submitting it to be sure there are no typos, misspellings, etc.

Submitting your code

You should submit your rex functions in a single file named hw2.rex using

  cs60submit hw2.rex

Grading

As in assignment 1, commenting your functions and your file are important. Be sure your comments indicate what each function does, as well as what the inputs are.

As you gain experience writing code in rex, in addition, it is a good idea (and very much the spirit of rex and functional programming) to try to devise programs as elegant and concise as possible.

The Problems

Create the function fahrToCel(L) that takes a list of temperatures in Fahrenheit and returns a list of equivalent temperatures in Celsius. For example,
```
     rex >  fahrToCel([212,32,-40]);
     [100,0,-40]
     
```
The conversion of individual temperatures is C = (F-32)*(5/9.0). This problem is not, in spirit, different than add42List. In this case, however, you should use an anonymous function as a first argument to map. Don't write the conversion function separately and then use its name.

You can test this function with Tests #1 and #2:
```
       rex hw2.rex < /cs/cs60/as/a2/Test1.in
       rex hw2.rex < /cs/cs60/as/a2/Test2.in
     
```
Create a function named erdos which takes as input a single positive integer x (or whatever name you choose). (You don't need to check that it's a positive integer.) The output it computes is 3x + 1, in the case that x is odd, and x/2, in the case that x is even.
You can test this function with Tests #3 and #4. (See Problem 1 for the full test command.)
Appropriate runs include
```
    rex >  erdos(42);
    21

    rex >  erdos(109);
    328
    
```

(Tests #5 and #6) Write iterate(N, F, X), taking a non-negative integer N, a function F (which is assumed to be a function of one integer argument), and an integer X. This function then computes F(F(F...(F(X)))) where the number of F's is equal to N. (This is called an iterated function system and is used, among other places, in generating fractal images.) If N is 0, the function simply returns X. For example here is some input and output:

  // Here I'm using the erdos function from the example above.
  //   since the first argument is 1, the desired result is erdos(3),
  //   and erdos(3) equals 10.
  
  rex > iterate(1, erdos, 3);
  10

  // Now I'm iterating twice and computing erdos(erdos(3)):
  
  rex > iterate(2, erdos, 3);
  5

  // And this is finding erdos(erdos(erdos(3)))
  
  rex > iterate(3, erdos, 3);
  16

  // Iterating the add1 function twice is equivalent to adding 2.
  //    Don't forget to define add1 before trying this !

  rex > iterate(2, add1, 9);
  11

(Tests #7 and #8). Create a function named collapse(L) that takes a list L of integers as input and as output returns a list of the number of iterations required so that erdos(erdos(...(erdos(x))...)) equals 1, for each element x in the list L. (To be precise, collapse returns the smallest such number of iterations. Thus, collapse([1]) is [0].) Keep in mind that collapse takes a single list as input.

As a bit more explanation, consider collapse([3]). Looking at the previous examples from the iterate specification, we see that
```
  erdos(erdos(erdos(3))) = 16.
  
```
Continuing in this way,
```
  erdos(erdos(erdos(erdos(erdos(erdos(erdos(3))))))) = 1
  
```
with erdos iterated 7 times. Thus, collapse([3]) == [7]. Further examples of collapse in action include
```
  rex >  collapse([3, 10, 5, 16]);
  [7, 6, 5, 4]

  rex >  collapse([129, 55, 85]);
  [121, 112, 9]
  
```
Will every positive integer collapse to 1? No one knows, though many mathematicians have worked on it. The mathematician Paul Erdos actually felt that "mathematics is not yet ready for such problems," as the "3x+1" problem, as this is called. Check out the site http://www.paulerdos.com/0.html for more on Paul Erdos.
(Tests #9 and #10) In assignment 1's scrabble problem, a standard database of scores for each letter was used. Other point-assignment systems might be useful in various contexts, however. The file /cs/cs60/as/a2/letterScoring.rex contains the scrabbleScores database from the game, as well as four other databases named letterOrder, numerology, telephone, and jotto. (That file offers explanations.) Each database consists of a list of 26 [ letter, score ] pairs.

Write a function createScorer(db) that takes one such database as input and returns a function as output. The output function should be a word-scoring function, that is, should take a string as input and output a score for that string based on the database provided to createScorer. The score is the sum of the scores of the letters. For example,
```
    // Include the file with the scoring tables

    rex >  sys(in,"/cs/cs60/as/a2/letterScoring.rex");

    // Let scrabbleScorer be bound to the function created by
    //   createScorer in this case.

    rex >  scrabbleScorer = createScorer(scrabbleScores);
    1

    rex >  scrabbleScorer("rex");
    10

    // In this case the function output by createScorer is used immediately.
    rex >  createScorer(letterOrder)("ronaldwilsonreagan");
    202
    
```
(Tests #11 and #12) Write a function envelope(F,G) that takes two functions as input. Each function is a function of one number. (That is, F and G each take one number as input.) envelope returns a function which takes as input one number, call it X, and returns the larger of F(X) and G(X). (It should return F(X) if they're equal, since in that case F(X) == G(X).) Careful here - envelope returns a function and not a number. For example here is some sample input and output:
```
    // To demonstrate, I'll first define two functions called foo and goo.
    rex > foo(X) => 2*X + 1;
    foo

    rex > goo(X) => 3*X - 2;
    goo

    // Now, envelope(foo, goo) returns a function and here we let moo
    // represent that function.

    rex > moo = envelope(foo, goo);
    1

    // Now, let's evaluate the function moo with input of 5.

    rex > moo(5);
    13
    
```
(Tests #21 and #22 -- note they're out of order) This problem and the following one ask you to write rex functions that operate on another data structure -- graphs -- in particular, graphs that are represented as binary relations, i.e., a list of [parent, child] pairs.

Create the function biggestFan(G) that takes a graph G (as a binary relation) as input and outputs the node of G that has the biggest fan-in. The fan-in of a node is simply the number of parents of that node, so biggestFan returns the node with the most parents. You may assume that biggestFan will only receive input graphs that do have at least one edge.

If there is a tie for the biggest fan-in, you may return any one of the correct nodes. For example:
```
    rex >  biggestFan([ ["A","D"], ["B","D"], ["C","D"], ["A","E"], 
	                ["D","E"], ["E","F"], ["F","A"]  ]);
    D
    
    rex >  biggestFan([ ["A","B"], ["B","C"] ]);
    B
    
```
where the last answer could also have been C.
(Tests #23, #24, and #25) Write a predicate reachable(a,b,G) that answers the question of whether or not you can get from "point a" to "point b." Specifically, the first two inputs to reachable are nodes in the graph G, which is the third input. reachable returns 0 if there is no path from a to b, and it returns 1 if there is a path (possibly through other nodes) from a to b.

NOTE Your predicate will only be given ACYCLIC graphs as input. Thus, you don't have to worry about "running around in circles" searching for whether there is a path from a to b, because there will be no cycles in the graph. Also, remember the graphs are directed. For example,
```
    rex >  reachable("A", "F", [ ["A","D"], ["B","D"], ["C","D"], ["A","E"], 
	                ["D","E"], ["E","F"] ]);
    1
    
    rex >  reachable("C", "A", [ ["A","B"], ["B","C"] ]);
    0
    
    rex >  reachable("D", "D", [ ["A","B"], ["B","C"] ]);
    1
    
```
As the last example illustrates, your function should acknowledge that any node can be reached from itself. What's more the node does not even need to be in the graph! (This is because the graph does not have to be connected -- an unlisted node is one with neither outgoing nor incoming edges.)
(Tests #13 and #14) The file insert.rex in the directory /cs/cs60/as/a2 defines an insert function which adds nodes to binary search trees in such a way that an inorder listing of the tree returns a sorted list of data. That file also explains that our representation for a binary search tree is that it's either empty (the empty list) or consists of a list with three elements: [ rootValue, leftSubtree, rightSubtree ]. The rootValue will always be an integer, and the left and right subtrees are binary search trees. Write the function delete(X, T) that takes a value X and a list T representing a binary search tree and returns a binary search tree with all of the elements of T except with X removed. You can assume that X is definitely in the tree T. For example, the list
```
[10, [3, [], []], [42, [20, [], []], [60, [], []] ] ]
```
represents the tree

We will go over the algorithm for removing an item from a binary search tree in detail in class. As a brief reminder, first find the element you are looking to remove (traversing left (less) or right (greater)) as necessary. When the element to be removed is at the root of a tree - check if it has an empty subtree. If so, replace it with the other subtree. If, on the other hand, there is no empty subtree, replace the root with the minimum of the right subtree and then remove that minimum from the right subtree.

For example, here are some input and output:
```
  rex > mytree = [10, [3, [], []], [42, [20, [], []], [60, [], []] ] ];
  1

  rex > delete(10, mytree);
  [20, [3, [], []], [42, [], [60, [], []]]]
  
```
Hint: you will need to write a function that finds the minimum element in a binary search tree. (Code for findMin will be written in class.)

The Unicalc Application

The last three problems are related to the "Unicalc" unit-conversion application.

The Unicalc application converts quantities expressed in one type of unit to those of another. The end result (not this week's assignment) will be able to do conversions like the following:
```
       rex >  convert( [1, ["mile"], ["hour"]], [1, ["meter"], ["second"]], db);
       multiply by 0.447032 or divide by 2.23698
```
Unicalc works on multiplicative conversions only, so it does not perform conversions such as Fahrenheit to Celsius.
Functional programming provides a natural way to implement the kernel of Unicalc. In this assignment, we walk through some of the data representations and issues, and ask you to write the first half of the code in rex. Here are the key representations used in Unicalc:
- A Quantity List is a list of three elements representing some quantity of a designated unit. For example,
```
       [1.0, ["mile"], ["hour"] ]                 represents 1 mile per hour
       [2.5, ["meter"], ["second", "second"] ]    represents 1 meter per second^2
       [60.0, [ ], ["second"] ]                   represents 60 hertz
       [3.14, [ ], [ ] ]                          represents about pi radians
       
```
  To summarize, a Quantity List is a list of three elements: the first is the quantity (floating point), the second is a list of units in the numerator, and the third is the list of units in the denominator. Note You can assume that the lists of units (within the numerator and within the denominator) are initially in alphabetical order. Your code will need to make sure this property is maintained!
- A Conversion Database is a list of relationships between units. For example,
```
       db = 
       [
         ["foot",    [12.0,       ["inch"],   []]],
         ["mile",    [5280.0,     ["foot"],   []]],
         ["coulomb", [1.0,        ["ampere", "second"],     []]],
       ];
       
```
  Each line in the database is a list of two elements -- the first element is a string that names some unit and the second element is a Quantity List that is equivalent to that named unit.
  
  Not every unit is defined in terms of others in the database. Some are intentionally left undefined. We'll call the latter basic units. In the above example, inch, ampere, and second are basic units. A database does not need to contain conversions for everything to everything; that would tend to make it very large and maintenance would be error-prone. Instead, all conversions done by Unicalc are derived from basic ones in the database.
(Tests #15 and #16) The representation of a quantity is said to be simplified if there is no element common to its numerator and denominator lists. One way to convert a quantity to an equivalent simplified form is to "cancel" any pairs of elements common to its numerator and denominator. Write a rex definition for a function simplify(QL) of one argument (i.e., one quantity list) that yields a corresponding simplified representation. I would suggest adapting the merge example (/cs/cs60/as/a2/merge.rex) in order to take advantage of the fact that the numerator and denominator lists are sorted (though any technique that maintains the sorted order is OK).

Some example uses of simplify include
```
  rex >  simplify([3, ["kg", "meter", "second", "second"], 
                      ["kg", "kg", "second"]]); 
  [3, ["meter", "second"], ["kg"]]

  rex >  simplify([3.14, ["meter", "second"], 
                         ["meter", "second", "second"]]);
  [3.14, [], ["second"]]
```
(Tests #17 and #18) Construct rex functions multiply(QL1, QL2) and divide(QL1, QL2) that respectively multiply and divide two quantity lists, yielding a simplified result. Again, you may increase your code's efficiency by using merge (as discussed in class) to to combine two sorted lists into a single sorted list. However, as with simplify, any technique is OK. Also, you may assume that the first element of a quantity list will never be 0.

Some example runs include
```
  rex >  multiply([2.0, ["kg"], ["second"]], 
                  [3.0, ["meter"], ["second"]]);
  [6.0, ["kg", "meter"], ["second", "second"]]

  rex >  divide([12.5, ["meter"], ["second"]], 
                [0.5, ["meter"], []]);
  [25.0, [], ["second"]]
```
(Hint: don't rewrite simplify!)
(Tests #19 and #20) As mentioned, the Unicalc database is represented as an association list, that is, a list of pairs in which each pair consists of a single unit and an equivalent quantity list for that unit. Essentially, these pairs define equations for various units, and we can refer to the elements of the pairs as the LHS (left-hand side) and RHS (right-hand side) for this reason. For example, a sample database might be defined as:
```
db = 
[
  ["foot",    [12.,       ["inch"],   []]],
  ["mile",    [5280.,     ["foot"],   []]],
  ["inch",    [0.0253995, ["meter"],  []]],
  ["hour",    [60.,       ["minute"], []]],
  ["minute",  [60.,       ["second"], []]],
  ["mph",     [1.,        ["mile"],                 ["hour"]]],
  ["coulomb", [1,         ["ampere", "second"],     []]],
  ["joule",   [1,         ["kg", "meter", "meter"], 
                          ["second", "second"]]],
  ["ohm",     [1,         ["volt"],                 ["ampere"]]],
  ["volt",    [1,         ["joule"],                ["coulomb"]]],
  ["watt",    [1,         ["joule"],                ["second"]]]
];
```
which happens to be the current contents of /cs/cs60/as/a2/unicalc_mini_db.rex. There is a substantial database of elements that you are also welcome to try out in the file /cs/cs60/as/a2/unicalc_db.rex .

If this were the entire database under consideration, note that "ampere", "meter", and "second" are all basic quantities, since they are not the LHS of any definition.

Write a rex function conv_unit(u, DB) that takes a string representing a unit and the database association list as arguments. If the unit is defined in the database, it returns that unit's equivalent quantity list, otherwise it returns a standard quantity list representing the unit: [1.0, [u], []]. For example, if db is the database consisting of the list of the pairs above, then:
```
  rex >  conv_unit("hour", db);
  [60, [minute], [] ]

  rex >  conv_unit("meter", db);
  [1, [meter], [] ]
```
Hint: when you use assoc to look up an element that's not in a database, it returns "false" (actually, it returns [], which is interpreted as false).

Assignment 3 will use these functions in order to complete the unit-conversion application.
For optional bonus credit of 20%, write a function subsets(n,W) that takes as input a nonnegative integer n and a word (a string) W. It outputs a list of lists: the sublists are all the distinct subsets of the letters of W with n elements. The letters in each subset should be in alphabetical order, and the subsets themselves should be in lexicographical order, i.e., dictionary order. To be considered subsets, it is important that each of these sublists have a distinct set of letters from all the rest.

This function would allow a more efficient solution to the bestThree problem on the previous assignment, as there are at most 140 different three-letter substrings formable from seven characters. (You do not have to redo that problem!)
For instance,
```
      rex >  subsets(2,"abcd"); 
      [ [a, b], [a, c], [a, d], [b, c], [b, d], [c, d] ]

      rex >  subsets(2,"aacd") 
      [ [a, a], [a, c], [a, d], [c, d] ]

      rex >  subsets(2,"aaad") 
      [ [a, a], [a, d] ]

      rex >  subsets(5,"aacd") 
      [ ]
      
```
For an alternative (or additional) optional bonus credit of 10%, write a function readInteger(x) that takes in a positive integer x and outputs the number formed when the input is "read" digit-by-digit. (This is the principle behind the sequence 1, 11, 21, 1211, 111221, ... .)

For instance,
```
      rex >  readInteger(111221);
      312211

      rex >  readInteger(8777511);
      18371521
      
```

Harvey Mudd College Computer Science 60 Assignment 2, Due Friday, Feb. 8, by midnight