Harvey Mudd College
Computer Science 60
Assignment 4, Due Monday, October 3, 11:59pm

Unicalc and Unilang

This homework asks you to use Java to implement the Unicalc application that you completed in Scheme in weeks 2 and 3. You will use the OpenList data structure as teh fundamentak building block -- OpenList, after all, is designed to implement Scheme-like list structures. Because Java is organized around creating and combining data structures, we will Java-ize Unicalc by writing a Quantity class that represents a quantity list. This port of unicalc to Java is the first part of the assignment.

In the second part of the assignment, you'll use this unicalc-in-Java in order to implement a language based on the unit-conversion capability. Implementing the language will require you to have a Tokenizer class (you have the option to use the one we've provided, if you wish), to write a Parser class, and to write an Evaluator class. Examples and, optionally, starting points for the Parser and Evaluator are provided via links at the top-level assignment page.

Why so much code provided? I'd prefer to write the application myself... .

Please do! With the support code we are balancing two valid concerns: on one hand, the deep understanding that a from-scratch implementation provides. On the other hand, we want to provide enough examples and framework to build familiarity with what are certainly intricate - and important! - computational ideas. Feel free to use as much - or as little - of the support code as you wish. One important thing we do ask if you implement from scratch is to use (or replicate) our toString conventions, so that we can check your results efficiently!  Also, please adhere to the interfaces we provide (names of functoins, parameters, return types, etc).

Submitting your code

Please submit the following files:

  • OpenList.java (whether you make changes or not), and
  • Tokenizer.java (if you make changes), and
  • Quantity.java (Problem 1) and
  • Parser.java (Problem 2) and
  • Evaluator.java (Problem 3)
  • extra.zip if you try the extra credit...
Submit these files in the usual way, under homework #4.

Pair Programming

Here is the breakdown of which parts may be done with a partner, and which are individual:

  • Part 0 (OpenList changes): Pair programming or individual
  • Part 1 (Quantity and QuantityTester): Pair programming or individual
  • Part 2 (Parser): Pair programming of individual
  • Part 3 (Evaluator): Individual only.

Keep everything in one directory...

Because the .java files you write this week will be interdependent, keep them all in a single folder/directory. That way, java will be able to find classes as it needs to.

If you are using Dr. Java, keeping everything in one directory should work out-of-the-box.

Part 0: Changes to OpenList

0 points

While it's not required, I found it very helpful to add the nonstatic methods

public Object second() ...
public Object third() ...
to my OpenList class, because the second and third elements of lists get accessed lots of time in what follows!

As promised, with this assignment and beyond there is no need to have both the static and nonstatic versions of these methods. Unless you want to... . Do feel free to add any other helper methods to your OpenList class - just be sure to submit your OpenList.java file!

If you use the OpenList class provided in the hw5.zip archive, these methods are already present.

Part 1: Unicalc in Java via Quantity

40 points

This problem asks you to implement the Unicalc application in Java by building a class named Quantity. A starting Quantity.java file is available the top-level assignments page in hw5.zip. You don't have to start with this file, if you'd like to implement from scratch (please be sure your toString method for Quantity objects matches ours, however!) On the other hand, you're more than welcome to use the provided Quantity.java as a starting point, if you wish... .

The Quantity class

Objects of type Quantity will represent quantity lists, that is, something of the form
     { 9.8, [meter], [second, second], 0.1 }
The above Quantity represents 9.8 meters per second squared with an uncertainty of 0.1 . Note that when a Quantity is printed, it is surrounded by curly braces.

As with quantity lists in Racket, a Quantity will always contain four elements. For example, if an object of type Quantity holds only a value, it would look like this:
     { 42.0, [], [], 0.0 }
Here, we are illustrating the provided toString function, along with the OpenList class you wrote last week. The curly braces are used to help distinguish objects of type Quantity from objects of type OpenList. We encourage you to use your own OpenList class again for this assignment -- and to augment it with any functions that may be helpful! If you'd prefer, you may also start with our OpenList solution, which is part of the hw5.zip starter files.

Here is an overview of what capabilities Quantity.java should have

  • Data members:
          private double m;    // the multiplier
    private double u; // the uncertainty
    private OpenList N; // the numerator, an OpenList of Strings (units)
    private OpenList D; // the denominator, also an OpenList of Strings (units)
  • Methods:
          
          public Quantity(double m, OpenList N, OpenList D, double u) ...  // constructor
          public Quantity(double m, double u, OpenList N, OpenList D) ...  // constructor if you like this order better
          public Quantity(double m, OpenList N, OpenList D) ...  // constructor with 0.0 uncertainty
    
          public toString() ...                // PROVIDED prints objects of type Quantity
          public static OpenList getDB() ...   // PROVIDED gets a pre-prepared database of units
          // the getDB() method is at the very bottom of the provided Quantity.java file
          public double getm();     // "getter" - returns the multiplier
          public double getu();     // "getter" - returns the uncertainty
          public OpenList getN();   // "getter" - returns the numerator
          public OpenList getD();   // "getter" - returns the denominator
    			
          // These five are ALMOST translations from prior weeks' Unicalc code
          // However, notice that norm_unit and norm take in a DB, rather than referring
          // to a global DB
          // you may replace these with others, as long as functionality is preserved
          public static Quantity simplify(Quantity Q);                 // cancels shared units in N and D
          public static Quantity multiply(Quantity Q1, Quantity Q2);   // multiplies the inputs
          public static Quantity power(Quantity Q, int p);   // raises the Q to the non-negative power p
          public static Quantity divide(Quantity Q1, Quantity Q2);     // divides the inputs
          public static Quantity norm_unit(String unit, OpenList DB);  // normalizes unit in DB
          public static Quantity norm(Quantity Q, OpenList DB);        // normalizes Q in DB
    
          // Addition requires the unit database, which here is named DB
          //
          // If the units (afer normalizing) do not match, return the following
          // valid Quantity:  { 0.0, ["error:add"], [], 0.0 }
          public static Quantity add(Quantity Q1, Quantity Q2, OpenList DB);   // norm(Q1,DB) + norm(Q2,DB)
    			
          // This last one is negate, rather than subtract.  It turns out that there's no
          // subtraction in our language (see below), only addition and negation.
          public static Quantity negate(Quantity Q);                   // returns negative of Q
          
  • and main:
          public static void main(String[] args);     // provided, though you might not need it

Helper functions    You're welcome to write helper functions to implement the Unicalc API in Java. For example, our solution relies on a few helper functions with these signatures:

public static OpenList cancel(OpenList L, OpenList M)
public static OpenList n_merge(OpenList L, int p)
each of these provides the capabilities of a piece of the Racket Unicalc functionality from last week. If these two methods were nonstatic, they would have to live in the OpenList class. However, because they're static, it's OK that they live in the Quantity class where they're used. (This is reasonable if we imagine these methods will not be used outside of Quantity, which is true.) Remember that, since all of Quantity's methods are static, they will not be called by an object and their implementations will not have a this to use.

Presuming the numerators and denominators are sorted    The constructors of Quantity should make sure that the numerator and denominator units are in alphabetical (sorted) order. Our starter file provides this functionality. If you are using your own OpenList class, you may want to borrow our sort routine - or use your own!

How do I compare two Strings in Java?    As in the merge method that was part of last week's OpenList class, you will want to use the method compareTo

  String s1 = "apple";
String s2 = "zebra";
if (s1.compareTo(s2) < 0) ... // this will be true!
compareTo returns a negative value if this (the calling object) is earlier in the dictionary (i.e., ASCII order) than the input argument s2. If they're equal 0 is returned. If s2 is earlier, a positive number is returned. The compareTo method is fully documented at this link. There is also a compareToIgnoreCase, but to be consistent with our tests, please use compareTo here.

Like OpenLists, objects of type Quantity are not meant to be changed. Rather, the methods above return new objects of type Quantity based on their inputs. Indeed, six of the methods above are based on the prior Unicalc assignment, a solution to which is posted at the top-level assignment page, if you'd like to use it. The reason that these six are all static is to more closely resemble their Racket counterparts.

One new method    There is one method that was not part of the Racket Unicalc implementation:

  • public static Quantity negate(Quantity Q)    This method should return a new object of type Quantity with the same numerator and denominator as Q, but with a multiplier of opposite sign.


And a note about add:
  • public static Quantity add(Quantity Q1, Quantity Q2, OpenList DB)    This method should return a new object of type Quantity that represents the sum of Q1 and Q2 after normalizing each one in the database DB. If, after normalizing both inputs, their units are identical, addition is straightforward. On the other hand, if the units are not identical, a warning message should be printed out and an error Quantity list, { 0.0, ["error:add"], [], 0.0 }, should be returned. This will allow the unicalc-language interpreter to avoid crashing while still alerting the user to an error. Indeed, the former example will be very forgiving - perhaps too forgiving - of user errors! Do not worry about this for this assignment.

    Note to experienced (and new) Java-ers: creating an "error" version of a Quantity object is not really the "Java" way of handling such a problem. Rather, we culd use Java's exception mechanism by throwing an exception (error) and catching it elsewhere. But there's enough new stuff in this assignment already!

Testing your code    For this part we've provided a set of tests in the QuantityTester.java file. As you write the methods of your Quantity class, you should be sure to uncomment and run the tests within the main method of Quantity.java! Then, after you're finished, try compiling and running the main method of QuantityTester. Note that we will only test your code on valid, commensurate inputs, for example, for the add method.


Part 1.5: Tokenizer

A Tokenizer.java file provided for you in the code at the top-level assignment page. You may want to improve it or create your own tokenizer, but neither of these is required. See the optional extra credit portion of the assignment for additional ideas... .


Part 2: Parsing the Unicalc language via Parser

30 points

This problem asks you to implement a Parser for the Unicalc language, whose grammar is as follows:

Unicalc grammar 

Negation is extra... One note for Fall 2011: The rule U -> ( - E ) will be considered extra credit -- feel free to skip that one, but if you've implemented it, you'll certainly receive credit for it. It creates the parse tree ["-", E], where E is the parse tree that results from using the rule E.

Technical notes on the grammar

All symbols listed above in red are terminal symbols.  Similarly, doubles, ints and strings are also terminals.  

The rule for Q uses the asterisk, not as "times," but meaning "zero or more of the preceding symbol." It also uses the plus sign, not as addition, but meaning "one or more of the preceding symbol." Both of these are implemented in the provided Q method in Parser.java.  A, D and V are also implemented and provided in Parser.java.  You will only need to implement rules U and above.

The Parser class    To do this, you should write a Parser class in a file named Parser.java. This should implement a recursive-descent parser for the above grammar, yielding an OpenList that represents the parse tree for a provided set of valid tokens. There is a starting point for such a class available in the hw5.zip archive at the top-level assignment page. In particular, that provided code already implements a number of methods for checking if there are remaining tokens (and what they are). In addition, it provides an implementation of Q(), for the Q nonterminal symbol, as well as the method A() and code for handling D, I and V as well.  

Overview of parsing    The method that will begin all the work - even though all of that work will be delegated recursively - is parse:

public OpenList parse(String[] tokens)    This method should return a new object of type OpenList that represents the parse tree for that set of tokens. We will not test your code on invalid sets of tokens. Here is a simple example of each one - keep in mind that they can be composed, as well. Also, double quotes have been added to highlight strings, but they will not be printed by Java.

USER INPUT                PARSE TREE
---------- ----------
def x 42 ["def", "x", [{ 42.0, [], [], 0.0 }]]

# 1 joule ["#", [{ 1.0, ["joule"], [], 0.0 }]]

1 ~ 0.5 foot + 1 inch ["+", [{ 1.0, ["foot"], [], 0.5 }], [{ 1.0, ["inch"], [], 0.0 }]]

1 foot + ( - 1 ~ 0.25 inch ) ["+", [{ 1.0, ["foot"], [], 0.0 }], ["-", [{ 1.0, ["inch"], [], 0.25 }]]]

1 second * 2 hz ["*", [{ 1.0, ["second"], [], 0.0 }], [{ 2.0, ["hz"], [], 0.0 }]]

1 second / 2 hz ["/", [{ 1.0, ["second"], [], 0.0 }], [{ 2.0, ["hz"], [], 0.0 }]]

2 second ^ 2 ["^", [{ 1.0, ["second"], [], 0.0 }], 2]

(1~0.4 cm) [{ 1.0, ["cm"], [], 0.4 }] // note - no extra nesting!

(- 1 cm) ["-", [{ 1.0, ["cm"], [], 0.0 }]]

42.42 meter [{ 42.42, ["meter"], [], 0.0 }]

meter [{ 1.0, ["meter"], [], 0.0 }]

42 [{ 42.0, [], [],0.0 }]

2+3*4^2 [+, [{ 2.0, [], [], 0.0 }], [*, [{ 3.0, [], [], 0.0 }], ["^" [{ 4.0, [], [] ,0.0 }], 2]]]
and here is a concise summary of the resulting parse tree from each expression in the grammar:

Unicalc grammar with parse trees 

Other methods for Parser    Recursive-descent parsing uses a method for each nonterminal in the grammar - except for D and V in the case of this problem. One way to build up to the full grammar is to implement one nonterminal method at a time, for example, from the bottom towards the top. So, you would implement and test U(), and then K(),and then P(), and then E(), and finally S().

Testing your code    The main method is already written to make this easy - you should be able to try out all of the above inputs (and many more of your own) to be sure that your Parser is working correctly.  You may also write unit tests if you find you prefer the unit testing way of doing things.  We'll be using unit tests to test your code, but it's rather cumbersome to produce them, so you'll probably find using the main function easier for testing.


Part 3: Evaluating the Unicalc language via Evaluator

30 points

This part completes the unicalc language. Here, you should build an Evaluator class that has at least the following four things:

  • Data members:
          private OpenList env;  // the environment - a unit-database association list!
  • Methods:
          public Evaluator(OpenList env) ...                        // constructor
    public Quantity evaluate(OpenList parseTree) ... // returns the result, a Quantity
    public static void main(String[] args) ... // for testing

There is a starting file for Evaluator.java, which evaluates Quantitys; it is available from the hw5.zip archive at the top-level assignment page. Again, its use is 100% optional.

What does everything evaluate to?    Well, every valid statement in the Unicalc language evaluates to an object of type Quantity, that is, a quantity list. For example, the parse tree which is simply a list of a Quantity naturally evaluates to that Quantity object.

Many of the others are clear: addition, subtraction, negation, multiplication, division, and normalizing. Some of these operations (addition and normalizing) require a database of units... this database is the environment in which each expression is evaluated. The data member env (already at the top of Evaluator.java is provided for this purpose. This environment env starts out empty -- meaning that every unit is a basic, irreduceable unit. However, as you define units with def, conversions become possible. These are considered next.

The two parse trees whose evaluations may not be clear are these:

  • [ "#" E ]    This parse tree should evaluate to the normalized equalivalent of E within the current environment. # is simply the operator for normalizing.


  • [ "def" V E ]    This parse tree should evaluate to the value of E. After all, because of Java's strong typing, it will complain if your code does not return an object of type Quantity, and the value of E will be of exactly that type. However, the much more important function of def is that it should add an association to the front of the environment's association list! This is probably best explained by example. Suppose that the Evaluator's association list env is at first
    
    [ 
      ["mile", { 5280.0, ["foot"], [], 0.0 }]
    ]
    
    with the quotes and spacing added only for emphasis. Then, after the statement def league 0.125 mile, the environment should become
    
    [
      ["league", { 0.125, ["mile"], [], 0.0 }],
      ["mile", { 5280.0, ["foot"], [], 0.0 }]
    ]
    
    We could even redefine "mile" - for example, after def mile 1760 yard, the new value of env would be
    
    [
      ["mile", { 1760.0, ["yard"], [], 0.0 }],
      ["league", { 0.125, ["mile"], [], 0.0 }],
      ["mile", { 5280.0, ["foot"], [], 0.0 }]
    ]
    
    and assoc should "do the right thing" by taking the first match at the top. The data member env, already provided in the Evaluator, is the natural place to store the environment... .



Testing your code

The main method is already written to make debugging your language easy. In the starter files you will see something similar to the following in main of Evaluator.java file:

boolean PRINT_ALL = true;

// the read-eval-print while loop is here...

if (PRINT_ALL)
System.out.println("The tokens are " + Tokenizer.printArray(tokens));

When the graders run your file, they will make sure to trigger the printing of all of this debugging information, because it makes it easier to understand the behavior of the code!  We'll again be using unit tests at least in part to test your code, so feel free to design some of those if you like.  Note that you'll want to use the same trick for rounding the error values that we used in the QuantityTester.

Things to check:

Here are some examples, including the printout of the tokens, parse tree, value, and environment each time:

def yard 3 foot
The tokens are < def, yard, 3, foot >
The parseTree is [def, yard, [{ 3.0, [foot], [], 0.0 }]]
The value is { 3.0, [foot], [], 0.0 }
The current env is
[
  [yard, { 3.0, [foot], [], 0.0 }]
]


def mile 5280 foot
The tokens are < def, mile, 5280, foot >
The parseTree is [def, mile, [{ 5280.0, [foot], [], 0.0 }]]
The value is { 5280.0, [foot], [], 0.0 }
The current env is
[
  [mile, { 5280.0, [foot], [], 0.0 }]
  [yard, { 3.0, [foot], [], 0.0 }]
]


1~0.4 mile + 1~0.2 foot
The tokens are < 1, ~, 0.4, mile, +, 1, ~, 0.2, foot >
The parseTree is [+, [{ 1.0, [mile], [], 0.4 }], [{ 1.0, [foot], [], 0.2 }]]
The value is { 5281.0, [foot], [], 2112.0 }
The current env is
[
  [mile, { 5280.0, [foot], [], 0.0 }]
  [yard, { 3.0, [foot], [], 0.0 }]
]



1~0.4 mile + 2~0.2 yard
The tokens are < 1, ~, 0.4, mile, +, 2, ~, 0.2, yard >
The parseTree is [+, [{ 1.0, [mile], [], 0.4 }], [{ 2.0, [yard], [], 0.2 }]]
The value is { 5286.0, [foot], [], 2112.0 }
The current env is
[
  [mile, { 5280.0, [foot], [], 0.0 }]
  [yard, { 3.0, [foot], [], 0.0 }]
]


def x 2~0.01 mile + ( - 1~0.3 foot )
The tokens are < def, x, 2, ~, 0.01, mile, -, 1, ~, 0.3, foot >
The parseTree is [def, x, [+, [{ 2.0, [mile], [], 0.01 }], [-, { 1.0, [foot], [], 0.3 }]]]
The value is { 10559.0, [foot], [], 52.8 }
The current env is
[
  [x, { 10559.0, [foot], [], 52.8 }]
  [mile, { 5280.0, [foot], [], 0.0 }]
  [yard, { 3.0, [foot], [], 0.0 }]
]


x / 1 foot
The tokens are < x, /, 1, foot >
The parseTree is [/, [{ 1.0, [x], [], 0.0 }], [{ 1.0, [foot], [], 0.0 }]]
The value is { 1.0, [x], [foot], 0.0 }
The current env is
[
  [x, { 10559.0, [foot], [], 52.8 }]
  [mile, { 5280.0, [foot], [], 0.0 }]
  [yard, { 3.0, [foot], [], 0.0 }]
]


# x / 1 foot
The tokens are < #, x, /, 1, foot >
The parseTree is [#, [/, [{ 1.0, [x], [], 0.0 }], [{ 1.0, [foot], [], 0.0 }]]]
The value is { 10559.0, [], [], 52.8 }
The current env is
[
  [x, { 10559.0, [foot], [], 52.8 }]
  [mile, { 5280.0, [foot], [], 0.0 }]
  [yard, { 3.0, [foot], [], 0.0 }]
]


3~0.2 yard yard + 2~0.1 foot ^ 2
The tokens are < 3, ~, 0.2, yard, yard, +, 2, ~, 0.1, foot, ^, 2 >
The parseTree is [+, [{ 3.0, [yard, yard], [], 0.2 }], [^, [{ 2.0, [foot], [], 0
.1 }], 2]]
The value is { 31.0, [foot, foot], [], 1.84 }
The current env is
[
  [x, { 10559.0, [foot], [], 52.8 }]
  [mile, { 5280.0, [foot], [], 0.0 }]
  [yard, { 3.0, [foot], [], 0.0 }]
]


3 yard yard + 1 foot * 2 foot
The tokens are < 3, yard, yard, +, 1, foot, *, 2, foot >
The parseTree is [+, [{ 3.0, [yard, yard], [], 0.0 }], [*, [{ 1.0, [foot], [], 0
.0 }], [{ 2.0, [foot], [], 0.0 }]]]
The value is { 29.0, [foot, foot], [], 0.0 }
The current env is
[
  [x, { 10559.0, [foot], [], 52.8 }]
  [mile, { 5280.0, [foot], [], 0.0 }]
  [yard, { 3.0, [foot], [], 0.0 }]
]


Optional extra credit: functionalization!

Functions! For up to an additional +20 points, provide an implementation of functions in the Parser and Evaluator of the unicalc language. How to design and implement this is 100% up to you.

Warning: If you try this, be sure not to edit the only copy of your unicalc language files! -- implementing functions is tricky, and it's happened that people have broken their language in the attempt to do so!

Be sure to work on a copy of your files -- and then, if you submit this, please place those new files into a folder named extra and zip that folder. There is a spot in the submission site to upload extra.zip for this extra-credit challenge.

Be sure to include a comment at the top of your new Evaluator.java on how we could best test your code - give the graders a couple of examples to try!



Here is a (very small) image of one possible grammar for introducing functions into the language. It's not the only approach, to be sure. Click on the image for a larger version.


Personalizing For up to +5 (or more) points, improve the unicalc-language with whatever feature you'd like... . Here are the rules, however:

  • be sure not to change or break valid statements in the language as specified in the assignment (so we can still grade it!)
  • in a comment at the top of Evaluator.java document what you did and how we can test it. Again, give the graders a couple of examples to try.