Harvey Mudd College
Computer Science 42
Assignment 4, Due Monday, Sept. 27, by 11:59pm

Unicalc and Unilang

This week we will complete the implementation of our Unit Calculation language: Unilang.  This homework asks you to use Racket to implement the Unicalc language, using the Unicalc API that you completed in the previous 2 homeworks. 

You've already completed the back-end API that performs all of the Unicalc calculations (with error propagation!).  This week you will make the front-end easier to use by implementing a language based on the unit-conversion capability. Implementing the language will require you to have a function to tokenize the input  (you have the option to use the one we've provided, if you wish), a function to parse the input, and a function to evaluate the parse trees.  Examples and, optionally, starting points for the Parser and Evaluator are provided below.

Starter Files

Why so much code provided? I'd prefer to write the application myself... .
Please do! With the support code we are balancing two valid concerns: on one hand, the deep understanding that a from-scratch implementation provides and, on the other hand, enough examples and framework to build familiarity with what are certainly intricate - and important! - computational ideas. Feel free to use as much - or as little - of the support code as you wish. One important thing we do ask if you implement from scratch is to use (or replicate) our output conventions, so that we can check your results efficiently!

Problem 0.5: tokenizer.rkt

A tokenizer function is provided for you at the top of this page. You may want to improve it or create your own tokenizer, but neither of these is required. See the optional extra credit portion of the assignment for additional ideas... .


Problem 1: Parsing the Unicalc language (PAIR OPTION).  Submit in a file named parser.rkt

60 points

This problem asks you to implement a Parser for the Unicalc language, whose grammar is as follows:

Unilang grammar

 

Technical notes on the grammar

The items in fixed-width font stand for various generic terminals in the grammar.  varname and unitname are any strings, number is any non-negative number and the epsilon in the last rule stands for the empty symbol (i.e., an empty terminal).  It is not necessary to further parse these items, as the tokenizer will take care of this for you.  All other terminals are shown in red.  

The parse-input function    

Your task is to implement a function named (parse-input tokens) that takes a list of tokens, tokens, and returns a list with the following form:

( parse-outcome  parse-result  residual )

The first item in the list, the parse-outcome will be #t if the input was successfully parsed, and #f if it was not

The second item in the list, the parse-result is the parse tree (see below for examples).

The third item in the list, the residual, is the rest of the tokens in the input list that are not included in the parse tree (or an empty list, if all tokens were accounted for).

Of course you will implement many helper functions along the way.  It is important to keep your functions short, focused and well-commented!

Overview of parsing    The method that will begin all the work - even though all of that work will be delegated recursively - is parse-input:

(parse-input tokens)    This method should return a list, as specified above, including the parse tree for that set of tokens. We will not test your code on invalid sets of tokens. Here are some examples of parse trees - keep in mind that they can be composed, as well. All non-numeric items in the parse tree should be symbols.

USER INPUT           		    PARSE TREE
---------- ----------
x = 42+-0 meter (= x ((42 0) (meter) ()))

% x (% x)

z = y + g - x (= z (+ y (- g x))) // Note that this grammar is left associative, unfortunately

x = 3.42 +- .75 foot foot / second (= x ((3.42 0.75) (foot foot) (second)))

var = 3 (= var ((3 0) () ()))

var = -2.5 meter / second (= var ((-2.5 0) (meter) (second)))

x = (y+x)^-2 (= x (^ (+ y x) -2))))

x = y+x^2 (= x (+ y (^ x 2)))

x = y+(x^2-(z/y)) (= x (+ y (- (^ x 2) (/ z y))))))

x = y+(x^2-z/y) (= x (+ y (- (^ x 2) (/ z y))))

a = 3.5 +- .2 / cm (= a ((3.5 0.2) () (cm))) ())

a a

and here is a concise summary of the resulting parse tree from each expression in the grammar (in blue, beneath each rule in the grammar).  

grammar with parse trees 

Other methods for Parser    Recursive-descent parsing uses a method for each nonterminal in the grammar.  We have already written the method for  UL and Units. We have started the definition for Amt (but it doesn't handle negative numbers yet).  One way to build up to the full grammar is to implement one nonterminal method at a time, for example, from the bottom towards the top. So, you would implement and test parse-val(), and then parse-sub(), etc etc.  

Testing your code   You can test your parser in the usual way, by calling test with the parse function (which in turn calls parse-input).  We provide a number of tests in the starter file.  You should add at least two more of your own.

Part 3: Evaluating the Unicalc language (INDIVIDUAL ONLY)  Submit in a file named evaluator.rkt

40 points

This part completes the unicalc language. Here, you should build a unicalc-eval function (or extend the one that's already in the starter file).  unical-eval should take as input a valid parse-tree and output the value of the parse-tree (possibly storing a definition too).

Again, it's up to you whether you want to use the starter file, but you need to support the following functions:

(read-eval-print) - reads the user's input and prints the results of parsing the input and the final result of evaluating the input to the screen.
(unicalc-eval parse-tree) - returns the value of the input parse tree.

What does everything evaluate to?    Well, every valid statement in the Unicalc language evaluates either void or an UQL. This is described below.  However, first a quick note about variables and environments.  For this assignment we will work with only one environment to store the values of all of our variables.  This environment will be global and accessible from everywhere, hence we will not need to pass it as an argument to our unical-eval function.  In other words, all variables are global variables, and our functions will not take any parameters.  However, as extra credit, you have the option of implementing local scoping for your variables.  

(= varnane E)  This parse tree evaluates to void (no return value).  However, evaluating this expression has the side effect of binding the name varname to the expression E in the global (and only) environment, without evaluating E.

(% varname)  Evaluates varname (by looking it up in the global environment and evaluating the value it finds) and then returns its normalized value (a UQL).

varname  Evaluates varname but does not normalize it (but still retuns a UQL).

S  Evaluates the mathematical expression, in the global environment.  Again, this will return a UQL.

Val   Evaluates to a UQL

All other parse trees are straightforward.  They simply evaluate to the result of applying the relevant arithmetic operator(s) to the UQL(s) in question.


Testing your code

We have again provided some tests for your evaluator, but you should add at least 3 more tests.  In addition, here is a sample interaction that you can use to check your results and to make sure you understand what you are trying to do with the evaluator.  Notice the value of the expressions appears after the environment.  In many cases the value of the expression is void.

> (read-eval-print)

> z = (x+y) - w
syntax tree: (= z (- (+ x y) w))
environment: ((z (- (+ x y) w)))

> x = 2.0 +- 0.2 m
syntax tree: (= x ((2.0 0.2) (m) ()))
environment: ((x ((2.0 0.2) (m) ())) (z (- (+ x y) w)))

> y = 3.0 +- 0.6 m
syntax tree: (= y ((3.0 0.6) (m) ()))
environment: ((y ((3.0 0.6) (m) ())) (x ((2.0 0.2) (m) ())) (z (- (+ x y) w)))

> z = 4.52 +- 0.02 m
syntax tree: (= z ((4.52 0.02) (m) ()))
environment: ((z ((4.52 0.02) (m) ())) (y ((3.0 0.6) (m) ())) (x ((2.0 0.2) (m) ())) (z (- (+ x y) w)))

> w = 4.53 +- 0.02 m
syntax tree: (= w ((4.53 0.02) (m) ()))
environment: ((w ((4.53 0.02) (m) ())) (z ((4.52 0.02) (m) ())) (y ((3.0 0.6) (m) ())) (x ((2.0 0.2) (m) ())) (z (- (+ x y) w)))

> z = (x+y)-w
syntax tree: (= z (- (+ x y) w))
environment: ((z (- (+ x y) w)) (w ((4.53 0.02) (m) ())) (z ((4.52 0.02) (m) ())) (y ((3.0 0.6) (m) ())) (x ((2.0 0.2) (m) ())) (z (- (+ x y) w)))

> z
syntax tree: z
environment: ((z (- (+ x y) w)) (w ((4.53 0.02) (m) ())) (z ((4.52 0.02) (m) ())) (y ((3.0 0.6) (m) ())) (x ((2.0 0.2) (m) ())) (z (- (+ x y) w)))
0.47+-0.63 meter

> %w
syntax tree: (% w)
environment: ((z (- (+ x y) w)) (w ((4.53 0.02) (m) ())) (z ((4.52 0.02) (m) ())) (y ((3.0 0.6) (m) ())) (x ((2.0 0.2) (m) ())) (z (- (+ x y) w)))
4.53+-0.02 meter

> w
syntax tree: w
environment: ((z (- (+ x y) w)) (w ((4.53 0.02) (m) ())) (z ((4.52 0.02) (m) ())) (y ((3.0 0.6) (m) ())) (x ((2.0 0.2) (m) ())) (z (- (+ x y) w)))
4.53+-0.02 m


Optional extra credit: language extensions! [INDIVIDUAL]  Submit in a file named unilangExtra.rkt (you may put both the parser and evaluator in this file, or just the evaluator and submit additional support files)

For up to an additional +25 points, provide any extension you wish to the Unicalc language.  Small extensions will be worth small amounts of points, but large extensions, such as implementing "real" functions, together with parameters and local scope, will be worth large amounts of points.  How to design and implement this is 100% up to you. In the spirit of the add-your-own-feature option (below) however, please be sure