Project 07

An Interpreter for a Dynamically-Scoped Mini-Lisp

Due by 5:00 PM on Tueday November 23, 1999

 

Purpose:

  In this phase we transform the calculator language into a mini Lisp-like language by adding structured data in the form of s-expressions.

Background:

 
S-expres- sions As Data Before I can specify the details of the project I must first tell you about how s-expressions are used as data in Lisp. So far we have only simple data types: numbers and strings. In Lisp s-expressions are provided as the principal way of building up compound data structures out of the base types which include not only numbers and strings but also symbols, which so far we have used just for variable names. Lists are constructed using the cons operator (which just stands for construct). For instance, we could build a list of 3 integers with the expression: (cons 2 (cons 6 (cons -3 ()))) which evaluates to the value (2 6 -3). Remember that in Lisp the second item in a cons does not have to be a list itself, though in this case it was for each of the calls to cons. The empty list can be written either as "()", or as "nil". (Note those are meta-quotation marks. If they were included in what you typed, this would be the string of three characters n, i, and l, which is not the same thing.) The parser is already set to return Atom Nil for either representation.

Boolean Values The standard treatment of boolean values in Lisp is somewhat analagous to that used in C and C++. The empty list (nil or ()) is used to represent "false", and any non-nil value (including any other atom, or any other compound s-expression) is "true". (Thus, 0 is now "true", since it is not nil.)

Cons Pair Literals In general, a cons pair is written by writing the two items of the pair inside parentheses, separated by a ".". Thus the result of evaluating (cons 2 3) is (2 . 3). However, in order to cut the nesting of parentheses, the representation is simplified according to the rule that whenever a dot immediately precedes a left parenthesis, the dot, the left parenthesis, and its matching right parenthesis, are not shown. That is why the result of (cons 2 (cons 6 (cons -3 ()))) is printed as (2 6 -3) rather than (2 . (6 . (-3 . ())))

CAR and CDR To get the first thing in a pair the function car is used, while the function cdr is used to return the second thing in a pair (usually the tail of a list). These functions take their names from the first implementation of Lisp on an early IBM mainframe. A pair was always represented as a pair of pointers that were loaded respectively into the address and decrement registers of the cpu. Thus car is ``Contents of Address Register'' and cdr is ``Contents of Decrement Register''.

Anyway, we use these functions as follows:

> (define a (cons 2 (cons 6 (cons -3 ()))))
a <- (2 6 -3)
> (car a)
2
> (cdr a)
(6 ~3)
> (car (cdr a))
6

The operations car and cdr apply only to cons pairs. It is an error to apply either to nil or to a symbol, string, or number (that is, to atoms).

Quote Now, obviously, it would be nice if we could just type (2 6 -3) instead of typing (cons 2 (cons 6 (cons -3 ()))), but there is a problem:
> (2 6 -3)
>>Error: Unknown operator 2 in (FUNCTION 2)
Since we are also using the s-expression syntax for our program syntax, the interpreter sees a list of things in parentheses as a request to treat the first item as a function and send the rest of the items to it as arguments. Of course, 2 is not a function. (This last error was from a real Lisp interpreter, the sample solution just prints "Illegal Expression".)

The way around this is the quote operator which simply says "Don't evaluate my argument!":

> (quote (2 6 -3))
(2 6 -3)
> (+ 2 3)
5
>(quote (+ 2 3))
(+ 2 3)
>(car (+ 2 3))
Runtime Error -- Illegal argument to car.
>(car (quote (+ 2 3)))
+
For any expression exp, the expression (quote exp) can also be abbreviated 'exp, as in '(2 6 -3). However, you do not have to worry about this, as it is handled by the parser. You will always see it as though it had been typed (quote exp).

Notice quote is not idempotent (that is, applying it more than once is not the same as applying it once):

> '(2 6 -3)
(2 6 -3)
> ' '(2 6 -3)
(quote (2 6 -3))

Now, with quote, it becomes meaningful to have symbols as data. Suppose a and b are defined to be 3 and 4 respectively. Well, there is a difference between the variable a and the symbol a. For instance:

> (define a 3)
a <- 3
> (define b 4)
b <- 4
> a
3
> 'a
a
> (+ a b)
7
> (+ 'a b)
Runtime Error -- Illegal operands to math function +
> (car '(a b))
a
> (+ (car '(a b)) b)
Runtime Error -- Illegal operands to math function +
So, once they've been quoted, symbols are just symbols. They become pretty much like the elements of an enumerated type. Note, this is not true of numbers or () which are called self-quoting. (That is, you didn't know it, but they were really quoted all along, you just didn't see the quote!):
> 3
3
> '3
3
> (+ 3 4)
7
> (+ '3 4)
7
>'(3 4)
(3 4)
> (car '(3 4))
3
> (+ (car '(3 4)) 4)
7
Note that explicite use of quote is still not idempotent on the self-quoting items:
> '3
3
> ''3
(quote 3)

As a more complete example, if we wish to search a list for a value, we can use the following code:

(define (memb val list)
   (cond ((null list) ())
         ((= val (car list)) 'T)
         (memb val (cdr list))))
We use the Lisp standard of representing false with nil and true with some non-nil value. Remember that when (quote exp) is evaluated, it's value is exp. So the value returned in the second case is T not (quote T). If we had left off the quote, then it would have appeared we were trying to return the value of some variable named T. Note that there is nothing special about 'T. It is just an arbitrary non-nil value that has become a standard value to use for "true".

Data Structures Because s-expressions don't have to be homogeneous like ML lists, they can be used to build arbitrary data structures. For instance, we can represent a binary search tree as a three element list, the first element of which is the root, the second of which is the left sub-tree, and the third of which is the right sub-tree. As in:
'(4 (2 (1 () ()) (3 () ())) (8 (6 (5 () ()) (7 () ())) (9 () ())))
which can be seen as representing the binary search tree:

Using this representation, the function to determine if a tree contains a given value would be:

(define (tree_memb val tree)
   (cond ((null tree) ())
         ((= val (car tree)) 'T)
         ((< val (car tree)) (tree_memb val (car (cdr tree)))
         ('T                 (tree_memb val (car (cdr (cdr tree))))))))
Here null is a predicate that tests whether its argument is the empty list.

What You Need To Do:

  In order to support the use of s-expressions as data, the type AbstractSyntax.datavalue must be changed from ConcreteSyntax.literal to ConcreteSyntax.sexp. This has already been done and the new version (along with an appropriately modified version of InterpreterFun) are pre-loaded in the new version of ("sml-sexp").

  1. The code for SyntaxConverterFun must be modified to deal with the use of quote. Any time you see a quoted expression, the body of the expression should be wrapped with a Cdata rather than being further evaluated.

    The only other changes to be made to SyntaxConverterFun are those necessary to recognize and check the arity of the new built-in operators described below.

    As stated above, the parser is already set up so that if the user types 'exp you will recieve an sexp as though they had typed (quote exp). You do not have to deal with that conversion yourself.

  2. You should define the built-in functions cons, car, and cdr. You should also define the predicates null which tests its argument and returns 'T if it is nil and nil otherwise, and atom which returns nil if its argument is a cons-pair and 'T otherwise. Note that while car and cdr are not defined for atoms, atom, and null are defined for all value types. This is exactly so that you can use atom to check whether or not it is safe to apply car or cdr.

  3. The relational operators should be redefined to return nil for false and 'T for true. The evaluation of conditional expressions should also be changed to use these values (i.e. nil for false, non-nil for true) rather than zero and non-zero. Note that for symbols, only equality and inequality are defined, and symbols, strings, and numbers may not be compared to one another. You should not allow non-atomic s-expressions to occur as arguments to any of the relational operators.

  4. Suppose we want to create a list out of the values stored in the variables a and b. We can't write '(a b) since the quote blocks the evaluation of the variable names to their respective values. Instead we have to write (cons a (cons b ())) which is pretty cumbersome. Therefore, Lisp defines the function list which takes an arbtrary number of arguments, evaluates them, and then builds a list out of the results. I.e.:
    > (define a 3)
    a <- 3
    > (define b 4)
    b <- 4
    > '(a b)
    (a b)
    > (list a b)
    (3 4)
    
    Add the definition of list to your interpreter.

  5. Suppose we wish to extract the 3 from the s-expression ((0 1) (2 3)) which is stored in the variable a. This requires typing the incredibly horrible expression (car (cdr (car (cdr a)))). Because Lisp does not have pattern matching it is often necessary to write such expressions for extracting parts of a structure.

    To simplify such expressions, most Lisp implementations allow you to abbreviate any combination of car and cdr expressions by simply typing the corresponding combination of as and ds, surrounded by a c and an r. So, for instance, the example expression becomes (cadadr a) which, while still hard to figure out, is at least easy to type.

    Add an implementation of this shorthand to your interpreter. That is, make the interpreter able to understand any such abbreviation of car and cdr expressions.

What to Submit:

  The new version of sml-sexp has already loaded the parser, the concrete syntax, the abstract syntax, and the interpreter. It has also pre-loaded SyntaxConverter.sig and Evaluator.sig. Therefore all you need to load to test your solution (and therefore, all you need to submit) is Evaluator.sml and SyntaxConverter.sml.

The make file in the /cs/cs131/proj_7 directory can be used to test-build your solution. A file of test lisp expressions is also available in that directory.


This page copyright ©1998 by Joshua S. Hodas. It was built on a Macintosh. Last rebuilt on Fri, Apr 10, 1999 at 3:15 PM.
http://cs.hmc.edu/~hodas/courses/cs131/projects/project07.html