Project 06

Interpreter for Simple Calculator Language (Phase 03)
Fruition

Due by 5:00 PM on Tuesday, March 28, 2000

Purpose:

In this phase we complete the construction of the calculator language interpreter by adding a handful of key features and upgrading its data types.

What has the language been missing? What keeps us from writing useful programs in it? The answer is simple:
Conditional Evaluation
Without conditional evaluation, our language can't really be said to be a programming language. It is just a simple macro-expression language. With it, though, it becomes the real thing. (An argument has been made that what really distinguished ENIAC from its predecessors, what made it a computer, rather than a calculator, was the fact that it had a conditional branch instruction.)

In this phase we will add conditional evaluation, as well as sequencing, local variables, a print statement, and, to round things out, strings as a data type.

Special Note for Spring 2000:

In various places below, the project will refer to changes you must make to the SyntaxConverter functor. Because of the delay in completing the sample solution and the project description, you are not required to implement the new syntax converter for this project. It has been provided for you compiled into the version of SML used for this project.

Strings as a Datatype

The lowest level change to the language involves the incorporation of strings as a datatype. To this end, the datatype ConcreteSyntax.number has been replaced with ConcreteSyntax.literal, with three constructors: Int, Real, and String. Similarly, the Number constructor has been replaced with Literal, and all the functions in ConcreteSyntax have been appropriately renamed and modified. So, for instance, we now use the function Note that when ConcreteSyntax.literalToString where we previously used ConcreteSyntax.numberToString. Note that when ConcreteSyntax.literalToString converts a String literal, it includes double quotes around the underlying string to make the type clear.

In order to incorporate this change into the inerpreter, you will need to make various changes in SyntaxConverter, and Evaluator. (In particular, datavalue will now need to be defined as ConcreteSyntax.literal instead of as ConcreteSyntax.number.) The only binary operator to be defined on string literals is '+' which will be overloaded as numeric addition (when it is given a pair of numbers) and as string concatenation (when it is given a pair of strings). In addition, you should implement a new unary built-in called makestring which converts its argument to a string. This operator should accept all three types of literals, but should be a no-op on String literals. (Note, therefore, that you can't just blindly use ConcreteSyntax.literalToString since it would add quotes to the string.) These two changes (along with others described below) will likely require a fair number of changes to the evaluation of applications of built-in operators.

The change from number to literal for datavalues is the only change made in the UserInterface functor. Therefore, that change has been made for you and a useable functor loaded into the sml version for this project.

Write and Newline

You must add a new unary built-in write defined at all object types. (By object type I mean types in the object language, the one we are implementing, as opposed to the types of the meta language we are using for the implementation.)

When printing strings, do not include quotes around the string.

To allow for full generality, the write operator should leave the cursor at the end of the output, and not advance to the next line. Therefore, you should also define the zero-ary built-in operator newline. Since we are implementing a functional language, in which all expressions must have values, even if they are really used just for their side-effects, these two operators must each return a value when called. After producing the desired output as a side-effect, the write operator should return as its value whatever it is given as its argument. The newline operator should return the empty String literal.

While we could require writeln as a built-in variant of write, this is not necessary as the user can define it (using the sequence construct described below) as:

(defun (writeln x) (sequence (write x) (newline)))

Conditonal Evaluation

In eager languages, conditional evaluation is what is known as a special form. That is, while it will look just like the other built-in operators from a syntactic standpoint, it will require a different operational behavior. Unlike the ordinary built-in operators, this one does not necessarilly evaluate all of its arguments. Instead a prescribed procedure is used to determine which arguments are actually evaluated.

The form of conditional evaluation we will include in the language is the cond expression from lisp. It can be seen as a generalization of a case statement, or of the if-then-else-if statement. The syntax of the expression is:

(cond (test_1 result_1) (test_2 result_2) ... (test_n result_n))

The operational semantics (i.e. what the system does when it encounters this expression) are as follows: Evaluate the first element of the first argument pair. If that expression evaluates to a true value (as outlined below) then evaluate the second expression in the pair and return its value as the value of the conditional expression. If, on the other hand, the first expression evaluates to false (as outlined below) then the second expression in the pair is discarded unevaluated, and the process repeats with the next pair. If the last pair is reached and none of the pairs has a test expression that evaluates to true then the value of the whole conditional expression is false.

Of course, our definition of literals does not include a boolean type, so we must pick a suitable notion of true and false. At this point we will take the C/Rex way out and say that any non-zero value is true and any zero value is false. Since the test expressions can evaluate to any literal type, we must, in turn, have a suitable notion of "zero" at each object type. Therefore we will assume that the following are all "zero" for the purposes of deciding whether to evaluate the other expression in the pair: (Int 0), (Real 0.0), and (String "") (the empty string).

In the case that a conditional expression returns false because none of the test expressions evaluated to non-zero values, you should use (Int 0) as the value of false returned.

(In later projects we will change our notion of true and false as our data types get richer. Therefore, you may find it useful to define vals for true and false, and a test function isFalse which determines whether a given literal is one of the acceptable representations of false, so as to make the subsequent changes more modular and easier to implement.)

In order to distinguish the conditional expression from an ordinary application structure, a new case has been added to the definition of the datatype AbstractSyntax.cexp:

| Ccond of (cexpression * cexpression) list

You must modify SyntaxConverter to recognize cond expressions and convert them to this special form. Then add the appropriate code for evaluating them to Evaluator.evaluateCexpression. Note that AbstractSyntax.cexpressionToString has already been updated to print them properly.

Relational Operators

In order to make the conditional statement useful, you need some test expressions. To that end, you should implement the following built-in relational operators: <, <=, >, >=, =, <>. Values of the two numeric types should be able to be compared to one another, but strings can only be compared to other strings. (String comparison should just be based on the underlying string comparison function from ML.) Whatever the types of the arguments, the operators should return (Int 1) if the test is true, and (Int 0) if it is false.

Note, these are just ordinary built-in operators. They are not new special forms. Also, we are not saying these are the only operators that can be used to construct test expressions. Any expression can be used as the first expression in a cond pair. It is just that we will need these for many applications.

Sequencing

As shown above, with the addition of side effects to the language, it is natural to add a notion of sequencing.

Strictly speaking it is not necessary to add a new special form for this to the language, since function application can be used to accomplish the desired efects. For example, if we knew that the arguments to a function were always evaluated left to right, we could define the writeln function above as:

(defun (writeln_aux foo bar) bar) (defun (writeln x) (writeln_aux (write x) (newline)))

Even if we did not know the order of evaluation of function arguments, eager evaluation imparts an ordering to nested applications. So, we could define it as:

(defun (writeln_aux foo) (newline)) (defun (writeln x) (writeln_aux (write x)))

Nevertheless, to avoid such machinations, it is useful to add a syntax for sequencing to the language, and to implement it as a special form. The syntax you are to support is:

(sequence exp_1 exp_2 ... exp_n)

This syntax should be mapped (in SyntaxConverter) to the following new case that has been added to AbstractSyntax.cexp:

| Cseq of cexpression list

and evaluated in the way sequences are evaluated in ML. That is, a sequence should evaluate each of the individual expressions in order, and return the value of the last expression as the value of the sequence.

Local Declarations with Let

As you well know from your experience with ML, because the body of a function is often composed of several sub-computations, it is often desireable to evaluate these sub-expressions individually and assign names to the results, in order to make the computation clearer, and, in the case of a sub-expression used several times, more efficient.

The basic syntax for declaring local variables is the let expression. The syntax of this special form is:

(let ((var_1 exp_1) (var_2 exp_2) ... (var_n exp_n)) body_exp)

The value of the let expression is the result of evaluating each of the n expressions and binding the n variables to the results and then evaluating the body expression in the resulting environment. In this basic form of the let expression each of the n expressions is evaluated in the same environment, the one that is active at the point that the let expression is evaluated.

Often it is desirable to define one of the local variables in terms of one of the others already defined. While this can be done by using nested let expressions, this is annoying in practice. Therefore, a second form of let expression, introduced with the keyword let* is used for this need.

In order to support these two special forms, two new cases have been added to AbstractSyntax.cexp:

| Clet of (vname * cexpression) list * cexpression | Clet' of (vname * cexpression) list * cexpression

Clet' is used for the let* form (since * is not allowed in ML identifiers).

What I have Provided:

I have created a version of the SML-NJ interpreter with the parser structure SexpParser, the two syntax structures ConcreteSyntax and AbstractSyntax, and the SyntaxConverter structure, as well as the UserInterface functor, and the signature for the evaluator functors already loaded. The executable (or a link to it) is in /cs/cs131/bin/sml-sexp. The file /cs/cs131/bin/sml-cs131 (which is the version loaded by emacs) has also been re-linked to this version.

(I have also loaded a sample solution to the evaluator functor and built a structure called testEvaluator as well. You can use the functions in this structure from the SML command line to see what the results of evaluation should be.)

I have also provided an executable sample solution at /cs/cs131/bin/sml-soln-proj6. You can use this to test how your solution should behave (from the standpoint of user interaction). This launches directly into my solution's interpreter.

Finally, the file /cs/cs131/proj_6/tests.funcalccontains a variety of test functions that your solution should be able to handle.

What You Should Turn In

Only the evaluator and the syntax converter get appreciable changes in this phase. As mentioned above, the changes to the syntax converter and user interface have already been made for you. Therefore, you should submit just the new implementation of EvaluatorFun.

We will test your solution (which should be a single file containing just the EvaluatorFun functor and nothing else) by compiling it with the file /cs/cs131/proj_6/make_sml-soln.sml with your file used in place of the evaluator.sml files currently used.

	This page copyright ©2000 by Joshua S. Hodas. It was built on a Macintosh. Last rebuilt on Tuesday March 21, 2000.
http://www.cs.hmc.edu/~hodas/courses/cs131/projects/project06.html


Purpose:
	In this phase we complete the construction of the calculator language interpreter by adding a handful of key features and upgrading its data types. What has the language been missing? What keeps us from writing useful programs in it? The answer is simple: Conditional Evaluation Without conditional evaluation, our language can't really be said to be a programming language. It is just a simple macro-expression language. With it, though, it becomes the real thing. (An argument has been made that what really distinguished ENIAC from its predecessors, what made it a computer, rather than a calculator, was the fact that it had a conditional branch instruction.) In this phase we will add conditional evaluation, as well as sequencing, local variables, a print statement, and, to round things out, strings as a data type.
Special Note for Spring 2000:
	In various places below, the project will refer to changes you must make to the SyntaxConverter functor. Because of the delay in completing the sample solution and the project description, you are not required to implement the new syntax converter for this project. It has been provided for you compiled into the version of SML used for this project.
Strings as a Datatype
	The lowest level change to the language involves the incorporation of strings as a datatype. To this end, the datatype `ConcreteSyntax.number` has been replaced with `ConcreteSyntax.literal`, with three constructors: `Int`, `Real`, and `String`. Similarly, the `Number` constructor has been replaced with `Literal`, and all the functions in `ConcreteSyntax` have been appropriately renamed and modified. So, for instance, we now use the function Note that when `ConcreteSyntax.literalToString` where we previously used `ConcreteSyntax.numberToString.` Note that when `ConcreteSyntax.literalToString` converts a `String` literal, it includes double quotes around the underlying string to make the type clear. In order to incorporate this change into the inerpreter, you will need to make various changes in `SyntaxConverter`, and `Evaluator`. (In particular, `datavalue` will now need to be defined as `ConcreteSyntax.literal` instead of as `ConcreteSyntax.number`.) The only binary operator to be defined on string literals is `'+'` which will be overloaded as numeric addition (when it is given a pair of numbers) and as string concatenation (when it is given a pair of strings). In addition, you should implement a new unary built-in called `makestring` which converts its argument to a string. This operator should accept all three types of literals, but should be a no-op on `String` literals. (Note, therefore, that you can't just blindly use `ConcreteSyntax.literalToString` since it would add quotes to the string.) These two changes (along with others described below) will likely require a fair number of changes to the evaluation of applications of built-in operators. The change from `number` to `literal` for `datavalue`s is the only change made in the `UserInterface` functor. Therefore, that change has been made for you and a useable functor loaded into the sml version for this project.
Write and Newline
	You must add a new unary built-in `write` defined at all object types. (By object type I mean types in the object language, the one we are implementing, as opposed to the types of the meta language we are using for the implementation.) When printing strings, do not include quotes around the string. To allow for full generality, the `write` operator should leave the cursor at the end of the output, and not advance to the next line. Therefore, you should also define the zero-ary built-in operator `newline`. Since we are implementing a functional language, in which all expressions must have values, even if they are really used just for their side-effects, these two operators must each return a value when called. After producing the desired output as a side-effect, the `write` operator should return as its value whatever it is given as its argument. The `newline` operator should return the empty `String` literal. While we could require `writeln` as a built-in variant of `write`, this is not necessary as the user can define it (using the `sequence` construct described below) as: (defun (writeln x) (sequence (write x) (newline)))
Conditonal Evaluation
	In eager languages, conditional evaluation is what is known as a special form. That is, while it will look just like the other built-in operators from a syntactic standpoint, it will require a different operational behavior. Unlike the ordinary built-in operators, this one does not necessarilly evaluate all of its arguments. Instead a prescribed procedure is used to determine which arguments are actually evaluated. The form of conditional evaluation we will include in the language is the `cond` expression from lisp. It can be seen as a generalization of a `case` statement, or of the `if`-`then`-`else`-`if` statement. The syntax of the expression is: (cond (test_1 result_1) (test_2 result_2) ... (test_n result_n)) The operational semantics (i.e. what the system does when it encounters this expression) are as follows: Evaluate the first element of the first argument pair. If that expression evaluates to a true value (as outlined below) then evaluate the second expression in the pair and return its value as the value of the conditional expression. If, on the other hand, the first expression evaluates to false (as outlined below) then the second expression in the pair is discarded unevaluated, and the process repeats with the next pair. If the last pair is reached and none of the pairs has a test expression that evaluates to true then the value of the whole conditional expression is false. Of course, our definition of literals does not include a boolean type, so we must pick a suitable notion of true and false. At this point we will take the C/Rex way out and say that any non-zero value is true and any zero value is false. Since the test expressions can evaluate to any literal type, we must, in turn, have a suitable notion of "zero" at each object type. Therefore we will assume that the following are all "zero" for the purposes of deciding whether to evaluate the other expression in the pair: `(Int 0)`, `(Real 0.0)`, and `(String "")` (the empty string). In the case that a conditional expression returns false because none of the test expressions evaluated to non-zero values, you should use `(Int 0)` as the value of false returned. (In later projects we will change our notion of true and false as our data types get richer. Therefore, you may find it useful to define `val`s for true and false, and a test function `isFalse` which determines whether a given literal is one of the acceptable representations of false, so as to make the subsequent changes more modular and easier to implement.) In order to distinguish the conditional expression from an ordinary application structure, a new case has been added to the definition of the datatype `AbstractSyntax.cexp`: \| Ccond of (cexpression * cexpression) list You must modify `SyntaxConverter` to recognize `cond` expressions and convert them to this special form. Then add the appropriate code for evaluating them to `Evaluator.evaluateCexpression`. Note that `AbstractSyntax.cexpressionToString` has already been updated to print them properly.
Relational Operators
	In order to make the conditional statement useful, you need some test expressions. To that end, you should implement the following built-in relational operators: `<`, `<=`, `>`, `>=`, `=`, `<>`. Values of the two numeric types should be able to be compared to one another, but strings can only be compared to other strings. (String comparison should just be based on the underlying string comparison function from ML.) Whatever the types of the arguments, the operators should return `(Int 1)` if the test is true, and `(Int 0)` if it is false. Note, these are just ordinary built-in operators. They are not new special forms. Also, we are not saying these are the only operators that can be used to construct test expressions. Any expression can be used as the first expression in a `cond` pair. It is just that we will need these for many applications.
Sequencing
	As shown above, with the addition of side effects to the language, it is natural to add a notion of sequencing. Strictly speaking it is not necessary to add a new special form for this to the language, since function application can be used to accomplish the desired efects. For example, if we knew that the arguments to a function were always evaluated left to right, we could define the `writeln` function above as: (defun (writeln_aux foo bar) bar) (defun (writeln x) (writeln_aux (write x) (newline))) Even if we did not know the order of evaluation of function arguments, eager evaluation imparts an ordering to nested applications. So, we could define it as: (defun (writeln_aux foo) (newline)) (defun (writeln x) (writeln_aux (write x))) Nevertheless, to avoid such machinations, it is useful to add a syntax for sequencing to the language, and to implement it as a special form. The syntax you are to support is: (sequence exp_1 exp_2 ... exp_n) This syntax should be mapped (in `SyntaxConverter`) to the following new case that has been added to `AbstractSyntax.cexp`: \| Cseq of cexpression list and evaluated in the way sequences are evaluated in ML. That is, a sequence should evaluate each of the individual expressions in order, and return the value of the last expression as the value of the sequence.
Local Declarations with Let
	As you well know from your experience with ML, because the body of a function is often composed of several sub-computations, it is often desireable to evaluate these sub-expressions individually and assign names to the results, in order to make the computation clearer, and, in the case of a sub-expression used several times, more efficient. The basic syntax for declaring local variables is the `let` expression. The syntax of this special form is: (let ((var_1 exp_1) (var_2 exp_2) ... (var_n exp_n)) body_exp) The value of the `let` expression is the result of evaluating each of the n expressions and binding the n variables to the results and then evaluating the body expression in the resulting environment. In this basic form of the `let` expression each of the n expressions is evaluated in the same environment, the one that is active at the point that the `let` expression is evaluated. Often it is desirable to define one of the local variables in terms of one of the others already defined. While this can be done by using nested `let` expressions, this is annoying in practice. Therefore, a second form of `let` expression, introduced with the keyword `let` is used for this need. In order to support these two special forms, two new cases have been added to `AbstractSyntax.cexp`: \| Clet of (vname cexpression) list * cexpression \| Clet' of (vname * cexpression) list * cexpression `Clet'` is used for the `let` form (since `` is not allowed in ML identifiers).
What I have Provided:
	I have created a version of the SML-NJ interpreter with the parser structure `SexpParser`, the two syntax structures `ConcreteSyntax` and `AbstractSyntax`, and the `SyntaxConverter` structure, as well as the `UserInterface` functor, and the signature for the evaluator functors already loaded. The executable (or a link to it) is in `/cs/cs131/bin/sml-sexp`. The file `/cs/cs131/bin/sml-cs131` (which is the version loaded by `emacs`) has also been re-linked to this version. (I have also loaded a sample solution to the evaluator functor and built a structure called `testEvaluator` as well. You can use the functions in this structure from the SML command line to see what the results of evaluation should be.) I have also provided an executable sample solution at `/cs/cs131/bin/sml-soln-proj6`. You can use this to test how your solution should behave (from the standpoint of user interaction). This launches directly into my solution's interpreter. Finally, the file `/cs/cs131/proj_6/tests.funcalc`contains a variety of test functions that your solution should be able to handle.
What You Should Turn In
	Only the evaluator and the syntax converter get appreciable changes in this phase. As mentioned above, the changes to the syntax converter and user interface have already been made for you. Therefore, you should submit just the new implementation of `EvaluatorFun`. We will test your solution (which should be a single file containing just the `EvaluatorFun` functor and nothing else) by compiling it with the file `/cs/cs131/proj_6/make_sml-soln.sml` with your file `use`d in place of the `evaluator.sml` files currently used.

Project 06

Interpreter for Simple Calculator Language (Phase 03) Fruition

Purpose:

Special Note for Spring 2000:

Strings as a Datatype

Write and Newline

Conditonal Evaluation

Relational Operators

Sequencing

Local Declarations with Let

What I have Provided:

What You Should Turn In

This page copyright ©2000 by Joshua S. Hodas. It was built on a Macintosh. Last rebuilt on Tuesday March 21, 2000.

http://www.cs.hmc.edu/~hodas/courses/cs131/projects/project06.html

Interpreter for Simple Calculator Language (Phase 03)
Fruition