

it
Just as most of you have, up till now, done most of your programming in imperative languages, I suspect that most of you have worked exclusively with compiler-based languages.

A compiler is sometimes called a transformer because what it does is transform a program in one language to a program in another language such that the second program has the same ``meaning'' as the first. We are not any closer to that ``meaning'', we won't have that till the second program is actually executed. The purpose of the compiler is just to get us a program that can be executed so that we can eventually get the meaning.
The typical mode of working with a compiler is that you edit your program in an editor, then you run your compiler on it, then you run the program that results from the compilation process. Then, if there are bugs, you start the cycle over. Note that the compiler has no meaningful interaction with the user.

An interpreter is a very different beast. Imagine that when you wanted to work in C that you ran GCC and it just gave you a prompt. Now at that rompt you could type in function definitions, in which case they would be checked for correct syntax and stored away if they were ok.
But suppose you could also just ype in C expressions and the system
would execute them directly. So, you could type sqrt(9); and the
system would tell you 3. Thats how an interpreter behaves.
An interpreter takes a program and directly computes its meaning (some value). There is no second object program to execute. Interpreters typically interact heavily with the user. While a program may be written in an editor and loaded into the interpreter for execution, it can also be typed directly into the interpreter. As expressions are typed they are evaluated and their values printed for the user.
This leads to what is known as the read-eval-print loop.
When you enter a function definition the interpreter recognizes that and simply stores it for later use.

Now one of the downsides of interpreters, every time they use a function, they have to go through the process of ``interpreting'' its meaning relative to the arguments it is passed. In general this is slower than executing the equivalent compiled code.
The result is that nowadays many interpretes are not really interpreters, butwhat are called ``incremental compilers''. The idea is that when you type an expression at the prompt, the system doesn't interpret it, but rather compiles it and then runs the compiled code and then presents you the result.
If you enter a function definition the system just stores the compiled code for the function as its definition. The ML system we will be using is and incremental compiler.
I want to talk now for a couple of minutes about the history of ML. You don't really need to know this but I think that understanding the background behind a language is kind of important because it influences how the language comes to look and be the way it does.
ML as originally developed in the late 1970's at the University of Edinburgh in Scotland by Mike Gordon and a handful of collaborators.
Interestingly, ML wasn't originally intended by it's designers as a general-purpose stand-alone language. Rather, Gordon was working on a theorem proving package called LCF (The Logic of Computable Functions) that was designed to prove facts about programs written in a relatively simple language called PCF.
Now theorem proving in that setting is complicated and theorem provers like LCF don't just go off and prove things on their own. You have to give them a lot of hints, which are called tactics, about what to doin various situations.
ML was originally just the custom language used to program the tactics for PCF. Since it was a language in which the user was going to make statements about another object language, Mike Gordon just called it Meta Language. This was inevitably shortened to ML.
Now other people working on the PCF project and related things began to realize that ML was a really nice programming language in its own right. And so it has come to stand alone from the PCF project.
Eventually there were several variants of ML in development and use at a variety of institutions. The inevitable incompatibilties developed and so in the mid 80's a series of discussions were held culminatingin the standardization of language. The new language is called SML. Some books say that this stands for ``Standard Meta Language'' but that's a misnomer. ML used to stand for ``Meta Language'' but SML stands for ``Standard ML''.
Most ML's now conform pretty closely to the standard, though there are often a number of non-standard extensions. One ML that continues to develop apart from the standard and yet has a fairly large following because of some very nice features it includes is CAML which stands for Category Architecture ML. It was developed in France by the people at INRIA (the natonal computer science research institutes) in Paris and Nice and grew out of theoretical work in using a mathematical system called Category Theory as the underlying model of computation.
The ML (I should say SML but I get lazy) we will be using is SML-NJ (Standard ML of New Jersey) which is under continuing development by researchers in the Languages and Applied Logic group at Bell Labs (Lucent) together with researchers at Princeton University and the University of Pennsylvania.
Now let's start looking at SML for real.
To start SML, you just type sml
As I described the nature of the read-eval-print loop is such that you can just type an expression and SML will respond with its value. For instance,
3;
3 + 2;
Notice that when SML responded with the value of the expression it also told me the value's type. Types are an extremely important aspect of SML (the feature which most distinguishes it from languages in the LISP family like Scheme and Common Lisp).
I'll talk about what the val it = ... stuff means in a minute.

There are two numeric base types, real and int.
The usual operators, +, -, and *
are defined for the integers and for the reals (but not with a mixed
pair of an integer and real as we just saw). Integer division is
done with the infix operator div, while / is
used for real division.
Note that - is used only for binary subtraction. The unary negation operator
is written '~'.
The full complement of relational operators, =, <, >,
<=, >=, and <> are supported
as well. Each takes either a pair of integers or a pair of reals and returns
a boolean value; which brings us to the next base type, bool.
The booleans are a built in type with two values: false, and true.
Two booleans may be checked for equality (and inequality), but no other comparisons
or operations are allowed.

There are lots of built in functions in ML and function application is pretty much like other languages, except you'll notice that you don't in general use parentheses. So for instance:
Math.sqrt 4.0;
You only use parentheses for grouping and precedence purposes, like:
Math.sqrt (4.0 + 12.0);
Notice that functions are themselves just named values:
Math.sqrt;
op +;
Real.~;

Now, ML is telling us by the arrow that sqrt is a function which takes a real and returns a real. As I said before, SML takes its types seriously. and if you violate them it will slap you on the wrist:
Math.sqrt 3;
A few functions like plus are overloaded so they work on different types
3 + 4;
3.0 + 4.0;
But there are limits to this, so if we type:
3 + 4.0;
we'll get an error. In general SML does not support defining your own overloaded operators, though SML-NJ does provide hooks for doing it. Later we'll talk about a much richer concept than overloading called Polymorphism that allows appropriate functions to work on many different types.

Now, to assign a name to a value you just use:
val x = 3+4;
notice the system echos back the assignment with the final value
filled in. Whenever you type an expression that is not an assignment
the system alutomatically assigns it the name it so you can use
the result in subsequent expresions:
3 + 4;
it + 5;
Notice that the result of the second expression now has the name it, so we have lost the previous result.

The last base types are string and char. String constants are written
with double-quotes as in C.
All the relational operators described above are defined for strings.
You may also concatenate two strings using the ^ operator as in:
val s = "Hello " ^ "World!";
Character literals are written in a slightly odd notation, as singleton strings preceeded by a '#' character. I.e.
val c = #"h";
As above, the usual relational operators are supported
SML-NJ supports the usual
C-like mechanism of using the backslash to write the standard control codes,
such as \n for newline and \" for the double-quote character. These may be used in characters and strings.
The function explode takes a string and returns a list of
characters (more on lists later) and the function implode
does the opposite.

Actually, I lied, those aren't all the base types. The last is an odd type
called unit, that is even simpler than the booleans.
It has only one value,
Why does it exist?
Now, ML has three built-in structured types, tuples, records, and
lists.
Tuples are just unlabeled ordered pairs, triples, etc. So for
instance, we can have a pair of integers
Tuples are heterogeneous, which means that each field of the tuple can
contain a different type. So, we can have a quadruple like:
Two tuples can be tested for equality and inequality, and they are
equal if each field is equal.
You can select out the fields of a tuple using numeric selectors:
As it turns out, though, These selectors are rarely used because of the
availability of pattern matching, that I'll explain in a bit.
Records are similar to tuples but use field names rather than position
to distinguish the different fields. So for example:
Fields are selected similarly to tuples:
Records can also be compared for equality and inequality. This is done
fieldwise, by the name of the field, not position:
Lists are homogeneous variable length structures. A list has a head
and a tail. The head is a single item of some type. The tail is a list
of that type. There is a special element named
The basic notation for lists uses the infix constructor
You would write a list of integers like:
No internal parentheses are needed here because cons associates to the
right. So the last expression is equivalent to
notice that cons takes an element on the left and a list on the right.
So we can build up a list out of existing ones like:
Remember, though, that lists must be homogeneous, so:
will generate an error.
There is a more comact notation for lists that can be used when you
can enumerate all the elements of the list. Just use a pair of square
braces with the elements separated by commas. In fact, notice that
this is how ML echoed our lists back to us before. The two notations
are entirely interchangeable. In the latter notation,
Now I said before that each element of the list is a single item. But
these items can be of any first class type in the system. We can build
lists of ints as before, as well as lists of pairs of ints and strings
as in:
Similarly we can build lists of lists:
We can even build lists of functions:
You will need to get pretty good at reading ML types and understanding
them. For instance, notice the difference between the types of:
and
As with tuples and records, lists come with a set of little used
selectors,

(2,3), a triple of
reals (2.3,3.4,5.6), or a quadruple of strings
("this","is","a","test").val bigtup = (1,true,(),"test",(2,"hello","world"),3.14159)#2 bigtup;#5 bigtup;#2 (#5 bigtup);
val emprec1 = {name="Josh",ext=8650};#ext emprec1val emprec2 = {name="Ran",ext=8976};val emprec3 = {ext=8650,name="Josh"};emprec1 = emprec2;emprec1 = emprec3;
nil that is used
to terminate a list. Nil is a list of arbitrary type.::
which is pronounced `cons' for `construct'.val ilist1 = (1::2::3::nil);(1::(2::(3::nil)));val ilist2 = (4::ilist1);val badlist = (1::2::3.0::4::nil);
nil is
written [].val wierd1 = [(1,"hello"),(2,"goodbye")];val wierd2 = [[1,2],[3,4,5],[6]];val wierd3 = [op *,op div]val wierd3a = [op *,op /]wierd1;val wierd5 = ([1,2],["hello","goodbye"]);
hd which gives you the head of a list, and tl
which gives you the tail of a list:wierd2;hd wierd2;tl wierd2;hd (tl wierd2);hd (tl (hd (tl wierd2)));(hd wierd3) (2,3);
![]()
This page copyright ©1999
by Joshua S. Hodas. It was built
on a Macintosh.
Last rebuilt on Tuesday, August 31, 1999 at 3:09:10 PM.
http://cs.hmc.edu/~hodas/courses/cs131/lectures/lecture02.html