Lecture 09

Harvey Mudd College
Computer Science 131
Programming Languages
Spring Semester 1999

Lecture 09 (2/22/99)

Continuing Introduction to ML
- The ML Module System, continued
  - Functors
Specifying Syntax
Concrete Syntax vs. Abstract Syntax

In the last lecture I introduced structures and signatures, which form the basis of the SML module system. As useful as these are, however, it is functors which provide the glue that make the module system truely useful.

Consider the following: if functions did not take parameters, then they would have to do all their calculations on global variables, and would always produce the same results as long as the values of the variables were the same at the time the function was defined. In a similar way, structures can only depend on those type names and structure names already defined. Functors allow a structure definition to be given as a template, and instantiated to some appropriate parameters for an actual application.

Suppose we have the following signature for a lookup table:

(* A signature for string dictionaries *)
 
signature DICTIONARY =
sig
   eqtype key
   type def
   type dict
    
   exception Already_Defined of key;
   exception Not_Defined of key;
 
   val newdict : dict;
   val lookup  : dict -> key -> def option
   val defined : dict -> key -> bool
   val define  : dict -> key -> def -> dict
   val undefine: dict -> key -> dict
end;

One structure matching this signature is:

(* A structure for string-string dictionaries *)
 
structure StringStringDictionary : DICTIONARY = 
struct
   type key = string
   type def = string
   datatype dict = dict of (key * def) list
 
   exception Not_Defined of key;
   exception Already_Defined of key;
 
   val newdict = dict nil;
 
   fun lookup_aux nil _ = NONE
     | lookup_aux ((key1,def1)::t) key = 
          if key1 = key
            then SOME def1
            else lookup_aux t key1
 
   fun lookup (dict dct) key = lookup_aux dct key
 
   fun defined dct key = not (lookup dct key = NONE)
 
   fun define (dict' as (dict dct)) key def = 
          if (defined dict' key)
            then raise (Already_Defined key)
            else dict ((key,def)::dct)
 
   fun undefine_aux nil key = raise (Not_Defined key)
     | undefine_aux (dct as ((defn as (key1,def1))::t)) key =
          if key1 = key
            then t
            else defn::(undefine_aux t key)
 
   fun undefine (dict dct) key = dict (undefine_aux dct key)
end;

We could similarly define structures for any key/definition types as long as the key type is an equality type. Notice, however, that this very definition should work for all those cases. It would be silly to have to repeat it over an over. Can we just leave out the key and definition types and let the functions be polymorphic? No. The problem is that such a structure would not match the signature we were given.

The solution is to use a functor. Whenever we want to build an instance of this structure for a particular pair of types, we will just call the functor with the appropriate types.

The functor for this example is:

(* A functor that builds dictionary structures *)
 
functor Dictionary (eqtype K; type D) : DICTIONARY = 
struct
   type key = K
   type def = D
   datatype dict = dict of (key * def) list
 
   exception Not_Defined of key
   exception Already_Defined of key
 
   val newdict = dict nil
 
   fun lookup_aux nil _ = NONE
     | lookup_aux ((key1,def1)::t) key = 
          if key1 = key
            then SOME def1
            else lookup_aux t key1
 
   fun lookup (dict dct) key = lookup_aux dct key
 
   fun defined dct key = case (lookup dct key) 
                            of NONE   => false
                             | SOME _ => true
 
   fun define (dict' as (dict dct)) key def = 
          if (defined dict' key)
            then raise (Already_Defined key)
            else dict ((key,def)::dct)
 
   fun undefine_aux nil key = raise (Not_Defined key)
     | undefine_aux (dct as ((defn as (key1,def1))::t)) key =
          if key1 = key
            then t
            else defn::(undefine_aux t key)
 
   fun undefine (dict dct) key = dict (undefine_aux dct key)
end;

If we wish to use this functor to build a structure for dictionaries with integer keys and string definitions, we type:

structure IntStringDict = Dictionary(type K = int; 
                                    type D = string);

The syntax of functor headers and applications is, admittedly, a bit strange. The semi-colons in between parameters are optional, but I think they make it a little easier to read. Also, be aware that a functor application (or a struct...end block, for that matter) are syntactically allowed only on the right hand side of a structure definition statement. You cannot have unnamed structures.

Note that I changed the definition of defined slightly. Why?

A functor can be parameterized by a type, a value, or a structure. As an example of the last, consider that if it is possible to build dictionaries that are more efficient if we know that the key type supports an ordering function. One option would be to build the dictionary as a binary search tree. Here I will just show a version that uses ordered lists. The functor now needs to recieve the key and definition types as well as the ordering function for the keys. While these could be passed in separately, it makes sense to group the function with the key type. We will use the following signature for this bundle:

(* A signature for string dictionaries *)
 
signature ORDERING =
sig
   eqtype T
 
   val lt : T * T -> bool
end;

Now, we can build a functor for ordered dictionaries as:

(* A functor that builds sorted dictionary structures *)
 
functor SortedListDictionary (structure Ordering : ORDERING; type D) : DICTIONARY = 
struct
   type key = Ordering.T
   type def = D
   datatype dict = dict of (key * def) list
 
   exception Not_Defined of key
   exception Already_Defined of key
 
   val newdict = dict nil
 
   fun lookup_aux nil _ = NONE
     | lookup_aux ((key1,def1)::t) key = 
          if key1 = key
            then SOME def1
            else if Ordering.lt (key1,key)
                   then lookup_aux t key1
                   else NONE;
 
   fun lookup (dict dct) key = lookup_aux dct key
 
   fun defined dct key = case (lookup dct key) 
                            of NONE   => false
                             | SOME _ => true
 
   fun define_aux nil key def = [(key,def)]
     | define_aux (dct as ((defn as(key1,def1))::t)) key def = 
          if key1 = key
            then raise (Already_Defined key)
            else if Ordering.lt(key1,key)
                   then defn::(define_aux t key def)
                   else (key,def)::dct
 
   fun define (dict dct) key def = dict (define_aux dct key def)
 
   fun undefine_aux nil key = raise (Not_Defined key)
     | undefine_aux (dct as ((defn as(key1,def1))::t)) key =
          if key1 = key
            then t
            else if Ordering.lt(key1,key)
                   then defn::(undefine_aux t key)
                   else raise (Not_Defined key)
 
   fun undefine (dict dct) key = dict (undefine_aux dct key)
end;

Then, If we have the ORDERING:

(* A structure for the integer ordering *)
 
structure IntOrder : ORDERING = 
struct
   type T = int
 
   val lt = Int.<
end;

we can use this functor to build a structure for fast dictionaries with integer keys and string definitions, by typing:

structure SortedIntStringDict =
              SortedListDictionary(structure Ordering = IntOrder; 
                                   type D = string);

Suppose we are writing a functor that makes use of a DICTIONARY and also made use of a QUEUE, which signature includes a queue type and an entry type. At some point in the code an element is taken off the queue and looked up in the dictionary. The functor would look something like:

functor SomeFunctor (structure Dict : DICTIONARY  and Queue : QUEUE) : SOME_SIG = 
struct
   ...

   fun lookUpHead dict queue = Dict.lookup dict (Queue.head queue);

   ...
end

But does (or should) this typecheck?

The types of the two functions are:

val Dict.lookup : Dict.dict -> Dict.key -> Dict.def; 
val Queue.head : Queue.queue -> Queue.entry;

So, it all depends on the particular dictionary and queue that is sent in to the functor. If Dict.key is the same type as Queue.entry then we are fine. Otherwise, not.

We can tell the system that they will be the same, and thus that it should, on application of the functor, check that they are the same, by adding a sharing constraint to the functor header. In particular, the functor becomes:

functor SomeFunctor (structure Dict : DICTIONARY  and Queue : QUEUE
                     sharing type Dict.key = Queue.entry) : SOME_SIG =
struct
   ...

   fun lookUpHead dict queue = Dict.lookup dict (Queue.head queue);

   ...
end

Here we have enforced a sharing constraint on types. In a similar manner, it is also sometimes necessary to require that two structures share some common substructure. In that case we just use a structure sharing constraint as in:

functor Foo (structure Struct1 : SIG1 and Struct2 : SIG2
             sharing structure Struct1.sub1 = Struct2.sub2) : FOO_SIG =
struct

   ...

end

In discussing programming languages, one must first distinguish between the languages syntax, the rules that determine what strings of symbols are legitimate programs in the language, and its semantics, the rules that determine the meaning of a given string of symbols.
In general, it is the semantics that researchers are interested in. What grammar rules are chosen for a language are rarely very important. So, often when presenting examples, a Programming Languages person will pick some non-specific artificial syntax that is generic to the class of languages she is demonstrating some point about. They will not, for example, use C or Pascal syntax, but rather some generic block-structured-imperative-language syntax.
Nevertheless, to build an interpreter or compiler for a language, we must be in a position to analyze it's grammar. Therefore, for the next three or so lectures, we will be discussing the problem of parsing. That is, analyzing a string of symbols to see if it belongs to the language we are interested in, and converting it to some internal, more manageable form that we can hand to the interpreter/code-generator.
The most common way to specify the grammar of a language is in what is known as Backus-Naur Form or BNF. There are several different styles for writing BNF rules. This is a common one.
A partial BNF for ANSI C functions might be given as:
fundef ::= [type_id] id "(" [param_list] ")" block block ::= "{" [local_list] stat_list "}" param_list ::= [type_id] id | [type_id] id "," param_list local_list ::= type_id id_list ";" [local_list] id_list ::= id | id "," id_list stat_list ::= statement ";" [stat_list]

We will discuss how to build and analyze grammars starting next class. For the rest of this lecture I want to discuss the difference between concrete syntax and abstract syntax and how they relate to your next assignment. Consider the following two functions:

fun fact x = if (x = 0) then 1 else x * (fact (x - 1)) );

function swap( var x:integer, var y:integer) : integer; var temp:integer; begin temp := x; x := y; y := temp; end;
Does everyone recognize these languages and understand what the functions do?

Well, whatever you're thinking, you're almost certainly wrong. When I wrote them I had in mind a rex program and a c program, and I argue that that is exactly what they are, for any purpose that you could really care about.

The BNF for C functions above is concrete syntax. It is written in terms of the actual strings of symbols that we will use to distinguish parts of a program. Clearly, though, at the interpreter/code-generator level we are not interested in whether C uses brackets or begin/end pairs for delimiting blocks. We can describe the structure of a function much more analytically as:
function = Function(name : identifier, return : type, parameters : var list, body : block) var = Var(name : identifier, type : type) block = Block(locals : var list, code : statement list)
Notice that this abstracts away many details. For instance, we no longer care that the parameter list for a function and the list of local variables are defined with different syntaxes.
The beauty of this idea (or perhaps the beauty of ML) is that it should be obvious that abstract syntax can be mimiced directly in ML datatypes. And ML programs can be easily built to analyze these structures.
So, for example, the C function:
int foo(int x) { x++; return x; }
could be represented by the ML value:
val foo = Function(Id "foo", Type "int", [Var (Id "x", Type "int")], Block([],[Inc(Var (Id "x")),Return(Var (Id "x"))]));

This page copyright ©1999 by Joshua S. Hodas. It was built on a Macintosh. Last rebuilt on Monday, February 22, 1999 at 4:10 PM.

http://cs.hmc.edu/~hodas/courses/cs131/lectures/lecture09.html

	This page copyright ©1999 by Joshua S. Hodas. It was built on a Macintosh. Last rebuilt on Monday, February 22, 1999 at 4:10 PM.
http://cs.hmc.edu/~hodas/courses/cs131/lectures/lecture09.html

Harvey Mudd College Computer Science 131 Programming Languages Spring Semester 1999

Lecture 09 (2/22/99)

This page copyright ©1999 by Joshua S. Hodas. It was built on a Macintosh. Last rebuilt on Monday, February 22, 1999 at 4:10 PM.

http://cs.hmc.edu/~hodas/courses/cs131/lectures/lecture09.html

Harvey Mudd College
Computer Science 131
Programming Languages
Spring Semester 1999