------------------------

Harvey Mudd College
Computer Science 131
Programming Languages
Spring Semester 1998

Lecture 05 (2/3/99)

------------------------

------------------------

The next thing I want to talk about is the question of how names are scoped in ML. At present we have only seen two kinds of names: global ones and local formal parameter names.

You may not have experimented explicitely with the use of global names in your functions yet, and that's a good thing, because they are always a potential source of confusion, but you should understand that they are there.

For instance, we could define the following names:

val x = 3;
fun addx y = x + y;
addx 5;
Now, what happens if we "redefine" x?
val x = 7;
addx 5;
This seem wierd, but it is a consequence of the meaning of names in ML. Unlike in imperative languages in which variables are the names of storage positions, in ML and other functional languages, variables are just names (or shorthand) for expressions. The easiest way to think about it is that when we use a val statement we are not assigning a value to a name but a name to a value.

When I assigned the name x to 3 and then used it in addx I should have been thinking of x just as a shorthand for 3. If that's so, just because later I decided to use x as shorthand for some other value, doesn't mean that I didn't want it to mean three at the time I wrote addx.

Obviously, though, addx was a lousy name; I should have called it add3;

While this may seem strange at first, it is actually a much more consistent behavior than in imperative languages. After all, if you write:

x = 3;
y = x + 2;
x = 4;
cout << y;
in C++, you don't expect the system to print 6 do you?

Notice that this rule applies to all names. If I define a function, then another function which uses that function, and then I decide to redefine the original function, the second function will still continue using the original definition of the first function, even though the new definition is the one available at the top level. Watch:

fun add1 x = x + 1;
fun add2 x = add1 (add1 x);
add1 4;
add2 4;
fun add1 x = x + 3;
add1 4;
add2 4;
This may have confused you at times during your programming/debugging sessions.

Now, we'll talk next class about what to do if we really really really want to be able to change the value associated with an existing name such that it permeates to things already defined in terms of that name. But for now I just wanted to get the ball rolling in our discussion of scope.

------------------------

Often we want to define a name so that it is local to some scope. This can be useful either if we want to just have a name that is used to store some temporary result, or if we want some function that is used to support the definition of another function but that shouldn't itself be externally available.

There are two main ways to restrict the scope of a name in ML: The let construct, and the local construct. The syntax for them is as follows: The let construct is written:

let
   definition1
   definition2
   ...
in 
   expression
end
and it is used in any situation that you would normally want an expression. That is, the value of the whole construct is the value of the expression after the in.

The local construct is written:

local
   definition1
   definition2
   ...
in 
   definition
end
and it is used anywhere you are making a definiton, rather than computing a value.

------------------------

So, for instance, the local function is most often used to restrict access to some auxillary function. For example:

local
   fun fact_aux result i n =  
          if (i > n)
            then result
            else fact_aux 
                    (i * result)
                    (i+1) n;
in
   fun fact n = fact_aux 1 1 n;
end;

This way the user has access to fact but has no idea fact_aux exists. Notice the compiler keeps its mouth shut concerning fact_aux.

------------------------

The let construct can be used either to provide temporary names, or to hide a function definition. For example, before we defined filter as:

fun filter p nil = nil
  | filter p (h::t) = 
       if p h
         then h::(filter p t)
         else filter p t;
We could save a little typing, and perhaps make things clearer by writing:
fun filter p nil = nil
  | filter p (h::t) = 
      let
         val filter_t = filter p t;
      in 
         if p h
           then h::filter_t
           else filter_t
      end;
In this case there is no difference in efficiency since the expression is only ever used in one of the two places. But if it were actually computed in more than one place we would be saving computation as well as typing.

As an example of hiding a function, we can use let in much the same way as local:

fun fact n = 
      let
         fun fact_aux result i n =  
                if (i > n)
                  then result
                  else fact_aux 
                         (i * result)
                         (i+1) n
      in
         fact_aux 1 1 n
      end;

One of the nice things about using let for defining a sub-function, is that since it is under the scope of the main function definition, the parameters of that function are global to it. So, in the last example we don't need to pass n to fact_aux since it never changes during a run of that function. Instead we can write the definition as:

fun fact n = 
      let
         fun fact_aux result i =  
                if (i > n)
                  then result
                  else fact_aux 
                         (i * result)
                         (i+1)
      in
         fact_aux 1 1
      end;

------------------------

One problem that programmers often face is what a function should do in aberant cases where there is no obvious sensible answer. This is generally caused by one of two situations.

------------------------

The first aberant situation is where the input is reasonable, but there is no answer, for example, when you are looking up a name in a database, but there is no entry for the name. In languages with weaker type systems the typical solution is to pick some value out of the range of the return type and designate it as the undefined value; for example, returning 0 as the age of a person not in the database, as in:

fun lookup _ nil = 0
  | lookup search_name ((name,age)::t) = 
       if (name = search_name) 
         then age
         else lookup search_name t

fun lookup' name db = 
       case (lookup name db) of
          0   => (print "That name (" ; print name ; 
                  print ") was not in the database!")
        | age => (print name ; print "is " ; print (Int.toString age) ; 
                  print "years old.")

But this is a hack. The choice of value for "undefined" will depend heavily on the application, and sometimes there may be no good choice.

SML provides a solution to this problem in the form of option types. In general, for a type 'a, a value of type 'a option is either the value NONE or the compund value SOME a, where a is a value of type 'a.

So, for example, we can rewrite the last function as:

fun lookup _ nil = NONE
  | lookup search_name ((name,age)::t) = 
      if (name = search_name) 
         then SOME age
         else lookup search_name t

fun lookup' name db = 
       case (lookup name db) of
          NONE     => (print "That name (" ; print name ; 
                             print ") was not in the database!")
        | SOME age => (print name ; print "is " ; print (Int.toString age) ; 
                             print "years old.")
You can see from the type reported for lookup that this is a much more general solution. In fact, it would probably be more appropriate to write the first function as:
fun lookup _ nil = NONE
  | lookup search_key ((key,value)::t) = 
      if (key = search_key) 
         then SOME value
         else lookup search_key t
------------------------

In the other situation, the output is undefined because the input was somehow inappropriate. This is in general an error, and in ML it is handled by using the exception mechanism. We have already encountered exceptions in the case of programs with non-exhaustive match errors. All SML error messages occur as the result of some exception being raised by the system because something has gone wrong. The exception mechanism is similar to that of Java, but is somewhat less cumbersome. In particular, you do not have to declare what exceptions you raise, and are not responsible for wrapping every block that might raise an exception with a handler.

Suppose the hd list destructor were not built into the language. It is simple to define ourselves as:

fun head (h::t) = h;
But what happens when we call it with an empty list as argument. How does this compare with the behavior of the built-in destructor? In order to have a custom error message occur for this situation (which would be much more informative to the programmer) we need to first define the error, and then use it appropriately:
exception Head;

fun head nil = raise Head
  | head (h::t) = h;

In a few classes we will discuss how to trap exceptions so that they don't cause the program to break back to the main SML prompt. This will be important when we are writing our interpreters, since we want those to behave like self-contained systems.

In many cases it will often be useful to have the exception carry with it some information to be used in the exception handler. For example, we might want to raise an exception when the factorial function is called with a negative number. For debuggin purposes it would be useful if the error included the problem value. We can do this as follows:

exception factorial of int;

fun fact 0 = 1
  | fact n = if (n < 0)
               then raise (factorial n)
               else n * fact (n-1);
Unfortunately, the current implementation does not echo back the parameters of untrapped exceptions. But they will be available to you when you actually trap the error yourself.

By the way, what is unfortunate about the way I just defined factorial (aside from the fact that it is not tail-recursive). How would you fix it to be more eficient?

------------------------

This page copyright ©1999 by Joshua S. Hodas. It was built on a Macintosh. Last rebuilt on Tuesday, February 3, 1999 at 1:59:10 PM.
http://cs.hmc.edu/~hodas/courses/cs131/lectures/lecture05.html