CS 60 URL http://cs.hmc.edu/~keller/rex/lite.html
Last updated 5 September 1999 by keller@cs.hmc.edu. Please report any errors or inaccuracies to keller@cs.hmc.edu.
The light-weight guide to the rex language and system.
Not all available functions are given here. For the full list, see http://cs.hmc.edu/~keller/rex.
Content Links
Arithmetic | Arity | Booleans | Characters | Commands | Comments | Definitions | Expressions | Forms | Failure | Incremental | Interrupt | Lists | Local | Logical | Misc | Quick Commands | Operators | Precedence | Read Loop | Relational | Rules | Sequences | Strings | Variables
In UNIX implementations, a file name to be read initially as rex source can be specified on the command line. Example, with the bold-face representing what the user types:
UNIX > rex fac.rex fac.rex loaded 1 rex > fac(10); 3628800
rex runs an interactive "read loop", also called read-eval-print loop. This means that the following cycle repeats until the user types end-of-file (e.g. control-d):
In some cases, it is a little more complicated than this: the evaluation can be interleaved with printing the answer. This is helpful when the answer is very long, and essential when the answer is infinite. An evaluation can be interrupted by typing control-C a couple of times.
Quick commands provide an alternative for entering system and help commands with a minimum of key strokes. A quick command is identified by a single line starting with * and terminated by return (no ;) The following quick commands are available:
*i
Filename
*h
*q
*r
*t
*
N
*
-N
**
Comments are as in C++ and cannot be recursive (no comments inside comments). They either start with
// and continue to end-of-line, or start with
/* and continue to */.
There are three types of rex expressions:
All expressions to be evaluated are terminated with a semi-colon. (end-of-line doesn't terminate an expression, because expressions can take an arbitrary number of lines, and it is not generally evident when there isn't more to an expression.)
Definitions bind variables and function names. A definition appears as
LHS = RHS;
where LHS is the variable or variables being defined, and RHS is what defines them. Examples are:
cube(x) = x*x; // defines the cube function alist = [1, 2, 3, 4]; // defines a list variable called alist [x, y, z] = foo(alist); // defines variables x, y, and z,
Rules define parts of functions by using pattern matching and guards, and are collected together by rex to make a definition. A rule has the form: LHS => RHS; (The symbol => is an = followed by a > and is read "yields"). The following are examples of rules (for the greatest-common-divisor function):
gcd(0, y) => y;
gcd(x, y) => x <= y ? gcd(y % x, x);
gcd(x, y) => gcd(y, x);
The second rule above uses a guard, an expression which must be true in order for the rule to apply. As another example, the following rules use recursion and pattern matching to find the last element in a non-empty list.
last([X]) => X; last([_ | X]) => last(X);
A failure is said to take place when a function evaluation is attempted but there is no matching rule. This is a level-1 failure. For example, last([]) in the case of the above rules would result in such a failure. A level N+1 failure occurs when a function evaluation calls another function which results in a level N failure. A level-0 failure is said to occur if the left-hand side of an equational guard does not match the right-hand side. Failures are usually the result of programming errors.
Although all expressions have a value, the remaining kinds of expressions are entered for purposes of evaluation. They yield a value which is printed. They can also have side-effects, but the routine use of side-effects is discouraged. The computations of these expressions are done in the context of definitions and rules.
For example, to use the gcd and last functions defined above in a computation, we'd need to enter an expression:
gcd(56, 72);
rex would respond with the answer 8. If we entered:
last([1, 2, 3, 4, 5]);
rex would respond with the answer 5.
A variable is a sequence of contiguous characters starting with a letter (upper or lower case) or underscore, and containing letters, underscores, and digits. Names of functions are actually variables bound to anonymous function values.
The following kinds of operators are available:
Arithmetic, logical, and relational operators are pretty much the same as in C and C++.
The following lists operator precedence tightest to weakest. All operators group left unless otherwise indicated.
These functions are defined as follows:
N_{1} +
N_{2} + .... + N_{n}
N_{1} -
N_{2}
N_{1} *
N_{2} * .... * N_{n}
N_{1} /
N_{2}
N_{1} %
N_{2}
For example, divides(3)(N_{1}) will return 1 whenever N_{1} divides 3.
These apply to more than just numeric values. For example, lists and arrays can be compared in lexicographic order.
The Boolean model used by rex is: 0, [ ] (the empty list), and failure values are false, and anything else is true. When a built-in returns a Boolean, it is either 0 or 1.
The logical operators are:
List formation is represented by [X_{0}, X_{1}, ...., X_{n-1}]. A list result will show in this fashion as well, unless it is converted to some other format. The null list is [ ].
Lists may be written to and read from files as S expressions by single commands.
A list may be formed from an item and an existing list by [X | L], meaning to construct the list consisting of X followed by the elements of L. Another way to get the same effect is cons(X, L). With respect to the list thus created, X is the "first" of the list and L is the "rest".
This notation may be generalized: [X_{0}, X_{1}, ...., X_{n-1} | L] forms the list starting with the n elements indicated, followed by the elements of L.
An "improper list" is formed by [X | Y] where Y is not a list. A result of this form will show as [X | Y]. Programming will generally be simplified if improper lists are avoided.
Incremental and "infinite" lists may be created by deferring the rest of the list: In [X |$ L] the $ acts to defer the evaluation of L. Taking the rest of the list will give the seed $L which is not further evaluated until needed (e.g. in the case of checking to see whether the rest is the empty list.
Other list functions may be found under sequence functions.
Other selections from lists may be made by treating
the list as a function from {0, 1, 2, ..., length-1} to values:
i.e. L(i)
is the ith element of L.
If N_{1 }<= N_{2}, range(N_{1}, N_{2}) yields the list [N_{1}, N_{1}+1, N_{1}+2, ...., N_{2}].
If N_{1} > N_{2}, range(N_{1}, N_{2}) yields the list [N_{1}, N_{1}-1, N_{1}-2, ...., N_{2}].
range(N_{1}, N_{2}, K) yields the list [N_{1}, N_{1}+K, N_{1}+2*K, ...., N_{2}] in case that N_{1} <= N_{2} and K is positive, or N_{1} >= N_{2} and K is negative. Otherwise the result is [ ].
Examples:
range(1, 9) ==> [1, 2, 3, 4, 5, 6, 7, 8, 9] range(9, 1) ==> [9, 8, 7, 6, 5, 4, 3, 2, 1] range(1, 9, 2) ==> [1, 3, 5, 7, 9] range(9, 1, -2) ==> [9, 7, 5, 3, 1]
from(N, K) creates the infinite list [N, N+K, N+2*K, ...]. If N is floating, the items will be floating. If K is floating, but N is not, all but the first item will be floating. Otherwise all items will be fixed.
A literal string must be in double quotes, e.g.
"this is a string"
Strings are distinguished from single characters, which are in single quotes, e.g. 'x'. In either case, backslash (\) can be used as an escape character. The standard C escape codes apply:
\n newline |
\\ backslash |
\b backspace |
\v vertical tab |
\t tab |
\f form-feed |
\a bell |
\r carriage return |
Note: As with sequences, a particular character of a string may be extracted by treating the string as a function: S(i), where i is between 0 and (length of the string) -1 yields the ith character of S.
isalpha: returns 1 if its argument is alphabetic, otherwise returns 0
isdigit: returns 1 if its argument is a digit, otherwise returns 0
isupper: returns 1 if its argument is upper-case, otherwise returns 0
islower: returns 1 if its argument is lower-case, otherwise returns 0
ispunct: returns 1 if its argument is punctuation, otherwise returns 0
isspace: returns 1 if its argument is whitespace, otherwise returns 0
iscntrl: returns 1 if its argument is a control character, otherwise returns 0
By a sequence we mean either a list or an array and these functions apply to both. In a few cases they also apply to strings. The ones which apply to incremental or infinite lists, as well as finite ones are annotated as "incremental".
assoc(X, L)
yields the first list element of an association list
having X as its first item. If there is no such
item, yields [ ].
For example, assoc('c', [['a', 1], ['b', 2], ['c', 3], ['d', 4]]) will
return [c, 3].
every(N, L, K)
yields a list consisting of every
Nth element of L starting at the Kth (where K = 0 means the first,
etc. [incremental]
find(P, L)
yields the suffix of L beginning with
the first item X for which P(X). If there is no such
item, yields [ ], unless L is infinite, in which case this
function diverges. [See also keep.]
find(P, L)
yields index (starting with 0 as the
first index) of the first item X for which P(X). If there
is no such item, yields -1, unless L is infinite, in which case
this function diverges.
find(P, L)
yields the list of indices of all
items X for which P(X). If L is infinite but there are only
finitely-many such X, this function diverges after producing the
indices of those X. [See also keep.]
[incremental]
keep(P, S)
is the array or list resulting from
keeping those elements for which P is true. [incremental]
leaves(A)
treats the list as the root of a tree,
and returns a list of the leaves of the tree. The tree can have
arrays or lists as nodes.
leafcount(A)
is the length of leaves(A)
length(S)
returns the length of the list, array,
or string
map(F, S)
constructs a
new array or list the same length as S, by applying F to each
element of S. map(B, S1, S2)
S1 and S2 should be both
arrays or both lists. The result is a new array or list
constructed by applying binary function B to the elements of S1
and S2, element-by-element. [incremental]
mappend(F, L)
Assuming L is a list, and F is a
function which will return a list when applied to a member of L,
returns the result of appending the elements of map(F, L)
together in order.
merge(L, M)
assuming that L and M are in order
according to the built-in order <, merges them together into a
single with respect to <. [incremental]
merge(P, L, M)
assuming that L and M are in order
according to the partial order P, represented as a binary
function, merges them together into a single list in order with
respect to P. [incremental]
no(P, L)
yields 1 if no X occurring in L is such
that P(X). Otherwise yields 0.
prefix(N, L)
yields the first N items in L. If L
doesn't have N items, yields all of L. [incremental]
reduce(B, U, S)
Here B is a binary operator with
unit U and S is an array or list. The result is the accumulation
of applying B to elements in S.
For example, reduce(+, 0, S)
is the sum of
elements in S, while reduce(*, 1, S)
is the product.
Grouping is to the left, i.e. if S were [S0, S1, S2, ... Sn], then
the result is B( .... B(B(B(U, S0), S1), S2) ...., Sn).
[incremental]
remove_duplicates(L)
returns a list with any
duplicates removed [incremental]
reverse(L)
returns the list consisting of
elements of L in reverse order
scale(F, S)
constructs a new array or list the same length as S, in
which each element of the original is multiplied by F.
[incremental]
scan(B, S)
results in an array or list (of
the same length as S) of partial accumulations of B
applied to the elements of S. For example, scan(+, S),
where S = [S0, S1, S2, ... Sn], yields [S0, S0+S1, S0+S1+S2, ....,
S0+....+Sn-1]. [incremental]
some(P, L)
yields 1 if an X occurs in L such that
P(X). Otherwise yields 0.
sort(S)
returns an array or list (whichever S is)
containing the elements of S in sorted order. The ordering is the
natural one built into rex.
suffix(N, L)
yields the last N items in L. If L
doesn't have N items, yields all of L. [incremental]
zip(L, M)
yields the list formed by alternating
an element of L with an element of M [incremental]
sin(N)
returns the sine of N, where N is
in radians. Similarly for cos, tan, atan, ....
log(N)
returns the natural logarithm of N.
log(N_{1}, N_{2})
returns the logarithm of N_{2}
base N_{1}.
id(Datum)
returns Datum (useful in certain
contexts, e.g. make_array).
k(Datum)
makes a constant function with value
Datum. That is, it returns a function which, when applied
to any argument, returns Datum. i.e. k(Datum)(Anything)
is Datum.
test(Expression1, Expression2)
evaluates the
expressions and compares them for equality. Usually Expression2 is
the expected answer. Prints a message indicating whether the test
is "ok" or "bad".
type(A)
returns a string naming the type of its
argument A. The following types are available:
The following functions return 1 or 0, depending on whether the argument is of the corresponding type:
Guard
is
the guard, a logical expression, and
Body is an expression. The rule applies only if the guard is true,
in which case the result is the value of Body.
Condition
? True_Alternative :
False_Alternative
Condition is evaluated first and if true, the result is the value of True_Alternative, otherwise it is the value of False_Alternative.
Head => Body
is used in two ways: to define rules and to define anonymous functions.
To define a rule:
Head
=>
Body
;
where Head is a function name with arguments, defines one rule for the named function.
Example: foo([A | L], N) => foo(L, A+N);
To define an anonymous function:
Head
=>
Body
where Head is just a list of arguments in parentheses, is an "anonymous function" (also known as a "lambda expression").
Example: (F, X) => F(F(X)) is the function of two arguments, say F and X, which applies F to X, then applies F to the resulting value.
Some built-in functions are overloaded. For example,
map has two or three arguments. When a
built-in function is passed as an argument, the ambiguity of which
function is to be used can be resolved by appending to the function
name a # followed by the arity, e.g. map#2
vs.
map#3
. The same applies to operators, e.g.
+#1
vs. +#2
. Some functions can take an
arbitrary number of arguments, e.g. max takes any number greater than
zero of arguments. Similarly, +
is two different
functions: +#1
is the "curried" +, while
+#_
is the addition function with any other number of
arguments. If # is not attached, rex will make a choice of arity. For
example, the binary arithmetic functions are the choice for
+, -, *, /, %.
A local definition takes the form
LHS = RHS, Expression;
The value of the overall expression is the value of Expression in the context of the bindings made in the equation. These bindings are local, and hold only for the evaluation of Expression. They have no impact on existing bindings to the variables. Example:
root = sqrt(x), f(root, g(root));
Here root is bound to the value of sqrt(x)
, for the
evaluation of f(root, g(root))
. One use of a local
definition is to prevent repeated expensive evaluation of an
expression.
When the right-hand side of a rule has this form, we call it an equational guard. The idea is that the rule only applies if the definition succeeds. As with global definitions, a definition will fail if there is a mis-match between the LHS and RHS. Example:
f(x) => root = sqrt(x), g(root,h(root));
Here the value of f(x)
is computed by binding root to
sqrt(x)
, then returning the value g(root,
h(root))
. Since root is only a single variable, the pattern is
trivial and this guard will never fail. If the evaluation of g
fails, this translates to a failure of the whole function f,
not to the failure of a guard.
When an ordinary local definition is made, the values of variables and functions on the RHS is that determined by the surrounding context. Sometimes it is desirable to make those values be exactly the ones being defined locally. This is particularly true in the case of defining a recursive function locally. For example, suppose for some reason we wished to apply a local definition of the factorial function. The following would not work:
f(N) = N < 2 ? 1 : N*f(N-1), f(10);
The reason this doesn't work is that the f on the RHS of the equation (but not the one after the comma) is not the local f, but rather the surrounding f, if indeed there is one. In order to make both f's be the same, so that recursion will work, we need to use a recursive local definition, signified by braces {....}. The proper way to define and use f is:
{ f(N) = N < 2 ? 1 : N*f(N-1); f(10) };
In general, the expressions inside braces consist of some number of equations terminated by semi-colon, followed by an expression, the value of which becomes the value of the overall brace expression. Another use of local recursive definitions is in infinite-list functions with loops. For example, to define a function which is the infinite repetition of an item X, without recomputing X, we can define
repeat(X) = { Y = [X |$ Y]; Y };
Here it is Y which is the same on both the left- and
right-hand sides of the equation
Y = [X |$ Y]
.
System commands look just like expressions to be evaluated, but they generally have side-effects, such as reading a rex source file. The "function" sys is thus not really a mathematical function. The second argument to sys is interpreted literally as one of a number of sub-commands, and there will usually be additional arguments, which are either literal or evaluated, depending on the sub-command.
The in sub-command:
sys(in, Filename);
Here in is literal and Filename is an expression which must evaluate to a string. That string is looked up according to the usual interpretation and loaded as a series of rex expressions.
The on and off sub-commands:
sys(on, Flag); and sys(off, Flag);
Here on and off are literal and Flag is interpreted literally (without evaluation). The corresponding flag is turned on or off. The following are possible flags: