The Type System

Scheme 48 has a rudimentary type system. Its main purpose is to generate helpful compile-time diagnostics.

Currently you don't get much checking beyond wrong number of arguments warnings unless you're compiling a package that has an (OPTIMIZE ...) clause in its definition (e.g. (OPTIMIZE EXPAND) or (OPTIMIZE AUTO-INTEGRATE)). The reason that type checking is disabled most of the time is that it increases compilation time by about 33%.

A design goal is to assign types to all valid Scheme programs. That is, type warnings should not be generated for programs that could work according to Scheme's dynamic semantics. For example, no warning should be produced for

    (define (foo x y) (if x (+ y 1) (car y)))

Warnings could in principle be produced for particular calls to FOO that would definitely go wrong, such as (foo #t 'a).

The type system assumes that all code is potentially reachable. This means that there will be some warnings for programs that cannot go wrong, e.g. (if #t 3 (car 7)).

Additionally, it's assumed that in a (BEGIN ...) or combination, every argument or command will always be executed. This won't be the case if there can be a throw out of the middle. For example, in

    (call-with-current-continuation
      (lambda (k)
        (if (not (number? x))
	    (k #f))
        (+ x 1)))

the type system might deduce that X must be a number (which is false).

The type reconstruction algorithm (such as it is) is in bcomp/recon.scm. The implementation finds some specific procedure types for LAMBDA expressions, but generally gives up pretty quickly.

Notation

F : T means that form F has static type T. T1 <= T2, or T1 is under T2, means that T1 is a subtype of T2; that is, if a form of type T2 is acceptable in some context, then so is a form of type T1.

Non-expressions

Not every valid Scheme form is an expression. Forms that are not expressions are syntactic keywords, definitions, types, and structure names.

If a name is bound to a macro or special operator, then an occurrence of that name has type :SYNTAX. E.g.

    cond : :syntax

Definitions have type :DEFINITION. E.g.

    (begin (define x 1) (define y 2)) : :definition

Thus type checking subsumes syntax checking.

Types (other than :TYPE itself?) have type :TYPE.

The type of a structure is its interface. E.g.

    (define-structure foo (export a b) ...)
    foo : (export a b)

Values

All expressions have type :VALUES. They may have more specific types as well.

If E1 ... En have types T1 ... Tn with Ti <= :VALUE, then the expression (VALUES E1 ... En) has type (SOME-VALUES T1 ... Tn).

If T <= :VALUE then (SOME-VALUES T) is equivalent to T.

Procedure types

Procedure types have the form (PROCEDURE T1 T2), where T1 and T2 are under :VALUES. Examples:

    (lambda (x) (values x 1)) :
       (procedure (some-values :value) (some-values :value :number))

    cons : (procedure (some-values :value :value) :pair)

Fixed-arity procedure types (PROCEDURE (SOME-VALUES T1 ... TN) T) are so common that the abbreviated syntax (PROC (T1 ... Tn) T) is defined to mean the same thing. E.g.

    cons : (proc (:value :value) :pair)

E : (PROCEDURE T1 T2) means that in a call to a value of E, if the argument sequence has any type other than T1, then the call can be expected to "go wrong" (provoke a type error) at run time. This is not to say it will definitely go wrong, but that it is just a matter of luck if it doesn't. If the argument sequence does have type T1, then the call might or might not go wrong, and any return value(s) will have type T2.

For example,

    (lambda (x) (+ (begin (set! x '(3)) 5) (car x))) :
       (proc (:pair) :value),

because if the arguments to + are evaluated from right to left, and X is not a pair, then there will be a run time type error.

Some primitive procedures have their own special typing rules. Examples include VALUES, CALL-WITH-VALUES, and PRIMITIVE-CATCH.

Variable types

Assignable variables have type (VARIABLE T), where T for now will always be :VALUE. In (SET! V E), V must have type (VARIABLE T) for some T.

Loopholes

The construct (loophole T E) is considered to have type T no matter what type E has. Among other things, this allows a rudimentary static abstract data type facility. For example, record types defined using DEFINE-RECORD-TYPE (rts/bummed-jar-defrecord.scm) are established as new base types.

Type lattice

The subtype relation is implemented by the procedure COMPATIBLE-TYPES? (in bcomp/mtypes.scm). If (COMPATIBLE-TYPES? T1 T2) is 'definitely, then T1 <= T2. If it's #T, then T1 and T2 intersect.

The type lattice has no bottom or top elements.

The types :SYNTAX, :VALUES, :DEFINITION, :STRUCTURE, and :TYPE are incomparable and maximal.

The following are a comprehensive set of subtyping rules for the type system as it stands. Additional rules may be added in the future.

(SOME-VALUES T1 ... Tn) <= :VALUES.
If T1 <= T1', ..., Tn <= Tn' then (SOME-VALUES T1 ... Tn) <= (SOME-VALUES T1' ... Tn').
T <= (SOME-VALUES T).
Basic value types, which include :NUMBER, :CHAR, :BOOLEAN, :PAIR, :STRING, and :UNSPECIFIC, are all under :VALUE.
If T1' <= T1 and T2 <= T2', then (PROCEDURE T1 T2) <= (PROCEDURE T1' T2').
(VARIABLE T) <= T.
:ZERO, the result type of infinite loops and calls to continuations, is under :VALUE, but perhaps shouldn't be. (E.g. maybe it should be just under :VALUES instead.)
(EXPORT ( T) ...) is under :STRUCTURE. [Not yet implemented.]

Type well-formedness

In (SOME-VALUES T1 ... Tn), T1 ... Tn must be under :VALUE.

In (PROCEDURE T1 T2), T1 and T2 must be under :VALUES.

In (VARIABLE T), T must be under :VALUE.

Module system

The rules for interfaces and structures are not yet very well worked out.

Interfaces are types. The type of a structure is its interface. (Compare with Pebble's "bindings" and "declarations".)

An interface has the basic form (EXPORT ( ) ...). There are two kinds of abbreviations:

(EXPORT ... ...) means the same as (EXPORT ... ( :VALUE) ...)
(EXPORT ... (( ...) ) ...) means the same as (EXPORT ... ( ) ( ) etc. ...)

Distinct interfaces are not comparable.

If a form S has type (EXPORT ... (name T) ...), then the form (STRUCTURE-REF S name) has type T. Note that T needn't be a :VALUE type; e.g.

    (structure-ref scheme cond) : :syntax

When a package is loaded or otherwise compiled, the type that is reconstructed or inherited for each exported name is checked against the type specified in the signature. (Cf. procedure SCAN-STRUCTURES in bcomp/scan.scm.)

[explain the role of the expander in type checking... compile-call doesn't do much checking if the arguments aren't expanded...]

Future work

There probably ought to be dependent sums and products and/or universal and existential types. In particular, it would be nice to be able to get static checking for abstract types, even if they're not implemented using records.

Type constructors (like STREAM-OF or COMPUTATION-OF) would be nice.

There are many loose ends in the implementation. For example, type and type constructor names aren't always lexically scoped; sometimes their scope is global. Packages that open the LOOPHOLES structure (which exports LOOPHOLE) don't always open TYPES (which would be a bad idea given the way TYPES is currently defined); LOOPHOLE works in spite of that.

Figure out whether :TYPE : :TYPE.

Credits

Original by JAR, 20 July 93. Updated by JAR, 5 December 93.

Ownership, Maintenance and Disclaimers

Scheme48 Manual Top Page

Envision Manual Top Page

Last modified