Compiler Step 4: Pre-type rewriting

This step applies rewriting rules that must follow variable renaming, because they introduce new function names. These rules, however, do not require type information and do not need to follow any later restructuring rules (notably scanner substitution). Its substeps, in order, are:

miscellaneous rewriting rules and code to trigger run-time errors if missing values are passed to forms that cannot accept them
apply arithmetic identities
expand sheet handlers
purity analysis

Generation of guard clauses needs to be done prior to the restructuring in step 5b. The step 5b restructuring assumes that side-effects are produced by forms which do not return values and, therefore, can only occur inside an expression if they are a non-final element of a begin form.

Purity analysis must also precede the step 5b restructuring, for two reasons. First, it generates useless forms (because parts of them are discovered to be unnecessary tests for missing values) which will be cleaned up by the first step in 5b. Second, step 5b generates code to detect missing inputs to arithmetic expressions and generate missing outputs. It depends on the fact that the purity analysis has marked (with pure-ref) variable references whose values cannot be missing.

In principle (with some rewriting), many parts of step 4 could probably be done during or after step 5a (type inference and type-dependent rewriting). However, the division between step 4 and step 5a is conceptually nice, helping organize the compiler.

Rewriting rules

If is specialized to unary-if and binary-if. Atan forms are specialized into binary-atan and unary-atan. The operator - is renamed to unary-minus when it has only one input.

Make-point forms with only one input are simplified.

Make-missing is rewritten as make-missing-integer. (This will be automatically coerced to make-missing-real when appropriate, by subsequent compiler steps.)

Calls to external functions are wrapped directly inside a set! or multiple-set! form (as appropriate), unless they are already directly inside such a form or do not return any values. New local variables are allocated as required.

Expect forms are analyzed. Expect forms containing conjunctions are broken up into a sequence of expect forms. Expressions of the form (not (missing? expr)) are rewritten using verify-non-missing. Forms containing other expressions are rewritten using unary-if and icp-error.

In principle, verify-non-missing is unnecessary: such forms could be rewritten using unary-if. However, the resulting code tends to be more complex through the middle parts of the compiler because it takes the compiler longer to eliminate the unary-if forms if they turn out to be redundant. Verify-non-missing simplifies debugging.

Finally, the inputs to forms headed by any of the following operators are rewritten to trigger a run-time error if they are missing.

sample-= sample-< sample-<= sample-> sample->= nearest-sample shift-sample sheet-ref = <= < > >= evenly-divides

The rewriting for an input value depends on its type:

constant point (including arithmetic operations applied to constant inputs): no check required
symbol x: (begin (verify-non-missing x) expr)
any other form s: (begin (set! x s) (verify-non-missing x) expr)

Sheet handlers

The following forms for handling samples are rewritten using accessors for disassembling a sample into its constituents and reassembling it. When the output sheet of code accesses an input variable more than once, and this variable is not a literal constant or a symbol, the compiler uses a temporary variable to avoid evaluating the subform more than once.

sample inequality forms: sample-=, sample-<, sample-=<, sample->, sample-=>
sample->point
shift-sample
sample-offset
nearest-sample
sheet-ref, sample-ref, sample-set!, sample-erase!
in-focus-area

Purity analysis

This step walks through the code in execution order, maintaining a list of which variables are known not to be missing (pure). References to pure variables are wrapped in the special form pure-ref. Tests for whether a form is missing are suppressed if that form is pure.

A form is pure if (recursive definition):

It is a constant point.
It is headed by a strict operator and its inputs are pure.
It is headed by a pure operator (pure-ref, sheet-domain-offset, sheet-domain-scaling sheet-codomain-offset, sheet-codomain-scaling, sample-unscaled-point).
It is headed by binary-if and both its inputs are pure.

A variable can become pure in several ways:

It is checked by a verify-non-missing form.
It is set to a pure value.
We are inside the first clause of a binary-if form and the test is a conjunction including the fact that this variable is non-missing.
We are inside the second clause of a binary-if form and the test is a disjunction including the fact that this variable is missing.

The variables pure after a binary-if form are those pure at the end of both clauses. The variables pure after a unary-if form are those pure after the test and also pure after the single clause. Exception: if the unary-if clause always triggers an error, we also include variables which will cause the clause to be entered if they are missing.

For scan and while, we walk through the loop repeatedly, revising the purity analysis, until we reach a fixed point. This process always converges, because we never add variables to the list of pure ones. This fixed point S is the list of variables pure at the start of the loop.

For a while form, S is the set of variables pure when the test is evaluated. When evaluating the while body and forms after the while, we also include any variables made pure during evaluation of the test. We do not do this for scan forms, as one cannot guarantee that the test is ever evaluated or evaluated just prior to halting. Scan forms may also halt, or never run, because they run out of sheet locations.

Verify-non-missing tests are replaced by their inputs if the input is pure. The input cannot be removed entirely, as it might trigger side-effects. Useless forms will be pruned by later parts of the compiler. Similarly, (missing? expr) is rewritten as (begin expr #f) if expr is pure.

Language changes

Step 4 introduces the following new functions

unary-atan binary-atan sample-unscaled-point binary-if unary-if unary-minus guaranteed-divide point-coord make-missing-integer make-into-integer make-sample verify-non-missing no-op pure-ref minus vector-<= unscaled-sheet-ref icp-error real-sheet-ref evenly-divides unscaled-sheet-set! integer-sheet-ref sheet-domain-offset sheet-codomain-offset sheet-domain-scaling sheet-codomain-scaling

It removes the following functions from the language.

if atan expect sample-= sample-< sample-=< sample-> sample-=> sample->point shift-sample sample-offset nearest-sample sheet-ref sample-ref sample-set! sample-erase! in-focus-area make-missing

Ownership, Maintenance and Disclaimers

Manual Top Page

Last modified