CS 132 Assignments

The directory $TIGER is /cs/cs132/src/tiger on turing.

Chapters 11 and 12
Due April 11, 2003

Get copies of all the files in $TIGER/chap11; these contain a complete implementation of the code for Chapter 10, which creates an interference graph for a sequence of assembly instructions.

Implement the Color and RegAlloc structures, as shown on Page 253 in Appel. Your code need not handle coalescing or spilling. Because of this, it may be easier to implement optimistic coloring from scratch --- based on the presentation given in class --- rather than trying to simplify the pseudocode in Chapter 11.

When you implement the simplify phase (picking an order to remove nodes from the interference graph) you should not actually modify the graph. (You'll need the original graph back for the actual coloring phase.) One possibility --- though certainly not the only one --- is to keep a list of the nodes not yet removed from the graph, and a mapping telling for each node in the graph how many of its neighbors have not yet been removed. There is an implementation of such lookup graphs in the structure Graph.Table.

Appel's suggestion that we add an intruction that uses all the special registers and the callee-save registers doesn't work in this case, for a couple reasons. It is therefore probably easiest to create two lists: SparcFrame.registers which contains all of the Sparc registers (which can be used as the values to map Temporaries 0..31 to hardware registers) and another list, something like SparcFrame.useableRegisters which contains the subset of Sparc registers to which you would be willing to allocate temporaries to (e.g., not the stack pointer or the zero register)

All that's left to do after the register allocator is working is cleanup:

Finish the procedure entry/exit routines as specified on pages 208 and 260--262. For procEntryExit3 you may assume that every function needs space in its stack frame for 6 outgoing arguments (i.e., that no function passes more than 6 arguments to its callees). This function should probably add the save and ret/restore instructions, and an initial .global (function-label) assembler pseudo-op to make sure the function's label is visible to other files.

Thus, the save instruction in the prolog must reserve 92 bytes + (number of escaping_locals * 4 bytes), making sure to keep the stack 8-byte aligned and also remembering that the argument to save should be negative since the stack grows downwards. (92 = 6 outgoing args (minimum permitted) * 4 bytes + 64 bytes for register-window spilling + 4 bytes for (unused) pointer for functions returning structures.)

The Sparc register windows take care of callee-save registers for us automatically, so there's no point in starting the routine by moving the callee-save registers into temporaries, and finishing by moving them back.

Next, the Frame.stringneeds to emit assembly code for string data. The output could look something like

  .section ".data"
(the-label):
  .word (the-length)
  .ascii (the-string-in-quotes)
  .align 4
  .section ".text"
[Actually, since strings are read-only, there's little point in putting them in the data segment, rather than leaving them in the read-only text segment with the code. But, the above a perfectly good example of how to allocate in the data segment.]

For testing purposes, the function Translate.getResult should probably reset the list of fragments back to nil; otherwise fragments from one test can appear in the output of future tests.

More hints on getting code running (some of which I probably should have mentioned for the previous assignment; sorry):

Submit your code as chapter 12.

Chapter 9
Due April 2, 2001

Read Chapter 9, and get copies of canon.sml from $TIGER/chap8 and the files in $TIGER/chap9. This assignment has two parts: implementing the function SparcGen.codegen (see the signature CODEGEN is given on page 207 of Appel), and extending the Frame signature and definition as specified on pages 208-209.

The function SparcFrame.string : Tree.label * string -> string is described on page 262 of Appel (see also the figure on page 260). It emits the SPARC assembly language to emit the given label and the given string in the SparcFrame.string. For this assignment, write a function that returns any string you want. (This will be fixed in a later assignment.)

The code in $TIGER/chap9/main.sml also assumes that your semant.sml has a function transProg which creates a new level for the main routine (with label "tigmain"), invokes transExp, passes it to Translate.procEntryExit to make a fragment for the main program, and finally calls Translate.getResult to return all the fragments.

Be careful with MOVE. Normally a T.TEMP or a T.MEM appearing in an expression stands for the value in the specified location. However, when it appears on the left-hand-side of a MOVE it is the location itself that is being referenced. So the right way to generate code for MOVE(e1,e2) is not to generate code that starts by computing the values of e1 and e2. The dst and src parts of an Assem.instr are exactly the def and use sets seen in class.

Chapter 7 (!)
Due March 26, 2003 (3 days after Spring Break)

First, make sure you have read Chapters 5, 6, and 7 in Appel. Combine the files from $TIGER/chap7 with a copy of your code. Of particular interest are the provided sparcframe.sml, frame.sig and the skeleton signature and implementation for Translate.

You may change any of the files however you like (e.g., rewrite code, change the definitions of types, etc.), but anything machine-specific should go in the sparcframe.sml file, and anything related to generic generation of intermediate-code should go in the translate.sml file.

  1. Implement the remaining parts of the PROGRAM section of chapter 6:
  2. Implement the remaining parts of the PROGRAM section of chapter 7:
    • Extend the code in semant.sml to return not only the type of the code, but the corresponding intermediate representation.

    • Follow Appel's organization: add lots of helper functions to the Translate module and its signature that actually do the translations; each one should generating the intermediate code for a particular language construct given the intermediate code for all of the sub-expressions.

    • A big part of the translation is simply figuring out how to get the right static link for an arbitrary variable. Remember that the static link is always the first formal parameter in a frame. (You can use SparcFrame.formals to get the access information for formal parameters including the static link, which will be at the head of the list.)

      A common bug with static links is to put the code to take one step in the chain after the code the follow the remaining links. The simplest solution is to write a frame-finding function with three arguments: (1) the frame you're currently at; (2) the frame you're looking for; (3) the code that computes a pointer to the frame given as the first argument.

    • Recall that function definitions return no code; they are passed to Translate.procEntryExit and stored as a "fragment".

    • Fix the parse.sml file if necessary if your changes make it stop typechecking.

    • Fix any bugs that you come across in the SparcFrame or Translate modules, or anywhere else.
Submit any files to which you have made changes.

Chapter 5 +
Due March 5, 2003

Implement the module Semant, as specified on pages 121-122 on Chapter 5. Be sure to read the chapter and familiarize yourself with $TIGER/chap5/types.sml before starting; the Tiger type system is a bit different.

The only thing one can do with an SML value of type unit ref (other than assign it the unit value --- which it already contains --- or to access this unit value) is to compare it with another unit ref for equality. Thus, one can (and Appel's supplied code does) use unit refs as an easy way to get "unique stamps" to distinguish between structurally identical types. One could just as well use a global counter to genereate unique type ids; the unit ref hack is arguably simpler.

PLUS: Implement the module FindEscape as discussed on pages 138--139 of the text. Your module must implement the function findEscape: Absyn.exp -> unit, but the names and types of any helper functions you define need not exactly match Appel's suggestions.

This should all be submitted as assignment 5.

Chapter 4
Due February 19, 2003

Adapt your parser to return abstract syntax trees, as specified on pages 100-101 in Chapter 4. You will need the files in $TIGER/chap4. The text of the chapter contains some example abstract syntax that might be informative.

Remember to group together adjacent type or function definitions into a single abstract syntax declaration.

You'll also need to change the %nonterm line to declare that your nonterminals now have interesting values associated with them rather than values of type unit. For example, if you have a nonterminal exp then you'll need to say exp of Absyn.exp .

Chapter 3
Due February 12, 2003

For reference you might want to look at the ML-Yacc documentation. However, it does not appear particularly helpful, so you should look closely at the examples in the book. (Be aware, however, the the book examples tend to be incomplete; every parser requires a %name, for example.)

Chapter 2
Due February 5, 2003

Carefully read the programming assignment on pages 31-32 of Appel, and implement the required lexer. You will need copies of the files from $TIGER/chap2. The given tiger.lex nearly empty, and will need a lot of work to fill in. One possible order would be:
  1. Make sure you are familiar with Appendix A, the Tiger Language Reference Manual.
  2. Extend the lexer to handle keywords, punctuation, and whitespace. (Warning: ML-Lex does not understand the escape sequence "\f" for form-feed, but you can use the escape sequence "\012" instead.)
  3. Extend the lexer to handle identifiers and integer constants. Remember that the string matched by the given regular expression is bound to the ML variable yytext.
  4. Extend the lexer to handle comments. To handle nesting of comments you will have to extend the approach on page 31 to count nesting depth (using a global counter) and changing states (i.e., calling YYBEGIN) only when appropriate. (Remember that that all states except INITIAL must be declared in the second (Lex-Definitions) portion of the input. The format is a line starting with %s and then a space-separated list of states terminated with a semicolon.) Detect unclosed comments.
  5. Extend the lexer to handle string literals. This is tricky because the lexer is required to return strings with all escape sequences have been expanded away.

    One method is to maintain a global buffer. The lexer can extend this buffer each time it recognizes a piece of the string and can call itself recursively via continue() to get the remaining pieces, until it reaches the end of the string. At this point the entire string can be returned. This technique requires adding at least one more state to the lexer.

    An alternate method is simply to match the entire string as a single token, and then write a separate translating function through which the lexer can pass the source text. This simplifies the lexer's organization, but requires you to implement your own code to detect escape sequences.

There are a number of sample Tiger files in $TIGER/testcases to try once you believe your lexer is working, though you will likely need to create your own files for testing a partially-implemented lexer. Submit your code with the following command line: cs132submit tiger.lex

Chapter 1
Due January 29, 2003

Carefully read the programming assignment on pages 10-12 of Appel, and implement the required functions. This will involve extending a copy of the file $TIGER/chap1/slp.sml with the following definitions:
  1. The function maxargs : stm -> int.
  2. The type table.
  3. The functions lookup : table * id -> int and update : table * id * int -> table.
  4. The functions interpStm : stm * table -> table and interpExp : exp * table -> int * table
  5. .
  6. The function interp : stm -> unit.

Submit your code with the following command line: cs132submit slp.sml