Initial design
We would like to design a compiler that allows us to separate the complexity of its different parts. This separation eventually will allow us to focus on one piece of the compiler at a time, without having to worry about how the other pieces work or even fit together.
1 Compiler phases
To manage complexity, compilers are organized into phases, through which the input source code passes on its way to becoming target code.
We have already discussed a simple, but complete sequence of phases:
The compiler’s tokenize phase accepts the input source code, and the compiler’s emit phase produces the target code.
2 Initial design
We will now “implement” this design in Haskell. By “implement”, I mean that we will focus only on the types of the phases, rather than on the actions that they perform. We are designing the architecture and interfaces for the different parts of our compiler, leaving the actual implementation of the phases for later.
Compiler.hs
module Compiler where
{- Compiler pipeline -}
1compiler :: String -> String
2compiler = emitter . optimizer . parser . lexer
3{- Compiler phases -}
lexer :: String -> [Token]
lexer = undefined
parser :: [Token] -> AST
parser = undefined
optimizer :: AST -> AST
optimizer = undefined
emitter :: AST -> String
emitter = undefined
4{- Compiler data types -}
data Token = Token
data AST = AST- 1
-
The overall compiler takes in a
String(the contents of the source file) and produces as output anotherString(the contents of the target code). - 2
- We implement the compiler as the composition of phases. Each phase is implemented as its own function.
- 3
- Each compiler phase indicates its input and output type. Because we are focusing only on the interfaces between phases, the bodies are left unimplemented.
- 4
- We also have placeholder definitions for the different types that the compiler uses.
Notice that we have to list the phases in reverse order, when reading from left to right. If we want to list the phases in left-to-right order, we can use the >>> operator from Control.Arrow, like so:
import Control.Arrow ((>>>))
{- Compiler pipeline -}
compiler :: String -> String
compiler = lexer >>> parser >>> optimizer >>> emitter