Configuration
Configuration allows the person using the compiler to control its behavior, for example by setting flags on the command line. The flag information could be stored in our CompilerConfiguration data structure. To use this information, a phase would need access to a value of type CompilerConfiguration. We can provide this value as an argument to the function. Let us redesign our compiler phases so that each one takes a CompilerConfiguration as input.
1 Redesign the interface
The new signatures for our compiler and phases will now be:
compiler :: String -> CompilerConfiguration -> String
tokenize :: String -> CompilerConfiguration -> [Token]
optimize :: AST -> CompilerConfiguration -> AST
emit :: AST -> CompilerConfiguration -> StringUnfortunately, all phases now take two arguments, so we cannot compose them as easily as we did before.
2 Composing phases
We will have to change how we compose our phases. Here is an implementation that works:
compiler :: String -> CompilerConfiguration -> String
compiler input configuration =
let tokens = tokenize input configuration
ast = parse tokens configuration
ast' = optimize ast configuration
result = emit ast' configuration
in resultIn this implementation, we have to feed the configuration to each phase. It is still somewhat readable, but there is a neat trick we can do to recover the readability of our original pipeline design.
Threading
Let us create a new phase-composition operator that “threads” (i.e., composes) the phases together, along with the needed configuration. This operator just generalizes what we did above: take the output of one phase and feed it as the input to the next, along with the configuration.
-- | Compose two compiler phases into a single phase
(>.>) :: (a -> CompilerConfiguration -> b) -> (b -> CompilerConfiguration -> c) -> (a -> CompilerConfiguration -> c)
(>.>) phase1 phase2 input configuration =
let phase1Result = phase1 input configuration
phase2Result = phase2 phase1Result configuration
in phase2ResultI made up a name for our operator: >.>. We could have chosen just about any name. I chose this name because the . conveys function composition, and the > > is similar to our previous design.
Now we define our pipeline as before, but with our new composition operator:
{- Compiler pipeline -}
compiler :: String -> CompilerConfiguration -> String
compiler = tokenize >.> parse >.> optimize >.> emitWith this design, it will be much easier to add a new phase!
3 Full code
Here is all the code for our compiler, whose phases can read from a configuration.
The code that has changed from our previous version is emphasized.
Compiler.hs
module Compiler where
{- Compiler pipeline -}
compiler :: String -> CompilerConfiguration -> String
compiler = tokenize >.> parse >.> optimize >.> emit
-- | Compose two compiler phases into a single phase
(>.>) :: (a -> CompilerConfiguration -> b) -> (b -> CompilerConfiguration -> c) -> (a -> CompilerConfiguration -> c)
(>.>) phase1 phase2 input configuration =
let phase1Result = phase1 input configuration
phase2Result = phase2 phase1Result configuration
in phase2Result
{- Compiler phases -}
tokenize :: String -> CompilerConfiguration -> [Token]
tokenize = undefined
parse :: [Token] -> CompilerConfiguration -> AST
parse = undefined
optimize :: AST -> CompilerConfiguration -> AST
optimize = undefined
emit :: AST -> CompilerConfiguration -> String
emit = undefined
{- Compiler data types -}
data Token = Token
data AST = AST
data CompilerConfiguration = CompilerConfiguration
data CompilerLog = CompilerLog
data CompilerState = CompilerState
data CompilerError = CompilerError