Example
Now we will show an example of our compiler running, using the threaded information at various points. This example will obviously be a toy example—it is not a real compiler. But it should give a sense of how to start using the features of this design, including:
- Running the compiler on some input.
- Logging information about each phase.
- Throwing an error.
- Conditionally executing code based on the value of a compilation flag.
To demonstrate these features, we extend the data types for various phases to be slightly more useful. In particular, we add a compiler flag for optimization, and we use strings for log entries and error messages.
data Token = Token
data AST = AST
newtype CompilerConfiguration = CompilerConfiguration {optimizeFlag :: Bool}
type CompilerLog = [CompilerLogEntry]
type CompilerLogEntry = String
data CompilerState = CompilerState
type CompilerError = StringHere is the full, example code, with commentary that follows.
Compiler.hs
module Compiler where
import Control.Monad (when, (>=>))
import Control.Monad.Except (ExceptT, runExceptT, throwError)
import Control.Monad.RWS (RWST, ask, runRWST, tell)
{- Compiler pipeline -}
compiler :: CompilerPhase String String
compiler = tokenize >=> parse >=> optimize >=> emit
1{- Running the compiler -}
runCompilerM :: String -> CompilerConfiguration -> IO (Either CompilerError String, CompilerLog)
runCompilerM source config = do
(result, _, log) <- runRWST (runExceptT (compiler source)) config CompilerState
return (result, log)
{- Compiler phases -}
tokenize :: CompilerPhase String [Token]
tokenize input = do
2 tell ["Tokenizing source code"]
if input == "nope"
3 then throwError "Tokenization failed on the secret word"
4 else return [Token]
parse :: CompilerPhase [Token] AST
parse tokens = do
tell ["Parsing tokens into AST"]
return AST
optimize :: CompilerPhase AST AST
optimize ast = do
5 config <- ask
6 when (optimizeFlag config) $
tell ["Optimizing AST"]
return ast
emit :: CompilerPhase AST String
emit ast = do
tell ["Emitting machine code"]
return "machine code!"
{- Compiler data types -}
type CompilerPhase a b = a -> ExceptT CompilerError (RWST CompilerConfiguration CompilerLog CompilerState IO) b
data Token = Token
data AST = AST
newtype CompilerConfiguration = CompilerConfiguration {optimizeFlag :: Bool}
type CompilerLog = [CompilerLogEntry]
type CompilerLogEntry = String
data CompilerState = CompilerState
type CompilerError = String- 1
-
There is a price we have to pay for hiding most of the complexity in an
mtlmonad: we have to do a lot to actually run the compiler. The library functionsrunExceptTandrunRWST“unstack” the monads (innermost first) and do the monadic computation. When we extract the value, we get back a triple: of the result, the state, and the log. The type of this value is(Either CompilerError String, CompilerState, CompilerLog). For the implementation ofrunCompilerM, we choose to ignore the state and just return the result and the log. - 2
-
The library function
telltakes as input a value of typeCompilerLogand appends that value to the existing log. - 3
-
The library function
throwErrortakes a value of typeCompilerErrorand essentially “returns” an error in the monad. To demonstrate errors, we have our compiler refuse to tokenize if the input string is"nope". - 4
- Otherwise, the tokenizer returns a valid token list.
- 5
-
The library function
askretrieves the configuration from the monad. The type ofconfigon this line isCompilerConfiguration. - 6
-
We use the library function
whento conditionally execute some monadic code. The condition is the value of theoptimizeFlagin the configuration. When the flag is set, we write a message to the log, to simulate optimization. Of course, this compiler can’t really perform any optimization, so logging is all that happens. Regardless of whether the flag is set, we just pass the AST on to the emitter.
Example: valid input, with no optimization
runCompilerM "abcd" (CompilerConfiguration False)(Right “machine code!”,[“Tokenizing source code”,“Parsing tokens into AST”,“Emitting machine code”])
Example: valid input, with optimization
runCompilerM "abcd" (CompilerConfiguration True)(Right “machine code!”,[“Tokenizing source code”,“Parsing tokens into AST”,“Optimizing AST”,“Emitting machine code”])
Example: invalid input
runCompilerM "nope" (CompilerConfiguration False)(Left “Tokenization failed on the secret word”,[“Tokenizing source code”])