Example

Now we will show an example of our compiler running, using the threaded information at various points. This example will obviously be a toy example—it is not a real compiler. But it should give a sense of how to start using the features of this design, including:

To demonstrate these features, we extend the data types for various phases to be slightly more useful. In particular, we add a compiler flag for optimization, and we use strings for log entries and error messages.

data Token = Token
data AST = AST
newtype CompilerConfiguration = CompilerConfiguration {optimizeFlag :: Bool}
type CompilerLog = [CompilerLogEntry]
type CompilerLogEntry = String
data CompilerState = CompilerState
type CompilerError = String

Here is the full, example code, with commentary that follows.

Compiler.hs
module Compiler where

import Control.Monad        (when, (>=>))
import Control.Monad.Except (ExceptT, runExceptT, throwError)
import Control.Monad.RWS    (RWST, ask, runRWST, tell)

{- Compiler pipeline -}
compiler :: CompilerPhase String String
compiler = tokenize >=> parse >=> optimize >=> emit

1{- Running the compiler -}
runCompilerM :: String -> CompilerConfiguration -> IO (Either CompilerError String, CompilerLog)
runCompilerM source config = do
    (result, _, log) <- runRWST (runExceptT (compiler source)) config CompilerState
    return (result, log)

{- Compiler phases -}
tokenize :: CompilerPhase String [Token]
tokenize input = do
2    tell ["Tokenizing source code"]
    if input == "nope"
3        then throwError "Tokenization failed on the secret word"
4        else return [Token]

parse :: CompilerPhase [Token] AST
parse tokens = do
    tell ["Parsing tokens into AST"]
    return AST

optimize :: CompilerPhase AST AST
optimize ast = do
5  config <- ask
6  when (optimizeFlag config) $
        tell ["Optimizing AST"]
  return ast

emit :: CompilerPhase AST String
emit ast = do
    tell ["Emitting machine code"]
    return "machine code!"

{- Compiler data types -}
type CompilerPhase a b = a -> ExceptT CompilerError (RWST CompilerConfiguration CompilerLog CompilerState IO) b

data Token = Token
data AST = AST
newtype CompilerConfiguration = CompilerConfiguration {optimizeFlag :: Bool}
type CompilerLog = [CompilerLogEntry]
type CompilerLogEntry = String
data CompilerState = CompilerState
type CompilerError = String
1
There is a price we have to pay for hiding most of the complexity in an mtl monad: we have to do a lot to actually run the compiler. The library functions runExceptT and runRWST “unstack” the monads (innermost first) and do the monadic computation. When we extract the value, we get back a triple: of the result, the state, and the log. The type of this value is (Either CompilerError String, CompilerState, CompilerLog). For the implementation of runCompilerM, we choose to ignore the state and just return the result and the log.
2
The library function tell takes as input a value of type CompilerLog and appends that value to the existing log.
3
The library function throwError takes a value of type CompilerError and essentially “returns” an error in the monad. To demonstrate errors, we have our compiler refuse to tokenize if the input string is "nope".
4
Otherwise, the tokenizer returns a valid token list.
5
The library function ask retrieves the configuration from the monad. The type of config on this line is CompilerConfiguration.
6
We use the library function when to conditionally execute some monadic code. The condition is the value of the optimizeFlag in the configuration. When the flag is set, we write a message to the log, to simulate optimization. Of course, this compiler can’t really perform any optimization, so logging is all that happens. Regardless of whether the flag is set, we just pass the AST on to the emitter.

Example: valid input, with no optimization

runCompilerM "abcd" (CompilerConfiguration False)

(Right “machine code!”,[“Tokenizing source code”,“Parsing tokens into AST”,“Emitting machine code”])

Example: valid input, with optimization

runCompilerM "abcd" (CompilerConfiguration True)

(Right “machine code!”,[“Tokenizing source code”,“Parsing tokens into AST”,“Optimizing AST”,“Emitting machine code”])

Example: invalid input

runCompilerM "nope" (CompilerConfiguration False)

(Left “Tokenization failed on the secret word”,[“Tokenizing source code”])