Errors
To support robust error handling, we will change our compiler to return a different type of value, depending on whether compilation succeeds or fails.
1 Either
Haskell has a datatype to support exactly the scenario where we want to compute values of two different types: Either.
Either has two constructors, Left and Right. Typically Left is used to indicate an error value, and Right is used to indicate a success value.
Let us explore how we would use Either, with tokenization as an example. Tokenization normally gives back a value of type [Token]. If we want to support success or failure values, the tokenizer should give back a result of type Either CompilerError [Token]. How would it do so? Let us suppose that our tokenizer builds up its list of tokens in a variable named tokens, which has type [Token]. To indicate success, the result of the tokenizer would be Right tokens. If tokenization instead fails, let us suppose that the tokenizer has built some information about the error in a variable named errorInfo, which has type CompilerError. To indicate failure, the result of the tokenizer would be Left errorInformation.
2 Redesign the interface
All our phases should now give back an Either. Let us update our type alias to reflect this design (you will have to look all the way to the end of the line to see the changes).
type CompilerPhase a b = a -> CompilerConfiguration -> CompilerLog -> CompilerState -> Either CompilerError (b, CompilerLog, CompilerState)Thanks to our type alias, we don’t need to update the signatures for the compiler or any of the phases!
3 Composing phases
To compose phases with error handling, we need to thread everything as before and at each new phase check whether the previous was successful or not. If not, we will say that the new phase fails, too.
-- | Compose two compiler phases into a single phase
(>.>) :: CompilerPhase a b -> CompilerPhase b c -> CompilerPhase a c
(>.>) phase1 phase2 input configuration log state =
case phase1 input configuration log state of
Left err -> Left err -- if the previous phase fails, just propagate the error
Right (phase1Result, log', state') ->
case phase2 phase1Result configuration log' state' of
Left err -> Left err
Right (phase2Result, log'', state'') -> Right (phase2Result, log'', state'')4 Full code
Here is all the code for our compiler, whose phases can read a configuration, read/write a log, read/update a state, and fail somewhat gracefully.
The code that has changed from our previous version is emphasized.
Compiler.hs
module Compiler where
{- Compiler pipeline -}
compiler :: CompilerPhase String String
compiler = tokenize >.> parse >.> optimize >.> emit
-- | Compose two compiler phases into a single phase
(>.>) :: CompilerPhase a b -> CompilerPhase b c -> CompilerPhase a c
(>.>) phase1 phase2 input configuration log state =
case phase1 input configuration log state of
Left err -> Left err -- if the previous phase fails, just propagate the error
Right (phase1Result, log', state') ->
case phase2 phase1Result configuration log' state' of
Left err -> Left err
Right (phase2Result, log'', state'') -> Right (phase2Result, log'', state'')
{- Compiler phases -}
tokenize :: CompilerPhase String [Token]
tokenize = undefined
parse :: CompilerPhase [Token] AST
parse = undefined
optimize :: CompilerPhase AST AST
optimize = undefined
emit :: CompilerPhase AST String
emit = undefined
{- Compiler data types -}
type CompilerPhase a b = a -> CompilerConfiguration -> CompilerLog -> CompilerState -> Either CompilerError (b, CompilerLog, CompilerState)
data Token = Token
data AST = AST
data CompilerConfiguration = CompilerConfiguration
data CompilerLog = CompilerLog
data CompilerState = CompilerState
data CompilerError = CompilerError