r1 - 21 Feb 2020 - 20:09:23 - Main.MelissaONeillYou are here: TWiki >  CS131Spring2020 Web  > Week5Class2
\require{AMSsymbols} %\require{AMSmath} \newcommand{\aC}[2]{#1 \;\leftrightarrow_\alpha\; #2} \newcommand{\bC}[2]{#1 \;\leftrightarrow_\beta\; #2} \newcommand{\eC}[2]{#1 \;\leftrightarrow_\eta\; #2} \newcommand{\aE}[2]{#1 \;\equiv_\alpha\; #2} \newcommand{\bE}[2]{#1 \;\equiv_\beta\; #2} \newcommand{\eE}[2]{#1 \;\equiv_\eta\; #2} \newcommand{\br}[2]{#1 \to_\beta #2} \newcommand{\brs}[2]{#1 \to_\beta^* #2} % \newcommand{\num}[1]{\ulcorner\! #1 \!\urcorner} \newcommand{\num}[1]{\underline{\mathrm{#1}}} \newcommand{\alias}[1]{\underline{\mathrm{#1}}} \newcommand{\comb}[1]{\mathrm{#1}} \newcommand{\FV}[1]{\mathrm{\mathbf{FV}}[\![\, #1 \,]\!]} \newcommand{\skTrans}[1]{\mathrm{\mathbf{Trans}}[\![\, #1 \,]\!]} \newcommand{\skOpt}[1]{\mathrm{\mathbf{Opt}}[\![\, #1 \,]\!]}

Haskell Reminders

Haskell data Declarations

Class Exercise

What's wrong with this idea for representing hot dogs?

data HotDog = Meat   MeatType
            | Bun    BunType
            | Spicy  Bool
            | Length Double

This data declaration implies that a HotDog is either just Meat OR a Bun OR Spicy Or a specific Length. (Note how this differs from class data members in an object-oriented programming language like Java. In Haskell, can have multiple pieces of data at once using a tuple. Haskell actually supports tuples with named fields, known as record types, but we have not used them in this class.)

Wiki Extra: Datatypes Quiz

Describe all the parts of this declaration...

data GrutorFood = Thai    ThaiPlace [ThaiDish]
                | Burger  BurgerPlace Int Bool Bool
                | Pizza   PizzaPlace Int [Topping]
                | Spicy   GrutorFood
                | Vegan   GrutorFood
    deriving (Show, Eq, Ord) 

  • Thai, Burger, Pizza, Spicy, and Vegan are all Value Constructors or Case Labels or Tags
  • Show, Eq, and Ord are examples of Type Classes
  • ThaiPlace [ThaiDish], BurgerPlace Int Bool Bool, PizzaPlace Int [Topping], GrutorFood are all Types
  • data is a reserved keyword

More Data Declarations

Electrical Circuit

-- Represents an electric circuit. Assume the circuit is closed
-- (the end is connected to the beginning). Ground, Input, and
-- Output are exceptions.
data Circuit = -- A plain wire, equivalent to (R 0).
               Epsilon
               -- The wire is grounded and the circuit is not closed.
             | Ground
               -- A voltage source determined by external input.
             | Input (a -> Double)
               -- A voltage sink which determines an external output.
             | Output (Double -> a)
               -- A component facing the other way.
             | Backward Circuit
               -- Two components in series.
             | Series Circuit Circuit
               -- Two components sharing a beginning and end.
             | Parallel Circuit Circuit
               -- A battery. The Double determines voltage.
             | Battery Double
               -- An AC battery. The Doubles are voltage and frequency.
             | ACBattery Double Double
               -- A resistor.
             | R Double
               -- An inductor.
             | L Double
               -- A capacitor.
             | C Double
               -- A switch. The Bool is its state (On/Off).
             | Switch Bool
    deriving (Show, Eq, Ord)

Route

data Route  = DriveFor     Distance
            | DriveUntil   Landmark
            | ChangeTo     Lane
            | ExitAt       Exit
            | Segments     [Route]
    deriving (Show, Eq, Ord)

Introduction to Parsing

Story So Far...

We can run programs! Yay!

But, without a parser, we'd have to create the representation directly with code. In Haskell, we'd have to write this:

Let "z" (Lambda "f" (Apply (Lambda "x" (Apply (Var "f") (Lambda "y"
(Apply (Apply (Var "x") (Var "x")) (Var "y"))))) (Lambda "x" (Apply
(Var "f") (Lambda "y" (Apply (Apply (Var "x") (Var "x")) (Var
"y"))))))) (Let "fact" (Apply (Var "z") (Lambda "f" (Lambda "n" (If
(BinOp (Var "n") Equals (Exactly (Number 0))) (Exactly (Number 1))
(BinOp (Var "n") Times (Apply (Var "f") (BinOp (Var "n") Minus
(Exactly (Number 1))))))))) (Apply (Var "fact") (Exactly (Number 6))))

This is ugly, so when testing our code in class, instead of typing this in directly we have been using parsers made by Prof. Melissa.

To parse is to go from concrete syntax to abstract syntax.

In the real world, it is possible to use tools to help write parsers. We won't use these tools in this class, instead, for educational purposes, we'll build our own from scratch.

Abstract Syntax for the Lambda Calculus

type VarName = String

data LambdaExpr = Var    VarName
                | Lambda VarName LambdaExpr
                | Apply  LambdaExpr LambdaExpr
    deriving (Show, Eq, Ord)

Examples: Converting from Concrete to Abstract by Hand

Concrete Syntax:

    (λx.(x y))
Abstract Syntax:
 Lambda "x" (Apply (Var "x") (Var "y")) 

What are some ways to specify what people can say? (i.e. how do we specify Concrete Syntax for Lambda Calculus?)

Ans: We need a grammar.

A Grammar for the Lambda Calculus

Here's our (required parentheses) grammar for the lambda calculus.

‹lexpr›   ::=  ‹var› 
        | ( λ  ‹var› . ‹lexpr› )
        | ( ‹lexpr› ‹lexpr› )

Tokenization (a.k.a. Lexical Analysis, Lexing)

What do you see? (Motivation for Tokenization)

int main()
{
   for (int count = 0; count < 10; ++count) {
      printf("Hello, World!");
   }
   return 0;
}

  • int is a type
  • count and main are identifiers
  • main is a function name
  • 0 and 10 are numeric literals
  • "Hello, World!" is a string literal
  • ++, =, < are operators
  • for and return are reserved keywords

What are the different tokens that should exist in our Lambda Calculus language?

  • (
  • )
  • λ
  • .
  • variable names/ identifiers like x, foo

How do we represent these Tokens?

data LambdaToken = LambdaTok | DotTok | LParenTok | RParenTok
                 | IdentTok String
     deriving (Show, Eq, Ord)

Hand-writing a Tokenizer

tokenize :: String -> [LambdaToken]
tokenize []            = []
tokenize ('λ' : rest)  = LambdaTok : tokenize rest
tokenize ('\\' : rest) = LambdaTok : tokenize rest
tokenize ('.' : rest)  = DotTok    : tokenize rest
tokenize ('(' : rest)  = LParenTok : tokenize rest
tokenize (')' : rest)  = RParenTok : tokenize rest
tokenize s@(c : rest) | isAlpha c =
                            let (name, rest2) = span isAlphaNum s
                            in  IdentTok name : tokenize rest2
                      | isSpace c =
                            tokenize rest
                      | otherwise =
                            error ("Invalid char, '" ++ [c] ++ "'")

Note: 
isSpace:: Char -> Bool 
isAlpha:: Char -> Bool
span:: (Char->Bool) -> String -> (String, String)
                   takes in a function f (that takes in a character and returns a boolean) and a string x, and returns (a,b)
                   where a is the first part of x that satisfies f, and b is the rest of the string x
  

Demo

*LambdaLexer> tokenize ". ( ) λ x y z abc"
[DotTok,LParenTok,RParenTok,LambdaTok,IdentTok "x",IdentTok "y",IdentTok "z",IdentTok "abc"]
*LambdaLexer> tokenize ".()λxyzabc"
[DotTok,LParenTok,RParenTok,LambdaTok,IdentTok "xyzabc"]
*LambdaLexer> tokenize "(λx.(λy.(x y)))"
[LParenTok,LambdaTok,IdentTok "x",DotTok,LParenTok,LambdaTok,IdentTok "y",DotTok,LParenTok,IdentTok "x",IdentTok "y",RParenTok,RParenTok,RParenTok]

Parsing

parse is a function that takes in a list of lambda tokens L, returns the first lambda expression, and the rest of the tokens in L

parse :: [LambdaToken] -> (LambdaExpr, [LambdaToken])
parse (IdentTok vname : rest) = (Var vname, rest)
parse (LParenTok : rest) = 
            let (result, remainder) =
                  case rest of
                      LambdaTok : IdentTok vname : DotTok : rest2 ->
                          let (body, rest3) = parse rest2
                          in  (Lambda vname body, rest3)
                      _ ->
                          let (func, rest2) = parse rest
                              (arg,  rest3) = parse rest2
                          in  (Apply func arg, rest3)
            in case remainder of
                   RParenTok : toe -> (result, toe)
                   _ -> error "')' expected"
parse _ = error "Lambda expression expected"

Demo

*LambdaParser> parse [LParenTok,LambdaTok,IdentTok "x",DotTok,LParenTok,LambdaTok,IdentTok "y",DotTok,LParenTok,IdentTok "x",IdentTok "y",RParenTok,RParenTok,RParenTok]
(Lambda "x" (Lambda "y" (Apply (Var "x") (Var "y"))),[])
*LambdaParser> parse (tokenize "(λx.(λy.(x y)))")
(Lambda "x" (Lambda "y" (Apply (Var "x") (Var "y"))),[])

We have completed Tokenization and Parsing only using recursion! Next time we will do the same thing in an easier way but with better infrastructure.

There is a runnable repl.it demo of this code.

Run

:load LambdaParser.hs
tokenize "(λx.(λy.x))"
parse (tokenize "(λx.(λy.x))")
parse (tokenize "...")

If you want to print this page, you can click to show all the answers.

-- Main.MelissaONeill - 21 Feb 2020

Edit | Attach | Printable | Raw View | Backlinks: Web, All Webs | History: r1 | More topic actions
 
Harvey Mudd College computer science
This site is powered by the TWiki collaboration platformCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback