Recall that the syntax of lambda calculus has only three cases:
x — variables
( MN ) — application
( λ x . M ) — lambda abstraction
Manipulating λ Expressions
Let $\FV{M}$ be the free variables in λ-expression $M$.
e.g. $\FV{(\lambda x.(\lambda y.w \; ((x \; y) \; z)))} = \{ w, z \}$
We can actually define this manipulation recursively and by cases.
(Note that $\{ x \} \setminus \{ y \}$ means subtract $\{ y \}$ from $\{ x \}$.)
\[
\begin{array}{rcll}
\FV{v} & = & \{ v \}\ & \mathrm{\ \ \ (Variables)} \\\ \FV{(M \; N)} & = & \FV{M} \cup \FV{N}\ & \mathrm{\ \ \ (Application)}\\\ \FV{(\lambda x.M)} & = & \FV{M} \setminus \{ x \}\ & \mathrm{\ \ \ (Lambda\ abstraction)}
\end{array}
\]
We call FV a static analyzer because it provides information about (analyzes) our lambda expression without "executing" it.
Example:
\[
\begin{array}{rcl}
\FV{(\lambda x. (z \; x))} & = & \FV{(z \; x)} \setminus \{x\} \\\ & = & (\FV{z} \cup \FV{x}) \setminus \{x\} \\\ & = & (\{z\} \cup \{x\}) \setminus \{x\} \\\ & = & \{z, x\} \setminus \{x\} \\\ & = & \{z\} \\\ \end{array}
\]
Implementing Sets
What are sets? We could implement them as lists that we scan for a particular value, but then we can't have infinite sets. For mathematical sets, the most fundamentally important operation for a set $S$ is to ask, "Does $S$ contain $x$?" If we can answer that question, then we know most of what there is to know about the set.
We can represent that question as a function that takes in an element ($x$) and returns a boolean ($\alias{TRUE}$ if $x \in S$, $\alias{FALSE}$ otherwise). We call this function the characteristic predicate (also sometimes called the indicator function for a set). To be clear, in the lambda calculus we are defining sets as their characteristic predicates.
How would you implement the following sets as characteristic predicates?
Empty set
Set of everything
The set $\{ 0 \}$ (assuming we'll only test to see if other integers are in the set)
Set of everything in two sets
Set of everything not in a set
Assume that you still have all the macros from lab ($\alias{TRUE}$, $\alias{FALSE}$, $\alias{AND}$, etc). Then,
\[
\begin{eqnarray*}
\alias{EMPTY} & \triangleq & (\lambda e.\alias{FALSE}) \\\ \alias{UNIVERSE} & \triangleq & (\lambda e.\alias{TRUE}) \\\ \alias{NATURALS} & \triangleq & (\lambda e.(\alias{NOT} (\alias{ISZERO} x))) \\\ \alias{JUSTZERO} & \triangleq & (\lambda e.(\alias{ISZERO} e)) \\\ \end{eqnarray*}
\]
It is important to consider the domain of each set, which refers to the set of arguments which can be passed to the characteristic predicate. For example, for $\alias{NATURALS}$, the domain is the Church numerals. If we were to pass something other than a Church numeral to $\alias{NATURALS}$, the result would be undefined.
And we can define set operations as follows:
\[
\begin{eqnarray*}
\alias{UNION} & \triangleq & (\lambda p.(\lambda q. (\lambda x.((\alias{OR} \; (p \; x)) \; (q \; x))))) \\\ \alias{INTERSECTION} & \triangleq & (\lambda p.(\lambda q. (\lambda x.((\alias{AND} \; (p \; x)) \; (q \; x))))) \\\ \alias{COMPLEMENT} & \triangleq & (\lambda p. (\lambda x.(\alias{NOT}\ (p \; x)))) \\\ \end{eqnarray*}
\]
Russell's Paradox
Library Analogy
The following example helps illustrate Russell's Paradox. Say every library in the U.S. has a catalog listing all books in the library. In the Congress library, a librarian decided to create a meta-catalog which list all of these catalogs. However, he soon finds out there is still one catalog never listed, the catalog itself. Thus, the librarian can either go into an infinite loop of creating new catalog to contain itself or have an incomplete catalog.
Russell's paradox involves the set
\[
R = \{ x \mid x \notin x \},
\]
which contains every set that doesn't contain itself.
\[
R \triangleq (\lambda x.(\alias{NOT} (x \; x)))
\]
Note that this set is definable in naive set theory, but in more modern constructions (for example ZFC) creating such a set would not be allowed in the model of set theory (for example, because we can't speak of a universal set in the same way).
Exploring the Paradox
We just made $R$ in the lambda calculus. So, let's see what happens if we ask the big question: Does $R$ contain $R$?
\[
\newcommand{\R}{\alias{R}}
\newcommand{\NOT}{\alias{NOT}}
\begin{eqnarray*}
(\R \; \R) & \to & ((\lambda x. \NOT (x \; x)) \R) \\\ & \to_\beta & (\NOT (\R \; R)) \\\ & \to_\beta & (\NOT (\NOT (\R \; R))) \\\ & \to_\beta & (\NOT (\NOT (\NOT (\R \; R)))) \\\ & \to_\beta & (\NOT(\NOT(\NOT(... etc.
\end{eqnarray*}
\]
another way to look at it is
\[
\begin{eqnarray*}
(\R \; \R) & \to_\beta & (\NOT (R \; R))
\end{eqnarray*}
\]
Recursive Functions
In many modern programming languages, we can write function definitions recursively. This is pretty useful, since there are many functions like the factorial function $\alias{FACT}$ that are naturally defined using recursion.
However, in the lambda calculus we are not allowed to write a recursive definition like
\[
\alias{FACT} \triangleq (\lambda n.((((\alias{ISZERO}\; n)\; \num{1})\; ((\alias{TIMES}\; n)\; (\alias{FACT}\; (\alias{PRED}\; n)))))).
\]
This is because if we tried to eliminate the $\alias{FACT}$ macro by substituting in its definition, we would have yet another $\alias{FACT}$ and the substitution would go on forever. In other words, the above definition does not actually fit into any of the given formats for expressions allowed in the lambda calculus. The right hand side isn't already established as an application of one expression to another, since $\alias{FACT}$ hasn't really be defined as an expression yet. It would get stuck in the pre-processing stage of substituting in the actual lambda calculus expression and never even get to the stage where the whole expression is evaluated.
Fixed Points
A fixed point of a function $f$ is a value such that $x = (f \; x)$.
Or, said another way:
The solutions to the equation $x = (f \; x)$ are called the fixed points of the function f.
Question
What are the fixed points of the mathematical function $f(x) = x^2$?
0 and 1 are the obvious ones, although we could also see infinity as a fixed point. We can actually consider infinity the least fixed point, as it is the one we'd probably get if we took an arbitrary number (bigger than one) and kept squaring it. Thus, no thought is involved in coming up with it.
Recursive Functions as Fixed Points
To get around the issue of not being allowed to explicitly use recursive definitions in the lambda calculus, we can frame recursive definitions as assertions about some expression being a fixed point of another function.
For example, consider our earlier recursive definition of the factorial function as the assertion that the factorial function fact should satisfy the equation
\[
\mathrm{fact} = (\lambda n.((((\alias{ISZERO}\; n) \num{1})\; ((\alias{TIMES}\; n)\; (\mathrm{fact}\; (\alias{PRED}\; n)))))).
\]
The above equation is equivalent to the assertion that fact should satisfy
\[
\mathrm{fact} = ((\lambda f.(\lambda n.\;((((\alias{ISZERO}\; n)\; \num{1})\; ((\alias{TIMES}\; n)\; (f\; (\alias{PRED}\; n))))))) \;\mathrm{fact}).
\]
In other words, the function $\alias{FACT}$ in the lambda calculus should be a fixed point of the function
\[(\lambda f. (\lambda n. (((\alias{ISZERO} \;n) \;1) ((\alias{TIMES} \;n) (f \;(\alias{PRED} \;n))))))\]
Note here that function fact is not the factorial function $\alias{FACT}$ that we would want it to be, since it takes in a function f in addition to a number. Therefore, $\alias{FACT}$ is a fixed point of fact because
\[
\mathrm{fact}(\alias{FACT}) = \alias{FACT}.
\]
In other words, passing in the true factorial function $\alias{FACT}$ to the function fact which takes in $\alias{FACT}$ as an argument (to avoid the infinite substitution), yields the actual true factorial function $\alias{FACT}$.
Amazing facts about the λ-calculus
In the lambda calculus, it turns out that any function has a fixed point.
Indeed, two amazing facts are:
For every term $M$ there exists $N$ such that $N \equiv_\beta (M \; N)$
$(M (M (M (M (M (M (M (M (\ldots)))))))))$ is exactly such a fixed point
Fixed points can be found uniformly, i.e., there are λ-calculus functions that compute fixed points.
More over, below we given an explicit example of a function (the Y-combinator) that computes fixed points.
Remember Russell's Paradox?
When we wrote the characteristic predicate for Russell's paradoxical set,
\[
R = \{ x \mid x \notin x \}
\]
it was
\[
\alias{R} \triangleq (\lambda x.(\alias{NOT}(x\;x)))
\]
We found that
\[ (\alias{R}\;\alias{R}) = (\alias{NOT}(\alias{R}\;\alias{R})) \]
i.e., $\alias{R}\;\alias{R}$ is a fixed point of $\alias{NOT}$.
Defining the $\alias{Y}$ Combinator
The Y-Combinator is the function
\[\alias{Y} = (\lambda f. \; ((\lambda x. \; (f \; (x \; x))) \; (\lambda x. \; (f \; (x \; x))))).\]
When we input a function $M$ into the above function, the resulting value is a fixed point of $M$.
The motivation for this construction comes from the observation that if we have a function $f$, and then define the auxiliary function
\[g = (\lambda x. \; (f \; (x \; x)))\]
then the expression
\[h = (g \; g)\]
is a fixed point of $f$. We can verify this last assertion by trying to $\beta$ reduce $(g \; g)$, and verify that it keeps expanding as a nested sequence of applications of $f$.
\[(g \; g) = ((\lambda x. (f (x \; x))) \; g) = (f (g \; g)) = (f (f (g \; g))) = \dots \]
Note that, $(g \; g) = (f \; (g \; g))$, so by definition, $(g \; g)$ is a fixed point of $f$.
Using the $\alias{Y}$ Combinator
We can use the $\alias{Y}$ combinator to create recursive functions!
For example, above we asserted that the factorial function was the fixed point of
\[
\alias{FACT\_NR} = (\lambda f. (\lambda n. (((\alias{ISZERO} \;n) \;1) ((\alias{TIMES} \;n) (f \;(\alias{PRED} \;n)))))).
\]
Then since applying the Y-combinator to a function returns a fixed point of that function, the expression
\[(\alias{Y} \;\alias{FACT\_NR})\]
should be equal to the factorial function.
There are three perspectives we can take to explain how and why we use the Y-combinator.
Fixed Points
As we saw above, the $\alias{Y}$ combinator can build fixed points of given functions. So, we can view $\alias{Y}$ just as a function in the lambda calculus that computes fixed points.
Practical Perspective
As we saw above, the $\alias{Y}$ combinator creates infinite structures, that involve a single function being applied ad infinitum.
For example, we could model the infinite list containing all $1$s as the expression
\[(\alias{Y} \; (\lambda x. \; ((\alias{CONS} 1)\; x\; ))).\]
Recipe Perspective
More generally, $\alias{Y}$ allows us to implement recursive functions in the lambda calculus as fixed points of larger functions.
The approach we used with the factorial function can be easily generalized to any recursive definition.
Note that a high level, we can characterize the Y-combinator as a sort of "infinite" Church Numeral $\omega$. This is because a standard Church Numeral $n$ (corresponding to a natural number $n$), is the function that, when given as input another function $f$, returns the $n$-fold composition of $f$. Analogously, when $\alias{Y}$ is given a function $f$, the output is the infinite-fold composition of $f$.
Compiling the Lambda Calculus down to Primitive Combinators
We can completely get rid of λ-abstractions, leaving only applications and free variables.
Rules
\[
\begin{eqnarray*}
\skTrans{x} & = & x \\\ \skTrans{(M \; N)} & = & (\skTrans{M} \; \skTrans{N}) \\\ \skTrans{(\lambda x.x)} & = & \comb{I} \\\ \skTrans{(\lambda x.M)} & = & (\comb{K}\; \skTrans{M}) \;\;\;\;\;\;\;\;\;\;\mathbf{if}\;x \notin \FV{M} \\\ \skTrans{(\lambda x.(\lambda y.M))} & = & \skTrans{ (\lambda x. \skTrans{(\lambda y.M)})}\\\ \skTrans{(\lambda x.(N \; O))} & = & ((\comb{S}\; \skTrans{\lambda x.N}) \; \skTrans{\lambda x.O}) \\\ \end{eqnarray*}
\]
where $\comb{I}$, $\comb{K}$, and $\comb{S}$ are defined such that
\[
\begin{eqnarray*}
(\comb{I} \; x) & \to & x \\\ ((\comb{K} \; x) \; y) & \to & x \\\ (((\comb{S} \; f) \; g) \; x) & \to & ((f \; x) \; (g \; x))
\end{eqnarray*}
\]
Note: $\comb{I}$ can be viewed as the identity, $\comb{K}$ as a constant, and $\comb{S}$ as a fix. We can also write $\comb{I} = \comb{S} \comb{K} \comb{K}$, so we actually only need two combinators to represent the lambda calculus!
Here's one we didn't do in class. Our lambda value for true could be written as just K, but comes out as S (K K) I (which works, it's just more verbose).
This is correct, but not as compact as it could be.
Wiki Extra: Improving Trans
If we add a rule that performs η-reduction, we'll avoid generating some of the overcomplicated results we saw above. η-reduction is the transformation $(\lambda x.(M \; x)) \to_\eta M$.
\[
\begin{eqnarray*}
\skTrans{(\lambda x.(M \; x))} & = & \skTrans{M} \;\;\;\;\;\;\;\;\;\;\mathbf{if}\;x \notin \FV{M}
\end{eqnarray*}
\]
With this rule added, true does translate to K,
>> Trans〚 (λx.(λy.x)) 〛...
>> Trans〚 (λy.x) 〛...
>> Trans〚 x 〛...
<< Trans〚 x 〛 = x
<< Trans〚 (λy.x) 〛 = (K x)
>> Trans〚 (λx.(K x)) 〛...
>> Trans〚 K 〛...
<< Trans〚 K 〛 = K
<< Trans〚 (λx.(K x)) 〛 = K
<< Trans〚 (λx.(λy.x)) 〛 = K