Lambda Calculus

Instead of machine descriptions, the λ-calculus traffics in expressions. An expression can be:

  • a variable, e.g., a, b, . . . ,z, possibly primed, e.g. a’,b’, . . . ,z’

  • or a combination (FE) where F and E are expressions, e.g., (fx), ((fx)y), (f(gx)). This represents the application of a function F to parameter E. “F(E)” is the usual way of writing it in regular mathematics.

  • or a function definition (λVE) where E is an expression and V is a variable. You can change V to any other variable throughout the function definition, skipping any function definitions inside E that also use V as its variable. This is called α-conversion. To be on the safe side you should perform α-conversions whenever the possibility of confusion exists. In fact, not doing so has caused much confusion over the years, including a flaw in LISP.

We use different sizes and shapes of parentheses to reduce confusion, for example, [(ef)g], {e(fg)}, ({[(fw)x]y}z), {λt[(tx)(λz{λr[r(xw)]})]}.

There is just one computing step, reduction: an expression or subexpression of the form ((λVE)F) can be replaced by E’, where E’ is derived from E by substituting F for V throughout E. This may create multiple occurrences of the same function definitions inside E’ in which case you should perform α-conversions to distinguish theme. Note that a “reduced” expression can be bigger, not smaller.

While no human could be expected to do much of this kind of symbol manipulation by hand, the λ-calculus computation is a lot more comprehensible than a Turing machine computation, which is too tedious, even for nerds.

An expression can be “evaluated” applying reduction repeatedly until no reducible expressions remain. If the reduction process ever completes, the resulting expression is called the normal form. Interestingly, Church proved that the order you apply reductions doesn’t matter, just as the order of multiplications in 34+582 doesn’t matter. You always get the same answer, ignoring α-conversions.

Just as Turing showed how arithmetic can be done with Turing machines, Church showed how to do it with expressions. He represented the numbers thusly:

0 = {λf[λxx]}

1 = {λf[λx(fx)]}

2 = {λf[λx(f(fx))]}

3 = {λf[λx(f(f(fx)))]}

etc.

In other words, a number n, takes a function f and a value x and applies f to x n times.


A function to add one to a number can be written as


Successor = {λn[λf(λx{f[(nf)x]})]}


(Successor 2) = ({ λn[λf(λx{f[(nf)x]})] }2) = [λf(λx{f[(2f)x]})]

= [λf(λx{f[( {λf’[λx’(f’(f’x’))]}f)x]} )]

= [λf(λx{f[[λx’(f(fx’))]x]} )]

= [λf(λx{f(f(fx))})]

= 3

An addition function is (λm[λn((m Successor)n)])


The truth values can be represented by

True = (λx(λyx))

False = (λx(λyy)) (which, coincidently, is also Zero, but don’t worry)

So the conditional expression (if B then C else D) can be written simply as ((BC)D), assuming B always reduces to a truth value.

There are expressions that can be reduced forever. For example, (λx(xx))(λy(yy)) reduces to (λy’(y’y’))(λy(yy)) to (λy’(y’y’))(λy(yy)), and so on forever. More usefully, there is a famous expression

Y = [ (λg(gg)) (λh {λf[f(hh)]}) ]

= [(λh {λf[f(hh)]} ) (λh’ {λf’[f’(h’h’)]}) ]

= {λf[f( (λh’ {λf’[f’(h’h’)]}) (λh {λf[f(hh)]}) )]}

It is called the fixed-point combinator because

(YF) = F(YF)

In other words F applied to YF is YF again.

This is useful because it allows one to invent recursive expressions that depend upon themselves, like the Factorial function which can be created by (YG) where G is:

(λF[λn if n=1 then 1 else n F(n-1)])

This reduces to λn if n=1 then 1 else n ✕ (YG(n-1)