Page 1 of 2 You may have heard of "the lambda calculus"  it's the reason lambda expressions are so called. You may know that it is something to do with computability, and so any explanation of how it works should be "for programmers". However, lambda calculus is close to logic and pure maths and it can be difficult to see what it is actually all about in practical terms. So there is a need for a plainspoken guide suitable for programmers.
A Programmers Guide To Theory
Contents
 The Transfinite
 Axiom Of Choice
 Finite State Machines
 What is a Turing Machine?
 Computational Complexity
 Noncomputable numbers
 What Is Computable?
 Lambda Calculus
 Grammar and Torture
 Reverse Polish Notation  RPN
 Information Theory
 Kolmogorov Complexity
 Introduction to Boolean Logic
 Confronting The Unprovable  Gödel And All That
 The Programmer's Guide to Chaos
 The Programmer's Guide to Fractals
What is computable?
Lambda calculus is an attempt to be precise about what computation actually is. It is a step on from pure logic but it isn't as easy to understand as the more familiar concept of a Turing Machine.
A Turing Machine defines the act of computing in terms that we understand at a reasonable practical level  a finite state machine and a single tape that can be read and written. A Turing Machine is about the most strippeddown piece of hardware that you can use to compute something.
The important point, however, is that this incredibly simple machine is assumed to be able to compute anything that can be computed. Other types of computer might be faster, but they can't do anything more than a Turing machine can.
If it isn't Turing computable then it isn't computable by any method we currently know.
A Turing machine is computing stripped down to its the heart and soul  even if it is so slow that it is impractical to actually use.
Lambda calculus is an alternative to the "hardware" approach of a Turing machine and it too seeks to specify a simple system that is powerful enough to compute anything that can be computed. One way of putting this is that the Lambda Calculus is equivalent to a Turing machine and vice versa.
This is the ChurchTuring thesis  that every function that can be computed can be computed by a Turing Machine or the lambda calculus.
In fact there is a third simple but universal computational system  the recursive function  and all three systems have been shown to be equivalent.
The lambda calculus started out as an attempt to create a logical foundation for the whole of mathematics, but this project failed due to the paradoxes that are inherent in any such attempt. However, a cutdown version of the idea proved to be a good model for computation.
The basic lambda
So to move on to the lambda calculus.
The most difficult thing to realize about the lambda calculus is that it is simply a game with symbols. If you think about it all of computation is just this  a game with symbols. Any meaning that you care to put on the symbols may be helpful to your understanding but it isn't necessary for the theory.
First we have the three rules that define what a lambda term is.
 any "variable" e.g. x is a lambda term
 if t is a lambda term then (λx.t) is a lambda term where x is a variable
 if t and s are lambda terms then (t s) is also a lambda term.
The use of the term "variable" is also a little misleading in that you don't use it to store values such as numbers. It is simply a symbol that you can manipulate.
The parentheses are sometime dropped if the meaning is clear and many definitions leave them out.
Rule 2 is called an abstraction and Rule 3 is called an application.
If you prefer we can define a simple grammar for a lambda expression <LE>:
<LE> > variable  (λ variable. <LE>)(<LE> <LE>)
Notice that what we have just defined is a simple grammar for a computer language. That really is all it is, as we haven't attempted to supply any meaning or semantics to the lambda terms.
Examples of lambda terms along with an explanation of their structure:
y rule 1 a variable (x y) rule 3 (λx.(x z)) rule 2 (λx.(x z)) (a b) rule 1, 2 and 3
To make sure you follow, let's parse the final example.
The first term is
(λx. <LE>)
with
<LE>=(x z)
and of course x z is an <LE> by rule 1.
and the second term is just (a b) which is again an LE by rule 1.
The two terms fit together as an LE again by rule 1.
A rough syntax tree is:
The parentheses are a nuisance so we usually work with the following conventions:
 outermost parentheses are dropped  a b not (a b)
 terms are left associative  so a b c means ((a b) c)
 the expression following a dot is taken to be as long as possible  λx.x z is taken to mean λx.(x z) not (λx. x) (z)
In practice is it convention 2 that is most difficult to get used to at first, because it implies that abc is three separate lambda expressions not one consisting of three symbols.
If in doubt add parenthesis and notice that grouping of terms does matter as expressed in the basic grammar by the presence of brackets. Without the brackets the grammar is ambiguous.
Also notice that the names of the variables don't carry any information. You can change them all in a consistent manner and the lambda expression doesn't change. That is, two lambda expression that differ only in the names used for the variables are considered to be the same.
Reduction
So far we have a simple grammar that generates lots of strings of characters, but they don't have any meaning.
At this point it is tempting to introduce meanings for what we are doing  but we don't have to. We can regard the whole thing as playing with symbols.
If this sounds like an empty thing to do, you need to remember that this is exactly what a Turing machine, or any computer, does. The symbols don't have any intrinsic meaning. When you write a program to compute the digits of Pi, say, you may think that you are doing something real, but the digits are just symbols and you add the extra interpretation that they are numbers in a place value system.
There are a number of additional rules added to the bare syntax of the lambda expression that can be used to reduce one expression to another. The most important to understand is probably beta reduction.
To the grammar above we can add another rule called beta reduction.
If you have a lambda expression of the form:
(λx.t) s
then you can reduce it to a new lambda expression by replacing every occurrence of x in t by s and erasing the λx. You can think of this as a simple string operation if you like. For example:
(λx. x z) (a b)
is of the form given with t=x z and s=a b.
