When writing Lean code, an interpreter recognises keywords (such as def) and settles ambiguous notation in a step referred to as parsing. The expression a+b-c for example could either refer to (a+b)-c or a+(b-c) and may yield different values in Nat. It then tries to clarify certain omissions or semantic ambiguities, for example filling in placeholders _ and trying to settle which subtraction you mean when using the minus sign (Nat.sub or Int.sub), referred to as elaboration. It is during this second stage that most of the errors you’ve made in your code will be discovered and stored as messages to be printed by the VS Code infoview. Finally, Lean will type-check your code with respect to a rather elaborate type theory, the crucial feature of Lean that makes it usable as a formal theorem prover.

We have previously already had a first look at types in Lean. In this post, we’ll take a closer look at type theory and type checking, and how they allow us to express and check the correctness of mathematics. We’ll first discuss types from a theoretical perspective, sketching type-checking in the process. Then we’ll illustrate how to encode mathematics in this framework using explicit examples.

Terms, types, expressions

In programming, types are used to prevent errors when performing operations on variables. Numbers and text for example will both be stored as a series of ones and zeros. so you may end up accidentally adding numbers and text. To prevent this from happening, the compiler or interpreter will ask you to label your variables with types (or it will try to infer them from context or during execution), and point you to the part of the code that may cause issues.

At a first glance, types can be thought of as having the same purpose in Lean. The objects of interest in Lean are expressions, such as 2+2 = 4 or List.append xs [1,2,3]. These expressions have a type: 2+2 = 4 for example has type Prop and List.append xs [1,2,3] has type List Nat. As we have previously already seen, these types are expressions too. This can for example be clearly seen in List Nat, which we think of as the type constructor List being applied to the type Nat. This illustrates that Lean’s types, the labels of our variables and constants, can be manipulated, and are therefore labeled as well. Indeed, if you try to get the type of natural numbers with #check Nat, Lean will tell you that it has type Type. You can then try #check Type to see that this type has type Type 1. The types Prop, Type, Type 1, Type 2, … are called sorts and they are built-in types of the theory.

We may add constant types via the axiom command axiom myType : Type. Most fundamental types of Lean, including Nat and List are however defined as inductive types, which can be thought of as introducing a multitude of axioms at each inductive declaration. We may also build further expressions with the introduction of functions. In Lean, one can define functions fun x : T => e(x), where T is some previously defined type and e(x) stands for some expression that may or may not contain the symbol x. The type of this function is of form T → E, (x : T) → E(x) or ∀ x : T, E(x) for some expression E potentially containing symbol x, which is the type of e(x). For some examples, consider the messages of:

1
2
3
#check fun x : Nat => true
#check fun x : Nat => (⟨x, rfl⟩ : Odd (2*x+1))
#check fun x : Nat => (0 : ZMod x)

T → E, (x : T) → E(x) or ∀ x : T, E(x) are all treated as the same expression. The first corresponds to the middle case, where we omit x in the case where the expression E doesn’t contain this symbol. The last is the mathematical interpretation of the middle one, that we’ll explore in a moment. What would be the use of functions if we couldn’t evaluate them? The answer is “representation”, as we’ll see soon, but we introduce function application, denoted f x (instead of the mathematician’s f(x)), as a valid way to form further expressions.

We now have the most essential ways in wich valid expressions may be formed in Lean. To see how any expression is formed, consider the following diagram (representing a grammar in Backus-Naur-Form), depicting rules and their names:

1
2
3
4
5
E --> V   (variables)
E --> C   (constants)
E --> fun V : E => E    (functions)
E --> ∀ x : E, E    (function-type)
E --> E E   (function-application)

To form an expression, start with symbol E and iteratively replace upper-case symbols with the right-hand-sides indicated above. V may be replaced by a symbol in an alphabet of variables, and C by a symbol in an alphabet of constants, which we imagine to be lower-case symbols. Once no upper-case symbols remain, the expression is formed. In our above description, some expressions may be ill-formed in a sense, as we allow for the presence of variables without guaranteeing that they are bound by an above fun. You can compare this with Lean’s built-in type Lean.Expr.

For example, let’s derive the expression 2+2 = 4, which is notation (or ‘syntactic sugar’) for (Eq (Nat.add 2 2)) 4. Recall that relations and binary operations are treated as functions of two arguments, and that functions of multiple arguments are treated as repeated application of one-variable functions that map to functions, a representation known as currying.

1
2
3
4
5
E --> E E
E E --> (E E) E
(E E) E --> (E (E E)) E
(E (E E)) E --> ... --> (C (C C)) C
(C (C C)) C --> ... --> (Eq (Nat.add 2 2)) 4

As we hinted at in the introduction, the parts of the Lean compiler responsible for turning the concrete syntax of your code into these expressions are parsing and elaboration. Tactics will produce such expressions as well, as you can tell when #print-ing the lemmata and definitions that made use of them (though the expressions can quickly get very big, even when pretty-printed).

Any expression has a type, that can be computed, or inferred, as we say in type theory, with the following recursive algorithm: for variables, we refer to the type declared in the function that introduced them. For constants, we refer to the context of previous definitions. For functions fun v : d => b, we take note that v has type d, and use this to find the type B of b, in a recursive call to the type inference process, and return type ∀ x : d, B. For function types, we’ll keep our exposition simple and just mention that the returned type will be a sort.

Finally, we reach function application, at which a new form of ill-formedness can be discovered for expressions. We want, in f x, for f to have a type of form ∀ y : d, B(y) and simultaneously have x of type d, so that we may assign type B(x) (with occurrences of y replaced by x) to f x. If this condition fails, Lean will not consider the expression valid, and throw an error. This is essentially where the type-checking happens. An extremely important aspect we will not discuss is that when checking types at f x, for f of type ∀ x : d1, B and x of type d2, we consider d1 and d2 equal not only for syntactic equality, but also equal up to reductions, such as alpha reduction (renaming of bound variables), beta reduction (replacing of (fun x : T => e(x)) v with e(v)) and delta reduction (replacement of constants with their value stored in context), among others.

Mathematics

Where does math fit into all this? So far, we have only described how to form expressions in Lean, and which ones Lean considers to be valid. We will now exemplify how to encode common mathematical objects and notions in this language, and how the process of checking if expressions are well-form corresponds to checking if proofs are correct.

A first example: the natural numbers

We can imagine the natural numbers to be defined via

1
2
3
axiom myNat : Type
axiom myZero : myNat
axiom mySuccessor : myNat → myNat

where 0 is encoded by myZero, 1 by mySuccessor myZero, 2 by mySuccessor (mySuccessor myZero), … . This is an encoding similar to Peano arithmetic. Lean’s actual definition of natural numbers is quote similar to this, bundled in the inductive type Nat and with the role of myZero and mySuccessor being fulfilled by Nat.zero and Nat.succ.

We can also define a “less or equal than” relation via

1
2
3
axiom myLe : myNat → myNat → Type
axiom myLe_symm : (x : myNat) → myLe x x
axiom myLe_step : (x : myNat) → (y : myNat) → myLe x y → myLe x (mySuccessor y)

The first axiom postulates the existence of a binary relation . The second and third axiom say that we have ∀ n : myNat, n ≤ n and ∀ n : myNat, ∀ m : myNat,n ≤ m → n ≤ m + 1. In Lean, these are respectively captured through Nat.le, Nat.le.refl, Nat.le.step. Note that they are function types, which we interpret as universal quantifiers. Let’s look at what happens if we evaluate them:

1
2
3
#check myLe_symm myZero
#check myLe_step myZero myZero (myLe_symm myZero)
#check myLe_step myZero (mySuccessor myZero) (myLe_step myZero myZero (myLe_symm myZero))

The types of these expressions are 0 ≤ 0, 0 ≤ 1 and 0 ≤ 2, which is why we consider them proofs of these statements.

Let’s follow how the type of the second expression is derived. Due to currying, it is read as (myLe_step myZero myZero) (myLe_symm myZero), so we’re dealing with a function application, with function (myLe_step myZero myZero) and argument (myLe_symm myZero). We should therefore derive the type of both, check that the function truly has a function-type, with domain-type definitionally equal to the type of the argument. Let’s first look at the type of the argument myLe_symm myZero, which is itself a function application. Here, the function has type (x : myNat) → myLe x x, and the argument type myNat (which we get be querying the context for the type of constants). The domain-type and the type of the argument match, so the type check passes, and we assign myLe_symm myZero the type myLe myZero myZero, which was the image-type of the function, where we replace the symbol of the variable (x) with the argument. Now, let’s return to the task of finding the type of (myLe_step myZero myZero) (myLe_symm myZero), and assume that we recursively derived the type of myLe_step myZero myZero to be myLe myZero myZero → myLe myZero (mySuccessor myZero) (try doing this as an exercise). myLe_symm myZero has the domain-type of the function-type myLe_step myZero myZero, so type-checking passes and we obtain the desired 0 ≤ 1.

Next, let’s look at more examples of formalism. We can model equality of natural numbers as follows:

1
2
3
axiom myEq : myNat → myNat → Type
axiom myEq_refl: (x : myNat) → myEq x x
axiom myEq_subst : (x : myNat) → (y : myNat) → (P : myNat → Type) → myEq x y → P x →  P y

This says that there is a binary relation myEq, which is reflexive, in the sense that for any x, we may build a term (a proof) of type (statement) myEq x x, by the axiom myEq_refl. The last axiom says that if x and y are equal, then for any predicate P on natural numbers, if P x then P y. In actual Lean, myEq_refl is Eq.refl and myEq_subst is Eq.rec.

We’ll take a brief moment to illustrate the notion of definitional equality we hinted at earlier. Consider the following:

1
2
noncomputable
example:  myEq myZero myZero := myEq_refl ((fun _ : myNat => myZero) (mySuccessor myZero))

When we define something with def or example or state a theorem with lemma, Lean will compute the type of what follows := and check if it matches what follows :. Technically speaking, the term of the example has type (fun _ : myNat => myZero) (mySuccessor myZero) = (fun _ : myNat => myZero) (mySuccessor myZero), and not myZero = myZero. However these types are congruent up to reduction (in this case, beta-reduction), which is why Lean considers them to be the same, and is satisfied.

Next, we’ll define addition of natural numbers. Here, we will deviate slightly from the way addition is implemented for Nat in Lean in order to simplify the exposition.

1
2
3
axiom myAdd : myNat → myNat → myNat
axiom myAdd_reduce_zero : (x : myNat) → myEq x (myAdd x myZero)
axiom myAdd_reduce_succ : (x : myNat) → (y : myNat) → myEq (mySuccessor (myAdd x y)) (myAdd x (mySuccessor y))

We introduce function myAdd and provide two computation rules which are, morally speaking, ∀ x : myNat, x+0=0 and ∀ x : myNat, ∀ y : myNat, x+(y+1)=(x+y)+1. We can now show that 2 = 1+1.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
noncomputable
def myProof :
  let one := mySuccessor myZero
  let two := mySuccessor (mySuccessor myZero)
  myEq two (myAdd one one)
    :=
    let one := mySuccessor myZero
    let two := mySuccessor (mySuccessor myZero)
    myEq_subst
      (mySuccessor (myAdd one myZero))
      (myAdd one one)
      (fun x : myNat => myEq two x)
      (myAdd_reduce_succ
        one
        myZero
        )
      (myEq_subst
        one
        (myAdd one myZero)
        (fun x : myNat => myEq two (mySuccessor x))
        (myAdd_reduce_zero one)
        (myEq_refl two)
        )

We highly recommend you explore the code: you may comment out lines with -- and write a placeholder _ in front of the comment, so as to have the infoview display the expected type we want the term to have. Alternatively, you may copy some parts out and #check them, as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
noncomputable def one := mySuccessor myZero
noncomputable def two := mySuccessor (mySuccessor myZero)

#check (myEq_refl two)
#check (myAdd_reduce_zero one)
#check (fun x : myNat => myEq two (mySuccessor x))
#check (myEq_subst one (myAdd one myZero))
#check (myEq_subst
        one
        (myAdd one myZero)
        (fun x : myNat => myEq two (mySuccessor x))
        (myAdd_reduce_zero one)
        (myEq_refl two)
        )

Both the claimed type and proof-term are expressions as we described them previously: they are made of functions, function applications, and constants. The claimed type encodes 2 = 1+1. To prove it, we will build a term of this type, which is the content after :=. We’ll get the claimed type from the application of myEq_subst, so that we use predicate fun x : myNat => myEq two x, to be able to get this from a proof of 2 = mySuccessor (1+0). This, in turn, we’ll also deduce from he application of myEq_subst, so that we use predicate fun x : myNat => myEq two (mySuccessor x)), to be able to get this from a proof of 2 = mySuccessor (mySuccessor 0). The axiom myEq_refl serves as proof for latter fact.

Type checking and inference thus ensure that we are using correct proofs of the hypotheses needed to be proven in intermediate steps of the proof, and that the proof truly produces evidence for the claimed type, which we interpret as statement. In the previous code excerpt, the #checks reveal that the type checker will be satisfied with the expression, as the arguments have the type that the function we apply them to requests in its type.

In actual Lean, addition is defined using a recursion principle for natural numbers, and the above computation holds by definitional equality, which in this context means that during type checking, a reduction specific to inductive (and quotient) types, called iota reduction, will be applied, so that the addition will be syntactically congruent to 2, and the proof therefore just requires reflexivity of equality.

We take a brief moment to look back at equality. Not all basic properties must be added as axioms. We may sometimes deduce certain basic properties from other ones. For example, the following proves that equality is symmetric, based on the previous axioms:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
noncomputable
def myEq_symm :
  (x : myNat) → (y : myNat) →
  myEq x y → myEq y x
    :=
    fun x : myNat =>
      fun y : myNat =>
        fun eq : myEq x y =>
          (myEq_subst
           x
           y
           (fun z => myEq z x)
           eq
           (myEq_refl x))

We see that proving a universally quantified statement is equivalent to building a function that given a argument return a term of the quantified type. Indeed, this is (almost) the only way to get universal quantification in the type during type inference (we could also get it from querying the type of constants).

If we add a induction principle for natural numbers, which is the equivalent of Lean’s Nat.rec, we can show that for every natural number n, zero is less then or equal to n:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
axiom myNat_rec : (P : myNat → Type) → (P myZero) → ((x : myNat) → (P x) →  P (mySuccessor x)) → (y : myNat) → P y

noncomputable
def myInductionProof :
    (n : myNat) → myLe myZero n
      :=
      fun n : myNat =>
        myNat_rec
          (fun x : myNat => myLe myZero x)
          (myLe_symm myZero)
          (fun x : myNat =>
            fun ineq : myLe myZero x =>
              (myLe_step myZero x ineq))
          n

When using the recursion principle, the first argument is the so-called motive, then we show the base case, which in the above corresponds to showing that 0 is less then 0, which follows from reflexivity, and finally we show the induction step. In the above, we get the implications as functions, and show that we may deduce the step from axiom myLe_step.

A second example: logic

So far, we looked mostly at computational proofs. But we can also model logic in this type theory. For example, consider this encoding of the logical “or”:

1
2
3
axiom myOr : Type → Type → Type
axiom myOr_inl : (P : Type) → (Q : Type) → (p : P) → myOr P Q
axiom myOr_inr : (P : Type) → (Q : Type) → (p : Q) → myOr P Q

This gives us two ways of proving P ∨ Q: either we build a proof of type P, or we build one of type Q. Can you guess how one could define the logical “and” ? Here is “or” in action, where we show that 2=1+1 ∨ 1=0:

1
2
3
4
5
6
7
8
9
10
11
12
noncomputable
def mySecondProof :
  let one := mySuccessor myZero
  let two := mySuccessor (mySuccessor myZero)
  myOr (myEq two (myAdd one one)) (myEq one myZero)
    :=
    let one := mySuccessor myZero
    let two := mySuccessor (mySuccessor myZero)
    myOr_inl
    (myEq two (myAdd one one))
    (myEq one myZero)
    myProof

Here is how one can encode existential quantification:

1
2
axiom myExists : (P : myNat → Type) → Type
axiom myExists_intro : (P : myNat → Type) → (e : myNat) → (he : P e) →  myExists P

The introduction rule allows us to produce type ∃ e, P e if we have an e and a proof of P e (a term of this type). Let’s use it to show that for every natural number, we may find a number greater or equal to it:

1
2
3
4
5
6
7
8
9
10
11
12
13
noncomputable
def myProofWithExists :
  (n : myNat) → myExists (fun x => myLe n x)
  :=
    fun n : myNat =>
      myExists_intro
        (fun x => myLe n x)
        (mySuccessor n)
        (myLe_step
          n
          n
          (myLe_symm n)
          )

The second argument to the introduction rule says that for n, one of the numbers greater then n that we claim to exists is n+1. The last argument is the proof to the fact that n+1 is greater than n. This is directly proven with axiom myLe_step. As an exercise, you may try to prove this statement with a different candidate as evidence for the existential quantifier.

Conclusion

This concludes our little trip around the basics of Lean’s type theory. It is quite stunning to see how the basic rules we introduced in the first section allow us to express basic mathematical objects, statements and proofs. In a future post, we’ll perhaps take a closer looks at further mathematical and logical foundational formalisms, as we started doing in the previous section, or at implementation aspects of type theory, as we did in the first section. In the meantime, we wish you happy proving!