Skip to content

Categorical Semantics of Simply Typed Lambda Calculus

Published:
\gdef\TUnit{\texttt{Unit}} \gdef\fst{\texttt{fst}} \gdef\snd{\texttt{snd}} \gdef\step{\Rightarrow} \gdef\id{\text{id}} \gdef\pair#1#2{\langle {#1}, {#2}\rangle} \gdef\semant#1{\llbracket {#1} \rrbracket}

Introduction

Programs are complicated. How do we develop a better understanding of how they work? Denotational semantics is the subarea of programming languages where people study the behaviors of programs by mapping them into some semantic domain of mathematical objects.

The lambda calculus is one of the simplest programming languages there is. Written in Backus-Naur form (BNF), it only has three kinds of expressions:

e  :=  x    λx.e    e1  e2e\;:=\;x\;|\;\lambda x.e\;|\;e_1\;e_2

which stands for variable, function abstraction, and function application. Despite its simplicity, the lambda calculus is Turing-complete, and in particular can represent infinite loops. There has been interesting work in giving semantics to the lambda calculus via domain theory, but that’s the topic for another day.

Computational trilogy

We will focus our attention on simply typed lambda calculus(STLC). STLC adds types to the (pure) lambda calculus. As a programming language, STLC is strongly normalizing, meaning that every well-typed term will eventually evaluate to a normal form. That shouldn’t be discouraging, since STLC is closely connected to intuitionistic logic, via the Curry-Howard correspondence. It turns out that logic and type systems are both connected to category theory via the so-called Curry-Howard-Lambek correspondence.

The goal of this post is to explicitly spell out a categorical semantics of STLC. I was unable to find proofs for the following table taken from this nlab page on “computational trilogy”, so I figured it will be a fun exercise to spell them out here.

logictype theorycategory theory
propositionstypesobjects
proofsterms/
impl-introλ\lambda abstractioncounit from tensor-hom adjunction
impl-elimapplicationunit from tensor-hom adjunction
cut eliminationβ\beta reductionzigzag identities
identity eliminationη\eta conversionother zigzag

Categorical semantics

Some of the following definitions are adapted from Benjamin Pierce’s Basic Category for Computer Scientists.

The type theory

We define the grammar for types in STLC:

A  :=  Unit    AB    A  ×  BA\;:=\;\TUnit\;|\;A\to B\;|\;A\;\times\;B

And the grammar for terms:

e  :=  unit    λx:τ.e    e1  e2    (e1,e2)    fst  e    snd  ee\;:=\;\texttt{unit}\;|\;\lambda x:\tau.e\;|\;e_1\;e_2\;|\;(e_1,e_2)\;|\;\fst\;e\;|\;\snd\;e

Here, Unit\TUnit is the type with one element unit\texttt{unit}. we also consider the following typing rules:

  Γ,x:Ax:AAxiomΓe:AΓ,y:Be:AWeaken  Γunit:UnitUnit-I\infer{Axiom}{\;}{\Gamma,x:A\vdash x:A} \quad \infer{Weaken}{\Gamma\vdash e:A}{\Gamma,y:B\vdash e:A} \quad \infer{Unit-I}{\;}{\Gamma\vdash\texttt{unit}:\TUnit} Γ,x:Ae:BΓλx:A.e:ABArrow-IΓf:ABΓe:AΓf  e:BArrow-E\infer{Arrow-I}{\Gamma, x:A\vdash e:B}{\Gamma\vdash\lambda x:A. e:A\to B} \quad \infer{Arrow-E}{\Gamma\vdash f:A\to B \quad \Gamma\vdash e:A}{\Gamma\vdash f\;e:B} Γe1:AΓe2:BΓ(e1,e2):A×BProd-IΓe:A×BΓfst e:AProd-E1Γe:A×BΓsnd e:BProd-E2\infer{Prod-I}{\Gamma\vdash e_1:A \quad \Gamma\vdash e_2:B}{\Gamma\vdash (e_1,e_2):A\times B} \quad \infer{Prod-E1}{\Gamma\vdash e:A\times B}{\Gamma\vdash \fst~e:A} \quad \infer{Prod-E2}{\Gamma\vdash e:A\times B}{\Gamma\vdash \snd~e:B}

Here, the -I and -E refer to introduction rules and elimination rules. Introduction rules describe how to produce terms of a certain type, whereas elimination rules corresponds to how to consume terms of a certain type. We call the syntax associated with them introduction forms and elimination forms. For example, λx:A. e\lambda x: A.~e and (e1,e2)(e_1,e_2) are introduction forms, whereas e1 e2e_1~e_2, or say fst e\fst~e, are elimination forms.

A Cartesian-closed category

For the other piece of the translation, we fix any Cartesian-closed category (CCC) C\mathbf{C}. Recall that a Cartesian-closed category is a category with finite products, such that for each object AA, the right product functor (×A):CC(-\times A):\mathbf{C}\to\mathbf{C} has a right adjoint (A):CC(-^A):\mathbf{C}\to\mathbf{C} called the exponential.

The definition of adjunctions come with a unit natural transformation ϵA:(A×A)IdC\epsilon_A:(-^A\times A)\to \text{Id}_\mathbf{C} and a counit natural transformation ηA:IdC(×A)A\eta_A:\text{Id}_\mathbf{C}\to (-\times A)^A such that the following “zigzag” equalities hold:

Id(×A)=ϵA(×A)(×A)ηA\begin{equation}\mathcal{Id}_{(-\times A)}=\epsilon_A (-\times A)\circ (-\times A)\eta_A\end{equation} Id(A)=(A)ϵAηA(A)\begin{equation}\mathcal{Id}_{(-^A)}=(-^A)\epsilon_A\circ \eta_A (-^A)\end{equation}

Here IdC\text{Id}_\mathbf{C} is the identity functor on C\mathbf{C}, and the caligraphic Id\mathcal{Id} denotes identity natural transformations on the respective functors.

An alternative formulation of the unit equation is useful as follows. For each object BCB\in\mathbf{C}, the component ϵA,B\epsilon_{A,B} of the unit ϵA\epsilon_A has the universal property such that, for any CCC\in\mathbf{C} and morphism ff from (C×A)(\underline{C}\times A) (the product functor applied to CC) to BB, there exists a unique morphism curry(f)curry(f) from CC to BAB^A (the exponential functor applied to BB) making the diagram commute:

Test

The translation

We are finally ready to define the semantic translation! The double brackets   \semant{ \;} takes something from the type theory and maps it into the CCC we specified. On types, it is defined inductive on the syntax and maps each type to an object in the CCC. Again, the following translations are taken from Pierce’s Basic Category for Computer Scientists:

Unit1CA×BA×BABBA\begin{align*} \semant{ \TUnit} &\triangleq 1_\mathbf{C}\\ \semant{ A\times B} &\triangleq \semant{ A} \times \semant{ B} \\ \semant{ A\to B} &\triangleq \semant{ B} ^{\semant{ A} } \end{align*}

The translation for the typing context is inductively defined on the number of variables: 1C\semant{ \varnothing} \triangleq 1_\mathbf{C} and Γ,x:AΓ×A\semant{ \Gamma,x:A} \triangleq \semant{ \Gamma} \times \semant{ A} .

Finally, we inductively define the translation on the typing derivations:

DerivationTranslationDomainTarget
Γunit:Unit\semant{ \Gamma\vdash \texttt{unit}:\TUnit} !Γ!_{\semant{ \Gamma} }Γ\semant{ \Gamma} 1C1_\mathbf{C}
Γ,x:Ax:A\semant{ \Gamma,x:A\vdash x:A} π2\pi_2Γ×A\semant{ \Gamma} \times\semant{ A} A\semant{ A}
Γ,y:Be:A\semant{ \Gamma,y:B\vdash e:A} Γe:Aπ1\semant{ \Gamma\vdash e:A} \circ\pi_1Γ×A\semant{ \Gamma} \times\semant{ A} B\semant{ B}
Γλx:A.e:AB\semant{ \Gamma\vdash \lambda x:A.e:A\to B} curry(Γ,x:Ae:B)curry(\semant{ \Gamma,x:A\vdash e:B} )Γ\semant{ \Gamma} BA\semant{ B} ^{\semant{ A} }
Γe1  e2:B\semant{ \Gamma\vdash e_1\;e_2:B} ϵA,BΓe1:AB,Γe2:A\epsilon_{A,B}\circ\langle\semant{ \Gamma\vdash e_1:A\to B} ,\semant{ \Gamma\vdash e_2:A} \rangleΓ\semant{ \Gamma} B\semant{ B}
Γ(e1,e2):A×B\semant{ \Gamma\vdash (e_1,e_2):A\times B} Γe1:A,Γe2:B\langle\semant{ \Gamma\vdash e_1:A} ,\semant{ \Gamma\vdash e_2:B} \rangleΓ\semant{ \Gamma} A×B\semant{ A} \times\semant{ B}
Γfst  e:A\semant{ \Gamma\vdash \fst\;e:A} π1Γe:A×B\pi_1\circ \semant{ \Gamma\vdash e:A\times B} Γ\semant{ \Gamma} A\semant{ A}
Γsnd  e:B\semant{ \Gamma\vdash \snd\;e:B} π2Γe:A×B\pi_2\circ \semant{ \Gamma\vdash e:A\times B} Γ\semant{ \Gamma} B\semant{ B}

For this particular formulation of STLC, it just so happens that for any fixed context, if a term is typeable then there is a unique typing rule that applies to it. This can be seen just from the fact that each term grammar branch appears in the conclusion of exactly one typing rule. Thus, we can be a little hand-wavy and define the translations on the term syntax directly.

Semantics respects equivalence

β\beta-equivalence

β\beta-equivalence is about how to reduce expressions to simpler ones. First define a Call-by-value small-step operational semantics:

(λx:A.  e1)  e2e1[xe2]Subste1e1e1  e2e1e2Func\infer{Subst}{}{(\lambda x: A.\;e_1)\;e_2 \step e_1[x\mapsto e_2]} \quad \infer{Func}{e_1 \step e_1'}{e_1\;e_2 \step e_1' e_2} \quad \dots

As we can see, these rules tell us how elimination forms can “eliminate” introduction forms. The e[xe]e[x\mapsto e'] notation denotes (capture-avoiding) substitution of free occurences of xx for ee' inside ee. And we can denote β\beta-equivalence, written β\equiv_\beta, as the reflexive-symmetric-transitive closure of this relation. However, proving that beta equivalence plays well with the semantics is now try because substitution is defined inductively on the syntax!

We will prove the following helpful lemma: If Γλx:A.  e1\Gamma\vdash \lambda x: A.\;e_1 and Γe2:A\Gamma\vdash e_2 : A, then

Γe2[xe1]:B=Γ,x:Ae1:BidΓ,Γe1:A\semant{\Gamma\vdash e_2[x \mapsto e_1] : B}=\semant{\Gamma,x: A\vdash e_1: B}\circ \pair{\id_{\semant{\Gamma}}}{\semant{\Gamma\vdash e_1: A}}

Proof: Induction on the typing derivation of BB.

η\eta-equivalence

The η\eta-equivalence for function types says that for whatever types A,BA,B, a function h:ABh:A\to B should be considered “the same” as λx:A. (h x)\lambda x:A.~(h~x). Indeed, the latter seems to be just ”hh with extra steps”. It is an instance of a more general idea called contextual equivalence, or even more abstractly, extensional equivalence. In short, it means that things are identified by “extension”, i.e. how they act from the outside perspective.

One might notice that λx:A.(h  x)\lambda x:A. (h\;x) is an introduction form immediately wrapped around an elimination form, both for arrow types. In general, η\eta-equivalences almost all look like this.

η\eta-equivalence for product types

This amounts to claiming that Γ(fst e,snd e):A×B=Γe:A×B\semant{\Gamma\vdash (\fst~e, \snd~e): A\times B}=\semant{\Gamma \vdash e:A\times B}, which follows immediately from the universal property of the product:

Product

η\eta-equivalence for arrow types

Suppose Γλx:A.e  x:AB\Gamma\vdash \lambda x:A.e\;x:A\to B. Then ee must be of type ABA\to B, but it doesn’t have to be in normal form. The semantics of the typing derivations defined earlier tells us

Γλx:A.e  x:AB=curry(Γ,x:Ae  x:B)=curry(ϵA,BΓ,x:Ae:AB,Γ,x:Ax:A)=curry(ϵA,BΓ,x:Ae:AB,π2)=curry(ϵA,BΓe:ABπ1,π2)\begin{align*} &\semant{ \Gamma\vdash \lambda x:A.e\;x:A\to B} \\ =&curry(\semant{ \Gamma,x:A \vdash e\;x:B} )\\ =&curry(\epsilon_{A,B}\circ \langle\semant{ \Gamma,x:A\vdash e:A\to B} ,\semant{ \Gamma,x:A\vdash x:A} \rangle)\\ =&curry(\epsilon_{A,B}\circ \langle\semant{ \Gamma,x:A\vdash e:A\to B} ,\pi_2\rangle)\\ =&curry(\epsilon_{A,B}\circ \langle\semant{ \Gamma\vdash e:A\to B} \circ\pi_1,\pi_2\rangle) \end{align*}

Tho a mouthful, ϵA,BΓe:ABπ1,π2\epsilon_{A,B}\circ\langle\semant{ \Gamma\vdash e:A\to B} \circ\pi_1,\pi_2\rangle is a morphism from Γ×A\semant{ \Gamma} \times \semant{ A} to B\semant{ B} . If we set CC to be Γ\semant{ \Gamma} , and ff as ϵA,BΓe:ABπ1,π2\epsilon_{A,B}\circ\langle\semant{ \Gamma\vdash e:A\to B} \circ\pi_1,\pi_2\rangle, then the diagram for the unit equation becomes

Unit

Where did the mystery gg come from? Well, since ff is of the form ϵA,B\epsilon_{A,B} composed with something, we can just define gg to be the something, aka Γe:ABπ1,π2\langle\semant{ \Gamma\vdash e:A\to B} \circ\pi_1,\pi_2\rangle, and the above diagram will trivially commute. But, note gg is actually equal to Γe:AB×idA\semant{ \Gamma\vdash e:A\to B} \times \text{id}_{\semant{ A} }.

Finally, for the real kicker, we invoke the uniqueness of currycurry in the universal property of ϵA\epsilon_A, so that Γe:AB=curry(ϵA,BΓe:ABπ1,π2)\semant{ \Gamma\vdash e:A\to B} =curry(\epsilon_{A,B}\circ \langle\semant{ \Gamma\vdash e:A\to B} \circ\pi_1,\pi_2\rangle). Thus,

Γλx:A.e  x:AB=Γe:AB\semant{ \Gamma\vdash \lambda x:A.e\;x:A\to B} =\semant{ \Gamma\vdash e:A\to B}

Conclusion

This was mostly just an exercise for me to make sure that the categorical semantics of STLC is sensible. Indeed, we have verified parts of the analogy from the nlab “computational trilogy” table, that the semantic respects β\beta and η\eta equivalence. Actually, the proof came down exactly to the unit and counit equations, just as in the table.

Note that our version of STLC doesn’t contain any “ground types” besides Unit\TUnit. One can simply add more ground types like Int,Bool,Number\texttt{Int}, \texttt{Bool}, \texttt{Number}, etc. The translation will then be incorporated to map each ground type to some designated object in our CCC, and the proof still works because it was defined inductively on the typing derivations. However, in this way the categorical semantics will not be able to enforce fine-grained equational constraints like 40 + 2 = 42 on the operational semantics.

References

  1. B. C. Pierce, Basic Category Theory for Computer Scientists. The MIT Press, 1991. doi: 10.7551/mitpress/1524.001.0001.
  2. S. Mac Lane, Categories for the Working Mathematician, vol. 5. in Graduate Texts in Mathematics, vol. 5. New York, NY: Springer, 1978. doi: 10.1007/978-1-4757-4721-8.
  3. https://ncatlab.org/nlab/show/computational+trilogy
  4. https://hustmphrrr.github.io/asset/pdf/comp-exam.pdf