You are currently browsing the category archive for the ‘Propositional Calculus’ category.

This post is a continuation of the discussion of “the elementary theory of the category of sets” [ETCS] which we had begun last time, here and in the comments which followed. My thanks go to all who commented, for some useful feedback and thought-provoking questions.

Today I’ll describe some of the set-theoretic operations and “internal logic” of ETCS. I have a feeling that some people are going to love this, and some are going to hate it. My main worry is that it will leave some readers bewildered or exasperated, thinking that category theory has an amazing ability to make easy things difficult.

• An aside: has anyone out there seen the book Mathematics Made Difficult? It’s probably out of print by now, but I recommend checking it out if you ever run into it — it’s a kind of extended in-joke which pokes fun at category theory and abstract methods generally. Some category theorists I know take a dim view of this book; I for my part found certain passages hilarious, in some cases making me laugh out loud for five minutes straight. There are category-theory-based books and articles out there which cry out for parody!

In an attempt to nip my concerns in the bud, let me remind my readers that there are major differences between the way that standard set theories like ZFC treat membership and the way ETCS treats membership, and that differences at such a fundamental level are bound to propagate throughout the theoretical development, and impart a somewhat different character or feel between the theories. The differences may be summarized as follows:

• Membership in ZFC is a global relation between objects of the same type (sets).
• Membership in ETCS is a local relation between objects of different types (“generalized” elements or functions, and sets).

Part of what we meant by “local” is that an element per se is always considered relative to a particular set to which it belongs; strictly speaking, as per the discussion last time, the same element is never considered as belonging to two different sets. That is, in ETCS, an (ordinary) element of a set $A$ is defined to be a morphism $x: 1 \to A$; since the codomain is fixed, the same morphism cannot be an element $1 \to B$ of a different set $B$. This implies in particular that in ETCS, there is no meaningful global intersection operation on sets, which in ZFC is defined by:

$A \cap B = \{x: (x \in A) \wedge (x \in B)\}$

Instead, in ETCS, what we have is a local intersection operation on subsets $A \hookrightarrow X, B \hookrightarrow X$ of a set. But even the word “subset” requires care, because of how we are now treating membership. So let’s back up, and lay out some simple but fundamental definitions of terms as we are now using them.

Given two monomorphisms $i: A \to X, j: B \to X$, we write $i \subseteq j$ ($A \subseteq B$ if the monos are understood, or $A \subseteq_X B$ if we wish to emphasize this is local to $X$) if there is a morphism $k: A \to B$ such that $i = j k$. Since $j$ is monic, there can be at most one such morphism $k$; since $i$ is monic, such $k$ must be monic as well. We say $i, j$ define the same subset if this $k$ is an isomorphism. So: subsets of $X$ are defined to be isomorphism classes of monomorphisms into $X$. As a simple exercise, one may show that monos $i, j$ into $X$ define the same subset if and only if $i \subseteq j$ and $j \subseteq i$. The (reflexive, transitive) relation $\subseteq_X$ on monomorphisms thus induces a reflexive, transitive, antisymmetric relation, i.e., a partial order on subsets of $X$.

Taking some notational liberties, we write $A \subseteq X$ to indicate a subset of $X$ (as isomorphism class of monos). If $x: U \to X$ is a generalized element, let us say $x$ is in a subset $A \subseteq X$ if it factors (evidently uniquely) through any representative mono $i: A \to X$, i.e., if there exists $x': U \to A$ such that $x = i x'$. Now the intersection of two subsets $A \subseteq X$ and $B \subseteq X$ is defined to be the subset $A \cap B \subseteq X$ defined by the pullback of any two representative monos $i: A \to X, j: B \to X$. Following the “Yoneda principle”, it may equivalently be defined up to isomorphism by specifying its generalized elements:

$A \cap B :=_i \{x \in X: (x \mbox{ is in } A) \wedge (x \mbox{ is in } B)\}.$

Thus, intersection works essentially the same way as in ZFC, only it’s local to subsets of a given set.

While we’re at it, let’s reformulate the power set axiom in this language: it says simply that for each set $B$ there is a set $P(B)$ and a subset $\in_B \subseteq B \times P(B)$, such that for any relation $R \subseteq B \times A$, there is a unique “classifying map” $\chi_R: A \to P(B)$ whereby, under $1_B \times \chi_R: B \times A \to B \times P(B)$, we have

$R = (1_B \times \chi_R)^{-1}(\in_B).$

The equality is an equality between subsets, and the inverse image on the right is defined by a pullback. In categorical set theory notation,

$R = \{\langle b, a \rangle \in B \times A: b \in_B \chi_R(a)\}.$

Hence, there are natural bijections

$\displaystyle \frac{R \subseteq B \times A}{A \to P(B)} \qquad \frac{R \subseteq B \times A}{B \to P(A)}$

between subsets and classifying maps. The subset corresponding to $\phi: A \to P(B)$ is denoted $\left[\phi\right] \subseteq B \times A$ or $\left[\phi\right] \subseteq A \times B$, and is called the extension of $\phi$.

The set $P(1)$ plays a particularly important role; it is called the “subset classifier” because subsets $A \subseteq X$ are in natural bijection with functions $\chi: X \to P(1)$. [Cf. classifying spaces in the theory of fiber bundles.]

In ordinary set theory, the role of $P(1)$ is played by the 2-element set $\{f, t\}$. Here subsets $A \subseteq X$ are classified by their characteristic functions $\chi_A: X \to \{f, t\}$, defined by $\chi_A(x) := t$ iff $x \in A$. We thus have $A = \chi_A^{-1}(t)$; the elementhood relation $\in_1 \hookrightarrow 1 \times P(1)$ boils down to $t: 1 \to P(1)$. Something similar happens in ETCS set theory:

Lemma 1: The domain of elementhood $\in_1 \to 1 \times P(1) \cong P(1)$ is terminal.

Proof: A map $X \to \in_1$, that is, a map $\chi: X \to P(1)$ which is in $\in_1 \subseteq P(1)$, corresponds exactly to a subset $\chi^{-1}(\in_1) \subseteq X$ which contains all of $X$ (i.e., the subobject $1_X: X \subseteq X$). Since the only such subset is $1_X$, there is exactly one map $X \to \in_1$. $\Box$

Hence elementhood $\in_1 \subseteq 1 \times P(1)$ is given by an element $t: 1 \to P(1)$. The power set axiom says that a subset $A \subseteq X$ is retrieved from its classifying map $\chi_A: X \to P(1)$ as the pullback $\chi^{-1}_A(t) \subseteq X$.

Part of the power of, well, power sets is in a certain dialectic between external operations on subsets and internal operations on $P(1)$; one can do some rather amazing things with this. The intuitive (and pre-axiomatic) point is that if $C$ has finite products, equalizers, and power objects, then $P(1)$ is a representing object for the functor

$Sub: C^{op} \to Set$

which maps an object $X$ to the collection of subobjects of $X$, and which maps a morphism (“function”) $f: X \to Y$ to the “inverse image” function $f^{-1}: Sub(Y) \to Sub(X)$, that sends a subset $j: B \subseteq Y$ to the subset $f^{-1}(B) \subseteq X$ given by the pullback of the arrows $f: X \to Y$, $j: B \subseteq Y$. By the Yoneda lemma, this representability means that external natural operations on the $Sub(X)$ correspond to internal operations on the object $P(1)$. As we will see, one can play off the external and internal points of view against each other to build up a considerable amount of logical structure, enough for just about any mathematical purpose.

• Remark: A category satisfying just the first three axioms of ETCS, namely existence of finite products, equalizers, and power objects, is called an (elementary) topos. Most or perhaps all of this post will use just those axioms, so we are really doing some elementary topos theory. As I was just saying, we can build up a tremendous amount of logic internally within a topos, but there’s a catch: this logic will be in general intuitionistic. One gets classical logic (including law of the excluded middle) if one assumes strong extensionality [where we get the definition of a well-pointed topos]. Topos theory has a somewhat fearsome reputation, unfortunately; I’m hoping these notes will help alleviate some of the sting.

To continue this train of thought: by the Yoneda lemma, the representing isomorphism

$\displaystyle \theta: \hom(-, P(1)) \stackrel{\sim}{\to} Sub(-)$

is determined by a universal element $\theta_{P(1)}(1_{P(1)})$, i.e., a subset of $P(1)$, namely the mono $t: 1 \to P(1)$. In that sense, $t: 1 \to P(1)$ plays the role of a universal subset. The Yoneda lemma implies that external natural operations on general posets $Sub(X)$ are completely determined by how they work on the universal subset.

### Internal Meets

To illustrate these ideas, let us consider intersection. Externally, the intersection operation is a natural transformation

$\cap_X: Sub(X) \times Sub(X) \to Sub(X).$

This corresponds to a natural transformation

$\hom(X, P(1)) \times \hom(X, P(1)) \to \hom(X, P(1))$

which (by Yoneda) is given by a function $\wedge: P(1) \times P(1) \to P(1)$. Working through the details, this function is obtained by putting $X = P(1) \times P(1)$ and chasing

$\langle \pi_1, \pi_2\rangle \in \hom(P(1) \times P(1), P(1)) \times \hom(P(1) \times P(1), P(1))$

through the composite

$\displaystyle \hom(X, P(1)) \times \hom(X, P(1))$

$\displaystyle \stackrel{\sim}{\to} Sub(X) \times Sub(X) \stackrel{\cap}{\to} Sub(X) \stackrel{\sim}{\to} \hom(X, P(1)).$

Let’s analyze this bit by bit. The subset $\left[\pi_1\right] = \pi_{1}^{-1}(t) \subseteq P(1) \times P(1)$ is given by

$t \times id: 1 \times P(1) \to P(1) \times P(1),$

and the subset $\left[\pi_2\right] = \pi_{2}^{-1}(t) \subseteq P(1) \times P(1)$ is given by

$id \times t: P(1) \times 1 \to P(1) \times P(1).$

Hence $\left[\pi_1\right] \cap \left[\pi_2\right] \subseteq P(1) \times P(1)$ is given by the pullback of the functions $t \times id$ and $id \times t$, which is just

$t \times t: 1 \times 1 \to P(1) \times P(1).$

The map $\wedge: P(1) \times P(1) \to P(1)$ is thus defined to be the classifying map of $t \times t: 1 \times 1 \subseteq P(1) \times P(1)$.

To go from the internal meet $\wedge: P(1) \times P(1) \to P(1)$ back to the external intersection operation, let $A \subseteq X, B \subseteq X$ be two subsets, with classifying maps $\chi_A, \chi_B: X \to P(1)$. By the definition of $\wedge$, we have that for all generalized elements $x \in X$

$\chi_A(x) \wedge \chi_B(x) = t$ if and only if $\langle \chi_A(x), \chi_B(x) \rangle = \langle t, t \rangle$

(where the equality signs are interpreted with the help of equalizers). This holds true iff $x$ is in the subset $A \subseteq X$ and is in the subset $B \subseteq X$, i.e., if and only if $x$ is in the subset $A \cap B \subseteq X$. Thus $\chi_A \wedge \chi_B$ is indeed the classifying map of $A \cap B \subseteq X$. In other words, $\chi_{A \cap B} = \chi_A \wedge \chi_B$.

A by-product of the interplay between the internal and external is that the internal intersection operator

$\wedge: P(1) \times P(1) \to P(1)$

is the meet operator of an internal meet-semilattice structure on $P(1)$: it is commutative, associative, and idempotent (because that is true of external intersection). The identity element for $\wedge$ is the element $t: 1 \to P(1)$. In particular, $P(1)$ carries an internal poset structure: given generalized elements $u, v: A \to P(1)$, we may define

$u \leq v$ if and only if $u = u \wedge v$

and this defines a reflexive, symmetric, antisymmetric relation $\left[\leq\right] \subseteq P(1) \times P(1)$:

$\left[\leq\right] :=_i \{\langle u, v \rangle \in P(1) \times P(1): u = u \wedge v\},$

equivalently described as the equalizer

$\left[\leq\right] \to P(1) \times P(1) \stackrel{\to}{\to} P(1)$

of the maps $\pi_1: P(1) \times P(1) \to P(1)$ and $\wedge: P(1) \times P(1) \to P(1)$. We have that $u \leq v$ if and only if $\left[u\right] \subseteq \left[v\right]$.

### Internal Implication

Here we begin to see some of the amazing power of the interplay between internal and external logical operations. We will prove that $P(1)$ carries an internal Heyting algebra structure (ignoring joins for the time being).

Let’s recall the notion of Heyting algebra in ordinary naive set-theoretic terms: it’s a lattice $P$ that has a material implication operator $\Rightarrow$ such that, for all $x, y, z \in P$,

$x \wedge y \leq z$ if and only if $x \leq y \Rightarrow z.$

Now: by the universal property of $P(1)$, a putative implication operation $\Rightarrow: P(1) \times P(1) \to P(1)$ is uniquely determined as the classifying map of its inverse image $(\Rightarrow)^{-1}(t) \subseteq P(1) \times P(1)$, whose collection of generalized elements is

$\{\langle u, v \rangle \in P(1) \times P(1): (u \Rightarrow v) = t\}$

Given $\langle u, v \rangle: A \to P(1) \times P(1)$, the equality here is equivalent to

$t \leq u \Rightarrow v$

(because $t: 1 \to P(1)$ is maximal), which in turn is equivalent to

$t \wedge u = u \leq v$

This means we should define $\Rightarrow: P(1) \times P(1) \to P(1)$ to be the classifying map of the subset

$\left[\leq\right] \subseteq P(1) \times P(1).$

Theorem 1: $P(1)$ admits internal implication.

Proof: We must check that for any three generalized elements $u, v, w: A \to P(1)$, we have

$w \leq u \Rightarrow v$ if and only if $w \wedge u \leq v.$

Passing to the external picture, let $\left[u\right], \left[v\right], \left[w\right]$ be the corresponding subsets of $A$. Now: according to how we defined $\Rightarrow,$ a generalized element $e \in A$ is in $\left[u \Rightarrow v\right]$ if and only if $u(e) \leq v(e)$. This applies in particular to any monomorphism $e: \left[w\right] \to A$ that represents the subset $\left[w\right] \subseteq A$.

Lemma 2: The composite

$\displaystyle u(e) = (\left[w\right] \stackrel{e}{\to} A \stackrel{u}{\to} P(1))$

is the classifying map of the subset $\left[w\right] \cap \left[u\right] \subseteq \left[w\right]$.

Proof: As subsets of $\left[w\right]$, $(u e)^{-1}(t) = e^{-1} u^{-1}(t) = e^{-1}(\left[u\right]) = \left[w\right] \cap \left[u\right]$ where the last equation holds because both sides are the subsets defined as the pullback of two representative monos $e: \left[w\right] \to A$, $i: \left[u\right] \to A$. $\Box$

Continuing the proof of theorem 1, we see by lemma 2 that the condition $u(e) \leq v(e)$ corresponds externally to the condition

$\left[w\right] \cap \left[u\right] \subseteq \left[w\right] \cap \left[v\right]$

and this condition is equivalent to $\left[w\right] \cap \left[u\right] \subseteq \left[v\right]$. Passing back to the internal picture, this is equivalent to $w \wedge u \leq v$, and the proof of theorem 1 is complete. $\Box$

### Cartesian Closed Structure

Next we address a comment made by “James”, that a category satisfying the ETCS axioms is cartesian closed. As everything else in this article, this uses only the fact that such a category is a topos: has finite products, equalizers, and “power sets”.

Proposition 1: If $A, B$ are “sets”, then $P(A \times B)$ represents an exponential $P(B)^A.$

Proof: By the power set axiom, there is a bijection between maps into the power set and relations:

$\displaystyle \frac{\phi: X \to P(A \times B)}{R \subseteq X \times (A \times B)}$

which is natural in $X$. By the same token, there is a natural bijection

$\displaystyle \frac{R \subseteq (X \times A) \times B}{\phi': X \times A \to P(B)}.$

Putting these together, we have a natural isomorphism

$\hom(-, P(A \times B)) \cong \hom(- \times A, P(B))$

and this representability means precisely that $P(A \times B)$ plays the role of an exponential $P(B)^A$. $\Box$

Corollary 1: $P(A) \cong P(1)^A$. $\Box$

The universal element of this representation is an evaluation map $A \times P(A) \to P(1)$, which is just the classifying map of the subset $\in_A \subseteq A \times P(A)$.

Thus, $P(B)^A \cong P(A \times B)$ represents the set of all functions $\phi: A \to P(B)$ (that is, relations from $A$ to $B$). This is all we need to continue the discussion of internal logic in this post, but let’s also sketch how we get full cartesian closure. [Warning: for those who are not comfortable with categorical reasoning, this sketch could be rough going in places.]

As per our discussion, we want to internalize the set of such relations which are graphs of functions, i.e., maps $\phi$ where each $\phi(a) \subseteq B$ is a singleton, in other words which factor as

$\displaystyle A \to B \stackrel{\sigma}{\to} P(B)$

where $\sigma: B \to P(B)$ is the singleton mapping:

$b \mapsto \{b\} = \{c \in B: b = c\}.$

We see from this set-theoretic description that $\sigma: B \to P(B)$ classifies the equality relation

$\{\langle b, c\rangle \in B \times B: b = c\} \subseteq B \times B$

which we can think of as either the equalizer of the pair of maps $\pi_1, \pi_2: B \times B \to B$ or, what is the same, the diagonal map $\delta_B = \langle 1_B, 1_B \rangle: B \to B \times B$.

Thus, we put $\sigma = \chi_{\delta}: B \to P(B)$, and it is not too hard to show that the singleton mapping $\sigma$ is a monomorphism. As usual, we get this monomorphism as the pullback $\chi_{\sigma}^{-1}(t)$ of $t: 1 \to P(1)$ along its classifying map $\chi_{\sigma}: P(B) \to P(1)$.

Now: a right adjoint such as $(-)^A$ preserves all limits, and in particular pullbacks, so we ought then to have a pullback

       B^A ---------------> 1^A
|                    |
sigma^A |                    | t^A
V                    V
P(B)^A -------------> P(1)^A
(chi_sigma)^A

Of course, we don’t even have $B^A$ yet, but this should give us an idea: define $\sigma^A$, and in particular its domain $B^A$, by taking the pullback of the right-hand map along the bottom map. In case there is doubt, the map on the bottom is defined Yoneda-wise, applying the isomorphism

$\hom(P(B)^A \times A, P(1)) \cong \hom(P(B)^A, P(1)^A)$

to the element in the hom-set (on the left) given by the composite

$\displaystyle P(B)^A \times A \stackrel{ev}{\to} P(B) \stackrel{\chi_\sigma}{\to} P(1).$

The map on the right of the pullback is defined similarly. That this recipe really gives a construction of $B^A$ will be left as an exercise for the reader.

### Universal Quantification

As further evidence of the power of the internal-external dialectic, we show how to internalize universal quantification.

As we are dealing here now with predicate logic, let’s begin by defining some terms as to be used in ETCS and topos theory:

• An ordinary predicate of type $A$ is a function $\phi: A \to P(1)$. Alternatively, it is an ordinary element $\phi': 1 \to P(1)^A \cong P(A)$. It corresponds (naturally and bijectively) to a subset $\left[\phi\right]: S \subseteq A$.
• A generalized predicate of type $A$ is a function $\phi': X \to P(A) \cong P(1)^A$. It may be identified with (corresponds naturally and bijectively to) a function $\phi: X \times A \to P(1)$, or to a subset $\left[\phi\right]: S \subseteq X \times A$.

We are trying to define an operator $\forall_A$ which will take a predicate of the form $\phi: X \times A \to P(1)$ [conventionally written $\phi(x, a)$] to a predicate $\forall_A \phi: X \to P(1)$ [conventionally written $\forall_{a \in A} \phi(x, a)$]. Externally, this corresponds to a natural operation which takes subsets of $X \times A$ to subsets of $X$. Internally, it corresponds to an operation of the form

$\forall_A: P(A) \cong P(1)^A \to P(1).$

This function is determined by the subset $(\forall_A)^{-1}(t) \subseteq P(1)^A$, defined elementwise by

$\{\phi \in P(1)^A: \forall_A \phi = t\}.$

Now, in ordinary logic, $\forall_{a \in A} \phi(a)$ is true if and only if $\phi(a)$ is true for all $a \in A$, or, in slightly different words, if $\phi: A \to P(1)$ is constantly true over all of $A$:

$\displaystyle \phi = (A \stackrel{!}{\to} 1 \stackrel{t}{\to} P(1)).$

The expression on the right (global truth over $A$) corresponds to a function $t_A: 1 \to P(1)^A$, indeed a monomorphism since any function with domain $1$ is monic. Thus we are led to define the desired quantification operator $\forall_A: P(1)^A \to P(1)$ as the classifying map of $t_A: 1 \subseteq P(1)^A$.

Let’s check how this works externally. Let $\phi: X \to P(1)^A$ be a generalized predicate of type $A$. Then according to how $\forall_A$ has just been defined, $\forall_A \phi: X \to P(1)$ classifies the subset

$\displaystyle \{x \in X: \phi(x, -) = t_A: A \to P(1))\} \subseteq X$

There is an interesting adjoint relationship between universal quantification and substitution (aka “pulling back”). By “substitution”, we mean that given any predicate $\psi: X \to P(1)$ on $X$, we can always pull back to a predicate on $X \times A$ (substituting in a dummy variable $a$ of type $A$, forming e.g. $\psi(x) \wedge \left[a=a\right]$) by composing with the projection $\pi: X \times A \to X$. In terms of subsets, substitution along $A$ is the natural external operation

$(\left[\psi\right] \subseteq X) \mapsto (\left[\psi\right]\times A \subseteq X \times A).$

Then, for any predicate $\phi: X \times A \to P(1)$, we have the adjoint relationship

$\left[\psi\right] \times A \subseteq \phi$ if and only if $\left[\psi\right] \subseteq \forall_A \phi$

so that substitution along $A$ is left adjoint to universal quantification along $A$. This is easy to check; I’ll leave that to the reader.

### Internal Intersection Operators

Now we put all of the above together, to define an internal intersection operator

$\bigcap: PPX \to PX$

which intuitively takes an element $1 \to PPX$ (a family $F$ of subsets of $X$) to its intersection $1 \to PX$, as a subset $\bigcap F \subseteq X$.

Let’s first write out a logical formula which expresses intersection:

$x \in \bigcap F \ \ \mbox{if and only if} \ \ \forall_{S \in PX} (S \in F) \Rightarrow (x \in S)\}.$

We have all the ingredients to deal with the logical formula on the right: we have an implication operator $\Rightarrow$ as part of the internal Heyting algebra structure on $P(1)$, and we have the quantification operator $\forall_{PX}$. The atomic expressions $(S \in F)$ and $(x \in S)$ refer to internal elementhood: $(x \in S)$ means $\langle x, S\rangle \in X \times PX$ is in $\ \in_{X}\ \subseteq X \times PX$, and $(S \in F)$ means $\langle S, F\rangle \in PX \times PPX$ is in $\ \in_{PX}\ \subseteq PX \times PPX$.

There is a slight catch, in that the predicates “$(S \in_{PX} F)$” and “$(x \in_X S)$” (as generalized predicates over $PX$, where $S$ lives) are taken over different domains. The first is of the form $\phi_1: PPX \to P(1)^{PX}$, and the second is of the form $\phi_2: X \to P(1)^{PX}$. No matter: we just substitute in some dummy variables. That is, we just pull these maps back to a common domain $PPX \times X$, forming the composites

$\displaystyle \psi_1 = (PPX \times X \stackrel{\pi_1}{\to} PPX \stackrel{\phi_1}{\to} P(1)^{PX})$

and

$\displaystyle \psi_2 = (PPX \times X \stackrel{\pi_2}{\to} X \stackrel{\phi_2}{\to} P(1)^{PX}).$

Putting all this together, we form the composite

$\displaystyle PPX \times X \stackrel{\langle \psi_1, \psi_2\rangle}{\to} P(1)^{PX} \times P(1)^{PX}$

$\displaystyle \cong (P(1) \times P(1))^{PX} \stackrel{(\Rightarrow)^{PX}}{\to} P(1)^{PX} \stackrel{\forall_{PX}}{\to} P(1)$

This composite directly expresses the definition of the internal predicate $(x \in \bigcap F)$ given above. By cartesian closure, this map $PPX \times X \to P(1)$ induces the desired internal intersection operator, $\bigcap: PPX \to PX$.

This construction provides an important bridge to getting the rest of the internal logic of ETCS. Since we can can construct the intersection of arbitrary definable families of subsets, the power sets $PX$ are internal inf-lattices. But inf-lattices are sup-lattices as well; on this basis we will be able to construct the colimits (e.g., finite sums, coequalizers) that we need. Similarly, the intersection operators easily allow us to construct image factorizations: any function $f: X \to Y$ can be factored (in an essentially unique way) as an epi or surjection $X \to I$ to the image, followed by a mono or injection $I \to Y$. The trick is to define the image as the smallest subset of $Y$ through which $f$ factors, by taking the intersection of all such subsets. Image factorization leads in turn to the construction of existential quantification.

As remarked above, the internal logic of a topos is generally intuitionistic (the law of excluded middle is not satisfied). But, if we add in the axiom of strong extensionality of ETCS, then we’re back to ordinary classical logic, where the law of excluded middle is satisfied, and where we just have the two truth values “true” and “false”. This means we will be able to reason in ETCS set theory just as we do in ordinary mathematics, taking just a bit of care with how we treat membership. The foregoing discussion gives indication that logical operations in categorical set theory work in ways familiar from naive set theory, and that basic set-theoretic constructions like intersection are well-grounded in ETCS.

Last time in this series on Stone duality, we observed a perfect duality between finite Boolean algebras and finite sets, which we called “baby Stone duality”:

1. Every finite Boolean algebra $B$ is obtained from a finite set $X$ by taking its power set (or set of functions $\hom(X, \mathbf{2})$ from $X$ to $\mathbf{2}$, with the Boolean algebra structure it inherits “pointwise” from $\mathbf{2} = \{0, 1\}$). The set $X$ may be defined to be $\mbox{Bool}(B, \mathbf{2})$, the set of Boolean algebra homomorphisms from $B$ to $\mathbf{2}$.
2. Conversely, every finite set $X$ is obtained from the Boolean algebra $B = \hom(X, \mathbf{2})$ by taking its “hom-set” $\mbox{Bool}(B, \mathbf{2})$.

More precisely, there are natural isomorphisms

$i_B: B \stackrel{\sim}{\to} \hom(\mbox{Bool}(B, \mathbf{2}), \mathbf{2}),$

$j_X: X \stackrel{\sim}{\to} \mbox{Bool}(\hom(X, \mathbf{2}), \mathbf{2})$

in the categories of finite Boolean algebras and of finite sets, respectively. In the language of category theory, this says that these categories are (equivalent to) one another’s opposite — something I’ve been meaning to explain in more detail, and I promise to get to that, soon! In any case, this duality says (among other things) that finite Boolean algebras, no matter how abstractly presented, can be represented concretely as power sets.

Today I’d like to apply this representation to free Boolean algebras (on finitely many generators). What is a free Boolean algebra? Again, the proper context for discussing this is category theory, but we can at least convey the idea: given a finite set $S$ of letters $x, y, z, \ldots$, consider the Boolean algebra $\mathbf{B}(S)$ whose elements are logical equivalence classes of formulas you can build up from the letters using the Boolean connectives $\wedge, \vee, \neg$ (and the Boolean constants $0, 1$), where two formulas $\phi, \phi'$ are defined to be logically equivalent if $\phi \leq \phi'$ and $\phi' \leq \phi$ can be inferred purely on the basis of the Boolean algebra axioms. This is an excellent example of a very abstract description of a Boolean algebra: syntactically, there are infinitely many formulas you can build up, and the logical equivalence classes are also infinite and somewhat hard to visualize, but the mess can be brought under control using Stone duality, as we now show.

First let me cut to the chase, and describe the key property of free Boolean algebras. Let $A$ be any Boolean algebra; it could be a power set, the lattice of regular open sets in a topology, or whatever, and think of a function $f: S \to A$ from the set of letters to $A$ as modeling or interpreting the atomic formulas $x, y, z, \ldots$ as elements $f(x), f(y), f(z), \ldots$ of $A$. The essential property of the free Boolean algebra is that we can extend this interpretation $f$ in a unique way to a Boolean algebra map $\mathbf{B}(S) \to A$. The way this works is that we map a formula like $(x \wedge \neg y) \vee z$ to the obvious formula $(f(x) \wedge \neg f(y)) \vee f(z)$. This is well-defined on logical equivalence classes of formulas because if $p = q$ in $\mathbf{B}(S)$, i.e., if the equality is derivable just from the Boolean algebra axioms, then of course $f(p) = f(q)$ holds in $A$ as the Boolean algebra axioms hold in $A$. Thus, there is a natural bijective correspondence between functions $S \to A$ and Boolean algebra maps $\mathbf{B}(S) \to A$; to get back from a Boolean algebra map $\mathbf{B}(S) \to A$ to the function $S \to A$, simply compose the Boolean algebra map with the function $S \to \mathbf{B}(S)$ which interprets elements of $S$ as equivalence classes of atomic formulas in $\mathbf{B}(S)$.

To get a better grip on $\mathbf{B}(S)$, let me pass to the Boolean ring picture (which, as we saw last time, is equivalent to the Boolean algebra picture). Here the primitive operations are addition and multiplication, so in this picture we build up “formulas” from letters using these operations (e.g., $(x + y) \cdot z$ and the like). In other words, the elements of $\mathbf{B}(S)$ can be considered as “polynomials” in the variables $x, y, z, \ldots$. Actually, there are some simplifying features of this polynomial algebra; for one thing, in Boolean rings we have idempotence. This means that $p^n = p$ for $n \geq 1$, and so a monomial term like $x^3 y^2$ reduces to its support $x y$. Since each letter appears in a support with exponent 0 or 1, it follows that there are $2^{|S|}$ possible supports or Boolean monomials, where $|S|$ denotes the cardinality of $S$.

Idempotence also implies, as we saw last time, that $b + b = 0$ for all elements $b \in \mathbf{B}(S)$, so that our polynomials = $\mathbb{Z}$-linear combinations of monomials are really $\mathbb{Z}_2$-linear combinations of Boolean monomials or supports. In other words, each element of $\mathbf{B}(S)$ is uniquely a linear combination

$\sum_{\sigma \in \mbox{supp}(S)} a_\sigma \sigma$ where $a_\sigma \in \{0, 1\},$

i.e., the set of supports $\mbox{supp}(S)$ forms a basis of $\mathbf{B}(S)$ as a $\mathbb{Z}_2$-vector space. Hence the cardinality of the free Boolean ring is $2^{|\mbox{supp}(S)|} = 2^{2^{|S|}}$.

• Remark: This gives an algorithm for checking logical equivalence of two Boolean algebra formulas: convert the formulas into Boolean ring expressions, and using distributivity, idempotence, etc., write out these expressions as Boolean polynomials = $\mathbb{Z}_2$-linear combinations of supports. The Boolean algebra formulas are equivalent if and only if the corresponding Boolean polynomials are equal.

But there is another way of understanding free Boolean algebras, via baby Stone duality. Namely, we have the power set representation

$i: \mathbf{B}(S) \stackrel{\sim}{\to} \hom(\mbox{Bool}(\mathbf{B}(S), \mathbf{2}), \mathbf{2})$

where $\mbox{Bool}(\mathbf{B}(S), \mathbf{2})$ is the set of Boolean algebra maps $\mathbf{B}(S) \to \mathbf{2}$. However, the freeness property says that these maps are in bijection with functions $S \to \mathbf{2}$. What are these functions? They are just truth-value assignments for the elements (atomic formulas, or variables) $x, y, z, \ldots \in S$; there are again $2^{|S|}$ many of these. This leads to the method of truth tables: each formula $b \in \mathbf{B}(S)$ induces (in one-one fashion) a function

$i(b): \mbox{Bool}(\mathbf{B}(S), \mathbf{2}) \to \mathbf{2}$

which takes a Boolean algebra map $\phi: \mathbf{B}(S) \to \mathbf{2}$, aka a truth-value assignment for the variables $x, y, z, \ldots$, to the element of $\{0, 1\}$ obtained by instantiating the assigned truth values $0, 1$ for the variables and evaluating the resulting Boolean expression for $b$ in $\mathbf{2}$. (In terms of power sets,

$\mathbf{B}(S) \cong P(\mbox{Bool}(\mathbf{B}(S), \mathbf{2}))$

identifies each equivalence class of formulas $b \in \mathbf{B}(S)$ with the set of truth-value assignments of variables which render the formula $b$ “true” in $\{0, 1\}$.) The fact that the representation $b \mapsto i(b)$ is injective means precisely that if formulas $b, c$ are inequivalent, then there is a truth-value assignment which renders one of them “true” and the other “false”, hence that they are distinguishable by truth tables.

• Remark: This is an instance of what is known as a completeness theorem in logic. On the syntactic side, we have a notion of provability of formulas (that $b$ is logically equivalent to $\top$, or $b = \top$ in $\mathbf{B}(S)$ if this is derivable from the Boolean algebra axioms). On the semantic side, each Boolean algebra homomorphism $\phi: \mathbf{B}(S) \to \mathbf{2}$ can be regarded as a model of $\mathbf{B}(S)$ in which each formula becomes true or false under $\phi$. The method of truth tables then says that there are enough models or truth-value assignments to detect provability of formulas, i.e., $b$ is provable if it is true when interpreted in any model $\phi$. This is precisely what is meant by a completeness theorem.

There are still other ways of thinking about this. Let $\phi: B \to \mathbf{2}$ be a Boolean algebra map, aka a model of $B$. This model is completely determined by

• The maximal ideal $\phi^{-1}(0)$ in the Boolean ring $B$, or
• The maximal filter or ultrafilter $\phi^{-1}(1)$ in $B$.

Now, as we saw last time, in the case of finite Boolean algebras, each (maximal) ideal is principal: is of the form $\{x \in B: x \leq b\}$ for some $b \in B$. Dually, each (ultra)filter is principal: is of the form $\{x \in B: c \leq x\}$ for some $c = \neg b \in B$. The maximality of the ultrafilter means that there is no nonzero element in $B$ smaller than $c$; we say that $c$ is an atom in $B$ (NB: not to be confused with atomic formula!). So, we can also say

• A model of a finite Boolean algebra $B$ is specified by a unique atom of $B$.

Thus, baby Stone duality asserts a Boolean algebra isomorphism

$i: B \to P(\mbox{Atoms}(B)).$

Let’s give an example: consider the free Boolean algebra on three elements $x, y, z$. If you like, draw a Venn diagram generated by three planar regions labeled by $x, y, z$. The atoms or smallest nonzero elements of the free Boolean algebra are then represented by the $2^3 = 8$ regions demarcated by the Venn diagram. That is, the disjoint regions are labeled by the eight atoms

$x \wedge y \wedge z, x \wedge y \wedge \neg z, x \wedge \neg y \wedge z, x \wedge \neg y \wedge \neg z,$

$\neg x \wedge y \wedge z, \neg x \wedge y \wedge \neg z, \neg x \wedge \neg y \wedge z, \neg x \wedge \neg y \wedge \neg z.$

According to baby Stone duality, any element in the free Boolean algebra (with $2^8 = 256$ elements) is uniquely expressible as a disjoint union of these atoms. Another way of saying this is that the atoms form a basis (alternative to Boolean monomials) of the free Boolean algebra as $\mathbb{Z}_2$-vector space. For example, as an exercise one may calculate

$(x \Rightarrow y) \wedge z = x \wedge y \wedge z + \neg x \wedge y \wedge z + \neg x \wedge \neg y \wedge z.$

The unique expression of an element $b \in \mathbf{B}(S)$ (where $b$ is given by a Boolean formula) as a $\mathbb{Z}_2$-linear combination of atoms is called the disjunctive normal form of the formula. So yet another way of deciding when two Boolean formulas are logically equivalent is to put them both in disjunctive normal form and check whether the resulting expressions are the same. (It’s basically the same idea as checking equality of Boolean polynomials, except we are using a different vector space basis.)

All of the above applies not just to free (finite) Boolean algebras, but to general finite Boolean algebras. So, suppose you have a Boolean algebra $B$ which is generated by finitely many elements $x_1, x_2, \ldots, x_n \in B$. Generated means that every element in $B$ can be expressed as a Boolean combination of the generating elements. In other words, “generated” means that if we consider the inclusion function $S = \{x_1, \ldots, x_n\} \hookrightarrow B$, then the unique Boolean algebra map $\phi: \mathbf{B}(S) \to B$ which extends the inclusion is a surjection. Thinking of $\phi$ as a Boolean ring map, we have an ideal $I = \phi^{-1}(0)$, and because $\phi$ is a surjection, it induces a ring isomorphism

$B \cong \mathbf{B}(S)/I.$

The elements of $I$ can be thought of as equivalence classes of formulas which become false in $B$ under the interpretation $\phi$. Or, we could just as well (and it may be more natural to) consider instead the filter $F = \phi^{-1}(1)$ of formulas in $\mathbf{B}(S)$ which become true under the interpretation $\phi$. In any event, what we have is a propositional language $\mathbf{B}(S)$ consisting of classes of formulas, and a filter $F \subseteq \mathbf{B}(S)$ consisting of formulas, which can be thought of as theorems of $B$. Often one may find a filter $F$ described as the smallest filter which contains certain chosen elements, which one could then call axioms of $B$.

In summary, any propositional theory (which by definition consists of a set $S$ of propositional variables together with a filter $F \subseteq \mathbf{B}(S)$ of the free Boolean algebra, whose elements are called theorems of the theory) yields a Boolean algebra $B = \mathbf{B}(S)/F$, where dividing out by $F$ means we take equivalence classes of elements of $\mathbf{B}(S)$ under the equivalence relation $b \sim c$ defined by the condition “$b \Leftrightarrow c$ belongs to $F$“. The partial order on equivalence classes [$b$] is defined by [$b$] $\leq$ [$c$] iff $b \Rightarrow c$ belongs to $F$. The Boolean algebra $B$ defined in this way is called the Lindenbaum algebra of the propositional theory.

Conversely, any Boolean algebra $B$ with a specified set of generators $x_1, \ldots x_n$ can be thought of as the Lindenbaum algebra of the propositional theory obtained by taking the $x_i$ as propositional variables, together with the filter $\phi^{-1}(1)$ obtained from the induced Boolean algebra map $\phi: \mathbf{B}(S) \to B$. A model of the theory should be a Boolean algebra map $\mathbf{B}(S) \to \mathbf{2}$ which interprets the formulas of $\mathbf{B}(S)$ as true or false, but in such a way that the theorems of the theory (the elements of the filter) are all interpreted as “true”. In other words, a model is the same thing as a Boolean algebra map

$B \cong \mathbf{B}(S)/F \to \mathbf{2}.$

i.e., we may identify a model of a propositional theory with a Boolean algebra map $f: B \to \mathbf{2}$ out of its Lindenbaum algebra.

So the set of models is the set $\mbox{Bool}(B, \mathbf{2})$, and now baby Stone duality, which gives a canonical isomorphism

$i: B \cong \hom(\mbox{Bool}(B, \mathbf{2}), \mathbf{2}),$

implies the following

Completeness theorem: If a formula of a finite propositional theory is “true” when interpreted under any model $\phi$ of the theory, then the formula is provable (is a theorem of the theory).

Proof: Let $B$ be the Lindenbaum algebra of the theory, and let $b = [p] \in B$ be the class of formulas provably equivalent to a given formula $p$ under the theory. The Boolean algebra isomorphism $i$ takes an element $b \in B$ to the map $\phi \mapsto \phi(b)$. If $\phi(b) = 1$ for all models $\phi$, i.e., if $i(b) = 1$, then $b = 1$. But then [$p$] $= 1$, i.e., $p \in F$, the filter of provable formulas. $\Box$

In summary, we have developed a rich vocabulary in which Boolean algebras are essentially the same things as propositional theories, and where models are in natural bijection with maximal ideals in the Boolean ring, or ultrafilters in the Boolean algebra, or [in the finite case] atoms in the Boolean algebra. But as we will soon see, ultrafilters have a significance far beyond their application in the realm of Boolean algebras; in particular, they crop up in general studies of topology and convergence. This is in fact a vital clue; the key point is that the set of models or ultrafilters $\mbox{Bool}(B, \mathbf{2})$ carries a canonical topology, and the interaction between Boolean algebras and topological spaces is what Stone duality is all about.

In this installment, I will introduce the concept of Boolean algebra, one of the main stars of this series, and relate it to concepts introduced in previous lectures (distributive lattice, Heyting algebra, and so on). Boolean algebra is the algebra of classical propositional calculus, and so has an abstract logical provenance; but one of our eventual goals is to show how any Boolean algebra can also be represented in concrete set-theoretic (or topological) terms, as part of a powerful categorical duality due to Stone.

There are lots of ways to define Boolean algebras. Some definitions were for a long time difficult conjectures (like the Robbins conjecture, established only in the last ten years or so with the help of computers) — testament to the richness of the concept. Here we’ll discuss just a few definitions. The first is a traditional one, and one which is pretty snappy:

A Boolean algebra is a distributive lattice in which every element has a complement.

(If $X$ is a lattice and $x \in X$, a complement of $x$ is an element $y$ such that $x \wedge y = 0$ and $x \vee y = 1$. A lattice is said to be complemented if every element has a complement. Observe that the notions of complement and complemented lattice are manifestly self-dual. Since the notion of distributive lattice is self-dual, so therefore is the notion of Boolean algebra.)

• Example: Probably almost everyone reading this knows the archetypal example of a Boolean algebra: a power set $PX$, ordered by subset inclusion. As we know, this is a distributive lattice, and the complement $S^c$ of a subset $S \subseteq X$ satisfies $S \cap S^c = \emptyset$ and $S \cup S^c = X$.
• Example: Also well known is that the Boolean algebra axioms mirror the usual interactions between conjunction $\wedge$, disjunction $\vee$, and negation $\neg$ in ordinary classical logic. In particular, given a theory $\mathbf{T}$, there is a preorder whose elements are sentences (closed formulas) $p$ of $\mathbf{T}$, ordered by $p \leq q$ if the entailment $p \to q$ is provable in $\mathbf{T}$ using classical logic. By passing to logical equivalence classes ($p \equiv q$ iff $p \leftrightarrow q$ in $\mathbf{T}$), we get a poset with meets, joins, and complements satisfying the Boolean algebra axioms. This is called the Lindenbaum algebra of the theory $\mathbf{T}$.

Exercise: Give an example of a complemented lattice which is not distributive.

As a possible leading hint for the previous exercise, here is a first order of business:

Proposition: In a distributive lattice, complements of elements are unique when they exist.

Proof: If both $b$ and $c$ are complementary to $a$, then $b = b \wedge 1 = b \wedge (a \vee c) = (b \wedge a) \vee (b \wedge c) = 0 \vee (b \wedge c) = b \wedge c$. Since $b = b \wedge c$, we have $b \leq c$. Similarly $c = b \wedge c$, so $b = c. \Box$

The definition of Boolean algebra we have just given underscores its self-dual nature, but we gain more insight by packaging it in a way which stresses adjoint relationships — Boolean algebras are the same things as special types of Heyting algebras (recall that a Heyting algebra is a lattice which admits an implication operator satisfying an adjoint relationship with the meet operator).

Theorem: A lattice is a Boolean algebra if and only if it is a Heyting algebra in which either of the following properties holds:

1. $(a \wedge x \leq y)$ if and only if $(a \leq \neg x \vee y)$
2. $\neg \neg x = x$ for all elements $x$

Proof: First let $X$ be a Boolean algebra, and let $x^c$ denote the complement of an element $x \in X$. Then I claim that $a \wedge x \leq y$ if and only if $a \leq x^c \vee y$, proving that $X$ admits an implication $x \Rightarrow y = x^c \vee y$. Then, taking $y = 0$, it follows that $\neg x := (x \Rightarrow 0) = x^c \vee 0 = x^c$, whence 1. follows. Also, since (by definition of complement) $x$ is the complement of $y$ if and only if $y$ is the complement of $x$, we have $x^{c c} = x$, whence 2. follows.

[Proof of claim: if $a \leq x^c \vee y$, then $x \wedge a \leq x \wedge (x^c \vee y) = (x \wedge x^c) \vee (x \wedge y) \leq 0 \vee y = y$. On the other hand, if $x \wedge a \leq y$, then $a = 1 \wedge a \leq (x^c \vee x) \wedge (x^c \vee a) = x^c \vee (x \wedge a) \leq x^c \vee y$. This completes the proof of the claim and of the forward implication.]

In the other direction, given a lattice which satisfies 1., it is automatically a Heyting algebra (with implication $\neg x \vee y$). In particular, it is distributive. From $\neg x \leq \neg x \vee 0$, we have (from 1.) $x \wedge \neg x \leq 0$; since $0 \leq x \wedge \neg x$ is automatic by definition of $0 = \bot$, we get $0 = x \wedge \neg x$. From $1 \wedge x \leq x$, we have also (from 1.) that $1 \leq \neg x \vee x$; since $\neg x \vee x \leq 1$ is automatic by definition of $1$, we have $\neg x \vee x = 1$. Thus under 1., every element $x$ has a complement $\neg x$.

On the other hand, suppose $X$ is a Heyting algebra satisfying 2.: $\neg \neg x = x$. As above, we know $x \wedge \neg x = 0$. By the corollary below, we also know the function $\neg: X \to X$ takes 0 to 1 and joins to meets (De Morgan law); since condition 2. is that $\neg$ is its own inverse, it follows that $\neg$ also takes meets to joins. Hence $\neg x \vee x = \neg x \vee \neg \neg x = \neg(x \wedge \neg x) = \neg 0 = 1$. Thus for a Heyting algebra which satisfies 2., every element $x$ has a complement $\neg x$. This completes the proof. $\Box$

• Exercise: Show that Boolean algebras can also be characterized as meet-semilattices $X$ equipped with an operation $\neg: X \to X$ for which $a \wedge x \leq y$ if and only if $a \leq \neg(x \wedge \neg y)$.

The proof above invoked the De Morgan law $\neg(x \vee y) = \neg x \wedge \neg y$. The claim is that this De Morgan law (not the other $\neg(x \wedge y) = \neg x \vee \neg y$!) holds in a general Heyting algebra — the relevant result was actually posed as an exercise from the previous lecture:

Lemma: For any element $c$ of a Heyting algebra $X$, the function $- \Rightarrow c: X \to X$ is an order-reversing map (equivalently, an order-preserving map $X^{op} \to X$, or an order-preserving map $X \to X^{op}$). It is adjoint to itself, in the sense that $- \Rightarrow c: X^{op} \to X$ is right adjoint to $- \Rightarrow c: X \to X^{op}$.

Proof: First, we show that $a \leq b$ in $X$ (equivalently, $b \leq a$ in $X^{op}$) implies $(b \Rightarrow c) \leq (a \Rightarrow c)$. But this conclusion holds iff $(b \Rightarrow c) \wedge a \leq c$, which is clear from $(b \Rightarrow c) \wedge a \leq (b \Rightarrow c) \wedge b \leq c$. Second, the adjunction holds because

$(b \Rightarrow c) \leq a$ in $X^{op}$ if and only if

$a \leq (b \Rightarrow c)$ in $X$ if and only if

$a \wedge b \leq c$ in $X$ if and only if

$b \wedge a \leq c$ in $X$ if and only if

$b \leq (a \Rightarrow c)$ in $X. \Box$

Corollary: $- \Rightarrow c: X^{op} \to X$ takes any inf which exists in $X^{op}$ to the corresponding inf in $X$. Equivalently, it takes any sup in $X$ to the corresponding inf in $X$, i.e., $(\bigvee_{s \in S} s) \Rightarrow c = \bigwedge_{s \in S} (s \Rightarrow c)$. (In particular, this applies to finite joins in $X$, and in particular, it applies to the case $c = 0$, where we conclude, e.g., the De Morgan law $\neg(x \vee y) = \neg x \wedge \neg y$.)

• Remark: If we think of sups as sums and infs as products, then we can think of implications $x \Rightarrow y$ as behaving like exponentials $y^x$. Indeed, our earlier result that $x \Rightarrow (-)$ preserves infs $\bigwedge_{s \in S} y_s$ can then be recast in exponential notation as saying $(\prod_{s \in S} y_s)^x = \prod_{s \in S} (y_s)^x$, and our present corollary that $(- \Rightarrow y)$ takes sups to infs can then be recast as saying $y^{\sum_{s \in S} x_s} = \prod_{s \in S} y^{x_s}$. Later we will state another exponential law for implication. It is correct to assume that this is no notational accident!

Let me reprise part of the lemma (in the case $c = 0$), because it illustrates a situation which comes up over and over again in mathematics. In part it asserts that $\neg = (-)\Rightarrow 0: X \to X$ is order-reversing, and that there is a three-way equivalence:

$a \leq \neg b$ if and only if $a \wedge b = 0$ if and only if $b \leq \neg a$.

This situation is an instance of what is called a “Galois connection” in mathematics. If $X$ and $Y$ are posets (or even preorders), a Galois connection between them consists of two order-reversing functions $f: X \to Y$, $g: Y \to X$ such that for all $x \in X, y \in Y$, we have $y \leq f(x)$ if and only if $x \leq g(y)$. (It’s actually an instance of an adjoint pair: if we consider $f$ as an order-preserving map $X \to Y^{op}$ and $g$ an order-preserving map $Y^{op} \to X$, then $f(x) \leq y$ in $Y^{op}$ if and only if $x \leq g(y)$ in $X$.)

Here are some examples:

1. The original example arises of course in Galois theory. If $k$ is a field and $k \subseteq E$ is a finite Galois extension with Galois group $G = Gal(E/k)$ (of field automorphisms $g: E \to E$ which fix the elements belonging to $k$), then there is a Galois connection consisting of maps $Aut_{(-)}(E): PE \to PG$ and $Fix: PG \to PE$. This works as follows: to each subset $S \subseteq E$, define $Aut_S(E)$ to be $\{g \in G: g(s) = s \mbox{ for all } s \in S \}$. In the other direction, to each subset $T \subseteq G$, define $Fix(T)$ to be $\{x \in E: g(x) = x \mbox{ for all } g \in T\}$. Both $Aut_{(-)}(E)$ and $Fix(-)$ are order-reversing (for example, the larger the subset $T \subseteq G$, the more stringent the conditions for an element $x \in E$ to belong to $Fix(T)$). Moreover, we have

$S \subseteq Fix(T)$ iff ($g(x) = x$ for all $x \in S, g \in T$) iff $T \subseteq Aut_S(E)$

so we do get a Galois connection. It is moreover clear that for any $T \subseteq G$, $Fix(T)$ is an intermediate subfield between $k$ and $E$, and for any $S \subseteq E$, $Aut_S(E)$ is a subgroup of $G$. A principal result of Galois theory is that $Fix(-)$ and $Aut_{(-)}(E)$ are inverse to one another when restricted to the lattice of subgroups of $G$ and the lattice of fields intermediate between $k$ and $E$. Such a bijective correspondence induced by a Galois connection is called a Galois correspondence.

2. Another basic Galois connection arises in algebraic geometry, between subsets $J \subseteq k[x_1, \ldots, x_n]$ (of a polynomial algebra over a field $k$) and subsets $V \subseteq k^n$. Given $J$, define $Z(J)$ (the zero locus of $J$) to be $\{(a_1, \ldots, a_n): f(a_1, \ldots, a_n) = 0 \mbox{ for each polynomial } f \in J\}$. On the other hand, define $I(V)$ (the ideal of $V$) to be $\{f \in k[x_1, \ldots, x_n]: f(a) = 0 \mbox{ for all } a = (a_1, \ldots, a_n) \in V\}$. As in the case of Galois theory above, we clearly have a three-way equivalence

$V \subseteq Z(J)$ iff ($f(a) = 0$ for all $a \in V, f \in J$) iff $J \subseteq I(V)$

so that $Z(-)$, $I(-)$ define a Galois connection between power sets (of the $n$-variable polynomial algebra and of $n$-dimensional affine space $k^n$). One defines an (affine algebraic) variety $V \subseteq k^n$ to be a zero locus of some set. Then, on very general grounds (see below), any variety is the zero locus of its ideal. On the other hand, notice that $I(V)$ is an ideal of the polynomial algebra. Not every ideal $I$ of the polynomial algebra is the ideal of its zero locus, but according to the famous Hilbert Nullstellensatz, those ideals $I$ equal to their radical $rad(I) = \{f \in k[x_1, \ldots, x_n]: f^n \in I \mbox{ for some } n \geq 1\}$ are. Thus, $Z(-)$ and $I(-)$ become inverse to one another when restricted to the lattice of varieties and the lattice of radical ideals, by the Nullstellensatz: there is a Galois correspondence between these objects.

3. Both of the examples above are particular cases of a very general construction. Let $X, Y$ be sets and let $R \subseteq X \times Y$ be any relation between them. Then set up a Galois connection which in one direction takes a subset $S \subseteq X$ to $S \backslash R := \{y \in Y: (x, y) \in R \mbox{ for all } x \in S\}$, and in the other takes $T \subseteq Y$ to $R/T := \{x \in X: (x, y) \in R \mbox{ for all } y \in T\}$. Once again we have a three-way equivalence

$S \subseteq R/T$ iff $S \times T \subseteq R$ iff $T \subseteq S \backslash R$.

There are tons of examples of this flavor.

As indicated above, a Galois connection between posets $X, Y$ is essentially the same thing as an adjoint pair between the posets $X, Y^{op}$ (or between $X^{op}, Y$ if you prefer; Galois connections are after all symmetric in $X, Y$). I would like to record a few basic results about Galois connections/adjoint pairs.

Proposition:

1. Given order-reversing maps $f: X \to Y$, $g: Y \to X$ which form a Galois connection, we have $x \leq g f(x)$ for all $x \in X$ and $y \leq f g(y)$ for all $y \in Y$. (Given poset maps $f, g$ which form an adjoint pair $f \dashv g$, we have $x \leq g f(x)$ for all $x \in X$ and $f g(y) \leq y$ for all $y \in Y$.)
2. Given a Galois connection as above, $f(x) = f g f(x)$ for all $x \in X$ and $g(y) = g f g(y)$ for all $y \in Y$. (Given an adjoint pair $f \dashv g$ as above, the same equations hold.) Therefore a Galois connection $(f, g)$ induces a Galois correspondence between the elements of the form $f(x)$ and the elements of the form $g(y)$.

Proof: (1.) It suffices to prove the statements for adjoint pairs. But under the assumption $f \dashv g$, $x \leq g f(x)$ if and only if $f(x) \leq f(x)$, which is certainly true. The other statement is dual.

(2.) Again it suffices to prove the equations for the adjoint pair. Applying the order-preserving map $f$
to $x \leq g f(x)$ from 1. gives $f(x) \leq f g f(x)$. Applying $f g(y) \leq y$ from 1. to $y = f(x)$ gives $f g f(x) \leq f(x)$. Hence $f(x) = f g f(x)$. The other equation is dual. $\Box$

Incidentally, the equations of 2. show why an algebraic variety $V$ is the zero locus of its ideal (see example 2. above): if $V = Z(S)$ for some set of polynomials $S$, then $V = Z(S) = Z I Z(S) = Z I(V)$. They also show that for any element $x$ in a Heyting algebra, we have $\neg \neg \neg x = \neg x$, even though $\neg \neg y = y$ is in general false.

Let $(f, g)$ be a Galois connection (or $f \dashv g$ an adjoint pair). By the proposition, $c = gf: X \to X$ is an order-preserving map with the following properties:

$x \leq c(x)$ for all $x \in X$

$c c(x) = c(x)$ for all $x \in X$.

Poset maps $c: X \to X$ with these properties are called closure operators. We have earlier discussed examples of closure operators: if for instance $G$ is a group, then the operator $c: PG \to PG$ which takes a subset $S \subseteq G$ to the subgroup generated by $S$ is a closure operator. Or, if $X$ is a topological space, then the operator $c: PX \to PX$ which takes a subset $S \subset X$ to its topological closure $\bar{S}$ is a closure operator. Or, if $X$ is a poset, then the operator $c: PX \to PX$ which takes $S \subseteq X$ to $\{a \in X: a \leq s \mbox{ for some } s \in S\}$ is a closure operator. Examples like these can be multiplied at will.

One virtue of closure operators is that they give a useful means of constructing new posets from old. Specifically, if $c: X \to X$ is a closure operator, then a fixed point of $c$ (or a $c$-closed element of $X$) is an element $x$ such that $c(x) = x$. The collection $Fix(c)$ of fixed points is partially ordered by the order in $X$. For example, the lattice of fixed points of the operator $c: PG \to PG$ above is the lattice of subgroups of $G$. For any closure operator $c$, notice that $Fix(c)$ is the same as the image $c(X)$ of $c$.

One particular use is that the fixed points of the double negation closure $\neg \neg: X \to X$ on a Heyting algebra $X$ form a Boolean algebra $Fix(\neg\neg)$, and the map $\neg \neg: X \to Fix(\neg \neg)$ is a Heyting algebra map. This is not trivial! And it gives a means of constructing some rather exotic Boolean algebras (“atomless Boolean algebras”) which may not be so familiar to many readers.

The following exercises are in view of proving these results. If no one else does, I will probably give solutions next time or sometime soon.

Exercise: If $X$ is a Heyting algebra and $x, y, z \in X$, prove the “exponential law” $((x \wedge y) \Rightarrow z) = (x \Rightarrow (y \Rightarrow z))$. Conclude that $\neg(x \wedge y) = (y \Rightarrow \neg x) = (x \Rightarrow \neg y)$.

Exercise: We have seen that $(x \Rightarrow y) \wedge x \leq y$ in a Heyting algebra. Use this to prove $(x \Rightarrow y) \wedge (y \Rightarrow z) \leq (x \Rightarrow z)$.

Exercise: Show that double negation $\neg \neg: X \to X$ on a Heyting algebra preserves finite meets. (The inequality $\neg \neg(x \wedge y) \leq \neg \neg x \wedge \neg \neg y$ is easy. The reverse inequality takes more work; try using the previous two exercises.)

Exercise: If $c: X \to X$ is a closure operator, show that the inclusion map $i: Fix(c) \hookrightarrow X$ is right adjoint to the projection $c: X \to Fix(c)$ to the image of $c$. Conclude that meets of elements in $Fix(\neg \neg)$ are calculated as they would be as elements in $X$, and also that $\neg \neg: X \to Fix(\neg \neg)$ preserves joins.

Exercise: Show that the fixed points of the double negation operator on a topology (as Heyting algebra) are the regular open sets, i.e., those open sets equal to the interior of their closure. Give some examples of non-regular open sets. Incidentally, is the lattice you get by taking the opposite of a topology also a Heyting algebra?

In our last installment in this series on Stone duality, we introduced the notion of Heyting algebra, which captures the basic relationships between the logical connectives “and”, “or”, and “implies”. Our discussion disclosed a fundamental relationship between distributive laws and the algebra of implication, which we put to work to discover the structure of the “internal Heyting algebra logic” of a topology.

I’d like to pause and reflect on the general technique we used to establish this relationship; like the Yoneda principle and the Principle of Duality, it comes up with striking frequency, and so it will be useful for us to give it a name. As it turns out, this particular proof technique is analogous to the way adjoints are used in linear algebra. Such analogies go all the way back to work of C. S. Peirce, who like Boole was a great pioneer in the discovery of relationships between algebra and logic. At a deeper level, similar analogies were later rediscovered in category theory, and are connected with some of the most potent ideas category theory has to offer.

Our proof that meets distribute over sups in the presence of an implication operator is an example of this technique. Here is another example of similar flavor.

Theorem: In a Heyting algebra $X$, the operator $p \Rightarrow -: X \to X$ preserves any infs which happen to exist in $X$, for any element $p \in X$. [In particular, this operator is a morphism of meet-semilattices, i.e., $(p \Rightarrow (q \wedge r)) = ((p \Rightarrow q) \wedge (p \Rightarrow r))$, and $(p \Rightarrow 1) = 1$.]

Proof: Suppose that $S \subseteq X$ has an inf, which here will be denoted $\bigwedge_{s \in S} s$. Then for all $a \in X$, we have

$a \leq p \Rightarrow (\bigwedge_{s \in S} s)$ if and only if

$a \wedge p \leq \bigwedge_{s \in S} s$ if and only if

(for all $s \in S$, $a \wedge p \leq s$) if and only if

for all $s \in S$, $a \leq p \Rightarrow s$.

By the defining property of inf, these logical equivalences show that $p \Rightarrow (\bigwedge_{s \in S} s)$ is indeed the inf of the subset $\{p \Rightarrow s: s \in S\}$, or in other words that $p \Rightarrow (\bigwedge_{s \in S} s) = \bigwedge_{s \in S} p \Rightarrow s$, as desired. $\Box \,$

In summary, what we did in this proof is “slide” the operator $p \Rightarrow -$ on the right of the inequality over to the operator $- \wedge p$ on the left, then invoke the defining property of infs, and then slide back to $p \Rightarrow -$ on the right. This sliding trick is analogous to how adjoint mappings work in linear algebra.

In fact, everything we have done so far with posets can be translated in terms of matrix algebra, provided that our matrix entries, instead of being real or complex numbers, are truth values $0, 1$ ($1$ for “true”, $0 \mbox{ for}$ “false”). These truth values are added and multiplied in the way familiar from truth tables, with join playing the role of addition and meet playing the role of multiplication. In fact the lattice $\mathbf{2} = \{0, 1\}$ is a very simple distributive lattice, and so most of the familiar arithmetic properties of addition and multiplication (associativity, commutativity, distributivity) do carry over, which is all we need to carry out the most basic aspects of matrix algebra. However, observe that $1$ has no additive inverse (for here $1 + 1 = 1 \vee 1 = 1$) — the type of structure we are dealing with is often called a “rig” (like a ring, but without assuming negatives). On the other hand, this lattice is, conveniently, a sup-lattice, thinking of sups as arbitrary sums, whether finitary or infinitary.

Peirce recognized that a relation can be classified by a truth-valued matrix. Take for example a binary relation on a set $X$, i.e., a subset $R \subseteq X \times X$. We can imagine each point $(x, y) \in X \times X$ as a pixel in the plane, and highlight $R$ by lighting up just those pixels which belong to $R$. This is the same as giving an $(X \times X)$-matrix $R(\cdot, \cdot)$, with rows indexed by elements $y$ and columns by elements $x$, where the $(y, x)$-entry $R(y, x)$ is $1$ (on) if $(x, y)$ is in $R$, and $0 \mbox{ (off)}$ if not. In a similar way, any relation $R \subseteq X \times Y$ is classified by a $(Y \times X)$-matrix whose entries are truth values.

As an example, the identity matrix has a $1$ at the $(x, y)$-entry if and only if $x = y$. Thus the identity matrix classifies the equality relation.

A poset is a set $X$ equipped with a binary relation $R$ satisfying the reflexive, transitive, and antisymmetry properties. Let us translate these into matrix algebra terms. First reflexivity: it says that $x = y$ implies $(x, y) \in R$. In matrix algebra terms, it says $Id(y, x) \leq R(y, x)$, which we abbreviate in the customary way:

(Reflexivity) $Id \leq R$.

Now let’s look at transitivity. It says

($\exists_y (x, y) \in R$ and $(y, z) \in R$) implies $(x, z) \in R$.

The “and” here refers to the meet or multiplication in the rig of truth values $\mathbf{2}$, and the existential quantifier can be thought of as a (possibly infinitary) join or sum indexed over elements $y$. Thus, for each pair $(x, z)$, the hypothesis of the implication has truth value

$\sum_y R(z, y) \cdot R(y, x)$

which is just the $(z, x)$-entry of the square of the matrix $R$. Therefore, transitivity can be very succinctly expressed in matrix algebra terms as the condition

(Transitivity) $R^2 \leq R$.

• Remark: More generally, given a relation $R \subseteq X \times Y$ from $X$ to $Y$, and another relation $S \subseteq Y \times Z$ from $Y$ to $Z$, the relational composite $S \circ R \subseteq X \times Z$ is defined to be the set of pairs $(x, z)$ for which there exists $y$ with $(x, y) \in R$ and $(y, z) \in S$. But this just means that its classifying matrix is the ordinary matrix product $S \cdot R$!

Let’s now look at the antisymmetry condition: ($(x, y) \in R$ and $(y, x) \in R$) implies $x = y$. The clause $(y, x) \in R$ is the flip of $(x, y) \in R$; at the matrix level, this flip corresponds to taking the transpose. Thus antisymmetry can be expressed in matrix terms as

(Antisymmetry) $R \wedge R^\top \leq Id$

where $R^\top$ denotes the transpose of $R$, and the matrix meet $\wedge$ means we take the meet at each entry.

• Remark: From the matrix algebra perspective, the antisymmetry axiom is less well motivated than the reflexivity and transitivity axioms. There’s a moral hiding beneath that story: from the category-theoretic perspective, the antisymmetry axiom is relatively insignificant. That is, if we view a poset as a category, then the antisymmetry condition is tantamount to the condition that isomorphic objects are equal (in the parlance, one says the category is “skeletal”) — this extra condition makes no essential difference, because isomorphic objects are essentially the same anyway. So: if we were to simply drop the antisymmetry axiom but keep the reflexivity and transitivity axioms (leading to what are called preordered sets, as opposed to partially ordered sets), then the theory of preordered sets develops exactly as the theory of partially ordered sets, except that in places where we conclude “$x$ is equal to $y$” in the theory of posets, we would generally conclude “$x$ is isomorphic to $y$” in the theory of preordered sets.

Preordered sets do occur in nature. For example, the set of sentences $p, q, ...$ in a theory is preordered by the entailment relation $p \vdash q$ ($q$ is derivable from $p$ in the theory). (The way one gets a poset out of this is to pass to a quotient set, by identifying sentences which are logically equivalent in the theory.)

Exercises:

1. (For those who know some topology) Suppose $X$ is a topological space. Given $x, y \in X$, define $x \leq y$ if $x$ belongs to the closure of $y$; show this is a preorder. Show this preorder is a poset precisely when $X$ is a $T_0$-space.
2. If $X$ carries a group structure, define $x \leq y$ for elements $x, y \in X$ if $x = y^n$ for some integer $n$; show this is a preorder. When is it a poset?

Since posets or preorders are fundamental to everything we’re doing, I’m going to reserve a special pairing notation for their classifying matrices: define

$\langle x, y \rangle = 1$ if and only if $x \leq y$.

Many of the concepts we have developed so far for posets can be succinctly expressed in terms of the pairing.

Example: The Yoneda principle (together with its dual) is simply the statement that if $X$ is a poset, then $x = y$ if and only if $\langle -, x \rangle = \langle -, y \rangle$ (as functionals valued in $\mathbf{2}$) if and only if $\langle x, - \rangle = \langle y, - \rangle$.

Example: A mapping from a poset $(X, \langle, \rangle_X)$ to a poset $(Y, \langle, \rangle_Y)$ is a function $f: X \to Y$ such that $\langle x, y \rangle_X \leq \langle f(x), f(y) \rangle_Y$.

Example: If $X$ is a poset, its dual or opposite $X^{op}$ has the same elements but the opposite order, i.e., $\langle x, y \rangle_X = \langle y, x \rangle_{X^{op}}$. The principle of duality says that the opposite of a poset is a poset. This can be (re)proved by invoking formal properties of matrix transpose, e.g., if $R^2 \leq R$, then $(R^\top)^2 = (R^2)^\top \leq R^\top$.

By far the most significant concept that can be expressed in terms of these pairings that of adjoint mappings:

Definition: Let $X, Y$ be posets [or preorders], and $f: X \to Y$, $g: Y \to X$ be poset mappings. We say $(f, g)$ is an adjoint pair (with $f$ the left adjoint of $g$, and $g$ the right adjoint of $f$) if

$\langle f(x), y \rangle_Y = \langle x, g(y) \rangle_X$

or, in other words, if $f(x) \leq y$ if and only if $x \leq g(y)$. We write $f \dashv g$. Notice that the concept of left adjoint is dual to the concept of right adjoint (N.B.: they are not the same, because clearly the pairing $\langle x, y \rangle$ is not generally symmetric in $x$ and $y$).

Here are some examples which illustrate the ubiquity of this concept:

1. Let $X$ be a poset. Let $X \times X$ be the poset where $(x, y) \leq (x', y')$ iff ($x \leq x'$ and $y \leq y'$). There is an obvious poset mapping $\delta: X \to X \times X$, the diagonal mapping, which takes $x$ to $(x, x)$. Then a meet operation $\wedge: X \times X \to X$ is precisely a right adjoint to the diagonal mapping. Indeed, it says that $(a, a) \leq (x, y)$ if and only if $a \leq x \wedge y$.
2. Dually, a join operation $\vee: X \times X \to X$ is precisely a left adjoint to the diagonal mapping $\delta: X \to X \times X$.
3. More generally, for any set $S$, there is a diagonal map $\Delta: X \to X^S$ which maps $x \in X$ to the $S$-tuple $(\ldots, x, x, x, \ldots)$. Its right adjoint $X^S \to X$, if one exists, sends an $S$-tuple $(x_s)_{s \in S}$ to the inf of the set $\{x_s: s \in S\}$. Its left adjoint would send the tuple to the sup of that set.
4. If $X$ is a Heyting algebra, then for each $x \in X$, the conjunction operator $x \wedge -: X \to X$ is left adjoint to the implication operator $x \Rightarrow -: X \to X$.
5. If $X$ is a sup-lattice, then the operator $\sup: PX \to X$ which sends a subset $S \subseteq X$ to $\sup(S) \in X$ is left adjoint to the Dedekind embedding $i: X \to PX$. Indeed, we have $\sup(S) \leq a$ if and only if (for all $s \in S, s \leq a$) if and only if $S \subseteq \{x \in X: x \leq a\} = i(a)$.

As items 1, 2, and 4 indicate, the rules for how the propositional connectives $\wedge, \vee, \Rightarrow$ operate are governed by adjoint pairs. This gives some evidence for Lawvere’s great insight that all rules of inference in logic are expressed by interlocking pairs of adjoint mappings.

Proposition: If $f \dashv g$ and $f' \dashv g'$ where $g: X \to Y$ and $g': Y \to Z$ are composable mappings, then $f \circ f' \dashv g' \circ g$.

Proof: $\langle f f' z, x \rangle_X = \langle f' z, g x \rangle_Y = \langle z, g' g x \rangle_Z$. Notice that the statement is analogous to the usual rule $(A B)^\dagger = B^\dagger A^\dagger$, where $A^\dagger$ refers to taking an adjoint with respect to given inner product forms.

We can use this proposition to give slick proofs of some results we’ve seen. For example, to prove that Heyting algebras are distributive lattices, i.e., that $p \wedge (x \vee y) = (p \wedge x) \vee (p \wedge y)$, just take left adjoints on both sides of the tautology $\delta \circ g = (g \times g) \circ \delta$, where $g = p \Rightarrow -$ is right adjoint to $p \wedge -$. The left adjoint of the left side of the tautology is (by the proposition) $p \wedge -$ applied to $\vee$. The left adjoint of the right side is $\vee$ applied to $(p \wedge -) \times (p \wedge -)$. The conclusion follows.

Much more generally, we have the

Theorem: Right adjoints $g: X \to Y$ preserve any infs which exist in $X$. Dually, left adjoints $f: Y \to X$ preserve any sups which exist in $Y$.

Proof: $\langle y, g(\bigwedge_{s \in S} x_s) \rangle_Y = \langle f(y), \bigwedge_{s \in S} x_s \rangle_X = \bigwedge_{s \in S} \langle f(y), x_s \rangle_X$ where the last inf is interpreted in the inf-lattice $\mathbf{2}$. This equals $\bigwedge_{s \in S} \langle y, g(x_s) \rangle_Y$. This completes the proof of the first statement (why?). The second follows from duality.

Exercise: If $X$ is a Heyting algebra, then there is a poset mapping $- \Rightarrow c: X^{op} \to X$ for any element $c$. Describe the left adjoint of this mapping. Conclude that this mapping takes infs in $X^{op}$ (i.e., sups in $X$) to the corresponding infs in $X$.

Last time in this series on Stone duality, we introduced the concept of lattice and various cousins (e.g., inf-lattice, sup-lattice). We said a lattice is a poset with finite meets and joins, and that inf-lattices and sup-lattices have arbitrary meets and joins (meaning that every subset, not just every finite one, has an inf and sup). Examples include the poset $PX$ of all subsets of a set $X$, and the poset $Sub(V)$ of all subspaces of a vector space $V$.

I take it that most readers are already familiar with many of the properties of the poset $PX$; there is for example the distributive law $A \cap (B \cup C) = (A \cap B) \cup (A \cap C)$, and De Morgan laws, and so on — we’ll be exploring more of that in depth soon. The poset $Sub(V)$, as a lattice, is a much different animal: if we think of meets and joins as modeling the logical operations “and” and “or”, then the logic internal to $Sub(V)$ is a weird one — it’s actually much closer to what is sometimes called “quantum logic”, as developed by von Neumann, Mackey, and many others. Our primary interest in this series will be in the direction of more familiar forms of logic, classical logic if you will (where “classical” here is meant more in a physicist’s sense than a logician’s).

To get a sense of the weirdness of $Sub(V)$, take for example a 2-dimensional vector space $V$. The bottom element is the zero space $\{0\}$, the top element is $V$, and the rest of the elements of $Sub(V)$ are 1-dimensional: lines through the origin. For 1-dimensional spaces $x, y$, there is no relation $x \leq y$ unless $x$ and $y$ coincide. So we can picture the lattice as having three levels according to dimension, with lines drawn to indicate the partial order:

       V = 1
/ | \
/   |   \
x    y    z
\   |   /
\ | /
0

Observe that for distinct elements $x, y, z$ in the middle level, we have for example $x \wedge y = 0 = x \wedge z$ (0 is the largest element contained in both $x$ and $y$), and also for example $y \vee z = 1$ (1 is the smallest element containing $y$ and $z$). It follows that $x \wedge (y \vee z) = x \wedge 1 = x$, whereas $(x \wedge y) \vee (x \wedge z) = 0 \vee 0 = 0$. The distributive law fails in $Sub(V)$!

Definition: A lattice is distributive if $x \wedge (y \vee z) = (x \wedge y) \vee (x \wedge z)$ for all $x, y, z$. That is to say, a lattice $X$ is distributive if the map $x \wedge -: X \to X$, taking an element $y$ to $x \wedge y$, is a morphism of join-semilattices.

1. Exercise: Show that in a meet-semilattice, $x \wedge -: X \to X$ is a poset map. Is it also a morphism of meet-semilattices? If $X$ has a bottom element, show that the map $x \wedge -$ preserves it.
2. Exercise: Show that in any lattice, we at least have $(x \wedge y) \vee (x \wedge z) \leq x \wedge (y \vee z)$ for all elements $x, y, z$.

Here is an interesting theorem, which illustrates some of the properties of lattices we’ve developed so far:

Theorem: The notion of distributive lattice is self-dual.

Proof: The notion of lattice is self-dual, so all we have to do is show that the dual of the distributivity axiom, $x \vee (y \wedge z) = (x \vee y) \wedge (x \vee z)$, follows from the distributive lattice axioms.

Expand the right side to $((x \vee y) \wedge x) \vee ((x \vee y) \wedge z)$, by distributivity. This reduces to $x \vee [(x \vee y) \wedge z]$, by an absorption law. Expand this again, by distributivity, to $x \vee (x \wedge z) \vee (y \wedge z)$. This reduces to $x \vee (y \wedge z)$, by the other absorption law. This completes the proof. $\Box$

Distributive lattices are important, but perhaps even more important in mathematics are lattices where we have not just finitary, but infinitary distributivity as well:

Definition: A frame is a sup-lattice for which $x \wedge -: X \to X$ is a morphism of sup-lattices, for every $x \in X$. In other words, for every subset $S \subseteq X$, we have $\sup(\{x \wedge s: s \in S\}) = x \wedge \sup(S)$, or, as is often written,

$\bigvee_{s \in S} x \wedge s = x \wedge \bigvee_{s \in S} s.$

Example: A power set $PX$, as always partially ordered by inclusion, is a frame. In this case, it means that for any subset $A$ and any collection of subsets $\{B_i: i \in I\}$, we have

$A \cap (\bigcup_{i \in I} B_i) = \bigcup_{i \in I} A \cap B_i$

This is a well-known fact from naive set theory, but soon we will see an alternative proof, thematically closer to the point of view of these notes.

Example: If $X$ is a set, a topology on $X$ is a subset $\mathbf{T} \subseteq PX$ of the power set, partially ordered by inclusion as $PX$ is, which is closed under finite meets and arbitrary sups. This means the empty sup or bottom element $\emptyset$ and the empty meet or top element $X$ of $PX$ are elements of $\mathbf{T}$, and also:

1. If $U, V$ are elements of $\mathbf{T}$, then so is $U \cap V$.
2. If $\{U_i: i \in I\}$ is a collection of elements of $\mathbf{T}$, then $\bigcup_{i \in I} U_i$ is an element of $\mathbf{T}$.

A topological space is a set $X$ which is equipped with a topology $\mathbf{T}$; the elements of the topology are called open subsets of the space. Topologies provide a primary source of examples of frames; because the sups and meets in a topology are constructed the same way as in $PX$ (unions and finite intersections), it is clear that the requisite infinite distributivity law holds in a topology.

The concept of topology was originally rooted in analysis, where it arose by contemplating very generally what one means by a “continuous function”. I imagine many readers who come to a blog titled “Topological Musings” will already have had a course in general topology! but just to be on the safe side I’ll give now one example of a topological space, with a promise of more to come later. Let $X$ be the set $\mathbb{R}^n$ of $n$-tuples of real numbers. First, define the open ball in $\mathbb{R}^n$ centered at a point $x \in \mathbb{R}^n$ and of radius $r > 0$ to be the set $\{y \in \mathbb{R}: ||x - y||$ < $r\}$. Then, define a subset $U \subseteq \mathbb{R}^n$ to be open if it can be expressed as the union of a collection, finite or infinite, of (possibly overlapping) open balls; the topology is by definition the collection of open sets.

It’s clear from the definition that the collection of open sets is indeed closed under arbitrary unions. To see it is closed under finite intersections, the crucial lemma needed is that the intersection of two overlapping open balls is itself a union of smaller open balls. A precise proof makes essential use of the triangle inequality. (Exercise?)

Topology is a huge field in its own right; much of our interest here will be in its interplay with logic. To that end, I want to bring in, in addition to the connectives “and” and “or” we’ve discussed so far, the implication connective in logic. Most readers probably know that in ordinary logic, the formula $p \Rightarrow q$ (“$p$ implies $q$“) is equivalent to “either not $p$ or $q$” — symbolically, we could define $p \Rightarrow q$ as $\neg p \vee q$. That much is true — in ordinary Boolean logic. But instead of committing ourselves to this reductionistic habit of defining implication in this way, or otherwise relying on Boolean algebra as a crutch, I want to take a fresh look at material implication and what we really ask of it.

The main property we ask of implication is modus ponens: given $p$ and $p \Rightarrow q$, we may infer $q$. In symbols, writing the inference or entailment relation as $\leq$, this is expressed as $p \wedge (p \Rightarrow q) \leq q$. And, we ask that implication be the weakest possible such assumption, i.e., that material implication $p \Rightarrow q$ be the weakest $a$ whose presence in conjunction with $p$ entails $q$. In other words, for given $p$ and $q$, we now define implication $p \Rightarrow q$ by the property

$(a \wedge p \leq q)$ if and only if $(a \leq p \Rightarrow q).$

As a very easy exercise, show by Yoneda that an implication $p \Rightarrow q$ is uniquely determined when it exists. As the next theorem shows, not all lattices admit an implication operator; in order to have one, it is necessary that distributivity holds:

Theorem:

• (1) If $X$ is a meet-semilattice which admits an implication operator, then for every element $p$, the operator $p \wedge -: X \to X$ preserves any sups which happen to exist in $X$.
• (2) If $X$ is a frame, then $X$ admits an implication operator.

Proof: (1) Suppose $S \subseteq X$ has a sup in $X$, here denoted $\bigvee_{s \in S} s$. We have

$(\bigvee_{s \in S} s) \wedge p \leq q$ if and only if

$\bigvee_{s \in S} s \leq p \Rightarrow q$ if and only if

for all $s \in S, (s \leq p \Rightarrow q)$ if and only if

for all $s \in S, (s \wedge p \leq q)$ if and only if

$\bigvee_{s \in S} (s \wedge p) \leq q$.

Since this is true for all $q$, the (dual of the) Yoneda principle tells us that $(\bigvee_{s \in S} s) \wedge p = \bigvee_{s \in S} (s \wedge p)$, as desired. (We don’t need to add the hypothesis that the sup on the right side exists, for the first four lines after “We have” show that $(\bigvee_{s \in S} s) \wedge p$ satisfies the defining property of that sup.)

(2) Suppose $p, q$ are elements of a frame $X$. Define $p \Rightarrow q$ to be $\sup(\{a \in X: a \wedge p \leq q\})$. By definition, if $a \wedge p \leq q$, then $a \leq p \Rightarrow q$. Conversely, if $a \leq p \Rightarrow q$, then

$a \wedge p \leq \sup\{x: x \wedge p \leq q\} \wedge p = \sup\{x \wedge p: x \wedge p \leq q\},$

where the equality holds because of the infinitary distributive law in a frame, and this last sup is clearly bounded above by $q$ (according to the defining property of sups). Hence $a \wedge p \leq q$, as desired. $\Box$

Incidentally, part (1) this theorem gives an alternative proof of the infinitary distributive law for Boolean algebras such as $PX$, so long as we trust that $p \Rightarrow q := \neg p \vee q$ really does what we ask of implication. We’ll come to that point again later.

Part (2) has some interesting consequences vis Ã  vis topologies: we know that topologies provide examples of frames; therefore by part (2) they admit implication operators. It is instructive to work out exactly what these implication operators look like. So, let $U, V$ be open sets in a topology. According to our prescription, we define $U \Rightarrow V$ as the sup (the union) of all open sets $W$ with the property that $W \cap U \subseteq V$. We can think of this inclusion as living in the power set $PX$. Then, assuming our formula $U^c \cup V$ for implication in the Boolean algebra $PX$ (where $U^c$ denotes the complement of $U$), we would have $W \subseteq U^c \cup V$. And thus, our implication $U \Rightarrow V$ in the topology is the union of all open sets $W$ contained in the (usually non-open) set $U^c \cup V$. That is to say, $U \Rightarrow V$ is the largest open contained in $U^c \cup V$, otherwise known as the interior of $U^c \cup V$. Hence our formula:

$U \Rightarrow V$ = int$(U^c \cup V).$

Definition: A Heyting algebra is a lattice $H$ which admits an implication $p \Rightarrow q$ for any two elements $p, q \in H$. A complete Heyting algebra is a complete lattice which admits an implication for any two elements.

Again, our theorem above says that frames are (extensionally) the same thing as complete Heyting algebras. But, as in the case of inf-lattices and sup-lattices, we make intensional distinctions when we consider the appropriate notions of morphism for these concepts. In particular, a morphism of frames is a poset map which preserves finite meets and arbitrary sups. A morphism of Heyting algebras preserves all structure in sight (i.e., all implied in the definition of Heyting algebra — meets, joins, and implication). A morphism of complete Heyting algebras also preserves all structure in sight (sups, infs, and implication).

Heyting algebras are usually not Boolean algebras. For example, it is rare that a topology is a Boolean lattice. We’ll be speaking more about that next time soon, but for now I’ll remark that Heyting algebra is the algebra which underlies intuitionistic propositional calculus.

Exercise: Show that $(0 \Rightarrow 0) = 1$ in a Heyting algebra.

Exercise: (For those who know some general topology.) In a Heyting algebra, we define the negation $\neg x$ to be $x \Rightarrow 0$. For the Heyting algebra given by a topology, what can you say about $\neg U$ when $U$ is open and dense?

Previously, on “Stone duality”, we introduced the notions of poset and meet-semilattice (formalizing the conjunction operator “and”), as a first step on the way to introducing Boolean algebras. Our larger goal in this series will be to discuss Stone duality, where it is shown how Boolean algebras can be represented “concretely”, in terms of the topology of their so-called Stone spaces — a wonderful meeting ground for algebra, topology, logic, geometry, and even analysis!

In this installment we will look at the notion of lattice and various examples of lattice, and barely scratch the surface — lattice theory is a very deep and multi-faceted theory with many unanswered questions. But the idea is simple enough: lattices formalize the notions of “and” and “or” together. Let’s have a look.

Let $(X, \leq)$ be a poset. If $x, y$ are elements of $X$, a join of $x$ and $y$ is an element $j$ with the property that for any $a \in X$,

$j \leq a$ if and only if ($x \leq a$ and $y \leq a$).

For a first example, consider the poset $PX$ of subsets of $X$ ordered by inclusion. The join in that case is given by taking the union, i.e., we have

$S \cup T \subseteq A$ if and only if ($S \subseteq A$ and $T \subseteq A$).

Given the close connection between unions of sets and the disjunction “or”, we can therefore say, roughly, that joins are a reasonable mathematical way to formalize the structure of disjunction. We will say a little more on that later when we discuss mathematical logic.

Notice there is a close formal resemblance between how we defined joins and how we defined meets. Recall that a meet of $x$ and $y$ is an element $m$ such that for all $a \in X$,

$a \leq m$ if and only if ($a \leq x$ and $a \leq y$).

Curiously, the logical structure in the definitions of meet and join is essentially the same; the only difference is that we switched the inequalities (i.e., replaced all instances of $x \leq y$ by $y \leq x$). This is an instance of a very important concept. In the theory of posets, the act of modifying a logical formula or theorem by switching all the inequalities but otherwise leaving the logical structure the same is called taking the dual of the formula or theorem. Thus, we would say that the dual of the notion of meet is the notion of join (and vice-versa). This turns out to be a very powerful idea, which in effect will allow us to cut our work in half.

(Just to put in some fine print or boilerplate, let me just say that a formula in the first-order theory of posets is a well-formed expression in first-order logic (involving the usual logical connectives and logical quantifiers and equality over a domain $X$), which can be built up by taking $\leq$ as a primitive binary predicate on $X$. A theorem in the theory of posets is a sentence (a closed formula, meaning that all variables are bound by quantifiers) which can be deduced, following standard rules of inference, from the axioms of reflexivity, transitivity, and antisymmetry. We occasionally also consider formulas and theorems in second-order logic (permitting logical quantification over the power set $PX$), and in higher-order logic. If this legalistic language is scary, don’t worry — just check the appropriate box in the End User Agreement, and reason the way you normally do.)

The critical item to install before we’re off and running is the following meta-principle:

Principle of Duality: If a logical formula F is a theorem in the theory of posets, then so is its dual F’.

Proof: All we need to do is check that the duals of the axioms in the theory of posets are also theorems; then F’ can be proved just by dualizing the entire proof of F. Now the dual of the reflexivity axiom, $\forall_{x \in X} x \leq x$, is itself! — and of course an axiom is a theorem. The transitivity axiom, $\forall_{x, y, z \in X} (x \leq y$ and $y \leq z)$ implies $x \leq z$, is also self-dual (when you dualize it, it looks essentially the same except that the variables $x$ and $z$ are switched — and there is a basic convention in logic that two sentences which differ only by renaming the variables are considered syntactically equivalent). Finally, the antisymmetry axiom is also self-dual in this way. Hence we are done. $\Box$

So, for example, by the principle of duality, we know automatically that the join of two elements is unique when it exists — we just dualize our earlier theorem that the meet is unique when it exists. The join of two elements $x$ and $y$ is denoted $x \vee y$.

Be careful, when you dualize, that any shorthand you used to abbreviate an expression in the language of posets is also replaced by its dual. For example, the dual of the notation $x \wedge y$ is $x \vee y$ (and vice-versa of course), and so the dual of the associativity law which we proved for meet is (for all $x, y, z$) $(x \vee y) \vee z = x \vee (y \vee z)$. In fact, we can say

Theorem: The join operation $\vee$ is associative, commutative, and idempotent.

Proof: Just apply the principle of duality to the corresponding theorem for the meet operation.

Just to get used to these ideas, here are some exercises.

• State the dual of the Yoneda principle (as stated here).
• Prove the associativity of join from scratch (from the axioms for posets). If you want, you may invoke the dual of the Yoneda principle in your proof. (Note: in the sequel, we will apply the term “Yoneda principle” to cover both it and its dual.)

To continue: we say a poset is a join-semilattice if it has all finite joins (including the empty join, which is the bottom element $\bot$ satisfying $\bot \leq a$ for all $a$). A lattice is a poset which has all finite meets and finite joins.

Time for some examples.

• The set of natural numbers 0, 1, 2, 3, … under the divisibility order ($x \leq y$ if $x$ divides $y$) is a lattice. (What is the join of two elements? What is the bottom element?))
• The set of natural numbers under the usual order is a join-semilattice (the join of two elements here is their maximum), but not a lattice (because it lacks a top element).
• The set of subsets of a set $X$ is a lattice. The join of two subsets is their union, and the bottom element is the empty set.
• The set of subspaces of a vector space $V$ is a lattice. The meet of two subspaces is their ordinary intersection; the join of two subspaces $U$, $W$ is the vector space which they jointly generate (i.e., the set of vector sums $u + w$ with $u \in U, w \in W$, which is closed under addition and scalar multiplication).

The join in the last example is not the naive set-theoretic union of course (and similar remarks hold for many other concrete lattices, such as the lattice of all subgroups of a group, and the lattice of ideals of a ring), so it might be worth asking if there is a uniform way of describing joins in cases like these. Certainly the idea of taking some sort of closure of the ordinary union seems relevant (e.g., in the vector space example, close up the union of $U$ and $W$ under the vector space operations), and indeed this can be made precise in many cases of interest.

To explain this, let’s take a fresh look at the definition of join: the defining property was

$x \vee y \leq a$ if and only if ($x \leq a$ and $y \leq a$).

What this is really saying is that among all the elements $a$ which “contain” both $x$ and $y$, the element $x \vee y$ is the absolute minimum. This suggests a simple idea: why not just take the “intersection” (i.e., meet) of all such elements $a$ to get that absolute minimum? In effect, construct joins as certain kinds of meets! For example, to construct the join of two subgroups $H \subseteq G$, $J \subseteq G$, take the intersection of all subgroups containing both $H$ and $J$ — that intersection is the group-theoretic closure of the union $H \cup J$.

There’s a slight catch: this may involve taking the meet of infinitely many elements. But there is no difficulty in saying what this means:

Definition: Let $X$ be a poset, and suppose $S \subseteq X$. The infimum of $S$, if it exists, is an element $m \in X$ such that for all $a \in X$, $a \leq m$ if and only if $a \leq s$ for all $s \in S$.

By the usual Yoneda argument, infima are unique when they exist (you might want to write that argument out to make sure it’s quite clear). We denote the infimum of $S$ by $\inf(S)$.

We say that a poset $X$ is an inf-lattice if there is an infimum for every subset. Similarly, the supremum of $S \subseteq X$, if it exists, is an element $\sup(S)$ such that for all $a \in X$, $\sup(S) \leq a$ if and only if $s \leq a$ for all $s \in S$. A poset is a sup-lattice if there is a supremum for every subset. [I’ll just quickly remark that the notions of inf-lattice and sup-lattice belong to second-order logic, since it involves quantifying over all subsets $S \subseteq X$ (or over all elements of $PX$).]

Trivially, every inf-lattice is a meet-semilattice, and every sup-lattice is a join-semilattice. More interestingly, we have the

Theorem: Every inf-lattice is a sup-lattice (!). Dually, every sup-lattice is an inf-lattice.

Proof: Suppose $X$ is an inf-lattice, and let $S \subseteq X$. Let $U = \{u \in X: \forall_{s \in S} s \leq u\}$ be the set of upper bounds of $S$. I claim that $\inf(U)$ (“least upper bound”) is the supremum of $S$. Indeed, from $\inf(U) \leq \inf(U)$ and the definition of infimum, we know that $\inf(U) \leq a$ if $a \in U$, i.e., $\inf(U) \leq a$ if $s \leq a$ for all $s \in S$. On the other hand, we also know that if $s \in S$, then $s \leq u$ for every $u \in U$, and hence $s \leq \inf(U)$ by the defining property of infimum (i.e., $\inf(U)$ really is an upper bound of $S$). So, if $\inf(U) \leq a$, we conclude by transitivity that $s \leq a$ for every $s \in S$. This completes the proof. $\Box$

Corollary: Every finite meet-semilattice is a lattice.

Even though every inf-lattice is a sup-lattice and conversely (sometimes people just call them “complete lattices”), there are important distinctions to be made when we consider what is the appropriate notion of homomorphism. The notions are straightforward enough: a morphism of meet-semilattices $f: X \to Y$ is a function which takes finite meets in $X$ to finite meets in $Y$ ($f(x \wedge x') = f(x) \wedge f(x')$, and $f(1) = 1$ where the 1’s denote top elements). There is a dual notion of morphism of join-semilattices ($f(x \vee x') = f(x) \vee f(x')$ and $f(0) = 0$ where the 0’s denote bottom elements). A morphism of inf-lattices $f: X \to Y$ is a function such that $f(\inf(S)) = \inf(f(S))$ for all subsets $S \subseteq X$, where $f(S)$ denotes the direct image of $S$ under $f$. And there is a dual notion of morphism of sup-lattices: $f(\sup(S)) = \sup(f(S))$. Finally, a morphism of lattices is a function which preserves all finite meets and finite joins, and a morphism of complete lattices is one which preserves all infs and sups.

Despite the theorem above , it is not true that a morphism of inf-lattices must be a morphism of sup-lattices. It is not true that a morphism of finite meet-semilattices must be a lattice morphism. Therefore, in contexts where homomorphisms matter (which is just about all the time!), it is important to keep the qualifying prefixes around and keep the distinctions straight.

Exercise: Come up with some examples of morphisms which exhibit these distinctions.

Let’s see if we can build this from ground up. We first define a statement (or sometimes, a proposition) to be a meaningful assertion that is either true or false. Well, meaningful means we should be able to say for sure if a statement is true or false. So, something like “Hello, there!” is not counted as a statement but “the sun is made of butter” is. The latter is evidently false but the former is neither true nor false. Now, it can get quite cumbersome after a while if we keep using statements such as “the sun is made of butter” every time we need to use them. Thus, it is useful to have variables, or to be precise, propositional variables, to denote all statements. We usually prefer to use $p, q, r$ and so on for such variables.

Now, all of this would be rather boring if we had just symbols such as $p, q, r$ etc. to denote statements. Thus, a statement like “Archimedes was a philosopher” is not that interesting in itself. In fact, all the statements (in our formal system) would be “isolated” ones in the sense that we wouldn’t be able to logically “connect” one statement to another. We want to be able to express sentences like “$x = -2$ and $y=2$“, “$(x = -2)$ implies $(x^2 = 4)$” and so on. So, we add something called logical connectives (also called operator symbols) to the picture. There are four basic ones: $\wedge$ (conjunction), $\vee$ (disjunction), $\rightarrow$ (material implication), which are all of arity 2 and $\neg$ (negation) which is of arity 1. Using these logical connectives, we can now form compound statements such as $p \wedge q$ (i.e. $p$ and $q$), $p \vee q$ (i.e. $p$ or $q$), $\neg p$ (i.e. $\mbox{not} (p)$), and $p \rightarrow q$ (i.e. $p$ implies $q$.) Note that each of $\wedge, \vee$ and $\rightarrow$ requires two propositional variables in order for it to make any sense; this is expressed by saying their arity is 2. On the other hand, $\neg$ has arity 1 since it is applied to exactly one propositional variable.

We also introduce another logical operator called logical equivalence ($\equiv$,) which has arity 2. It is really convenient to have logical equivalence on hand, as we shall see later. We say $p \equiv q$ if and only if “$(p \rightarrow q) \mbox{ and } (q \rightarrow p)$“. What this basically means is, if $p$ is true then so is $q$ and if $q$ is true then so is $p$. Another equivalent way of saying this is, if $p$ is true then so is $q$ and if $p$ is false then so is $q$.

Before we proceed further, we make a few observations. First, if $p$ and $q$ are propositional variables, then by definition each of those is either true or false. Formally speaking, the truth value of $p$ or $q$ is either true or false. This is equally true of the compound statements $p \wedge q,\, p \vee q,\, p \rightarrow q$ and $\neg p$. Of course, the truth values of these four compound statements depend on $p$ and $q$. We will delve into this in the next post.

Second, we don’t really need all the four basic operators. Two of those, viz. $\rightarrow$ and $\neg$ suffice for all logical purposes. This means all statements involving $\wedge$ and/or $\vee$ can be “converted” to ones that involve only $\rightarrow$ and $\neg$. However, we can also choose the “minimal” set $\{ \wedge, \,\neg \}$, instead, for the purpose for which we chose the minimal set $\{ \rightarrow, \neg \}$. In fact, there are lots of other possible combinations of operators that can serve our purpose equally well. Which minimal set of operators we choose depends sometimes on personal taste and at other times on practical considerations. So, for example, while designing circuits in the field of computer hardware, the minimal operator set that is used is $\{ \downarrow \}$. In fact, all that’s really needed is this particular operator set. Here $p \downarrow q \equiv \neg (p \wedge q)$.

So, what have we got so far? Well, we have a formal notion of a statement (or proposition.) We have access to propositional variables ($p, \, q, \, r$, etc.) that may be used to denote statements. We know how to create the negation of a given statement using the $\neg$ logical connective. We also know how to “connect” any two statements using conjunction, disjunction and material implication that are symbolically represented by the logical connectives $\wedge, \, \vee$ and $\rightarrow$, respectively. And, lastly, given any two statements $p$ and $q$, we have defined what it means for the two to be logically equivalent (which is symbolically represented by $\equiv$) to each other. Indeed, $p \equiv q$ if and only if ($p \rightarrow q \mbox{ and } q \to p$).

We shall see in the later posts that the above “small” formal system (for propositional calculus) we have built thus far is, in fact, quite powerful. We can, indeed, already employ quite a bit of it in “ordinary” mathematics. But, more on this, later!

I wish to use this part of the blog to quickly go through the basic elements of propositional calculus, and then later move on to predicate calculus in another part of the blog, followed by the fundamentals of relational algebra in yet another part. I might then go through the problem of query optimization in RDBMS after that. Let’s see how far this goes.

• 311,754 hits