You are currently browsing the category archive for the ‘Category Theory for Beginners’ category.

In our last post on category theory, we continued our exploration of universal properties, showing how they can be used to motivate the concept of natural transformation, the “right” notion of morphism $\phi: F \to G$ between functors $F, G$. In today’s post, I want to turn things around, applying the notion of natural transformation to explain generally what we mean by a universal construction. The key concept is the notion of representability, at the center of a circle of ideas which includes the Yoneda lemma, adjoint functors, monads, and other things — it won’t be possible to talk about all these things in detail (because I really want to return to Stone duality before long), but perhaps these notes will provide a key of entry into more thorough treatments.

Even for a fanatic like myself, it’s a little hard to see what would drive anyone to study category theory except a pretty serious “need to know” (there is a beauty and conceptual economy to categorical thinking, but I’m not sure that’s compelling enough motivation!). I myself began learning category theory on my own as an undergraduate; at the time I had only the vaguest glimmerings of a vast underlying unity to mathematics, but it was only after discovering the existence of category theory by accident (reading the introductory chapter of Spanier’s Algebraic Topology) that I began to suspect it held the answer to a lot of questions I had. So I got pretty fired-up about it then, and started to read Mac Lane’s Categories for the Working Mathematician. I think that even today this book remains the best serious introduction to the subject — for those who need to know! But category theory should be learned from many sources and in terms of its many applications. Happily, there are now quite a few resources on the Web and a number of blogs which discuss category theory (such as The Unapologetic Mathematician) at the entry level, with widely differing applications in mind. An embarrassment of riches!

Anyway, to return to today’s topic. Way back when, when we were first discussing posets, most of our examples of posets were of a “concrete” nature: sets of subsets of various types, ordered by inclusion. In fact, we went a little further and observed that any poset could be represented as a concrete poset, by means of a “Dedekind embedding” (bearing a familial resemblance to Cayley’s lemma, which says that any group can be represented concretely, as a group of permutations). Such concrete representation theorems are extremely important in mathematics; in fact, this whole series is a trope on the Stone representation theorem, that every Boolean algebra is an algebra of sets! With that, I want to discuss a representation theorem for categories, where every (small) category can be explicitly embedded in a concrete category of “structured sets” (properly interpreted). This is the famous Yoneda embedding.

This requires some preface. First, we need the following fundamental construction: for every category $C$ there is an opposite category $C^{op}$, having the same classes $O, M$ of objects and morphisms as $C$, but with domain and codomain switched ($\mbox{dom}^{op} := \mbox{cod}: M \to O$, and $\mbox{cod}^{op} := \mbox{dom}: M \to O$). The function $O \to M: A \mapsto 1_A$ is the same in both cases, but we see that the class of composable pairs of morphisms is modified:

$(f, g) \in C^{op}_2$ [is a composable pair in $C^{op}$] if and only if $(g, f) \in C_2$

and accordingly, we define composition of morphisms in $C^{op}$ in the order opposite to composition in $C$:

$(g \circ f)^{op} := f \circ g$ in $C$.

Observation: The categorical axioms are satisfied in the structure $C^{op}$ if and only if they are in $C$; also, $(C^{op})^{op} = C$.

This observation is the underpinning of a Principle of Duality in the theory of categories (extending the principle of duality in the theory of posets). As the construction of opposite categories suggests, the dual of a sentence expressed in the first-order language of category theory is obtained by reversing the directions of all arrows and the order of composition of morphisms, but otherwise keeping the logical structure the same. Let me give a quick example:

Definition: Let $X_1, X_2$ be objects in a category $C$. A coproduct of $X_1$ and $X_2$ consists of an object $X$ and maps $i_1: X_1 \to X$, $i_2: X_2 \to X$ (called injection or coprojection maps), satisfying the universal property that given an object $Y$ and maps $f_1: X_1 \to Y$, $f_2: X_2 \to Y$, there exists a unique map $f: X \to Y$ such that $f_1 = f \circ i_1$ and $f_2 = f \circ i_2$. $\Box$

This notion is dual to the notion of product. (Often, one indicates the dual notion by appending the prefix “co” — except of course if the “co” prefix is already there; then one removes it.) In the category of sets, the coproduct of two sets $X_1, X_2$ may be taken to be their disjoint union $X_1 + X_2$, where the injections $i_1, i_2$ are the inclusion maps of $X_1, X_2$ into $X_1 + X_2$ (exercise).

Exercise: Formulate the notion of coequalizer (by dualizing the notion of equalizer). Describe the coequalizer of two functions $\displaystyle f, g: X \stackrel{\to}{\to} Y$ (in the category of sets) in terms of equivalence classes. Then formulate the notion dual to that of monomorphism (called an epimorphism), and by a process of dualization, show that in any category, coequalizers are epic.

Principle of duality: If a sentence expressed in the first-order theory of categories is provable in the theory, then so is the dual sentence. Proof (sketch): A proof $P$ of a sentence proceeds from the axioms of category theory by applying rules of inference. The dualization of $P$ proves the dual sentence by applying the same rules of inference but starting from the duals of the categorical axioms. A formal proof of the Observation above shows that collectively, the set of categorical axioms is self-dual, so we are done. $\Box$

Next, we introduce the all-important hom-functors. We suppose that $C$ is a locally small category, meaning that the class of morphisms $g: c \to d$ between any two given objects $c, d$ is small, i.e., is a set as opposed to a proper class. Even for large categories, this condition is just about always satisfied in mathematical practice (although there is the occasional baroque counterexample, like the category of quasitopological spaces).

Let $Set$ denote the category of sets and functions. Then, there is a functor

$\hom_C: C^{op} \times C \to Set$

which, at the level of objects, takes a pair of objects $(c, d)$ to the set $\hom(c, d)$ of morphisms $g: c \to d$ (in $C$) between them. It takes a morphism $(f: c \to c', h: d \to d')$ of $C^{op} \times C$ (that is to say, a pair of morphisms $(f: c' \to c, h: d \to d')$ of $C$) to the function

$\hom_C(f, h): \hom(c, d) \to \hom(c', d'): g \mapsto hgf.$

Using the associativity and identity axioms in $C$, it is not hard to check that this indeed defines a functor $\hom_C: C^{op} \times C \to Set$. It generalizes the truth-valued pairing $P^{op} \times P \to \mathbf{2}$ we defined earlier for posets.

Now assume $C$ is small. From last time, there is a bijection between functors

$\displaystyle \frac{h: C^{op} \times C \to Set}{f: C \to Set^{C^{op}}}$

and by applying this bijection to $\hom_C: C^{op} \times C \to Set$, we get a functor

$y_C: C \to Set^{C^{op}}.$

This is the famous Yoneda embedding of the category $C$. It takes an object $c$ to the hom-functor $\hom_C(-, c): C^{op} \to Set$. This hom-functor can be thought of as a structured, disciplined way of considering the totality of morphisms mapping into the object $c$, and has much to do with the Yoneda Principle we stated informally last time (and which we state precisely below).

• Remark: We don’t need $C$ to be small to talk about $\hom_C(-, c)$; local smallness will do. The only place we ask that $C$ be small is when we are considering the totality of all functors $C^{op} \to Set$, as forming a category $\displaystyle Set^{C^{op}}$.

Definition: A functor $F: C^{op} \to Set$ is representable (with representing object $c$) if there is a natural isomorphism $\displaystyle \hom_C(-, c) \stackrel{\sim}{\to} F$ of functors.

The concept of representability is key to discussing what is meant by a universal construction in general. To clarify its role, let’s go back to one of our standard examples.

Let $c_1, c_2$ be objects in a category $C$, and let $F: C^{op} \to Set$ be the functor $\hom(-, c_1) \times \hom(-, c_2)$; that is, the functor which takes an object $b$ of $C$ to the set $\hom(b, c_1) \times \hom(b, c_2)$. Then a representing object for $F$ is a product $c_1 \times c_2$ in $C$. Indeed, the isomorphism between sets $\hom(b, c_1 \times c_2) \cong \hom(b, c_1) \times \hom(b, c_2)$ simply recapitulates that we have a bijection

$\displaystyle \frac{b \to c_1 \times c_2}{b \to c_1 \qquad b \to c_2}$

between morphisms into the product and pairs of morphisms. But wait, not just an isomorphism: we said a natural isomorphism (between functors in the argument $b$) — how does naturality figure in?

Enter stage left the celebrated

Yoneda Lemma: Given a functor $F: C^{op} \to Set$ and an object $c$ of $C$, natural transformations $\phi: \hom(-, c) \to F$ are in (natural!) bijection with elements $\xi \in F(c)$.

Proof: We apply the “Yoneda trick” introduced last time: probe the representing object $c$ with the identity morphism, and see where $\phi$ takes it: put $\xi = \phi_c(1_c)$. Incredibly, this single element $\xi$ determines the rest of the transformation $\phi$: by chasing the element $1_c \in \hom(c, c)$ around the diagram

                phi_c
hom(c, c) -----> Fc
|            |
hom(f, c) |            | Ff
V            V
hom(b, c) -----> Fb
phi_b

(which commutes by naturality of $\phi$), we see for any morphism $f: b \to c$ in $\hom(b, c)$ that $\phi_b(f) = F(f)(\xi)$. That the bijection

$\displaystyle \frac{\xi: 1 \to F(c)}{\phi: \hom(-, c) \to F}$

is natural in the arguments $F, c$ we leave as an exercise. $\Box$

Returning to our example of the product $c_1 \times c_2$ as representing object, the Yoneda lemma implies that the natural bijection

$\displaystyle \phi_b: \hom(b, c_1 \times c_2) \cong \hom(b, c_1) \times \hom(b, c_2)$

is induced by the element $\xi = \phi_{c_1 \times c_2}(1_{c_1 \times c_2})$, and this element is none other than the pair of projection maps

$\xi = (\pi_1: c_1 \times c_2 \to c_1, \pi_2: c_1 \times c_2 \to c_2).$

In summary, the Yoneda lemma guarantees that a hom-representation $\phi: \hom(-, c) \cong F$ of a functor is, by the naturality assumption, induced in a uniform way from a single “universal” element $\xi \in F(c)$. All universal constructions fall within this general pattern.

Example: Let $C$ be a category with products, and let $c, d$ be objects. Then a representing object for the functor $\hom(- \times c, d): C^{op} \to Set$ is an exponential $d^c$; the universal element $\xi \in \hom(d^c \times c, d)$ is the evaluation map $d^c \times c \to d$.

Exercise: Let $\displaystyle f, g: x \stackrel{\to}{\to} y$ be a pair of parallel arrows in a category $C$. Describe a functor $F: C^{op} \to Set$ which is represented by an equalizer of this pair (assuming one exists).

Exercise: Dualize the Yoneda lemma by considering hom-functors $\hom_C(c, -): C \to Set$. Express the universal property of the coproduct in terms of representability by such hom-functors.

The Yoneda lemma has a useful corollary: for any (locally small) category $C$, there is a natural isomorphism

$\displaystyle \frac{\hom_C(-, a) \to \hom_C(-, b)}{a \to b}$

between natural transformations between hom-functors and morphisms in $C$. Using $C(a, b)$ as alternate notation for the hom-set, the action of the Yoneda embedding functor $y_C$ on morphisms gives an isomorphism between hom-sets

$\displaystyle C(a, b) \stackrel{\sim}{\to} Set^{C^{op}}(y_C a, y_C b);$

the functor $y_C$ is said in that case to be fully faithful (faithful means this action on morphisms is injective for all $a, b$, and full means the action is surjective for all $a, b$). The Yoneda embedding $y_C$ thus maps $C$ isomorphically onto the category of hom-functors $y_C a = \hom_C(-, a)$ valued in the category $Set$.

It is illuminating to work out the meaning of this last statement in special cases. When the category $C$ is a group $G$ (that is, a category with exactly one object $\bullet$ in which every morphism is invertible), then functors $F: G^{op} \to Set$ are tantamount to sets $X$ equipped with a group homomorphism $G^{op} \to \hom(X, X)$, i.e., a left action of $G^{op}$, or a right action of $G$. In particular, $\hom(-, \bullet): G^{op} \to Set$ is the underlying set of $G$, equipped with the canonical right action $\rho: G \to \hom(G, G)$, where $\rho(g)(h) = hg$. Moreover, natural transformations between functors $G^{op} \to Set$ are tantamount to morphisms of right $G$-sets. Now, the Yoneda embedding

$y_G: G \to Set^{G^{op}}$

identifies any abstract group $G$ with a concrete group $y_G(G)$, i.e., with a group of permutations — namely, exactly those permutations on $G$ which respect the right action of $G$ on itself. This is the sophisticated version of Cayley’s theorem in group theory. If on the other hand we take $C$ to be a poset, then the Yoneda embedding is tantamount to the Dedekind embedding we discussed in the first lecture.

Tying up a loose thread, let us now formulate the “Yoneda principle” precisely. Informally, it says that an object is determined up to isomorphism by the morphisms mapping into it. Using the hom-functor $\hom(-, c)$ to collate the morphisms mapping into $c$, the precise form of the Yoneda principle says that an isomorphism between representables $\hom(-, c) \to \hom(-, d)$ corresponds to a unique isomorphism $c \to d$ between objects. This follows easily from the Yoneda lemma.

But far and away, the most profound manifestation of representability is in the notion of an adjoint pair of functors. “Free constructions” give a particularly ubiquitous class of examples; the basic idea will be explained in terms of free groups, but the categorical formulation applies quite generally (e.g., to free monoids, free Boolean algebras, free rings = polynomial algebras, etc., etc.).

If $X$ is a set, the free group (q.v.) generated by $X$ is, informally, the group $FX$ whose elements are finite “words” built from “literals” $a, b, c, \ldots$ which are the elements of $X$ and their formal inverses, where we identify a word with any other gotten by introducing or deleting appearances of consecutive literals $a a^{-1}$ or $a^{-1}a$. Janis Joplin said it best:

Freedom’s just another word for nothin’ left to lose…

— there are no relations between the generators of $FX$ beyond the bare minimum required by the group axioms.

Categorically, the free group $FX$ is defined by a universal property; loosely speaking, for any group $G$, there is a natural bijection between group homomorphisms and functions

$\displaystyle \frac{FX \to G}{X \to UG}$

where $UG$ denotes the underlying set of the group. That is, we are free to assign elements of $G$ to elements of $X$ any way we like: any function $f: X \to UG$ extends uniquely to a group homomorphism $\hat{f}: FX \to G$, sending a word $x_1 x_2 \ldots x_n$ in $FX$ to the element $f(x_1)f(x_2) \ldots f(x_n)$ in $G$.

Using the usual Yoneda trick, or the dual of the Yoneda trick, this isomorphism is induced by a universal function $i: X \to UFX$, gotten by applying the bijection above to the identity map $id: FX \to FX$. Concretely, this function takes an element $x \in X$ to the one-letter word $x \in UFX$ in the underlying set of the free group. The universal property states that the bijection above is effected by composing with this universal map:

$\displaystyle \hom_{Grp}(FX, G) \to \hom_{Set}(UFX, UG) \stackrel{\hom(i, UG)}{\to} \hom_{Set}(X, UG)$

where the first arrow refers to the action of the underlying-set or forgetful functor $U: Grp \to Set$, mapping the category of groups to the category of sets ($U$ “forgets” the fact that homomorphisms $f: G \to H$ preserve group structure, and just thinks of them as functions $Uf: UG \to UH$).

• Remark: Some people might say this a little less formally: that the original function $f: X \to G$ is retrieved from the extension homomorphism $\hat{f}: FX \to G$ by composing with the canonical injection of the generators $X \to FX$. The reason we don’t say this is that there’s a confusion of categories here: properly speaking, $FX \to G$ belongs to the category of groups, and $X \to G$ to the category of sets. The underlying-set functor $U: Grp \to Set$ is a device we apply to eliminate the confusion.

In different words, the universal property of free groups says that the functor $\hom_{Set}(X, U-): Grp \to Set$, i.e., the underlying functor $U: Grp \to Set$ followed by the hom-functor $\hom(X, -): Set \to Set$, is representable by the free group $FX$: there is a natural isomorphism of functors from groups to sets:

$Grp(FX, -) \stackrel{\sim}{\to} Set(X, U-).$

Now, the free group $FX$ can be constructed for any set $X$. Moreover, the construction is functorial: defines a functor $F: Set \to Grp$. This is actually a good exercise in working with universal properties. In outline: given a function $f: X \to Y$, the homomorphism $Ff: FX \to FY$ is the one which corresponds bijectively to the function

$\displaystyle X \stackrel{f}{\to} Y \stackrel{i_Y}{\to} UFY,$

i.e., $Ff$ is defined to be the unique map $h$ such that $Uh \circ i_X = i_Y \circ f$.

Proposition: $F: Set \to Grp$ is functorial (i.e., preserves morphism identities and morphism composition).

Proof: Suppose $f: X \to Y$, $g: Y \to Z$ is a composable pair of morphisms in $Set$. By universality, there is a unique map $h: FX \to FZ$, namely $F(g \circ f)$, such that $Uh \circ i_X = i_Z \circ (g \circ f)$. But $Fg \circ Ff$ also has this property, since

$U(Fg \circ Ff) \circ i_X = UFg \circ UFf \circ i_X = UFg \circ i_Y \circ f = i_Z \circ g \circ f$

(where we used functoriality of $U$ in the first equation). Hence $F(g \circ f) = Fg \circ Ff$. Another universality argument shows that $F$ preserves identities. $\Box$

Observe that the functor $F$ is rigged so that for all morphisms $f: X \to Y$,

$UFf \circ i_X = i_Y \circ f.$

That is to say, that there is only one way of defining $F$ so that the universal map $i_X$ is (the component at $X$ of) a natural transformation $1_{Set} \to UF$!

The underlying-set and free functors $U: Grp \to Set$, $F: Set \to Grp$ are called adjoints, generalizing the notion of adjoint in truth-valued matrix algebra: we have an isomorphism

$\hom_{Grp}(FX, Y) \cong \hom_{Set}(X, UY)$

natural in both arguments $X, Y$. We say that $F$ is left adjoint to $U$, or dually, that $U$ is right adjoint to $F$, and write $F \dashv U$. The transformation $i: 1 \to UF$ is called the unit of the adjunction.

Exercise: Define the construction dual to the unit, called the counit, as a transformation $\varepsilon: FU \to 1$. Describe this concretely in the case of the free-underlying adjunction $F \dashv U$ between sets and groups.

What makes the concept of adjoint functors so compelling is that it combines representability with duality: the manifest symmetry of an adjunction $\hom(FX, Y) \cong \hom(X, GY)$ means that we can equally well think of $GY$ as representing $\hom(F-, Y)$ as we can $FX$ as representing $\hom(X, G-)$. Time is up for today, but we’ll be seeing more of adjunctions next time, when we resume our study of Stone duality.

[Tip of the hat to Robert Dawson for the Janis Joplin quip.]

I wish to bring the attention of our readers to the $36^{th}$ Carnival of Mathematics hosted by Charles at Rigorous Trivialities. I guess most of you already know about it. Among other articles/posts, one of Todd’s recent post Basic Category Theory I is part of the carnival. He followed it up with another post titled Basic Category Theory II. There will be a third post on the same topic some time soon. This sub-series of posts on basic category theory, if you recall, is part of the larger series on Stone Duality, which all began with Toward Stone Duality: Posets and Meets. Hope you enjoy the Carnival!

I’ll continue then with this brief subseries on category theory. Today I want to talk more about universal properties, and about the notion of natural transformation. Maybe not today, but soon at any rate, I want to tie all this in with the central concept of representability, which leads directly and naturally to the powerful and fundamental idea of adjoint functors. This goes straight to the very heart of category theory.

The term “natural”, often bandied about by mathematicians, is perhaps an overloaded term (see the comments here for a recent disagreement about certain senses of the word). I don’t know the exact history of the word as used by mathematicians, but by the 1930s and 40s the description of something as “natural” was part of the working parlance of many mathematicians (in particular, algebraic topologists), and it is to the great credit of Eilenberg and Mac Lane that they sought to give the word a precise mathematical sense. A motivating problem in their case was to prove a universal coefficient theorem for Cech cohomology, for which they needed certain comparison maps (transformations) which cohered by making certain diagrams commute (which was the naturality condition). In trying to precisely define this concept of naturality, they were led to the concept of a “functor” and then, to define the concept of functor, they were led back to the notion of category! And the rest, as they say, is history.

More on naturality in a moment. Let me first give a few more examples of universal constructions. Last time we discussed the general concept of a cartesian product — obviously in honor of Descartes, for his tremendous idea of the method of coordinates and analytic geometry.

But of course products are only part of the story: he was also interested in the representation of equations by geometric figures: for instance, representing an equation $y = f(x)$ as a subset of the plane. In the language of category theory, the variable $y$ denotes the second coordinate or second projection map $\pi_2: \mathbb{R} \times \mathbb{R} \to \mathbb{R}$, and $f(x)$ denotes the composite of the first projection map followed by some given map $f$:

$\displaystyle \mathbb{R} \times \mathbb{R} \stackrel{\pi_1}{\to} \mathbb{R} \stackrel{f}{\to} \mathbb{R}.$

The locus of the equation $y = f(x)$ is the subset of $\mathbb{R} \times \mathbb{R}$ where the two morphisms $\pi_2$ and $f \circ \pi_1$ are equal, and we want to describe the locus $L$ in a categorical way (i.e., in a way which will port over to other categories).

Definition: Given a pair of morphisms

$\displaystyle f, g: X \stackrel{\to}{\to} Y$

their equalizer consists of an object $L$ and a map $e: L \to X$, such that $f \circ e = g \circ e$, and satisfying the following universal property: for any map $h: A \to X$ such that $f \circ h = g \circ h$, there exists a unique map $j: A \to L$ such that $h = e \circ j$ (any map $h: A \to X$ that equalizes $f$ and $g$ factors in a unique way through the equalizer $e: L \to X$). $\Box$

Another way of saying it is that there is a bijection between $(f, g)$-equalizing maps $h: A \to X$ and maps $j: A \to L$,

$\displaystyle \frac{h: A \to X \mbox{ such that } fh = gh}{j: A \to L \qquad },$

effected by composing such maps $j$ with the universal $(f, g)$-equalizing map $e: L \to X$.

Exercise: Apply a universality argument to show that any two equalizers of a given pair of maps $(f, g)$ are isomorphic.

It is not immediately apparent from the definition that an equalizer $e: L \to X$ describes a “subthing” (e.g., a subset) of $X$, but then we haven’t even discussed subobjects. The categorical idea of subobject probably takes some getting used to anyway, so I’ll be brief. First, there is the idea of a monomorphism (or a “mono” for short), which generalizes the idea of an injective or one-to-one function. A morphism $f: S \to T$ is monic if for all $g, h: A \to S$, $f \circ g = f \circ h$ implies $g = h$. Monos with codomain $T$ are preordered by a relation $\leq$, where

$(e: R \to T) \leq (f: S \to T)$

if there exists $g: R \to S$ such that $e = f \circ g$. (Such a $g$ is unique since $f$ is monic, so it doesn’t need to be specified particularly; also this $g$ is easily seen to be monic [exercise].) Then we say that two monics $e, f$ mapping into $T$ name the same subobject of $T$ if $e \leq f$ and $f \leq e$; in that case the mediator $g$ is an isomorphism. Writing $e \sim f$ to denote this condition, it is standard that $\sim$ is an equivalence relation.

Thus, a subobject of $X$ is an equivalence class of monos into $X$. So when we say an equalizer $e: L \to X$ of maps $f, g: X \to Y$ defines a subobject of $X$, all we really mean is that $e$ is monic. Proof: Suppose $eh = ej$ for maps $h, j: A \to X$. Since $fe = ge$, we have $f(ej) = g(ej)$ for instance. By definition of equalizer, this means there exists a unique map $k: A \to X$ for which $eh = ej = ek$. Uniqueness then implies $h, j$ are equal to this self-same $k$, so $h = j$ and we are done.

Let me turn to another example of a universal construction, which has been used in one form or another for centuries: that of “function space”. For example, in the calculus of variations, one may be interested in the “space” of all (continuous) paths $\alpha: I = \left[0, 1\right] \to X$ in a physical space $X$, and in paths which minimize “action” (principle of least action).

If $X$ is a topological space, then one is faced with a variety of choices for topologizing the path space (denoted $X^I$). How to choose? As in our discussion last time of topologizing products, our view here is that the “right” topology will be the unique one which ensures that an appropriate universal property is satisfied.

To get started on this: the points of the path space $X^I$ are of course paths $\alpha: I \to X$, and paths in the path space, $I \to X^I$, sending each $s \in I$ to a path $\alpha_s: I \to X$, should correspond to homotopies between paths, that is continuous maps $h: I \times I \to X$; the idea is that $h(s, t) := \alpha_s(t)$. Now, just knowing what paths in a space $Y = X^I$ look like (homotopies between paths) may not be enough to pin down the topology on $Y$, but: suppose we now generalize. Suppose we decree that for any space $Z$, the continuous maps $Z \to X^I$ should correspond exactly to continuous maps $h: Z \times I \to X$, also called homotopies. Then that is enough to pin down the topology on $X^I$. (We could put it this way: we use general spaces $Z$ to probe the topology of $X^I$.)

This principle applies not just to topology, but is extremely general: it applies to any category! I’ll state it very informally for now, and more precisely later:

Yoneda principle: to determine any object $Y$ up to isomorphism, it suffices to understand what general maps $Z \to Y$ mapping into it look like.

For instance, a product $X_1 \times X_2$ is determined up to isomorphism by knowing what maps $Z \to X_1 \times X_2$ into it look like [they look like pairs of maps $(Z \to X_1, Z \to X_2)$]. In the first lecture in the Stone duality, we stated the Yoneda principle just for posets; now we are generalizing it to arbitrary categories.

In the case at hand, we would like to express the bijection between continuous maps

$\displaystyle \frac{f: Z \to X^I}{h: Z \times I \to X}$

as a working universal property for the function space $X^I$. There is a standard “Yoneda trick” for doing this: probe the thing we seek a universal characterization of with the identity map, here $1_{X^I}: X^I \to X^I$. Passing to the other side of the bijection, the identity map corresponds to a map

$ev: X^I \times I \to X$

and this is the “universal map” we need. (I called it $ev$ because in this case it is the evaluation map, which maps a pair (path $\alpha: I \to X$, point $t \in I$) to $\alpha(t)$, i.e., evaluates $\alpha$ at $t$.)

Here then is the universal characterization of the path space: a space $X^I$ equipped with a continuous map $ev: X^I \times I \to X$, satisfying the following universal property: given any continuous map $h: Z \times I \to X$, there exists a unique continuous map $f: Z \to X^I$ such that $h$ is retrieved as the composite

$\displaystyle Z \times I \stackrel{f \times 1_I}{\to} X^I \times I \stackrel{ev}{\to} X$

(for the first arrow in the composite, cf. the exercise stated at the end of the last lecture).

Exercise: Formulate a universality argument that this universal property determines $X^I$ up to isomorphism.

Remark: Incidentally, for any space $X$, such a path space exists; its topology turns out to be the topology of “uniform convergence”. We can pose a similar universal definition of any function space $X^Y$ (replacing $I$ by $Y$, mutatis mutandis); a somewhat non-trivial result is that such a function space exists for all $X$ if and only if $Y$ is locally compact; the topology on $X^Y$ is then the so-called “compact-open” topology.

But why stop there? A general notion of “exponential” object is available for any category $C$ with cartesian products: for objects $c, d$ of $C$, an exponential $d^c$ is an object equipped with a map $ev: d^c \times c \to d$, such that for any $h: b \times c \to d$, there exists a unique $f: b \to d^c$ such that $h$ is retrieved as the composite

$\displaystyle b \times c \stackrel{f \times 1_c}{\to} d^c \times c \stackrel{ev}{\to} d.$

Example: If the category is a meet-semilattice, then (assuming $x^y$ exists) there is a bijection or equivalence which takes the form

$\displaystyle \frac{a \leq x^y}{a \wedge y \leq x}$ iff

But wait, we’ve seen this before: $x^y$ is what we earlier called the implication $y \Rightarrow x$. So implication is really a function space object! $\Box$

Okay, let me turn now to the notion of natural transformation. I described the original discovery (or invention) of categories as a kind of reverse engineering (functors were invented to talk about natural transformations, and categories were invented to talk about functors). Moving now in the forward direction, the rough idea can be stated as a progression:

• The notion of functor is appropriately defined as a morphism between categories,
• The notion of natural transformation is appropriately defined as a morphism between functors.

That seems pretty bare-bones: how do we decide what the appropriate notion of morphism between functors should be? One answer is by pursuing an analogy:

• As a space $Y^X$ of continuous functions $X \to Y$ is to the category of topological spaces, so a category $D^C$ of functors $C \to D$ should be to the category of categories.

That is, we already “know” (or in a moment we’ll explain) that the objects of this alleged exponential category $D^C$ are functors $F: C \to D$. Since $D^C$ is defined by a universal property, it is uniquely determined up to isomorphism. This in turn will uniquely determine what the “right” notion of morphism between functors $F, G: C \to D$ should be: morphisms $F \to G$ in the exponential $D^C$! Then, to discover the nature of these morphisms, we employ an appropriate “probe”.

To carry this out, I’ll need two special categories. First, the category $\mathbf{1}$ denotes a (chosen) category with exactly one object and exactly one morphism (necessarily the identity morphism). It satisfies the universal property that for any category $C$, there exists a unique functor $C \to \mathbf{1}$. It is called a terminal category (for that reason). It can also be considered as an empty product of categories.

Proposition: For any category $C$, there is an isomorphism $\mathbf{1} \times C \cong C$.

Proof: Left to the reader. It can be proven either directly, or by applying universal properties. $\Box$

The category $\mathbf{1}$ can also be considered an “object probe”, in the sense that a functor $\mathbf{1} \to C$ is essentially the same thing as an object of $C$ (just look where the object of $\mathbf{1}$ goes to in $C$).

For example, to probe the objects of the exponential category $D^C$, we investigate functors $\mathbf{1} \to D^C$. By the universal property of exponentials $D^C$, these are in bijection with functors $\mathbf{1} \times C \to D$. By the proposition above, these are in bijection with functors $C \to D$. So objects of $D^C$ are necessarily tantamount to functors $C \to D$ (and so we might as well define them as such).

Now we want to probe the morphisms of $D^C$. For this, we use the special category given by the poset $\mathbf{2} = \{0 \leq 1\}$. For if $X$ is any category and $f: x \to y$ is a morphism of $X$, we can define a corresponding functor $F: \mathbf{2} \to X$ such that $F(0) = x$, $F(1) = y$, and $F$ sends the morphism $0 \leq 1$ to $f$. Thus, such functors $F$ are in bijection with morphisms of $X$. Speaking loosely, we could call the category $\mathbf{2}$ the “generic morphism”.

Thus, to probe the morphisms in the category $D^C$, we look at functors $\mathbf{2} \to D^C$. In particular, if $F, G$ are functors $C \to D$, let us consider functors $\phi: \mathbf{2} \to D^C$ such that $\phi(0) = F$ and $\phi(1) = G$. By the universal property of $D^C$, these are in bijection with functors $\eta: \mathbf{2} \times C \to D$ such that the composite

$\displaystyle C \cong \mathbf{1} \times C \stackrel{0 \times 1_C}{\to} \mathbf{2} \times C \stackrel{\eta}{\to} D$

equals $F$, and the composite

$\displaystyle C \cong \mathbf{1} \times C \stackrel{1 \times 1_C}{\to} \mathbf{2} \times C \stackrel{\eta}{\to} D$

equals $G$. Put more simply, this says $\eta(0, c) = F(c)$ and $\eta(1, c) = G(c)$ for objects $c$ of $C$, and $\eta(1_0, f) = F(f)$ and $\eta(1_1, f) = G(f)$ for morphisms $f$ of $C$.

The remaining morphisms of $\mathbf{2} \times C$ have the form $(0 \leq 1, f: c \to d)$. Introduce the following abbreviations:

1. $\phi_c := \eta(0 \leq 1, 1_c)$ for objects $c$ of $C$;
2. $\phi_f := \eta(0 \leq 1, f)$ for morphisms $f$ of $C$.

Since $\eta$ is a functor, it preserves morphism composition. We find in particular that since

$(1_1, f) \circ (0 \leq 1, 1_c) = (1_1 \circ (0 \leq 1), f \circ 1_c) = (0 \leq 1, f)$

$(0 \leq 1, 1_d) \circ (1_0, f) = ((0 \leq 1) \circ 1_0, 1_d \circ f) = (0 \leq 1, f)$

we have

$\eta(1_1, f) \circ \eta(0 \leq 1, 1_c) = \eta(0 \leq 1, f)$

$\eta(0 \leq 1, 1_d) \circ \eta(1_0, f) = \eta(0 \leq 1, f)$

or, using the abbreviations,

$G(f) \circ \phi_c = \phi_f = \phi_d \circ F(f).$

In particular, the data $\phi_f$ is redundant: it may be defined either as either side of the equation

$G(f) \circ \phi_c = \phi_d \circ F(f)$.

Exercise: Just on the basis of this last equation (for arbitrary morphisms $f$ and objects $c$ of $C$), prove that functoriality of $\eta$ follows.

This leads us at last to the definition of natural transformation:

Definition: Let $C, D$ be categories and let $F, G$ be functors from $C$ to $D$. A natural transformation $\phi: F \to G$ is an assignment of morphisms $\phi_c: F(c) \to G(c)$ in $D$ to objects $c$ of $C$, such that for every morphism $f: c \to d$, the following equation holds: $G(f) \circ \phi_c = \phi_d \circ F(f)$. $\Box$

Usually this equation is expressed in the form of a commutative diagram:

          F(f)
F(c) ---> F(d)
|         |
phi_c |         | phi_d
V   G(f)  V
G(c) ---> G(d)

which asserts the equality of the composites formed by following the paths from beginning (here $F(c)$) to end (here $G(d)$). (Despite the inconvenience in typesetting them, commutative diagrams as 2-dimensional arrays are usually easier to read and comprehend than line-by-line equations.) The commutative diagram says that the components $\phi_c$ of the transformation are coherent or compatible with all morphisms $f: c \to d$ of the domain category.

Remarks: Now that I’ve written this post, I’m a little worried that any first-timers to category theory reading this will find this approach to natural transformations a little hardcore or intimidating. In that case I should say that my intent was to make this notion seem as inevitable as possible: by taking seriously the analogy

function space: category of spaces :: functor category: category of categories

we are inexorably led to the “right” (the “natural”) notion of natural transformation as morphism between functors. But if this approach is for some a pedagogical flop, then I urge those readers just to forget it, or come back to it later. Just remember the definition of natural transformation we finally arrived at, and you should be fine. Grasping the inner meaning of fundamental concepts like this takes time anyway, and isn’t merely a matter of pure deduction.

I should also say that the approach involved a kind of leap of faith that these functor categories (the exponentials $D^C$) “exist”. To be sure, the analysis above shows clearly what they must look like if they do exist (objects are functors $C \to D$; morphisms are natural transformations as we’ve defined them), but actually there’s some more work to do: one must show they satisfy the universal property with respect to not just the two probing categories $\mathbf{1}$ and $\mathbf{2}$ that we used, but any category $E$.

A somewhat serious concern here is that our talk of exponential categories played pretty fast and loose at the level of mathematical foundations. There’s that worrying phrase “category of categories”, for starters. That particular phraseology can be avoided, but nevertheless, it must be said that in the case where $C$ is a large category (i.e., involving a proper class of morphisms), the collection of all functors from $C$ to $D$ is not a well-formed construction within the confines of GÃ¶del-Bernays-von Neumann set theory (it is not provably a class in general; in some approaches it could be called a “super-class”).

My own attitude toward these “problems” tends to be somewhat blasÃ©, almost dismissive: these are mere technicalities, sez I. The main concepts are right and robust and very fruitful, and there are various solutions to the technical “problem of size” which have been developed over the decades (although how satisfactory they are is still a matter of debate) to address the apparent difficulties. Anyway, I try not to worry about it much. But, for those fine upstanding citizens who do worry about these things, I’ll just state one set-theoretically respectable theorem to convey that at least conceptually, all is right with the world.

Definition: A category with finite products is cartesian closed if for any two objects $c, d$, there exists an exponential object $d^c$.

Theorem: The category of small categories is cartesian closed. $\Box$

After a long hiatus (sorry about that!), I would like to resume the series on Stone duality. You may recall I started this series by saying that my own point of view on mathematics is strongly informed by category theory, followed by a little rant about the emotional reactions that category theory seems to excite in many people, and that I wouldn’t be “blathering” about categories unless a strong organic need was felt for it. Well, it’s come to that point: to continue talking sensibly about Stone duality, I really feel some basic concepts of category theory are now in order. So: before we pick up the main thread again, I’ll be talking about categories up to the point of the central concept of adjoint pairs, generalizing what we’ve discussed before in the context of truth-valued matrix algebra.

I’ll start by firmly denouncing a common belief: that category theory is some arcane, super-abstract subject. I just don’t believe that’s a healthy way of looking at it. To me, categories are no more and no less abstract than groups, rings, fields, etc. — they are just algebraic structures of a certain type (and a not too complicated type at that). That said, they are particularly ubiquitous and useful structures, which can be studied either as small structures (for example, posets provide examples of categories, and so do groups), or to organize the study of general types of structure in the large (for example, the class of posets and poset morphisms forms a category). Just think of them that way: they are certain sorts of algebraic structures which crop up just about everywhere, and it is very useful to learn something about them.

Usually, the first examples one is shown are large categories, typically of the following sort. One considers the class of mathematical structures of a given type: it could be the class of groups, or of posets, or of Boolean algebras, etc. The elements of a general such class are given the neutral name “objects”. Then, we are also interested in how the objects $A, B, C, \ldots$ are related to each other, typically through transformations $f: A \to B$ which “preserve” the given type of structure. In the case of sets, transformations are just functions; in the case of groups, the transformations are group homomorphisms (which preserve group multiplication, inversion, and identities); in the case of vector spaces, they are linear transformations (preserving vector addition and scalar multiplication); in the case of topological spaces, they are continuous maps (preserving a suitable notion of convergence). In general, the transformations are given the neutral name “homomorphisms”, or more often just “morphisms” or “maps”.

In all of these cases, two morphisms $f: A \to B$, $g: B \to C$ compose to give a new morphism $g \circ f: A \to C$ (for example the composite of two group homomorphisms is a group homomorphism), and do so in an associative way ($h \circ (g \circ f) = (h \circ g) \circ f$), and also there is an identity morphism $1_A: A \to A$ for each object $A$ which behaves as identities should ($f \circ 1_A = f = 1_B \circ f$ for any morphism $f: A \to B$). A collection of objects, morphisms between them, together with an associative law of composition and identities, is called a category.

A key insight of category theory is that in general, important structural properties of objects $A, B, C, \ldots$ can be described purely in terms of general patterns or diagrams of morphisms and their composites. By means of such general patterns, the same concept (like the concept of a product of two objects, or of a quotient object, or of a dual) takes on the same form in many different kinds of category, for many different kinds of structure (whether algebraic, or topological, or analytic, or some mixture thereof) — and this in large part gives category theory the power to unify and simplify the study of general mathematical structure. It came as quite a revelation to me personally that (to take one example) the general idea of a “quotient object” (quotient group, quotient space, etc.) is not based merely on vague family resemblances between different kinds of structure, but can be made absolutely precise and across the board, in a simple way. That sort of explanatory power and conceptual unification is what got me hooked!

In a nutshell, then, category theory is the study of commonly arising structures via general patterns or diagrams of morphisms, and the general application of such study to help simplify and organize large portions of mathematics. Let’s get down to brass tacks.

Definition: A category $C$ consists of the following data:

• A class $O$ of objects;
• A class $M$ of morphisms;
• A function $\mbox{dom}: M \to O$ which assigns to each morphism its domain, and a function $\mbox{cod}: M \to O$ which assigns to each morphism its codomain. If $f \in M$, we write $f: A \to B$ to indicate that $\mbox{dom}(f) = A$ and $\mbox{cod}(f) = B$.
• A function $\mbox{Id}: O \to M$, taking an object $A$ to a morphism $1_A: A \to A$, called the identity on $A$.

Finally, let $C_2$ denote the class of composable pairs of morphisms, i.e., pairs $(f, g) \in M \times M$ such that $\mbox{cod}(f) = \mbox{dom}(g)$. The final datum:

• A function $\mbox{comp}: C_2 \to M$, taking a composable pair $(f: A \to B, g: B \to C)$ to a morphism $g \circ f: A \to C$, called the composite of $f$ and $g$.

These data satisfy a number of axioms, some of which have already been given implicitly (e.g., $\mbox{dom}(g \circ f) = \mbox{dom}(f)$ and $\mbox{cod}(g \circ f) = \mbox{cod}(g)$). The ones which haven’t are

1. Associativity: $h \circ (g \circ f) = (h \circ g) \circ f$ for each composable triple $(f: A \to B, g: B \to C, h: C \to D)$.
2. Identity axiom: Given $f: A \to B$, $f \circ 1_A = f = 1_B \circ f$.

Sometimes we write $C_0$ for the class of objects, $C_1$ for the class of morphisms, and for $n > 1$, $C_n$ for the class of composable $n$-tuples of morphisms. $\Box$

Nothing in this definition says that objects of a category are “sets with extra structure” (or that morphisms preserve such structure); we are just thinking of objects as general “things” and depict them as nodes, and morphisms as arrows or directed edges between nodes, with a given law for composing them. The idea then is all about formal patterns of arrows and their compositions (cf. “commutative diagrams”). Vishal’s post on the notion of category had some picture displays of the categorical axioms, like associativity, which underline this point of view.

In the same vein, categories are used to talk about not just large classes of structures; in a number of important cases, the structures themselves can be viewed as categories. For example:

1. A preorder can be defined as a category for which there is at most one morphism $f: A \to B$ for any two objects $A, B$. Given there is at most one morphism from one object to another, there is no particular need to give it a name like $f$; normally we just write $a \leq b$ to say there is a morphism from $a$ to $b$. Morphism composition then boils down to the transitivity law, and the data of identity morphisms expresses the reflexivity law. In particular, posets (preorders which satisfy the antisymmetry law) are examples of categories.
2. A monoid is usually defined as a set $M$ equipped with an associative binary operation $(a, b) \mapsto a \cdot b$ and with a (two-sided) identity element $e$ for that operation. Alternatively, a monoid can be construed as a category with exactly one object. Here’s how it works: given a monoid $(M, \cdot, e)$, define a category where the class $O$ consists of a single object (which I’ll give a neutral name like $\bullet$; it doesn’t have to be any “thing” in particular; it’s just a “something”, it doesn’t matter what), and where the class of morphisms is defined to be the set $M$. Since there is only one object, we are forced to define $\mbox{dom}(a) = \bullet$ and $\mbox{cod}(a) = \bullet$ for all $a \in M$. In that case all pairs of morphisms are composable, and composition is defined to be the operation in $M$: $a \circ b := a \cdot b$. The identity morphism on $\bullet$ is defined to be $e$. We can turn the process around: given a category with exactly one object, the class of morphisms $M$ can be construed as a monoid in the usual sense.
3. A groupoid is a category in which every morphism is an isomorphism (by definition, an isomorphism is an invertible morphism, that is, a morphism $f: A \to B$ for which there exists a morphism $g: B \to A$ such that $g \circ f = 1_A$ and $f \circ g = 1_B$). For example, the category of finite sets and bijections between them is a groupoid. The category of topological spaces and homeomorphisms between them is a groupoid. A group is a monoid in which every element is invertible; hence a group is essentially the same thing as a groupoid with exactly one object.

Remark: The notion of isomorphism is important in category theory: we think of an isomorphism $f: A \to B$ as a way in which objects $A, B$ are the “same”. For example, if two spaces are homeomorphic, then they are indistinguishable as far as topology is concerned (any topological property satisfied by one is shared by the other). In general there may be many ways or isomorphisms to exhibit such “sameness”, but typically in category theory, if two objects satisfy the same structural property (called a universal property; see below), then there is just one isomorphism between them which respects that property. Those are sometimes called “canonical” or “god-given” isomorphisms; they are 100% natural, no artificial ingredients! $\Box$

Earlier I said that category theory studies mathematical structure in terms of general patterns or diagrams of morphisms. Let me give a simple example: the general notion of “cartesian product”. Suppose $X_1, X_2$ are two objects in a category. A cartesian product of $X_1$ and $X_2$ is an object $X$ together with two morphisms $\pi_1: X \to X_1$, $\pi_2: X \to X_2$ (called the projection maps), satisfying the following universal property: given any object $Y$ equipped with a map $f_i: Y \to X_i$ for $i = 1, 2$, there exists a unique map $f: Y \to X$ such that $f_i = \pi_i \circ f$ for $i = 1, 2$. (Readers not familiar with this categorical notion should verify the universal property for the cartesian product of two sets, in the category of sets and functions.)

I said “a” cartesian product, but any two cartesian products are the same in the sense of being isomorphic. For suppose both $(X, \pi_1, \pi_2)$ and $(X', \pi_1', \pi_2')$ are cartesian products of $X_1, X_2$. By the universal property of the first product, there exists a unique morphism $f: X' \to X$ such that $\pi_i' = \pi_i \circ f$ for $i = 1, 2$. By the universal property of the second product, there exists a unique morphism $g: X \to X'$ such that $\pi_i = \pi_i' \circ g$. These maps $f$ and $g$ are inverse to one another. Why? By the universal property, there is a unique map $\phi: X \to X$ (namely, $\phi = 1_X$) such that $\pi_i = \pi_i \circ \phi$ for $i = 1, 2$. But $\phi = f \circ g$ also satisfies these equations: $\pi_i = \pi_i \circ (f \circ g)$ (using associativity). So $1_X = f \circ g$ by the uniqueness clause of the universal property; similarly, $1_{X'} = g \circ f$. Hence $f: X \to X'$ is an isomorphism.

This sort of argument using a universal property is called a universality argument. It is closely related to what we dubbed the “Yoneda principle” when we studied posets.

So: between any two products $X, X'$ of $X_1$ and $X_2$, there is a unique isomorphism $f: X' \to X$ respecting the product structure; we say that any two products are canonically isomorphic. Very often one also has chosen products (a specific choice of product for each ordered pair $(X_1, X_2)$), as in set theory when we construe the product of two sets as a set of ordered pairs $\{(x_1, x_2): x_1 \in X_1, x_2 \in X_2\}$. We use $X_1 \times X_2$ to denote (the object part of) a chosen cartesian product of $(X_1, X_2)$.

Exercise: Use universality to exhibit a canonical isomorphism $\sigma: X_1 \times X_2 \to X_2 \times X_1$. This is called a symmetry isomorphism for the cartesian product.

Many category theorists (including myself) are fond of the following notation for expressing the universal property of products:

$\displaystyle \frac{f_1: Y \to X_1 \qquad f_2: Y \to X_2}{f = \langle f_1, f_2 \rangle: Y \to X_1 \times X_2}$

where the dividing line indicates a bijection between pairs of maps $(f_1, f_2)$ and single maps $f$ into the product, effected by composing $f$ with the pair of projection maps. We have actually seen this before: when the category is a poset, the cartesian product is called the meet:

$\displaystyle \frac{a \leq x \qquad a \leq y}{a \leq x \wedge y}$

In fact, a lot of arguments we developed for dealing with meets in posets extend to more general cartesian products in categories, with little change (except that instead of equalities, there will typically be canonical isomorphisms). For example, we can speak of a cartesian product of any indexed collection of objects $\{X_i\}_{i \in I}$: an object $\prod_{i \in I} X_i$ equipped with projection maps $\pi_i: \prod_{i \in I} X_i \to X_i$, satisfying the universal property that for every $I$-tuple of maps $f_i: Y \to X_i$, there exists a unique map $f: Y \to \prod_{i \in I} X_i$ such that $f_i = \pi_i \circ f$. Here we have a bijection between $I$-tuples of maps and single maps:

$\displaystyle \frac{(f_i: Y \to X_i)_{i \in I}}{f = \langle f_i \rangle_{i \in I}: Y \to \prod_{i \in I} X_i}$

By universality, such products are unique up to unique isomorphism. In particular, $(X_1 \times X_2) \times X_3$ is a choice of product of the collection $\{X_1, X_2, X_3\}$, as seen by contemplating the bijection between triples of maps and single maps

$\displaystyle \frac{\frac{f_1: Y \to X_1 \qquad f_2: Y \to X_2}{\langle f_1, f_2 \rangle: Y \to X_1 \times X_2} \qquad \frac{f_3: Y \to X_3}{f_3: Y \to X_3}}{f: Y \to (X_1 \times X_2) \times X_3}$

and similarly $X_1 \times (X_2 \times X_3)$ is another choice of product. Therefore, by universality, there is a canonical associativity isomorphism

$\alpha: (X_1 \times X_2) \times X_3 \to X_1 \times (X_2 \times X_3).$

Remark: It might be thought that in all practical cases, the notion of cartesian product (in terms of good old-fashioned sets of tuples) is clear enough; why complicate matters with categories? One answer is that it isn’t always clear from purely set-theoretic considerations what the right product structure is, and in such cases the categorical description gives a clear guide to what we really need. For example, when I was first learning topology, the box topology on the set-theoretic product $\prod_{i \in I} X_i$ seemed to me to be a perfectly natural choice of topology; I didn’t understand the general preference for what is called the “product topology”. (The open sets in the box topology are unions of products $\prod_{i \in I} U_i$ of open sets in the factors $X_i$. The open sets in the product topology are unions of such products where $U_i = X_i$ for all but finitely many $i \in I$.)

In retrospect, the answer is obvious: the product topology on $\prod_{i \in I} X_i$ is the smallest topology making all the projection maps $\pi_i$ continuous. This means that a function $f: Y \to \prod_{i \in I} X_i$ is continuous if and only if each $f_i = \pi_i \circ f: Y \to X_i$ is continuous: precisely the universal property we need. Similarly, in seeking to understand products or other constructions of more abstruse mathematical structures (schemes for instance), the categorical description is de rigeur in making sure we get it right. $\Box$

For just about any mathematical structure we can consider a category of such structures, and this applies to the notion of category itself. That is, we can consider a category of categories! (Sounds almost religious to me: category of categories, holy of holies, light of lights…)

• Remark: Like “set of sets”, the idea of category of categories taken to a naive extreme leads to paradoxes or other foundational difficulties, but there are techniques for dealing with these issues, which I don’t particularly want to discuss right now. If anyone is uncomfortable around these issues, a stopgap measure is to consider rather the category of small categories (a category has a class of objects and morphisms; a small category is where these classes are sets), within some suitable framework like the set theory of GÃ¶del-Bernays-von Neumann.

If categories are objects, the morphisms between them may be taken to be structure-preserving maps between categories, called “functors”.

Definition: If $C$ and $D$ are categories, a functor $F: C \to D$ consists of a function $F_0: C_0 \to D_0$ (between objects) and a function $F_1: C_1 \to D_1$ (between morphisms), such that

• $F_0(\mbox{dom}_C(f)) = \mbox{dom}_D(F_1(f))$ and $F_0(\mbox{cod}_C(f)) = \mbox{cod}_D(F_1(f))$, for each morphism $f \in C_1$ (i.e., $F$ preserves domains and codomains of morphisms);
• $F_1(1_A) = 1_{F_0(A)}$ for each object $A \in C_0$, and $F_1(g \circ f) = F_1(g) \circ F_1(f)$ for each composable pair $(f, g) \in C_2$ (i.e., $F$ preserves identity morphisms and composition of morphisms).

Normally we are not so fussy in writing $F_1(f)$ or $F_0(A)$; we write $F(f)$ and $F(A)$ for morphisms $f$ and objects $A$ alike. Sometimes we drop the parentheses as well. $\Box$

If $X, Y$ are groups or monoids regarded as one-object categories, then a functor between them amounts to a group or monoid homomorphism. If $X, Y$ are posets regarded as categories, then a functor between them amounts to a poset map. So no new surprises in these cases.

Exercise: Define a product $C \times D$ of two categories $C, D$, and verify that the definition satisfies the universal property of products in the “category of categories”.

Exercise: If a category $C$ has chosen products, show how a functor $C \times C \to C$ may be defined which takes a pair of objects $(c, d)$ to its product $c \times d$. (You need to define the morphism part $F_1$ of this functor; this will involve the universal property of products.)

I will write a series of posts as a way of gently introducing category theory to the ‘beginner’, though I will assume that the beginner will have some level of mathematical maturity. This series will be based on the the book, Conceptual Mathematics: A first introduction to categories by Lawvere and Schanuel. So, this won’t go into most of the deeper stuff that’s covered in, say, Categories for the Working Mathematician by Mac Lane. We shall deal only with sets (as our objects) and functions (as our morphisms). This means we deal only with the Category of Sets! Therefore, the reader is not expected to know about advanced stuff like groups and/or group homomorphisms, vector spaces and/or linear transformations, etc. Also, no knowledge of college level calculus is required. Only knowledge of sets and functions, some familiarity in dealing with mathematical symbols and some knowledge of elementary combinatorics are required. That’s all!

Sets, maps and composition

An object (in this category) is a finite set or collection.

A map $f: A \to B$ (in this category) consists of the following:

i) a set $A$ called the domain of the map;

ii) a set $B$ called the codomain of the map; and

iii) a rule assigning to each $a \in A$, an element $b \in B$.

We also use ‘function’, ‘transformation’, ‘operator’, ‘arrow’ and ‘morphism’ for ‘map’ depending on the context, as we shall see later.

An endomap $f: A \to A$ is a map that has the same object as domain and codomain, which in this case is $A$.

An endomap in which $f(a) = a$ for every $a \in A$ is called an identity map, also denoted by $1_A$.

Composition of maps is a process by which two maps are combined to yield a third map. Composition of maps is really at the heart of category theory, and this will be evident in plenty in the later posts. So, if we have two maps $f: X \to Y$ and $g:Y \to Z$, then $g \circ f: X \to Z$ is the third map obtained by composing $f$ and $g$. Note that $g \circ f$ is read ‘$g$ following $f$‘.

Guess what? Those are all the ingredients we need to define our category of sets! Though we shall deal only with sets and functions, the following definition of a category of sets is actually pretty much the same as the general definition of a category.

Definition: A CATEGORY consists of the following:

(1) OBJECTS: these are usually denoted by $A, B, C, \ldots$ etc.

(2) MAPS: these are usually denoted by $f, g, h, \ldots$ etc.

(3) For each map $f$, one object as DOMAIN of $f$ and one object as CODOMAIN of $f$. So, $f: A \to B$ has domain $A$ and codomain $B$. This is also read as ‘$f$ is a map from $A$ to $B$‘.

(4) For each object $A$, there exists an IDENTITY MAP, $1_A$. This is also written as $1_A: A \to A$.

(5) For each pair of maps $f: A \to B$ and $g: B \to C$, there exists a COMPOSITE map, $g \circ f: A \to C$. ( ‘$g$ following $f$‘.)

such that the following RULES are satisfied:

(i) (IDENTITY LAWS): If $f: A \to B$, then we have $1_B \circ f = f$ and $f \circ 1_A = f$.

(ii) (ASSOCIATIVE LAW): If $f: A \to B , \, g: B \to C$ and $h: C \to D$, then we have $(h \circ g)\circ f = h \circ (g \circ f)$. ( ‘$h$ following $g$ following $f$‘) Note that in this case we are allowed to write $h \circ g \circ f$ without any ambiguity.

Exercises: Suppose $A = \{ a, b, c, d \}$ and $B = \{ 1, 2, 3 \}$.

(1) How many maps $f$ are there from $A$ to $B$?

(2) How many maps $f$ are there from $A$ to $A$?

(3) How many maps $f$ are there from $B$ to $A$?

(4) How many maps $f$ are there from $B$ to $B$?

(5) How many maps $f$ are there from $A$ to $A$ satisfying $f \circ f = f$?

(This is a non-trivial exercise for the general case in which $|A| = n$ for some positive integer $n$.)

(6) How many maps $g$ are there from $B$ to $B$ satisfying $g \circ g = g$?

(7) Can you find a pair of maps $f: A \to B$ and $g: B \to A$ such that $g \circ f = 1_A$. If yes, how many pairs can you find?

(8 ) Can you find a pair of maps $h: B \to A$ and $k: A \to B$ such that $k \circ h = 1_B$. If yes, how many pairs can you find?

Bonus exercise:

How many maps $f: A \to B$ are there if $A$ is the empty set and $|B| = n$ for some $n \in \mathbb{N}$? What if $|A| = n$ and $B$ is the empty set? What if both $A$ and $B$ are empty?

• 311,755 hits