In our last post on category theory, we continued our exploration of universal properties, showing how they can be used to motivate the concept of natural transformation, the “right” notion of morphism \phi: F \to G between functors F, G. In today’s post, I want to turn things around, applying the notion of natural transformation to explain generally what we mean by a universal construction. The key concept is the notion of representability, at the center of a circle of ideas which includes the Yoneda lemma, adjoint functors, monads, and other things — it won’t be possible to talk about all these things in detail (because I really want to return to Stone duality before long), but perhaps these notes will provide a key of entry into more thorough treatments.

Even for a fanatic like myself, it’s a little hard to see what would drive anyone to study category theory except a pretty serious “need to know” (there is a beauty and conceptual economy to categorical thinking, but I’m not sure that’s compelling enough motivation!). I myself began learning category theory on my own as an undergraduate; at the time I had only the vaguest glimmerings of a vast underlying unity to mathematics, but it was only after discovering the existence of category theory by accident (reading the introductory chapter of Spanier’s Algebraic Topology) that I began to suspect it held the answer to a lot of questions I had. So I got pretty fired-up about it then, and started to read Mac Lane’s Categories for the Working Mathematician. I think that even today this book remains the best serious introduction to the subject — for those who need to know! But category theory should be learned from many sources and in terms of its many applications. Happily, there are now quite a few resources on the Web and a number of blogs which discuss category theory (such as The Unapologetic Mathematician) at the entry level, with widely differing applications in mind. An embarrassment of riches!

Anyway, to return to today’s topic. Way back when, when we were first discussing posets, most of our examples of posets were of a “concrete” nature: sets of subsets of various types, ordered by inclusion. In fact, we went a little further and observed that any poset could be represented as a concrete poset, by means of a “Dedekind embedding” (bearing a familial resemblance to Cayley’s lemma, which says that any group can be represented concretely, as a group of permutations). Such concrete representation theorems are extremely important in mathematics; in fact, this whole series is a trope on the Stone representation theorem, that every Boolean algebra is an algebra of sets! With that, I want to discuss a representation theorem for categories, where every (small) category can be explicitly embedded in a concrete category of “structured sets” (properly interpreted). This is the famous Yoneda embedding.

This requires some preface. First, we need the following fundamental construction: for every category C there is an opposite category C^{op}, having the same classes O, M of objects and morphisms as C, but with domain and codomain switched (\mbox{dom}^{op} := \mbox{cod}: M \to O, and \mbox{cod}^{op} := \mbox{dom}: M \to O). The function O \to M: A \mapsto 1_A is the same in both cases, but we see that the class of composable pairs of morphisms is modified:

(f, g) \in C^{op}_2 [is a composable pair in C^{op}] if and only if (g, f) \in C_2

and accordingly, we define composition of morphisms in C^{op} in the order opposite to composition in C:

(g \circ f)^{op} := f \circ g in C.

Observation: The categorical axioms are satisfied in the structure C^{op} if and only if they are in C; also, (C^{op})^{op} = C.

This observation is the underpinning of a Principle of Duality in the theory of categories (extending the principle of duality in the theory of posets). As the construction of opposite categories suggests, the dual of a sentence expressed in the first-order language of category theory is obtained by reversing the directions of all arrows and the order of composition of morphisms, but otherwise keeping the logical structure the same. Let me give a quick example:

Definition: Let X_1, X_2 be objects in a category C. A coproduct of X_1 and X_2 consists of an object X and maps i_1: X_1 \to X, i_2: X_2 \to X (called injection or coprojection maps), satisfying the universal property that given an object Y and maps f_1: X_1 \to Y, f_2: X_2 \to Y, there exists a unique map f: X \to Y such that f_1 = f \circ i_1 and f_2 = f \circ i_2. \Box

This notion is dual to the notion of product. (Often, one indicates the dual notion by appending the prefix “co” — except of course if the “co” prefix is already there; then one removes it.) In the category of sets, the coproduct of two sets X_1, X_2 may be taken to be their disjoint union X_1 + X_2, where the injections i_1, i_2 are the inclusion maps of X_1, X_2 into X_1 + X_2 (exercise).

Exercise: Formulate the notion of coequalizer (by dualizing the notion of equalizer). Describe the coequalizer of two functions \displaystyle f, g: X \stackrel{\to}{\to} Y (in the category of sets) in terms of equivalence classes. Then formulate the notion dual to that of monomorphism (called an epimorphism), and by a process of dualization, show that in any category, coequalizers are epic.

Principle of duality: If a sentence expressed in the first-order theory of categories is provable in the theory, then so is the dual sentence. Proof (sketch): A proof P of a sentence proceeds from the axioms of category theory by applying rules of inference. The dualization of P proves the dual sentence by applying the same rules of inference but starting from the duals of the categorical axioms. A formal proof of the Observation above shows that collectively, the set of categorical axioms is self-dual, so we are done. \Box

Next, we introduce the all-important hom-functors. We suppose that C is a locally small category, meaning that the class of morphisms g: c \to d between any two given objects c, d is small, i.e., is a set as opposed to a proper class. Even for large categories, this condition is just about always satisfied in mathematical practice (although there is the occasional baroque counterexample, like the category of quasitopological spaces).

Let Set denote the category of sets and functions. Then, there is a functor

\hom_C: C^{op} \times C \to Set

which, at the level of objects, takes a pair of objects (c, d) to the set \hom(c, d) of morphisms g: c \to d (in C) between them. It takes a morphism (f: c \to c', h: d \to d') of C^{op} \times C (that is to say, a pair of morphisms (f: c' \to c, h: d \to d') of C) to the function

\hom_C(f, h): \hom(c, d) \to \hom(c', d'): g \mapsto hgf.

Using the associativity and identity axioms in C, it is not hard to check that this indeed defines a functor \hom_C: C^{op} \times C \to Set. It generalizes the truth-valued pairing P^{op} \times P \to \mathbf{2} we defined earlier for posets.

Now assume C is small. From last time, there is a bijection between functors

\displaystyle \frac{h: C^{op} \times C \to Set}{f: C \to Set^{C^{op}}}

and by applying this bijection to \hom_C: C^{op} \times C \to Set, we get a functor

y_C: C \to Set^{C^{op}}.

This is the famous Yoneda embedding of the category C. It takes an object c to the hom-functor \hom_C(-, c): C^{op} \to Set. This hom-functor can be thought of as a structured, disciplined way of considering the totality of morphisms mapping into the object c, and has much to do with the Yoneda Principle we stated informally last time (and which we state precisely below).

  • Remark: We don’t need C to be small to talk about \hom_C(-, c); local smallness will do. The only place we ask that C be small is when we are considering the totality of all functors C^{op} \to Set, as forming a category \displaystyle Set^{C^{op}}.

Definition: A functor F: C^{op} \to Set is representable (with representing object c) if there is a natural isomorphism \displaystyle \hom_C(-, c) \stackrel{\sim}{\to} F of functors.

The concept of representability is key to discussing what is meant by a universal construction in general. To clarify its role, let’s go back to one of our standard examples.

Let c_1, c_2 be objects in a category C, and let F: C^{op} \to Set be the functor \hom(-, c_1) \times \hom(-, c_2); that is, the functor which takes an object b of C to the set \hom(b, c_1) \times \hom(b, c_2). Then a representing object for F is a product c_1 \times c_2 in C. Indeed, the isomorphism between sets \hom(b, c_1 \times c_2) \cong \hom(b, c_1) \times \hom(b, c_2) simply recapitulates that we have a bijection

\displaystyle \frac{b \to c_1 \times c_2}{b \to c_1 \qquad b \to c_2}

between morphisms into the product and pairs of morphisms. But wait, not just an isomorphism: we said a natural isomorphism (between functors in the argument b) — how does naturality figure in?

Enter stage left the celebrated

Yoneda Lemma: Given a functor F: C^{op} \to Set and an object c of C, natural transformations \phi: \hom(-, c) \to F are in (natural!) bijection with elements \xi \in F(c).

Proof: We apply the “Yoneda trick” introduced last time: probe the representing object c with the identity morphism, and see where \phi takes it: put \xi = \phi_c(1_c). Incredibly, this single element \xi determines the rest of the transformation \phi: by chasing the element 1_c \in \hom(c, c) around the diagram

      hom(c, c) -----> Fc
          |            |
hom(f, c) |            | Ff
          V            V
      hom(b, c) -----> Fb

(which commutes by naturality of \phi), we see for any morphism f: b \to c in \hom(b, c) that \phi_b(f) = F(f)(\xi). That the bijection

\displaystyle \frac{\xi: 1 \to F(c)}{\phi: \hom(-, c) \to F}

is natural in the arguments F, c we leave as an exercise. \Box

Returning to our example of the product c_1 \times c_2 as representing object, the Yoneda lemma implies that the natural bijection

\displaystyle \phi_b: \hom(b, c_1 \times c_2) \cong \hom(b, c_1) \times \hom(b, c_2)

is induced by the element \xi = \phi_{c_1 \times c_2}(1_{c_1 \times c_2}), and this element is none other than the pair of projection maps

\xi = (\pi_1: c_1 \times c_2 \to c_1, \pi_2: c_1 \times c_2 \to c_2).

In summary, the Yoneda lemma guarantees that a hom-representation \phi: \hom(-, c) \cong F of a functor is, by the naturality assumption, induced in a uniform way from a single “universal” element \xi \in F(c). All universal constructions fall within this general pattern.

Example: Let C be a category with products, and let c, d be objects. Then a representing object for the functor \hom(- \times c, d): C^{op} \to Set is an exponential d^c; the universal element \xi \in \hom(d^c \times c, d) is the evaluation map d^c \times c \to d.

Exercise: Let \displaystyle f, g: x \stackrel{\to}{\to} y be a pair of parallel arrows in a category C. Describe a functor F: C^{op} \to Set which is represented by an equalizer of this pair (assuming one exists).

Exercise: Dualize the Yoneda lemma by considering hom-functors \hom_C(c, -): C \to Set. Express the universal property of the coproduct in terms of representability by such hom-functors.

The Yoneda lemma has a useful corollary: for any (locally small) category C, there is a natural isomorphism

\displaystyle \frac{\hom_C(-, a) \to \hom_C(-, b)}{a \to b}

between natural transformations between hom-functors and morphisms in C. Using C(a, b) as alternate notation for the hom-set, the action of the Yoneda embedding functor y_C on morphisms gives an isomorphism between hom-sets

\displaystyle C(a, b) \stackrel{\sim}{\to} Set^{C^{op}}(y_C a, y_C b);

the functor y_C is said in that case to be fully faithful (faithful means this action on morphisms is injective for all a, b, and full means the action is surjective for all a, b). The Yoneda embedding y_C thus maps C isomorphically onto the category of hom-functors y_C a = \hom_C(-, a) valued in the category Set.

It is illuminating to work out the meaning of this last statement in special cases. When the category C is a group G (that is, a category with exactly one object \bullet in which every morphism is invertible), then functors F: G^{op} \to Set are tantamount to sets X equipped with a group homomorphism G^{op} \to \hom(X, X), i.e., a left action of G^{op}, or a right action of G. In particular, \hom(-, \bullet): G^{op} \to Set is the underlying set of G, equipped with the canonical right action \rho: G \to \hom(G, G), where \rho(g)(h) = hg. Moreover, natural transformations between functors G^{op} \to Set are tantamount to morphisms of right G-sets. Now, the Yoneda embedding

y_G: G \to Set^{G^{op}}

identifies any abstract group G with a concrete group y_G(G), i.e., with a group of permutations — namely, exactly those permutations on G which respect the right action of G on itself. This is the sophisticated version of Cayley’s theorem in group theory. If on the other hand we take C to be a poset, then the Yoneda embedding is tantamount to the Dedekind embedding we discussed in the first lecture.

Tying up a loose thread, let us now formulate the “Yoneda principle” precisely. Informally, it says that an object is determined up to isomorphism by the morphisms mapping into it. Using the hom-functor \hom(-, c) to collate the morphisms mapping into c, the precise form of the Yoneda principle says that an isomorphism between representables \hom(-, c) \to \hom(-, d) corresponds to a unique isomorphism c \to d between objects. This follows easily from the Yoneda lemma.

But far and away, the most profound manifestation of representability is in the notion of an adjoint pair of functors. “Free constructions” give a particularly ubiquitous class of examples; the basic idea will be explained in terms of free groups, but the categorical formulation applies quite generally (e.g., to free monoids, free Boolean algebras, free rings = polynomial algebras, etc., etc.).

If X is a set, the free group (q.v.) generated by X is, informally, the group FX whose elements are finite “words” built from “literals” a, b, c, \ldots which are the elements of X and their formal inverses, where we identify a word with any other gotten by introducing or deleting appearances of consecutive literals a a^{-1} or a^{-1}a. Janis Joplin said it best:

Freedom’s just another word for nothin’ left to lose…

— there are no relations between the generators of FX beyond the bare minimum required by the group axioms.

Categorically, the free group FX is defined by a universal property; loosely speaking, for any group G, there is a natural bijection between group homomorphisms and functions

\displaystyle \frac{FX \to G}{X \to UG}

where UG denotes the underlying set of the group. That is, we are free to assign elements of G to elements of X any way we like: any function f: X \to UG extends uniquely to a group homomorphism \hat{f}: FX \to G, sending a word x_1 x_2 \ldots x_n in FX to the element f(x_1)f(x_2) \ldots f(x_n) in G.

Using the usual Yoneda trick, or the dual of the Yoneda trick, this isomorphism is induced by a universal function i: X \to UFX, gotten by applying the bijection above to the identity map id: FX \to FX. Concretely, this function takes an element x \in X to the one-letter word x \in UFX in the underlying set of the free group. The universal property states that the bijection above is effected by composing with this universal map:

\displaystyle \hom_{Grp}(FX, G) \to \hom_{Set}(UFX, UG) \stackrel{\hom(i, UG)}{\to} \hom_{Set}(X, UG)

where the first arrow refers to the action of the underlying-set or forgetful functor U: Grp \to Set, mapping the category of groups to the category of sets (U “forgets” the fact that homomorphisms f: G \to H preserve group structure, and just thinks of them as functions Uf: UG \to UH).

  • Remark: Some people might say this a little less formally: that the original function f: X \to G is retrieved from the extension homomorphism \hat{f}: FX \to G by composing with the canonical injection of the generators X \to FX. The reason we don’t say this is that there’s a confusion of categories here: properly speaking, FX \to G belongs to the category of groups, and X \to G to the category of sets. The underlying-set functor U: Grp \to Set is a device we apply to eliminate the confusion.

In different words, the universal property of free groups says that the functor \hom_{Set}(X, U-): Grp \to Set, i.e., the underlying functor U: Grp \to Set followed by the hom-functor \hom(X, -): Set \to Set, is representable by the free group FX: there is a natural isomorphism of functors from groups to sets:

Grp(FX, -) \stackrel{\sim}{\to} Set(X, U-).

Now, the free group FX can be constructed for any set X. Moreover, the construction is functorial: defines a functor F: Set \to Grp. This is actually a good exercise in working with universal properties. In outline: given a function f: X \to Y, the homomorphism Ff: FX \to FY is the one which corresponds bijectively to the function

\displaystyle X \stackrel{f}{\to} Y \stackrel{i_Y}{\to} UFY,

i.e., Ff is defined to be the unique map h such that Uh \circ i_X = i_Y \circ f.

Proposition: F: Set \to Grp is functorial (i.e., preserves morphism identities and morphism composition).

Proof: Suppose f: X \to Y, g: Y \to Z is a composable pair of morphisms in Set. By universality, there is a unique map h: FX \to FZ, namely F(g \circ f), such that Uh \circ i_X = i_Z \circ (g \circ f). But Fg \circ Ff also has this property, since

U(Fg \circ Ff) \circ i_X = UFg \circ UFf \circ i_X = UFg \circ i_Y \circ f = i_Z \circ g \circ f

(where we used functoriality of U in the first equation). Hence F(g \circ f) = Fg \circ Ff. Another universality argument shows that F preserves identities. \Box

Observe that the functor F is rigged so that for all morphisms f: X \to Y,

UFf \circ i_X = i_Y \circ f.

That is to say, that there is only one way of defining F so that the universal map i_X is (the component at X of) a natural transformation 1_{Set} \to UF!

The underlying-set and free functors U: Grp \to Set, F: Set \to Grp are called adjoints, generalizing the notion of adjoint in truth-valued matrix algebra: we have an isomorphism

\hom_{Grp}(FX, Y) \cong \hom_{Set}(X, UY)

natural in both arguments X, Y. We say that F is left adjoint to U, or dually, that U is right adjoint to F, and write F \dashv U. The transformation i: 1 \to UF is called the unit of the adjunction.

Exercise: Define the construction dual to the unit, called the counit, as a transformation \varepsilon: FU \to 1. Describe this concretely in the case of the free-underlying adjunction F \dashv U between sets and groups.

What makes the concept of adjoint functors so compelling is that it combines representability with duality: the manifest symmetry of an adjunction \hom(FX, Y) \cong \hom(X, GY) means that we can equally well think of GY as representing \hom(F-, Y) as we can FX as representing \hom(X, G-). Time is up for today, but we’ll be seeing more of adjunctions next time, when we resume our study of Stone duality.

[Tip of the hat to Robert Dawson for the Janis Joplin quip.]