You are currently browsing the category archive for the ‘Problem Corner’ category.

Last summer, Todd and I discussed a problem and its solution, and I had wondered if it was fit enough to be in the POW-series (on this blog) when he mentioned that the problem might be somewhat too easy for that purpose. Of course, I immediately saw that he was right. But, a few days back, I thought it wouldn’t be bad if we shared this cute problem and its solution over here, the motivation being that some of our readers may perhaps gain something out of it. What is more, an analysis of an egf solution to the problem lends itself naturally to a discussion of combinatorial species. Todd will talk more about it in the second half of this post. Anyway, let’s begin.

PROBLEM: Suppose $A = \{ 1,2, \ldots , n \}$, where $n$ is a positive natural number. Find the number of endofunctions $f: A \rightarrow A$ satisfying the idempotent property, i.e. $f \circ f = f$.

It turns out that finding a solution to the above problem is equivalent to counting the number of forests with $n$ nodes and height at most $1$, which I found here (click only if you wish to see the answer!) at the Online Encyclopedia of Integer Sequences. If you haven’t clicked on that link yet and wish to solve the problem on your own, then please stop reading beyond this point.

SOLUTION: There are two small (and related) observations that need to be made. And, both are easy ones.

Lemma 1: $f$ has at least one fixed point.

Proof: Pick any $i \in A$ and let $f(i) = j$, where $j \in A$. Then, using the idempotent property, we have $f(f(i)) = f(i)$, which implies $f(j) = j$. Therefore, $j$ is a fixed point, and this proves our lemma.

Lemma 2: The elements in $A$ that are not fixed points are mapped to fixed points of $f$.

Proof: Suppose$j \in A$ is not a fixed point such that $f(j) = k$.  Then, using the idempotent property again, we immediately have $f(f(j)) = f(j)$, which implies $f(k) = k$, thereby establishing the fact that $k$ itself is a fixed point. Hence, $j$ (which is not a fixed point) is mapped to some fixed point of $f$.

In both the lemmas above, the idempotent property “forces” everything.

Now, the solution is right before our eyes! Suppose $f$ has $m$ fixed points. Then there are $\displaystyle \binom{n}{m}$ ways of choosing them. And, each of the remaining $n - m$ elements of $A$ that are not fixed points are to be mapped to any one of the $m$ fixed points. And, there are a total of $m^{n-m}$ ways of doing that. So, summing over all $m$, our final answer is $\displaystyle \sum_{m=0}^{n} \binom{n}{m} m^{n-m}$.

Exponential Generating Function and Introduction to Species

Hi; Todd here. Vishal asked whether I would discuss this problem from the point of view of exponential generating functions (or egf’s), and also from a categorical point of view, using the concept of species of structure, which gives the basis for a categorical or structural approach to generatingfunctionology.

I’ll probably need to write a new post of my own to do any sort of justice to these topics, but maybe I can whet the reader’s appetite by talking a little about the underlying philosophy, followed by a quick but possibly cryptic wrap-up which I could come back to later for illustrative purposes.

Enumerative combinatorics studies the problem of counting the number $a_n$ of combinatorial structures of some type on an $n$-element set, such as the number of idempotent functions on that set, or the number of equivalence relations, and so on. A powerful idea in enumerative combinatorics is the idea of a generating function, where we study the series $a_n$ by rolling them into a single analytic function, such as

$\displaystyle A(x) = \sum_{n \geq 0} \frac{a_n x^n}{n!},$

(this the so-called “exponential” generating function of $\{a_n\}_{n \geq 0}$). In many cases of interest, the function $A(x)$ will be recognizable in terms of operations familiar from calculus (addition, multiplication, differentiation, composition, etc.), and this can then be used to extract information about the series $a_n$, such as explicit formulas, asymptotics, and so on. If you’ve never seen this idea in action, you should definitely take a look at Wilf’s book generatingfunctionology, or at the book Concrete Mathematics by Graham, Knuth and Patashnik.

Each of the basic operations one performs on analytic functions (addition, multiplication, composition, etc.) will, it turns out, correspond to some set-theoretic operation directly at the level of combinatorial structures, and one of the trade secrets of generating function technology is to have very clear pictures of the combinatorial structures being counted, and how these pictures are built up using these basic structural operations.

In fact, why don’t we start right now, and figure out what some of these structural operations would be? In other words, let’s ask ourselves: if $A(x)$ and $B(x)$ are generating functions for counting combinatorial structures of type (or species) $A$ and $B$, then what types of structures would the function $A(x) + B(x)$ “count”?  How about $A(x)B(x)$? Composition $A(B(x))$?

The case of $A(x) + B(x)$ is easy: writing

$\displaystyle A(x) + B(x) = \sum_{n \geq 0} \frac{a_n x^n}{n!} + \sum_{n \geq 0} \frac{b_n x^n}{n!} = \sum_{n \geq 0} \frac{(a_n + b_n) x^n}{n!},$

and thinking of $a_n$ as counting structures of type $A$ on an $n$-element set, and $b_n$ as counting structures of type $B$, the quantity $a_n + b_n$ counts elements in the disjoint union of the sets of $A$-structures and $B$-structures.

In the categorical approach we will discuss later, we actually think of structure types (or species of structure) $A$ as functors, which take an $n$-element set $S$ to the set $A\left[S\right]$ of structures of type $A$ on $S$. Here, we have to be a little bit careful about what categories we’re talking about, but the general idea is that if we have a bijection $f: S \to T$ from one $n$-element set to another, then it should always be possible to “transport” $A$-structures on $S$ to $A$-structures on $T$, simply by relabeling points along the bijection $f$. So, let us define a species to be a functor

$A: FB \to Set$

where $FB$ is the category of finite sets and bijections (not all functions, just bijections!), and $Set$ is the category of sets. In enumerative combinatorics, the set $A\left[S\right]$ is normally assumed to be finite, but in other applications of the notion of species, we actually allow a lot more latitude, and allow the functor $A$ to map into other categories $C$, not just $Set$ (“$C$-valued species”). But if we stick for now just to set-valued species $A$, $B$, then we define the species $A + B$ by the functorial formula

$\displaystyle (A + B)\left[S\right] = A\left[S\right] \sqcup B\left[S\right]$

where $\sqcup$ denotes disjoint union. So addition of generating functions will correspond to the concrete operation of taking disjoint unions of sets of combinatorial species.

More interesting is the case of multiplication. Let’s calculate the product of two egf’s:

$\displaystyle A(x) B(x) = (\sum_{j \geq 0} \frac{a_j x^j}{j!})(\sum_{k \geq 0} \frac{b_k x^k}{k!}) = \sum_{n \geq 0} (\sum_{j + k = n} \frac{n!}{j! k!} a_j b_k) \frac{x^n}{n!}$

The question is: what type of structure does the expression $\displaystyle \sum_{j+k = n} \frac{n!}{j! k!} a_j b_k$ “count”? Look at the individual terms: the binomial coefficient $\displaystyle \frac{n!}{j! k!}$ describes the number of ways of decomposing an $n$-element set into two disjoint subsets, one with $j$ elements and the other with $k$, where $j$ and $k$ add to $n$. Then, $a_j$ is the number of ways of putting an $A$-structure on the $j$-element part, and $b_k$ is the number of $B$-structures on the $k$-element part.

This suggests a new operation on structure types: given structure types or species $A, B$, we define a new species $A \otimes B$ according to the formula

$\displaystyle (A \otimes B)\left[S\right] = \bigsqcup_{T \sqcup U = S} A\left[T\right] \times B\left[U\right]$

(that is, a structure of type $A \otimes B$ on a set $S$ is an ordered pair, consisting of an $A$-structure on a subset of $S$ and a $B$-structure on its complement). This functorial operation is usually called the “convolution product” of the combinatorial species $A, B$: it is the concrete set-theoretic operation which corresponds to multiplication of generating functions.

Finally, let’s look at composition $A(B(x))$. Here we make the technical assumption that $b_0 = 0$ (no $B$-structures on the empty set!), so that we won’t have divergence issues blowing up in our faces: we want to remain safely within the realm of finitary structures. Okay, then, what type of combinatorial structure does this egf count?

Perhaps not surprisingly, this is rather more challenging than the previous two examples. In analytic function language, we are trying here to give a meaning to the Taylor coefficients of a composite function in terms of the Taylor coefficients of the original functions — for this, there is a famous formula attributed to Faà di Bruno, which we then want to interpret combinatorially. If you don’t already know this but want to think about this on your own, then stop reading! But I’ll just give away the answer, and say no more for now about where it comes from, although there’s a good chance you can figure it out just by staring at it for a while, possibly with paper and pen in hand.

Definition: Let $A, B: FB \to Fin$ be species (functors from finite sets and bijections to finite sets), and assume $B\left[\emptyset\right] = \emptyset$. The substitution product $A \circ B$ is defined by the formula

$\displaystyle (A \circ B)\left[S\right] = \sum_{E \in Eq(S)} A\left[S/E\right] \times \prod_{c \in S/E} B\left[c\right]$

This clearly requires some explanation. The sum here denotes disjoint union, and $Eq(S)$ denotes the set of equivalence relations on the finite set $S$. So $E$ here is an equivalence relation, which partitions $S$ into nonempty sets $c$ ($E$-equivalence classes). And the quotient $S/E$ denotes the set of such equivalence classes (so we think of each class $c$ as a point of $S/E$). What this formula says is that a structure of type $A \circ B$ on $S$ consists of a partition of $S$ into a bunch of non-empty blobs, a $B$-structure on each blob, and then an $A$-structure on the set of blobs.

It’s high time for an example! So let’s look at Vishal’s problem, and see if we can picture it in terms of these operations. We’re going to need some basic functions (or functors!) to apply these operations to, and out of thin air I’ll pluck the two basic ones we’ll need:

$\displaystyle E(x) = \exp(x) = \sum_{n \geq 0} \frac{x^n}{n!}$

$F(x) = x$

The first is the generating function for the series $e_n = 1$. So for the species $E$, there’s just one structure of type $E$ for each set $S$ (in categorical language, the functor $E: FB \to Set$ is the terminal functor). We can just think of that structure as the set $S$ itself, if we like, with no other structure appended thereon.

For $F$, we have $f_n = 0$ unless $n = 1$, where $f_1 = 1$. So $F$ is the species for the one-element set structure (meaning that $F\left[S\right] = \emptyset$ unless $S$ has cardinality 1, in which case $F\left[S\right] = \{S\}$).

Okay, on to Vishal’s example. He was counting the number of idempotent functions $f: S \to S$, and now, as promised, I want to determine the corresponding egf. You might be able to find it by looking at his formula, but obviously I want to use the ideas I’ve developed thus far, which focuses much more on the pictures. So, let’s picture $f: S \to S$, first as Vishal did, by thinking of the elements of $S$ as ‘nodes’, and then drawing a directed edge from node $x$ to node $y$ if $f(x) = y$. (Then, by idempotence of $f$, $y$ will be a fixed point of $f$. Let’s agree not to bother drawing an edge from $y$ to itself, if $y$ is already a fixed point.)

In this picture, we get a directed graph which consists of a disjoint union of “sprouts”: little bushes, each rooted at a fixed point of $f$, whose only other nodes are “leaves” joined to the root by an edge. We can simplify the picture a little: if you put a circle around each sprout, you don’t need the edges at all: just mark one of the points inside as the root, and you know what to do.

So we arrive at a picture of an idempotent function on $S$: a partition of $S$ into a collection of (nonempty) blobs, and inside each blob, one of the points is marked “root”. In terms of our operations, what does it mean to mark a point in a blob? It just means: break the blob into two pieces, where one piece is given the structure of “one-element set”, and the other piece is just itself. In terms of the ideas developed above, this means each blob carries a $F \otimes E$ structure; we’ll suggestively write this structure type as $X \otimes \exp(X)$.

In this picture of idempotent $f$, there is no extra combinatorial structure imposed on the set of blobs, beyond the set itself. In other words, in this picture, the set of blobs carries merely an “$E$-structure”, nothing more.

So, putting all this together, we picture an idempotent function on $S$ as a partition or equivalence relation on $S$, together with an assignment of a marked point in each equivalence class. In the language of species operations, we may therefore identify the structure type of idempotent functions with

$E \circ (F \otimes E)$

or more suggestively, $\exp \circ (X \otimes \exp(X))$. The exponential generating function is, of course, $e^{x e^x}$!

In summary, the theory of species is a functorial calculus which projects onto its better-known “shadow”, the functional calculus of generating functions. That is to say, we lift operations on enumeration sequences $\{a_n\}$, as embodied in their generating functions, directly up to the level of the sets we’re counting, where the functorial operations become both richer and more concrete. The functorial analogue of the generating function itself is called the “analytic functor” attached to the species (the species itself being the concrete embodiment of the enumeration).

Much more could be said, of course. Instead, here’s a little exercise which can be solved by working through the ideas presented here: write down the egf for the number of ways a group of people can be split into pairs, and give an explicit formula for this number. Those of you who have studied quantum field theory may recognize this already (and certainly the egf is very suggestive!) ; in that case, you might find interesting the paper by Baez and Dolan, From Finite Sets to Feynman Diagrams, where the functorial point of view is used to shed light on, e.g., creation and annihilation operators in terms of simple combinatorial operations.

The literature on species (in all its guises) is enormous, but I’d strongly recommend reading the original paper on the subject:

• André Joyal, Une théorie combinatoire des séries formelles, Adv. Math. 42 (1981), 1-82.

which I’ve actually referred to before, in connection with a POW whose solution involves counting tree structures. Joyal could be considered to be a founding father of what I would call the “Montreal school of combinatorics”, of which a fine representative text would be

• F. Bergeron, G. Labelle, and P. Leroux, Combinatorial Species and Tree-like Structures, Encyclopedia of Mathematics and its Applications 67, 1998.

More to come, I hope!

I thought I would share with our chess-loving readers the following interesting (and somewhat well-known) mathematical chess paradox , apparently proving that $64=65$, and the accompanying explanation offered by Prof. Christian Hesse, University of Stuttgart (Germany).  It shows a curious connection between the well-known Cassini’s identity (related to Fibonacci numbers) and the $8 \times 8$ chessboard ($8$ being a Fibonacci number!). The connection can be exploited further to come up with similar paradoxes wherein any $F_n \times F_n$ -square can always be “rerranged” to form a $F_{n-1} \times F_{n+1}$ -rectangle such that the difference between their areas is either $+1$ or $-1$. Of course, for the curious reader there are plenty of such dissection problems listed in Prof David Eppstein’s Dissection page.

The following “polynomial-logarithmic” algebraic identity that one encounters on many occasions turns out to have a rather useful set of applications!

POLYNOMIAL-LOGARITHMIC IDENTITY: If $P(z)$ is a polynomial of degree $n \ge 1$ with roots $a_1, a_2, \ldots, a_n$, then $\displaystyle \frac{P'(z)}{P(z)} = \frac1{z-a_1} + \frac1{z-a_2} + \ldots + \frac1{z-a_n}$.

PROOF: This one is left as a simple exercise. (Hint: Logarithms!)

A nice application of the above identity is found in one of the exercises from the chapter titled Analysis (p120) in Proofs from the Book by Aigner, Ziegler and Hofmann.

EXERCISE: Let $p(x)$ be a non-constant polynomial with only real zeros. Show that $p'(x)^2 \ge p(x) p''(x)$ for all $x \in \mathbb{R}$.

SOLUTION: If $x = a_i$ is a zero of $p(x)$, then the right hand side of the above inequality equals zero, and we are done. So, suppose $x$ is not a root of $p(x)$. Then, differentiating the above identity w.r.t. $x$, we obtain $\displaystyle \frac{p''(x)p(x) - p'(x)^2}{p(x)^2} = - \sum_{k=1}^n \frac1{(x - a_k)^2} < 0$, and we are done.

It turns out that the above identity can also used to prove the well-known Gauss-Lucas theorem.

GAUSS-LUCAS: If $P$ is a non-constant polynomial, then the zeros of $P'$ lie in the convex hull of the roots of $P$.

PROOF: See this

HISTORY: The well-known Russian author V.V. Prasolov in his book Polynomials offers a brief and interesting historical background of the theorem, in which he points out that Gauss’ original proof (in 1836) of a variant of the theorem was motivated by physical concepts, and it was only in 1874 that F. Lucas, a French Engineer, formulated and proved the above theorem. (Note that the Gauss-Lucas theorem can also be thought of as some sort of a generalization (at least, in spirit!) of Rolle’s theorem.)

Even though I knew the aforesaid identity before, it was once again brought to my attention through a nice (and elementary) article, titled On an Algebraic Identity by Roberto Bosch Cabrera, available at Mathematical Reflections. In particular, Cabrera offers a simple solution, based on an application of the given identity, to the following problem (posed in the 2006 4th issue of Mathematical Reflections), the solution to which had either escaped regular problem solvers or required knowledge of some tedious (albeit elementary) technique.

PROBLEM: Evaluate the sum $\displaystyle \sum_{k=0}^{n-1} \frac1{1 + 8\sin^2 (k\pi /n)}$. (proposed by Dorin Andrica and Mihai Piticari.)

There is yet another problem which has a nice solution based again on our beloved identity!

PROBLEM: (Putnam A3/2005) Let $p(z)$ be a polynomial of degree $n$, all of whose zeros have absolute value 1 in the complex plane. Put $g(z) = p(z)/z^{n/2}$. Show that all zeros of $g'(z) = 0$ have absolute value 1.

In mathematics you don’t understand things. You just get used to them.”

— John von Neumann

I had been wanting to write on this topic – no, I am not referring to the above quote by von Neumann – for quite some time but I wasn’t too sure if doing so would contribute anything “useful” to the ongoing discussion on the pedagogical roles of concrete and abstract examples in mathematics, a discussion that’s been going on on various blogs for some time now. In part coaxed by Todd, let me share some of my own observations for whatever they are worth.

First, some background. A few months ago, Scientific American published an article titled In Abstract: Avoid Concrete Example When Teaching Math (by Nikhil Swaminathan). Some excerpts from that article can be read below:

New research published in Science suggests that attempts by math teachers to make the subject easier to grasp by providing such practical examples may actually have made it tougher to learn.

For their study, Kaminski and her colleagues taught 80 undergraduate students—split into four 20-person groups—a new mathematical system (based on several simple arithmetic concepts) in different ways.

One group was taught using generic symbols such as circles and diamonds. The other groups were taught using practical scenarios such as combining liquids in measuring cups.

The researchers then tested their grasp of the concept by seeing how well they could apply it to an unrelated situation, in this case a children’s game. The results: students who learned using symbols on average scored 80 percent; the others scored between 40 and 50 percent, according to Kaminski.

One may read the entire article online to learn a bit more about the study done. Let me add that I do agree with the overall conclusion of the study cited: in mathematics concrete examples (in contradistinction to abstract ones) more often than not obfuscate the underlying concepts behind those examples, thus hindering “real” or complete understanding of those concepts. However, I also feel that such a claim must be somewhat qualified because there is more to it than meets the eye.

Sometimes the line between abstract examples and concrete ones can be quite blurry. What is more, some concrete examples may even be more abstract than other concrete ones. In this post, I will assume (and hope others do too) that the distinction between an abstract example and a concrete one (that I have chosen for this post) is sharp enough for our discussion. Of course, my aim is not to highlight such a distinction but to emphasize the importance of both abstract and concrete examples in mathematical education, for I firmly believe that a “concrete” understanding of concepts isn’t necessarily subsumed under an “abstract” one, even though a concrete example may just be a special case of a more general and abstract one. What is more, and this may sound surprising, abstract examples may sometimes not reveal certain useful principles which, on the other hand, may be clearly revealed by concrete ones!

Let me illustrate what I wrote above by discussing a somewhat well-known problem and its two related solutions, one of which employs an abstract approach and the other a concrete one, if you will. Some time ago, Isabel at God Plays Dice pointed to an online article titled An Intuitive Explanation of Bayesian Reasoning by Eliezer Yudkowsky, and I borrow the problem I am about to discuss in this post from that article.

PROBLEM: 1% of women at age forty who participate in routine screening have breast cancer. 80% of women with breast cancer will get positive mammographies. 9.6% of women without breast cancer will also get positive mammographies. A woman in this age group had a positive mammography in a routine screening. What is the probability that she actually has breast cancer?

How may one proceed to solve this problem? Well, first, let us look at an “abstract” solution.

“ABSTRACT” SOLUTION: Here we employ the machinery of set-theoretic probability theory to arrive at our answer. We first note that what we really want to compute is the probability of a woman having breast cancer given that she has tested positive. That is, we want to compute the conditional probability P(A/B), where event A corresponds to that of a woman having breast cancer and event B corresponds to that of a woman testing positive for breast cancer. Now, from Bayes’ theorem, we have

$\displaystyle P(A/B) = \frac{P(B/A) P(A)}{P(B/A) P(A) + P(B/A^{c}) P(A^{c})}$.

Also, we note that $P(A) = 0.01, P(B/A) = 0.8, P(A^{c}) = 0.99$ and $P(B/A^{c}) = 0.096$. Plugging these values into the above formula immediately yields P(A/B) = 7.76%. And, we are done.

A couple of observations.

1. It is not hard to observe that the derivation of Bayes’ formula follows from the definition of conditional probability, viz. P(A/B) = P(AB)/P(B), where P(B) > 0, and the usual set-theoretic rules involving the union and intersection of sets (events). And, this derivation can be carried out through sheer manipulation of symbols under those rules. By that I mean, if a student knows enough set theory as well as the “laws” of set-theoretic probability theory, then the derivation of Bayes’ theorem makes absolutely no (or, almost no) use of the “intuitive” faculty of a student.

2. The abstract method presented above also subsumes the concrete method, as we shall see shortly. What is more, Bayes’ formula can be generalized even further. This means that once we have this particularly useful “abstract” tool at our disposal, we can solve any number of similar problems by repeatedly using this tool in concrete (and even abstract) cases. In addition, Bayes’ theorem can also be thought of as a “black box” to which we apply certain inputs in order to get our output. This should not surprise us, for in mathematics the use of theorems as black boxes is a common one.

Now, the above two observations may lead one to believe that indeed there is almost no need to find a “concrete” solution to the above problem. After all, the abstract case takes care of the concrete cases completely.

However, let us see if we can come up with a concrete (that is, a far less abstract) solution and examine it more closely to see if we can extract some useful ideas/techniques from the same.

“CONCRETE” SOLUTION: Suppose we choose a random sample of 100,000 women of age forty. (We choose that figure for reasons that will be clear soon.) Then, we have two groups of women.

1st group: 1,000 (1%) women who have breast cancer.

2nd group: 99,000 (99%) women who don’t have breast cancer.

Now, in the 1st group, 800 (80% of 1,000) women will test positive, and, in the 2nd group, 9,504 (9.6% of 99,000) women will test positive. So, it is clear that if a woman tests positive, then the probability that she belongs to the 1st group (that is, she really has cancer) is 800/(800+9504) = 7.76 %. And, we are done.

Let me quickly point out a very important advantage the above solution has over the abstract one we saw earlier.

Indeed, we finally “see” what’s really going on. That is, from an intuitive standpoint, we observe in the above solution that there is a “tree structure” involved in our reasoning. The sample of 1,00,000 women bifurcates into two distinct samples, one of which has 1,000 women who have breast cancer and the other that has 99,000 women who don’t. Next, we observe that each of these two samples in turn bifurcates into two samples, one of which comprises women who test positive and the other that comprises women who don’t. This clearly reveals to the student the “tree structure” in the above reasoning. This makes the concrete solution much more appealing and “satisfying” to the average student. In fact, the generalization we talked about earlier in regard to Bayes’ theorem can even be carried out in this particular method: we will only need to increase the depth and/or breadth of our “tree” by extending more nodes from existing ones!

Moreover, one may recall that the use of such “trees” in reasoning is quite common in mathematics. For instance, the two most basic rules of Combinatorial Principles, viz. the Rule of Sum and the Rule of Product are proved using such “trees”. So, this is one instance in which a concrete solution reveals much more clearly a quite fundamental principle/technique (use of “trees” in reasoning) in mathematics that isn’t clearly revealed at all in the abstract solution we examined earlier.

In other words, much thought needs to be put in deciding if abstract examples should necessarily be “favored” over concrete ones in mathematics education. From a pedagogical standpoint, sometimes concrete examples are simply much better than abstract ones!

Okay, folks, time for another Problem of the Week! I hope it generates more response than last week’s problem:

Let $C$ be a simple closed curve in the plane, and let $P$ be any point strictly in the region interior to $C$. Show there are two points on $C$ whose midpoint is $P$.

Please submit solutions to topological[dot]musings[At]gmail[dot]com by Wednesday, July 9, 11:59 pm (UTC); do not submit solutions in Comments. Everyone with a correct solution will be inducted into our Hall of Fame! We look forward to your response.

This week’s problem is offered more in the spirit of a light and pleasant diversion — I don’t think you’ll need any deep insight to solve it. (A little persistence may come in handy though!)

Define a triomino to be a figure congruent to the union of three of the four unit squares in a $2 \times 2$ square. For which pairs of positive integers $(m, n)$ is an $m \times n$ rectangle tileable by triominoes?

Please submit solutions to topological[dot]musings[At]gmail[dot]com by Wednesday, July 3, 11:59 pm (UTC); do not submit solutions in Comments. Everyone with a correct solution will be inducted into our Hall of Fame! We look forward to your response. Enjoy!

We got some very good response to our last week’s problem from several of our “regular” problem-solvers as well as several others who are “new”. There were solutions that were more “algebraic” than others, some that had a more “trigonometric” flavor to them and some that had a combination of both. All the solutions we received this time were correct and they all deserve to be published, but for the sake of brevity I will post just one.

Solution to POW-5: (due to Animesh Datta, Univ of New Mexico)

Note that the given integral may be written as

$\displaystyle \int \frac{x^2 - 1}{x(x^2 + 1) \sqrt{x^2 + 1/x^2}} \, dx$

$\displaystyle = \int \frac{1 - 1/x^2}{(x + 1/x) \sqrt{(x + 1/x)^2 - 2}} \, dx$.

Now, we use the substitution $t = x + 1/x$, which transforms the integral into

$\displaystyle \int \frac1{t \sqrt{t^2 - 2}} \, dt$.

Finally, we use one last (trigonometric) substitution $t = \sqrt{2} \sec \theta$, which transforms the integral into $\displaystyle \int \frac1{\sqrt{2}} \, d\theta$, which evaluates to $\theta /\sqrt{2} + C$, which equals $\displaystyle \frac1{\sqrt2} \arctan \sqrt{\frac12 (x^2 + \frac1{x^2})} + C$. And this is our final answer!

Watch out for the next POW that will be posted by Todd!

Source: I had mentioned earlier that Carl Lira had brought this integral to our attention, and he in turn had found it in the MIT Integration Bee archives. This one was from the year 1994.

Trivia: Four out of the six people who sent correct solutions are either Indians or of Indian origin! Coincidence? 🙂

Time for our next problem in the POW series! Earlier, Todd and I deliberated for a bit on whether we should pose a “hard” Ramanujan identity (involving an integral and Gamma function) as the next POW, but decided against doing it. Perhaps, we may do so some time in the future.

Okay, the following integral was brought to our attention by Carl Lira, and for the time being I won’t reveal the actual source of the problem.

Compute $\displaystyle \int \frac{x^2 - 1}{(x^2 + 1) \sqrt{x^4 + 1}} \, dx$.

It is “hard” or “easy” depending on how you look at it!

Please send your solutions to topological[dot]musings[At]gmail[dot]com by Wednesday, June 26, 11:59pm (UTC); do not submit solutions in Comments. Everyone with a correct solution gets entered in our Hall of Fame! We look forward to your response.

The solutions are in! I thought last week’s problem might have been a little more challenging than problems of previous weeks — the identity is just gorgeous, but not at all obvious (I don’t think!) without some correspondingly gorgeous combinatorial insight. Luckily, some of our readers came up with the goods, and their solutions provide a forum for discussing a beautiful circle of ideas, involving the inter-related combinatorics of trees and endofunctions.

I can’t decide which of the solutions we received I like best. They all bear a certain familial resemblance, but each has its own distinct personality. I’ll give two representative examples, and append some comments at the end. Both proofs are conceptual “bijective” proofs, in which the two sides of the identity represent two different ways of counting essentially the same combinatorial objects. And both rely on a famous theorem of Cayley, on the number of tree structures or spanning trees on $n$ distinct labeled nodes (maybe this would be sufficient hint, if you still want to think about it some more by yourself!). Here, I’ll add a little spoiler space:

1. (Solution by David Eppstein) As is well known (see, e.g., http://en.wikipedia.org/wiki/Pr%C3%BCfer_sequence), the number of different spanning trees on a set of n labeled nodes is $n^{n-2}$. Equivalently, the number of ways of choosing a spanning tree together with a specification of a single vertex $s$ is $n^{n-1}$, and the number of ways of choosing a spanning tree together with a specification of two vertices $s$ and $t$ (possibly equal to each other) is $n^n$. So that’s the right hand side of the identity.

Now suppose you are given a tree $T$, and two nodes $s$ and $t$. If $s$ and $t$ are different, let $(s,u)$ be the first edge on the path in $T$ from $s$ to $t$; cutting $T$ at that edge produces two disjoint subtrees, one containing one marked node $s$ and the other containing two (possibly equal) marked nodes, namely $t$ and the first node $u$ on the path after $s$. Conversely, from this information (two trees, one containing a marked node $s$ and the other containing two marks on nodes $u$ and $t$) we can put together $T$ simply by connecting the two trees by an edge $(s,u)$. If $j$ is the number of nodes in the tree containing $s$, the number of ways we can choose two disjoint marked subtrees in this way is

$\displaystyle \sum_{j=1}^{n-1} {n\choose j} j^{j-1} (n-j)^{n-j},$

almost the same as the left hand side of the identity, but missing the final term in the sum.

The final term comes from the case when the marked nodes $s$ and $t$ of tree T coincide. The number of ways this can happen is the same as the number of ways we can pick a single marked node of a tree, that is, $n^{n-1}$, which is the same as the final term in the left hand sum.

Thus, the left side counts (partitions of n vertices into two disjoint subtrees, one subtree having one marked node and one subtree having two possibly-equal marks) + ($n$-vertex trees with one marked node); the right side counts ($n$-vertex trees with two possibly-equal marks), and we have demonstrated a combinatorial equivalence between these two sets. $\Box$

2. (Solution by Sune Jakobsen) Consider all $(n-1)$-tuples $a=(a_1,a_2,...,a_{n-1})$, where each term is from the set $\{1,2, \ldots, n\}.$ Since each of the $n-1$ terms can take $n$ values, there are $n^{n-1}$ such tuples.

Given a $(n-1)$-tuple, $a$, construct a graph as follows. Begin with a vertex labeled $n$. Then, for each vertex labeled $k$ in the graph, if $a_i=k$, add a new vertex labeled $i$, and connect $i$ and $k$ by an edge. This graph must be a tree since each $a_i$ only takes one value and $a_n$ doesn’t exist.

Using this graph, I will count the number of $(n-1)$-tuples in another way. Let $j$ be the number of vertices in such a tree graph. The vertices may be chosen in $\displaystyle \binom{n-1}{j-1}$ ways, since the vertex labeled $n$ is already one of them. Given the vertices, the tree can be formed in $j^{j-2}$ ways, by Cayley’s theorem (see http://www.research.att.com/~njas/sequences/A000272). Given the tree graph, the values of the $a_i$‘s, for each vertex labeled $i$ in the graph, can be chosen in one and only one way (namely, $a_i$ is the label of the first vertex after $i$ along the unique path from vertex $i$ to vertex $n$). The remaining $n-j$ components of the tuple are not among the vertex labels in the graph, so each takes on one of $n-j$ possible values, giving $(n-j)^{n-j}$ possibilities for the remaining components. Therefore the number of $(n-1)$-tuples must be:

$\displaystyle n^{n-1} = \sum_{j=1}^{n} \binom{n-1}{j-1} j^{j-2} (n-j)^{n-j}$

Multiplying both sides of the previous equation by $n$ and using $\displaystyle \frac{n}{j}\binom{n-1}{j-1} = \binom{n}{j}$, the claim follows. $\Box$

Remarks:

1. I found this curious identity in HAKMEM, item 118. For those who don’t know, HAKMEM is a kind of archive of cool mathematical observations made by some of the original MIT computer “hackers” from the 60’s and 70’s, including Bill Gosper and Rick Schroeppel. This particular item is credited to Schroeppel, but the accompanying text is a bit cryptic in my view:

Differentiate $ye^{-y} = x$ to get $y + y x y' - x y' = 0$. One observes the curious identity
$\displaystyle \sum_{j=1}^n \binom{n}{j} j^{j-1} (n-j)^{(n-j)} = n^n$ ($0^0 = 1$)
and thus
$\displaystyle y(x) = \sum_{n \geq 1} \frac{n^{n-1} x^n}{n!}$.

Maybe it was just their style to record a lot of their observations in such terse, compact form, but it annoys me that these guys hide their light under a bushel in this way. No motivation whatsoever, even though (I’d be willing to bet) these guys knew about the connection to trees — they’re computer scientists, after all!

Personally, I find it easier to get from $y = x e^y$ to $\displaystyle y = \sum_{n=1}^\infty \frac{n^{n-1}x^n}{n!}$ by other means than through their intermediate identity. I feel sure that just about anyone who has played around with enumerative combinatorics, and with the combinatorics of trees in particular, could figure this one out.

For, as David pointed out in his solution, $n^{n-1}$ is the number of spanning trees on the set [$n$] $= \{1, 2, \ldots, n\}$ equipped with a distinguished vertex; I’ll call that vertex the root, and such structures rooted trees. (Incidentally, a spanning tree is by definition an acyclic subgraph of the complete graph on the set [$n$], such that any two elements of the set are connected or spanned by a path in the subgraph. The theorem of Cayley mentioned above is that there are $n^{n-2}$ such spanning trees.) Thus,

$\displaystyle y(x) = \sum_{n=1}^\infty \frac{n^{n-1} x^n}{n!}$
is the exponential generating function (egf) for rooted trees.

On the other hand, it is not hard to see that the functional equation $y = xe^y$ holds for the egf of rooted trees (and uniquely determines the power series of the egf). One just applies some basic principles; I’ll just say it briefly and hope it’s somewhat followable: a rooted tree structure on a finite set $S$ is given by the selection of a root $r \in S$, together with a partition of the remainder $S - \{r\}$ into equivalence classes and a choice of rooted tree structure on each class. (Severing the root results in a bunch of disjoint subtrees, whose roots are those vertices adjacent to the original root.) At the level of egf’s, selection of the root accounts for the factor $x$ on the right of the functional equation, and if $y$ is the egf for rooted trees, then the other factor $e^y$ is the egf for the collection of ways of partitioning a set into nonempty classes and putting a rooted tree structure on each class. This is all part of the art and science of generatingfunctionology. It’s beautiful stuff.

Somehow I find this explanation much easier to understand than the machinations hinted at in HAKMEM 118.

2. David’s proof was actually the one I myself had in mind. I can’t say what inspired David, but I myself was inspired by an earlier reading of a beautiful (and in many respects revolutionary-for-its-time) article, on a systematic functorial approach to enumerative combinatorics:

• André Joyal, Une théorie combinatoire des séries formelles, Adv. Math. 42 (1981), 1-82.

In particular, I am very fond of the proof Joyal gives for Cayley’s theorem (which he credits to Gilbert Labelle), and this proof is in a line of thought which also leads to David’s solution. I’d like to present that proof now.

Labelle’s proof of Cayley’s theorem:

The expression $n^n$ probably makes most people think of the number of functions $f$: [$n$] $\to$ [$n$] from an $n$-element set to itself. The art of combinatorics lies in drawing appropriate pictures, so draw a picture (a graph) of such a function by drawing a directed edge from $i$ to $j = f(i)$ whenever $i \neq j$ (cf. Sune’s solution). Starting from any vertex and iterating $f$ enough times, you always eventually land in a cycle where points get revisited periodically, infinitely often. Let’s call those points periodic points; the function $f$ acts as a permutation on periodic points. Now, for each periodic point $p$, consider the union of directed paths which end at $p$ without hitting any other periodic points. This union forms a subgraph $T_p$ which is a tree, rooted at $p$ (again, cf. Sune’s solution). The entire set [$n$] is thereby partitioned into (equivalence) classes (the underlying vertex sets of the trees $T_p$), and the structure of a function $f:$ [$n$] $\to$ [$n$] thus determines the following data:

• An equivalence relation on [$n$];
• A rooted tree structure on each equivalence class;
• A permutation structure on the set of equivalence classes (each tagged by the periodic point at the root).

Conversely, these three data determine a function, and the correspondence is bijective.

• Remark: It’s not necessary to the proof, but let me add that by basic principles of generatingfunctionology, if $p(x)$ is the egf for permutations [namely, $\displaystyle p(x) = \frac1{1-x}$], and if $y(x)$ is the egf for rooted trees, then $(p \circ y)(x)$ is the egf for structures given by such triplets of data. Thus, by the bijective correspondence, we have

$\displaystyle \sum_{n=0}^\infty \frac{n^n x^n}{n!} = (p \circ y)(x).$

On the other hand, consider a tree structure $T$ on $n$ points, and suppose we also specify an ordered pair of such points $(s, t)$, possibly equal. There is a unique path from $s$ to $t$ in $T$, which I’ll call the spine (of the “bipointed tree”); call the points along that path, including $s$ and $t$, vertebrae. Now, for each point $x \in T$, there is a unique shortest path from $x$ to the spine, terminating at a vertebra $p$. The union of all such paths which terminate at a vertebra $p$ again forms a subtree $T_p$ rooted at $p$. Again, the set of $n$ points is partitioned by the (underlying vertex sets of) $T_p$ , and the structure of a bipointed tree on an $n$-element set [$n$] is thus encoded [in bijective fashion] by

• An equivalence relation on [$n$];
• A rooted tree structure on each equivalence class;
• A spine structure (that is, a linear ordering) on the roots which tag the equivalence classes.

However, the number of linear orderings on an $n$-element set, $n!$, is the same as the number of permutations on that set. We conclude that the number of bipointed tree structures on an $n$-element set is the same as the number of endofunctions, $n^n$. And, voilà! the number of tree structures on the $n$-element set must therefore be $n^{n-2}$. $\Box$

Note: regular solver Philipp Lampe, who submitted a solution similar to David’s, pointed out that there are no fewer than four proofs of Cayley’s theorem given in Aigner and Ziegler’s Proofs from The Book, which I referred to in an earlier post. At this point, I really wish I had that book! I’d be delighted if someone were to post one of those nice proofs in comments…

3. I’m not quite sure, but Sune’s solution just might be my current favorite, just because it makes obvious contact with the circle of ideas which embrace endofunctions, trees, and rooted trees (I think of the tuples there as endofunctions, or actually, partial endofunctions on $(n-1)$-element sets). In any event, my sincere thanks go to David, Philipp, and Sune for their insightful responses.

Encouraged and emboldened (embiggened?) by the ingenuity displayed by some of our readers, I’d like to see what sort of response we get to this Problem of the Week:

Establish the following identity: $\displaystyle \sum_{j=1}^n \binom{n}{j} j^{j-1} (n-j)^{(n-j)} = n^n$ for all natural numbers $n > 0$.

(Here we make the convention $0^0 = 1$.) I find this problem tantalizing because it looks as if there should be some sort of conceptual proof — can you find one?

Please send your solutions to topological[dot]musings[At]gmail[dot]com by Wednesday, June 11, 11:59pm (UTC); do not submit solutions in Comments. Everyone with a correct solution gets entered in our Hall of Fame! We look forward to your response.

• 372,989 hits