The solutions are in! I thought last week’s problem might have been a little more challenging than problems of previous weeks — the identity is just gorgeous, but not at all obvious (I don’t think!) without some correspondingly gorgeous combinatorial insight. Luckily, some of our readers came up with the goods, and their solutions provide a forum for discussing a beautiful circle of ideas, involving the inter-related combinatorics of trees and endofunctions.
I can’t decide which of the solutions we received I like best. They all bear a certain familial resemblance, but each has its own distinct personality. I’ll give two representative examples, and append some comments at the end. Both proofs are conceptual “bijective” proofs, in which the two sides of the identity represent two different ways of counting essentially the same combinatorial objects. And both rely on a famous theorem of Cayley, on the number of tree structures or spanning trees on distinct labeled nodes (maybe this would be sufficient hint, if you still want to think about it some more by yourself!). Here, I’ll add a little spoiler space:
1. (Solution by David Eppstein) As is well known (see, e.g., http://en.wikipedia.org/wiki/Pr%C3%BCfer_sequence), the number of different spanning trees on a set of n labeled nodes is . Equivalently, the number of ways of choosing a spanning tree together with a specification of a single vertex is , and the number of ways of choosing a spanning tree together with a specification of two vertices and (possibly equal to each other) is . So that’s the right hand side of the identity.
Now suppose you are given a tree , and two nodes and . If and are different, let be the first edge on the path in from to ; cutting at that edge produces two disjoint subtrees, one containing one marked node and the other containing two (possibly equal) marked nodes, namely and the first node on the path after . Conversely, from this information (two trees, one containing a marked node and the other containing two marks on nodes and ) we can put together simply by connecting the two trees by an edge . If is the number of nodes in the tree containing , the number of ways we can choose two disjoint marked subtrees in this way is
almost the same as the left hand side of the identity, but missing the final term in the sum.
The final term comes from the case when the marked nodes and of tree T coincide. The number of ways this can happen is the same as the number of ways we can pick a single marked node of a tree, that is, , which is the same as the final term in the left hand sum.
Thus, the left side counts (partitions of n vertices into two disjoint subtrees, one subtree having one marked node and one subtree having two possibly-equal marks) + (-vertex trees with one marked node); the right side counts (-vertex trees with two possibly-equal marks), and we have demonstrated a combinatorial equivalence between these two sets.
2. (Solution by Sune Jakobsen) Consider all -tuples , where each term is from the set Since each of the terms can take values, there are such tuples.
Given a -tuple, , construct a graph as follows. Begin with a vertex labeled . Then, for each vertex labeled in the graph, if , add a new vertex labeled , and connect and by an edge. This graph must be a tree since each only takes one value and doesn’t exist.
Using this graph, I will count the number of -tuples in another way. Let be the number of vertices in such a tree graph. The vertices may be chosen in ways, since the vertex labeled is already one of them. Given the vertices, the tree can be formed in ways, by Cayley’s theorem (see http://www.research.att.com/~njas/sequences/A000272). Given the tree graph, the values of the ‘s, for each vertex labeled in the graph, can be chosen in one and only one way (namely, is the label of the first vertex after along the unique path from vertex to vertex ). The remaining components of the tuple are not among the vertex labels in the graph, so each takes on one of possible values, giving possibilities for the remaining components. Therefore the number of -tuples must be:
1. I found this curious identity in HAKMEM, item 118. For those who don’t know, HAKMEM is a kind of archive of cool mathematical observations made by some of the original MIT computer “hackers” from the 60′s and 70′s, including Bill Gosper and Rick Schroeppel. This particular item is credited to Schroeppel, but the accompanying text is a bit cryptic in my view:
Differentiate to get . One observes the curious identity
Maybe it was just their style to record a lot of their observations in such terse, compact form, but it annoys me that these guys hide their light under a bushel in this way. No motivation whatsoever, even though (I’d be willing to bet) these guys knew about the connection to trees — they’re computer scientists, after all!
Personally, I find it easier to get from to by other means than through their intermediate identity. I feel sure that just about anyone who has played around with enumerative combinatorics, and with the combinatorics of trees in particular, could figure this one out.
For, as David pointed out in his solution, is the number of spanning trees on the set  equipped with a distinguished vertex; I’ll call that vertex the root, and such structures rooted trees. (Incidentally, a spanning tree is by definition an acyclic subgraph of the complete graph on the set , such that any two elements of the set are connected or spanned by a path in the subgraph. The theorem of Cayley mentioned above is that there are such spanning trees.) Thus,
Somehow I find this explanation much easier to understand than the machinations hinted at in HAKMEM 118.
2. David’s proof was actually the one I myself had in mind. I can’t say what inspired David, but I myself was inspired by an earlier reading of a beautiful (and in many respects revolutionary-for-its-time) article, on a systematic functorial approach to enumerative combinatorics:
- André Joyal, Une théorie combinatoire des séries formelles, Adv. Math. 42 (1981), 1-82.
In particular, I am very fond of the proof Joyal gives for Cayley’s theorem (which he credits to Gilbert Labelle), and this proof is in a line of thought which also leads to David’s solution. I’d like to present that proof now.
Labelle’s proof of Cayley’s theorem:
The expression probably makes most people think of the number of functions :   from an -element set to itself. The art of combinatorics lies in drawing appropriate pictures, so draw a picture (a graph) of such a function by drawing a directed edge from to whenever (cf. Sune’s solution). Starting from any vertex and iterating enough times, you always eventually land in a cycle where points get revisited periodically, infinitely often. Let’s call those points periodic points; the function acts as a permutation on periodic points. Now, for each periodic point , consider the union of directed paths which end at without hitting any other periodic points. This union forms a subgraph which is a tree, rooted at (again, cf. Sune’s solution). The entire set  is thereby partitioned into (equivalence) classes (the underlying vertex sets of the trees ), and the structure of a function   thus determines the following data:
- An equivalence relation on ;
- A rooted tree structure on each equivalence class;
- A permutation structure on the set of equivalence classes (each tagged by the periodic point at the root).
Conversely, these three data determine a function, and the correspondence is bijective.
- Remark: It’s not necessary to the proof, but let me add that by basic principles of generatingfunctionology, if is the egf for permutations [namely, ], and if is the egf for rooted trees, then is the egf for structures given by such triplets of data. Thus, by the bijective correspondence, we have
On the other hand, consider a tree structure on points, and suppose we also specify an ordered pair of such points , possibly equal. There is a unique path from to in , which I’ll call the spine (of the “bipointed tree”); call the points along that path, including and , vertebrae. Now, for each point , there is a unique shortest path from to the spine, terminating at a vertebra . The union of all such paths which terminate at a vertebra again forms a subtree rooted at . Again, the set of points is partitioned by the (underlying vertex sets of) , and the structure of a bipointed tree on an -element set  is thus encoded [in bijective fashion] by
- An equivalence relation on ;
- A rooted tree structure on each equivalence class;
- A spine structure (that is, a linear ordering) on the roots which tag the equivalence classes.
However, the number of linear orderings on an -element set, , is the same as the number of permutations on that set. We conclude that the number of bipointed tree structures on an -element set is the same as the number of endofunctions, . And, voilà! the number of tree structures on the -element set must therefore be .
Note: regular solver Philipp Lampe, who submitted a solution similar to David’s, pointed out that there are no fewer than four proofs of Cayley’s theorem given in Aigner and Ziegler’s Proofs from The Book, which I referred to in an earlier post. At this point, I really wish I had that book! I’d be delighted if someone were to post one of those nice proofs in comments…
3. I’m not quite sure, but Sune’s solution just might be my current favorite, just because it makes obvious contact with the circle of ideas which embrace endofunctions, trees, and rooted trees (I think of the tuples there as endofunctions, or actually, partial endofunctions on -element sets). In any event, my sincere thanks go to David, Philipp, and Sune for their insightful responses.