On the independent set sequence of a tree

Alavi, Malde, Schwenk and Erd\H{o}s asked whether the independent set sequence of every tree is unimodal. Here we make some observations about this question. We show that for the uniformly random (labelled) tree, asymptotically almost surely (a.a.s.) the initial approximately 49.5\% of the sequence is increasing while the terminal approximately 38.8\% is decreasing. Our approach uses the Matrix Tree Theorem, combined with computation. We also present a generalization of a result of Levit and Mandrescu, concerning the final one-third of the independent set sequence of a K\"onig-Egerv\'ary graph.

Theorem 1.2. For a König-Egerváry graph G, So the (non-zero part of the) independent set sequence of a tree is weakly decreasing for its last one-third. Theorem 1.2 is easily seen to be tight: the graph consisting of α vertex disjoint edges (and no other vertices) has independent set sequence which is weakly decreasing from exactly i ⌈(2α−1)/3⌉ on.
In this note we make a number of observations around Question 1.1, the first of which is a generalization of Theorem 1.2, showing that the theorem is more about graphs with independent sets of size at least half the number of vertices than about König-Egerváry graphs. Theorem 1.3. Let G be a graph (not necessarily a tree or a König-Egerváry graph) with n vertices and maximum independent set size α. The sequence (i k ) α k=ℓ is weakly decreasing, where ℓ = α(n − 1) α + n .
If κ satisfies α ≥ κn then (See Section 2.1 for the proof). The second part of Theorem 1.3 follows quickly from the first: if α ≥ κn then Every n-vertex graph satisfies µ ≤ n/2, so every König-Egerváry graph satisfies α ≥ n/2 (the converse of this is not true; for example, K 3 together with two isolated vertices has 3 = α ≥ 5/2 = n/2 but is not König-Egerváry). Thus, taking κ = 1/2 in (1) we recover Theorem 1.2. Theorem 1.3 gives no new information on Question 1.1, the status of the independent set sequence of all trees, because there are trees with α = ⌈n/2⌉. But for trees with α larger than n/2, it gives a decreasing tail longer than one-third of the length of the sequence.
One obvious place to exploit this is in the study of the independent set sequence of the random uniform tree. Our model here is to select T uniformly from among the n n−2 labelled trees on vertex set {1, . . . , n}, and to consider the sequence (X 0 , X 1 , . . . , X n ) where X k is the number of independent sets of size k in T. To gain some evidence in favor of an affirmative answer to Question 1.1, it is natural to ask whether (X 0 , X 1 , . . . , X n ) is a.a.s. (asymptotically almost surely -with probability tending to 1 as n tends to infinity) unimodal.
This seemingly simple question turns out to be quite intricate. It is easy, via the Matrix Tree Theorem, to establish E(X k ) = n k 1 − k n n−1 (this was probably first observed by Bedrosian [4]), so that the sequence (E(X 0 ), E(X 1 ), . . . , E(X n )) is unimodal. One might then try to establish that with high probability the X k 's fall in disjoint intervals centered around the E(X k )'s, leading to a.a.s. unimodality. Unfortunately the variance of X k (which can also be explicitly calculated via the matrix tree theorem) turns out to be very large, typically much larger than E(X k ) 2 , precluding a straightforward application of the second moment method.
Nonetheless, Theorem 1.3 allows us to say something about the decreasing tail of the independent set sequence of almost all trees, beyond what is given by Theorem 1.2. Pittel [26], tightening an earlier result of Meir and Moon [24], established that for any f (n) = ω(1), a.a.s.
where α(T) is the size of the largest independent set in T, and ρ ≈ 0.5671 is the unique real satisfying ρe ρ = 1. Theorem 1.4. Let T be a uniformly random labelled tree on n vertices, and let X k be the number of independent sets of size k in T. A.a.s. the sequence (X ℓ , X ℓ+1 , . . . , X n ) is weakly decreasing, where ℓ = 0.347n.
So a.a.s. the (non-zero part of the) independent set sequence of the uniform labelled tree is weakly decreasing for its terminal approximately 38.8%. See Section 2.3 for the proof of Theorem 1.4. With some further computation it is likely that we could improve Theorem 1.4, but an improvement to ℓ = 0.346n is beyond the reach of our current methods. (Problem 2.6 suggests a possible direction of improvement.) Our second observation around Question 1.1 concerns the start of the independent set sequence. Again, we begin with a general statement: Theorem 1.5. Let G be a graph in which every maximal (by inclusion) independent set has size at least λ. Then the initial portion (i 0 , i 1 , . . . , i ⌈λ/2⌉ ) of the independent set sequence of G is weakly increasing. This is a straightforward generalization (see Section 2.2 for the short proof) of a result of Michael and Traves [25], who showed that if every independent set in G is contained in an independent set of size α (G is well-covered) then i 0 ≤ i 1 ≤ · · · ≤ i ⌈α/2⌉ .
To connect this to the independent set sequence of a tree, we show: Theorem 1.6. Let T be a tree with n vertices and maximum independent set size α. Every maximal independent set in T has size at least n−α+1 2 , and so the initial portion (i 0 , i 1 , . . . , i ℓ ) of the independent set sequence of T is weakly increasing, where (See Section 2.2 for the proof.) For example, if we know that α = ⌈n/2⌉ (its smallest possible value) then we get that the independent set sequence is increasing up to about n/8 or 0.25α. On the other hand, if we know that α = n − 1 (its largest possible value) then we get no information from Theorem 1.6.
Recalling (2), from Theorem 1.6 we can immediately say that a.a.s. the (non-zero part of the) independent set sequence of the uniform labelled tree is increasing for its initial about 19%, or up to about 0.1n. By modifying the idea that goes into the proof of Theorem 1.5, we can improve this substantially. Theorem 1.7. Let T be a uniformly random labelled tree on n vertices, and let X k be the number of independent sets of size k in T. A.a.s. the sequence (X 0 , X 1 , . . . , X ℓ ) is weakly increasing, where ℓ = 0.280n. So a.a.s. the (non-zero part of the) independent set of the uniform labelled tree is weakly increasing for its initial 49.5%. See Section 2.3 for the proof of Theorem 1.7. With some further computation it is likely that we could improve Theorem 1.7, but an improvement to ℓ = 0.281n is beyond the reach of our current methods. Using quite different methods Heilman [17] has recently shown that the independent set sequence of T is a.a.s. weakly increasing up to 0.265n, or for the initial about 46% of its non-zero part.
Our proofs of Theorems 1.4 and 1.7 also give some information about the value of another well-studied graph parameter for the random labelled tree, namely the independent domination number. This is defined to be the cardinality of the smallest independent set that is also a dominating set (every vertex outside the set is adjacent to something in the set). Equivalently, it is the cardinality of the smallest independent set that is maximal (by inclusion). For any tree T with n vertices and ℓ = ℓ(T ) leaves it is known [11,19] that i(T ), the independent domination number of T , satisfies Since ℓ(T) is concentrated around n/e, (3) says that with high probability i(T) is between about 0.210n and 0.456n. We can improve the lower bound. In Section 2.3.2 we present and analyse the asymptotics of the quantity f (n, k, t), the expected number of independent sets of size k in T that have exactly t extensions to an independent set of size k + 1. At t = 0 this is exactly the expected number of maximal (by inclusion) independent sets of size k. The analysis of Section 2.3.2 (details omitted) shows that f (n, κn, 0) = o(1) for all κ ≤ 0.307, and so (by Markov's inequality) we deduce that a.a.s. i(T) is at least 0.307n. We end the introduction with a few further remarks about generalizations of Question 1.1. Recall that there is a sequence of ever-stronger (first is implied by second, et cetera, but no reverse implications) conditions on a sequence (a 0 , . . . , a m ) of positive terms: • Unimodality: a 0 ≤ a 1 ≤ · · · ≤ a k ≥ a k+1 ≥ · · · ≥ a m .
• Ordered log-concavity: for k = 1, . . . , m−1. (We say "ordered" because ordered log-concavity corresponds to the sequence (k!a k ) n k=0 being log-concave, and when a k counts objects each consisting of k unordered elements, k!a k counts the same objects when also an order is put on the elements).
• Real roots: m k=0 a k x k has all real roots. Chudnovsky and Seymour [9] showed that the independent set sequence of a claw-free graph satisfies not just unimodality but the real roots property; on the other hand, the independent set sequence of trees does not in general satisfy ultra-log concavity, as witnessed by the star on four vertices. It is plausible, however that there is an affirmative answer to the following question: Question 1.9. Is the independent set sequence of every tree ordered log-concave?
Radcliffe [27] has verified that every tree on up to 25 vertices has ordered log-concave independent set sequence (see also [31], where Yosef, Mizrachi and Kadrawi verify logconcavity for trees on up to 20 vertices).
One reason to think about ordered log-concavity is that it has a very nice reformulation. For a graph G with maximum independent set size α, let I and I k be the set of all independent sets of G, and the set of independent sets of size k, respectively. For I ∈ I, denote by e(I) the number of extensions of I to an independent set of size |I| + 1 (or: e(I) is the number of vertices in G that are neither in I nor adjacent to anything in I). Denote by e k the average number of extensions of an independent set of size k, that is is weakly decreasing.
(See Section 2.2 for the quick proof.) So Question 1.9 is equivalent to: Question 1.11. For every tree, is the sequence (e k ) α−1 k=0 weakly decreasing?
Before turning to proofs of Theorems 1.3, 1.4, 1.5, 1.6 and 1.7 and Claim 1.10 (in Section 2), we make a remark concerning the difference between Question 1.1 for trees versus forests. If G has components G 1 , . . . , G k , and component G ℓ has independent set sequence i ℓ = (i ℓ 0 , i ℓ 1 , . . .), then the independent set sequence of G is the convolution of the sequences i ℓ -that is, it is the coefficient sequence of the polynomial k ℓ=1 j≥0 i ℓ j x j . It is not in general the case that the convolution of unimodal sequences is unimodal, which means that Question 1.1 for trees is distinct from Question 1.1 for forests. On the other hand, it is the case that the convolution of log-concave sequences is log-concave [10], which means that to establish the log-concavity of the independent set sequence of an arbitrary forest, it is sufficient to do so for an arbitrary tree. We do not at the moment know whether the convolution of ordered log-concave sequences is ordered log-concave.

Proof of Theorem 1.3
The proof follows from two old results. First, a theorem of Fisher and Ryan [12]: Theorem 2.1. For any graph G with maximum independent set size α, we have Second, a theorem of Zykov [36]: For any graph G with n vertices and with maximum independent set size α, and any 0 ≤ k ≤ α, we have (This is a corollary of a more general result that among all graphs on n vertices with maximum independent set size α, the one which maximizes the number of independent sets of size k for each 0 ≤ k ≤ α is the balanced union of α cliques.) In summary: i k+1 > i k forces k < (αn − α)/(α + n), which implies Theorem 1.3.
2.2 Proofs of Theorems 1.5 and 1.6, and of Claim 1.10 The proofs of Theorems 1.4, 1.5 and 1.7, and of Claim 1.10, all have an element in common, which we introduce now. Given a graph G with maximum independent size α, for 0 ≤ j ≤ α − 1 denote by B j the bipartite graph with classes I j (the set of independent sets of size j in G) and I j+1 , with an edge joining I ∈ I j , J ∈ I j+1 if and only if I ⊆ J. B j has (j + 1)i j+1 edges, since each independent set of size j + 1 is an extension of exactly j + 1 independent sets of size j. It also has I∈I j e(I) edges, where as before e(I) is the number of extensions of I to an independent set of size |I| + 1. So we have the identity for j = 0, . . . , α − 1.
Proof (of Theorem 1.5): For k ≤ λ, each I ∈ I k−1 has e(I) ≥ λ − (k − 1), since each such I is in at least one independent set of size λ.
Proof (of Theorem 1.6): Let K be a maximal independent set in T , of size |K|. Each of the n − |K| vertices of T − K must have at least one edge to K, so the subgraph induced by T − K is a forest with n − |K| vertices and at most |K| − 1 edges, and so at least n − 2|K| + 1 components. It follows that T − K, and hence T , has an independent set of size at least n − 2|K| + 1. The result follows from n − 2|K| + 1 ≤ α.
Proof (of Claim 1.10): From (4) we have so that monotonicity of (e k ) α−1 k=0 is equivalent to which is in turn equivalent to ordered log-concavity of (i k ) α k=0 .
2.3 Proofs of Theorems 1.4 and 1.7 Theorem 1.5 hinged on the identity (4), which allows us to deduce that if every independent set of size k has more than k extensions to an independent set of size k + 1, then i k ≤ i k+1 . For Theorem 1.7 we modify this to: if all but a vanishing proportion of independent sets of size k have more than k extensions to an independent set of size k + 1, then a.a.s. i k ≤ i k+1 . Theorem 1.4 depends on a similar statement, that if all but a vanishing proportion of independent sets of size k have fewer than k extensions to an independent set of size k + 1, then a.a.s. i k ≥ i k+1 . In Section 2.3.1 below we make these ideas precise. In Section 2.3.2 we do the necessary analysis on the inequalities presented in Section 2.3.1. In Section 2.3.3 we use the Matrix Tree Theorem to establish (5), the key identity used throughout.

Key claim
Let e(n, k, t) denote the probability that, in a uniformly chosen labelled tree on [n] := {1, . . . , n}, a particular set of size k is an independent set and has exactly t extensions to an independent set of size k + 1. In Section 2.3.3 we use the Matrix Tree Theorem (and inclusion-exclusion) to establish Let g 1 (n, k) denote the expected number of independent sets of size k that have no more than k+1 extensions to an independent set of size k+1, and let g 2 (n, k) denote the expected number of independent sets of size k that have k + 1 or more extensions to an independent set of size k + 1. By linearity of expectation we have g 1 (n, k) = n k k+1 t=0 e(n, k, t) and g 2 (n, k) = n k n−k t=k+1 e(n, k, t).
Claim 2.3. Suppose that n and k with k + 2 ≤ n satisfy Then all but a proportion 1/(n log n) of trees on [n] satisfy i k ≤ i k+1 . And if then all but a proportion 1/(n log n) of trees on [n] satisfy i k+1 ≤ i k .
The proof will require the following result, which was possibly first observed by Wingard [30, Theorem 5.1]): Theorem 2.4. For any tree T on n vertices, and for any 0 ≤ k ≤ n, where P n is the path on n vertices.
In other words, within the family of trees, the path minimizes the number of independent sets of any size. (See Problem 2.6 for further discussion).
Proof (of Claim 2.3): By Markov's inequality, under (6) all but a proportion at most 1/(n log n) of trees on [n] have no more than n−k+1 k /n independent sets of size k with no more than k + 1 extensions to an independent set of size k + 1. In what follows we work inside in this set T 1 of trees.
As in the proofs of Theorem 1.5 and Claim 1.10, for T ∈ T 1 consider the bipartite graph B k with classes I k (the set of independent sets of size k in T ) and I k+1 , with an edge joining I ∈ I k , J ∈ I k+1 if and only if I ⊆ J. Recalling (4) we have where e(I) denotes the number of extensions of I to an independent set of size k + 1. Now lower bounding e(I) by 0 if I has no more than k + 1 extensions to an independent set of size k + 1, and by k + 2 otherwise, we get Inserting into (8) and using k + 2 ≤ n yields That i k ≥ n−k+1 k (completing the proof of the first part of the claim) follows from Theorem 2.4.
The proof of the second part of the claim is similar. By Markov, under (7) all but a proportion at most 1/(n log n) of trees on [n] have no more than n−k k+1 /n independent sets of size k with k + 1 or more extensions to an independent set of size k + 1. In what follows we work inside in this set T 2 of trees.
For T ∈ T 2 we again consider the bipartite graph B k . Upper bounding e(I) by n if I has k + 1 or more extensions to an independent set of size k + 1, and by k otherwise, we get the last inequality following from Theorem 2.4 applied to independent sets of size k + 1.

Analysis
To complete the proofs of Theorems 1.4 and 1.7, it remains to verify (5) (which we do in Section 2.3.3), and to show that for all sufficiently large n, for all k ≥ 0.347n (7) holds (so that, by a union bound, all but a proportion at most 1/ log n of trees on [n] satisfy i k ≥ i k+1 for all k ≥ 0.347n), while for all k ≤ 0.280n (6) holds. This section furnishes those verifications. It will be convenient to introduce f (n, k, t), the expected number of independent sets of size k that have exactly t extensions to an independent set of size k + 1; we have f (n, k, t) = n k e(n, k, t) The sum in (9) is of the form N ℓ=0 N ℓ (x − 1) ℓ 1 − ℓ N k , with x < 1, and is not easy to directly asymptotically analyze, since it is an alternating sum. However, with a little manipulation we can turn the sum into one involving only positive terms. Indeed, we have where in the second line we use symmetry of the binomial coefficients and in the last line we use the identity (see, e.g. [6, Proposition 2.5]) with z = 1/(x − 1). Here a b is a Stirling number of the second kind and (a) b is a falling power.
While not necessary for our argument, let us observe that (11) admits a combinatorial proof. The left-hand side evidently counts triples consisting of a subset S of a set of size N, a word of length k over alphabet S, and a coloring of the letters of S from a palette of z colors. This collection of triples could also be determined by first choosing the number j ∈ {1, . . . , k} of distinct letters that appear in the word; then choosing the blocks in the word in which the same letter appears ( k j options); then choosing the letters that appear in each of these blocks (let T be this set of letters; note |T | = j), and the colors that each of those letters receive ((N) j z j options); and finally choosing the remainder of the selected letters (i.e., the rest of S), and their colors ( . This leads to a count of k j=0 k j N j z j (1 + z) N −j for the number of triples. With N = n − k − t and x = k/n, (10) yields Recalling Claim 2.3, our goal is to find the largest k 1 = k 1 (n) and smallest k 2 = k 2 (n) such that for all sufficiently large n we have from which it follows that the independent set sequence of the uniform random labelled tree on n vertices is almost surely weakly increasing up to k 1 and weakly decreasing from k 2 on. We first give a heuristic analysis of the right-hand side of (12). Setting κ = k/n and τ = t/n, and ignoring polynomial factors of n in the approximations below, we have where H is the binary entropy function.) To estimate the sum in (12) we start with the standard identity from which we deduce where [z k ] is the operation that extracts the coefficient of z k from a power series in variable z.
We now appeal to a result of Good [15,Theorem 6.1] (see also [14, Theorem 2]) concerning the asymptotics of a coefficient of a high power of z in the power series expansion of a high power of a power series in z. Theorem 2.5. Suppose that f (z) = ∞ k=0 f k z k is a power series with positive coefficients and with infinite radius of convergence. Suppose that N = N(r) (r a natural number) is such that N/r is bounded away from 0 and from infinity as r → ∞. Then the implicit equation defines a unique positive real ρ = ρ(r), and ρf (ρ) ρ 2 and g is a continuous function.
See [15, p. 868] for an explicit description of g. As observed in [14], ρf ′ (ρ)/f (ρ) and σ have probabilistic interpretations, that will by useful for us later: ρf ′ (ρ)/f (ρ) is the expectation of the probability distribution X, supported on the natural numbers, given by P (X = k) ∝ f k ρ k , while σ 2 /ρ 2 is the variance of X.
Before making this heuristic analysis rigorous, we note one obvious place where both Theorems 1.4 and 1.7 might be improved. Looking at the proof of Claim 2.3, we see that that the n−k+1 k and n−k k+1 on the right-hand sides of (6) and (7), and so the n−k+1 k −1 and n−k k+1 −1 on the left-hand side of (13), come directly from Theorem 2.4 (the number of independent sets of size k, or k + 1, in any tree on n vertices is at least the number of size k, or k + 1, in the path on n vertices). If we could replace n−k+1 k and n−k k+1 with something larger, then we would have that (13) holds for larger k 1 = k 1 (n) and smaller k 2 = k 2 (n).
If we were working with all trees, such a replacement would not be possible, since the bound in Theorem 2.4 is tight for some trees (e.g., for paths). But we are working only with almost all trees, and so for our purposes it would be enough to have an a.a.s. lower bound on the number of independent sets of size k in the uniform random labelled tree. We can find such a bound, but it is not much larger than the deterministic bound -for k = κn, when n−k+1 k and n−k k+1 grow exponentially with n (both with base exp 2 {H(κ/(1 − κ))}), it is only larger by a polynomial in n, and this does not lead to any improvement in our results. What is needed for an improvement is an answer to the following problem: Problem 2.6. Find an a.a.s. lower bound on the number of independent sets of size k in the uniform random labelled tree of n vertices, that is substantially better than n−k+1 k . "Substantially better" here means that when k = κn (and κ < 1/2, to avoid trivialities) the bound should grow exponentially in n, with a base that is larger than exp 2 {H(κ/(1 − κ))}.
The remainder of this section is devoted to making our heuristic analysis rigorous. We start with Theorem 1.7, by analyzing (12) for k ≤ 0.280n and t ≤ k + 1 (the range of values relevant for that theorem). Note that in proving Theorem 1.7 we may assume k ≥ 0.1n, since we already know from Theorem 1.6 that a.a.s. the independent set sequence of the random tree is increasing up to 0.108n. Also, we initially assume only that k ≤ 0.49n. Recall that our specific goal is to establish k+1 t=0 f (n, k, t) for all large enough n and k ≤ 0.280n, where We break [0, n] into finitely many equal intervals, and for each k and t we upper bound the various terms that comprise f (n, k, t) (and lower bound n−k+1 k ) in terms of the upper To explain the denominator: observe that in putting an upper bound on 1/ρ k we have to pay attention to whether ρ is smaller than 1 -in which case we should use an upper bound for k -or ρ > 1 -in which case we should use a lower bound for k.
We now argue that σ = σ(n, k, t) is bounded below by a positive constant depending only on p, q and M. This follows from the fact that, as observed after the statement of Theorem 2.5, σ/ρ is the standard deviation of a probability distribution (supported on N) that is not almost surely constant, so σ > 0. We may lower bound σ by the minimum value it attains as the left-hand side of (22) varies over the (21). (The minimum exists since ρ and therefore σ vary continuously as the the left-hand side of (22) varies.) Note also that since g is continuous it is bounded on [ρ min , ρ max ], and that n − k − t grows linearly with n. It follows that for all sufficiently large n (depending on p, q, M) we where c = c(p, q, M) is a constant. Since for fixed M there are only finitely options for p, q, we can find a single constant c = c(M) so that for all large enough n (24) holds for all k, t under consideration. Combining all of these bounds, it follows that for all large enough n and for all k, t satisfying k ≤ 0.49n and 0 ≤ t ≤ k + 1 we have where p, q are associated with k and t as described earlier, and A, B, et cetera are the various terms (depending on p, q and M) that we have just defined. It follows that k+1 t=0 f (n, k, t) (taking q up to p + 1 in the max calculation is necessary since t ranges up to k + 1). If there is an M such that for all 0.1M ≤ p ≤ 0.280M, then we obtain (19) (equivalently (6)), that is, To make the computation manageable, we proceed in stages. We can begin, for example, by showing that with M = 100, (25) holds for 10 ≤ p ≤ 23, yielding that the independent set sequence of the uniform labelled tree on [n] is a.a.s. increasing up to 0.23n. So from here on we may restrict attention to p ≥ 0.23M. With M = 1000, (25) holds for 230 ≤ p ≤ 274, allowing us in the sequel to restrict to p ≥ 0.274M. Bootstrapping in this way, we eventually One way to get around this problem is to use an alternate, slightly weaker, upper bound on It follows that for t close to n − k we may replace C(IJ) n / √ n with K n where Specifically, we may bound The computational verification of (26) now proceeds in a very similar manner to that of (19), and we omit the details.

Deriving (5)
Here we use the Matrix Tree Theorem to find an explicit expression for e(n, k, t), the probability that, in a uniformly chosen labelled tree on [n], a particular set of size k is independent and has exactly t extensions to an independent set of size k + 1.
Given two disjoint subsets K, L of [n] with |K| = k ≥ 1 and |L| = ℓ, denote by T K,L the set of trees on [n] with K an independent set and with L having no edges to K. Claim 2.7.
Proof: T K,L is exactly the set of spanning trees of the graph G(K, L) obtained from K [n] , the complete graph on vertex set [n], by deleting all the edges inside K, as well as all edges from L to K. The Laplacian of G(K, L), with the rows and columns indexed first by vertices in K, then L, then the rest of the vertices (call this set M), is a block matrix.
• The block with rows indexed by K, columns indexed by K, has 0's off the diagonal, and n − k − l's down the diagonal.
• The block with rows indexed by K, columns indexed by L, is all 0.
• The block with rows indexed by K, columns indexed by M, is all −1.
• The block with rows indexed by L, columns indexed by L, has −1's off the diagonal, and n − k − 1's down the diagonal.
• The block with rows indexed by L, columns indexed by M, is all −1.
• The block with rows indexed by M, columns indexed by M, has −1's off the diagonal, and n − 1's down the diagonal.
(No other blocks need be specified -the matrix is symmetric). This matrix has • 0 as an eigenvalue with geometric multiplicity at least 1 (all row sums are 0); • n − k − ℓ as an eigenvalue with geometric multiplicity at least k (on subtracting n − k − ℓ from each diagonal entry, the first k rows become identical, and the sum of the rows indexed by L is a multiple of this common value); • n − k as an eigenvalue with geometric multiplicity at least ℓ − 1 (on subtracting n − k from each diagonal entry, the ℓ rows indexed by L become identical); and • n as an eigenvalue with geometric multiplicity at least n − k − ℓ (on subtracting n from each diagonal entry, the n − k − ℓ − 1 rows indexed by M become identical, and the sum of the remaining rows is a multiple of this common value).
Since 1 + k + (ℓ − 1) + (n − k − ℓ) = n it follows that these lower bounds on geometric multiplicities are equalities, and that the algebraic multiplicities of all the eigenvalues coincide with their geometric multiplicities. So from the Matrix Tree Theorem we get Claim 2.8. The number of trees on [n] with K as an independent set, and with T as the exact set of vertices that extend K to an independent set of size k + 1, is Proof: Let U K,T be the set of trees with K as an independent set and with T among the set of vertices that extend K to an independent set of size k + 1; we know from Claim 2.7 that |U K,T | = n n−2 1 − k n t−1 Let the vertices of [n]\(K ∪T ) be v 1 , . . . , v n−k−t . Let A j be the set of trees in U K,T in which there is no edge from v j to K. Then the number of trees on [n] with K as an independent set, and with T as the exact set of vertices that extend K to an independent set of size k + 1, is U K,T \ (A 1 ∪ · · · ∪ A n−k−t ).
If L is any subset of {1, . . . , n − k − t} then ∩ i∈L A i is exactly the set of trees with K independent, and with T ∪ L among the set of vertices that extend K to an independent set of size k + 1, so by Claim 2.7 we have So, by inclusion-exclusion, the number of trees on [n] with K as an independent set, and with T as the exact set of vertices that extend K to an independent set of size k + 1, is The claimed expression (5) for e(n, k, t) (the probability that in a uniformly chosen labelled tree on [n] a given set of size k is independent and has exactly t extensions to an independent set of size k + 1) follows from Claim 2.8 by first summing over all possible choices for T (the set of extensions) and then using Cayley's formula.