Reduced word enumeration, complexity, and randomization

A reduced word of a permutation $w$ is a minimal length expression of $w$ as a product of simple transpositions. We examine the computational complexity, formulas and (randomized) algorithms for their enumeration. In particular, we prove that the Edelman-Greene statistic, defined by S. Billey-B. Pawlowski, is typically exponentially large. This implies a result of B. Pawlowski, that it has exponentially growing expectation. Our result is established by a formal run-time analysis of A. Lascoux-M.-P. Sch\"utzenberger's transition algorithm. The more general problem of Hecke word enumeration, and its closely related question of counting set-valued standard Young tableaux, is also investigated. The latter enumeration problem is further motivated by work on Brill-Noether varieties due to M. Chan-N. Pflueger and D. Anderson-L. Chen-N. Tarasca.


Reduced word combinatorics
Let S n denote the symmetric group on {1, 2, . . . , n}. Each w ∈ S n can be expressed as a product of ℓ(w) simple transpositions s i = (i, i+1), where ℓ(w) is the number of inversions of w, i.e., pairs i < j such that w(i) > w(j). Such an expression w = s i 1 s i 2 · · · s i ℓ(w) is a reduced word for w.
Let Red(w) be the set of reduced words for w. R. P. Stanley [36] defined a symmetric function F w such that #Red(w) = the coefficient of x 1 x 2 · · · x ℓ(w) in F w . (1) In connection to ibid., P. Edelman-C. Greene [13,Section 8] proved that #Red(w) = ! λ a w,λ f λ , where (2) • f λ is the number of standard Young tableaux of shape λ, that is, row and column increasing bijective fillings of the Young diagram of λ using 1, 2, . . . , |λ|. The hooklength formula of J. S. Frame-G. de B. Robinson-R. M. Thrall [16] states where the product is over all boxes b ∈ λ and h b is the hooklength of b, i.e., the number of boxes weakly right and strictly below b.
• a w,λ counts EG tableaux : row and column increasing fillings T of λ such that reading the entries (i 1 , i 2 , . . . , i |λ| ) of T along columns, top to bottom, and right to left, gives a reduced word s i 1 · · · s i |λ| for w (cf. [10]).

Run-time complexity of transition
Our proof of Theorem 1 uses the transition algorithm of A. Lascoux-M. P. Schützenberger [26] (cf. [29,Sections 2.7,2.8]). This algorithm constructs the Lascoux-Schützenberger transition tree T (w) whose root is w and the leaves L(w) are labelled with vexillary permutations (with multiplicity). With this, see Section 2 for details. Different v may give the same λ(v). After combining such terms, (6) is the same as (2); see Lemma 6. The (practical) efficiency of (extensions/variations of) transition has been mentioned a number of times. S. Billey [3] calls transition "one of the most efficient methods" to compute Schubert polynomials. See also A. Buch [9,Sections 3.4,3.5] who discusses a quantum cohomology version of transition that is "quite efficient" for computing Gromov-Witten invariants (based on "practical experiments"). Another remark in this vein is found in the abstract to Z. Hamaker-E. Marberg-B. Pawlowski [19] who develop a different variation on the transition algorithm to "efficiently compute the decomposition" of involution Stanley symmetric functions into Schur P -functions. On the other hand, concerning the application of transition to computing the Littlewood-Richardson coefficients [26], A. Garsia [18, p. 52] writes: "Curiously, their algorithm (in spite of their claims to the contrary) is hopelessly inefficient as compared with well known methods." He also refers to transition as "efficient" for a different purpose in his study of Red(w).
Theorem 1 is actually a reformulation of the following result which may be interpreted as a lower-bound for the typical run-time of transition: Theorem 2. E(#L) = Ω(c n ) for a fixed constant c > 1. That is the average running time of transition, as an algorithm to compute #Red(w), is at least exponential in n.
Theorem 7 strengthens Theorem 2 to show that the "typical" running time is exponentially large. To prove Theorem 2 we use that the expected number of occurences of a fixed pattern π ∈ S k in w ∈ S n is # n k $ /k!. Thus for u = 2143, this expectation is O(n 4 ). One shows each step of transition reduces the number of 2143 patterns by O(n 3 ). Using the graphical description of transition by A. Knutson and the third author [25], a node u of T (w) has exactly one child u ′ only if u ′ has weakly more 2143 patterns than u does. Consequently, T (w) has Ω(n) branch points along any root-to-leaf path and thus exponentially many leaves. (In fact, the c > 1 from our argument is close to 1.) Section 6 collects some remarks and questions about the related matter of the computational complexity of counting #Red(w).

Hecke words
Section 4 studies the more general problem of counting Hecke(w, N ), the set of Hecke words of length N whose Demazure product is a given w ∈ S n . Here, the role of Stanley's symmetric polynomial is played by the stable Grothendieck polynomial defined by S. Fomin and A. N. Kirillov [15]. Using work of S. Fomin and C. Greene [14] and of A. Buch, A. Kresch, M. Shimozono, H. Tamvakis and the third author [10], one has two analogues of the results of Edelman-Greene [13]. However, finding useful enumeration formulas for Hecke words is a challenge. This is even true for the case that w is vexillary.
As explained by Proposition 19, enumerating Hecke words is closely related to the problem of counting f λ,N , the number of set-valued tableaux [8] that are N -standard of shape λ. These are fillings T of the boxes of λ by 1, 2, . . . , N , where each entry appears exactly once, and if one chooses precisely one entry from each box of T , one obtains a semistandard tableau. For example, if N = 8 and λ = (3, 2), one tableau is 1,2 4,5 8 3 6,7 .
There is no algorithm to compute f λ,N that is polynomial-time in the bit-length of the input (λ, N ) (see Section 6, after Observation 39).
Problem 3. Does there exist an algorithm to compute f λ,N that is polynomial in |λ| and N ?
Recent work of M. Chan-N. Pflueger [11] and D. Anderson-L. Chen-N. Tarasca [2] motivates study of f λ,N in terms of Brill-Noether varieties. We remark on two manifestly nonnegative formulas for the Euler characteristics of these varieties (Corollary 26).

Randomization
Section 5 gives three randomized algorithms to estimate #Red(w) and/or #Hecke(w, N ) using importance sampling. That is, let S be a finite set. Assign s ∈ S probability p s . Let Z be a random variable on S with Z(s) = 1/p s . Then Using this, one can devise simple Monte Carlo algorithms to estimate #S. The idea goes back to at least a 1951 article of H. Kahn-T. E. Harris [21], who furthermore credit J. von Neumann. The application to combinatorial enumeration was popularized through D. Knuth's article [22] which applies it to estimating the number of self-avoiding walks in a grid. An application to approximating the permanent was given by L. E. Rasmussen [33]. More recently, J. Blitzstein-P. Diaconis [6] develop an importance sampling algorithm to estimate the number of graphs with a given degree sequence. We are suggesting another avenue of applicability, to core objects of algebraic combinatorics.

Preliminaries
The graph G(w) of a permutation w ∈ S n is the n × n grid, with a • placed in position (i, w(i)) (in matrix coordinates). The Rothe diagram of w is given by Pictorially, this is described by striking out boxes below and to the right of each • in G(w). D(w) consists of the remaining boxes. If it exists, the connected component involving (1,1) is the dominant component. The essential set of w consists of the maximally southeast boxes of each connected component of D(w), i.e., If it exists, the accessible box is the southmost then eastmost essential set box not in the dominant component. We will assume L is minimum (i.e., code(w) does not have trailing zeros). By this convention, code(id) = (). Fulton's criterion [17,Remark 9.17] states that u is vexillary if and only if there does not exist two essential set boxes where one is strictly northwest of the other. Thus, using the above picture of D(w) we can see that w is not vexillary because of, e.g., (1,4) and (5, 6).
We describe the graphical version of the transition algorithm to compute #Red(w). The root of the tree is labelled by D(w). If w is not vexillary, the children of w are defined as follows. For each i = 1, 2, . . . , t, let R i be the rectangle defined by b i and e. Remove b i and its rays from G(w) to form G (i) (w). Order the boxes {v i } m i=1 in English reading order. Move v 1 strictly north and strictly west to the closest position not occupied by another box of D(w) or a ray from G (i) (w). Now, iterate this procedure with v 2 , v 3 , . . .. At each step, v j may move to a position vacated by earlier moves. The result is the diagram D(w (i) ) of some permutation w (i) . These D(w (i) )'s are the children of D(w). We call the transformation D(w) → D(w (i) ) a marching move.
Example 4. Continuing our example, the pivots of w are (1,5), (2,4) and (3,2). We now obtain the child corresponding to the pivot b 2 = (2, 4): We have indicated by "X" the boxes that have moved. This process constructs one of the three children of w. In Figure 1 we draw the remainder of T (w).
If u is vexillary we define λ(u) graphically by pushing all boxes of D(u) northwest along the diagonal that it sits until a partition shape is reached; see [24, Section 3.2]. Concluding our running example, from Figure 1  This result is a mild variation of [29, Proposition 2.8.1] (cf. [26]) using the marching moves. We make no claim of originality.
Theorem 5 (cf. [26,29,25] Proof: We follow [1, Section 5.2], which elaborates on the notions from [25] in the case of Schubert polynomials S w . We refer to [29, Chapter 2] for background. Let (r, c) be the accessible box of w ∈ S n and set k = w −1 (c). Also let w ′ = w · (r, k). Transition gives this recurrence for the Schubert polynomials: a X X X X X X X X a X X a X X Figure 1: T (w) for w = 54278316. The a indicates the accessible box of each node. The X's describe which boxes of the parent moved. From this tree, we compute #Red(w) = 730158.
where the summation is over the children w ′′ of w in T (w).
Since v ∈ L(w) is vexillary, see, e.g., [29, Section 2.8.1]. Hence We have that [ . Now the result follows from these two facts combined with (11).
3 Proof of Theorems 1 and 2

On the distribution of EG(w)
Lemma 6. For any w ∈ S n , EG(w) = #L(w).
Proof. Combining results of [36,13] gives where the sum is over partitions λ of size ℓ(w), and a w,λ is defined in Section 1. The Schur polynomials s λ (x 1 , . . . , x ℓ(w) ) for |λ| = ℓ(w) are a basis of the vector space Λ (12) and (11) (where n = ℓ(w)) are linear combinations for the same vector, we are done.
In view of Lemma 6, Theorems 1 and 2 are equivalent. It is easy to see that Theorem 1 follows from our main result: There exists α > 0 such that for n sufficiently large, Our goal is therefore to prove Theorem 7. To do so we need some preparatory results. Let N π,n (w) be the number of π patterns contained in w ∈ S n . Proposition 8. Suppose in T (w) that the node u has exactly one child u ′ . Then Proof of Proposition 8: Let the accessible box z u of u be in position (x, y). By definition By definition of the transition algorithm, all •'s of G(u) and G(u ′ ) are in the same position, except A, B, C in G(u) are respectively replaced by Schematically, the march move looks as follows (we have thickened the moving •'s).
If there are two •'s, other than {B, C}, that are weakly south and weakly east of z u then one • must be (strictly) southeast of the other.
Proof of Claim 9: Suppose not. Then let the two •'s be at (q, r) and (m, n) where q > m and r < n. Then there is a box z ∕ = z u of D(u) in position (m, q), which is weakly south and weakly east of z u . Since z u is not in the dominant component of D(u), then z cannot be in that component either. Therefore, z u is not the accessible box of D(u), a contradiction.

Claim 10.
There is no • of G(u) strictly north of row c and strictly between columns d and y. Similarly, there is no • of G(u) strictly west of column d and strictly between rows c and x.
Proof of Claim 10: We prove only the first sentence of the claim, as the second sentence is analogous. Suppose not; we may assume this • is maximally southeast with the assumed properties. Then A = (c, d) and this • are two pivots for D(u), which implies u has at least two children, contradicting the hypothesis of the Proposition. Let F u consist of all embedding positions , the positions do not involve the rows of A, B or C). Let In what follows, we will let • i refer to the • in the diagram corresponding to the "i" in the 2143 pattern, for 1 ! i ! 4. In addition, if i 1 is in the row of A we will write "A = • 2 ", etc. We define now ψ in cases: The • 4 and • 3 appear strictly right of column y. This contradicts Claim 9. Hence, no elements of F ′′ u fall into this case. Case 2: (C = • 1 or C = • 2 ) • 4 and • 3 appear strictly southeast of z u . As in Case 1, this contradicts Claim 9. Again, no elements of F ′′ u fall into this case. Case 3: (A = • 1 ) Let • 2 be at position (r, s). Hence r < c and s > d. If moreover, s < y we contradict the first sentence of Claim 10. Hence, s > y. We must have that Subcase 3c: (• 4 is strictly north of row x and • 3 is strictly south of row x). Since s > y, Subcase 3d: (• 3 is in row x and • 4 is strictly above row x) Then in fact Subcase 3e: (• 4 is in row x and • 3 is strictly south of row x) Thus • 4 = C and • 3 is strictly southeast of z u . This contradicts Claim 9. Hence, z u cannot be the accessible box, a contradiction. Thus, no elements of F ′′ u are in this case.
Case 6: (A = • 4 ) • 3 is strictly south of the row of A. If it is also weakly north of x, we contradict the second sentence of Claim 10. Hence • 3 is strictly south of x, i.e., the row of e. Now, Since the row of A ′ is x the output is a 2143 pattern in u ′ . Subcase 8a: (r < c) Therefore, • 1 and • 2 are also strictly above row c. Since • 1 , • 2 and • 4 stay in the same place in u and u ′ and Subcase 8b: (x < r < x ′ ) This contradicts Claim 9. Subcase 8c: (r = c) This implies A = • 4 , which is impossible. Subcase 8d: (c < r < x) We may assume A ∕ = • 1 , • 2 since those cases are handled by Case 3 and Case 4. Now, • 1 and • 2 are strictly west of column y and strictly north of row x. By the assumption that A = b 1 is the (unique) pivot, combined with Claim 10, both • 1 and • 2 are strictly northwest of A. Thus, • 1 and • 2 are in the same place in u ′ , and Subcase 8e: (r = x) Hence C = Since C ′ is further south than C, we may set Subcase 9b: (c < r < x) We may also assume that A ∕ = • 1 and A ∕ = • 2 , since those are handled in Case 3 and Case 4, respectively. Thus Subcase 9c: (r = c) Then A = • 4 , which is impossible. Thus, x < r < x ′ and it follows that • 3 is in the same place in u ′ . By the reasoning of the first paragraph of Subcase 8d, • 1 , • 2 are strictly northwest of A. Hence • 1 , • 2 also remain in the same place in u ′ . Summing up, since B ′ is in row c, we may define ψ is well-defined: The above cases handle each of the possibilities for A, B, C being one of 1, 2, 3, 4. Our definition of ψ is shown to send an element of F ′′ u to an element of F ′′ u ′ . We also need that if an element of F ′′ u occurs in two cases, ψ sends them to the same element of F ′′ u ′ . By inspection, the only overlapping situations are Subcase 3d↔Subcase 9b and Subcase 8d↔Case 10. In both these cases we define ψ to be consistent on the overlap. ψ is an injection: This is by inspection of pairs of subcases where ψ's output was given. By our choice of notation, if • i appears in the description of the input to ψ, it cannot be equal to A, B or C and hence in the output, it cannot be equal to A ′ , B ′ or C ′ (as {A, B, C} and {A ′ , B ′ , C ′ } occupy the same rows). Therefore, if in two cases, some coordinate of the two outputs differ symbolically, those outputs cannot be equal. After ruling out these pairs, we are left with a few to check: Subcase 3b, Subcase 3c: These differ in the fourth coordinate since in the former case, • 3 is strictly north of row x and in the latter case, • 3 is strictly south of row x. Case 5 and Subcase 8d: These differ in the third coordinate since in the former case, • 4 appears above row c whereas in the latter case, • 4 is below row c. Subcase 9a and Subcase 9b: These differ in the third coordinate for the same reason as the previous pair.
Lemma 11. Let w ∈ S n and suppose u → u ′ in T (w). Then Proof of Lemma 11: Since u → u ′ in T (w), exactly three positions a, i, j differ between u and u ′ . We are claiming that to N 2143,n (u) − N 2143,n (u ′ ), thus explaining the third term of (13). Similar arguments explain the first and second terms of (13) as the contributions to N 2143,n (u) − N 2143,n (u ′ ) from the cases that #({t 1 , t 2 , t 3 , t 4 } ∩ {a, i, j}) = 3 and #({t 1 , t 2 , t 3 , t 4 } ∩ {a, i, j}) = 2, respectively. The lemma thus follows.
The following is known; see work of M. Bona [7] and of S. Janson, B. Nakamura, and D. Zeilberger [20]. The proof being not difficult, we include it for completeness.
Lemma 12. For any π ∈ S k , the expected number of occurrences of π as a pattern in w ∈ S n (selected using the uniform distribution) is Lemma 13. Let T be a rooted tree with the property that along any path from the root to a leaf there are d nodes with at least two children. Then that tree has at least 2 d leaves.
Proof of Lemma 13: Arbitrarily left-right order the descendants of the root of T . After pruning, if necessary, we may assume each node as at most two children. Along any path from the root to a leaf, record "S" if a node has one child, and "L" if one steps to the left child and "R" if one goes to the right child. Thus, each leaf is uniquely encoded by an {S, L, R} sequence. By hypothesis, each such sequence has at least d from {L, R}. Also, each of the 2 d -many {L, R}-sequences must be a subsequence of a unique leaf sequence.
Hence there are at least 2 d leaves.

Remarks
M. Bona [7] proves that the sequence of random variables is asymptotically normal, i.e., X n converges in distribution to the standard normal variable N (0, 1). In particular, this means that for any ) > 0, for any a, b ∈ R, there exists N ∈ N such that for all n " N , |P( - . Thus one could use Bona's theorem to prove a more refined version of Theorem 7. However, this does not affect our basic conclusions, so we opted to state a result/proof that only appeals to Chebyshev's inequality.
Using the relations one can transform between any two reduced words see, e.g., [29, Proposition 2.1.6]. Hence, it follows that Let σ (n) = 214365 · · · 2n 2n − 1 ∈ S 2n . This next fact is well-known to experts (for example, an anonymous referee states one can derive it from [28, Theorem 9]). We give a proof here for sake of completeness and make no claims of originality. Proposition 14. a σ (n) ,λ = f λ .
Proof. Fix any partition λ of size 2n − 1. Consider any row and column increasing filling T of λ, using each of the labels {1, 3, 5, . . . , 2n − 1} precisely once. Let A λ be the set of these tableaux. Also, let B λ be the set of EG tableaux for the coefficient a σ (n) ,λ . Red(σ (n) ) consists of all n! rearrangements of the factors of s 1 s 3 · · · s 2n−1 . Hence, the column reading word of any T ∈ A λ gives a reduced word for w. Thus, A λ ⊆ B λ . By (20), if S ∈ B λ , it must use each label of {1, 3, 5, . . . , 2n − 1} exactly once. Since S must also be row and column increasing, we see S ∈ A λ . This gives A λ = B λ .
Given T ∈ A λ (= B λ ), let φ(T ) ∈ SYT(λ) be the standard Young tableau of shape λ obtained by sending label i in T to Let inv(n) be the number of involutions of S n . The following shows that the worst case and average case running time of transition is quite different:  Proof. The equality holds since #L(σ (n) ) = EG(σ (n) ) = The first equality of (21) is Lemma 6, the second is Proposition 14 and the third is textbook (e.g., [37,Corollary 7.13.9]). The asymptotic statement is [23,Section 5.1.4].
Since the original preprint version of this paper was posted to the arXiv, this conjecture has been proved by G. Orelowitz [31].

Counting Hecke words
Therefore, N " ℓ(w). Let Hecke(w, N ) denote the set of Hecke words for w of length N .

Two generalizations of the Edelman-Greene formula (2)
We now give two formulas for computing Hecke(w, N ). Both are known to experts, but we are unaware of any specific place that they appear in the literature. We need three (stable) Grothendieck polynomial formulas from the literature. For the purposes of this paper, the reader may take these formulas as definitions.
First, S. Fomin-A. N. Kirillov [15] prove the following combinatorial formula for the stable Grothendieck polynomial G w : where i = (i 1 , . . . , i N ) ∈ Hecke(w, N ), and j = (j 1 ! j 2 ! · · · ! j N ) are positive integers satisfying j t < j t+1 whenever i t ! i t+1 . This is a formal power series in x 1 , x 2 , . . .. Second, S. Fomin-C. Greene [14,Theorem 1.2] states that, up to change of conventions, where b w,λ be the number of row strictly increasing and column weakly increasing tableaux of shape λ whose top to bottom, right to left, column reading word is a Hecke word for w. Third, C. Lenart [27] gave an expression for the symmetric Grothendieck polynomial : where µ ⊆ λ ⊆ 0 µ. Here 0 µ is the unique maximal partition with t rows obtained by adding at most i − 1 boxes to row i of µ for 2 ! i ! t. In addition, g µ,λ counts the number of Lenart tableaux, i.e., column and row strict tableaux of shape µ/λ with entries in the i-th row restricted to 1, 2, . . . , i − 1 for each i.
Since the Schur polynomials form a basis of the ring of symmetric polynomials, the righthand sides of (24) and (23) coincide, i.e., b π,λ = g M ×M,λ for every λ. The result follows.
and the latter sum is over all semistandard set-valued tableaux of shape λ [8]. Above, c w,λ is the number of row and column strict tableaux of shape λ whose top to bottom, right to left, column reading word is a Hecke word for w. This next generalization of (2) is also manifestly nonnegative. It specializes in the vexillary case in a tantalizing way.
Proof. In view of (26) and (28) we have proving (29). For the second statement, by [34,Lemma 5.4], when w is vexillary then G w = G λ , and the above sequence of equalities simplifies, as desired.
Proposition 19 is our central motivation for Problem 40.
The second formula expresses (−1) g−|CP| χ(G r,α,β d (X, p, q)) as a cancellation-free sum of Euler characteristics of other Brill-Noether varieties. Is there a geometric explanation of this?
5 Three importance sampling algorithms

Estimating #Hecke(w, N )
We propose a different importance sampling algorithm, to compute #Hecke(w, N ). For N < ℓ(w) the random variable Z w,N is equal to 0 and for N " ℓ(w), it is recursively defined by: if w = id and N > 0 % i∈D (#Hecke(ws i , N −1)+#Hecke(w, N −1)) otherwise. (33) The unique Hecke word for w = id is the empty word; this explains the first two cases.
Claim 32. i N is the position of a descent of w, i.e., w(i N ) > w(i N + 1).
Proof of Claim 32: In the former case then if i N is the position of an ascent of w ′ = w then w = w ′ + s i N would create a descent at that position, a contradiction. In the latter case, w ′ had an ascent at position i N which becomes a descent in w ′ + s i N = w ′ s i N . Claim 32 implies the existence of a bijection 3 Therefore, by taking cardinalities on both sides of (34) we obtain the third case of (33).

Remarks and questions about computational complexity
The exponential average run-time of transition (Theorem 2) does not imply computing #Red(w) is hard. Suppose one encodes a permutation w by its Lehmer code code(w) = (c 1 , c 2 , . . . , c L ). What is the worst case complexity of computing #Red(w) given input code(w)?
L. Valiant [38] introduced the complexity class #P of problems that count the number of accepting paths of a non-deterministic Turing machine running in polynomial time in the length of the input. Let FP be the class of function problems solvable in polynomial time on a deterministic Turing machine. It is basic theory that FP ⊆ #P.
Proof. Let θ n be the vexillary permutation with code(θ n ) = (n, n). Then, using (5), #Red(θ n ) = f (n,n) = C n := 1 The middle equality is textbook: there is a bijection between standard Young tableaux of shape (n, n) and Dyck paths from (0, 0) to (2n, 0); both are enumerated by the Catalan number C n . Now, #Red(θ n ) is doubly exponential in the input length O(log n). No such problem can be in #P; see [30,Section 3] which also inspired this observation.
(#Red(w) ∕ ∈ FP is true from this argument for the simple reason that it takes exponential time just to write down the output.) By Observation 39's reasoning, (36) shows there is no algorithm to compute f λ,N that is polynomial-time in the bit-length of the input (λ, N ).
A counting problem P is #P-hard if any problem in #P has a polynomial-time counting reduction to P. Is #Red(w) ∈ #P-hard?
Observation 39 is dependent on the choice of encoding. For example, if one encodes a permutation w ∈ S n in the inefficient one-line notation, the input takes O(n log n) space. Since ℓ(w) ! # n 2 $ is polynomial in the input length, it follows that #Red(w) ∈ #P; see [35].
Problem 40. Does there exist an n O(1) -algorithm to compute #Red(w)?
Knutson, Tejo Nutalapati, Gidon Orelowitz, Colleen Robichaux, Renming Song, John Stembridge, Anna Weigandt and Harshit Yadav for helpful remarks/discussion. We are especially grateful to Brendan Pawlowski for pointing out Theorem 1 appears as Theorem 3.2.7 of [32], as well as other remarks. We also thank the anonymous referee for their careful reading and comments that improved our presentation. This work is part of ICLUE, the Illinois Combinatorics Lab for Undergraduate Experience.