Universal and Near-Universal Cycles of Set Partitions

We study universal cycles of the set ${\cal P}(n,k)$ of $k$-partitions of the set $[n]:=\{1,2,\ldots,n\}$ and prove that the transition digraph associated with ${\cal P}(n,k)$ is Eulerian. But this does not imply that universal cycles (or ucycles) exist, since vertices represent equivalence classes of partitions! We use this result to prove, however, that ucycles of ${\cal P}(n,k)$ exist for all $n \geq 3$ when $k=2$. We reprove that they exist for odd $n$ when $k = n-1$ and that they do not exist for even $n$ when $k = n-1$. An infinite family of $(n,k)$ for which ucycles do not exist is shown to be those pairs for which $S(n-2, k-2)$ is odd ($3 \leq k<n-1$). We also show that there exist universal cycles of partitions of $[n]$ into $k$ subsets of distinct sizes when $k$ is sufficiently smaller than $n$, and therefore that there exist universal packings of the partitions in ${\cal P}(n,k)$. An analogous result for coverings completes the investigation.


Introduction
Universal cycles are often loosely defined. In a recent seminar talk at Viriginia Commonwealth University, Glenn Hulbert offered the following description: "Broadly, universal cycles (ucycles) are special listings of combinatorial objects in which codes for the objects are written in an overlapping, cyclic manner." By "special," Hurlbert means "without repetitions". As an example, the cyclic string 112233 encodes each of the six multisets of size 2 from the set {1, 2, 3}. Another well-quoted example, from [8], is the string 1356725 6823472 3578147 8245614 5712361 2467836 7134582 4681258, where each block is obtained from the previous one by addition of 5 modulo 8. This string is an encoding of the fifty six 3-subsets of the set [8] := {1, 2, 3, 4, 5, 6, 7, 8}. A seminal paper in the area is that of Chung, Diaconis and Graham [2] who studied ucycles of • subsets of size k of an n-element set (as in the above example); • set partitions (the focus of this paper); and • permutations (with a necessarily augmented ground set and the use of order isomorphism representations, e.g., the string 124324 encodes each of the six permutations of [3] = {1, 2, 3} in an order isomorphic fashion, which is clearly not possible to do using the ground set [3]).
In [2] it is shown that for n ≥ 4, there does exist a ucycle of all partitions P(n) of the set [n] into an arbitrary number of parts; for example, we have the ucycle abcbccccddcdeec of P(4), where, for example, the substring dcde encodes the partition 13|2|4. Note that the alphabet used was in this case of size 5, though an alphabet of (minimum) size 5 is shown to suffice to encode P(5) as DDDDDCHHHCCDDCCCHCHCSHHSDSSDSSHSDDCH SSCHSHDHSCHSJCDC.
The above example reflects tongue-in-cheek humor, since there are 52 partitions of [5] and the above ucycle has 13 cards of each suit -except that one spade has been replaced by a joker! The authors also ask how many partitions of P(n) using an alphabet of size N ≥ n exist. This question will also be of deep relevance to us, as alluded to in the later section on Open Problems. As noted in [6], however, not much seems to be known about ucycles of the partitions P(n, k)(|P(n, k)| = S(n, k)) of [n] into k parts. In [4], it was shown that for k = n − 1, ucycles exist if and only if n is odd. At the other end of the k-spectrum, the authors of [7] showed that for odd n, one could find a ucycle of partitions of [n] into two parts, and that an "asymptotically good universal packing" could be found for k = 3, i.e., that there was a string of length T (n, 3 It is this work that we build on. In Section 2, we generalize the above result on asymptotically good universal packings (upackings) to the case of all fixed k as n → ∞, as well as to ucoverings, which are also shown to be "asymptotically good." Finally, Section 3 is devoted to showing that the transition digraph associated with P(n, k) is Eulerian. As noted in the work of [2], this does not necessarily imply that universal cycles exist, since the digraph vertices represent equivalence classes of partitions. We use our result to prove, however, that ucycles of P(n, k) exist for all n ≥ 3 when k = 2 and for odd n when k = n − 1, the latter recovering the result in [4]. We also (re)prove that ucycles do not exist for n even when k = n − 1. Finally, we show that for even n, ucycles do not exist when S(n − 2, k − 2) is odd (3 ≤ k < n − 1). The last result shows, e.g., that ucycles of P(12, 6), or P(6, 3) do not exist. There are infinitely many such pairs of values of (n, k). Moreover, the technique we exhibit in Section 3 has the potential to tease out many more results along these lines.

Universal Packings and Coverings of Partitions of [n] into k parts
One of the main results in [9] was that one could create a ucycle of all surjections from [n] to [k] iff n > k. Since there are k! surjections that yield the same set partition, we need to be more careful, and proceed by showing in Theorem 1 that for sufficiently large n, it is possible to ucycle partitions of [n] into k parts of distinct sizes. We represent such partitions as surjections from f : The fact that asymptotically good upackings exist is proved in Theorem 2, where we provide an alternative proof of the intuitively obvious fact that for any fixed k, the number of k-partitions of [n] with non-distinct part sizes form a vanishing fraction of all partitions into k parts as n → ∞. Proof. Following the standard process, we create a digraph D in which the vertices are sequences of length n − 1, of numbers in {1, . . . , k}, for which the addition of at least one number in {1 . . . , k} at the beginning or the end creates a sequence of length n which, using the special canonical format we have adopted, represents a partition of [n] into k parts of distinct sizes. For example, with n = 10, k = 4, 122333444 is a legal vertex, as is 123334444. On the other hand, 112233344 is not in the underlying graph. A vertex points towards another if its last n − 2 terms are the same as the first n − 2 terms of the second vertex. The edges of the digraph, labeled by vertex label concatenation, are thus sequences representing partitions of [n] into k distinct parts.
The problem of finding a ucycle is reduced to the problem of finding an Eulerian circuit in this digraph. We know that Eulerian circuits exist if the graph is both strongly connected and balanced, i.e., for each vertex v, the in-and out-degrees of v are equal. It is easy to show that if a digraph is balanced and weakly connected, then it is also strongly connected, so all we need to show is that our digraph is balanced and weakly connected. This approach is used, e.g., in [1].
Showing that D is balanced is simple -if a number from 1 through k can be added to the beginning of the sequence at a vertex, then it can also always be added to the end of the sequence since it is only the numbers of 1's, 2's, . . . and k's that actually matter in determining if an edge represents a partition into distinct parts. Therefore, the number of edges pointing away from a vertex will be the same as the number of edges pointing towards it. Note that the in-and out-degrees of any vertex (the common value is sometimes called the vertex degree), though equal, are quite different at different vertices. For example for k = 3, the vertex 123333333 has degree 1; the vertex 122333333 has degree 2; and the vertex 122233333 has degree 3. In general, one may write down a formula for deg(v) depending on the differences between the number of i + 1s and the number of is in v.
To show that the digraph is weakly connected, we need only show that it is possible to reach a designated target vertex from any other starting vertex. We will let this target vertex be the one consisting of two 2's, three 3's,. . .,and k − 1 k − 1's -leaving all of the remaining numbers as k's, in that order. For example, for n = 27 and k = 6, the target vertex of length 26 will be 22333444455555666666666666. In fact, we will show something stronger, namely that one can traverse from any edge to the edge 122333 . . . (k − 1) . . . (k − 1)k . . . k from whence we may reach the target vertex in a single step. Notice that such edges represent legal partitions in P(n, k) only if n ≥ n 0 := k(k + 1)/2.
Key to our algorithm on how to reach one edge from another is the process of "switching" numbers. For example, we can write out all of the steps to go from the edge 33132323 to the edge 33122323 as follows 33132323 → 31323233 → 13232333 → 32323331 → 23233312 → 32333122 → 23331223 → 33312232 → 33122323, or, we can skip all the steps of "rotating around" and just say that we "switched" the 3 to the right of the 1 into a 2. Note that "rotating" is always legal but switching in the above form might not always be, even in several steps. We need to have the "room to maneuver around" while maintaining edge-integrity.
We will show that the only requirement to reach the designated target edge from any other edge is to have the ability to "switch" any j ∈ {2, 3, . . . , k − 1} into a k and vice versa (possibly through several steps), and that this is equivalent to having n ≥ n 1 : where the "extra" (k − 1) digits give us the needed room to maneuver around.
In general, our algorithm to reach the target sequence will then be to first, if we have more than one 1, change all extra 1's into k's. We will then underline the single remaining 1 as something we won't touch again. Next, we will switch the number to the right of the 1 into a k, possibly in multiple steps, and then the k back into a 2, again possibly through multiple steps. We will now underline the 1 and the 2 together, as something we won't touch again. Next we will consider the next number to the right of this 2 and switch it to a k, and then switch back from the k into a 2 again, then underline the sequence 122, and switch all remaining 2's in the sequence into k's. Next, we consider the number to right of the second 2, switch it into a k, and then switch from the k back into a 3, then block off 1223, etc., until we reach the target sequence. The following example illustrates the general technique.
Let n = 19, k = 5. Suppose we begin with the sequence 1432543552543455435 with PSV=(1, 2, 4, 5, 7). There are no extra 1's to change into 5's. Then we can first change the 4 to the right of the 1 into a 5, but in order to do this, we must create space between the numbers of 3's and 4's by changing one 3 into a 5; we thus arrive at 1452543552543455435 PSV= (1,2,3,5,8). The 4 to the right of the 1 can now be changed into a 5 to get 1552543552543455435 PSV= (1,2,3,4,9). We now need to change the 5 to the right of the 1 back into a 2 by creating space between the number of 4's and 3's and then between the number of 3's and 2's. This leads to This process works in general since, given an edge of weight n 1 := (k+4)(k−1) 2 + 1, the sums of the gaps between the components of the PSV may be as low as k, corresponding to the PSV (1, 3, 4, . . . , k + 1), or as high as 2k − 1, corresponding to the PSV (1, 2, . . . , k − 1, 2k − 1). Switching numbers, possibly in multiple steps, is always possible whenever there is a gap of two somewhere in the PSV sequence, which is guaranteed by the choice of n 1 . Question: Does a better algorithm allow for a smaller threshold n?
The next theorem shows that there exists a ucovering of all partitions of [n] into k parts if n is large enough; for simplicity we let the threshold n be the same as in Theorem 1. This is because any partition may be represented by a surjection satisfying the conditions of Theorem 2, though there may be multiple such representations when two or more of the part sizes are equal. Proof. Exactly the same as that of Theorem 1, except that the algorithm may terminate faster. Proof. Our proof will reveal that T (n, k) S(n, k) where T (n, k) and U (n, k) are the lengths of the ucycles in Theorems 1 and 2 respectively. As pointed out by Professor László Székely, however, both these results are special cases of asymptotic results found in [5], where a threshold of k = n 1/5 is seen to hold for the property "partitions of size k with distinct parts form a "high" fraction of all partitions of [n] into k parts." Thus such values of k can serve to improve the conclusion of Corollary 3. Our proof is somewhat different, however. Note that T (n, k) S(n, k) where Sa(n, k) denotes the number of partitions in which two or more parts are equal. Reframing the question in terms of distributing n distinguishable balls into k distinct boxes, we would like to calculate P (∪ i,j I i,j ) where I i,j = {B i = B j }, is the event that boxes i and j contain the same number of balls B i and B j respectively. By symmetry, we observe that (1) Setting β (A) = 0, we find easily that A = 1 k . The next step is to show that and thus that for any > 0, The lemma follows.
We now return to (1) and see that for a ϕ(n) to be determined, We will next use Stirling's approximation N ! ∼ √ 2πN (N/e) n at various points. Note that whenever N = n k + o(n) we have that √ 2πN = Θ √ n. Accordingly, we first see that for some constant A .
Thus the second part of (2), when using n/k + ϕ(n), is bounded above by which tends to zero provided that ϕ(n) = nψ(n) for any ψ(n) → ∞ (noting that k ≥ 3 is fixed). Finally it is easy to verify that the second part of (2) tends to zero if we consider f (n/k − ϕ(n)) as well. This completes the proof.

Universal Cycles of Partitions P(n, k) of [n]
into k parts As in the previous section, we encode a k-partition of [n] as a string of length n containing k symbols where i and j are in the same subset of the partition if and only if the ith character in the string is the same symbol as the jth character. Since the cases for k = 1 and k = n are trivial, we always assume that 2 ≤ k < n. For convenience, we use {1, 2, . . . , k} as our alphabet. We refer to an encoding of a partition as a representation of that partition. Note that each k-partition of [n] has k! different representations. Following methods outlined in [2], we construct a transition digraph G n,k for P(n, k) as follows. Let the set of vertices of G n,k be the set of all k and (k − 1)-partitions of [n − 1]. There is an edge between two vertices v and w of G n,k if and only if w can immediately follow v in a ustring of k-partitions of [n]. That is, there is an edge from v to w if and only if the last n−2 symbols of a representation for v match the first n − 2 symbols of a representation for w and the string formed by overlaying these two representations at their shared n−2 length substring is a representation of a k-partition of [n]. Observe that each vertex that is a k-partition of [n − 1] will have indegree = outdegree = k, and each vertex that is a (k − 1)-partition of [n − 1] will have indegree = outdegree = 1. As an example, G 5,3 is shown in Figure 1, with all vertices labeled with the representation having symbols appearing in the order 123. Now, the edges of G n,k are precisely the k-partitions of [n]. Furthermore, a partition p 1 can follow another partition p 2 in a ustring for P(n, k) if and only if the vertex at the tail of p 1 is also at the head of p 2 . Thus, there is a bijection between the Eulerian cycles of G n,k and the ustrings of P(n, k).
Theorem 5. Let n, k ∈ Z + with 2 ≤ k < n, and let G n,k be the transition digraph for P(n, k). Then G n,k has an Eulerian cycle.
Proof. G n,k is balanced as remarked above, so we must show that G n,k is weakly connected. To do so, we show that there exists a path from any vertex of G n,k to the vertex w with representation (1, 2, . . . , k −1, k, k, . . . , k). Accordingly, let u be a vertex of G n,k . We describe an algorithm for obtaining a path from u to w. We first find a path from u to a vertex v which ends in k distinct symbols. We may arrive at such a vertex in k − 1 steps by a path u = v 1 , v 2 , . . . v k where, for i = 1, 2, . . . , k − 1, we choose v i+1 to be a vertex connected to v i such that the representations of v i+1 end in i + 1 distinct symbols. Note that choosing v i+1 this way is always possible -v i will have representations ending in i distinct symbols and if outdegree(v i ) = 1 then the only possible choice for v i+1 has representations formed by adding the missing symbol of each representation of v i to its last n − 2 symbols (the case where outdegree(v i )=k is clear). Now, v k has representations ending in k distinct symbols, so for any path of length (n−1)−k starting at v k , each vertex on the path will have outdegree = k. Thus, there exists a path v k , v k+1 , . . . , v n−1 , where v k+j has representations whose last j + 1 symbols are all the same (j = 0, 1, . . . , (n − 1) − k). Then, by construction, v n−1 = w. Hence, G n,k is weakly connected and so it follows that G n,k contains an Eulerian cycle.
Hence, we know that Eulerian cycles exist in G n,k , and therefore ustrings of P(n, k) exist as well. However, there may be ustrings which cannot be turned into ucycles, which occurs when the representations of the first and last partitions do not overlap correctly, i.e., they have their symbols permuted. This idea is illustrated in the Eulerian cycle in Figure 2: If we start with the representation 123 of the first vertex, then this Eulerian cycle represents the ustring 123312132, which cannot be turned into a ucycle. This example shows another important concept -once we choose the first representation to use, all other representations used are uniquely determined by the given Eulerian cycle. These observations motivate the following definitions.

Definition 6.
Suppose v is a vertex in G n,k and r is a representation for v. Form a new string r 0 from r by deleting all but the first occurence of each symbol from r and appending the missing symbol to the end if v is a Definition 7. Consider an edge vw for some vertices v and w of G n,k . Fix a representation r v of v and suppose it has relative order π v . Suppose the corresponding representation of w is r w with relative order π w . Then π w π −1 v is called the associated permutation of the edge vw.
Remark 8. We have defined the associated permutation as the π ∈ S k such that ππ v = π w , so that this definition is independent of the choice of representation of v.
The graph G 5,3 is shown again in Figure 3 with edges labeled with their associated permutations (expressed in cycle notation with fixed points supressed).
From this definition, we get the following characterization.
Theorem 10. An Eulerian cycle E = e 1 , e 2 , . . . , e S(n,k) in G n,k can be lifted to a ucycle of P(n, k) if and only if its permutation product is the identity. Proof. Fix a representation r of the vertex at the tail of e 1 and suppose r has relative order τ . E can be lifted to a ucycle if and only if we arrive back at r at the end of the cycle, and going through E is equivalent to applying the permutation product to τ . Now, we show that the associated permutation of an edge is completely determined by the vertex at its "tail", and that only certain permutations can be associated permutations.
Lemma 11. Let vw 1 be an edge in G n,k , suppose vw 1 has associated permutation π. Then π has the form (1 j j −1 · · · 2) for some 1 ≤ j ≤ k, and if vw 2 is another edge from v, then vw 2 has associated permutation π as well.
Proof. Let r = v 1 v 2 · · · v n−1 be the representation of v with relative order 12 · · · k. Then there are representations r 1 = v 2 · · · v n−1 u 1 and r 2 = v 2 · · · v n−1 u 2 of w 1 and w 2 corresponding to r under vw 1 and vw 2 , respectively. Suppose that j − 1 distinct symbols appear in r after v 1 (= 1) and before a second appearance of 1 (1 may only occur once in r). Since the first n − 2 characters of both r 1 and r 2 are the same as the last n − 2 characters of r, it follows that r 1 and r 2 both have relative order 23 · · · j 1 j+1 j+2 · · · k. Hence, the associated permuations of vw 1 and vw 2 are both (1 j j −1 · · · 2).
Theorem 12. For n ≥ 3, every Eulerian cycle of G n,2 can be lifted to a ucycle.
Proof. Observe that the vertex with representation 11 · · · 1 is the only vertex of outdegree 1 and that the edge coming out of this vertex has id. as its associated permutation. All other vertices have outdegree 2, and the two edges originating from any particular vertex both have the same associated permutation by Lemma 11. In particular, there is an even number of (12) permutations. Since S 2 is abelian, the permutation product of an Eulerian cycle will be the identity so that the result follows by Theorem 10.
Proof. This follows directly from Theorem 5 and Theorem 12.
We can also use the permutation product to determine cases when ucycles do not exist. The easiest way for this to occur is if the multiset consisting of all associated permutations in G n,k contains an odd number of odd permutations since this ensures that there is no ordering of the associated permutations which multiplies to the identity.
Definition 14. We call the multiset consisting of all associated permutations in G n,k the permutation multiset of G n,k .
Definition 15. Let O be the multiset which contains all odd permutations of the permutation multiset of G n,k . Define the parity function by P ar(n, k) = 0 if |O| ≡ 0 mod 2 1 if |O| ≡ 1 mod 2 Lemma 16. If P ar(n, k) = 1, then there does not exist a ucycle of P (n, k).
The following formula gives a recursive formula for calculating P ar(n, k).
Lemma 17. The function P ar(n, k) satifies the following recurrence relation: P ar(n, k) ≡ k · P ar(n − 1, k) + P ar(n − 1, k − 1) + S(n − 2, k − 2) mod 2 (3) with initial conditions P ar(n, 2) = 0 for all n, and P ar(n, n − 1) = 1 if n ≡ 0 mod 4 0 otherwise Proof. We establish a relationship between the permutation multiset of G n,k and those of G n−1,k and G n−1,k−1 . Suppose v is a vertex in G n,k , so v represents a k or k − 1-partition p of [n − 1]. We consider the edge e p which represents p in either G n−1,k or G n−1,k−1 . We know that the associated permutation of e p is determined by the location of the second occurence of the first symbol in a representation r p of the vertex w p at the tail of e p by Lemma 11. First, suppose v represents a k-partition of [n − 1]. If the first symbol does actually occur for a second time in r p , then since there is a representation of v whose first n − 2 characters are precisely r p , it follows that e p has the same associated permutation as all the edges coming from v. If the first symbol of r p does not occur a second time, then e p has associated permutation (1 k k −1 · · · 2). If w p has outdegree 1, then the representations of v do not have a second occurrence of their first symbols, and so all edges from v have associated permutation (1 k k −1 · · · 2). If w p has outdegree k, then the representations of v have an occurence of all symbols before a second occurence of the first symbol, so we get that the edges form v have associated permutation (1 k k−1 · · · 2) again. Since each vertex of G n,k which represents a k-partition of [n − 1] has outdegree k, we get the term k · P ar(n − 1, k). Now, suppose v represents a k − 1-partition of [n − 1]. Case 1, the first symbol of v appears a second time. Then either the first symbol of r p appears a second time, or the first symbol of r p is appended by following e p (if the second appearance in v is at the last character). If the first symbol of r p appears a second time, then by previous reasoning e p has the same associated permutation as all edges from v. If the first symbol of r p does not occur a second time, then e p and the edges from v all have associated permutation (1 k k−1 · · · 2). Case 2, the first symbol of v does not appear a second time. Then the first symbol of r p does not appear a second time, and so e p must have associated permutation (1 k − 1 k − 2 · · · 2). However, in this case v has associated permutation (1 k k −1 · · · 2). Note that since the first symbol of v does not appear a second time, the last n−2 characters of v represent a k −2-partition of [n − 2], so this case occurs exactly S(n − 2, k − 2) times. Thus, we have S(n − 2, k − 2) partitions that either switch from even to odd or odd to even; in either case adding S(n − 2, k − 2) affects the parity in the desired manner.
Thus, each vertex in G n,k which represents a k − 1-partition of [n − 1] has the same associated permutation as it does in the graph G n−1,k−1 except for S(n − 2, k − 2) permutations which change sign. Since each such vertex has outdegree 1, we get the P ar(n − 1, k − 1) + S(n − 2, k − 2) term.
(d) Throughout this paper, we have insisted on having the alphabet size equal k. How do our results change if we relax this condition?

Acknowledgments
The research of all the authors was supported by NSF Grant 1263009.