On generalizations of separating and splitting families

The work in this article is concerned with two different types of families of finite sets: separating families and splitting families (they are also called"systems"). These families have applications in combinatorial search, coding theory, cryptography, and related fields. We define and study generalizations of these two notions, which we have named $n$-separating families and $n$-splitting families. For each of these new notions, we outline their basic properties and connections with the well-studied notions. We then spend the greatest effort obtaining lower and upper bounds on the minimal size of the families. For $n$-separating families we obtain bounds which are asymptotically tight within a linear factor. For $n$-splitting families this appears to be much harder; we provide partial results and open questions.


§1. INTRODUCTION
Separating families, also called separating systems, play a major role in several areas of applied combinatorics. Before discussing this motivation, let us record the definition: If X is a finite set and A, B ⊆ X, we say that A separates B if both A ∩ B = ∅ and A c ∩ B = ∅. We say that F ⊆ P(X) is a separating family if for all B ⊆ X there exists A ∈ F such that A separates B. In fact, it is equivalent to consider separating only pairs b ⊆ X, as we prove in section 2.
Separating families were first studied in [Rén61] in connection with probabilistic questions about boolean algebras. Separating families have found applications in many areas, including combinatorial search, boolean logic, and coding theory. There are numerous similar notions of families that also have significant applications. For most efficient use in algorithms and applications to extremal questions, families of minimum size are desirable, so work has been done on bounding the minimum sizes of various types of separating and splitting families.
Numerous generalizations of the notion of a separating family have been studied in a variety of contexts. One of the most important is the following: F is an (i, j)-separating family if for all P, Q ⊆ X such that |P| ≤ i and |Q| ≤ j, there exists A ∈ F such that P ⊆ A and Q ∩ A = ∅ or vice versa. Thus ordinary separating is equivalent to (1, 1)-separating. Applications of (i, j)-separating families arise in automata theory, for instance see [Har65]. Bounds of the minimum size of (i, j)separating families given by [FK84] are used to produce bounds on the minimum sizes of new kinds of families introduced in this paper.
We also consider splitting families, which are closely related to separating families. Let us record the definition of a splitting family. If X is a finite set and A, B ⊆ X, we say A splits B if |A ∩ B| = ⌊|B|/2⌋ or |A ∩ B| = ⌈|B|/2⌉. We say that F ⊆ P (X) is a splitting family if for all B ⊆ X there exists A ∈ F such that A splits B. We allow the rounding to go either way for convenience. Some other authors have more strict rounding rules, for example see [RH12].
Splitting families have a slightly less illustrious history than separating families. They first appeared in Coppersmith's algorithm which efficiently computes the discrete logarithm in the low Hamming weight case [Sti02]. Coppersmith's algorithm only requires families that split sets of a specific size. Such families are studied in more detail in [LLvR04] and [DSL + 07]. As far as we know, families that split all subsets of [k] have not been previously studied.
In this paper, we define and study generalizations of separating families and splitting families, which we call n-separating families and n-splitting families respectively. We provide bounds on their minimum sizes and establish relationships with other notions of separating and splitting.
We describe an application of the generalization of separating families to error correcting codes, and we are confident that both generalizations will find other applications similar to those of separating families and splitting families. If X is a finite set and F ⊆ P(X), then F is an n-separating family if and only if for all separable collections B 1 , B 2 , . . . , B n ⊆ X, there exists A ∈ F that separates each B i . We discuss which collections of sets are separable in Section 3.
Similarly, if F ⊆ X, then F is an n-splitting family if and only if for all splittable collections B 1 , B 2 , . . . , B n ⊆ X, there exists A ∈ F that splits each B i . In each of the previous definitions, we defined F to be a collection of subsets of the finite set X. For the rest of this article, we let X = [k] = {1, 2, . . . , k} for convenience. We discuss partial results on which collections of sets are splittable in Section 4, with a complete characterization in the case n ≤ 3.
We now briefly outline the conclusions proved in this paper. After discussing which collections of sets are separable, we define n-separating families and establish the relationship between nseparating families and the well-studied notion of (i, j)-separating families. We then establish the following lower and upper bounds on the minimal size of an n-separating family.
Theorem. The minimal size of an n-separating family is Ω(2 n log k) and O(n2 n log k).
Splitting families turn out to be more challenging to handle than separating families. We give a characterization of splittability only for n ≤ 3. We provide a new lower bound on the size of a splitting family in Section 4. We also establish an upper bound on the minimal size of a 2-splitting family.
Theorem. The minimal size of a 2-splitting family is Ω(k) and O(k 2 ).
Unfortunately the proof of the above theorem does not immediately generalize to the case of nsplitting. But if the key result Theorem 4.10 can be generalized to this case, then we would obtain the following.

Conjecture. The minimal size of an n-splitting family is O( f (n)k n/2+1
Here we say A separates B. We show that it is in fact enough to only consider separating pairs.

Proposition 2.2. A family F ⊆ P(X) is a separating family if and only if for every
Proof. Clearly if F separates all sets of cardinality at least two then it separates pairs. Conversely, suppose F separates pairs, and consider B ⊆ X with |B| ≥ 2. Choose x, y ∈ B with x = y; then {x, y} is a pair so there exists A ∈ F such that both |A ∩ B| = 1 and |A c ∩ B| = 1, and hence Separating families are most efficiently used in applications if they are of minimum size. Many of the results in this paper deal with giving bounds on the minimum sizes of various types of families. The minimum size of an ordinary separating family is well-known and typically attributed to Rényi. Theorem 2.3. If F is a separating family over [k] of minimal size, then |F | = ⌈log k⌉.
Note that while we are mostly concerned with the finite case and the previous theorem is stated in terms of the finite number k, it also holds true for any cardinal κ, with ⌈log k⌉ replaced by min{λ : 2 λ ≥ κ}. The method of proof for theorem 2.3 relies on the matrix representation of separating families which we now describe. In other words, M represents F if and only if its rows are precisely the characteristic vectors of the elements of F . Since F is unordered, the matrix representation is well-defined only up to permutations of its rows.
We are now in position to prove the following lemma. Conversely, suppose M has distinct columns. Then for every pair {i, j} the columns i, j disagree at some row, say row n, in which case the set corresponding to row n separates {i, j}.
Proof of Theorem 2.3. Since there are 2 ⌈log k⌉ many binary strings of length ⌈log k⌉, in order for the matrix M corresponding to F to have distinct columns it must contain at least ⌈log k⌉ many rows. Furthermore, this suffices since we may simply choose k many distinct columns of length ⌈log k⌉ and put them into a matrix M which represents a separating family over [k] of size ⌈log k⌉.
There is an algorithm to recognize separating families that runs in linear time in the size of the matrix representing the family.
Proof. Given M ∈ M m,k , we first sort the columns of M using radix sort. We then check whether any pair of adjacent columns are equal. If any adjacent pair of columns are identical, then the family is not splitting. Both the sort and the comparisons require O(mk) many bit comparisons.
We now describe a classification of separating families.
Definition 2.8. Two separating families F and G are said to be equivalent (written F ≡ G) if and only if G can be obtained from F by means of the following operations: • replace an element of F with its complement; • permute the elements of [k].
Since both of these operations are invertible, they generate a natural group acting on the collection of separating families over [k] of cardinality m. This group is a natural quotient of the following group acting on m × k matrices: Definition 2.9. The group CCR m,k , or simply CCR if m and k are understood, is the permutation group acting on M m,k and generated by the following operations: • Complementation: replace any row r with 1 − r (corresponds to taking the complement of an element of F ; here 1 denotes to the vector of 1's); • Column permutation (corresponds to permuting the elements of [k]); and • Row permutation (corresponds to reordering the elements of F ).
Since separating families are unordered, the CCR group modulo row permutations acts on separating families in a natural way, and two separating families are equivalent if and only if they lie in the same orbit of this action.
To classify separating families up to equivalence, it turns out to be very useful to represent such families as subsets of a Hamming cube. We denote the m-dimensional Hamming cube by Q m , and its symmetry group by H m . For π ∈ CCR m,k let π denote its image in G under projection. Note that G is generated by the following elements: • σ ij , where σ ij interchanges rows i and j.
We define φ on this set of generators. Note that a rotation of Q m is uniquely determined by a permutation of the coordinates of its vertices, so H m is generated by rotations which interchange two coordinates and reflections across coordinate hyperplanes; let r ij be the rotation interchanging coordinates i and j, and c i be the reflection across the hyperplane normal to the ith coordinate axis.
It remains to verify that (φ, f ) is an equivariance. Let M ∈ M and π ∈ G. It is easy to show that f (πx) = φ(π) f (x), which completes the proof. The number of distinct subsets of the m-cube up to cube symmetry, and hence the values of sep(m, k), can be computed using Pólya theory. For two such calculations, see [Che93] and [HH68].
Unfortunately the formulas these articles produce each have an exponential number of terms. This leads us to conjecture that the following question has an affirmative answer.
Question 2.14. Is the computation of the values of sep(m, k) NP-hard?
For more on the values of sep(m, k), see the OEIS sequence A039754, and the monograph [Har65]. §3. n-SEPARATING FAMILIES In this section we define the concept of n-separating families and discuss separability. We also investigate an application of n-separating families to error-correcting codes. We then define (i, j)-separating families and describe their relationships with n-separating families. As a consequence of this relationship with (i, j)-separating, we obtain a lower bound on the minimum size of an n-separating family by applying a result from [FK84]. We then describe an application of (n, 1)-disjunct families to combinatorial search and obtain a lower bound on the size of an (n, 1)-separating family. We conclude this section by providing a probabilistic upper bound on the minimum size of an n-separating family.
It is first necessary to characterize separable collections of sets. Indeed, there exist collections that cannot simultaneously be separated by a single set. For example, {1, 2}, {2, 3}, {3, 1} cannot be simultaneously separated by any set.
is a 2-colouring of G, thereby constructing the desired graph.
Remark 3.4. It is worth mentioning that a collection B 1 , . . . , B n is separable if and only if, when the collection is viewed as a hypergraph, it is 2-colorable. The problem of recognizing hypergraph 2-colorability is known to be NP-complete; see [Lov73].
We now provide a constructive upper bound for the minimum size of a 2-separating family.
This constructive upper bound is not optimal, as it is bested by the bound given in theorem 3.12. However, this bound is still noteworthy because it introduces an easily computable construction of a 2-separating family given an ordinary separating family.
Theorem 3.5. We can construct a 2-separating family of subsets of k whose size is O((log k) 2 ).
This follows immediately from the following construction.
Theorem 3.6. Let F be a separating family on [k], and Proof. Let b 1 , b 2 be two pairs in [k] and A 1 , A 2 be their respective separators in F . Often, A 1 or A 2 will separate both pairs. Assuming neither does, we have that A 1 contains precisely one element in b 1 and A 2 contains both or zero elements in b 1 . Then A 1 △A 2 contains precisely one element in b 1 . By identical reasoning, A 1 △A 2 contains precisely one element in b 2 , and A 1 △A 2 separates both b 1 and b 2 simultaneously.
. This is the 2-separating family obtained from the previously described construction with the separating family given in example 2.6. This separating family has the following matrix representation.
It is now a natural time to describe an application of n-separating families to error-correcting codes. They will be applied to produce Hamming d-codes. A Hamming d-code is an errorcorrecting code that can detect up to d − 1 errors and correct up to ⌊d/2⌋ − 1 errors in a message. It may be represented as an m × k binary matrix whose columns have pairwise Hamming distances of at least d. As we now prove, the matrix representation of an n-separating family is an example of a Hamming 2 n−1 -code.
Theorem 3.8. If F is n-separating, then the Hamming distance between any two columns in F is at least 2 n−1 .
The proof will be given in a series of lemmata. We first note that the matrix representation of P[k] contains each subset of [k] as a row. It has k columns and 2 k rows. Starting with a base case P[1] = (0, 1) T , we can construct P[k] inductively by starting with a (k − 1) × 2 k matrix with the rows of P[k − 1] repeated twice; that is, the first 2 k−1 rows and the last 2 k−1 rows are both P[k − 1]. We can then add a k th column, such that the first half of its entries are 0's and the last half are 1's. The resulting matrix is a representation of P[k], which we will denote M k . Proof. We proceed inductively, using the matrix construction described above. The two columns of P[2] have Hamming distance 2, so the base case holds. Then, hypothesize that the columns of M k−1 have d = 2 k−2 . Consider the i th and j th columns of M k , with i, j < k. Their Hamming distance is twice that of the i th and j th columns of M k−1 so by the inductive hypothesis d = 2 k−1 . Consider the Hamming distance between the i th column and the k th column, where i th column of M k−1 had Hamming weight x. Then the i th and k th columns disagree on x of the first 2 k−1 rows, and on 2 k−1 − x of the second 2 k−1 rows, for a total of d = and for all A ∈ P[k], exactly one of A or A c ∈ F , then the Hamming distance between any two columns in the matrix representation of F is 2 k−2 .
Proof. Fix i, j distinct columns of the matrix representation of P[k]. Consider removing any row ℓ from the matrix. If columns i, j agree at position ℓ, then their Hamming distance is not changed. If columns i, j disagree at position ℓ, then their Hamming distance is reduced by 1. Since A, A c are distinct for every A ⊆ [k], F has 2 k−1 rows. In exactly half of these rows, columns i, j will disagree. So the Hamming distance between these two columns is 2 k−1 − 1 4 2 k = 2 k−2 .
Lemma 3.11. If F is an n-separating family, then any set of n + 1 columns in F will contain P[n + 1] up to complements.
Proof. Consider the following construction on a connected bipartite graph with n + 1 vertices and n edges. Color i of the vertices red and n + 1 − i of them blue. Connect all red vertices to one particular blue vertex and all remaining blue vertices to one particular red vertex. We have n + 1 − i + (i − 1) = n edges. Allowing i to range up to ⌊ n 2 ⌋ we get the desired result.
Proof of Theorem 3.8. Since F is n-separating, any collection of n columns from F contains P([n + 1]) up to complements. Thus the Hamming distance between any two columns in F is at least 2 (n+1)−2 = 2 n−1 .
We now provide an upper bound on the minimum size of an n-separating family.
Theorem 3.12. If F is an n-separating family, then |F | ≤ 2n log k − log(1−2 −n ) . In particular, the minimal size of an n-separating family of subsets of k is O(2 n n log k).
We shall prove this using the probabilistic method. For this, we make use of the following lemma, which gives an upper bound on the number of objects needed to complete N tasks, provided that for each task, a randomly chosen object completes it with probability at least p.
Lemma 3.13. Suppose there is a set of N tasks to be completed, and that for each task, a randomly chosen object completes it with probability at least p. Then the minimal size of a family F of objects which completes all the tasks is Proof. Choose a particular task τ. The probability that a randomly chosen object does not complete τ is at most (1 − p), so the probability that no object from a collection of m randomly chosen objects completes τ is at most (1 − p) m . Thus the expected number of tasks left uncompleted by a collection of m objects is N(1 − p) m . We are thus looking for the least m such that N(1 − p) m < 1 (so there is at least one family of m objects which completes all the tasks). Solving this inequality for m, we obtain the inequality desired (as m = |F |).
Lemma 3.14. The probability that a randomly chosen subset of [k] simultaneously separates a given separable collection of n pairs has lower bound 2 −n .
Proof. Let {b 1 , . . . , b n } be a separable collection of pairs, and G be the graph with edge set {b 1 , . . . , b n }. Let p n be the probability that a random set A ⊆ [k] simultaneously separates all pairs in G. We proceed by induction on n. The base case n = 1 is clear. For the inductive step, we break into cases.
Case 1: If b n+1 is disjoint from G, then the event that b n+1 is separated is independent of the event that all edges in G are separated, and the probability that b n+1 is separated is 1 2 . Thus p n+1 = 1 2 p n ≥ 2 −(n+1) . Case 4: Both vertices of b n+1 are in G, and b n+1 connects two components of G. We remove an edge b i without disconnecting any component of b n+1 ∪ G; as any acyclic graph has at least two leaves, this is always possible. The probability that b n+1 ∪ (G b i ) is separated is at least 2 −n by the inductive hypothesis. Replacing b i results in either Case 1, Case 2, or Case 3, and gives us p n+1 ≥ 1 2 p n = 2 −(n+1) .
Remark 3.15. The worst-case probability p n = 2 −n is attained for a disjoint set of pairs (because the events that each pair is separated are independent). This worst case is in fact attained when the graph of the pairs forms a forest whose connected components are single edges.
Proof of Theorem 3.12. Since it suffices to separate collections of n pairs for n-separation, we notice there are ( k 2 ) n collections of pairs to be separated (tasks to be completed). Substituting the minimal probability from Lemma 3.14 into the bound provided in Lemma 3.13, we obtain log(( k 2 ) n ) − log(1−2 −n ) as an upper bound. Since ( k 2 ) ≤ k 2 , we realize the desired bound of |F | ≤ We will now define the previously studied notion of an (i, j)-separating family.

Definition 3.16. A family F of subsets of [k] is (i, j)-separating if and only if for all P, Q ⊆ [k]
with |P| ≤ i and |Q| ≤ j, there exists A ∈ F such that either P ⊆ A and A ∩ Q = ∅ or Q ⊆ A and In this section we establish the full set of relationships between the notions of (i, j)-separating and the notions of n-separating. In particular, we prove the implications depicted in Figure 1, and also that no other implications hold.
The first implication is obvious from the definition.
We next establish two key implications between an (i, j)-separating notion and an n-separating notion.
Theorem 3.18. Let F be a family of subsets of k. If F is (n, n)-separating then F is n-separating.
Proof. Consider a separable collection of pairs b 1 , b 2 , . . . , b n . Viewing these pairs as edges of a bipartite graph, note there are at most n vertices of either color. Since F is (n, n)-separating, we can find an A ∈ F such that A contains exactly one element from each b i . Theorem 3.19. If a family F is (i + j − 1)-separating, then F is (i, j)-separating.
Proof. Given two sets P, Q with respective sizes i, j, we can construct a connected bipartite graph G with vertex classes P and Q using i + j − 1 edges. Since F separates G, some element of F contains either P or Q. To construct the graph, first draw an edge between min P and each element in Q, then draw an edge between every other element of P and any element of Q. Note that this graph is bipartite, connected, and contains i + j − 1 edges.
Theorem 3.20. No other implications hold between (i, j)-separating or n-separating notions aside from the (transitive) consequences of Proposition 3.17 and Theorems 3.18 and 3.19.
We prove this theorem in a sequence of lemmata that give counterexamples to any other implications.
Lemma 3.21. There exists an n-separating family which is not (n, n)-separating. otherwise.
Under f ′ , no monochromatic set of vertices from G contains either A or B, so since b 1 , . . . , b n are separated by a monochromatic set with respect to any 2-colouring of G, they are separated by an element of F . This completes the proof that F is n-separating.
Proof. Let F = [k] n−1 . Note that F contains all sets of size n − 1 and so is (n − 1, j)-separating. However, since it contains no sets of cardinality n, it cannot separate disjoint collections of n pairs, and so is not n-separating.
Lemma 3.24. There exists an n-separating family which is not (n + 1)-separating.
Proof. The family F = [k] n is clearly n-separating, but cannot split disjoint collections of n + 1 pairs, so F is n-separating but not (n + 1)-separating.
Lemma 3.25. Let i ≤ j. There exists an (i, j)-separating family which is not (i + 1, j)-separating. And there exists an (i, j)-separating family which is not (i, j + 1)-separating.
Proof. Without loss of generality we may suppose that i ≤ j in each expression above.
Lemma 3.26. Let i < i ′ ≤ j ′ < j. Then we have the following: (1) There exists an (i, j)-separating family that is not an (i ′ , j ′ )-separating family.
(2) There exists an (i ′ , j ′ )-separating family that is not an (i, j)-separating family.
Let B j be such that |B j | = j and A ∩ B j = ∅. Since no set in F contains A, any set in F that would separate A and B j must contain B j . This is not possible since the sets in F have cardinality at most j ′ and j ′ < j. Thus F is not (i, j)-separating.
Proof of Theorem 3.20. The preceding lemmata address all other possible implications between (i, j)separating families and n-separating families, thus the proof is complete.
Theorem 3.20 leads us to believe that n-separating is an interesting concept in its own right. There are two nontrivial implications with previously known concepts and other possible implications do not exist.
We now provide a lower bound on the minimum size of an n-separating family.
Theorem 3.27. The least size of an n-separating family has lower bound Ω(2 n log k).
Proof. By Theorem 3.18, every n-separating family is (n/2, n/2)-separating. Therefore we can use the lower bound on (n/2, n/2)-separating families from Fredman-Komlós [FK84] Remark 3.28. There is an upper bound provided by Fredman and Komlós [FK84] for (i, j)-separating families. Taking this bound for (n, n)-separating families provides an upper bound for n-separating families by theorem 3.18. However, that bound is bested by our probabilistic upper bound from theorem 3.12 §4. n-SPLITTING FAMILIES AND SPLITTABILITY In this section we introduce a particular type of separating family that is called a splitting family. Similar to the generalization of separating families to n-separating families, we generalize splitting families to n-splitting families. We then provide an upper bound on the minimum size of a splitting family that we obtain from a generalization of a construction of Coppersmith in [Sti02]. We then define splittability and the n-splitting families. We give partial results on the characterization of splittable configurations of sets for n = 3. We provide a lower bound for the size of a splitting family using the volume method, and conclude with an upper bound for the minimum size of a 2-splitting family and conjectures for bounds on n-splitting families.
If A, B are any sets then A splits B if and only if |A ∩ B| = ⌊|B|/2⌋ or ⌈|B|/2⌉. We remark that some authors require |A ∩ B| = ⌊|B|/2⌋, for example [RH12]. We prefer our definition since it is technically handy and just as good for most applications. We prove this theorem with a generalization of Coppersmith's construction described in [Sti02].
Our version allows for k or t to be odd.
We conjecture the upper bound given above is sharp, however at the moment we can only compute a lower bound of Ω( √ k). In order to obtain this estimate, we will use the volume method for computing lower bounds. This method is used together with more advanced techniques in [FK84] to obtain their lower bound on the size of (i, j)-separating families. The next result describing the volume method has been written in terms of objects and tasks, as was done for 3.13.

Lemma 4.3. Suppose that there is a set of N tasks to be completed, and each object completes at most v of the tasks (the number of tasks completed by an object is called the object's volume). Then for any family F of objects which jointly complete all the tasks,
The proof of this lemma is completely trivial: A collection of m many objects completes at most mv many tasks, so a family of objects completing all the tasks must have size at least N v . We now apply the volume method to compute a lower bound for the size of a splitting family. Proof. For the purpose of asymptotics we may assume that k is even. Then following [FK84] it is the case that the splitters of maximum volume are of size k/2.
Meanwhile, the number of sets to be split is clearly N = 2 k . Thus the volume bound is asymptotic to N/v = Ω( √ k), as desired.
We next study the generalization of splitting families we have called n-splitting families. The definition is analogous to that of n-separating families.  We remark that every collection consisting of just two sets B 1 , B 2 is 2-splittable. To see this, we can simply let D ⊂ B 1 ∩ B 2 be such that |D| = ⌈|B 1 ∩ B 2 |/2⌉, let E ⊂ B 1 B 2 be such that |E| = ⌊|B 1 B 2 |/2⌋, and let F ⊂ B 2 B 1 be such that |F| = ⌊|B 2 B 1 |/2⌋. Then it is easy to see This fact together with the method of Theorem 4.4 gives the following lower bound on the size of 2-splitting families.
Theorem 4.7. The minimal size of a 2-splitting family of subsets of [k] is Ω(k).
Proof. This time we say that the volume of A ⊆ [k] is the number of (ordered) collections B 1 , B 2 such that A simultaneously splits B 1 , B 2 . This time, it is easy to see that the splitters of maximum volume satisfy The number of ordered collections B 1 , B 2 is (2 k ) 2 , so we obtain a bound of The latter expression is Ω(k), as desired.
The sector-by-sector splitting technique described after Definition 4.6 does not work in general for collections of three or more sets, since for example is simply not 3-splittable. The next result states that this is type of example essentially the only obstacle to 3-splittability. Proof. In the proof we will make numerous references to the seven regions of the Venn diagram of A, B, C, and for convenience we label them according to the figure shown below.
We first show that if R AB , R BC , and R AC have odd size, and R A = R B = R C = R ABC = ∅, then A, B, C is not 3-splittable. Indeed, suppose towards a contradiction that S simultaneously splits A, B, C. Without loss of generality we can suppose |S ∩ R AB | > |R AB |/2. It follows that |S ∩ R BC | < |R BC |/2, and then that |S ∩ R AC | > |R AC |/2. It follows that S does not split R AB ∪ R AC , which is a contradiction because R AB ∪ R AC = A in this case.
For the converse, we show that if A, B, C is not of this form, then it is 3-splittable. We first consider the case when R A = R B = R C = ∅. If all four of the sectors R AB , R BC , R AC , R ABC are even, then we can build a splitter A by simply splitting each sector in half. If just one of these four sectors is odd we can simply round sector up or down. If just two of these four sectors is odd we can round one of them up and the other down. This leaves only the following three subcases shown in Figure 2. In subcases 1 and 2, we round the sector R ABC up, and each of the other odd sectors down.
In subcase 3, we know that |R ABC | ≥ 2 (or else we are in the converse situation), and so we can build a splitter S with |S ∩ R ABC | = |R ABC |/2 − 1, and each intersection of S with R AB , R BC , R AC rounded down.
We next consider the case when at least one of R A , R B , R C is nonempty. If there exists a splitter S for the configuration A R A , B R B , C R C is splittable, then we can build a splitter for A, B, C by letting S ′ ⊃ S and suitably rounding the intersection of S ′ with R A , R B , R C up or down. Finally, if A R A , B R B , C R C is not splittable, then by the above analysis we must have R AB , R BC , R AC odd and R ABC = ∅. Suppose for concreteness that R A = ∅. Then we can build a splitter S such that S ∩ R AB is rounded down, S ∩ R BC is rounded up, S ∩ R AC is rounded down, and |S ∩ R A | = |R A |/2 − 1. This completes the proof.
We conjecture that the problem of deciding whether an arbitrary collection of sets is splittable is NP-complete.
We devote the rest of this section to establishing our upper bound on the minimum size of a 2-splitting family, and stating a conjecture concerning the upper bound on the minimum size of an n-splitting family. For this, we will need the following key result. Although the statement of Theorem 4.10 feels intuitive, the proof is somewhat technical and will be given as the conclusion of a series of lemmata. We first deal with the case where s and t are both even. To this end, we fix sets S, T as in the statement of the theorem. Throughout the proof we will fix elements x ∈ S T and y ∈ T S (we may assume these exist) and consider two more pairs of sets. S ′ = S {x} and T ′ = T {y}, and also S ′′ = S and T ′′ = (T {y}) ∪ {x}. Refer to Figure 3 to visualize these configurations. • If A is the family of sets that split T and S simultaneously and B is the family of sets that split T ′ and S ′ simultaneously, then 4|A| = |B|. • If C is the family of sets that splits T ′′ and S ′′ simultaneously, then 1 4 |B| ≤ |C|.
Proof. There are k − t − s + b elements in [k] (T ∪ S); we will ignore those elements as they multiply |A| and |B| both by the constant 2 k−s−t+b , which does not affect our calculations. We proceed as if t . The corresponding statement is true for S ′ , so A splits T ′ and S ′ , and hence A ⊆ B.
We can generate four distinct splitters of T ′ and S ′ from each splitter A of T and S. Let A 1 , A 2 , A 3 , A 4 be sets such that . Then given splitters generated from B and A, For the second statement, it suffices to show that at least 1/4 of the elements of B are in fact elements of C. To this end let B ∈ B, so B is a splitter of T ′ and S ′ . Note that B is a splitter of T ′′ and S ′′ if and only if it rounds the same way to split both T ′ and S ′ . That is B ∩ T ′ = t/2 and B ∩ S ′ = s/2; or B ∩ T ′ = t/2 − 1 and B ∩ S ′ = s/2 − 1. Furthermore, a splitter B that rounds down (up) to split both T ′ and S ′ will also split T ′′ and S ′′ iff x is in B (x is not in B). Thus precisely half of the sets that split both T ′ and S ′ and also round the same way both times will also split T ′′ and S ′′ .
Thus it now suffices to show that more than half of the splitters of both T ′ and S ′ round the same way for both sets. The number of splitters that round the same way each time is given by: On the other hand, the number of splitters that round differently each time is given by: We shall show term-by-term that the first sum is greater than or equal to the second sum. Taking the i th term of the first sum minus the i th term in the second sum and factoring, we obtain that the following is equivalent to what we wish to show: By the unimodality of the binomial coefficient, both of the terms in the above product are negative for i < b/2 and both are nonnegative for i ≥ b/2, so the inequality is always true. Consequently at least 1 4 of splitters in B are also splitters in C, and 1 4 |B| ≤ |C|.
For the case where s is odd and t is even, we assume that there exists x ∈ [k] (T ∪ S) and y ∈ T \ S in order for this case to be reduced to the above s, t even case. Let S ′ = S ∪ {x} and T ′ = (T {y}) ∪ {x}, so |T ′ | = t, |S ′ | = s, and |T ′ ∩ S ′ | = b + 1.

Lemma 4.12. Suppose s is odd and t is even.
• If A is the family of sets that split T and S simultaneously and B is the family of sets that split T and S ′ simultaneously, then |A| = 2|B|.
• If C is the family of sets which simultaneously split S ′ and T ′ , then 2|B| ≤ |C|.
Proof. Let A ∈ A. Then A splits S ′ if and only if it rounds S down and contains x, or rounds S up and omits x. A rounds S up or down independently of whether it contains x, and so there are two splitters in A for each splitter in B, i.e. |A| = 2|B|. The second point in the above proof follows from Lemma 4.11.
We are now ready to complete the proof of the key result.
Proof of Theorem 4.10. It suffices to prove that if |T| = |T ′ | = t and |S| = |S ′ | = s and |T ∩ S| = b and |T ′ ∩ S ′ | = b + 1, then there is an injection from the set A of splitters of T and S to the set B of splitters of T ′ and S ′ . Let |T| = t, |S| = s, |T ′ | = t − 1, |S ′ | = s − 1. The case t, s even is handled by lemma 4.11, and the cases t even, s odd and t odd, s even are handled by lemma 4.12 The remaining case is t, s odd. For this case we adjoin an element to T and reduce to the case t even, s odd, just as this case in turn was reduced to the case t, s even.
Finally, we can conclude the upper bound calculation for the minimal size of a 2-splitting family.
Proof of Theorem 4.9. If T ⊆ [k] with |T| = t, then by a Stirling-type approximation computed in Section 2 of [Sti02], the probability that S is split by a uniformly random subset of [k] has a lower bound of c/ √ t where c is a constant. Next, if also S ⊆ [k] with |S| = s then by Theorem 4.10, the probability p S,T that a uniformly random set simultaneously splits S and T is minimized when S ∩ T = ∅. In this case, the event that S is split and the event that T is split are independent, and so p S,T = p s p t . Minimizing over the possible sizes s and t, we have that p S,T has lower bound p ≥ c 2 /k. We now wish to invoke Lemma 3.13 with this value of p and N = (2 k ) 2 . Thus, letting m be the minimal size of a 2-splitting family of subsets of [k], we have that This latter expression is O(k 2 ), as desired.
We close by conjecturing that the analog of Theorem 4.10 holds for configurations of n sets as well. If this conjecture holds true, one can easily obtain an upper bound on the minimal size of an n-splitting family of t-subsets of [k] of size O( f (n)k n/2+1 ).
Conjecture. Let B 1 , . . . , B n be a collection of subsets of [k]. Then the number of splitters of B 1 , . . . , B n is minimized when the collection is pairwise disjoint.