Structure and Supersaturation for Intersecting Families

The extremal problems regarding the maximum possible size of intersecting families of various combinatorial objects have been extensively studied. In this paper, we investigate supersaturation extensions, which in this context ask for the minimum number of disjoint pairs that must appear in families larger than the extremal threshold. We study the minimum number of disjoint pairs in families of permutations and in $k$-uniform set families, and determine the structure of the optimal families. Our main tool is a removal lemma for disjoint pairs. We also determine the typical structure of $k$-uniform set families without matchings of size $s$ when $n \ge 2sk + 38s^4$, and show that almost all $k$-uniform intersecting families on vertex set $[n]$ are trivial when $n\ge (2+o(1))k$.


Introduction
Determining the size of intersecting families of discrete objects is a line of research with a long history, originating in extremal set theory. A set family is intersecting if any two of its sets share a common element. A classic result of Erdős, Ko and Rado [20] from 1961 states that when n ≥ 2k, the size of the largest intersecting k-uniform set family over [n] is n−1 k−1 . Furthermore, when n ≥ 2k + 1, the only extremal configurations are the trivial families, where all edges contain a given element. This fundamental theorem has since inspired a great number of extensions and variations.
A recent trend in extremal combinatorics is to study the supersaturation extension of classic results. This problem, sometimes referred to as the Erdős-Rademacher problem, asks for the number of forbidden substructures that must appear in a configuration larger than the extremal threshold. We often observe an interesting phenomenon: while the extremal result only requires one forbidden substructure to appear, we usually find several. The first such line of research extended Mantel's Theorem [37], which states that an n-vertex triangle-free graph can have at most n 2 /4 edges. Rademacher (unpublished) showed that one additional edge would force the appearance of at least n/2 triangles. Determining the number of triangles in larger graphs attracted a great deal of attention, starting with the works of Erdős [17,18] and Lovász and Simonovits [36] and culminating in the asymptotic solution due to Razborov [42] and the recent exact solution determined by Liu, Pikhurko and Staden [33]. Supersaturation problems have since been studied in various contexts; examples include extremal graph theory [2,30,38,39,41,43], extremal set theory [5,9,13,31,45], poset theory [4,40,46], and group theory [8,27,47].
The first result of our paper concerns supersaturation for the extension of the Erdős-Ko-Rado Theorem to families of permutations. A pair of permutations σ, π ∈ S n is said to be intersecting if {i ∈ [n] : π(i) = σ(i)} = ∅, and disjoint otherwise. A family F ⊆ S n is intersecting if every pair of permutations in the family is. A natural construction of an intersecting family is to fix some pair i, j ∈ [n], and take all permutations that map i to j; we call this a coset, and denote it by T (i,j) . Observe that T (i,j) = (n − 1)!, and Deza and Frankl [12] showed that this is the largest possible size of an intersecting family in S n .
In the corresponding supersaturation problem, we seek to determine how many disjoint pairs of permutations must appear in larger families. We write dp(F) for the number of disjoint pairs of permutations in a family F ⊆ S n . By the Deza-Frankl Theorem, when |F| ≤ (n − 1)!, we need not have any disjoint pairs in F, while for |F| > (n − 1)!, dp(F) must be positive.
One might expect a family of permutations with the minimum number of disjoint pairs to contain large intersecting subfamilies, and a candidate construction is therefore the union of an appropriate number of cosets. However, these unions are not isomorphic, as pairs of cosets can intersect each other differently. Indeed, given pairs (i 1 , j 1 ) = (i 2 , j 2 ) ∈ [n] 2 , we have T (i 1 ,j 1 ) ∩ T (i 2 ,j 2 ) = {π ∈ S n : π(i 1 ) = j 1 and π(i 2 ) = j 2 }, which is empty if i 1 = i 2 or j 1 = j 2 , and has size (n − 2)! otherwise. To fit the family within as few cosets as possible, we should take the cosets to be pairwise-disjoint, motivating the following definition.
Our first result shows, for certain ranges of family sizes s, that these families indeed minimise dp(F) over all families F ⊆ S n with |F| = s. Theorem 1.2. There exists a constant c > 0 such that the following holds. Let n, k and s be positive integers such that k ≤ cn 1/2 , and s = (k + ε)(n − 1)! for some real ε with |ε| ≤ ck −3 . Then any family F ⊆ S n with |F| = s satisfies dp(F) ≥ dp(T (n, s)).
We next consider the supersaturation extension of the original Erdős-Ko-Rado Theorem, where one seeks to minimise the number of disjoint pairs of sets in a k-uniform family of s subsets of [n]. Bollobás and Leader [7] provided, for every s, a family of constructions known as the -balls, and conjectured that for some 1 ≤ ≤ k, an -ball is optimal for the supersaturation problem. In particular, when = 1, the construction is an initial segment of the lexicographic ordering.
Denote by [n] k the family of all k-element subsets of [n]. Letting L(n, k, s) be the initial segment of the first s sets in [n] k , we write dp(n, k, s) for dp(L(n, k, s)), where again dp(F) is the number of disjoint pairs in a set family F. Das, Gan and Sudakov [10] proved that if n > 108(k 3 r + k 2 r 2 ) and s ≤ n k − n−r k , then for any family F ⊆ [n] k of size s, dp(F) ≥ dp(n, k, s). That is, when n is sufficiently large and the families are of small size, the initial segments of the lexicographic order minimise the number of disjoint pairs, confirming the Bollobás-Leader conjecture in this range.
Note that, for fixed r, the result in [10] requires k = O(n 1/3 ). Frankl, Kohayakawa and Rödl [23] showed that initial segments of the lexicographic order are asymptotically optimal even for larger uniformities k. In our next result, we extend the exact results to larger k as well, showing that the lexicographic initial segments are still optimal when k = O(n 1/2 ). Theorem 1.3. There is some absolute constant C such that if n ≥ Ck 2 r 3 and s ≤ n k − n−r k , then any family F ⊆ [n] k with |F| = s satisfies dp(F) ≥ dp(n, k, s); that is, L(n, k, s) minimises the number of disjoint pairs.
With our next results, we address a different variation of classic extremal problems. Rather than considering the supersaturation phenomenon, we describe the typical structure of set families with a given property, showing that almost all such families are subfamilies of the trivial extremal constructions.
We first consider the famous Erdős Matching Conjecture concerning the largest kuniform set families over a ground set of size n that have no matching of size s. There are two constructions that trivially avoid a matching of size s: a clique on ks − 1 vertices, and the family of all edges intersecting a set of size s − 1. In [19], Erdős conjectured that one of these constructions is always optimal. Conjecture 1.4 (Erdős [19], 1965). Given integers n, k and s, let F ⊆ [n] k be a set family with no matching of size s. Then Frankl [22] proved the conjecture in the range n ≥ (2s − 1)k − s + 1, showing that the extremal families can be covered by s − 1 elements. Adapting the methods of Balogh et al. [3], we show that a slightly larger lower bound on n guarantees that almost all families without a matching of size s have a cover of size s − 1. The s = 2 case corresponds to intersecting families. In this case, Balogh et al. [3] showed that when n ≥ (3 + o(1))k, almost all intersecting families are trivial. Our final result improves the required bound on n to the asymptotically optimal n ≥ (2 + o(1))k. Indeed, when n = 2k, then the number of intersecting families is 3 1 2 ( n k ) = 3 ( n−1 k−1 ) , since we can freely choose at most one set from each complementary pair of k-sets {A, [n] \ A}. Theorem 1.6. There exists a positive constant C such that for k ≥ 2 and n ≥ 2k + C √ k ln k, almost all intersecting families in [n] k are trivial. In particular, the number of intersecting families in [n] k is (n + o(1))2 ( n−1 k−1 ) , where the term o(1) tends to 0 as n → ∞. Remark: During the preparation of this paper, Theorem 1.6 (with a superior constant C = 2) was proven independently by Frankl and Kupavskii [25] using different methods.
Outline and notation. The rest of the paper is organised as follows. We discuss families of permutations in Section 2, in particular proving the supersaturation result of Theorem 1.2. Section 3 is devoted to supersaturation for set families and the proof of Theorem 1.3. In Section 4, we address the typical structure of families, proving Theorems 1.5 and 1.6. Section 5 contains some concluding remarks, including a counterexample to the Bollobás-Leader conjecture.
We use standard set-theoretic and asymptotic notation. We write X k for the family of all k-element subsets of a set X. Given two functions f and g of some underlining parameter n, if lim n→∞ f (n)/g(n) = 0, we write f = o(g). For a, b, c ∈ R + , we write

Supersaturation for families of permutations
In this section, we study the supersaturation problem concerning the number of disjoint pairs in a family of permutations. Our main tool is a removal lemma for disjoint pairs of permutations, showing that families with relatively few disjoint pairs are close to unions of cosets. We start by collecting some basic facts.
2.1. The derangement graph. Let S n be the symmetric group on [n]. A permutation τ ∈ S n is called a derangement if τ (i) = i for every i ∈ [n]. Let D n be the set of all derangements in S n . Denote by Γ n the derangement graph on S n , that is, σ ∼ π if σ · τ = π for some τ ∈ D n . In other words, σ and τ are adjacent in Γ n if and only if they are disjoint.
We denote by d n the number of derangements in S n . By construction, Γ n is a d n -regular graph. A standard application of the inclusion-exclusion principle shows We also introduce the notation D n = d n +d n−1 , which we will use to keep track of disjoint pairs in certain subgraphs of the derangement graph. For instance, consider the subgraph of Γ n induced by two disjoint cosets T (i 1 ,j) and T (i 2 ,j) . Since the cosets are intersecting families, they are independent sets in Γ n , and so Γ n [T (i 1 ,j) , T (i 2 ,j) ] is bipartite. For any σ ∈ T (i 1 ,j) and any neighbour π = σ · τ , where τ ∈ D n , we have π ∈ T (i 2 ,j) if and only if τ (i 2 ) = i 1 . It is straightforward to see that there are d n−2 such derangements τ with τ (i 1 ) = i 2 and d n−1 derangements with τ (i 1 ) = i 2 . As a result, every vertex of the bipartite graph has the same degree d n−2 + d n−1 = D n−1 .
We will now prove a removal lemma for disjoint pairs of permutations, which states that any family F ⊆ S n of size s ≈ k(n − 1)! with dp(F) ≈ dp(T (n, s)) must be 'close' to a union of k cosets. Lemma 2.1. There exist positive constants C and c such that the following holds for sufficiently large n. Let 1 ≤ k < n/2 be an integer, and let ε ∈ R and β ∈ R + be such that max{|ε|, β} ≤ ck. If F ⊆ S n is a family of size s = (k + ε)(n − 1)! and dp(F) ≤ dp(T (n, s)) + β(n − 1)!D n−1 , then there is some union G of k cosets with the property that In the proof of Lemma 2.1 we shall use a stability result due to Ellis, Filmus and Friedgut [15,Theorem 1]. To state their theorem we need some additional notation. We equip S n with the uniform distribution. Then, for any function f : S n → R, the expected value of f is defined by n! σ∈Sn f (σ)g(σ); this induces the norm f = f, f . Given c > 0, let round(c) denote the nearest integer to c.
Theorem 2.2 (Ellis, Filmus and Friedgut). There exist positive constants C 0 and δ 0 such that the following holds. Let F be a subfamily of S n with |F| = α(n − 1)! for some α ≤ n/2. Let f = 1 F be the characteristic function of F and let f U 1 be the orthogonal where g is the characteristic function of a union of round(α) cosets of S n .
We now derive the removal lemma from Theorem 2.2.
Proof of Lemma 2.1. Set c = min{ δ 0 12 , 1 2 } and C = 3C 0 , where δ 0 and C 0 are the positive constants from Theorem 2.2. Let f be the characteristic vector of F. Write f = f 0 + f 1 + f 2 , where f i is the projection of f onto the λ i -eigenspace for i = 0, 1. By the orthogonality of the eigenspaces, Since f is Boolean, Let A be the adjacency matrix of the derangement graph Γ n . Then Dividing both sides by n!, we obtain the following inequalities when n ≥ 4K: On the other hand, by assumption we have dp(F) ≤ dp(T (n, s)) + β(n − 1)!D n−1 Combined with (6), we get Moreover, Since 6(|ε|+β) k ≤ 12c ≤ δ 0 , we may apply Theorem 2.2 to conclude that there exists a union G of k cosets in S n such that This gives |F∆G| = E[(f − 1 G ) 2 ] · n! ≤ Ck 2 1 n + 6(|ε|+β) k (n − 1)!, completing our proof.
We will use this removal lemma to prove Theorem 1.2 (a supersaturation result for disjoint pairs in S n ) in Subsection 2.4. However, from the proof above we can immediately deduce that for any 1 1 ≤ k ≤ n, the union of k pairwise disjoint cosets minimises the number of disjoint pairs among all families of k(n − 1)! permutations. Proof. Let F ⊆ S n be an extremal family of size k(n − 1)!, and let f = 1 F . By (3), we must have dp(F) ≤ dp(T ) = k 2 (n − 1)!D n−1 . Hence, as in the proof of Lemma 2.1, we can use (7) with ε = β = 0, and so f 2 2 = 0. It follows from (6) that dp(F) ≥ k 2 (n − 1)!D n−1 = dp(T ), showing that T minimises the number of disjoint pairs. 1 In the proof of Lemma 2.1, we used that n was sufficiently large to bound Kdn n 2 f 2 2 . However, in Proposition 2.3, we have f 2 2 = 0, and so do not require n to be large.

Intersection graphs.
The removal lemma states that families with relatively few disjoint pairs must be close to unions of cosets. While this describes their large-scale structure, it falls short of determining the finer details of such families. As we have observed previously, certain pairs of cosets are disjoint, while other pairs share a small number of permutations. In order to keep track of this information, we introduce the notion of an intersection graph. Given a union G = T (i 1 ,j 1 ) ∪ . . . ∪ T (i k ,j k ) of k different cosets in S n , its intersecting graph is the graph with vertex set {(i 1 , j 1 ), . . . , (i k , j k )} and edges between pairs corresponding to cosets with non-empty intersection. As remarked before Definition 1.1, we therefore have (i, j) ∼ (i , j ) if and only if i = i and j = j ; that is, when these vertices do not lie on an axis-aligned line in Z 2 .
For Theorem 1.2, we need to show that pairwise disjoint cosets minimise the number of disjoint pairs. To that end, we call a union G of k cosets canonical if at least k − 1 of its cosets are pairwise disjoint. In terms of the intersection graph G of G, this means there is an axis-aligned line containing at least v(G) − 1 vertices. For example, when s = k(n − 1)!, the lexicographic family T (n, s) is canonical, as all the vertices (i, j) of its intersection graph lie on the line i = 1.
Our next proposition, central to the proof of Theorem 1.2, describes how the intersection graph of a union G of cosets can be used to bound the size of G and the number of disjoint pairs it is involved in. For this we require some further notation. Given a graph G and an integer t ≥ 1, we denote by k t (G) the number of t-cliques in G. In particular, we have k 1 (G) = v(G) and k 2 (G) = e(G). When the graph G is clear from context, we write k t for k t (G).

Proposition 2.4.
There is some c > 0 such that if, for 2 ≤ k 1 ≤ cn 1/2 , G is the union of k 1 cosets in S n with intersection graph G, then the following hold: and (d) dp(G) ≥ dp(T (n, |G|)), with equality if and only if G is canonical.
The proof of Proposition 2.4, though elementary, is rather technical, involving careful and repeated application of the Bonferroni inequalities to estimate the number of permutations in a union of cosets that are disjoint from a given permutation. We therefore defer the proof to Appendix A, and instead proceed to show how the proposition can be combined with Lemma 2.1 to prove Theorem 1.2.

Supersaturation.
Here we prove Theorem 1.2. Our strategy is to use Lemma 2.1 to reduce the statement to the case when F is a union of some cosets in S n , and then apply Proposition 2.4 to obtain the desired lower bound on the number of disjoint pairs.
Proof of Theorem 1.2. Let c 2.1 and C 2.1 be the positive constants from Lemma 2.1, and set c = min c 2.1 , 10 −5 C −2 2.1 , 10 −2 . Now letting n, k and ε be as in the statement of the theorem, let F ⊆ S n be an extremal family of s = (k + ε)(n − 1)! permutations. In the first part of our proof, we establish Claim 2.5, a rough structural result for F. Claim 2.5. Either F contains k cosets or F is contained in a union of k cosets.
Proof. Since |ε| ≤ ck −3 ≤ c 2.1 and dp(F) ≤ dp(T (n, s)) by the extremality of F, we may apply Lemma 2.1 to F with β 2.1 = 0 to find a union G = k i=1 T i of k cosets in S n such that Let A = F \ G and B = G \ F. We may assume that A = ∅ and B = ∅, otherwise either F ⊆ G or G ⊆ F as claimed. We shall show that if the permutations in A are replaced by those in B, the number of disjoint pairs decreases, which then contradicts the extremality of F. Fix two arbitrary permutations σ ∈ A and π ∈ B. It suffices to show that dp(σ, F) > dp(π, F).
First, using (8), Recall that any two cosets in S n have at most (n − 2)! elements in common, and that a permutation is disjoint from D n−1 other permutations in any coset not containing it.
Next, we combine this claim with Proposition 2.4 to bound dp(F) from below and finish the proof. We consider two cases, depending on the sign of ε. Case 1: ε ≤ 0. We have shown in Claim 2.5 that either F ⊇ G or F ⊆ G, where G is a union of some k cosets. Let t = |G|.
We next deal with the case F ⊆ G. It is convenient to think of F as a family obtained by removing permutations in G one by one. Since G is a union of k cosets in S n , the number of disjoint pairs is decreased by at most (k − 1)D n−1 each time. Following the same process for the family T (n, t), we see that the number of disjoint pairs is decreased by exactly (k − 1)D n−1 each time we remove a permutation from the last coset in T (n, t). Moreover, at the beginning of the process, dp(G) ≥ dp(T (n, t)) by Proposition 2.4(d). Thus dp(F) ≥ dp(G) − (t − s)(k − 1)D n−1 ≥ dp(T (n, t)) − (t − s)(k − 1)D n−1 = dp(T (n, s)), completing the proof in Case 1. Case 2: ε > 0. This case will be handled rather differently. Since ε > 0, formula (3) gives If G is a union of k disjoint cosets, then dp(F) = dp(G) + dp(H, G) + dp(H) ≥ dp(G) + dp(H, G) = dp(T (n, s)), where equality holds if and only if dp(H) = 0, that is, H is intersecting. It remains to verify that dp(F) ≥ dp(T (n, s)) when the k cosets of G are not pairwise disjoint. In this scenario we in fact have a strict inequality. Indeed, let G be the intersection graph of G. We shall use the inequality dp(F) ≥ dp(G) + dp(H, G) to lower bound dp(F). By Proposition 2.4(c), we have We next estimate the number of disjoint pairs between H and G. By Proposition 2.4(b), for every π ∈ H. Furthermore, using Proposition 2.4(a) to estimate |G| gives Combining (10), (11) and (12), and simplifying gives dp(G) + dp(H, G) − dp(T (n, s)) . Thus dp(F) ≥ dp(G) + dp(H, G) > dp(T (n, s)), completing the proof of Theorem 1.2.

Supersaturation for uniform set systems
In this section, we shall prove Theorem 1.3, but first let us examine dp(n, k, s). When , L(n, k, s) consists of the full stars with centres in [r − 1], with a further γ n−r k−1 sets from the star with centre r. Let L(i) = {L ∈ L(n, k, s) : i ∈ L} and L * (i) = {L ∈ L(n, k, s) : min L = i}. One can then compute the number of disjoint pairs as (13) dp(n, k, s) = This expression is quite unwieldy, so we shall make use of a few estimates. We first note that any set outside a star has exactly n−k−1 k−1 disjoint pairs with the star, so dp(n, k, s) ≤ 1≤i<j≤r−1 dp(L(i), L(j)) + 1≤i≤r−1 dp(L(i), L * (r)) This is only an upper bound as we overcount disjoint pairs involving sets belonging to multiple stars. For an even simpler upper bound, observe that every set belongs to at least one of the r stars, and is not disjoint from any other set in its star. In the worst case, there are an equal number of sets in each star, with each set disjoint from at most a 1 − 1 r -proportion of the family. We thus have We shall use these upper bounds on the number of disjoint pairs present in any extremal family.
3.1. Tools. There are two main tools we use in our proof of Theorem 1.3: a removal lemma for disjoint pairs, and the expander-mixing lemma applied to the Kneser graph. Before proving the theorem, we introduce these tools and explain how we shall use them.
3.1.1. Removal lemma. Using a result of Filmus [21], Das and Tran [11, Theorem 1.2] proved the following removal lemma, showing that large families with few disjoint pairs must be close to a union of stars. (20C) 2 n , then there is a family S that is the union of stars satisfying Observe that the bound on the number of disjoint pairs in the lemma is very similar to the upper bound given in (14). Thus one may interpret this result as a stability version of our previous calculation: any family with size similar to the union of r − 1 stars without many more disjoint pairs can be made a union of r − 1 stars by exchanging only a small number of sets. Given this stability, it is not difficult to show that the lexicographic ordering is optimal in this range.
There is some constant c > 0 such that if r, k and n are positive integers satisfying n ≥ 2c −1 k 2 r 3 , and s = n k − n−r+1 k of size s has dp(F) ≥ dp(n, k, s).
Proof. Let C be the constant from Lemma 3.1 and choose c = n−2k 2(20C) 2 n . For the given range of s, the lexicographic initial segment has r − 1 full stars with one small partial star, so we wish to apply Lemma 3.1 with = r − 1.
Let F be a subfamily of [n] k with s sets and the minimum number of disjoint pairs. Note that s = ( + α) n−1 k−1 , where γ − kr 2 2n ≤ α ≤ γ. In particular, we have |α| ≤ c r .
By optimality of F, and our calculation in (14), we also have dp(F) ≤ dp(n, k, s) ≤ 2 + (r − 1)γ n−1 k−1 n−k−1 k−1 , and hence take β = (r − 1)γ. We thus have |β| = (r − 1)γ ≤ c < n−2k (20C) 2 n and 2 |α| ≤ 2c = n−2k (20C) 2 n , and hence we may apply Lemma 3.1. This gives a family S, a union of stars, such that Hence, we know an optimal family F must be close to a union of stars S. We first show that S ⊆ F. If not, there is some set F ∈ F \ S in our family, as well as a set G ∈ S \ F missing from our family (note that |F| ≥ |S|). For each star in S, there are at most n−1 On the other hand, the set G is in one of the stars of S, which contains at least strictly increases the number of intersecting pairs, thus decreasing the number of disjoint pairs, contradicting the optimality of F.
Since every set outside a union of stars is contained in exactly the same number of disjoint pairs with sets from the stars, the terms dp(S) and dp(S, H) are determined by and s, and independent of the structure of F. It follows that dp(F) is minimised precisely when dp(H) is minimised. As |H| = |F| − |S| = γ n−r k−1 ≤ n−r k−1 , we may take H to be an intersecting family, and so dp(H) = 0 is possible. Since in L(n, k, s), the set H corresponds to the final (intersecting) partial star, it follows that L(n, k, s) is optimal, and so dp(F) ≥ dp(n, k, s) for any family F of s sets.

3.1.2.
Expander-mixing lemma. The second tool we shall use is the expander-mixing lemma 2 of Alon and Chung [1], which relates the spectral gap of a d-regular graph to its edge distribution. Since the graph is d-regular, its largest eigenvalue is trivially d, corresponding to the constant eigenvector. In what follows, an (n, d, λ)-graph is a d-regular n-vertex graph whose largest non-trivial eigenvalue (in absolute value) is λ.
As we are interested in counting disjoint pairs, we shall apply the expander-mixing lemma to the Kneser graph, where the vertices are sets and edges represent disjoint pairs. The spectral properties of the Kneser graph were determined by Lovász [35]. In particular, Proof. Without loss of generality we assume i = n − 1 and j = n. Let A = {F \ {n − 1} : F ∈ F(n − 1), n / ∈ F } and B = {F \ {n} : F ∈ F(n), n − 1 / ∈ F }, and observe that dp(F(n − 1), F(n)) = dp (A, B). Furthermore, we have A, B ⊆ [n −2] k−1 , with |A| ≥ |F(n − 1)| − n−2 k−2 and |B| ≥ |F(n)| − n−2 k−2 . Since disjoint pairs between A and B correspond to edges between the corresponding vertex sets in the Kneser graph KG(n − 2, k − 1), Lemma 3.3 gives dp(F(n − 1), F(n)) = dp(A, B) ≥ We now recall that |F(n − 1)| − n−2 k−2 ≤ |A| ≤ |F(n − 1)|, with similar bounds holding for B. We shall also remove the square root by appealing to the AM-GM inequality. Also observe that n−k−1 Noting that n−2 k−2 ≤ k n n−1 k−1 , taking the main order term and collecting the negative terms then gives the desired bound. Hence dp(L(n, k, s)) = 0, which is clearly optimal.
For the induction step, we have s = n k − n−r+1 k +γ n−r k−1 for some r ≥ 2 and γ ∈ (0, 1]. Letting c be the positive constant from Corollary 3.2, if γ ∈ (0, c r ], we are done. Hence we may assume γ ∈ ( c r , 1]. Let F be a k-uniform set family over [n] of size s with the minimum number of disjoint pairs. In particular, we must have dp(F) ≤ dp(n, k, s).
For any set F ∈ F, by the induction hypothesis we have dp(F \ {F }) ≥ dp(n, k, s − 1). Hence dp({F }, F) = dp(F) − dp(F \ {F }) ≤ dp(n, k, s) − dp(n, k, s − 1), where the right-hand side is the number of disjoint pairs involving the last set L added to L(n, k, s). The set L is in a star of size γ n−r k−1 > γ 2 n−1 k−1 , and hence intersects at least γ 2 n−1 k−1 sets in L(n, k, s). Thus it follows that every set F ∈ F must also intersect at least γ 2 n−1 k−1 sets in F. Now suppose that F contains a full star; without loss of generality, assume F(1) consists of all n−1 k−1 sets containing the element 1. Let G = F \ F(1). Since F(1) is intersecting, and every set outside F(1) has exactly n−k−1 k−1 disjoint pairs with sets in F(1), we have dp(F) = dp(F(1), G) + dp(G) = |G| n − k − 1 k − 1 + dp(G).
Now G is a k-uniform set family over [n] \ {1} of size s = s − n−1 k−1 , and so by induction dp(G) is minimised by the initial segment of the lexicographic order of size s . However, adding back the full star F(1) gives the initial segment of the lexicographic order of size s, and as a result dp(F) ≥ dp(L(n, k, s)) = dp(n, k, s).
Hence we may assume that F does not contain any full star. In particular, this means for any set F ∈ F and element i ∈ [n], we have the freedom to replace F with some set containing i. We shall use such switching operations to show that F, like L(n, k, s), must have a cover of size r, from which the result will easily follow.
Relabel the elements if necessary so that for every i ∈ [n], i is the vertex of maximum degree in F| [n]\[i −1] . Let F * (i) = {F ∈ F : min F = i} be those sets containing i that do not contain any previous element. Define We shall show that X is a cover for F (that is, F 1 = F and F 2 = ∅), but to do so we shall first have to establish a few claims. The first shows that X cannot be too big.
Proof. Observe that the families {F * (x) : x ∈ X} partition F 1 . Hence we have from which the claim immediately follows.
The next claim asserts that every set in F must intersect many sets in F 1 . Proof. First observe that any element i ∈ [n] is contained in fewer than γ 4k n−1 k−1 sets in F 2 . Indeed, the elements x ∈ X have all of their sets in F 1 , and hence have F 2 -degree zero. Thus the F 2 -degree of any element is its degree in F| [n]\X . If the element of largest F 2 -degree was contained in at least γ 4k n−1 k−1 sets from F 2 , then it would have been in X, giving a contradiction. Now recall that every set F ∈ F must intersect at least γ 2 n−1 k−1 sets in F. The number of sets in F 2 it can intersect is at most Hence the remaining γ 4 n−1 k−1 intersections must come from sets in F 1 .
The following claim combines our previous results with the expander-mixing corollary to provide much sharper bounds on the size of X. Proof. For every i ∈ X, we shall estimate dp(F * (i), F 1 ). Since {F * (x) : x ∈ X} is a partition of F 1 , we have dp(F * (i), F 1 ) = j∈X\{i} dp(F * (i), F * (j)). Applying Corollary 3.4, we get dp(F * (i), F 1 ) = j∈X\{i} dp(F * (i), F * (j)) By averaging, some F ∈ F * (i) is disjoint from at least n−1 k−1 , we can lower bound this expression by Recalling that γ ≥ c r , we find that F intersects at most sets from F 1 . By Claim 3.6, this quantity must be at least γ 4 n−1 k−1 , which gives since n > Ck 2 r 3 for some large enough constant C. Hence for every i ∈ X, we in fact have the much stronger bound |F * (i)| ≥ γ Our next claim shows that X is indeed a cover for F. Claim 3.8. X is a cover for F; that is, F 1 = F and F 2 = ∅.
Proof. Suppose for contradiction we had some set F ∈ F 2 . By Claim 3.6, at least γ 4 n−1 k−1 sets in F 1 must intersect F . However, each such set must contain at least one element of X, which by Claim 3.7 has size at most 8r γ , together with one element from F . Hence there are at most k |X| n−2 k−2 ≤ 8k 2 r γn n−1 k−1 sets in F 1 intersecting F . Since γ ≥ c r and n > Ck 2 r 3 for some large enough constant C, this is less than γ k−1 ≤ |F * (i)| sets in F * (i) meet X in at least two elements, and thus there must be some set F i ∈ F * (i) such that F i ∩ X = {i}. We shall use this fact to establish the following claim. Proof. Suppose for contradiction |F * (j)| > |F * (i)| + 8k 2 r γn n−1 k−1 . Let F i ∈ F * (i) be such that F i ∩ X = {i}. Then F i intersects only the sets that contain i together with sets containing some other element in X and some element in F i . This gives a total of at most sets. On the other hand, if we replace F i by some set G containing j (which we may do, since we assume the family F(j) is not a full star), we would gain at least |F * (j)| intersecting pairs. Hence F ∪ {G} \ {F i } is a family of s sets with strictly fewer disjoint pairs, contradicting the optimality of F.
This claim shows that the sets in F are roughly equally distributed over the families F * (i), i ∈ X. To simplify the notation, we let m = |X|, and so we have X = [m]. By Claim 3.7, m ≤ 8r γ . We shall now proceed to lower-bound the number of disjoint pairs in F. Note that dp(F) = 1≤i<j≤m dp(F * (i), F * (j)). We shall use Corollary 3.4 to bound these summands. We let s i = |F * (i)| n−1 k−1 −1 and set s = s n−1 Since i s i = s, there must be some with s ≤ s m , and Claim 3.9 then implies that for every i, Claim 3.10. |X| = r; that is, F has a cover of size r.
These two bounds together imply 1 m + 1 r(r+1) > 1 r , which in turn gives m < r + 1. This shows m = r, and X is thus a cover of size r.
Hence it follows that F is covered by some r elements, which we may without loss of generality assume to be [r]. We now finish with a similar argument as in the proof of Corollary 3.2: let S be the union of the r stars with centres in [r], and let G = S \ F be the missing sets. Then dp(F) = dp(S) − dp(G, S) + dp(G) is minimised when G is an intersecting family of sets that each meet [r] in precisely one element, which is the case for F = L(n, k, s). Hence dp(F) ≥ dp(n, k, s), completing the proof of the theorem.
The problem of minimising the number of disjoint pairs can be viewed as an isoperimetric inequality in the Kneser graph. The following lemma links isoperimetric problems for small and large families (see, for instance, [10, Lemma 2.3]). The following corollary, which is a direct consequence of Theorem 1.3 and Lemma 3.11, shows that the complements of the lexicographical initial segments, which are isomorphic to initial segments of the colexicographical order, are optimal when s is close to n k .
the number of families with property P is (T + o(1))2 N 0 .
We will apply Lemma 4.1 with P being the property of avoiding a matching of size s or, equivalently, of not containing s pairwise disjoint sets. To do so, we first bound the number of maximal families with no matching of size s.  F 0 such that F i , G i,1 , . . . , G i,s−2 and G i,s−1 are pairwise disjoint, while for every j = i, F j , G i,1 , . . . , G i,s−2 and G i,s−1 are not pairwise disjoint. In other words, if we let Given these conditions, we may apply the Bollobás set-pairs inequality [6] to bound the size of F 0 .
The conditions of Theorem 4.3 are satisfied, and hence we deduce m ≤ sk k . We map each maximal family F to a minimal generating family F 0 ⊂ F. This map is injective because I(F 0 ) = F. We have shown that |F 0 | ≤ sk k , and thus the number of maximal families is bounded from above by Proof of Theorem 1.5. We shall verify that the condition (17)  . In addition, Proposition 4.2 shows that we may use the estimate log M ≤ n sk k . Altogether we have As n ≥ 2sk + 38s 4 and s ≥ 2, we find n−k−s+2 , and hence This implies as s/(s + 1) ≥ 2/3, (k − 2)/k ≥ 1/3 and (n − k − s + 2)/n ≥ 3/4. Substituting this inequality into (18), we obtain

4.2.
Intersecting set systems. In this section we shall use the removal lemma for disjoint sets (Lemma 3.1) to show that intersecting set systems in [n] k are typically trivial when n ≥ 2k + C √ k ln k for some positive constant C. Since the number of trivial intersecting families is it suffices to prove that there are o(2 ( n−1 k−1 ) ) non-trivial intersecting families. We need a few classic theorems from extremal set theory. The first is a theorem of Hilton and Milner [26], bounding the cardinality of a non-trivial uniform intersecting family.

Theorem 4.4 (Hilton and Milner). Let F ⊂ [n]
k be a non-trivial intersecting family with k ≥ 2 and n ≥ 2k + 1.
The next result we require is a theorem of Kruskal [32] and Katona [28]. For a family F ⊂ [n] r , its s-shadow in [n] s , denoted ∂ (s) F, is the family of those s-sets contained in some member of F. For x ∈ R and r ∈ N, we define the generalised binomial coefficient The following convenient formulation of the Kruskal-Katona theorem is due to Lovász [34]. r with |F| = x r for some real number x ≥ r, then ∂ (s) F ≥ x s . With these results in hand, we now prove Theorem 1.6.
Proof of Theorem 1.6. The statement has been established for n ≥ 3k + 8 ln k in [3,Theorem 1.4], and so we may assume n = 2k + s for some integer s with C √ k ln k ≤ s ≤ k + 8 ln k.
For each ∈ N, let N denote the number of maximal non-trivial intersecting families of size n−1 k−1 − . By Theorem 4.4, we know N = 0 for < n−k−1 k−1 − 1. By taking a simple union bound over the subfamilies of these families, we can bound the number of non-trivial intersecting families by Hence it suffices to show (19) n( 2k We fix some integer with n−k−1 k−1 − 1 ≤ ≤ n 2k k , and fix some maximal intersecting family F of size n−1 k−1 − . Let S be the star that minimises |F∆S|, and without loss of generality assume that n is the center of S. Let A = F \ S, and t = |A|. Let B = S \ F, and note that |B| = t + . Let For the opposite direction, suppose H / ∈ ∂ (k−1) P. Then, following the same argument as above, ({n}∪H)∩A = ∅ for all A ∈ A. By maximality of F, we must have {n}∪H ∈ F, and thus {n} ∪ H / ∈ B, resulting in H / ∈ Q. We shall show that ≥ 2nt. First let us see why this implies (19). For each family F counted by N , it suffices to provide the star S and the family A outside the star 3 . Indeed, since Q = ∂ (k−1) P, we can compute F ∩ S, and hence completely determine F. Moreover, |A| = t ≤ /(2n). Thus It remains to show ≥ 2nt. Letting P and Q be as above, recall that Q = ∂ (k−1) P. According to Theorem 4.5, if x is a real number so that t = |P| = x n−k−1 , then + t = |Q| ≥ x k−1 . Now observe that by Lemma 3.1, we have t ≤ C n for some absolute constant C . Since ≤ n 2k k , this implies t ≤ C n 2 2k k . Since n = 2k + s, we have t = x n−k−1 = x k+s−1 . We next show that x < 2k + 3 4 s . If not, then The bases of the exponential factors are minimised when s is as large as possible; substituting s ≤ k + 8 ln k < 1.1k, we can lower bound the coefficient of 2k k by 2. as s > C √ k ln k ≥ 100 ln n, contradicting our upper bound t ≤ C n 2 2k k . Suppose, then, that x ≤ 2k + 3 4 s − 1. Since t = x k+s−1 and + t ≥ x k−1 , we have This product is decreasing in x, so we can substitute our upper bound x ≤ 2k + 3 4 s − 1 to find This is increasing in s, so plugging in the lower bound s ≥ C √ k ln k, we have as required. This completes the proof.

Concluding remarks
We close by offering some final remarks and open problems related to the supersaturation problems discussed in this paper. 5.1. Supersaturation for permutations. Theorem 1.2 shows, for k ≤ cn 1/2 and s (very) close to k(n − 1)!, one minimises the number of disjoint pairs in a family of s permutations by selecting them from pairwise-disjoint cosets. This leaves large gaps between the ranges where we know the answer to the supersaturation problem, and it would be very interesting to determine the correct behaviour throughout. For instance, which family of 1.5(n − 1)! permutations minimises the number of disjoint pairs?
Note that the derangement graph is d n -regular, and so we can apply Lemma 3.11 to determine the optimal families for sizes close to k(n − 1)! when k ≥ n − cn 1/2 by taking complements. However, the complement of a union of pairwise disjoint cosets is again a union of pairwise disjoint cosets, and hence there may well be a nested sequence of optimal families for this problem. One candidate would be the initial segments of the lexicographic order on S n , where π < σ if and only if π j < σ j for j = min{i ∈ [n] : π i = σ i }.

Set systems of very large uniformity.
For set families, we improved the range of uniformities for which the small initial segments of the lexicographic order are known to be optimal. In Corollary 3.2, which applies when n = Ω(r 3 k 2 ), we handled the case where the family is a little larger than the union of r stars. However, if one restricts the size of the set families even further, one can obtain optimal bounds on n. For instance, Katona, Katona and Katona [29] showed that adding one set to a full star is always optimal.  k is a family with |F| = n−1 k−1 + t for some 1 ≤ t ≤ c · n−2k n n−k−1 k−1 . Letting s = n−1 k−1 +t, we shall show that dp(F) ≥ dp(L n,k (s)) = t n−k−1 k−1 . Suppose otherwise that dp(F) < t n−k−1 k−1 . By Lemma 3.1, there exists a star S such that |F∆S| ≤ 1 2 n−k−1 k−1 . It follows that |F ∩ S| = n−1 k−1 − p for some integer p with 0 ≤ p ≤ 1 2 n−k−1 k−1 . As |F| = n−1 k−1 + t and |F ∩ S| = n−1 k−1 − p, we must have |F \ S| = p + t. Since each set in F \ S is disjoint from exactly n−k−1 k−1 sets in the star S and |F ∩ S| = n−1 k−1 − p, we conclude dp(F, F ∩ S) ≥ n−k−1 where the last inequality holds since p ≤ 1 k−1 sets from the full star. Hence dp(L(n, k, s)) = 2k−1 k − 1 2k−2 k−1 . Now instead let F be the family consisting of the S 1 , the full star with centre 1, and all but one k-element subset of {2, 3, . . . , 2k}. Since F again consists of a full star and an intersecting family of size 2k−1 k − 1, we have dp(F ) = dp(L(n, k, s)). Now form the family F from F by replacing the set A = {1, 2k + 1, . . . , 3k − 1} with the missing k-set B from {2, 3, . . . , 2k}. We lose 2k−1 k − 1 disjoint pairs when we remove A, and gain only , it follows that dp(F) < dp(L(n, k, s)), showing the initial segment of the lexicographic order is not optimal.
Bollobás and Leader [7] conjectured that the solution to the supersaturation problem is always given by an -ball. Given n, k and s, an -ball of size s is a family B (n, k, s) of s sets such that there is some r with In particular, the initial segments of the lexicographic order are 1-balls, while their complements are isomorphic to k-balls.
We have shown that the construction F given above has fewer disjoint pairs than the 1-balls of size s = |F|. Computer-aided calculations show that for n = 3k − 1, s = n−1 k−1 + 2k−1 k − 1 and 5 ≤ k ≤ 15, the 1-balls have far fewer disjoint pairs than the -balls for ≥ 2, showing that F gives a counterexample to the Bollobás-Leader conjecture for these parameters. The numerical evidence suggests that F should be a counterexample for all k ≥ 5, but it is difficult to estimate the number of disjoint pairs in B (3k − 1, k, s) for ≥ 2, and so we have been unable to prove this.
Next, given a graph G, k t (G) denotes the number of t-cliques in G. We will further write K t (G) for the set of these t-cliques. Moreover, given a vertex subset X ⊆ V (G), we denote by k t,X (G) the number of t-cliques in G that contain X. In particular, we have k t (G) = k t,∅ (G) = |K t (G)|. Again, we will omit G from the notation when the graph is clear from the context. Finally,P 3 is the complement of the path on three vertices, which is the union of an edge and an isolated vertex. Let With this additional notation in place, we close these preliminaries with the following crucial observation, which we shall make repeated use of.
Observation A.1. If G is the intersection graph of a union of cosets, the following properties hold.
A.2. Proof of Proposition 2.4(a). Armed with these preliminaries, we may begin to prove the statements in Proposition 2.4, of which the first is by far the simplest.
Proof of Proposition 2.4(a). By the Bonferroni inequalities, we have By Observation A.1(i), ∩ x∈X T x is empty unless X induces a clique in G, in which case |∩ x∈X T x | = (n − |X|)!. Hence we have as required.
A.3. Proof of Proposition 2.4(b). In the second part of the proposition, we count the number of disjoint pairs between G and an arbitrary permutation π ∈ S n \ G. We will in fact prove the more accurate estimate given in the claim below, as this will be required in the proof of part (c).
Claim A.2. Let G be a union of k 1 ≤ n cosets with intersection graph G. Then, for every π ∈ S n \ G, where the indices x = (x 1 , x 2 ) and y = (y 1 , y 2 ) are vertices of G.
We first verify that this implies the bound from the proposition.
Proof of Proposition 2.4(b). The leading term is already in the desired form. For the second-order term, observe that D n−2 = D n−2 + d n−3 . In the third term, we bound the coefficient of d n−3 above by k 3 + 2k 2 , and recall that we also have a term of −k 2 d n−3 from the second-order term. Thus, in total, the third-order term is at most (k 3 + k 2 )d n−3 . Since k 3 ≤ k 1 k 2 and d n−3 ≤ (n − 3)!, we can bound this from above by 2k 1 k 2 (n − 3)!.
We now prove the claim.
A.4. Proof of Proposition 2.4(c). In the third part of the proposition, we count the number of disjoint pairs within a union G of cosets, with the result depending on numerous parameters of the intersection graph G. We shall once more prove a more precise estimate that we will need in the proof of part (d).
Claim A.3. Let G be a union of k 1 ≤ cn 1/2 cosets with intersection graph G. Then where Let us first verify that this claim suffices for the proposition.
Proof of Proposition 2.4(c). The first two terms in Claim A.3 are exactly as required.
We now prove the claim.
Proof of Claim A.3. The idea behind the proof is to partition the permutations in G based on how many of the cosets they are contained in. For each vertex set X ⊆ V (G), let M X be the family of all permutations π ∈ G which satisfy {x ∈ V (G) : π ∈ T x } = X.
We shall use the symbol∪ to denote an union of disjoint sets. From Observation A.1 (i) we find M i :=∪ |X|=i M X =∪ X∈K i (G) M X , resulting in dp(M i , G) = X∈K i (G) dp(M X , G). The following claim evaluates these expressions.
On the other hand, it follows from the Bonferroni inequalities and Observation A.1 (i) that |M X | = (n − 3)! ± k 4,X (n − 4)! for every triangle X ∈ K 3 (G). Combining with the trivial bound k 4,X ≤ k 1 , this gives Summing dp(π, G) over all permutations π ∈ M 3 and using the estimate 2k 2 ≤ k 2 1 gives the desired result.
We shall use a similar counting argument to estimate dp(M 2 , G).
Summing dp(M X , G) over all X ∈ K 2 (G) and using the identity X∈K 2 (G) k 3,X = 3k 3 results in the desired equation.
Finally we come to what is, in some sense, the trickiest part of our proof, which is dealing with permutations in a single coset.
A.5. Proof of Proposition 2.4(d). The final part of this appendix is devoted to showing that if G is a union of few cosets in S n , then G has at least as many disjoint pairs as T (n, s), where s = |G|. To bound the gap dp(G) − dp(T (n, s)), we shall use part (III) of the following claim concerning structural properties of intersection graphs.
We note that parts (I) and (II) will only be used to prove part (III). Before proving this claim, we show how it implies the final part of the proposition.
Proof of Proposition 2.4(d). If G is canonical, then G is a union of k 1 − 1 pairwise disjoint cosets and an intersecting family, and so dp(G) = dp(T (n, s)).
Thus to complete the proof of Proposition 2.4, we need to prove Claim A.5. The first part shows that, but for a handful of small exceptions, the intersection graph of a non-canonical union of cosets must have many edges.
Proof of Claim A.5(I). It is not difficult to verify the result for k 1 ≤ 5. It remains to deal with the case that k 1 ≥ 6 and k 2 < max{k 1 , 2k 1 − 6} = 2k 1 − 6, in which case we wish to show G to be canonical.
Let be an axis-aligned line that maximises d := | ∩ V (G)|. If d ≥ k 1 − 1, then G is canonical, as desired. If d ≤ 2, then d G (x) ≥ k 1 − 3 for every x ∈ V (G). Hence, as k 1 ≥ 6, a contradiction. We may therefore assume 3 ≤ d ≤ k 1 − 2. Since each vertex x ∈ V (G) \ is incident to all but at most one vertex in , we must have giving the required contradiction. The next part of the claim bounds the number of triangles in terms of the number of edges and vertices.
Proof of Claim A.5(II). We use induction on k 1 . The cases k 1 ≤ 6 can be checked by hand. Now suppose k 1 ≥ 7. If k 2 ≥ k 1 , G cannot be canonical. It then follows from part (I) that (21) k 2 ≥ 2k 1 − 6 ≥ k 1 + 1.
Let x be a vertex of G of minimum degree. We distinguish two cases.