Threshold and hitting time for high-order connectedness in random hypergraphs

We consider the following definition of connectedness in k-uniform hypergraphs: two j-sets (sets of j vertices) are j-connected if there is a walk of edges between them such that two consecutive edges intersect in at least j vertices. The hypergraph is jconnected if all j-sets are pairwise j-connected. We determine the threshold at which the random k-uniform hypergraph with edge probability p becomes j-connected with high probability. We also deduce a hitting time result for the random hypergraph process – the hypergraph becomes j-connected at exactly the moment when the last isolated j-set disappears. This generalises the classical hitting time result of Bollobás and Thomason for graphs.


Introduction 1.Preliminaries and main results
In the study of random graphs, one of the most well-known results concerns the hitting time for connectedness.More precisely, if we add randomly chosen edges one by one to an initially empty graph on n vertices, then with high probability at the moment the last isolated vertex gains its first edge, the whole graph will also become connected (this classical result was first proved by Bollobás and Thomason in [3]).This interplay between local and global properties is an example of the common phenomenon relating graph properties with their smallest obstruction; the graph can certainly not be connected while an isolated vertex still exists, but this smallest obstruction is also the critical one which is last to disappear.
In this paper we generalise the result of Bollobás and Thomason to random k-uniform hypergraphs.For an integer k 2, a k-uniform hypergraph consists of a set V of vertices together with a set E of (hyper-)edges, each consisting of k vertices.(A 2-uniform hypergraph is simply a graph.)We need to define the notion of connectedness, for which there is a whole family of possible definitions.For any 1 j k − 1, we say that two j-sets (sets of j vertices) J 1 , J 2 are j-connected if there is a sequence of edges E 1 , . . ., E m such that In other words, we may walk from J 1 to J 2 using edges which consecutively intersect in at least j vertices.A j-component is a maximal set of pairwise j-connected j-sets.A k-uniform hypergraph is j-connected if there is one j-component which contains all j-sets. 1ote that in the case k = 2, j = 1 this is simply the usual definition of connectedness for graphs.More generally, for arbitrary k 2 the case j = 1 is by far the most wellstudied.The definition for general j is also entirely natural, albeit harder to visualise and often requires more complex analysis.In this paper we will be interested in arbitrary 1 j k − 1 and k 3.
There is also more than one model for random hypergraphs.We first define the uniform model, the counterpart of the uniform model of Erdős and Rényi for graphs: given any natural numbers k, M, n such that M n k , the random hypergraph H k (n, M ) is a hypergraph chosen uniformly at random from all k-uniform hypergraphs on vertex set {1, . . ., n} which have M edges.This is closely related to the random hypergraph process {H k (n, M )} M which is defined as follows: • H k (n, 0) is the hypergraph on vertex set {1, . . ., n} with no edges; by adding an edge chosen uniformly at random from among those k-sets which do not already form an edge.
Note that the distribution of the random hypergraph obtained in the M -th step of the process is the same as in the uniform model H k (n, M ), so the notation is consistent.
We consider asymptotic properties of random hypergraphs and throughout this paper any asymptotics are as n → ∞.In particular we say with high probability (or whp) to mean with probability tending to 1 as n → ∞.
We say that a j-set is isolated if it is not contained in any edges.It is trivial to see that if a hypergraph contains isolated j-sets, then it is not j-connected (assuming it has more than j vertices).Our main result is that this trivial smallest obstruction is also the critical one in a random hypergraph.
Let τ c = τ c (n, j, k) denote the time step in the hypergraph process {H k (n, M )} M at which the hypergraph becomes j-connected.Similarly, let τ i denote the time at which the last isolated j-set disappears.Note that the properties of being j-connected or of having no isolated j-set are certainly monotone increasing properties, so these two variables are well-defined.Furthermore, as noted above, τ i τ c deterministically.
Theorem 1.For any 1 j k − 1 and k 3, with high probability in the random hypergraph process {H k (n, M )} M we have τ c = τ i .
The case j = 1 of this theorem was already proved by Poole as a special case of the results in [7].The case j = k − 1 was previously proved by Kahle and Pittel in [6].For all other j, this result was previously unknown.
The uniform model and the associated hypergraph process allow us to formulate exact hitting time results such as Theorem 1.However, the drawback is that the analysis of the model can become tricky due to the fact that the presence of different edges is not independent (the total number is fixed).For this reason, it is often easier to analyse the binomial model : H k (n, p) is a random k-uniform hypergraph on vertex set {1, . . ., n} in which each k-set is an edge with probability p independently of all other k-sets.In Section 2 we will show that if p = M/ n k , then the two models are very similar and we can transfer results from one model to the other.
For the proof of Theorem 1 we will also make use of the following result (Theorem 2), which is interesting in itself and is therefore stated in a significantly more general form than we need for Theorem 1.For integer valued random variables Z and Z we denote their total variation distance by d T V (Z, Z ), i.e.
For integer-valued random variables X n and Y , we say X n converges in distribution to Y , denoted by X n d −→ Y , if for every integer i we have P(X n = i) → P(Y = i).
Theorem 2. For any k 3 and 1 j k − 1 and for any integer s 0, let , where c n = o(log n), and let D s be the number of j-sets of degree precisely s in H k (n, p s ) (i.e. which lie in s edges).Then we have In particular the electronic journal of combinatorics 23(2) (2016), #P.48 These two theorems together give the following corollary.
1.If c n → −∞, then with high probability H k (n, p) contains isolated j-sets (and is therefore not j-connected).
2. If c n → ∞, then with high probability H k (n, p) is j-connected (and therefore contains no isolated j-sets).
In other words, the properties of being j-connected and having no isolated j-sets both undergo a (sharp) phase transition at threshold p conn , defined as

Methods
The main contribution of this paper is to deduce Theorem 1 from Theorem 2. Attempting to prove this directly using standard techniques generalised from the graph case does not work because j-components in a hypergraph may be strangely and non-intuitively distributed.To overcome this problem we quote a powerful result from [4], which guarantees one component with a large subset which is in some sense smoothly distributed.We then show that whp all non-trivial components are connected to this smooth subset.

Notation and definitions
We introduce a few more definitions before we proceed with the proofs.We fix k 3 and 1 j k − 1 for the remainder of the paper.The order |H| of a hypergraph H is the number of vertices it contains, while its size e(H) is the number of edges.Since a j-component consists of j-sets of vertices, we may view it as a j-uniform hypergraph in which the edges are the j-sets in the component.In particular, the size of a j-component is the number of j-sets it contains.In the remainder of the paper we will use component to mean j-component.
We will sometimes need to relate the j-sets of a component to the edges of the hypergraph which connect them.To allow us to do this, for a k-uniform hypergraph H we define the j-size of H to be the number of j-sets contained in edges of H.
We ignore floors and ceilings whenever this does not significantly affect the argument.
2 Contiguity of H k (n, M ) and H k (n, p) We need to know that H k (n, p) and H k (n, M ) are roughly equivalent, which is a generalisation of a standard fact about the corresponding graph models (see [2,5]).In fact, [5] considers a more general setting than we require here, but what we state is an immediate corollary of the results there (see [5], Corollary 1.16).Let N = n k and to ease notation, for some property Q we will denote by P M (Q) the probability that H k (n, M ) has property Q.P p (Q) is defined similarly.Lemma 4. Let Q be some monotone increasing property of k-uniform hypergraphs and let M = N p → ∞.Then This lemma allows us to transfer properties from H k (n, p) to H k (n, M ) (transferring in the other direction is also possible, with some small modifications, but we will not need to do this here).However, this only works for monotonically increasing properties.This is fine for the properties of being j-connected or of having no isolated j-sets, but in the proof of Theorem 1 we will need to consider the probability of having a component of size r, for various fixed r.This property is not even convex (and nor is its complement) and so for this case we will need some more careful arguments.
The following standard argument allows us to transfer high probability events from the binomial to the uniform model provided that the failure probability is small enough.Lemma 5. Let Q be an arbitrary property.Suppose M → ∞ and p = M/N → 0. Then Proof.The inequality follows from the fact that For the equality we use Stirling's approximation to deduce that the electronic journal of combinatorics 23(2) (2016), #P.48

Proof of Theorem 2
Let C = k j − 1. Fix an integer s 0 and suppose p = p s = j log n+s log log n+cn ( n k−j ) , where Then the expected number of j-sets of degree s in H k (n, p) satisfies since For the Poisson-approximation we use the Chen-Stein method (see [1]).For any j-set J we denote its degree in H k (n, p) by deg(J) and analyse how D s changes by conditioning on the event {deg(J 0 ) = s} for an arbitrary j-set J 0 .
First we construct H k (n, p) and denote by E 0 the set of edges containing J 0 , then we distinguish three cases: We denote the resulting hypergraph by H * = H * (J 0 ).For any j-set J we write deg * (J) for its degree in H * and D * s (J 0 ) for the number of j-sets J = J 0 such that deg * (J) = s.Furthermore observe that this construction provides a coupling of H k (n, p) and H * such that removing all edges containing J 0 in either one of them yields the same random hypergraph H − = H − (J 0 ).For any j-set J we write deg − (J) for its degree in H − .
We use the following form of the Chen-Stein approximation given by Theorem 1.B in [1].
Theorem 6 (Chen-Stein approximation [1]).Given a finite index set I and a random variable W = i∈I Z i , where Z i is a Bernoulli random variable with parameter p i ∈ [0, 1], denote by λ = i∈I p i the expectation of W . Assume that for each i ∈ I there is a pair of coupled random variables (U i , V i ) such that U i has the distribution of W and V i + 1 has the distribution of W conditioned on {Z i = 1}.Then we have the electronic journal of combinatorics 23(2) (2016), #P.48 For the proof of Theorem 2, we let I be the set of all j-sets and for all J let s (J 0 ).We observe that H * has the same distribution as H conditioned on the event {deg(J 0 ) = s}, so V J has the same distribution as W conditioned on {Z J 0 = 1}.Now applying Thoerem 6 and using the upper bound min 1, 1 Hence it suffices to estimate the random variable |D s − D * s (J 0 )| .We first observe that To justify the inequality, first note that if deg(J 0 ) = s, then H = H * and only the first term contributes.Furthermore, if deg(J 0 ) < s, say deg(J 0 ) = s−t for some t ∈ [1, s], then the only contribution to |D s − D * s (J 0 )| comes from j-sets J = J 0 whose degree increased, i.e. deg * (J) > deg(J).Similarly, if deg(J 0 ) = s+t for some t ∈ 1, n−j k−j −s , observe that for a j-set J to contribute it is necessary to have either deg(J) = s or deg * (J) = s.Note that these cannot hold unless deg − (J) s, and we will simply bound the probability of this (more likely) event.
Moreover, each inner sum has at most Ct terms, since we certainly only sum over j-sets J whose degree has changed, and adding or deleting an edge influences the degree of at most C j-sets (other than J 0 ).
Note also that deg independently of deg(J 0 ), and the probability that deg − (J) s is maximised when |J 0 ∪J| is minimised.Hence for an upper bound we will assume that |J 0 ∪ J| = j + 1.This motivates the definition q = P Bin N , p s , the electronic journal of combinatorics 23(2) (2016), #P.48 where Combining these two arguments we obtain the upper bound Therefore, using the notation x + = max{x, 0} for any x ∈ R, we have We can estimate both probabilities in (4) using where the second and third lines follow because s is bounded.Moreover, we have and furthermore Therefore ( 3) and ( 4) provide (1), i.e.
Now assume lim n→∞ c n = c.By (2) we know that E (D s ) → j s e −c j!s! and by the continuity in λ of the function P(Po(λ) = i) for each i hence by the triangle inequality and (1), case (ii) in the second claim follows.Cases (i) and (iii) can be easily deduced from case (ii).

Proof of Theorem 1
The proof which we present is largely elementary except for the use of Theorem 2, which relies on Theorem 6, and one powerful result from [4] (Lemma 7 below).This result is stated for a much smaller probability than we have in this setting, which is therefore not the optimal range for its application, but nevertheless it will turn out to be strong enough.We first collect two preliminary results which we will need later (Corollary 8 and Proposition 9 below).

Smooth subset
The following result from [4] will be key.
. Then whp there is a j-component of H k (n, p * ) with a subset S of at least ε 3 n j j-sets satisfying the following property: n j-sets of S.
In other words, we can find a reasonably large subset S of a component which is smooth in the sense that all (j − 1)-sets are in about the "right" of j-sets of S.More precisely, we say that a set S of j-sets is smooth if every (j − 1)-set is contained in n j-sets of S.
We note that Lemma 7 is not stated explicitly in this form in [4], but is implicit in the proof.We therefore give a brief outline of how it can be deduced from the results in that paper.The casual reader who is unfamiliar with [4] may skip the following proof.
Proof.Starting from some j-set J, we explore the j-component containing J using a breadth-first search process BFS(J).This partitions the j-sets of the j-component into generations, which can be numbered according to the order they were discovered in.
We fix a starting j-set J which lies in the largest component of H k (n, p), let ∂C g denote the g-th generation of this search process BFS(J), and C g = ∪ g g ∂C g .Then there are generations g 0 and g 1 such that the following statements hold whp.

Either |∂C
(and in particular g 0 < g 1 ); 3. Every generation ∂C g with g 0 g g 1 is smooth.We set g 0 = i 1 (j − 1), where i 1 (j − 1) is defined immediately before Lemma 16 in [4], and set g 1 = i 1 , where i 1 is defined in [4] to be the generation at which one of three stopping conditions (S1), (S2) or (S3) is invoked (see Section 5.4 of [4]).These stopping conditions contain a parameter λ, which we choose to be λ = ε 3/2 .
Property (1) follows from these stopping conditions.We use here the fact that J is in the largest component of H k (n, p), which is a giant component whp by Theorem 2 of [4], therefore whp either (S2) or (S3) is invoked at time i 1 (as (S2) would be invoked before (S1)).
We now use these three properties to prove the existence of the set S. We make a case distinction based on Property (1).If |∂C g 1 | ε 3 n j , then we simply set S = ∂C g 1 , and S is smooth by (3).
On the other hand, if |C g 1 | ε 3/2 n j , then we let S = C g 1 \ C g 0 −1 .Then since every generation from g 0 to g 1 is smooth, and since a union of smooth sets is also smooth, we have that S is smooth.Furthermore, |S| = (1 − o(1))|C g 1 | ε 3 n j .Lemma 7 has the following corollary which we will apply later.
Corollary 8. Suppose n −1/3 ε 1 and . Then H k (n, p) has a jcomponent containing a set S of size at least ε 3 n j such that every n j-sets of S.
Proof.We set and p = p−p * 1−p * and let H 1 = H(n, p * ) and H 2 = H(n, p ) independently.Observe that we may couple in such a way that H k (n, p) = H 1 ∪ H 2 .Furthermore, by Lemma 7, whp H 1 has a component containing a smooth set S of the appropriate size.In H k (n, p) this component may be bigger than in H 1 , but certainly still contains S.

Well-constructed hypergraphs
We will also use the following proposition.We say that a hypergraph is well-constructed if it can be generated from an initial j-set via a search process, i.e. by successively adding edges such that each edge contains at least one previously discovered j-set and also contains at least one previously undiscovered j-set.
Proposition 9. Up to isomorphism, the number of well-constructed k-uniform hypergraphs of j-size s is at most 2 ks 2 .
Proof.We explore the hypergraph by adding the edges one by one in the order in which it is well-constructed.The resulting hypergraph is uniquely determined, up to isomorphism, by the intersection of each edge with the previous vertices (though we will multiple count the isomorphism classes, this is permissible for an upper bound).When adding the i-th edge, we certainly have at most (i − 1)k vertices so far, and so the number of possible intersections is at most 2 (i−1)k .Multiplying over all edges, of which there are certainly at most s (each edge gives at least one new j-set), we have that the number of such hypergraphs is at most 2 By taking a union bound over all 3 r r 0 , we conclude that with probability at least 1 − 2n −12j/7 there are no j-components of this size.Case 2: r r 0 .
In this case, rather than looking at the full component we look at a well-constructed subgraph H of j-size r 0 .Such a subgraph certainly exists up to an additive k j error term in the j-size, which will not affect calculations significantly.Most of the calculations which lead to (6) are still valid, replacing r by r 0 .However, since we are no longer considering a full component, we must be more careful about the number of non-edges.
At this point we make use of the set S of j-sets guaranteed by Corollary 8, which lie in a different component to H.For each of the r 0 j-sets of H, pick an arbitrary (j − 1)-set within it and by Corollary 8, this (j − 1)-set is contained in (1 ± o(1))ε 3 n j-sets of S. For each such pair of j-sets intersecting in j − 1 vertices, there are n−j−1 k−j−1 k-sets containing both of them, all of which must be non-edges, since the j-sets lie in different components.
It may be that we multiple count the non-edges in this way.However, each k-set may only be counted from a pair of j-sets it contains, and therefore the number of times it is counted is certainly at most k j (k − j).Thus in total the number of non-edges is at least We may thus calculate the expected number of such structures H (cf. ( 6)): and so, letting Y be the number of such well-constructed hypergraphs of j-size r 0 which are not in the same component as S, we have .By Markov's inequality, this implies that with probability at least 1 − n −12j/7 we have Y = 0 and therefore no further components of size r.
Combining the two cases, this tells us that with probability at least 1 − 3n −12j/7 , H k (n, p) only has one non-trivial component.
We now take a union bound over all possible M , of which there are at most 2ω ( n k−j ) n k = O(ωn j ), and deduce that the probability that there is ever a second non-trivial within this time period is at most O(ωn j )n −8j/7 = O(ωn −j/7 ) = o(1) as required.
the electronic journal of combinatorics 23(2) (2016), #P.48 Theorem 3 follows almost immediately from Theorems 1 and 2. In order to apply Theorem 1 in the binomial model, we apply the standard trick of birth times: to each k-tuple we assign a number (the birth time) between 0 and 1 uniformly at random and independently of all other k-tuples.Then the hypergraph process {H k (n, M )} M can be obtained by adding edges in increasing order of birth time (with probability 1 no two edges have the same birth time), while the hypergraph obtained by taking all edges with birth time at most p is distributed as H k (n, p).
Theorem 6 (with s = 0) tells us that if c n → ∞, then whp there are no isolated j-sets, and therefore Theorem 1 tells us that whp the hypergraph is j-connected.This proves part (2).Part (1) is simply an application of Theorem 2 with s = 0.

Concluding remark
In [7], it is determined for the case j = 1 that the hitting time for d-strong 1-connectedness, i.e. the time at which the hypergraph first has the property that deleting any set of less than d vertices still leaves a 1-connected hypergraph, is the same as the hitting time for having no vertices of degree less than d whp.It would be interesting to generalise this result to d-strong j-connectedness (removing fewer than d j-sets still leaves a j-connected hypergraph), which is presumably attained whp when every j-set has degree at least d.However, this would present significant additional difficulties, not least that Lemma 7 would no longer give the substructure which we require.
the hypergraph; (b) If deg(J 0 ) = s, do nothing; (c) If deg(J 0 ) > s, delete a set of deg(J 0 ) − s edges chosen uniformly at random from E 0 .

1 .
log(E(Y )) kr 2 0 log 2 + j log n + O (r 0 log log n) − Θ r 0 ε 3 log n .Now observe that in Corollary 8 we may choose any n −1/3 ε Choosing ε 3 = 1 log log log n , we have r 0 ε 3 → ∞ and the last term in the above inequality dominates, and we have log(E(Y )) −C log n for any constant C. In particular, choosing C = 12j/7, we have E(Y ) n −12j/7