On the hyperbolicity of random graphs

Let $G=(V,E)$ be a connected graph with the usual (graph) distance metric $d:V \times V \to N \cup \{0 \}$. Introduced by Gromov, $G$ is $\delta$-hyperbolic if for every four vertices $u,v,x,y \in V$, the two largest values of the three sums $d(u,v)+d(x,y), d(u,x)+d(v,y), d(u,y)+d(v,x)$ differ by at most $2\delta$. In this paper, we determinate the value of this hyperbolicity for most binomial random graphs.


Introduction
Hyperbolicity is a property of metric spaces that generalizes the idea of negatively curved spaces like the classical hyperbolic space or Riemannian manifolds of negative sectional curvature (see, for example, [1,8]). Moreover, this concept can be applied to discrete structures such as trees and Cayley graphs of many finitely generated groups. The study of properties of Gromov's hyperbolic spaces from a theoretical point of view is a topic of recent and increasing interest in graph theory and computer science. Informally, in graph theory hyperbolicity measures how similar a given graph is to a tree-trees have hyperbolicity zero and graphs that are "tree-like" have "small" hyperbolicity. Formally, a connected graph G = (V, E) is δ-hyperbolic, if for every four vertices u, v, x, y ∈ V , the two largest values in the set {d(u, v) + d(x, y), d(u, x) + d(v, y), d(u, y) + d(v, x)} differ by at most 2δ. The hyperbolicity of G, denoted by δ H (G), is the smallest δ for which this property holds.
In [2] several equivalent conditions for a graph to be 0-hyperbolic are given, and in [3] the authors characterize 1/2-hyperbolic graphs in terms of forbidden subgraphs. On the algorithmic side, by the conditions investigated in [2], 0-hyperbolic graphs can be recognized in linear time, and in [14] it is shown that recognizing 1/2-hyperbolic graphs is equivalent to finding an induced cycle of length 4 in a graph. Fast algorithms for computing the hyperbolicity of large-scale graphs are given in [13].
The study of this parameter is motivated by the following observations: on the algorithmic side, in [11] fast algorithms for computing properties related to the diameter of graphs with small hyperbolicity are given. In [12] the authors give a simple construction how to approximate with small error the distances in graphs with small hyperbolicity by trees, and in [10], the authors give a polynomial algorithm which allows for such graphs to find the minimum number of edges needed such that the augmented graph has still small diameter. Finally, in [9] it is shown that all cop-win graphs in which the cop and the robber move at different speeds have small hyperbolicity. Moreover, the concept of hyperbolicity turns out to be useful for many applied problems such as visualization of the Internet, the Web graph, and other complex networks [18,19], routing, navigation, and decentralized search in these networks [5,17]. In particular, the hyperbolicity plays an important role in investigating the spread of viruses through the network [16].
Let us recall a classic model of random graphs that we study in this paper. The binomial random graph G(n, p) is defined as a random graph with vertex set [n] = {1, 2, . . . , n} in which a pair of vertices appears as an edge with probability p, independently for each such a pair. As typical in random graph theory, we shall consider only asymptotic properties of G(n, p) as n → ∞, where p = p(n) may and usually does depend on n. We say that an event in a probability space holds asymptotically almost surely (a.a.s.) if its probability tends to one as n goes to infinity.
In this paper, we investigate hyperbolicity for binomial random graphs. Surprisingly, this important graph parameter is not well investigated for random graphs which is an important and active research area with numerous applications. In [20], sparse random graphs (p = p(n) = c/n for some real number c > 1) are analyzed. It was shown that G(n, p) is, with positive probability, not δ-hyperbolic for any positive δ. Nothing seems to be known for p n −1 . On the other hand, it is known that for a random d-regular graph G, for d ≥ 3, we have that a.a.s.
where ω(n) is any function tending to infinity together with n. (In fact, almost geodesic cycles are investigated in [4], and this is an easy consequence of this result.) The hyperbolicity of the class of Kleinberg's small-world random graphs is investigated in [7].
Let j ≥ 2 be the smallest integer such that d j /n − 2 log n → ∞. Then, the following properties hold a.a.s. (i) If j is even and d j−1 ≤ 1 16 n log n, then δ H (G) = j/2. (ii) If j is even and d j−1 > 1 16 n log n (but still d j−1 ≤ (2 + o(1))n log n), then (iii) If j is odd, then δ H (G) = (j − 1)/2. Furthermore, the following complementary results hold.
Remark. It seems that with quite a bit more work, we could slightly push the lower bound required for d and require only that d log 3 n or perhaps even only d log 2 n. Unfortunately, it seems it is more difficult to investigate sparser graphs (that is, assuming only d log n or even closer to the connectivity threshold). Therefore, we aim for an easier (and cleaner) argument in this paper, leaving the investigation of sparser graphs as an open problem. Let us also mention that the hyperbolicity is not determined precisely for dense graphs right before the diameter decreases from even j to j − 1 (case (ii) in Theorem 1.1). Again, the constant 1 16 could be slightly improved with a more delicate argument but the gap cannot be closed with the current approach. This is also worth investigating and (unfortunately) left open at the moment.

Preliminaries
In this section, we introduce a few useful lemmas. The following result is well-known but we include the proof for completeness. Proof. Consider any four vertices u, v, x, y with their three sums of distances Clearly, d 1 ≤ 2D. First observe that by applying the triangle inequality four times, Hence, if d 3 ≤ D, the required condition holds and we are done. Otherwise, d 3 > D and so also d 2 > D. As a consequence, d 1 − d 2 < 2D − D = D, and we are done as well.
We can slightly improve the upper bound for graphs with odd diameter.
Proof. As in the previous proof, we consider four vertices u, v, x, y with their three sums of distances Arguing as in the previous proof, we get that So the only case to analyze is when and D is odd, we may assume (without loss of generality) that d(u, y) < D/2. Then, since and D = d(x, y) ≤ d(x, u) + d(u, y), we have d(y, v) > D/2, d(x, u) > D/2, and we have that d 2 = d(u, x) + d(v, y) > D, contradicting our assumption on d 2 . Therefore, d 1 − d 2 ≤ D − 1, and the lemma follows.
In order to bound the hyperbolicity from above, we will make use of the following result for random graphs, see [6,Corollary 10.12]. Then the diameter of G ∈ G(n, p) is equal to i a.a.s.
From the proof of this result, we have the following corollary.
Corollary 2.4. Suppose that d = p(n − 1) log n and that Then the diameter of G ∈ G(n, p) is at most i a.a.s.
In order to obtain a lower bound on the hyperbolicity, we will need the following expansion lemma investigating the shape of typical neighbourhoods of vertices. Before we state the lemma we need a few definitions. For any j ≥ 0, let us denote by N (v, j) the set of vertices at distance at most j from v, and by S(v, j) the set of vertices at distance exactly j from v. Also, for a set of vertices F ⊆ V , and x ∈ V \ F , denote by N V \F (x, j) the set of vertices in V \ F at distance at most j from x in the graph induced by V \ F , and similarly let S V \F (x, j) be the set of vertices in V \ F at distance exactly j from x in the graph induced by V \ F .
Let G = (V, E) ∈ G(n, p) and let i ≥ 4 be the largest even integer such that d i−1 ≤ 1 16 n log n.
there is no edge from v to F .
In particular, it follows that a.a.s. the following properties hold: (iii) for all j = 1, 2, . . . , i/2 − 1, , (vii) for any fixed partition of the neighbours of v into two sets, V L and V R , such that ||V L | − |V R || ≤ 1, let S L denote the set of vertices of S(v, i/2 − 1) that are at distance i/2 − 2 from V L , and let S R = S(v, i/2 − 1) \ S L ; then, , and v ∈ V \ F . Consider the random variable X = X(F, v) = |S V \F (v, 1)|. We will bound X in a stochastic sense. There are two things that need to be estimated: the expected value of X, and the concentration of X around its expectation. Since X ∈ Bin(n − f − 1, p), it is clear that A consequence of Chernoff's bound (see e.g. [15,Corollary 2.3]) is that for 0 < ε < 3/2. Hence, after taking ε = 2 log n/d, we get that with probability 1 + o(n −1 ) we have This proves part (i) of the lemma. Part (ii) is straightforward since the probability that there is no edge from v to F is equal to Part (iii) is a straightforward implication of (i). In order to have good bounds on the ratios of the cardinalities of N (v, 1), N (v, 2), and so on, we consider the Breadth First Search (BFS) algorithm that explores vertices one by one (instead of the whole j-th neighbourhood). Formally, the process is initiated by putting v into the queue Q. In each step of the algorithm, one vertex w is taken from Q and edges from w to all vertices that are not in F and have not yet been discovered are examined. All new neighbours of w that are found are put into the queue Q. The process continues until the queue Q is empty or vertices of are in the queue Q; that is, no vertex from this sphere is processed and, in particular, edges in the graph induced by S V \F (v, i/2 − 1) are not exposed yet.) Suppose that N V \F (v, j − 1) is discovered and we continue investigating vertices of the sphere S V \F (v, j − 1), one by one, that are in the queue Q. Provided O(d i/2−1 ) vertices have been discovered so far, it follows from part (i) that we may assume that when each vertex of After that we update F by adding all newly discovered vertices to it, adding the vertex processed at this step, and removing the next vertex to be processed-see Figure 1.
We consider this up to the j'th iterated neighbourhood, where j = i/2 − 1 and d i−1 ≤ n log n/16 and thus j = O(log n/ log log n). Then the cumulative multiplicative error term is Figure 1. Two consecutive steeps of BFS started from vertex v. The black vertex is the vertex currently exposed. Grey vertices form the set F that is updated each time. White vertices are newly discovered ones.

This establishes (iii).
For parts (iv), (v), and (vi) we note that in each step of the BFS algorithm the probability that there is no edge from w (the vertex that is processed at this point) to vertices that have been already discovered is, by part (ii), 1 − O(d i/2 /n). Hence, by the union bound, a.a.s. this never happens since the number of vertices processed is The claim follows.
Part (vii) follows immediately (and deterministically) from (iv), (v), and (vi). The proof of the lemma is finished.
We first give the proof of the result for very dense graphs.
Proof of Theorem 1.1(iv)-(v). For p = 1 − o(1/n 2 ), note that the expected number of edges in the complement of G is n 2 (1 − p) = o(1), and thus by Markov's inequality, a.a.s., G is the complete graph on n vertices. If this is the case, then d(u, v) = 1 for any pair of vertices u and v, and thus for any four vertices u, v, x, y, clearly, For p = 1 − 2c/n 2 and c > 0, note that a.a.s. there is no component of size 3 or more in the complement of G. Thus, a.a.s., for all four-tuples of vertices in the original graph, either all edges are present, only one edge is missing, or two disjoint edges are missing. In all of these cases, the non-adjacent vertices are at distance 2, and thus a.a.s. δ H (G) ≤ 1. The expected number of edges in the complement of G equals n 2 (1 − p) = (1 + o(1))c. Also, for any fixed r, the r-th moment of the number of edges in the complement of G equals c r (1 + o(1)), and thus, by the method of moments (see, for example, Theorem 1.22 of [6]) the number of edges converges to a random variable with a Poisson distribution with parameter c. In particular, with probability (1+o(1))e −c , the complement of G is empty, and by the argument in the first case, we have δ H (G) = 0. Also, with probability (1 + o(1))ce −c , the complement of G contains exactly one edge, say {u, v}. For the four-tuples not containing both u and v, the analysis is as before. For a four-tuple u, v, x, y we now have for the distances in the original graph x) = 2, and thus δ H (G) ≥ 1, and part (iv) follows.
The main challenge of this paper is to prove the following result and the whole next section is dedicated to it. Here, we show how Theorem 1.1(i)-(iii) can be derived from it.
Let i ≥ 2 be the largest even integer such that d i−1 ≤ 1 16 n log n. Let G ∈ G(n, p). Then, a.a.s., Proof of Theorem 1.1(i)-(iii). Fix j to be the smallest integer such that d j /n − 2 log n → ∞. In particular, d j+1 /n = ω(log n). Moreover, it follows from Corollary 2.4 that the diameter of G is at most j a.a.s. Hence, by Lemma 2.1, a.a.s. δ H (G) ≤ j/2. This establishes upper bounds in parts (i) and (ii).
Suppose first that j is even and that d j−1 ≤ 1 16 n log n. Then j is the largest even integer such that d j−1 ≤ 1 16 n log n. By Theorem 2.6, a.a.s. δ H (G) ≥ j/2 and part (i) holds. Suppose next that j is even and that d j−1 > 1 16 n log n (note that it follows from the definition of j that d j−1 ≤ (2 + o(1))n log n). Then j − 2 is the largest even integer such that d j−3 ≤ 1 16 n log n, and by Theorem 2.6, a.a.s. δ H (G) ≥ j/2 − 1. This finishes part (ii). Finally, suppose that j is odd. Since d j−1 /n = O(log n), d j−2 /n = o(log n), and thus j −1 is the largest even integer such that d j−2 ≤ 1 16 n log n. By Theorem 2.6, a.a.s., δ H (G) ≥ (j −1)/2. Since a.a.s. the diameter of G is at most j, and j is odd, by Lemma 2.2 we have that a.a.s. δ H (G) ≤ (j − 1)/2. Part (iii) and so the whole proof is finished.

Proof of Theorem 2.6
Let G = (V, E) ∈ G(n, p) and suppose that d = p(n − 1) log 5 n (log log n) 2 and p = 1 − ω(1/n 2 ). Let i ≥ 2 be the largest even integer such that d i−1 ≤ 1 16 n log n. Assume first that d > ( 1 16 n log n) 1/3 which implies that i = 2. In this case, we have to prove that a.a.s. δ H (G) ≥ 1. It therefore suffices to find four vertices u, v, x, y such that the subgraph induced by them is a 4-cycle. Since p > n −2/3 ( 1 16 log n) 1/3 and 1 − p = ω(n −2 ), the expected number of induced cycles of length 4 is n It is a straightforward application of the second moment method to show that a.a.s. there is at least one induced cycle in G and the statement follows in this case.
Hence, from now on we may assume that which implies that i ≥ 4. We need one more definition: for a given u ∈ V , k ≥ 1, and A ⊆ V , we say that N V \A (u, k) expands well if for all j = 1, 2, . . . , k, Figure 2. left: Hyperbolicity(u, v, x, y), the big picture; right: the neighbourhood exposure around a in more detail and for all j = 1, 2, . . . , k − 1, every vertex of S V \A (u, j) has d(1 + o(log log n/ log 2 n)) neighbours in S V \A (u, j + 1). Finally, fix a four-tuple of different vertices u, v, x, y and consider the following process (see Figure 2): . Make sure that the following properties hold (otherwise stop the process): The reason that here we restrict ourselves to the induced graph is to make sure no edge in this graph is already exposed and so, as typical, we perform BFS by exposing edges one by one, as required.) Make sure that the following properties hold (otherwise stop): There is no edge from N V \D (v, i/2 − 1) to S(u, i/2 − 1) (note that edges from vertices of N (u, i/2 − 2) are already exposed, so that the only chance for the intersection of N V \D (v, i/2 − 1) and N (u, i/2 − 1) to be non-empty is when we reach vertices of S(u, i/2 − 1)).  (4) In this step, the neighbourhood of a is investigated. Unfortunately, this is slightly more complicated since some part of the neighbourhood of a is already "buried" in N (u, i/2 − 1). In order to accomplish our goal, we need to perform BFS not only from a (up to level i/2 − 2), but also from some other vertices of S(u, i/2 − 1) (this time going not as deep as i/2 − 2; the level until which the neighborhood is explored depends on the distance from a)-see Figure 2 (right side). Formally, for 1 ≤ k ≤ i/2 − 2, let S k be the set of vertices of S(u, i/2 − 1) that are at distance k from a in the tree induced by N (u, i/2 − 1). (In fact, k has to be even in order for S k to be non-empty, but we consider all values of k for simplicity.) Let We perform BFS from a and from vertices of i/2−2 k=1 S k in the graph induced by V \ F ; we reach vertices at distance i/2 − 2 from a and at distance i/2 − 2 − k from S k . Make sure that the following properties hold (otherwise stop).
(n): N V \F (a, i/2 − 2) expands well. Moreover, for all 1 ≤ k ≤ i/2 − 2 and all ∈ S k we have that N ( , i/2 − 2 − k) expands well. In particular, (o): There is no edge from N V \F (a, i/2−2)\{a} to F and for every k = 1, 2, . . . , i/2− 2 and every vertex ∈ S k , there is no edge from N V \F ( , i/2 − 2 − k) \ { } to F . (p): All graphs exposed in this step are disjoint trees. Note that this implies that N (a, i/2 − 2) is a tree. (q): For 1 ≤ k ≤ i/2 − 2, let S k be the set of vertices of S(v, i/2 − 1) that are at distance k from b in the tree induced by N (v, i/2 − 1) and let Perform BFS from b and from vertices of i/2−2 k=1 S k in the graph induced by V \F ; Properties (n), (o), and (p) hold when a is replaced by b, F is replaced by F , and the sets S k are replaced by S k . (r): For 1 ≤ k ≤ i/2 − 2, let S k be the set of vertices of S(u, i/2 − 1) that are at distance k from c in the tree induced by N (u, i/2 − 1) and let Perform BFS from c and from vertices of Perform BFS from d and from vertices of i/2−2 k=1 S k in the graph induced by V \ F ; Properties (n), (o), and (p) hold when a is replaced by d, F is replaced by F , and the sets S k are replaced by S k . (5) Let We perform BFS from x in the graph induced by V \ Q to expose N V \Q (x, i/2 − 1). Make sure that the following properties hold (otherwise stop):  replaced by y and Q is replaced by R. (6) It is the end of this tedious process so it is time for a short break-perform a fireworks show (fireworks are explosive pyrotechnic devices typically used for aesthetic, cultural, and religious purposes; here the main purpose is to celebrate finding an object with the desired properties).
We say that the process Hyperbolicity(u,v,x,y) terminates successfully if all the required conditions are satisfied, that is, the process does not stop prematurely before reaching the end. For the distance between x and y observe the following: first, by Properties (j) and (l) there is an x − y path of length i going from x to a, then through u to c, and then to y. We will show that N (x, i/2 − 1) ∩ N (y, i/2 − 1) = ∅ and that there is no edge between S(x, i/2 − 1) and S(y, i/2 − 1). Indeed, if a shortest x − y-path first goes from x to a, by Properties (o), (r), (s) and (x), it has to go until S(a, i/2 − 2), and then it has to pass through at least two more edges before entering S(y, i/2 − 1), including S(c, i/2 − 2) and S(d, i/2 − 2), and in each case the length is at least i. By properties (q), (r), (s) and (x), the same holds if the path starts from x to b. If the path from x neither goes through a nor through b, it has to go through N V \Q (x, i/2 − 1). By Properties (u) and (x), it has to arrive at S(x, i/2 − 1), and then it has to go through at least two edges before entering S(y, i/2 − 1), including S(c, i/2 − 2) and S(d, i/2 − 2), and in each case the length is also at least i.
Thus, by Claim 3.1, in order to show that a.a.s. δ H (G) ≥ i/2, it suffices to show that a.a.s. Hyperbolicity(u, v, x, y) succeeds for at least one four-tuple of vertices u, v, x, y. Let X u,v,x,y be the indicator random variable defined as follows: where the sum is taken over all n 4 4-tuples of all disjoint vertices. In order to prove that a.a.s. δ H (G) ≥ i/2, we will apply the second moment method to X. Define q = exp(−d i−1 /(2n)) and note that from the assumption that d i−1 ≤ 1 16 n log n we have q ≥ n −1/32 . Proof. Fix a four-tuple of vertices u, v, x, y. First, we will calculate Pr (X u,v,x,y = 1). We will estimate for each of the five steps of Hyperbolicity(u,v,x,y) the probability that it fails at that step. For z ∈ {a, b, . . . , x}, let P z be the indicator random variable for the event that Property (z) succeeds provided that all previous Properties have succeeded as well. Similarly, for step z ∈ {1, 2, . . . , 5}, let T z be the indicator random variable for the event that step z succeeds provided that all previous steps have succeeded as well. By Lemma 2.5(iii) and (iv), P(P a = 1) = 1 + o(1). Let E be the event that there is no edge within the last sphere S V \B (u, i/2 − 1). By Lemma 2.5(v) and (vi), in order to calculate the probability that Property (b) holds, it remains to estimate the probability that E holds. We have where the last equality follows from Property (a) that is assumed to hold deterministically now. Hence, where the last line follows from the assumption that d i−1 ≤ 1 16 n log n. Hence, the probability that N V \B (u, i/2 − 1) is a tree is asymptotically equal to the probability that the event E holds, and thus (1)). Now, let us move to Property (c). By Lemma 2.5(ii) together with a union bound over all vertices in N V \B (u, i/2 − 2), we see that with probability there is no edge from N V \B (u, i/2 − 2) to B, and thus P(P c = 1) = 1 + o(1). Hence, (1)). Next, for Property (d), by Lemma 2.5(iii) and (iv), we obtain P(P d = 1) = 1 + o(1).
For Property (e), since Property (d) is assumed to hold deterministically at this point, we have and hence, by the same calculations following (3) we obtain P(P e = 1) = q 2 (1 + o (1)). The probability of having Property (f ) is calculated as before for Property (b), and of having Property (g) as before for Property (c). Thus, (1)). For Property (h) we immediately have by Lemma 2.5(vii) that P(P h = 1) = 1 + o(1), and the same applies to Property (i). For Property (j), since Property (h) is assumed to hold deterministically, we have where the last line follows from the fact that (d i−1 ) 1/2 = O( √ n log n) (by definition of i), and by (2), which implies that (d i−1 ) 1/2 √ d/n = O( d log n/n) = o(1). Note then that Properties (j), (k), (l) and (m) are symmetric and mutually independent, and thus calculated the same way. Therefore Let us move to investigating Properties (n), (o), and (p). First, we perform BFS from a in V \ F . It follows immediately from Lemma 2.5(iii) that N V \F (a, i/2 − 2) expands well and so the bound on |N V \F (a, i/2 − 2)| in Property (n) holds a.a.s. For the vertices in S k , since Properties (a) and (b) are assumed to hold deterministically, for every even value of k such that 2 ≤ k ≤ i/2 − 2, the number of vertices in S k is (1 + o(1))d k/2 . In order to deal with the second bound of Property (n) (and to investigate Properties (o) and (p) at the same time), we mimic the proof of Lemma 2.5(iii). We perform BFS from some other vertex in some S k in V \ F , updating the set F every time a vertex is processed. As shown in Figure 1, the vertex that was processed before, together with all its neighbours, will be added to F , and the next vertex in the queue to be processed will be taken out of F . Once we are done, we take the next vertex in some S k and continue in this way until all neighbourhoods under consideration are discovered. Arguing as in the proof of Lemma 2.5(iii), by Lemma 2.5(i) together with a union bound over all vertices processed, we obtain the desired bounds for the sizes of neighbourhoods. Moreover, by Lemma 2.5(ii) together with a union bound over all vertices that are discovered during this step (at most O(d i/2−2 ) vertices), we get that a.a.s. at the time when a given vertex was processed there was no edge to already discovered vertices (neither within the same tree where we started BFS from, nor to other trees, nor to the initial set F ). This deals with Properties (o) and (p). Finally, it follows that a.a.s.
where the last equality follows from the fact that d log 5 n/(log log n) 2 . Thus, P(P n = 1 and P o = 1 and P p = 1) = 1 + o(1). The probabilities for Properties (q), (r) and (s) to hold are calculated in exactly the same way as for Properties (n), (o) and (p), and hence P(T 4 = 1) = 1 + o(1). Finally, for T 5 , when exposing x, Property (t) is investigated as before. Also, by analogous calculations as for T 1 , the probability of having no edge to Q (Property (u)) and the one of being a tree (Property (v)), altogether yield q 5 (1 + o(1)); note that the exponent of 5 comes from the fact that N V \Q (x, i/2 − 1) is a tree (giving one q), and that there is no edge to S(u, i/2 − 1) (giving 2 additional factors of q), and no edge to S(v, i/2 − 1) (giving another 2 additional factors of q). Property (w) clearly also holds with probability 1 + o(1). Similarly, when exposing y, we also have to consider the condition of having no edge to N V \R (x, i/2 − 1) (Property (x)), giving us another factor of q 2 , and thus yielding a probability of q 7 (1 + o(1)). Thus, P(T 5 = 1) = q 12 (1 + o(1)). Combining the events T 1 , T 2 , . . . , T 5 , we obtain P(X u,v,x,y = 1) = (d i/2 /(2n)) 4 q 16 (1 + o(1)), yielding the first part of the lemma. Observing that i is such that d i+1 > 1 16 n log n, and therefore d i > 1 16 n log n/d, and also using (2), we obtain ( 1 16 n log n) 2 384d 2 q 16 (1 + o(1)) = Ω(n 3/2 (log n/d) 2 ) = Ω(n 5/6 (log n) 4/3 ), which finishes the proof of the lemma.
We now move to the second moment method.
Proof. In order to analyze the expected value of X 2 we will consider a number of different cases. Note that where both sums range over all 4-tuples of different vertices. For a fixed 4-tuple of vertices u, v, x, y, it follows from the first part of Lemma 3.2 that P(X u,v,x,y = 1) = (d i/2 /(2n)) 4 q 16 (1 + o(1)).
Conditioning on X u,v,x,y = 1, our goal is to investigate P(X u ,v ,x ,y = 1|X u,v,x,y = 1). Note that the lower bound for E X 2 trivially holds and so we aim for an upper bound. Therefore, we focus on properties that hold with probability o(1) in the unconditional case (such as Property (b) that holds with probability asymptotic to q). We ignore properties that hold with probability 1 + o(1) in the unconditional case (such as Property (a)) although it might happen that the probability that they hold in the conditional space is smaller (which clearly helps).
Assume that X u,v,x,y = 1, and let U be the set of vertices exposed to certify this, that is, x, y, a, b, c, d}, we always denote by z the vertex in Hyperbolicity(u , v , x , y ) corresponding to vertex z in Hyperbolicity(u, v, x, y).
Before we move to investigating cases, let us note that we may assume the following useful properties.
Claim 1: For a vertex z / ∈ U , the number of edges between N (z , i/2 − 1) \ U and U is O(log n).
Proof. Indeed, we may assume that |N (z , i/2 − 1) \ U | = O(d i/2−1 ) and so the expected number of edges between N (z , i/2 − 1) \ U and U is O(pd i−2 ) = O(d i−1 /n) = O(log n). It follows from Chernoff's bound that there exists some constant C > 0 such that the probability to have at least C log n edges of this type is o(n −4 ). The contribution of all such four-tuples u , v , x , y to X 2 is o(1) and so can be safely ignored.
Proof. Indeed, suppose that there is an edge from N (z , i/2 − 1) \ U to U . In fact, since U is exposed during the BFS process, this edge has to be adjacent to a leaf of the graph induced by U . We need to estimate the size of N ( , i/2 − 2) ∩ U , since vertices in N (z , i/2 − 1) ∩ U due to the existence of this edge form a subset of N ( , i/2 − 2) ∩ U . From the fact that U induces an almost regular tree, it follows that |N ( , i/2 − 2) ∩ U | = O(d i/4−1 ). The claim follows now from Claim 1.
Now, we are ready for a case analysis.
It follows from Claim 2 above that the part of the graph that needs to be exposed in order to check whether X u ,v ,x ,y = 1 intersects with U only in a negligible way. It is straightforward to show that Properties (b), (e), (f ), (u), (v) and (x), as well as Properties (j), (k), (l) and (m) are up to a 1 + o(1) factor independent of conditioning on X u,v,x,y = 1, and their calculations are as in Lemma 3.2. We obtain P(X u ,v ,x ,y = 1|X u,v,x,y = 1) ≤ (d i/2 /(2n)) 4 q 16 (1 + o(1)) = P(X u,v,x,y = 1)(1 + o(1)).
Clearly, the number of choices of four-tuples u, v, x, y and u , v , x , y is at most the square of the number of choices for u, v, x, y, so the contribution of this case is at most (1 + o(1))E X 2 . (We will show that the contribution of all other cases is o(E X 2 ) and so the contribution of this case is indeed (1 + o(1))E X 2 .) Since {x , y } ∩ U = ∅, edges emanating from x and y are not exposed yet. Hence, the probabilities that Properties (j), (k), (l) and (m) hold are as in the unconditional case. Ignoring all other properties we get On the other hand, since at least one of u , v is in U , only a fraction of four-tuples is considered here. Hence, the contribution of this case to X 2 is negligible.
From now on we may assume that at least one of x , y has to be in U .
Case 3: {u , v } ∩ U = ∅, and |{x , y } ∩ U | = 1. By symmetry, we may assume that x ∈ U and y / ∈ U . Since y / ∈ U , arguing as in the previous case, we get that Properties (l) and (m) hold with probability up to a 1 + o(1) factor as in the unconditional case. However, it might happen that, say, N (u , i/2 − 1) ∩ U = ∅ and Property (j) holds "for free". But for this to happen, an edge joining N V \U (u , i/2 − 1) and U must occur at the right place, which happens with small probability. Moreover, only a small fraction of four-tuples satisfies x ∈ U .
More precisely, for Property (j) to hold "for free", we must have that d(u , x ) = i/2, and so there must be an edge between S U (x , k) and S V \U (u , i/2−1−k) for some k = 0, 1, . . . , i/2−1. By the union bound, this happens with probability since i = O(log n/ log log n). (In fact, with a slightly more delicate argument one can remove the log n/ log log n factor but this is not needed.) The same argument applies to Property (k). (Recall, that we may assume that N (u , i/2 − 1) and N (v , i/2 − 1) are disjoint.) Hence, by considering only these four properties we get that On the other hand, since x ∈ U , only a O(|U |/n) = O(n −1/2 (log −2 n)(log log n)) = o(q 16 (log n/ log log n) −2 ) fraction of four-tuples is considered here. Hence, the contribution of this case to X 2 is negligible.
Note that if X u ,v ,x ,y = 1 then, in particular, d(x , y ) = i and so the distance between x and y in the graph induced by U is at least i. It follows that, for example, an edge between s ∈ S U (x , k) and t ∈ S V \U (u , i/2 − 1 − k) (for some k = 0, 1, . . . , i/2 − 1) cannot make both Property (j) and Property (l) to hold, since that would imply that d(x , y ) ≤ d(x , s) + d(s, y ) ≤ i − 2. Hence the calculations dealing with y (related to Properties (l) and (m)) are independent of calculations dealing with x (related to Properties (j) and (k)). Arguing as in the previous case, we get that P(X u ,v ,x ,y = 1|X u,v,x,y = 1) = O P(X u,v,x,y = 1)q −16 (log n/ log log n) 4 . From now on we may assume that at least one of u , v has to be in U (and still at least one of x , y ).
By symmetry, suppose that u ∈ U , v / ∈ U , x ∈ U , y / ∈ U . Since y / ∈ U , as before, we get that Properties (l) and (m) hold with probability up to a 1 + o(1) factor as in the unconditional case. Now, since v / ∈ U , by the same calculations as in Case 3, the probability that Property (k) holds is at most O (d i/2 /n) log n/ log log n . Property (j), however, might hold deterministically now, so we lose an additional factor of O(d i/2 /n). By considering only the three properties we have an upper bound on we get that log n log log n .
On the other hand, since u , x ∈ U , only a O((|U |/n) 2 ) = O((d i/2−1 /n)n −1/2 (log −2 n)(log log n)) = o((d i/2 /n)q 16 (log log n/ log n)) fraction of four-tuples is considered here, and so the contribution of this case to X 2 is negligible.
Case 6: |{u , v } ∩ U | = 1, and |{x , y } ∩ U | = 2. By symmetry, suppose that u ∈ U and v / ∈ U . By the same argument as in Case 4, the calculations involving Properties related to x are independent of those related to y . Since v / ∈ U , by the same calculations as in Case 3, the probability that Properties (k) and (m) both hold is at most O (d i/2 /n) 2 (log n/ log log n) 2 . This time, Properties (j) and (l) might hold deterministically, so we lose an additional factor of O((d i/2 /n) 2 ). By considering only the two properties we have a bound on we get that P(X u ,v ,x ,y = 1|X u,v,x,y = 1) = O P(X u,v,x,y = 1)q −16 n d i/2 2 log n log log n 2 .
By symmetry, suppose that x ∈ U and y / ∈ U . Since y / ∈ U , as before, we get that Properties (l) and (m) hold with probability up to a 1 + o(1) factor as in the unconditional case. By considering only these two properties we get that P(X u ,v ,x ,y = 1|X u,v,x,y = 1) = O P(X u,v,x,y = 1)q −16 n d i fraction of four-tuples is considered here, and the contribution of this case to X 2 is negligible.
Case 8: |{u , v } ∩ U | = 2, and |{x , y } ∩ U | = 2. First, let us observe that in order to have X u ,v ,x ,y = 1 the vertices u , v , x , y have to lie on an induced cycle of length 2i (of course, this is a necessary condition only). Moreover, note that there is only one such cycle in the graph induced by U (namely, the one going through u, v, x, y). Hence, in order for this necessary condition to hold "for free," all four vertices of u , v , x , y have to be on this cycle. Thus, there are only (2i) 4 = O((log n/ log log n) 4 ) = o(E [X]) such 4-tuples of vertices, and this contribution is negligible. Otherwise, we observe that at least one edge of the cycle u , v , x , y is not yet present in the graph induced by U , say an edge on the path between u and x . For this to happen, there must be an edge between S(u , k) and S(x , i/2 − 1 − k) for some k ∈ {0, 1, . . . , i/2 − 1} that is not yet exposed. By the same calculations as in Case 3, this happens with probability at most O d i/2 n · log n log log n .
Using Lemma 3.2 and Lemma 3.3, Theorem 2.6 follows now easily by Chebyshev's inequality.

Concluding remarks and open questions
We have shown that in G(n, p) for p log 5 n n(log log n) 2 the hyperbolicity is (up to a possible difference of 1) a monotone decreasing graph parameter. In general, since trees as well as cliques have hyperbolicity 0, the hyperbolicity is not monotone, but we conjecture that in G(n, p) with p above the threshold of connectivity, the same behavior holds. Intuitively, if p is close to the threshold of connectivity, then G(n, p) a.s.s. contains a lot of long cycles, and there will not be many shortcuts, making the hyperbolicity of the graph large. After extending the definition of hyperbolicity to non-connected graphs by defining it as the maximum over all connected components, the situation is quite different. For p < (1 − ε)/n for some ε > 0, the hyperbolicity is 0 a.a.s., but for p > (1 + ε)/n for some ε > 0, the appearance of the giant component makes the hyperbolicity to tend to infinity a.a.s. (see also [20]). It would be interesting to investigate for which values of p the hyperbolicity of G(n, p) is maximized, and what this value is. We also would like to know whether the hyperbolicity is monotone increasing up to its maximal value and then decreasing, or whether there are several "peaks."