Random perturbation of sparse graphs

In the model of randomly perturbed graphs we consider the union of a deterministic graph $\mathcal{G}_\alpha$ with minimum degree $\alpha n$ and the binomial random graph $\mathbb{G}(n,p)$. This model was introduced by Bohman, Frieze, and Martin and for Hamilton cycles their result bridges the gap between Dirac's theorem and the results by Pos\'{a} and Kor\v{s}unov on the threshold in $\mathbb{G}(n,p)$. In this note we extend this result in $\mathcal{G}_\alpha \cup \mathbb{G}(n,p)$ to sparser graphs with $\alpha=o(1)$. More precisely, for any $\varepsilon>0$ and $\alpha \colon \mathbb{N} \mapsto (0,1)$ we show that a.a.s. $\mathcal{G}_\alpha \cup \mathbb{G}(n,\beta /n)$ is Hamiltonian, where $\beta = -(6 + \varepsilon) \log(\alpha)$. If $\alpha>0$ is a fixed constant this gives the aforementioned result by Bohman, Frieze, and Martin and if $\alpha=O(1/n)$ the random part $\mathbb{G}(n,p)$ is sufficient for a Hamilton cycle. We also discuss embeddings of bounded degree trees and other spanning structures in this model, which lead to interesting questions on almost spanning embeddings into $\mathbb{G}(n,p)$.


INTRODUCTION AND RESULTS
For α ∈ (0, 1) we let G α be an n-vertex graph with minimum degree δ(G α ) ≥ αn. A famous result by Dirac [15] says that if α ≥ 1/2 and n ≥ 3, then G α contains a Hamilton cycle, i.e. a spanning cycle through all vertices of G α . This motivated the more general questions of determining the smallest α such that G α contains a given spanning structure. For example, there are results for trees [29], factors [22], powers of Hamilton cycles [26,28], and general bounded degree graphs [12]. This is a problem for deterministic graphs that belongs to the area of extremal graph theory.
We can consider similar questions for random graphs, in particular, for the binomial random graph model G(n, p), which is the probability space over n-vertex graphs with each edge being present with probability p independent of all the others. Analogous to the smallest α we are looking for a functionp = p(n) : N → (0, 1) such that if p = ω(p) the probability that G(n, p) contains some spanning subgraph tends to 1 as n tends to infinity and for p = o(p) it tends to 0. We call thisp the threshold function for the respective property (an easy sufficient criteria for its existence can be found in [8]) and if the first/second statement holds we say that G(n, p) has/does not have this property asymptotically almost surely (a.a.s.). One often says that G(n, p) undergoes a phase transition atp. For the Hamilton cycle problem Posá [39] and Koršunov [31] proved independently thatp = log n/n gives the threshold. Similar as above there was a tremendous amount of research on determining the thresholds for various spanning structures, e.g. for matchings [17], trees [32,36], factors [24], powers of Hamilton cycles [35,37], and general bounded degree graphs [1,18,19,40]. An extensive survey by Böttcher can be found in [9].
Motivated by the smoothed analysis of algorithms [41], both these worlds were combined by Bohman, Frieze, and Martin [7]. For any fixed α > 0, they defined the model of randomly perturbed graphs as the union G α ∪ G(n, p). They showed that 1/n is the threshold for a Hamilton cycle, meaning that there is a graph G α such that with p = o(1/n) there a.a.s. is no Hamilton cycle in G α ∪ G(n, p) and for any G α and p = ω(1/n) there a.a.s. is a Hamilton cycle in G α ∪ G(n, p). It is important to note that in G(n, p), p = 1/n is also the threshold for an almost spanning cycle, this is for any ε > 0 a cycle on at least (1 − ε)n vertices. It should be further remarked that if p = o(log n/n) there are a.a.s. isolated vertices in G(n, p) and the purpose of G α is to compensate for this and to help in turning the almost spanning cycle into a Hamilton cycle.
This first result on randomly perturbed graphs [7] sparked a lot of subsequent research on the thresholds of spanning structures in this randomly perturbed graphs model, e.g. trees [10,25,34], factors [4], powers of Hamilton cycles [5,11], and general bounded degree graphs [11]. As for a Hamilton cycle there is often a log-factor difference to the thresholds in G(n, p) alone, which is there for local reasons similar to isolated vertices. In most of these cases a G α , that is responsible for the lower bound, is the complete imbalanced bipartite graph K αn, (1−α)n . In this model there are also results with lower bounds on α [6,16,23,38] and for Ramsey-type problems [13,14].
1.1. Hamiltonicity in randomly perturbed sparse graphs. The aim of this note is to investigate a new direction. Instead of fixing an α ∈ (0, 1) in advance we allow α to tend to zero with n. This extends the range of G α to sparse graphs and we want to determine the threshold probability in G α ∪ G(n, p). For example, with α = 1/ log n we have a sparse deterministic graph G α with minimum degree n/ log n. Then p = ω(1/n) does not suffice in general, but it is sufficient to take G α ∪ G(n, Θ(log log n)/n) to a.a.s. guarantee a Hamilton cycle. More generally, we can prove the following. This extends the result of Bohman, Frieze, and Martin [7] for constant α > 0. For even n a direct consequence of this theorem is the existence of a perfect matching in the same graph. To prove Theorem 1.1 we use a result by Frieze [20] to find a very long path in G(n, p) alone and then use the switching technique developed in [11] to turn this into a Hamilton cycle. As it turns out, our method allows to prove the existence of a perfect matching with a slightly lower edge probability. To see that in both theorems β is optimal up to the constant factor, consider G α = K αn,(1−α)n and note that there cannot be a perfect matching, if we have more than αn isolated vertices on the (1 − α)n side. The number of isolated vertices in G(n, β/n) roughly is n(1 − β/n) n−1 ∼ = n exp(−β), which is larger than αn if β = o(− log(α)).
For proving results in the randomly perturbed graphs model good almost spanning results are essential. Typically, by almost spanning one means that for any ε > 0 we can embed the respective structure on at least (1 − ε)n vertices. For paths and cycles in G(n,C /n) this can, for example, be done using expansion properties and the DFS-algorithm [33]. These almost spanning results are much easier than the spanning counterpart, because there is always a linear size set of available vertices. But for the proof of Theorem 1.1 this is not sufficient, because if α = o(1) we will not be able to take care of a linear sized leftover. Instead we exploit that we have G(n, β/n) and use the following result showing that we can find a long cycle consisting of all but sublinearly many vertices.

a.s. contains a cycle of length at least
This is optimal, because this is asymptotically the size of the 2-core (maximal subgraph with minimum degree 2) of G(n, p) [21,Lemma 2.16]. A similar result holds for large matchings. [20]). Let 0 < β = β(n) ≤ log n. Then G(n, β/n) a.a.s. contains a matching consisting of at

Lemma 1.4 (Frieze
Again this is optimal, because the number of isolated vertices is a.a.s. Observe, that also a bipartite variant of this lemma holds, which can be proved by removing small degree vertices and employing Halls theorem. 1.2. Bounded degree trees in randomly perturbed sparse graphs. After Hamilton cycles and perfect matchings, the next natural candidates are n-vertex trees with maximum degree bounded by a constant ∆. In G(n, p) the threshold log n/n was determined in a breakthrough result by Montgomery [36], in G α it is enough to have a fixed α > 1/2 [27], and in G α ∪ G(n, p) with constant α > 0 the threshold is 1/n [34]. To obtain a result similar to Theorem 1.1 for bounded degree trees using our approach we need an almost spanning result similar to Lemma 1.3. With a similar approach as for Theorem 1.1 and 1.2 we obtain the following modular statement. Theorem 1.6. Let ∆ ≥ 2 be an integer and suppose that α, β, ε : N → [0, 1] are such that 4(∆ + 1)ε < α ∆+1 and a.a.s. G(n, β/n) contains a given tree with maximum degree ∆ on (1 − ε)n vertices. Then any tree with maximum degree ∆ on n vertices is a.a.s. contained in the union G α ∪ G(n, β/n).
Next we discuss the almost spanning results that we can obtain in the relevant regime. Improving on a result of Alon, Krivelevich, and Sudakov [2], Balogh, Csaba, Pei, and Samotij [3] proved that for ∆ ≥ 2 there exists a C > 0 such that for ε > 0 a.a.s. G(n, β/n) contains any tree with maximum degree ∆ on at most (1−ε)n vertices provided that β ≥ C ε log 1 ε . For the proof they only require that the graph satisfies certain expander properties. This can be extended to the range where ε → 0 and ω(1) = β ≤ log n and following along the lines of their argument we get the following.
a.s. contains any bounded degree tree on at most (1 − ε) n vertices. Then together with Theorem 1.6 we obtain the following.
The proof for the dense case in [34] uses regularity and it is unlikely to give anything better in the sparse regime. As remarked in [2] the condition on the almost spanning embedding in G(n, β/n) could possibly be improved to β > log C ε , then covering almost all non-isolated vertices. More precisely this asks for the following. With Theorem 1.6 this would then give that already β = −(∆+1) log(C α) suffices, which would be optimal up to the constant factors. We want to briefly argue why it is possible to answer this question for large families of trees and what the difficulties are. For simplicity we only discuss the case β = log log n and note that by Lemma 1.7 above we can embed trees on roughly (1 − 1/ log log n)n vertices. A very helpful result for handling trees by Krivelevich [32] states that for any integer n, k > 2, a tree on n vertices either has at least n/4k leaves or a collection of at least n/4k bare paths (internal vertices of the path have degree 2 in the tree) of length k. If there are at least n/(4 log log n) leaves, we can embed the tree obtained after removing the leaves. Then we can use a fresh random graph and Lemma 1.5 to find a matching for all the leaves, completing the embedding of the tree.
On the other hand, if there are at least n log log n/(4 log n) bare paths of length log n/ log log n, it is possible to embed all but n/ log n of these paths, which are all but n/ log log n vertices. Then one has to connect the remaining paths, again using ideas from [36]. In between both cases it is not clear what should be done, because we might have n/ log n leaves and n/(4 log log n) bare paths of length log log n. The length of the paths are too short to connect them and the leaves are too few for the above argument. Answering this questions and thereby improving the result of Alon, Krivelevich, and Sudakov [2] is a challenging open problem.
1.3. Other spanning structures. As mentioned above, embeddings of spanning structures in G α , G(n, p), and G α ∪ G(n, p) for fixed α > 0 have also been studied for other graphs such as powers of Hamilton cycles, factors, and general bounded degree graphs. In most of these cases almost spanning embeddings (e.g. Ferber, Luh, and Nguyen [18]) can be generalised such that previous proofs can be extended to the regime α = o(1) with β = α −1/C , similar to what we do in Corollary 1.8. Further improvements seem to be hard, because better almost spanning results are similar in difficulty to spanning results in G(n, p) alone. We want to discuss this on one basic example, the triangle factor, which is the disjoint union of n/3 triangles.
In G α we need α ≥ 2/3, in G(n, p) the threshold is n −2/3 log 1/3 n, and in G α ∪ G(n, p) with a fixed α > 0 it is n −2/3 . Note that the log-term in G(n, p) is needed to ensure that every vertex is contained in a triangle, which is essential for a triangle factor. Using Janson's inequality [21,Theorem 21.12] it is not hard to prove the almost spanning result for a triangle factor on at least (1 − ε)n vertices with p = ω(n −2/3 ). This can be generalised to G(n, βn −2/3 ) giving a.a.s. a triangle factor on at least (1 − C /β)n vertices. Again, this can only give something with β = α −1/C in G α ∪ G(n, βn −2/3 ) and to improve this we ask the following. Observe, that this is a.a.s. the number of vertices of G(n, βn −2/3 ) that are not contained in a triangle. Similar questions for other factors or more general structures would be of interest. It took a long time until Johannson, Kahn, and Vu [24] determined the threshold for the triangle factor. This conjecture seems to be of similar difficulty, whereas for our purposes it would already be great to obtain a triangle factor on at least (1 −C exp(−β 3 ))n vertices for some C > 1.
For the remainder of this note we prove Theorem 1.1 and 1.6 in Section 2 and 3 respectively.

HAMILTONICITY
We will prove the following proposition that will be sufficient to prove the theorem together with known results on Hamilton cycles in G(n, p). Proof of Theorem 1.1. Let α, β > 0 such that β = −(6+o(1)) log(α). If α = O(n −1/6 ), we have β ≥ (1+o(1)) log n and we can infer that a.a.s. there is a Hamilton cycle in G(n, β/n) (this follows from an improvement on the result concerning the threshold for Hamiltonicity [30]). On the other hand, if α = ω(n −1/6 ), then we apply Proposition 2.1 to a.a.s. get the Hamilton cycle.

Proof of Proposition 2.1.
To prove the proposition we apply the following strategy. We first find a long path in G(n, p) alone. Then, by considering the union with G α , we obtain a reservoir structure for each vertex that allows us to extend the length of the path iteratively. Finally, we will also be able to close this path to a cycle on all vertices. W.l.o.g. we can assume that α < 1/10. Finding a long path. Let P = p 1 , . . . , p ℓ be the longest path that we can find in G 1 = G(n, (β − 1)/n) and let V ′ = {v 1 , . . . , v k } = V (G 1 ) \ p 1 , ..., p ℓ be the left-over. Then, by Lemma 1.3, we get a.a.s. that (2.1) Next, let P ′ be a collection of vertices of P , where we take every other vertex, excluding the last, that is In the following, we will work on P ′ instead of all of P , ensuring that certain absorbing structures do not overlap.
. . . p ℓ v FIGURE 1. The top shows a path P = p 1 , . . . , p ℓ and the left-over vertex v. Black edges belong to the random graph, orange edges can be found in G α . The bottom shows the situation after absorbing v using that p j ∈ B (p ℓ , v).
Absorbing the left-over. We now consider the union G α ∪ G 1 . The following absorbing structure is the key to the argument.

Definition 2.2. For any vertices
If for some v ∈ V ′ there is an p j ∈ B(p ℓ , v) we can proceed as follows (see Figure 1). By definition we have p j −1 , p j +1 ∈ N G α (v) and p j ∈ N G α (p ℓ ) ∩ P . Then p j can be replaced by v in the path P and can now be appended to the path P at p ℓ . So we get the pathP = p 1 , . . . , To iterate this argument we show that a.a.s. for any pair of vertices u and v, the set B (u, v) is large enough.
Proof. Let u, v be arbitrary vertices in V = V (G α ∪ G 1 ). The set B(u, v) is uniformly distributed over P ′ , because G(n, (β − 1)/n) is sampled independently of the deterministic graph G α . Then by definition An immediate consequence of B (u, v) being uniformly settled over G(n, (β−1)/n) is that |B (u, v)| ∼ Bin( P ′ , α 3 ). It follows from (2.4) and the Chernoff bound that there is a sufficiently small, but constant, δ > 0 s.t.
The lemma follows from a union bound over all n 2 choices for u, v and (2.5). We now have everything at hand to absorb all but two of the left-over vertices v ∈ V ′ onto a path of length n − 2. We do this inductively using Algorithm 1.
LetP , B i (·, ·) be defined as in Algorithm 1. In order to see that the algorithm terminates withP = P k it suffices to prove, that B i (u, v) is not empty for any u, v ∈ V and i = 1 . . . k. By definition of P ′ in (2.2) we have Define ℓ 1 = ℓ, P 1 = P with P 1 = u 1 1 . . . u 1 It remains to prove that we have an edge between A and B . For this we reveal G 2 = G(n, 1/n). As |A|, |B | ≥ α 3 n/8 by (2.6) we get as α = ω(n −1/6 ). Together with Chernoff's inequality this implies that a.a.s e G 2 (A, B ) > 0. As the union of G 1 and G 2 can be coupled as a subgraph of G(n, β/n) this implies that a.a.s. there is a Hamilton cycle in G α ∪ G(n, p) and finishes the proof of Proposition 2.1.
Observe, that when running the same proof for Theorem 1.2 we can obtain the better constant by adapting the definition of the B (u, v) to the setup of perfect matchings and then proving that a.a.s. |B (u, v)| ≥ α 2 n/4. We spare the details here.

BOUNDED DEGREE TREES
Theorem 1.6 is modular, which turns almost spanning embeddings in the random graph into spanning embeddings in the union G α ∪ G(n, β/n). The proof is very similar to the proof for Hamilton cycles and we will spare some details.
Proof of Theorem 1.6. Let G α be given and G = G(n, β/n). Let T be an arbitrary tree on n vertices with maximum degree ∆. Denote by T ε the tree obtained from T by the following construction.
(2) In every step i , check whether T i has at most (1 − ε)n vertices.
• If this is the case, set T ε = T i and finish the process.
• Otherwise, create T i +1 by deleting one leaf of T i . We denote by L the left-over, that are the vertices removed during construction of T ε . Then |V (T ε )| ≤ (1 − ε)n, |L| ≤ εn + 1, and V (T ) = V (T ε ) ∪ L.
Next we let T be an independent subset of the vertices of T ε such that the vertices in T do not have neighbours outside of T ε with respect to T . Observe, that there exists such a T such that |T | ≥ (1−∆ε)n ∆+1 . By assumption we a.a.s. have an embedding T ′ ε of T ε into G and we denote by T ′ the image of T under this embedding. We adapt Definition 2.2 and define for any two vertices u, v . As before, if we want to embed a vertex w that is a neighbour of an already embedded vertex u in T ε and v is an available vertex we can do it if B (u, v) is non-empty. More precisely, with x ∈ B (u, v), we can embed the vertex embedded onto x to v, embed w to x, and obtain a valid embedding of T ε with an additional neighbour of u. Analogous to Claim 2.3 we get the following. Claim 3.1. We have a.a.s. |B (u, v)| ≥ α ∆+1 n 4(∆+1) for any u, v ∈ V (G α ∪ G ). Therefore, similar to Algorithm 1, we can iteratively append leaves to T ε to obtain an embedding of T into G α ∪ G . As in every step we lose at most one vertex from each B (u, v) this works as long as |L| ≤ εn + 1 < |B (u, v)| , which holds by Claim 3.1 and the assumption on ε and α.