A limit theorem for the six-length of random functional graphs with a fixed degree sequence

We obtain results on the limiting distribution of the six-length of a random functional graph, also called a functional digraph or random mapping, with given in-degree sequence. The six-length of a vertex $v\in V$ is defined from the associated mapping, $f:V\to V$, to be the maximum $i\in V$ such that the elements $v, f(v), \ldots, f^{i-1}(v)$ are all distinct. This has relevance to the study of algorithms for integer factorisation.


Introduction
We consider random directed graphs with all out-degrees equal to 1, which we call functional graphs (see Section 2 for further notation) or random mappings. The motivation in most of the related literature is a better understanding of Pollard's ρ-algorithm [7] for integer factorisation, or the improved version by Brent and Pollard [3]. The runtime depends on the six-length (also called ρ-length) of a polynomial in F p [x]. (Pollard's first version used x 2 − 1.) Under the assumption that a polynomial mod p 'behaves like' a random mapping (supported by some research listed below), we are interested in the six-length of random mappings in particular. Martins and Panario [6] studied polynomials in F p [x], in particular the six-length in several random models. They found significance in the six-length of random polynomials with given in-degree sequence, and gave numerical results for several random models. Our main aim is to derive results on the six-length of random functional graphs with given in-degree sequence, to give a baseline for comparison with random polynomial models.
Results pertinent to our study were obtained by Arney and Bender [1], who were motivated by the study of random shift registers. For a fixed set D, they considered a functional graph chosen uniformly at random among those with in-degrees in D. They studied various properties such as the in-degrees of vertices, tree size, tail length and six-length. They also obtained some information on the number of origins (vertices of in-degree 0), stopping short of being able to specify the number of origins. Hansen and Jaworski [4] considered a two-stage experiment: (1) Choose random indegrees D 1 , . . . , D n from an exchangeable probability distribution, (2) Choose a functional graph at random among graphs with indegrees D 1 , . . . , D n . They studied the number of cyclic vertices (vertices lying on a cycle) and of components, and component sizes.
Our main results are stated in Section 2 after some basic definitions. In particular we give the limiting distribution of the six-length for functional graphs with given indegree sequence, and also asymptotics for the moments of the distribution, as well as the joint distribution of the tail-and six-lengths. Proofs for the case that the second moment of the indegree sequence is "large" are given in Section 3, and for the remaining case (except for some almost trivial cases) in Section 4. See also Konyagin, Luca, Mans, Mathieson, Sha and Shparlinski [5] for a study of polynomials over finite fields considering similar aspects, such as largest component and tree size of the associated functional digraphs. Similar to [6], they observe, in [5,Section 4], that the in-degree sequence of these random digraphs is distributed rather differently from that of uniformly random functional digraphs.

Definitions, model and results
Functional Graphs. The functional graph of a function f : V → V is a directed graph G f with vertex set V and edge set {(v, f (v)) : v ∈ V }. Consider, for example, the vertex set V = {0, . . . , 4} and the function f (x) = x 2 (mod 5). Then G f is given by The six-length of a vertex in a functional graph is defined as follows: Let f : V → V be a function and let id denote the identity function on V . Let f k denote the k-times An example for the six-length in a functional graph is given in Fig. 1(a). Note that s f (v) can be decomposed into the tail-length t f (v) and the cycle-length c f (v) as indicated in Fig. 1(b). More formally, the tail-length is the unique integer that satisfies Random Model. Throughout the paper, a (finite) sequence d n = (d n,1 , . . . , d n,n ) is called degree sequence if n j=1 d n,j = n and d n ∈ N n 0 .
A random functional graph with degree sequence d n is a graph G F where F is drawn uniformly at random from the set Here and elsewhere, we use [n] := {1, . . . , n}. Note that technically what we call the degree sequence is the indegree sequence of the directed graph. This simplification is sensible because all outdegrees are 1. Now let {d n : n ∈ N} be a family of degree sequences. Let s n (v) and t n (v) be sixand tail-length of a vertex v ∈ [n] in a random functional graph with degree sequence d n . The aim of this paper is to investigate the asymptotic behaviour of (s n (v), t n (v)).
We use the usual asymptotic notation such as O, Ω, Θ, o, ω, ∼; in particular a n = ω(b n ) if b n = o(a n ). Also, for any positive integers n, k ∈ N with k ≤ n let n k := n!/(n − k)! .
Degree sequences. For a degree sequence d n = (d n,1 , . . . , d n,n ) let The parameter σ 2 (d n ) is sometimes called the coalescence.
Throughout this section, let {d n : n ∈ N} be a family of degree sequences and let {v n : n ∈ N} be a family of vertices with v n ∈ [n]. For the upcoming limit theorem for s n (v n ) we assume the following: Theorem 2.1. Assume (A0), (A1) and (A2). Then s n (v n )/ n/σ 2 (d n ) n≥1 converges weakly to the standard Rayleigh distribution, that is In fact the methods used to prove Theorem 2.1 also yield the convergence of all moments for a wide range of degree sequences. More precisely, let Then the convergence in Theorem (2.1) also holds with respect to all moments, that is: Theorem 2.2. Assume (A0), (B1) and (B2). Let X be standard Rayleigh distributed. Then In particular, E[s n (v n )] ∼ πn 2σ 2 (dn) and Var(s n (v n )) ∼ 4−π 2σ 2 (dn) n.
Moreover, these assumptions also imply that the ratio between tail-length and sixlength is asymptotically uniformly distributed. More precisely: Theorem 2.3. Let X and U be independent, U be uniformly distributed on [0, 1], and X be Rayleigh distributed. Assume (A0), (B1) and (B2). Then

Remark 2.4. A combination of Theorem 2.2 and Theorem 2.3 yields
.
These results support a conjecture by Brent and Pollard [3, Section 3] on the typical tail-and cycle-length of polynomials mod p.

Proofs for sequences with large coalescence
We first prove all Theorems under the additional assumption Cases with σ 2 (d n ) = O log n/n 1/3 will be discussed in Section 4. Throughout this section we omit the dependence on d n in the notation. In particular Moreover, we also omit the dependence on n in the notation of the degrees, that is Unless stated otherwise, n is a positive integer and asymptotic results are as n → ∞. Condition (A0) is the only condition assumed throughout the section. All other assumptions are stated in the lemmas separately.

Limit theorem for the six-length
This section contains the proof of Theorem 2.1 for degree sequences that additionally satisfy (A+), that is we prove the following statement: The proof of is based on the following explicit formula for the probabilities. In fact, the formula below remains valid even without making any assumptions on the degree sequence other than (A0).
Recall that F denotes a function drawn uniformly at random from the set F(d n ) defined in (1). Note that J n,k (v) corresponds to the set of all possible nonself-intersecting k-paths starting at v. Thus, we have The probability on the right hand side can be derived by counting the functions in F(d n ) that lead to the path J. Since J determines the images of exactly k elements to be i 1 , . . . , i k , there are possible ways to choose the remaining images. The assertion follows after dividing by the total number n!/ n ℓ=1 d ℓ ! of elements in F(d n ).
where the summation is taken over all The first term equals g n (k) by matching vectors with equal order statistics. For the second sum note that Hence, and the assertion follows from Lemma 3.2.
Note that the previous Lemma in particular yields the following bounds: Thus we can focus on the asymptotic behaviour of g n (k) for k = Θ( n/σ 2 ) instead. However, since we need some large deviation bounds in later proofs, we formulate the following lemmas so as to cover a wider range for k than necessary for Proposition 3.1.
The first step is to transform the sum in g n (k) into a probability that is covered by Poission approximation. To this end let Then g n (k) can be rewritten as follows: Now let B n be binomially B(n, α) distributed. Moreover, let X 1 , . . . , X n be independent, Bernoulli distributed random variables with P(X i = 1) = αd i j /(αd i j + 1) and let S n = X 1 + · · · + X n . Then (5) yields αd j αd j + 1 with α = k/n. Moreover, let x ∧ y = min{x, y}. Then In particular, λ − k = O (k 2 m 2 /n 2 ).
Proof. Note that for x ≥ 0 Using this bound in the definition of λ yields the assertion.
Next we apply Chen-Stein Poisson approximation to obtain the following result: It only remains to transform these into relative error bounds. Note that Stirling's approximation yields As formally shown in Lemma 3.6 below, e k−λ (λ/k) k = 1+o(1). Hence, since Lemma 3.4 implies λ ∼ k, Therefore (6) implies the assertion.
and the assertion follows using the above bound on k − λ. Proof. First note that Lemmas 3.5 and 3.6 yield g n (k) = (1 − α) n−k n j=1 (αd j + 1) By expanding log(1 + x) and using α = k/n we find Hence the assertion follows from (7) and σ 2 = m 2 /n − 1, noting that the error term tends to 0.
In preparation for the proof of Proposition 3.9, we note the following. Proof. Same as for Lemma 3.8 up to some obvious changes.
Proof of Proposition 3.9. Let X n = s n (v n )/ n/σ 2 and let X be standard Rayleigh distributed. First note that if X n converges in distribution to X and , since (9) and Markov's inequality imply that (X p n ) n≥0 is uniformly integrable. Hence, by Proposition 3.1 it is sufficient to show (9).
To this end, note that (4) and Lemma 3.7 imply for every C > 0 for some constant C ′ which only depends on C. In particular, since X n ≤ n, which yields the assertion by (10).

Joint limit for tail-and six-length
In this section we prove Theorem 2.3 under the additional assumption (A+), that is: . Let X and U be independent, U be uniformly distributed on [0, 1], and X be Rayleigh distributed. Assume (A0), (B1), (B2) and (A+). Then The joint limit of tail-and six-length will be established in two steps: • Show that, conditioned on t n (v) > 0 and s n (v) = k, t n (v) is uniformly distributed on [k − 1].
The first observation is true for every degree sequence: Lemma 3.12. Let d n be any degree sequence with (A0).
Let v be such that P(t n (v) > 0) > 0 (i.e. d w > 1 for some w = v). Then, for every k ≥ 2, Proof. The assertion is obviously true for k = 2, since t n (v) ≤ s n (v) − 1 and thus It is sufficient to prove In order to prove (11), let F k,i : since the underlying random function F is drawn uniformly at random from F(d n ). We prove the equality above by finding bijections φ i : F k,i → F k,i+1 . First consider the case i = k − 2: For f ∈ F k,k−2 let φ k−2 (f ) = g where g is the function given by otherwise.
The effect of φ k−2 on a functional graph is illustrated in Fig. 2(a). It is not hard to Thus φ k−2 is a bijection and (11) follows for i = k − 2. A similar bijection works for i < k − 2, as schematically shown in Fig. 2(b). Details are left to the reader.
Proof of Proposition 3.11. Let U be a uniformly on [0, 1] distributed random variable that is independent of (s n (v)) n≥1 . Moreover, let γ n = n/σ 2 (d n ). Then, by Lemma 3.12, Moreover, by Lemma 3.14 and since γ n → ∞, Finally, Theorem 2.1 and the independent choice of U yield P(s n (v) > xγ n , Us n (v) > yγ n ) → P(X > x, UX > y), which implies the joint convergence as claimed.

An extension to cases with small coalescence
In this section we discuss how to extend Theorem 2.1 to degree sequences with small coalescence, that is sequences with σ 2 (d n ) = O(n −1/3 log n) and nσ 2 (d n ) → ∞. The key idea is to contract edges incident to vertices with degree 1 until we obtain a reduced graph that satisfies (A+). The six-length of this reduced graph converges to a standard Rayleigh distribution by Proposition 3.1. Finally, a concentration argument will allow us to deduce a limit theorem for the original graph. (ii) Delete the vertex v i from the graph.
Remark 4.2. Note that 4/3 in the definition ofn is somewhat arbitrary; the proof works equally well for a range of similar numbers. Also note thatn = o(n) for degree sequences with σ 2 (d n ) = o n −1/4 . Finally, note thatn → ∞ as n → ∞ for any degree sequence with (A1).
(2) Let w be the smallest element in [n] \ V i . Let X w = 1 with probability 1/(|E i | + 1) and let X w = 0 otherwise. Then do the following: (a) If X w = 1, add w to the graph as an isolated vertex with a single loop, that (b) If X w = 0, choose an edge xy ∈ E i uniformly at random.
Otherwise, increase i by one and return to step 2.
Lemma 4.5. Let d n be a degree sequence, w ∈ [n], and let d n,w be as in Definition 4.1.
If G w is a random functional graph with degree sequence d n,w , then an n-extension of G w is a random functional graph with degree sequence d n .
Proof. Let H be any functional graph with degree sequence d n and let H denote the n-extension of G w . The claim is that P(H = H) = 1/|F(d n )|.
Since H can only be an n-extension of G w if G w = H w , it is sufficient to show that all possible n-extensions of a graph G are equally likely. But since there is exactly one way of choosing edges in (2) throughout the procedure that leads to a particular graph H, we have   where {R(n, a, b) : a, b, n ∈ N 0 } is independent of s n,w (w) and distributed as in Definition 4.6.
Proof. Identify edges contributing to the six-length s n,w (w) with red balls and all other edges (including a 'phantom' edge for step 2a in Definition 4.4) with blue balls in a Pólya urn. Then the dynamics described in Definition 4.4 is equivalent to the procedure of drawing from a Pólya urn. Therefore Lemma 4.5 implies the assertion. It is not hard to check that (M k ) k≥0 is a martingale. Since |R(k+1, a, b)−R(k, a, b)| ≤ 1 and R(k, a, b) ≤ a + k, one obtains Therefore, the Azuma-Hoeffding inequality yields the assertion.
We end the section with the missing proofs for Theorems 2.1, 2.2, and 2.3. Note that we may assume w.l.o.g. that since the other case is covered by the proofs in Section 3.
Proof of Theorem 2.1. Let X n := s n (v n )/ n/σ 2 (d n ) and let X be standard Rayleigh distributed. The claim is that X n converges in distribution to X. By Proposition 3.1 this holds for degree sequences with (A+) and thus, we may assume (A-). Let w = v n . Let s n,w (w) and R(n−n, s n,w (w),n+1−s n,w (w)) be as in Corollary 4.7. Moreover, let X n,w = s n,w (w)/ n/σ 2 (d n,w ). Note that (a) X n,w converges in distribution to X by Proposition 3.1 and Remark 4.3; (b) (s n,w (w)) 2 /n → ∞ in probability by (a) and σ 2 (d n,w ) ∼n −1/4 . Hence, using the tail bound in Lemma 4.8 with arbitrary constant t > 0, R(n −n, s n,w (w),n + 1 − s n,w (w)) s n,w (w)(n + 1)/(n + 1) where P −→ denotes convergence in probability.
Proof of Theorem 2.2. Let X n , X n,w and X be as in the previous proof. As in the proof of Proposition 3.9 it is sufficient to show that sup n∈N E[X p n ] < ∞, p ≥ 1,