The component counts of random injections

A model of random injections is defined which has domain A∪B and codomain A ∪ C, where A, B and C are mutually disjoint finite sets such that |B| 6 |C|. The model encompasses both random permutations, which is the case B = C = ∅, and random maximum matchings of a complete bipartite graph, which is the case A = ∅. The possible components of random injections are cycles and paths. Results on the counts of cycles and paths of different sizes are obtained for this model. Mathematics Subject Classifications: 05C20, 05C30, 05C80


Introduction
Suppose a permutation is chosen uniformly at random from the symmetric group S n . If the chosen permutation is written as a product of cycles, then the number of cycles of size i is a random variable S i (n) for each index 1 i n. The distribution of the S i has been studied since Montmort, who described a game related to the probability that a random permutation has no fixed points; see the references in [3]. The probability that a random injection (defined in Section 2) has no fixed points is derived in Theorem 8 below.
Let |π| be the number of cycles of a permutation π ∈ S n . If permutations are chosen proportionally to θ |π| , where θ > 0 is a constant, then the distribution of the process of cycle counts is called the Ewens sampling formula [3,7]. The Ewens sampling formula with θ = 1 is the same as the distribution of cycle counts of random permutations. For the Ewens sampling formula, for each fixed i 1 the distribution of S i converges weakly to the Poisson(θ/i) distribution and, moreover, the total variation distance between the process of cycle counts (S 1 , S 2 , . . . , S b ) for b = o(n) and the independent Poisson process (Z 1 , Z 2 , . . . , Z b ) is o(1), where Z i ∼ Poisson(θ/i) are independent; see [1,3].
The research in this paper is motivated by the observation that permutations are a particular kind of injection for which the domain is the same as the codomain. The other canonical example of an injection is a maximum matching of a complete bipartite graph, for which the domain and codomain are disjoint. Note that the cycles of a permutation π ∈ S n correspond to cycles in the directed graph with vertices {1, 2, . . . , n} and directed edges G π = {(j, π(j)) : 1 j n}. It is shown in Section 2 that the directed graphs corresponding to general injections may be decomposed into components in a way similar to the way directed graphs corresponding to permutations are decomposed into cycles. The components of injections are of two types: paths and cycles. It will be convenient to classify paths themselves into different types. After random injections are defined, limiting properties of the counts of components of the same size and type are derived.
Random injections as defined in this paper do not seem to have been studied before. Random injections have been viewed as a saturated (maximum) matching of a complete bipartite graph in [13]. The components of such an injection are simply vertex disjoint directed edges and elements of the codomain not in the range. The random injections studied here are random digraphs with all of their vertices falling into three predetermined categories: vertices with indegree 0 and outdegree 1; vertices with indegree at most 1 and outdegree 1; and vertices with indegree at most 1 and outdegree 0. Previous research [5] on random digraphs imposing a condition on both indegree and outdegree assumes they both equal a constant common to all vertices.
Components of injections are classified and consequences of the classification are derived in Section 2. We mainly study cycle and path counts. Results on the number of cycle counts of random injections are obtained in Section 3 and path counts are examined in Section 4.

The components of injections
Consider an injection f from a finite labelled domain D to a finite labelled codomain R.
To each such mapping we associate a directed graph G f which has vertices D ∪ R and directed edges {(d, f (d)) : d ∈ D} ⊆ D × R. If R = D, then f is a permutation of D and G f consists of vertex disjoint directed cycles. On the other hand, if D and R are disjoint, then the injection f is simply a random matching of all elements of D to a subset of R and the graph G f consists of vertex disjoint directed edges.
We study an interpolation between these two situations. Let n 1 , n 2 , and n 3 be nonnegative integers such that n 1 + n 2 + n 3 > 0 and n 2 n 3 . For any mutually disjoint sets A, B and C of sizes |A| = n 1 , |B| = n 2 , and |C| = n 3 , consider the set of injections with domain A ∪ B and codomain A ∪ C. We will call any injection with domain and codomain defined in this way an (n 1 , n 2 , n 3 )-injection. Any injection with finite domain and codomain is an (n 1 , n 2 , n 3 )-injection for some n 1 , n 2 , n 3 . If n 1 = 0, then the domain and codomain are disjoint, while if n 2 = n 3 = 0, then they are equal. Now that (n 1 , n 2 , n 3 )injections have been defined, they will usually just be referred to as injections.
The digraph G f corresponding to an injection f may be decomposed in the following way. A vertex c ∈ C may not be in the range of f , in which case it may be considered as consisting of an A-path of length 0. We will call such vertices isolated C vertices.
(Elements of A are called fixed points if they map to themselves and may also be thought of as being isolated.) If c ∈ C is in the range of f , then there are two possibilities; either c terminates a path of the form a 1 → a 2 → · · · → a → c where 1 and a i ∈ A for all i ∈ [1, ], or c terminates a directed path of the form b → a 1 → a 2 → · · · → a → c where 0, a i ∈ A for all i ∈ [1, ], and b ∈ B. Let us call a directed path of the first type an A-path and a path of the second type an B-path. There are always exactly n 2 B-paths and each of them terminates in a unique element of C. Consequently, if n 2 = n 3 , then there can be no A-paths. Let A 1 denote the set of elements of A which are on A-paths or B-paths. Vertices in the set A 2 = A \ A 1 must map to elements of A ∪ C, but they can not map to elements of A 1 ∪ C or else they would lie in A 1 . Therefore, elements of A 2 map to elements of A 2 and, since f is an injection, the restriction f | A 2 is a permutation of A 2 . As was noted above, the components of G f | A 2 are disjoint directed cycles. We have shown Lemma 1. For any injection f , the directed graph G f is the disjoint union of isolated C vertices, A-paths, B-paths, and cycles.
We call the isolated C vertices, A-paths, B-paths, and cycles of the lemma the components of G f ; the cycles are strongly connected digraph components and the paths are weakly connected digraph components. The size of a component is the number of its vertices. The size of any cycle lies between 1 and n 1 , the size of any A-path lies between 2 and n 1 + 1, and the size of any B-path lies between 2 and n 1 + 2.
Given an injection, let r denote the number of isolated C vertices; let s i , 1 i n 1 , denote the number of cycles of size i; let t j , 2 j n 1 + 1, denote the number of A-paths of size j; and let u k , 2 k n 1 + 2, denote the number of B-paths of size j. The total number of vertices is The counts of the A, B and C vertices in the components of an injection must add up to n 1 , n 2 , and n 3 , respectively. Therefore, and r + Let r be a non-negative integer and let s i , t j , and u k , be sequences of nonnegative integers such that (1), (2) and (3) are satisfied. Then the number of injections with exactly r isolated C vertices, s i cycles of size i, t j A-paths of size j, and u k B-paths of size k for 1 i n 1 , 2 j n 1 + 1, 2 k n 1 + 2 is Proof. Consider assigning labels to an unlabelled injection with components counts given by r and the s i , t j and u k . There are n 1 !n 2 !n 3 ! ways to do this. The injections we obtain in this way all have the desired component structure, but they are not all different. The isolated C vertices can be permuted among each other without changing the resulting injection, as can the vertices forming cycles of the same size, the vertices forming A-paths of the same size, and the vertices forming B-paths of the same size. This accounts for dividing by r! n 1 i=1 s i ! n 1 +1 j=2 t j ! n 1 +2 k=2 u k !. Finally, each cycle of size i can be obtained in i different ways as any one of its i entries can be in a fixed position of an unlabelled i-cycle, which accounts for dividing by n 1 i=1 i s i . Restricting to n 2 = n 3 = 0 gives the following well known formula for the number of permutations with a given cycle structure.
Corollary 3 (Cauchy's Formula). Let s i , i = 1, . . . , n be non-negative integers such that n i=1 is i = n. Then the number of permutations on n vertices with exactly s i cycles of the electronic journal of combinatorics 28(1) (2021), #P1.5 A proof of this formula is given on page 92 of [6]. The number of injections is given by (n 1 +n 3 ) n 1 +n 2 , where (x) k = x(x−1) · · · (x−k+1) denotes the falling factorial for natural numbers x and k. We will give each injection the uniform measure 1/(n 1 + n 3 ) n 1 +n 2 . Our aim is to study the transition from random permutations when n 2 = n 3 = 0 to random matchings when n 1 = 0.
We define R to be the number of isolated C vertices in a random injection; S i to be the number of cycles of size i; T j to be the number of A-paths of size j; and U k to be the number of B-paths of size k. The joint distributions of these variables can be represented by conditioned independent Poisson variables in the following way. Let (1), 2 k n 1 + 2, be mutually independent random variables, and, motivated by (1), (2) and (3), define random variables W 1 , W 2 and W 3 by For any r, s i , t j and u k , let E and E be the events It can be shown using Proposition 2 that The formula (5) is similar to Theorem 1 of [2] which gives the distribution of random combinatorial structures as independent random variables conditioned on the event that a weighted sum of them equals the size of the object. The formula (5) takes the form when n 2 = n 3 = 0. Poisson process approximations of (S 1 , . . . , S b ) for b = o(n 1 ) were made for random permutations with the assistance of this equality in [1]. Permutations are viewed as a subspecies of the species of endofunctions (also called mappings) in [10]. The component structure of random combinatorial objects such as permutations and mappings are studied in great detail in [3]. For work on the related subject of random set partitions see [4,8,14]. Conditional on |A 2 |, the distribution of the process of cycle counts (S 1 , S 2 , . . . , S |A 2 | ) is the distribution of the process of cycle counts of a randomly chosen permutation from S |A 2 | . Unfortunately, the distribution of |A 2 | seems difficult to determine. We will not make use of (5) in this paper. Inclusion-exclusion, the method of moments and the second moment method will be applied instead.
The expected number of isolated C vertices is If n 2 = n 3 , then R = 0 because all elements of C end a B-path.
In the sequel we consider counts of cycles and paths of different sizes. We let n 2 and n 3 be functions of n 1 .

The cycles of random injections
The probability mass function of |A 2 | will now be derived.
Theorem 4. If n 3 1, then the probability mass function of |A 2 | is given by and its cummulative distribution function |A 2 | restricted to its support is , 0 m n 1 .
Proof. The digraphs of injections with a given A 2 ⊆ A, |A 2 | = m, can be decomposed uniquely into the digraph of a permutation on A 2 together with n 2 B-paths and n 3 − n 2 possibly empty A-paths on A 1 ∪ B ∪ C. The number of ways of choosing the paths equals the number of ways of partitioning A 1 into n 3 possibly empty paths, assigning an element of C to each path, and assigning each of n 2 of the paths a single element of B in (n 3 ) n 2 ways. Let D be a set of n 3 − 1 labelled elements with D disjoint from A ∪ B ∪ C. Given a permutation of A 1 ∪ D written as a sequence and forgetting the labels of D, the elements of D divide A 1 , |A 1 | = n 1 − m, into a sequence of n 3 paths which can be successively assigned elements of C. There are (n 1 −m+n 3 −1)! (n 3 −1)! ways of doing this. We have shown that there are injections for which |A 1 | = m. Dividing by the total number of injections (n 1 + n 3 ) n 1 +n 2 gives (8). Moreover, where an identity from finite calculus (see [12]) is used at (10) Corollary 5. Therefore, → denotes convergence in probability as n 1 → ∞. Proof.
Let C i (n) denote the number of cycles of length i in a random permutation on n letters and let S i (n 1 ) denote the number of cyces of size i in a random injection. We can use Lemma 6 to transfer results on the process (C 1 (n), C 2 (n), . . .) to (S 1 (n 1 ), S 2 (n 1 ), . . .). The total variation distance between the laws L(X) and L(Y ) of random elements taking values in a discrete space S is defined to be Proof. For n 1 large enough so that b(n 1 ) ω(n 1 ), We have the electronic journal of combinatorics 28(1) (2021), #P1.5 Let L i (n 1 ) be the ith largest cycle in a random injection. Lemma 6 and a limit result on the large cycles of a random permutation from [11,15] can be used to show where the left hand side is taken arbitrarily to be (1, 0, 0, . . .) if A 2 = ∅, the distribution of the random vector (L 1 , L 2 , . . .) is the Poisson-Dirichlet distribution with parameter θ = 1, and the convergence is in distribution. For various representations of the Poisson-Dirichlet distribution see [3].
Inclusion-exclusion can be used to find the exact probability that a random injection has no fixed points Theorem 8. The probability that a random injection has no fixed points is Proof. Let F ⊆ A. The injections f for which F are fixed points are those such that f | F is the identity map and f | (A\F )∪B is an injection with codomain (A \ F ) ∪ C. Therefore, It follows from inclusion-exclusion that Formula (13) shows that the probability of having no fixed points does not depend at all on n 2 . The next theorem shows that the counts of cycles of different sizes exhibit a phase transition when the size of n 3 is of the same order as n 1 . Given sequences x n 1 , y n 1 , we write x n 1 ∼ y n 1 to mean lim n 1 →∞ x n 1 /y n 1 = 1.
Theorem 9. The expected number of cycles of size 1 i n 1 is Suppose that n 3 ∼ γn 1 for a constant γ > 0. For each fixed d 1, the process of small cycles satisfies (S 1 , S 2 , . . . , S d ) Z i are mutually independent Poisson(λ i ) distributed random variables with parameters If ω(n 1 ) is any integer valued function growing to infinity arbitrarily slowly, the number of vertices on cycles of size at least ω(n 1 ) satisfies Proof. We will calculate the joint falling moments of the S i . For any µ i 0, i = 1, . . . , d, let Γ(µ 1 , . . . , µ d ) denote the set of sequences of d i=1 µ i vertex disjoint cycles such that the first µ 1 of them have size 1, the next µ 2 of them have size 2, and so on until the last µ d of them have size d. Define τ = d i=1 iµ i . For τ n 1 , the size of Γ is The injections for which all α l are components are precisely those for which f | F is determined by the α l and f | (A\F )∪B is an injection with codomain (A \ F ) ∪ C. Note that |F | = τ . The joint falling moment of the S i corresponding to the µ i is In particular, the formula (14) follows by taking µ i = 1 and µ m = 0 for m = i for each 1 i n 1 . As n 1 → ∞, The conclusion of weak convergence (15) follows from the method of moments (Theorem 6.2 of [9]) with λ i given by (16) if n 3 ∼ γn 1 and λ i = 1/i if n 3 = o(n 1 ) as in Theorem 7. By (14), the expected number of vertices on cycles of size at least ω(n 1 ) is the electronic journal of combinatorics 28(1) (2021), #P1.5 = n 1 ! (n 1 + n 3 )! n 1 +n 3 −ω(n 1 ) j=1 (j) n 3 = n 1 ! (n 1 + n 3 )! (n 1 + n 3 − ω(n 1 ) + 1) n 3 +1 n 3 + 1 which converges to 0 as n 1 → ∞ when n 3 ∼ γn 1 , implying (17).
The limiting distribution in (15) is the same as for the Ewens sampling formula with θ = (1 + γ) −1 . However, unlike the Ewens sampling formula, asymptotically a random injection does not have any cycles of size growing to infinity as n 1 → ∞.

Counts of paths of different sizes
In this section we estimate the number of A-paths and B-paths of different sizes and get almost sure asymptotics for the maximum size of the two kinds of paths under certain conditions. Theorem 10. Suppose n 3 1. The expectation of the number of A-paths of size 2 i n 1 + 1 is and the expectation of the number of B-paths of size 2 i n 1 + 2 is Let the sequence of indices i = i(n 1 ) be such that i = o( √ n 1 ). If lim n 1 →∞ min(n 2 , n 3 − n 2 ) → ∞ and lim n 1 →∞ IE(T i ) = ∞, then while if lim n 1 →∞ n 2 → ∞ and lim n 1 →∞ IE(U i ) = ∞, then Proof. For each fixed 1 µ min(n 1 /(i − 1), n 3 ), the number of ways of choosing µ ordered vertex disjoint A-paths of size i is (n 1 ) µ(i−1) (n 3 ) µ . Let α denote a particular set of µ ordered vertex disjoint A-paths of size i. Let F be the subset of A which are vertices of A-paths in α and let G be the subset of C which are vertices of A-paths in α, where |F | = µ(i−1) and |G| = µ. The injections f for which the chosen A-paths are components are those for which f | F is determined by α and f | (A\F )∪B is an injection with codomain (A \ F ) ∪ (C \ G). Therefore, the µth falling factorial moment of T i is Similarly, for each µ > 0 such that µ(i − 2) n 1 and µ n 2 , the µth falling factorial moment of the number of B-paths of size i is Letting µ = 1 in the formulae for the falling moments gives the formulae for IE(T i ) and IE(U i ). By the assumptions lim n 1 →∞ min(n 2 , n 3 − n 2 ) → ∞ and i = o( √ n 1 ), letting µ = 2 in the formulae for the falling moments results in while the assumptions lim n 1 →∞ n 2 → ∞, n 2 n 3 , and i 1 = o( √ n 1 ) produce The identity Var(X) = IE(X) 2 + IE(X) − (IE(X)) 2 , true for any random variable X, and lim n 1 →∞ IE(T i ) = ∞ and lim n 1 →∞ IE(U i ) = ∞ now gives us the estimates Var(T i ) = o((IE(T i ) 2 ) and Var(U i ) = o((IE(U i ) 2 ) The conclusions about convergence in probability result from the second moment method; see [9], for example.
to be the maximum sizes of A-paths and B-paths, respectively. Under suitable conditions the next theorem provides asymptotics for Y A and Y B .
Theorem 11. Given a function ω(n 1 ) which converges to infinity arbitrarily slowly, define log n 3 (n 3 −n 2 )(n 1 +n 3 −1) where for any real number x, x and x are the usual floor and ceiling functions. If lim n 1 →∞ min(n 2 , n 3 − n 2 ) → ∞, lim inf n 1 →∞ i A 2, and i A = o( √ n 1 ), then lim n 1 →∞ Define Proof. We prove the theorem first for Y A . The lower bound on i A and (18) imply that by (18), for any i A i n 1 + 2, IE(T i ) = (n 1 ) i−1 n 3 (n 3 − n 2 ) (n 1 + n 3 − 1) i−1 (n 1 + n 3 ) The proof of (21) is similar and omitted.