Statistical properties of lambda terms

We present a quantitative, statistical analysis of random lambda terms in the de Bruijn notation. Following an analytic approach using multivariate generating functions, we investigate the distribution of various combinatorial parameters of random open and closed lambda terms, including the number of redexes, head abstractions, free variables or the de Bruijn index value profile. Moreover, we conduct an average-case complexity analysis of finding the leftmost-outermost redex in random lambda terms showing that it is on average constant. The main technical ingredient of our analysis is a novel method of dealing with combinatorial parameters inside certain infinite, algebraic systems of multivariate generating functions. Finally, we briefly discuss the random generation of lambda terms following a given skewed parameter distribution and provide empirical results regarding a series of more involved combinatorial parameters such as the number of open subterms and binding abstractions in closed lambda terms.

Lambda calculus (often abbreviated to λ-calculus) is a functional calculus proposed by Alonzo Church in the 1930s as an alternative foundation of mathematics. Although the initial plan failed, due to logical inconsistencies discovered in Church's naive system, it was quickly realised that λ-calculus itself is able to elegantly capture the, by then still informal, notion of computability, see [17]. Nowadays, λ-calculus is considered not only as an important theoretical model of computation, but is also used in practical applications ranging from functional programming languages [38], including the evaluation and testing of functional programming language compilers [18,37], to automated theorem provers [9].
Despite the extensive use of λ-terms (i.e. formal expressions of λ-calculus) as computations in functional programming languages or as components of proof artifacts in various automated theorem provers, quantitative investigations into the combinatorial or statistical properties of λ-terms were initialised only quite recently. Motivated by the uniformly random (conditioned on size) generation of λ-terms, Wang [41] explored a combinatorial model of λ-calculus where α-convertible λ-terms (i.e. terms identical up to bound variable names) are considered equivalent. The central problem of providing asymptotic estimates on the number of λ-terms in this model remained, however, open. Some time later, David et al. [19] investigated a similar model of λ-calculus where variables do not contribute to the term size and showed that asymptotically almost all λ-terms are strongly normalising. In other words, the fraction of λ-terms for which all evaluation strategies terminate approaches one as the term size tends to infinity. Likewise, in this model the central problem of giving accurate estimates on the number of λ-terms of size n remained open. Enumeration problems for restricted classes of closed affine and linear λ-terms, where binders capture at most and exactly one variable, respectively, were investigated by Bodini, Gardy, Jacquot and Gittenberger [10,11,14]. The class of λ-terms with restricted unary height was considered by Bodini, Gardy and Gittenberger in [12].
The canonical models of Wang and David et al. pose considerable difficulties due to the global, intractable structure of binders (abstractions) and their associated variables, all considered modulo α-equivalence. Explicit variable names, though elegant for manual manipulation, introduce also problems with substitution of terms for variables, especially when computations in λ-calculus are meant to be automatised. For the latter purpose, de Bruijn proposed an alternative notation of λ-terms, involving non-negative indices instead of explicit variable names [20]. This notation was later adopted by Lescanne [35,28] who proposed a new combinatorial representation for the enumeration of λ-terms. Within this new representation, λ-terms represent entire α-equivalence classes in the former models. Consequently, it became possible to enumerate also open terms (i.e. containing free variables) not just closed ones. Let us also remark that independently, a different combinatorial model was proposed by Tromp who considered a binary encoding of λ-calculus meant for the construction of a compact and efficient self-interpreter with applications to Kolmogorov complexity [40]. Enumeration problems related to the binary λ-calculus, as well as the effective random generation of λ-terms, were later studied by Grygiel and Lescanne [29].
Investigations into quantitative properties of λ-terms in the de Bruijn notation were continued by Bendkowski et al. [7,8] who showed that, in contrast to the canonical representation of David et al., asymptotically almost all λ-terms are not strongly normalising. In other words, the proportion of terms for which all evaluation strategies terminate approaches zero as the term size tends to infinity. Various size models based on the de Bruijn notation, such as Tromp's binary encoding or the so-called natural size notion introduced by Bendkowski et al. were later generalised in a common framework by Gittenberger and Gołębiewski who provided tight lower and upper asymptotic bounds on the number of closed λ-terms [27]. Recently, the gap between both the lower and bounds was closed by Bodini, Gittenberger and Gołębiewski [15]. Subsequently, efficient sampling methods for closed terms were developed and the enumeration of closed λ-terms was finally completed.
In the current paper we propose to deepen the quantitative analysis of λ-calculus in the de Bruijn notation, offering a detailed statistical analysis of random λ-terms. We investigate the distribution of several combinatorial parameters related to plain (i.e. unrestricted) and closed λ-terms. Table 1 provides a brief overview of the obtained results. In the current paper, we provide limit laws and asymptotic estimates using techniques from analytic combinatorics. While plain λ-terms in de Bruijn size notion can be analysed using classical methods, the respective analysis of closed λ-terms requires solving infinite systems of algebraic equations. Let us mention that an earlier paper by Drmota, Gittenberger and Morgenbesser [23] deals with infinite algebraic systems which are strongly connected and whose Jacobian can be represented as a sum of a scaled identity operator and an operator whose power is compact. Here, we develop a general tool meant for analysis of infinite algebraic systems that resemble in structure systems for closed λ-terms, however do not fit into the framework of Drmota, Gittenberger and Morgenbesser. In this context, our result can be considered as a continuation of [23].
The paper is structured as follows. In Section 2 we provide a concise presentation of preliminary notions and techniques. In particular, we discuss the de Bruijn representation of λ-terms (Section 2.1) and introduce the utilised analytic toolbox (Section 2.2). We then continue with a fairly standard analysis of basic statistics corresponding to plain λ-terms (Section 3). Next, we provide an empirical evaluation of several statistical properties corresponding to plain, closed, and so-called h-shallow λ-terms, i.e. terms with de Bruijn indices whose value does not exceed h (Section 4). We give empirical histograms and relate the discovered distributions with D R A F T considered term types, exhibiting some intriguing correlations. In the next section we develop our main technical tool for investigating combinatorial parameters in closed λ-terms (Section 5).
In the subsequent section we use our advanced marking techniques and study various parameters in closed λ-terms (Section 6). Finally, we conclude the paper with remarks and open questions (Section 7).

Preliminaries
2.1. Lambda calculus. λ-calculus is a theoretical formalism famously equivalent in expressiveness to Turing machines, see [2]. In this calculus, computations are represented as λ-terms defined by the formal grammar T ::= x | (λx.T ) | (T T ) in which x belongs to the countable, infinite alphabet of variables; (λx.T ) is an abstraction of variable x in T ; and (T T ) denotes an application of two λ-terms. Given an abstraction (λx.T ), occurrences of x in T are said to be bound. Unbound variable occurrences are said to occur freely.
Lambda terms, intended to represent anonymous functions, are executed by means of the iterated process of β-reduction. First, an arbitrary β-redex subterm in form of (λx.N )M is selected (if no such subterm exists, computations are terminated). Next, the selected β-redex is replaced with N [x := M ], i.e. N in which each occurrence of x is substituted, in a captureavoiding manner, by M . While substituting M for x in N we have to avoid the unintended situation in which free variable occurrences in M get bound, in other words captured, by some abstractions occurring in N . For instance, let N = (λy.x) and M = y. The term (λx.N )M should not be reduced to λy.y as, by doing so, the free variable occurrence y gets bound due to a coincidental clash with the inner abstraction variable name. Certainly, the arbitrary choice of the formal variable name y should not influence the intended semantics of the represented computation. Following this motivation, λ-terms differing only in bound variable names are considered equivalent (in other words α-convertible). In order to avoid potential name clashes, we can therefore rename bound variable occurrences before proceeding with β-reduction. Since there is an infinite supply of available variable names, it is always possible to avoid variable captures. Consequently, we can equivalently α-convert (λx.λy.x)y into, say, (λx.λw.x)y and proceed with (λx.λw.x)y → β (λw.x)[x := y] = λw.y.
Though intuitive, explicit variable names pose considerable conceptual and implementation problems. For instance, consider the terms λx.x and λy.y. Although syntactically different, both semantically represent the same anonymous identity function as (λx.x)T → β T and (λy.y)T → β T for arbitrary T . In order to facilitate automatic computations, de Bruijn proposed a different notation for λ-terms eliminating in effect the troublesome variable names [20]. In his notation, variable occurrences are replaced with indices represented as non-negative integers. The intention is to view λ-terms as natural tree-like structures and encode variable occurrences as indices denoting their relative distance to respective variable binders -each index n denotes a variable occurrence x whose relative distance to its binder (in terms of passed lambda symbols) is equal to n+1. For instance, 0 corresponds to a variable occurrence bound to the nearest abstraction on its unique path to the root in the associated tree-like representation of the considered λ-term. Consequently, α-convertible λ-terms have the same de Bruijn representation. In effect, each λ-term in the de Bruijn notation represents an entire α-equivalence class of λ-terms in the classic variable notation. For instance, both λx.x and λy.y, being α-convertible, are represented as λ0 in the de Bruijn notation. The use of de Bruijn indices significantly simplifies the automatic substitution operation. Due to the lack of explicit variable names, variable captures and name clashes do not pose implementation issues.
Remark 2.1. There exists a disagreement in the literature whether to start de Bruijn indices with 0 or 1. Although de Bruijn himself assumed the latter [20], some authors follow his footsteps, see e.g [28,29] whereas others do not, including 0 in the set of admissible indices, see e.g. [27,8,16]. Certainly, neither convention is better than the other. In the current paper, we follow the convention of starting de Bruijn indices with 0 so the keep consistent with the most recent literature.
Definition 2.2. Let {0, 1, . . .} be an infinite, denumerable set of available indices. Then, the set L ∞ of λ-terms in the de Bruijn notation is defined inductively as follows: a) Each index n is a λ-term; b) If N and M are λ-terms, then (N M ) is a λ-term; c) If N is a λ-term, then (λN ) is a λ-term. Following usual notational conventions, we omit outermost parentheses and drop parentheses from left-associated λ-terms. For instance, λx.λy.λz.((xy)z) in the classical variable notation is written as λλλ210.
An index occurrence n is said to be bound in the term N if there exist at least n + 1 lambda symbols on the unique path from n to the root of the associated tree-like representation of N , see e.g. Figure 1. Otherwise, n is said to be occurring freely in N and hence corresponds to a free variable in the classical λ-calculus notation. For convenience, we refer to de Bruijn indices both as indices and variables. If each index occurrence in N is bound, then N is said to be closed. Otherwise, it is said to be open. And so, λλλ210 is closed whereas λλ21 is not as here 2 is not bound. If prepending N with m lambdas turns it into a closed λ-term, then N is said to be m-open. Certainly, if N is m-open, then it is also (m + 1)-open. Moreover, 0-open λ-terms correspond exactly to closed λ-terms. Hence, though λλ21 is not closed, it is 1-open as λλλ21 is a closed λ-term. Finally, we write that a λ-term is plain if we mean to denote that it is either open or closed, without specifying which case holds. . Two tree-like representations associated with the same example λ-termλx.λy.λz.xz(yz) and its de Bruijn notation variant λλλ20 (10). Back pointers to abstractions are included for illustrative purposes only.
2.1.1. Enumeration. In the current paper we follow [7,8,27,15] and investigate the statistical properties of random λ-terms in the de Bruijn representation. We assume a unary base encoding of indices, i.e. an encoding in which n is identified with an n-fold application of the successor operator S to zero. Formally, the set L ∞ of λ-terms is described by the following formal grammar: In order to enumerate λ-terms, we have to assign a formal notion of size to each term in such a manner that for each available size n the number of terms of size n is finite. Though various size measures are considered in the literature, most notably the general size model framework of Gittenberger and Gołębiewski [27], we assume the simple natural size notion [7] in which the size of T is equal to the total number of abstractions, applications, successors and zeros occurring in T . Formally, we define the size of T inductively as follows: 3. Note that, in general, n is of size n + 1 as it consists of n successors applied to zero. Consequently, the term λλλ210 is of size 11 as it consists of three λ symbols, two applications between 2, 1 and 0, by convention omitted in writing, and indices 0, 1, 2 of total size six.
Remark 2.4. It is worth noticing that, with some minor technical overhead, the analysis presented in the current paper extends onto the more general size model framework of Gittenberger and Gołębiewski [27] including the assumed natural size notion as a special case. We prefer to avoid technicalities related to the general size notion and so, for the reader's convenience, favour a lucid presentation of the key arguments.
Let L m denote the set of m-open λ-terms, see Definition 2.2 (plain terms can be viewed as "infinitely" open, hence the ∞ symbol in the subscript of L ∞ ). Like plain λ-terms (2.1), L m can be described in terms of a formal, though now infinite, grammar as follows: An m-open λ-term T can take one of the three forms. Either T is in the form of abstraction followed by an (m + 1)-open λ-term; or it is an application of two m-open λ-terms; or, finally, T is one of the indices 0, 1, . . . , m-1.
Due to the infinite combinatorial specification (2.3) for L m standard analytic combinatorics techniques are not readily applicable. Consequently, enumerating closed λ-terms poses a considerable challenge. In [27] a partial solution bounding the asymptotic growth rate of the number of m-open λ-terms of size n was proposed. Although both the lower and upper bounds were of the form Cρ n n −3/2 , a typical trait of various tree-like structures, the asymptotic growth rate of m-open terms remained open. Remarkably, some time later in their joint paper [15] Bodini, Gittenberger and Gołębiewski closed the remaining gap and confirmed the conjectured Cρ n n −3/2 form of the asymptotic growth of m-open λ-terms. Furthermore, two combinatorial problems related to random closed λ-terms were studied. Specifically, the number of terms with an a priori fixed number of abstractions and the number of terms in β-normal form, i.e. without β-redexes. In this context, our contribution is a natural continuation of their work. In addition, we offer a different proof of the asymptotic growth rate of m-open λ-terms.

Analytic tools.
In the following subsection we present standard, analytic combinatorics tools following the exposition of Flajolet and Sedgewick [25]. We also assume conventional notation corresponding to generating functions, their coefficients and asymptotic expansions. We refer the reader to [25,42] for a detailed exposition.
For our purposes, combinatorial parameter analysis outlines as follows: • Let a n,k denote the number of plain (closed) lambda terms of size n for which the investigated parameter takes value k. Note that we do not assume that the numbers a n,k are a priori known. With such a two-dimensional sequence of numbers we associate a bivariate generating function n,k 0 a n,k z n u k ; In order to simultaneously study several different parameters of interest, we introduce multivariate generating functions in form of (2.5) A(z, u) = n,k 0 a n,k z n u k where u = (u 1 , . . . , u d ) is a d-dimensional variable, k = (k 1 , . . . , k d ) is a d-dimensional index satisfying k i 0, u k := u k 1 1 u k 2 2 · · · u k d d , and a n,k denotes the number of plain (closed) lambda terms of size n for which the investigated parameter values equal k 1 , k 2 , . . . , k d , respectively; • Considered combinatorial parameters (patterns) inside plain or closed λ-terms are described in terms of admissible combinatorial specifications (sometimes infinite, as in the case of closed terms); • So obtained specifications are then converted into systems of equations involving multivariate generating functions where additional variables u = (u 1 , u 2 , . . . , u d ) mark associated combinatorial patterns; • In the case of plain lambda terms, the resulting systems of equations are solved, usually approximately, in terms of standard complex-valued functions like f (z) = √ 1 − z. The coefficients of associated generating functions depend on the marking variables u = (u 1 , u 2 , . . . , u d ). In the case of closed lambda terms, novel tools developed in Section 5 are applied; • Finally, an application of Flajolet and Odlyzko's transfer theorem provides access to probability generating functions of the limiting probability distributions. In consequence, properties of investigated combinatorial parameters become readily available.
2.2.1. Asymptotic expansions. In order to access the asymptotic form of the coefficients of corresponding generating functions, we view them as functions analytic at the origin of the complex plane and examine their singularities, in particular so-called dominant singularities of smallest possible absolute value. Typically, at this point, an analytic continuation of the formal power series outside its circle of convergence is required. The following, usual domain in which the function is considered, is called a delta-domain.

D R A F T
Then, for x in any compact subinterval of (0, +∞) the coefficient standing by z n in f (z) k admits an asymptotic estimate where S(x) is the Rayleigh function satisfying 2.2.2. Algebraic systems. The following theorem, commonly known as the Drmota-Lalley-Woods theorem, is a fundamental result obtained independently by several authors [21,43,34] in order to establish limit laws in various families of tree structures specified by context-free grammars. In our exposition, we reference Drmota's book [22,Section 2.2.5], and the papers [21,23,39,3].
Assume that Φ(0, 0, 0) = 0. Then, • Φ(z, y, u) is said to be non-linear if at least one of its component polynomials Φ j is non-linear in one of the formal variables y 1 , . . . , y m ; • Φ(z, y, u) is said to be algebraic positive if all of its component polynomials Φ j have non-negative coefficients; • Φ(z, y, u) is said to be algebraic proper if it admits a unique formal power series solution to which the iteration (2.15) y 0 (z, u) = 0 and y k+1 (z, u) = Φ(z, y k (z, u), u) considered in the metric space of formal power series with valuation, converges as k → ∞, and the Jacobian matrix ∂Φ ∂y is nilpotent at (z, y) = (0, 0).
• Φ(z, y, u) is said to be algebraic irreducible if its dependency graph (i.e. a graph whose vertices are the integers 1, . . . , m and there exists a directed edge k → j if y j figures in a monomial of Φ k ) is strongly connected; • Φ(z, y, u) is said to be algebraic aperiodic if each of its component solutions y j (z, 1) for j = 1, . . . , m is aperiodic in the sense that the greatest common divisor of the pairwise differences of the set of exponent indices of z within y j (z, 1) is equal to 1.
Remark 2.8. The notion of algebraic properness of systems, also referred to as well-foundedness, is extensively studied in [39,Section 5]. As discussed in [39,33], the system has combinatorial meaning only if the Jacobian is nilpotent, i.e. if the recursive definition is well-defined and allows to inductively construct all the instances of combinatorial species. Let us note that the condition Φ(0, 0, 0) = 0 is a technical assumption of the Drmota-Lalley-Woods theorem. Pivoteau, Salvy and Soria consider, inter alia, well-founded systems for which y(0, 0) = 0. One possible characterisation of such systems is the condition that the limit of a suitable iterative approximation procedure yields the solution of the initial functional system.
The assumption that a system is polynomial can be replaced by a more general assumption that the functions are analytic, see [21]. For a detailed and non-trivial study of the conditions regarding analytic functions and the configuration of the critical points in this more general case, see [3].
for u in a neighbourhood of 1, |z − ρ(u)| < ε and arg(z − ρ(u)) = 0, where c k,j (u) and ρ(u) are analytic functions of u, and the functions h j (t, u) are analytic at (t, u) = (0, 1). In addition, if the system is algebraic aperiodic, then all y j have ρ(u) as their unique dominant singularity, and there exist constants 0 < δ < π/2 and η > 0 such that y(z, u) is analytic in a region of the form Remark 2.10. The above Drmota-Lalley-Woods theorem has been further generalised by Drmota, Gittenberger and Morgenbesser in the case of strongly connected systems with infinitely many equations [23]. In their generalisation, the authors require that the Jacobian of the system (or some of its power) is a compact operator. Alas, as the system corresponding to closed λ-terms is not strongly connected, it does not fit into their framework. In the current paper we introduce a new condition of exponential convergence which is independent of the Jacobian and conjecture that it is crucial for obtaining the respective Puiseux expansions of generating functions. 2.3. Limit laws. Consider a bivariate generating function L(z, u) with non-negative coefficients and a sequence of random variables (X n ) n 0 such that (2.19) L(z, u) = n,k 0 a n,k z n u k and P(X n = k) = a n,k j 0 a n,j .
We say that X n is associated with variable u. In order to understand the limiting behaviour of X n we investigate the probability generating function p n (u) of X n defined as Once accessed, p n (u) proves extremely useful in establishing the traits of X n as n tends to infinity. In what follows, we focus on two types of limiting distributions. The first type is related to the case of a so-called fixed singularity, which results in discrete limit law; the second type is related to so-called moving singularity, and typically results in a Gaussian limit law.
as z → ρ uniformly in delta-domain ∆(R) for some R > ρ (see Proposition 2.5). Then, the random variable X n associated with the marking variable u converges in distribution to a discrete limiting distribution with probability generating function The corresponding mean values satisfy 2.3.2. Central limit theorem.
Remark 2.13. In 1983, Bender and Richmond [4] proved a multi-dimensional variant of the central limit theorem for probability generating functions taking the quasi-power form p n (u) ∼ A(u)B(u) n . This line of research was later continued by Hwang [32] who established precise rates of convergence in the one-dimensional case. The two-dimensional case by was next investigated by Heuberger [30]. More recently, in 2016, the full multi-dimensional version has been resolved by Heuberger and Kropf [31] using a multi-dimensional version of the Berry-Esseen inequality. Although we do not touch on the rates of convergence in the current paper, let us mention that they can be obtained using the above results.
In order to formulate the multivariate central limit theorem, it is convenient to introduce the notion of logarithmic derivative which enters the mean value and the covariance matrix of the resulting random variable.
Definition 2.14. The logarithmic derivative of A(z, u) is given by the expression where A(u) is uniformly continuous and B(u) has a quadratic Taylor series expansion with error term O |u k − 1| 3 . Assume that B(u) satisfies the following variability condition: Then, the sequence of random variables X n , after standardization, converges in law to Gaussian random variable satisfying The mean vector and the covariance matrix satisfy In the one-dimensional case (2.28) simplifies to 16. Typically, when the singularity is moving (see Proposition 2.9) the bivariate generating function takes the form uniformly as z → ρ(u) for u in a vicinity of 1.
Consequently, the probability generating function takes form In this form, the probability generating function satisfies the premises of the multivariate quasipower theorem (see Remark 2.13) and so one can also obtain the speed of convergence. In our situations this speed is typically of order O 1 √ n .
Remark 2.17. For convenience, we say that a random vector X n converges in law to multivariate Gaussian distribution with mean nµ and variance nΣ writing

Basic statistics of plain lambda terms
In this section we investigate several basic combinatorial parameters related to random plain λ-terms. Let us start with invoking the combinatorial specification (2.1) describing the set L ∞ of plain λ-terms. Recall that L ∞ is specified as Equivalently, the set L ∞ of λ-terms can be specified using a pictorial tree grammar, see Figure 2 (note the explicit @ symbol for term application). Figure 2. Combinatorial specification for plain λ-terms.
Following symbolic methods [25, Part A: Symbolic Methods] we note that the generating function D(z) corresponding to de Bruijn indices takes the form D(z) = z 1 − z and so the generating function L ∞ (z) associated with plain λ-terms satisfies the following functional equation: 2) for L ∞ (z) we obtain two formal solutions. Since we know a priori that the resulting generating function has non-negative coefficients [z n ]L ∞ (z) we conclude that In this form, we can easily verify that the radicand expression (1 − z) 2 − 4z 2 1 − z carries the single dominant square-root type singularity ρ of L ∞ (z). At this point, a straightforward application of the transfer theorem (see Proposition 2.5) gives us access to the asymptotic growth rate of the counting sequence corresponding to plain λ-terms.
Proposition 3.1 (see [7]). Let L ∞ (z) be the generating function associated with plain λ-terms (3.3). Then, the number [z n ]L ∞ (z) of plain terms of size n admits the following asymptotic approximation: More specifically, ρ is the positive real root of the polynomial z 3 + z 2 + 3z − 1 = 0 whereas π . 3.1. Variables in plain lambda terms. We start our investigations with the variable distribution in plain λ-terms.
Proposition 3.2. Let X n be a random variable corresponding to the number of variables in a random plain λ-term of size n. Then, after standardisation, X n converges in law to a Gaussian distribution. Its expectation µ n and variance σ 2 n satisfy (up to numerical approximation)  Proof. Let us consider a bivariate generating function L ∞ (z, u) in which [z n u k ]L ∞ (z, u), i.e. the coefficient standing by z n u k , denotes the number of plain λ-terms of size n with k variables (equivalently k occurrences of 0). Marking all occurrences of 0 in the defining equation (3.2) of L ∞ (z), see Figure 3, we obtain the following combinatorial specification for L ∞ (z, u): The dominant singularity ρ(u) of L ∞ (z, u), is the real positive root of the radicand expression Moreover, the singularity has a non-zero derivative at u = 1.
According to Remark 2.16 (moving singularity framework) this yields a Gaussian limit law.

D R A F T
The mean and the variance of the resulting normal distribution can be computed by Proposition 2.15 using the values ρ (1) and ρ (1) from the partial derivatives of F (z, u). Since F (ρ(u), u) = 0, after taking the derivative with respect to u we obtain .
whose degree depends on the specific size model parameters a, b, c, d denoting the weights of zero, successor, abstraction and application, respectively, cf. [27]. Specifically, Consequently, for most admissible size notion parameters we cannot explicitly obtain analytic expression of ρ(u). Instead, in order to check the premises of the multivariate central limit theorem we have to work with the implicit equation (3.10).
The main technical obstacle lies in the verification of the requested variability condition The remaining argumentation is virtually identical to the one presented for the specific case of a = b = c = d = 1.

3.2.
Redexes in plain lambda terms. Basic marking techniques allow us also to investigate limiting distributions of various sub-patterns in plain λ-terms. In what follows we study the fundamental pattern of β-redexes. Recall that a β-redex is a λ-term in form of (λN )M where N and M arbitrary λ-terms. In other words, a sub-pattern which can be depicted as Proposition 3.4. Let X n be a random variable denoting the number of β-redexes in a random plain λ-terms of size n. Then, after standardisation, X n converges in law to a Gaussian distribution with expectation µ n and variance σ 2 n satisfying (up to numerical approximation) Proof. We start with establishing a formal specification for β-redexes in plain λ-terms. For that purpose, we introduce an auxiliary class N consisting of de Bruijn indices and terms in application form. Note that if N is in application form, then either N = (λM )P , i.e. N is a β-redex, or N = M P where M belongs itself to class N . Furthermore, with N at hand, we notice that each plain λ-term N takes either the form N = λM for some λ-term M , or is an element of N . We can therefore write down the following joint combinatorial specification (3.12) for L ∞ (z, u) and N (z, u) corresponding to L ∞ and N , respectively, using the variable u to mark the redex occurrences in N , see Figure 4: With the closed-form formula (3.13) we can now easily access the dominant singularity ρ(u) of L ∞ (z, u) carried by the radicand expression Consequently, a straightforward application of the multivariate central limit theorem finishes the proof, see Proposition 2.15. Proposition 3.5. Let X n = (X n(var) , X n(red) , X n(suc) , X n(abs) ) be a random vector denoting a) the number X n(var) of variables; b) the number X n(red) of β-redexes; c) the number X n(suc) of successors, and d) the number X n(abs) of abstractions in a random plane λ-terms of size n. Then, after standardisation, the random vector X n converges in law to a multivariate Gaussian distribution satisfying (up to numerical approximation) Proof. Like in the corresponding proofs for single parameters (see Proposition 3.2 and Proposition 3.4) we base our proof on the multivariate central limit theorem (see Proposition 2.15). We start with a joint multivariate specification for plain λ-terms including the investigated combinatorial parameters marked using auxiliary variable vector u = (u (var) , u (red) , u (suc) , u (abs) ) corresponding to respective components of X n , see Figure 5 (cf. Figure 4). Figure 5. Marking abstractions, variables, successors and redexes in plain λ-terms.
Let u denote the vector of considered marking variables. Such a specification, when converted into a system of functional equations involving the generating functions L ∞ (z, u) and A(z, u) associated with L ∞ and A, respectively, yields Furthermore, reformulating (3.15) we find that In this form, we can access the dominant singularity ρ(u) of L ∞ (z, u) solving ∆(z, u) = 0 for z as a function of u. Since (3.17) is a cubic equation in z(u) we have access to the analytic form of its roots. We can therefore easily check that only one solution z(u) of (3.17) coincides at u = 1 with the dominant singularity ρ corresponding to plain terms (3.5). Consequently, the generating function L ∞ (z, u) admits a corresponding Puiseux expansion in form of where both α(z, u) and β(z, u) are analytic and non-vanishing near (z, u) = (ρ, 1).
The required variability condition can be directly verified once the analytic form of ρ(u) is calculated. At this point, the multivariate central limit theorem is readily applicable yielding the asserted convergence. A direct computation gives the vector of corresponding means and covariance matrix, see (3.14).
Remark 3.6. Arguably, the most interesting part of the covariance matrix (3.14) is the sign of the correlations and the absolute values of associated variances. Note that in the natural size notion, the number of abstractions has greater variance than other constructors. Interestingly, the number of β-redexes is positively correlated with the number of abstractions. Not surprisingly, all other parameters are negatively correlated.

3.4.
Head abstractions in plain lambda terms. In this section we turn to head abstractions in plain λ-terms showing that the corresponding random variable converges to a discrete geometric distribution. Proposition 3.7. Let X n be a random variable denoting the number of head abstractions in a random plain λ-term of size n. Then, X n converges in law to a geometric distribution Geom(ρ) with parameter ρ. Specifically, Proof. Note that each λ-term starts with a (perhaps empty) sequence of consecutive head abstractions followed either by a de Bruijn index or an application of two terms (recall that abstractions therein are no longer considered to be head abstractions). Consequently, the set L ∞ of plain λ-terms can be specified using the auxiliary class H of head abstractions as depicted in Figure 6. The bivariate generating function L ∞ (z, u) corresponding to plain λ-terms with marked head abstractions satisfies therefore In such a form we immediately note that the dominant singularity ρ(u) of L ∞ (z, u) does not depend on u. In fact, it is constant and equal to ρ (i.e. the dominant singularity of L ∞ (z), see (3.5)) as, by construction, L ∞ (z, 1) = L ∞ (z). Figure 6. Marking head abstractions in plain terms.
Following the fact that L ∞ (z) admits a Puiseux expansion near z = ρ we can represent L ∞ (z, 1) 2 as a corresponding Puiseux series in form of Given (3.20) we further note that for u fixed sufficiently close to 1.
At this point, we apply Proposition 2.12 and find that the limit probability generating function p(u) associated with L ∞ (z, u) satisfies which indeed corresponds to the asserted limit geometric distribution (3.19) of X n .
Remark 3.8. The mean number µ n of head abstractions in a random λ-term of size n satisfies The limit mean (3.24) is close to 0.4196. Consequently, sufficiently large plain terms have, on average, less than one head abstraction. Such a result stands in sharp contrast to the canonical representation of David et al. [19] where the number of head abstractions in a random (closed) λ-term of size n is at least of order n/ log n; in particular, it is a moderately growing function of n.

De Bruijn index values in plain lambda terms.
In the current subsection we focus on the distribution of de Bruijn index values in random λ-terms.
Proposition 3.9. Let X n be a random variable denoting the de Bruijn index value m of an index m taken uniformly at random from a random plain λ-term of size n. Then, X n converges in law to a geometric distribution Geom(ρ) with parameter ρ. Specifically, Proof. Let V (m) n be a random variable denoting the number of de Bruijn indices m in a plain λ-term of size n. Marking the index m in the specification for plain terms, see Figure 7, we note that the bivariate generating function L (m)

Figure 7.
Marking the index m in plain λ-terms.
Then, taking the partial derivative ∂u at u = 1 of both sides of (3.26) we arrive at corresponds to the weighted sum over all plain λ-terms of size n where each term comes with weight equal to the total number of occurrences of index m in it. Let Taking the weighted sum over all m 0 of both sides of (3.27) such that weight corresponding to m is w m we obtain denotes the weighted sum of all λ-terms of size n where each term has weight equal to its total number of variables.
In other words, variable w in S ∞ (z, w) marks the probability mass function corresponding to X n . Solving (3.28) we find that where the latter equality follows from the formula (3.3) for L ∞ (z).
Furthermore, given the known Puiseux expansion of the right-hand side square-root expression (see, e.g. Proposition 3.1) we easily note that S ∞ (z, w) admits a Puiseux series expansion required for Proposition 2.12. Finally, a routine computation verifies the asserted geometric limit distribution Geom(p) corresponding to the variable X n .
Remark 3.10. The mean value µ n of a random index with a random plain terms satisfies The limit value (3.30), coinciding in the natural size notion with the corresponding mean for head abstractions (3.24), is close to 0.4196. This result stands, again, is sharp contrast to the canonical model of David et al. [19] in which variables (in closed terms) tend to be arbitrarily far from their binding abstractions. Let us point out that such a disparity is a consequence of the different combinatorial models for λ-terms. In the canonical representation, the distance from a variable to its binding abstraction does not contribute to the weight of the corresponding variable (all variables have weight zero). On the other hand, in the de Bruijn representation weights of bound indices are proportional to the distances to their binding abstractions. Consequently, de Bruijn indices in large random λ-terms tend to be, on average, shallow. This central difference of both combinatorial models leads to remarkably contrasting asymptotic properties, including normalisation of large random λ-terms, cf. [19,7,5].

D R A F T
3.6. Leftmost-outermost redex search. In order to evaluate an expression represented by a λ-term M we iteratively choose a β-redex (i.e. a subterm in form of (λP )Q) in M and contract it using β-reduction, see e.g. Figure 8 (cf. Section 2.1).
The order of evaluation (i.e. the order in which redexes are contracted) has a crucial impact on the computational effect of repeated β-reduction. Consider λ-terms Ω := (λ00)(λ00) and P := (λλ0)Ω. Note that Ω can be evaluated ad infinitum as Ω → β Ω. If we choose the Ω redex over the main (λλ0)Ω redex in P , then P → β P . Otherwise, if we choose the leftmost-outermost redex (λλ0)Ω instead of Ω we note that P → β λ0 since the topmost abstraction in P has no associated indices. We cannot continue β-reducing λ0 as it contains no more redexes (such terms are in so-called (β-)normal form) and so we terminate the evaluation process.
Terms from which it is possible to reach a β-normal form using repeated β-reduction are said to be normalisable. Since it is possible to emulate computations of arbitrary Turing machines in λ-calculus by means of normalisable terms, it is undecidable to determine whether a given λ-term is normalisable or not, see [2]. Remarkably, the following classical result provides a normalising evaluation strategy guaranteed to find normal forms of normalisable λ-terms.
Theorem 3.11 (Standardisation theorem, see e.g [2]). Let N be a normalising λ-term. Then, the iterated process of applying β-reduction to the leftmost-outermost redex in N leads to the (unique) normal form of N .
When searching for the leftmost-outermost redex in a given λ-term, we traverse the associated λ-tree in a depth-first manner favouring left branches of application nodes. Both cases of handling abstractions and indices are trivial -when visiting an abstraction node, we immediately recurse to its subterm looking for the leftmost-outermost redex; indices cannot contain β-redexes hence after arriving at an index, we terminate the traversal.
The most interesting part of the traversal algorithm lies in visiting application nodes. Suppose that we are currently visiting an application node. If its left branch starts with an abstraction node, we terminate the traversal as we have just found the leftmost-outermost redex. Otherwise, we have two possibilities based on whether the left branch is in β-normal form or not. If it is, we move into the left branch and, as we cannot find a β-redex, return from the recursion moving to the corresponding right branch. Otherwise if the left branch is not in β-normal form, we handle it recursively, however since it contains a β-redex, we eventually terminate the search before ever returning to visit the right branch.
Let us note that such a traversal algorithm, henceforth abbreviated to LO, introduces some a priori non-trivial computational overhead to the execution cost of β-reduction in λ-calculus. If carried out on a β-redex, LO takes constant time to run. In contrast, when carried out on a λ-term in β-normal form, LO traverses nearly the whole λ-term. Such a varying traversal cost poses the natural question of the average-case performance of LO. In what follows, we show that the execution cost of LO, when viewed as a random variable ranging over random λ-terms, tends to a discrete limit law with constant expectation. Consequently, finding the leftmost-outermost redex introduces, on average, only a constant overhead to the cost of carrying out a single β-reduction.
Proposition 3.12. Let X n be a random variable denoting the number of nodes visited by the LO traversal algorithm while searching for the leftmost-outermost β-redex in a random plain λ-term of size n. Then, X n converges in law to a discrete limiting distribution with computable probability generating function and constant expectation. The corresponding means µ n satisfy (up to numerical approximation) Proof. We start with providing a combinatorial specification for plain λ-terms marking all nodes that are visited by the leftmost-outermost redex traversal algorithm LO. For that purpose we introduce the following three auxiliary classes: • N for the class of β-normal forms; • M for the class of so-called neutral terms, and • A for the class of de Bruijn indices and plain λ-terms starting with an application. In order to give their combinatorial specification we follow the presentation of [7] and give a joint description for the class N of β-normal forms and the associated class M of neutral terms. A β-normal form is either a plain λ-term starting with an abstraction followed by another β-normal form, or a neutral term. A neutral term, in turn, is either a de Bruijn index, or an application of a neutral term to a β-normal form, see Figure 9. Such a specification provides the following system of functional equations defining the generating functions N (z) and M (z) corresponding to the class of β-normal forms and neutral terms, respectively: Solving (3.32) we note that M (z) and N (z) satisfy With both N (z) and M (z) at hand, we can now proceed and give the announced specification for plain λ-terms with marked nodes visited during the execution of LO, see Figure 10. Let T be a plain λ-term in the class A. Certainly, if T is a de Bruijn index, we mark only its topmost atom (i.e. the topmost successor or 0 if T is equal to 0). Otherwise, T starts with an application. If T is a β-redex, we mark two atoms -the topmost application and the abstraction Figure 10. Specification for plain λ-terms with marked nodes following the execution of the redex finding traversal algorithm LO.
starting the left branch of T . Remaining nodes are left unmarked. If T is not a redex however its left branch is a neutral term, we mark the entire left branch as well as the topmost application. Finally, if the left branch of T is not a neutral term, we take the difference class A \ M of A and (marked) neutral terms M for the left branch. The right branch remains unmarked. Such a specification yields the following system of functional equations defining the generating functions L ∞ (z, u) and A(z, u) corresponding to classes L ∞ and A, respectively: Knowing a priori that L ∞ (z, 1) corresponds to the generating function for plain λ-terms (3.3) we solve system (3.34) and find that L ∞ (z, u) satisfies What remains to finish the proof is to check that L ∞ (z, u) meets the premises of Proposition 2.12. Specifically, it admits a single, fixed dominant singularity ρ(u) = ρ and a corresponding Puiseux expansion in form of Denote the denominator expression of (3.35) as F (z, u). Note that F (ρ, 1) > 0 and hence in a fixed neighbourhood of u = 1 the denominator F (z, u) is non-zero. Consequently, L ∞ (z, u) shares its (fixed) dominant singularity with L ∞ (z, 1). The required form of the Puiseux expansion of L ∞ (z, u) follows as a consequence of the Puiseux expansions of both L ∞ (z, 1) and its power L ∞ (z, 1) 2 , see (3.21). A direct computation gives access to the corresponding probability generating function (omitted for brevity) and also the specific limit mean (3.31) of X n .
Remark 3.13. The generating function M (z) associated with neutral terms, see (3.32), also corresponds to the well-known class of Motzkin numbers enumerating, inter alia, plane unarybinary trees, see e.g. [25,Note I.39,p.68]. We refer the curious reader to [7] for a size-preserving correspondence between neutral terms of size n and Motzkin trees with n nodes.

3.7.
Height profile in plain lambda terms. The goal of this section is to obtain some insight into the mean height profile of plain λ-terms. We distinguish essentially two different notions of height. The first notion which we call unary height, takes into account only the number of abstractions above the considered node. The second notion concerns the natural height of a tree, i.e. the number of predecessors of a node which can be either abstractions or applications. In both situations, the semi-large powers theorem (see Proposition 2.6) can be applied. Consequently, the mean profile is always related to the Rayleigh distribution.
We are interested in mean profile of different types of nodes. In what follows we consider three types of mean profiles: • the mean (unary or natural) profile of leaves; • the mean (unary or natural) profile of abstractions, and • the mean (unary or natural) profile of applications.
Proposition 3.14. Let H n be a random variable denoting the unary (respectively natural) height of a randomly chosen variable (application or abstraction) in a random plain λ-term. Then, with x in any compact subinterval of (0, +∞), H n admits a limiting Rayleigh distribution with C ≈ 4.30187 for unary height, and C ≈ 1.27162 for the natural height. The average value of mean height is √ πn/C whereas the peak value is attained at k * = √ 2n/C. More specifically, the average number of • variables at unary height k is asymptotically equal to 2.839 ke −C 2 k 2 /4n ; • variables at natural height k is asymptotically equal to 0.248 ke −C 2 k 2 /4n ; • abstractions at unary height k is asymptotically equal to 2.383 ke −C 2 k 2 /4n ; • abstractions at natural height k is asymptotically equal to 0.208 ke −C 2 k 2 /4n ; • applications at unary height k is asymptotically equal to 2.839 ke −C 2 k 2 /4n ; • applications at natural height k is asymptotically equal to 0.248 ke −C 2 k 2 /4n .
Proof. We start with the unary height profile of variables. Consider generating functions C k (z, u) corresponding to plain λ-terms with u marking the variables at the unary height k. These functions satisfy the following system of equations: Taking partial derivatives of each equation in (3.38) with respect to u, we obtain a linear system for derivatives of generating functions. Setting u = 1 we can solve this linear system and obtain Furthermore, a direct computation provides the following Puiseux series expansions as z → ρ: Consequently, the numbers M n,k of variables at unary level k in a random plain lambda term of size n satisfy .
An application of the semi-large powers theorem (see Proposition 2.6) and the transfer theorem (see Proposition 2.5) to the numerator and denominator, respectively, result in the following asymptotic estimate: After normalising by the total sum n k=0 M n,k we obtain the declared limiting distribution.

D R A F T
Next, we turn to the case of natural height profile of variables. Consider generating functions C k (z, u) where now u marks the variables at the natural height k, instead of the unary height. As in the previous case, we obtain a system of equations Again, taking partial derivatives ∂u at u = 1 we can solve the resulting system and find that In this case, a direct computation verifies that the function z + 2zL ∞ (z) admits a Puiseux series expansion in form of Consequently, this estimate yields a Rayleigh distribution with parameter 1.27162265120953. In particular, the average number M n,k of variables at natural height k in a random plain λ-term of size n satisfies In order to mark remaining nodes, i.e. abstractions and applications, it is sufficient to change the first equation of the system (3.38). Accordingly, only the constant multiple behind the mean tree width at level k changes. For abstractions, we obtain, respectively, for unary height whereas for natural height. This change gives the constants 2L ∞ (ρ) = (1−ρ) ρ ≈ 2.383 for unary height, and 2ρ 2 L ∞ (ρ) = ρ(1 − ρ) ≈ 0.208 for the natural height, respectively.
Similarly, marking applications yields a change in the first equation for the generating function for unary height, and for natural height. We obtain the constants 2L 2 ∞ (ρ) = (1−ρ) 2 2ρ 2 ≈ 2.839 for unary height, and ≈ 0.248 for natural height, respectively. The mean value is obtained by using the integral approximation for the ratio of sums kM n,k / n k=0 M n,k whereas the peak value is obtained by finding the maximum value of M n,k as a function of k.

Empirical results
The analysis of various combinatorial parameters corresponding to plain λ-terms can be approached using standard proof templates, typical for algebraic structures. Starting from a combinatorial specification associated with the investigated parameter, its analysis follows as an, either direct or indirect, examination of the resulting system of multivariate generating functions and related singularities, see [25,Chapter IX].
Alas, such a general approach to the analysis of combinatorial parameters related to plain λ-terms does not readily transcend to closed λ-terms. Standard analytic tools, such as Bender and Richmond's multivariate central limit theorem (see Proposition 2.15) or the Drmota-Lalley-Woods theorem (see Proposition 2.9) require that the investigated system of generating functions is, inter alia, finite. Although this is true for plain terms, closed λ-terms give rise to a more involved, infinite system of generating functions (2.3) based on the hierarchical notion of m-openness. Consequently, closed λ-terms escape the usual course of parameter analysis, successfully carried out in the case of plain terms, see Section 3.
In the current section we present two, somewhat complementary, empirical approaches to the analysis of combinatorial parameters related to closed λ-terms. We start with an experimental scheme based on the recent development of efficient Boltzmann samplers for closed λ-terms due to Gittenberger, Bodini and Gołębiewski [15]. We generate large, uniformly random (conditioned on size) closed λ-terms and collect empirical data for various interesting parameters related to generated terms. The second approach is based on the empirical evaluation of, appropriately truncated, systems of multivariate generating functions. We compute the coefficients of the corresponding formal power series and consequently analyse the distribution of investigated parameters for relatively small term sizes.
The benefits of such empirical approaches are threefold. Firstly, the empirical parameter distribution for large random λ-terms is closely related to its theoretical, limiting counterpart; hence, develops solid intuitions underlying the successful analysis of a broad class of combinatorial parameters, see Section 6. Secondly, the empirical data for various term sizes provides insight in the convergence rates at which the considered random variables tend to respective limit laws, relating them in effect with practically attainable term sizes. Finally, experimental results for large closed λ-terms provide strong evidence for conjectures regarding practical, though even more advanced parameters escaping the restrictions of our analysis.

4.1.
Empirical evaluation of Boltzmann samplers. Boltzmann samplers are a prominent sampling framework meant for the random generation of large combinatorial structures [24]. Given an admissible combinatorial specification, it is possible to construct an appropriate sampler whose outcome are uniformly random (conditioned on size) structures built according to the input specification. Although Boltzmann samplers are guaranteed to return uniformly random structures, their eventual size is not deterministic, but instead random. Following a proper calibration during the construction of Boltzmann samplers, the randomness of their output structure size can be controlled so that it is centred around a given (not necessarily finite) expectation. A final rejection phase, dismissing structures of undesired properties, such as for instance inadmissible size, provides the means of controlling the generated structures.
In order to conduct our experiments, we have implemented three kinds of Boltzmann samplers 1 for plain, closed and so-called h-shallow λ-terms, i.e. closed terms in which de Bruijn indices do not exceed the shallowness bound h, see e.g. [27,5]. The respective sampler for closed λ-terms follows the ideas of [15]. For each investigated type of terms we sample k = 50, 000 terms of sizes in the interval [20, 000; 50, 000] using a dedicated singular Boltzmann sampler (i.e. a Boltzmann sampler with unbounded outcome size expectation, see e.g. [13]) whose controlling parameter is calculated numerically with accuracy of order 10 −9 . For closed h-shallow terms, we fix h = 30. Finally, we record several combinatorial parameters related to so obtained terms and use them in the subsequent evaluation.
For each considered parameter and term type, we plot a histogram relating the collected samples and the respective parameter values. The x-axis denotes the parameter value (either raw or aptly normalised) whereas the y-axis corresponds to the number of samples attaining the associated value. Averages, variances and standard deviations corresponding to the investigated parameter are rounded up to the fifth decimal point and summarised in respective tables. For brevity, we include only histograms for plain and closed λ-terms. We comment on the missing h-shallow λ-terms at the end of the current section. 4.1.1. Head abstractions. We start with head abstractions, see Section 3.4. Figure 11 depicts the distribution histogram of the obtained data sets. The corresponding numerical approximations of averages, variances and standard deviations are listed in Table 2.    Remark 4.1. The distribution corresponding to plain terms resembles a geometric law, as expected by Proposition 3.7. The observed average corresponds closely to the theoretical limit average (3.24). Notably, head abstractions in closed λ-terms do not follow the same distribution as their plain counterpart.

4.1.2.
Free variables in plain lambda terms. When viewed as programs of an abstract programming language, variables in λ-terms become formal arguments bound to abstractions introducing them in respective name scopes (namespaces). In this perspective, free variable occurrences correspond to expressions available in the global namespace such as for instance predefined operators or constants, see e.g. [38]. Figure 12 depicts the distribution histogram of the obtained data set for plain terms (recall that closed λ-terms contain, by definition, no free variable occurrences). The corresponding numerical approximations of averages, variances and standard deviations are listed in Table 3. Remark 4.2. The empirical distribution of free variables in plain terms resembles closely a geometric law P(Geom = k) = (1 − p) k p with parameter p ≈ 0.14815. Note however that the observed distribution is not geometric; its global maximum is not attained at the parameter value corresponding to the lack of free variables, see Figure 12. Remarkably, the observed value p is a D R A F T quite good estimate for the probability that a sufficiently large random plain λ-term is closed. Using the numerical estimate for the multiplicative constant C in the asymptotic growth rate C · n −3/2 ρ −n corresponding to closed λ-terms [27, Section 6.1] one can find that this probability is in fact close to 0.12840.

4.1.3.
Leftmost-outermost redex search. The next parameter we evaluate is the cost of finding the leftmost-outermost β-redex in plain and closed λ-terms, see Section 3.6. Figure 13 depicts the distribution histogram of the obtained data sets. Corresponding numerical approximations of averages, variances and standard deviations are given in Table 4.    In the current subsection we are interested in the degree to which variables connect various subterms in random λ-terms. In other words, in the proportion of open subterms in a random term. Let us note that such a parameter provides some insight in the extend to which subterms, viewed as components of the entire term (i.e. functional program) depend syntactically on each other. In this perspective, terms consisting of many closed subterms correspond to programs with equally many independent subprograms. On the other hand, λ-terms with few open subterms represent computations in which various parts of the program depend on each other through common variable usage. Since each index n consists of n open proper subterms which do not contribute to the intended degree of functional dependence (variables in functional programs are atomic expressions) for our current considerations we assume that n is itself atomic, i.e. does not have proper subterms. Consequently, the total number of subterms of a given λ-term T becomes equal to the number of indices, applications and abstractions occurring in T , without accounting for successors.
The resulting distribution histograms for plain and closed λ-terms are depicted in Figure 14. The x-axis denotes the (normalised with respect to the total number of subterms as defined above) number of open subterms whereas the y-axis corresponds to the number of terms attaining the corresponding value. Respective averages, variances and standard deviations are listed in Table 5.      Figure 15 depicts the empirical distribution histograms for plain and closed λ-terms, respectively. The x-axis denotes the proportion of binding abstractions among all abstractions. Corresponding averages, variances and standard deviations are presented in Table 6.   In the current subsection we turn to extremal statistics related to binding abstractions. Specifically, we investigate the maximal number of variables bound to a single abstraction in random λ-terms. Figure 16 illustrates the related distribution histogram for plain and closed λ-terms, respectively. Numerical approximations of averages, variances and standard deviations are given in Table 7.    Consequently, the obtained average suggests a similar C · log 1/ρ n type of behaviour occurring, for instance, in the analysis of longest runs in random words, see [25,Example V.4]. Nonetheless, unlike the limit distribution of longest runs, the discovered distribution does not suggest a limit concentration due to the significant values of the observed variances.  [15] and efficient techniques allowing to obtain numerical estimates for the relative asymptotic density of m-open terms in the set of all plain λ-terms [27] it is possible to approximate the limit distribution of the m-openness parameter in plain terms. However, in order to avoid the laborious computations involved in this approach, we offer a simple Monte Carlo approximation scheme. Figure 17 depicts the distribution histogram of the obtained data set for plain λ-terms. The corresponding average, variance and standard deviation are listed in Table 8.  Generalised m-openness of plain and closed terms. Though the notion of m-openness is defined only for non-negative values of m, we propose a natural generalisation to all integers in the following manner. We say that a closed λ-term T is m-closed 2 for m 0 if it is possible to discard m head abstractions of T and still retain a closed λ-term. In other words, a closed term is m-closed if its top m head abstractions are non-binding. Note that this new parameter provides a degree of term closeness; the higher the factor m of an m-closed λ-term, the more closed it is. Like in the case of m-open λ-terms, closed λ-terms are 0-closed. Moreover, if a term is (m + 1)-closed, then it is also m-closed. Figure 18 depicts the distribution histogram of the obtained data sets for plain and closed λ-terms, respectively. The x-axis denotes the generalised m-openness factor whereas the yaxis corresponds to the number of terms attaining the corresponding cost value. Numerical approximations of averages, variances and standard deviations are listed in Table 9.    Remark 4.7. The empirical distribution corresponding generalised m-openness of closed λ-terms suggests that the vast majority of closed terms are 0-closed but not 1-closed. Given the related distribution of head abstractions, see Figure 11, such a result suggests that if a closed term has a leading head abstraction, it is more likely that it is in fact binding.

D R A F T
Let us note that all of the presented empirical histograms exhibit a close correspondence between parameters of closed λ-terms and their corresponding equivalents in h-shallow terms. Such a result should not be surprising given the exponential convergence speed at which h-shallow λ-terms of size N tend to closed λ-terms of size N as h → ∞, see Section 5, cf. [15]. Consequently, virtually all presented histograms for closed λ-terms are also correct histograms for h-shallow terms.

4.2.
Empirical evaluation of systems of generating functions. The probability generating function p n (u) of a random variable X n corresponding to a certain parameter inside a random λ-term of size n, plain or closed, can be obtained by evaluating the bivariate generating function L(z, u) and taking the respective ratio of coefficients, see Section 2.3: Remarkably, for statistics whose limit laws are discrete distributions, the convergence is already evident at n = 20. We empirically evaluate the coefficients of generating functions whenever the associated functional equations are available.

D R A F T
For closed λ-terms, the corresponding systems of functional equations are infinite and hence cannot be directly evaluated. Fortunately, truncating the infinite system at the kth level and replacing the kth function L k (z, u) by its infinite counterpart L ∞ (z, u) yields an exponentially small difference in the dominant term coefficients, see Section 5. In what follows, we truncate the systems at height k = 15.

Head abstractions.
We return to head abstractions in plain and closed λ-terms. The distribution of this parameter for large term sizes was discussed in Section 4.1.1. Here, we study this parameter for small term sizes. The equations for corresponding generating functions are presented in Sections 3.4 and 6.4.   The convergence process for n ∈ [1,15] is depicted in Figure 19. Head abstractions in plain terms admit a geometric distribution while the number of head abstractions in closed lambda terms converges to a certain computable, discrete distribution.

4.2.2.
Leftmost-outermost redex search. As already mentioned in Section 4.1, the distribution associated with the cost of finding the leftmost-outermost β-redex in plain and closed λ-terms tends to a discrete limiting distribution. The functional systems for bivariate generating functions can be found in Sections 3.6 and 6.6.

D R A F T
We depict the convergence process in Figure 20. For both plain and closed λ-terms, a remarkable detail can be observed. For λ-terms of small size, the distribution has a peak at the value corresponding to the term size. Note that such a phenomenon is related to β-normal forms, witnessing the worst-case time complexity of the traversal algorithm. For larger terms, the proportion of β-normal forms in all lambda terms becomes exponentially negligible.

4.2.3.
Free variables and m-openness in plain terms. Next, we consider two parameters of plain λ-terms that require advanced marking techniques. The functional equations for the cases of free variables and m-openness are given in subsequent Sections 6.1 and 6.3.   The convergence process is depicted in Figure 21. Let us remark that the convergence process corresponding to the number of free variables proceeds more slowly than all the other statistics that we consider in the current paper.

4.2.4.
De Bruijn index profile. The final pair of plots concerns the profile of variables or, in other words, the profile of de Bruin indices in random plain and closed λ-terms. In Sections 3.5 and 6.5 we show that both parameters tend to geometric limiting distributions. Figure 22 depicts the probability distributions for lambda terms of the size in the interval n ∈ [1, 15].

Infinite systems of algebraic equations
In this section we present our main technical contribution within analytic combinatorics meant for dealing with certain recursive, infinite systems of generating functions, i.e. Theorem 5.9. Although our results admits broader applications than the one presented in the current paper, for consistency, we focus only on the enumeration and statistical analysis of various combinatorial parameters in closed λ-terms. Our proof is motivated by the papers [15,27] where the authors consider the enumeration problem of closed λ-terms without additional marking parameters. For that purpose, they construct a series of sequences (L m,N ) m 0 that approximate m-open λ-terms with convergence rate of order O 1 √ N . In what follows we improve this rate to an exponential one. Moreover, we abstract the considered system away from λ-terms in the de Bruijn notation, allowing for an analysis of more general varieties of combinatorial systems.
Remarkably, in [27,Section 5] the authors obtain the asymptotic estimate Θ(n −3/2 ρ −n ) for the number of closed lambda terms of size n with the difference between upper and lower bounds on the constant multiple within 10 −7 . Let us notice that this technique is quite different from the technique used in [15]. Our approach is more similar to the former method, however admits certain improvements. Specifically, we simplify the procedure of improved constant estimation, give a rigorous proof about the exponential convergence rate of the constant multiple, and provide more general and simpler tools based on the properties of the Jacobian of the limiting system (instead of exploiting the particular form of this equation in the case of λ-terms).

5.1.
Calculus techniques for formal power series. We start with some basic notation and properties of coefficient-wise inequalities on formal power series and some geometric results on matrices of formal power series. In what follows, we denote the spectral radius of matrix A, i.e. the largest absolute value of its eigenvalues, by r(A). where z = (z 1 , . . . , z d ), n = (n 1 , . . . , n d ) and z n = z n 1 1 z n 2 2 . . . z n d d the domination f (z) g(z) means that for each vector of indices n it holds a n b n . Certainly, if a combinatorial class F is included in a combinatorial class G, then the generating functions f (z) and g(z) corresponding to respective classes satisfy f (z) g(z). The same holds for marked classes and associated multivariate generating functions. Finally, for vectors A and B of identical (however not necessarily finite) dimension, we write A B to denote a coordinate-wise domination of respective components.
In real analysis, the squeeze lemma is a theorem regarding the limit of the sequence which is upper-and lower-bounded by two sequences with the same limit value. The following statement is a variant of this lemma, stated in the context of formal power series admitting coefficient asymptotics suitable for analysis of corresponding limit laws, see Section 2.3.
Lemma 5.2 (Squeeze lemma for formal power series). Let z ∈ C and u = (u 1 , . . . , u r ) ∈ C r . Assume that f (z, u), (h m (z, u)) m 0 , and (g m (z, u)) m 0 are multivariate formal power series in (z, u) with non-negative coefficients, such that for every n and m, the functions • there exists a function A(u) analytic near u = 1 such that A(1) = 0 satisfying uniformly in a complex vicinity of u = 1 Then, uniformly in a complex vicinity of u = 1, as n → ∞: Proof. We divide the proof into two parts. We start with showing that the statement holds for vectors u whose components are real positive numbers. Next, we extend this property onto all complex components of u.
Taking the limit with respect to m we note that condition (5.5) guarantees that for arbitrary small ε > 0 and sufficiently large N (again, independent of u) we have In other words, for values of u ∈ R r within a fixed vicinity of u = 1 it holds Now, let us consider u ∈ C r . Note that since for each n, m 0 the formal power series [z n ]f (z, u), [z n ]h m (z, u) and [z n ]g m (z, u) are polynomials in u, they are analytic in C r . Moreover, as ψ n,m (u) := [z n ]f (z, u)−[z n ]h m (z, u) is a polynomial with non-negative coefficients, for every u ∈ C r we have |ψ n,m (u)| ψ n,m (|u|) and, consequently, After dividing both parts by C n |A(u)||B(u) n | we obtain Following condition (5.4) and the estimate (5.9) for u ∈ R r , we note that for every ε > 0 there exists N := N (ε) independent of m and u, such that for all n > N we further have Since ε does not depend on m, we can take the limit with respect to m. Given condition (5.5) we note that for sufficiently large n we have Hence, uniformly in a fixed complex vicinity of u = 1 (5.14) lim n→∞ [z n ]f (z, u) C n A(u)B(u) n = 1, which finishes the proof.

D R A F T
Remark 5.3. Using the same technique, higher-order error terms can also be transferred, provided that the function is squeezed between two sequences of formal power series with known Puiseux expansions. In such a situation, higher-order terms correspond to coefficients obtained from the summands of Puiseux expansion The next lemma is a formal power series analogue of Lagrange's mean value theorem.
Lemma 5.4 (Mean value lemma for formal power series). Let f (z) and g(z) be two formal power series such that f (z) g(z). Assume that Ψ(t) = n 0 ψ n t n is a formal power series with non-negative coefficients. Then, Likewise, the statement holds for vectors of formal power series in a coordinate-wise manner.
Proof. Coefficient-wise subtraction of the left-hand side of (5.16) yields ). (5.17) The first two equalities hold as a consequence of formal power series composition and the identity a n − b n = (a − b) n−1 i=0 a i b n−i−1 . The subsequent domination follows from the assumption that f (z) g(z).

5.2.
Forward recursive systems. The following definition of forward recursive systems encapsulates the general, abstract features of the infinite systems that we consider in the current paper. Core characteristics of the infinite system corresponding to closed λ-terms are abstracted and divided into three general conditions which are sufficient to access the asymptotic form of respective coefficients.
Then, we say that the system (5.19) is forward recursive. Furthermore, consider a limiting system in form of (5.20) 2 , z, u), respectively, and moreover all series K ∞ i are analytic at ( 1 , 2 , z, u) = (0, 0, 0, 1). In this setting, we say that the system (5.19): a) is infinitely nested if K m ( 1 , 2 , z, u) K ∞ ( 1 , 2 , z, u) for each m 0; b) tends to an irreducible context-free schema if it is infinitely nested and its corresponding limiting system (5.20) satisfies the premises of the Drmota-Lalley-Woods theorem (see Proposition 2.9) i.e. is a polynomial, non-linear system of functional equations which is algebraic positive, proper, irreducible and aperiodic; c) is exponentially converging if there exists a vector A(z, u) = (A 1 (z, u), . . . , A d (z, u)) and a function B(z, u) such that: • for each m 0 we have u) and B(z, u) are analytic functions in the disk |z| < ρ + ε for some ε > 0 and u = 1 where ρ is the dominant singularity of the limit system (5.20) at point u = 1. Moreover, at u = 1 we have |B(ρ(u), u)| < 1 where ρ(u) is the singularity of the limiting system (5.20).
Example 5.6. Consider the infinite system corresponding to m-open λ-terms, see (2.3). Recall that the sequence (L m (z)) m 0 of respective generating functions satisfies Let us show that (5.22) is an infinitely nested, forward recursive system which tends to an irreducible context-free schema of L ∞ (z) at an exponential convergence rate. Here, each intermediate system L m consists of a single equation defining L m (z). Note that there are no additional marking variables u. The vectors K m are one-dimensional and the corresponding functions K m are given by The limiting system L ∞ (z) satisfies One can easily check that it also satisfies the premises of Proposition 2.9; hence, the considered system (5.22) tends to an irreducible context-free schema. Since the trivariate formal power series K ∞ ( 1 , 2 , z) − K m ( 1 , 2 , z) has non-negative coefficients, the system (5.22) is also infinitely nested. Moreover, the difference between the limiting equation and the mth equation computed at 1 = 2 = L ∞ (z) is equal to z m+1 1−z and corresponds to a subset of de Bruijn indices. Certainly, as m tends to infinity, this difference converges to zero exponentially fast.
Given the combinatorial relation between m-open λ-terms and plain terms, we readily obtain the requested condition L m (z) L ∞ (z). However, for arbitrary forward recursive systems (6.2) establishing that L m L ∞ is no longer so straightforward. In what follows, we prove that for this inequality to hold it is sufficient that the limiting system is well-founded.
Lemma 5.7. Let S be an infinitely nested, forward recursive system (5.19). Assume that the coefficients of the formal power series K ∞ ( 1 , 2 , z, u) corresponding to the limiting system are non-negative and the limiting system (5.20) is well-founded (i.e. algebraic proper in the sense D R A F T of Definition 2.7) and has a non-zero solution L ∞ (z, u). Finally, assume that K ∞ (0, 0, 0, u) = 0. Then, (5.25) L m (z, u) L ∞ (z, u). and L ∞ , respectively. Intuitively, L and L + are in a sense vectors of vectors, but for convenience we call them just vectors. Note that both L(z, u) and L + (z, u) satisfy λ 2 , z, u), . . .) (5.27) with λ taken as a flattening concatenation of d-dimensional vectors of free variables (λ 0 , λ 1 , . . .).

Proof. Consider the vectors
Since for each m we have K m (λ m , λ m+1 , z, u) K ∞ (λ m , λ m+1 , z, u) it also holds The idea of the current proof is to consider the difference L + (z, u) − L(z, u) and show that it is non-negative. According to (5.26) this difference can be represented as Since Ψ(λ, z, u) Φ(λ, z, u), the formal power series Ψ can be represented as a sum Ψ(λ, z, u) = Φ(λ, z, u) + Θ(λ, z, u) with Θ(λ, z, u) 0. Hence, the difference (5.29) becomes where J is some non-negative operator whereas I is the corresponding identity. The rest of the proof is dedicated to formalising the above approach, in particular showing that the Neumann series k 0 J k is well-defined. We start by noticing that due to the well-foundedness of the limiting system (5.20) we have L + (0, u) = 0. Since Φ(λ, z, u) Ψ(λ, z, u) there also holds Φ(0, 0, u) = 0. Furthermore, since there exists a unique formal power series solution of the equation L(0, u) = Φ(L(0, u), 0, u) and L(0, u) = 0 satisfies this equation, we note that indeed L(0, u) = 0.
Consider two formal infinite-dimensional variables λ and λ + which are both flattened concatenations of d-dimensional vectors of free variables. Let us show that the difference Φ(λ + , z, u) − Φ(λ, z, u) can be represented as where J = J (z, u, λ, λ + ) is some non-negative operator (i.e. infinite-dimensional matrix). Moreover, after substituting λ = L(z, u) and λ + = L + (z, u) into J , there exists a nonnegative integer K > 0 such that J K is element-wise divisible by z. Since we have established that both L + (0, u) = 0 and L(0, u) = 0, the latter condition is equivalent to the nilpotency of the operator J evaluated at λ = L(z, u), λ + = L + (z, u) and z = 0.

D R A F T
Note that the function Φ(λ, z, u) is a sum of (finite) monomials in formal variables (λ, z, u); although λ is infinitely-dimensional, each of the monomials involves only finitely many factors of λ. Let us consider the difference of arbitrary monomials in form of (5.34) x n 1 1 · · · x n k k − y n 1 1 · · · y n k k . Note that we can rewrite (5.34) as x n 1 1 · · · x n k k − y n 1 1 · · · y n k k = x n 1 1 · · · x n k k − y n 1 1 x n 2 2 · · · x n k k + y n 1 1 x n 2 2 · · · x n k k − y n 1 1 y n 2 2 x n 3 3 · · · x n k k + · · · + y n 1 1 · · · y n k−1 in the final sum is in fact a polynomial Therefore, the difference x n 1 1 · · · x n k k − y n 1 1 · · · y n k k can be represented as a scalar product of and a vector of formal power series in (x, y). Furthermore, the difference Φ(λ + , z, u) − Φ(λ, z, u) consists of the sums of such differences of monomials multiplied by appropriate non-negative coefficients. Grouping these differences together, we obtain the desired form (5.33).
Next, as an intermediate step, let us now show that the Jacobian operator ∂Ψ ∂λ (λ, z, u) is nilpotent at (z, λ) = (0, 0). For convenience, set and J 2 (u) : Then, . Since the limiting system is well-founded (see Definition 2.7) the sum J 1 (u) + J 2 (u) is nilpotent. Moreover, since each of the matrices J 1 (u) and J 2 (u) is non-negative, there exists K such that all the summands of the expanded binomial (J 1 (u) + J 2 (u)) K are zero.
On the other hand, note that following the definition of Ψ(λ, z, u) its Jacobian operator ∂Ψ ∂λ evaluated at (z, λ) = (0, 0) admits the following block structure: If we take the Kth power of this operator, it will have a block structure in which each block element consists of a sum of certain summands from the binomial expansion of (J 1 + J 2 ) K . Since the latter is a zero matrix and the summands corresponding to the blocks are non-negative and dominated by (J 1 + J 2 ) K , all such summands are also zero. This implies that the Jacobian operator ∂Ψ ∂λ (λ, z, u) evaluated at (z, λ) = (0, 0) is nilpotent with a nilpotence index at most K, i.e. the nilpotence index of the Jacobian operator ∂K ∞ ( , , z, u) ∂ evaluated at (z, ) = (0, 0). Now, let us show that the infinitely-dimensional matrix J (z, u, λ, λ + ) evaluated at (z, λ, λ + ) = (0, 0, 0) is equal to the Jacobian operator ∂Φ ∂λ evaluated at (z, λ) = (0, 0). Recall that the operator J is determined by the differences of monomials in Φ(λ + , z, u) − Φ(λ, z, u).
Monomials that have degree zero in λ or λ + cancel out because they depend only on the arguments z and u. Likewise, monomials with degree two or more in λ or λ + vanish after the substitution λ = λ + = 0. The only type of the terms that do not turn to zero upon substitution λ = λ + = 0 are terms coming from differences of monomials linear in λ or λ + . Note that such terms have the same contribution to the infinitely-dimensional matrix J as the corresponding terms of the infinitely-dimensional matrix ∂Φ ∂λ . Hence, J (z, u, λ, λ + ) evaluated at (z, λ, λ + ) = (0, 0, 0) is indeed equal to the Jacobian operator ∂Φ ∂λ evaluated at (z, λ) = (0, 0).
And so, we have established that J evaluated at λ = L(z, u), λ + = L + (z, u) and z = 0 is nilpotent. Equivalently, it meas that after substituting λ = L(z, u) and λ + = L + (z, u) into J there exists a non-negative integer K > 0 such that J K is element-wise divisible by z. Consequently, each coefficient in z of the formal sum j 0 J j is finite. Indeed, for each integer N 0, the coefficient at z N in this formal series is a sum of coefficients at z N in the finite sum K·N j=0 J j . Moreover, since J is non-negative, this sum is also non-negative. Finally, this infinite formal series is equal to (I − J ) −1 where I is the identity operator of appropriate dimension.
Remark 5.8. The condition that K ∞ (0, 0, 0, u) = 0 can be omitted but we keep it for the simplicity of the proof. For the above proof, it is enough to guarantee that each coefficient in z of the infinite formal sum j 0 J j is finite, which is equivalent to saying that some power of J is divisible by z. More details on well-founded systems can be found in [39].

5.3.
Coefficient transfer for infinite systems. Finally, we give our main theorem on the transfer of coefficients for infinitely nested forward-recursive systems.
Theorem 5.9. Let S be an infinitely nested, forward recursive system (5.19) which tends to an irreducible context-free schema at an exponential convergence rate. Then, the respective solutions L m j (z, u) of S admit for each m 0 an asymptotic expansion of their coefficients as n → ∞ in form of (5.39) [ where ρ(u) is the dominant singularity of the corresponding limiting system (5.20) and the coefficients c m j,k (u) are analytic at u = 1. Furthermore, ρ(u) is analytic near u = 1.

D R A F T
The coefficients c m j,k (u) can be approximated by taking first (M − 1) equations from (5.19) and replacing the M th equation by its following limit variant: Such a truncated system can be solved recursively. Consequently, the coefficients of respective Puiseux expansions (5.39) are estimated with an error which is exponentially small in M .
• The condition that the system S tends to an irreducible context-free schema can be replaced by a weaker condition asserting that the limiting system (5.20) admits a suitable Puiseux expansion. • Instead of limiting systems with square-root type singularities, it is also possible to consider rational systems or other types of systems. The same set of conditions is sufficient to establish the transfer of behaviours around the dominant singular point.
Proof. (Theorem 5.9) We divide the proof into three conceptual parts.
a) First, we show that each component of the difference vector L ∞ (z, u) − L m (z, u) can be upper bounded by a Puiseux series expansion whose coefficients decay exponentially fast as m tends to infinity; b) Next, we show that if for m 1 there exist coordinate-wise upper and lower bounds then the vector of functions L m−1 (z, u) obtained from the infinite system S admits upper and lower bounds L m−1 (z, u) and L m−1 (z, u) satisfying for some matrix R(z, u) with spectral radius satisfying r(R(z, u)) 1 for z ∈ [0, ρ(u)] where ρ(u) is the dominant singularity of L ∞ (z, u); c) Finally, we combine two previous results and prove that the difference between the Puiseux coefficients of upper and lower bounds of L m (z, u) can be reduced to zero.
First part. According to Lemma 5.7 we have L m L ∞ . Following the functional definitions of L ∞ and L m from the infinite system of equations, their difference can be represented as For convenience, henceforth we omit the arguments z and u. Moreover, K ∞ and K m become functions of two vector arguments (5.44) In addition, we use the nabla notation to denote the Jacobian operator We start with the following subtraction-addition trick. Each component of the vector difference is evaluated through Lemma 5.4 (mean value lemma for formal power series) and then upper-bounded by the value of the functional K ∞ at L ∞ . Specifically, Since K m K ∞ , the final difference in the above sum is a vector of formal power series with non-negative coefficients. Consequently, the last summand can be bounded by K ∞ (L ∞ , L ∞ ) − K m (L ∞ , L ∞ ). By the condition of exponential convergence, this difference can be even further bounded by a vector of exponentially decaying functions A(z, u)(B(z, u)) m . For brevity, set . Note that ∂ 1 K, ∂ 2 K ∈ C d×d . Now, the upper bound on L ∞ − L m can be stated as or, equivalently, Let us show that the inverse matrix (I d − ∂ 1 K) −1 exists and has non-negative coefficients in the sense of formal power series. As discussed in the proof of Lemma 5.7, the matrix ∂ 1 K is nilpotent at z = 0. Equivalently, there exists a non-negative integer K such that (∂ 1 K) K is divisible by z. It means that the formal series j 0 (∂ 1 K) j is well-defined and hence so is the formal inverse (I d − ∂ 1 K) −1 . Moreover, since ∂ 1 K has non-negative coefficients, the same holds for the investigated inverse matrix (I d − ∂ 1 K) −1 .
Now, let us focus on the behaviour of (I d − ∂ 1 K) −1 as a function near z = ρ(u). As follows from Proposition 2.11 applied to the system of equations L ∞ = K ∞ (L ∞ , L ∞ , z, u), for each real 0 < z < ρ(u) we have the following inequality: By Perron-Frobenius theorem (see e.g. [22, section 2.2.5] and references therein) the spectral radius of a matrix with positive entries is monotonic in its coefficients. Hence, for all real 0 < z < ρ(u) and so, in the same interval Moreover, due to the continuity of the spectral radius, the same identity can be extended to some complex neighbourhood of u = 1.
Consequently, we can multiply both sides of (5.47) by (I d − ∂ 1 K) −1 and obtain can be further iterated for increasing values of m. In doing so, we find that Hence, the difference L ∞ − L m can be bounded by the tail of a geometric progression which appears as a summation of the formal Neumann series Let us now focus on the above formal Neumann series. Note that applying Proposition 2.9 (Drmota-Lalley-Woods theorem) to the limiting system (5.20), we obtain that the vector of functions L ∞ (z, u) admits a coordinate-wise Puiseux expansion in form of where functions 0 (u), 1 (u), ρ(u) are analytic near u = 1. Likewise, both matrices ∂ 1 K and ∂ 2 K admit Puiseux expansions of the same kind.
Let us prove that coordinate-wise Puiseux expansions of the matrix R near the singular point z = ρ(u) have the form where the spectral radius of R 0 satisfies r(R 0 ) = 1. According to Perron-Frobenius theorem, since the coefficients of R are non-negative, the eigenvalue of the matrix R with the largest absolute value, i.e. the eigenvalue corresponding to the spectral radius of R, is the largest real positive solution λ of the characteristic equation Since the determinant of a matrix product is equal to the product of respective determinants and det (I d − ∂ 1 K) = 0, as I d − ∂ 1 K is invertible, the above condition is equivalent to Let us show that the largest positive real solution (as a function of z < ρ(u)) of this equation, does not exceed 1, with equality when z = ρ(u). Assume, by contrary, that λ > 1. The matrix is a matrix with non-negative coefficients, whose coefficients are strictly smaller than the coefficients of the matrix (∂ 1 K + ∂ 2 K). By Perron-Frobenius theorem (see e.g. [22, section 2.2.5] and references therein), the spectral radius of a matrix with positive coefficients is monotonic in its coefficients, so for λ > 1 Therefore, the characteristic equation cannot have a solution λ > 1. Moreover, the spectral radius of R 0 is equal to the spectral radius of R when z = ρ(u) because in this case, the two matrices coincide. Therefore, r(R 0 ) = 1.
From (5.60) we now note that the difference L ∞ − L m can be bounded by a vector of functions having the same singularity ρ(u) as the components of the vector L ∞ . The Puiseux coefficients of this upper bound decay exponentially fast as m → ∞. Moreover, these coefficients are analytic functions near u = 1.
Second part. Assume that function L m (z, u) admits some upper and lower bounds and denote by ∆ m (z, u) the difference between these bounds:

D R A F T
That is, repeating the argument that allows to multiply both sides of the inequality by the inverse matrix, we obtain As we discovered in the first part, the matrix R := (I d − ∂ 1 K) −1 ∂ 2 K has spectral radius 1 at the singular point z = ρ(u). Third part. As a result of the first part, we know that L ∞ (z, u)−L m (z, u) can be bounded in the following manner: Let us assign in a certain delta-domain. According to Proposition 2.5 (transfer theorem), their coefficients admit the following asymptotic estimate: A final application of Lemma 5.2 (squeeze lemma for formal power series) combined with Remark 5.3 finishes the proof.
Remark 5.11. In Theorem 5.9 we prove a so-called weak transfer theorem, i.e. prove that the asymptotics of the coefficients of each L m can be obtained by taking the asymptotic expansion of the correseponding Puiseux expansion. A stronger version would suggest that the generating functions L m (z, u) can be analytically continued beyond the circle of convergence of corresponding formal power series, in a certain delta-domain. However, for our analysis, the presented weak variant is enough. The techniques presented above, can be further extended to obtain a stronger transfer theorem, by computing the Taylor series expansions at points z 0 inside the circle of convergence.

Advanced marking
In the following section we investigate more parameters related to plain and closed λ-terms. In particular, we consider: • Several parameters related to closed λ-terms, resulting in Gaussian limit laws; • Further parameters whose limiting distributions are discrete, including the leftmostoutermost redex search time in closed terms, the number of free variables in plain terms, the number of head abstractions in closed terms, and mean degree profile in closed terms; • Finally, the mean height profile of closed terms for several different notions of height.  Let L m (z) denote the generating function associated with the set of m-open λ-terms, i.e. L m (z) = n 0 a n,m z n where a n,m stands for the number of m-open lambda terms of size n. Using (6.1) we obtain a corresponding infinite system for the functions L m (z): In [15,Lemma 8] the authors prove that for each m 0 the generating functions for m-open λ-terms L m (z) admit Puiseux expansions in form of Moreover, by the virtue of their proof, we obtain an suitable approximation procedure for the coefficients a m and b m by truncating the system (6.2) and replacing the function L m (z) with L ∞ (z). Furthermore, the estimated coefficients a m and b m tend to their respective limits with an error of order O( 1 √ m ). Using Theorem 5.9 it is possible to prove that a m and b m converge to their respective limits exponentially fast. Consequently, the approximation procedure proposed in [15] convergences exponentially fast, as well.
Immediately, this implies that the probability that a random plain λ-terms is m-open, but Certainly, the limiting distribution associated with m-openness is discrete. Note that the probability distribution function of m-openness is proportional to the coefficient at z n in the bivariate generating function

D R A F T
The mean value corresponding to m-openness of plain terms can be calculated as In order to compute this expectation we use the approximation procedure discussed above. Using the (aptly truncated) recurrence for the coefficients a m and b m we obtain the numerical approximation for the mean value corresponding to m-openness. Numerical approximation yields an estimate 2.01922912627.
6.2. Variables, abstractions, successors and redexes in closed terms. In the following section we investigate the joint distribution of several parameters in closed λ-terms, utilising the novel Theorem 5.9.
Proposition 6.1. Let X n = (X n(var) , X n(red) , X n(suc) , X n(abs) ) denote a vector of random variables denoting the number of variables, redexes, successors and abstractions in a random closed λ-term of size n, respectively. Then, after standardisation, the random vector X n converges in law to a multivariate Gaussian distribution with identical parameters as plain terms.
Proof. Let us recall that the system of equations from Proposition 3.5 associated with the four parameters that we consider is, in the general class of plain terms, of the form A(z, u) = u (var) z 1 − u (suc) z + u (red) u (abs) z 2 L(z, u) 2 + zA(z, u)L(z, u) (6.7) with u = (u (var) , u (red) , u (suc) , u (abs) ) corresponding to respective components of X n .
In order to compose a similar, infinite system for m-open terms, we index respective generating functions in accordance with the natural combinatorial interpretation of m-openness; if an abstraction stands before an occurrence of L(z, u), its respective index should be increased by one. This leads us to the following system: Equivalently, we can represent (6.8) as It is straightforward to check that all the conditions of Theorem 5.9 are satisfied. Consequently, the function L 0 (z, u) admits a Puiseux expansion in form of (6.10) with the same ρ(u) as in Proposition 3.5. Therefore, the limiting distribution, after standardisation, is Gaussian with the mean vector and the covariance matrix completely determined by the behaviour of the singularity ρ(u) near the point u = 1. D R A F T 6.3. Free variables in plain terms. Proposition 6.2. Let X n be a random variable denoting the number of free variables in a random plain lambda term of size n. Then, X n converges in law to a computable, discrete limiting distribution.
Proof. Consider the infinite system of functional equations (L m (z, u)) ∞ m=0 where L m (z, u) corresponds to the generating function for plain λ-terms in which each de Bruijn index whose value k exceeds its unary height at least by m, is marked. For example, L 0 (z, u) corresponds to plain λ-terms with marked free variables. Note that L 0 (z) = zL 1 (z) + zL 0 (z) 2 + uz + uz 2 + · · · , Let us apply Theorem 5.9 to this system taking L ∞ (z) as the limiting equation. Certainly, the limiting equation does not depend on the marking variable u. Therefore, the singular point z * also does not depend on u. An application of Proposition 2.12 finishes the proof.
In order to compute the mean value we set L m (z) : and represent the respective derivative as Based on (6.11) we note that Since z 1−2zL∞(z) ∼ 1 − 2b 1 − z/ρ as z → ρ we establish the following recurrence relation for the coefficients c m and d m : Once solved, this implies Consequently, the mean value corresponding the number of free variables in a random plain lambda term is equal to d 0 b∞ = 2 (1−ρ) 3 ≈ 5.7222625231204. 6.4. Head abstractions in closed terms. Proposition 6.3. Let X n be a random variable denoting the number of head abstractions in a closed λ-term of size n, chosen uniformly at random. Then, X n converges in law to a computable, discrete limiting distribution. The corresponding expectation is close to 1.447.

(6.16)
If a λ-term starts with a head abstraction, then after its removal, the openness of the respective subterm increases by one. Consequently, we include the expression uzL m+1 (z, u) in the equation for L m (z, u). On the other hand, if the λ-term does not start with a head abstraction, i.e. starts with an application or is itself a de Bruijn index, we do not mark remaining abstractions as they are no longer head abstractions. Hence, we also include expressions zL 2 m (z, 1) and z 1−z m 1−z in the equation corresponding to L m (z, u).
Denote the final sum in (6.18) as S(z, u). Since for each m the function L m (z, 1) admits a Puiseux series expansion L m (z, 1) ∼ a m − b m 1 − z/ρ, near z = ρ it holds where c(u) comes from the last two summands of the previous expression. Since a 2 m a 2 ∞ and a m b m a ∞ b ∞ , S(z, u) is convergent near (z, u) = (ρ, 1) and the function L 0 (z, u) admits a Puiseux series expansion in form of (6.20) Consequently, p(u) = b 0 (u)/b 0 (1) is the limiting probability generating function corresponding to the number of head abstractions in closed λ-terms. The function b 0 (u) satisfies Proof. Let L m,k (z, u) denote the generating function for m-open λ-terms with u marking the number of occurrences of de Bruijn index k. Note that L m,k (z) satisfies a functional equation (6.23) L m,k (z, u) = zL m+1,k (z, u) + zL m,k (z, u) 2 + z 1 − z m 1 − z + (u − 1)z k+1 1 [k<m] where 1 [·] stands for the Iverson bracket notation.

D R A F T
Taking the partial derivative of (6.23) with respect to u and assigning u = 1, we obtain the generating function corresponding to λ-terms weighted by the number of occurrences of de Bruijn index k. Denote ∂ ∂u L m,k (z, u)| u=1 as L m,k (z). Then, taking into account that L m,k (z, 1) = L m (z) we arrive at (6.24) L m,k (z) = zL m+1,k (z) + 2zL m (z)L m,k (z) + z k+1 1 [k<m] .
These equations generate an infinite system for which Theorem 5.9 with a small modification is applicable. Each of the equations of the infinite system is linear, and the generating functions L m (z) enter the equations as coefficients. This yields the desired behaviour of the Puiseux expansions, because the non-linearity of the components is used only to provide the Puiseux expansion of the limiting equation, which is given in our case by construction. The condition of exponential convergence holds because the difference between the limiting system and the mth equation of the system is equal to (6.27) 2zE ∞ (z, w) (L ∞ (z) − L m (z)) + z 1 − wz (wz) m and decreases at exponential speed. Hence, the limiting distribution of de Bruijn index value is identical to the respective parameter in plain λ-terms.
6.6. Leftmost-outermost redex search time in closed terms.
Proposition 6.5. Let X n denote the number of vertices visited by depth-first traversal algorithm searching for the leftmost-outermost β-redex in a random closed λ-term of size n (see Section 3.6). Then, the random variable X n converges in law to a computable, discrete limiting distribution.
In order to apply Theorem 5.9 to (6.32) we need to replace the condition that the limiting system satisfies the premises of Drmota-Lalley-Woods theorem by an assumption that the limiting system admits Puiseux expansion. It was proven in Proposition 3.12 that L ∞ (z, u) has a fixed singularity z * which is independent of u. Therefore, all the functions L m (z, u) have a fixed singularity z * and by applying Proposition 2.12, we obtain that the limiting distribution of the redex search time is discrete. 6.7. Node height profile in closed terms. Like in Section 3.7, in the current section we consider unary and natural height profile of variables, abstractions and applications in closed lambda terms. For this purpose, we provide a variation of the semi-large powers theorem (see Proposition 2.6).
Then, for x in any compact subinterval of (0, +∞), as n → ∞, it holds (6.35) [z n ] k j=0 f j (z) ∼ σ ρ −n n S(βx) and x = k √ n (L j (z)) ∞ j=0 (respectively, the the sequence of first coefficients L j (ρ), and the sequence of the second coefficients) tend to their respective limits, i.e. to the coefficients of the Puiseux expansion of L ∞ (z) exponentially fast. Comparing with the Puiseux expansion of z 1−2zL∞(z) given in the proof of Proposition 3.14 we obtain the limiting values of the sequences (σ j ) ∞ j=0 and (c j ) ∞ j=0 . Since the speed of convergence is exponential, the product k j=0 σ j converges to some σ, and the sum of the ratios c j /σ j tends to a linear function βk = 2b ∞ k.
Note that up to a normalising constant, the height profile of other parameters, namely the height profile distribution of abstractions and applications, remains asymptotically the same because from the generating function viewpoint only the multiple in front of the product k j=0 f j (z) changes (see Proposition 3.14).
In the same manner, there can be obtained Rayleigh distribution for natural height profile of different parameters. For example, in the case of variable height profile, we obtain the system of equations for the family of generating functions C m,k (z, u) for m-open lambda terms with variable u marking de Bruijn indices at unary height k − m: This implies Using the same argument as in the previous case, and taking into account two first terms of Puiseux expansion of (z + 2zL ∞ (z)) (see proof of Proposition 3.14), we obtain again Rayleigh distribution, with the same parameter as for plain lambda terms.

Conclusions
We investigated the statistical properties of λ-terms in the de Bruijn notation, providing some insight into their internal, quantitative characteristics. In essence, our results suggest that random λ-terms, both plain and closed, exhibit typical traits of various tree-like structures. For instance, the distribution of sub-patterns inside random λ-terms is typically Gaussian whereas their corresponding finding time tends to a discrete limit distribution. Similarly, the height profile of random terms follows the Rayleigh distribution.
Nonetheless, some of the investigated parameters do not have analogues in other tree-like structures such as, for instance, m-openness or the number of free variables. In both cases we have established a discrete limiting distribution. Remarkably, we have not discovered substantially different statistical traits of plain and closed lambda terms; however, we found that among the statistics with discrete limiting distributions, the distribution in closed terms is often different from the associated distribution in plain terms.
Given the general algorithmic frameworks meant for the construction of effective exact-and approximate-size combinatorial samplers, such as Boltzmann samplers [24] and the recursive method [36,26], presented parameter specifications provide a novel source of effective sampling methods for λ-terms with additional control over their specific combinatorial parameters. In this context, most parameter specifications associated with plain terms are finite and hence also readily applicable. Remaining, infinite specifications are a bit more involved. Nonetheless, an appropriate truncation of the specification followed by a final rejection phase allows to discard inadmissible terms. The exponential convergence of intermediate, truncated specifications rationalises such an approach and provides effective samplers for corresponding λ-terms. Let us also remark that so generated terms do not have to be restricted to their natural parameter distributions. It is possible to gain an additional control over the expected parameter values using a dedicated tuning procedure which distorts the intrinsic parameter distribution and hence allows for a skewed parameter distribution sampler construction [6]. Consequently, the presented analysis provides means for an effective construction of various samplers for (plain or closed) λ-terms with additional control over their parameter distribution.
Few more aspects of the parameter analysis of λ-terms remain untouched. For instance, our empirical data suggests that the distribution of binding abstractions, both in plain and closed terms, is Gaussian. The same holds for the number of open subterms. Alas, the theoretical verification of our empirical findings is left open. Moreover, we have not investigated other, well-known parameters. Let us mention, for instance, the height distribution of random closed terms, or the distribution of certain extremal statistics, such as the maximal de Bruijn index, longest lambda run, or the maximal number of variables bound to a single abstraction. We conjecture that the behaviour of these parameters in closed terms does not substantially differ from the behaviour of respective parameters in plain terms. Finally, the question of generalised m-openness also has not been settled and the corresponding techniques are still to be developed.
Arguably, from the viewpoint of analytic combinatorics, our novel result complements the existing result of Drmota, Gittenberger and Morgenbesser [23] on infinite systems. In our formulation, the infinite system is not required to be strongly connected. Consequently, we conjecture that the properties of the Jacobian operator of the infinite system are not sufficient to deduce the result, in contrast with the mentioned paper. In other words, we conjecture that the condition of exponential convergence is essential. Moreover, we discovered that it is possible to rewrite the infinite system defining closed λ-terms as a strongly connected system, however the Jacobian of the resulting system is not compact. Alas, the framework [23] is not applicable.
Consequently, we finish the paper with an even more general question: what can be stated about the properties of infinite systems which are either not strongly connected or have a noncompact Jacobian operator?