Randomly weighted d-complexes: Minimal spanning acycles and Persistence diagrams

A weighted d-complex is a simplicial complex of dimension d in which each face is assigned a real-valued weight. We derive three key results here concerning persistence diagrams and minimal spanning acycles (MSAs) of such complexes. First, we establish an equivalence between the MSA face-weights and death times in the persistence diagram. Next, we show a novel stability result for the MSA face-weights which, due to our first result, also holds true for the death and birth times, separately. Our final result concerns a perturbation of a mean-field model of randomly weighted d-complexes. The d-face weights here are perturbations of some i.i.d. distribution while all the lower-dimensional faces have a weight of 0. If the perturbations decay sufficiently quickly, we show that suitably scaled extremal nearest face-weights, face-weights of the d-MSA, and the associated death times converge to an inhomogeneous Poisson point process. This result completely characterizes the ∗Research supported by ARRS project N1-0058. †Research supported by URSAT, ERC Grant 320422. ‡Research supported by DST-INSPIRE faculty award. the electronic journal of combinatorics 27(2) (2020), #P2.11 https://doi.org/10.37236/8679 extremal points of persistence diagrams and MSAs. The point process convergence and the asymptotic equivalence of three point processes are new for any weighted random complex model, including even the non-perturbed case. Lastly, as a consequence of our stability result, we show that Frieze’s ζ(3) limit for random minimal spanning trees and the recent extension to random MSAs by Hino and Kanazawa also hold in suitable noisy settings. Mathematics Subject Classifications: Primary : 60C05, 05E45 Secondary : 60G70, 60B99, 05C80

extremal points of persistence diagrams and MSAs. The point process convergence and the asymptotic equivalence of three point processes are new for any weighted random complex model, including even the non-perturbed case. Lastly, as a consequence of our stability result, we show that Frieze's ζ(3) limit for random minimal spanning trees and the recent extension to random MSAs by Hino and Kanazawa also hold in suitable noisy settings.

Introduction
Broadly, there are two parts to this paper. The first part concerns weighted simplicial complexes. This study significantly deepens the understanding of the relationship of minimal spanning acycles in such complexes to associated persistence diagrams and also to what we refer to as "nearest face" distances. The second part looks at a specific "mean-field" model of complexes with random weights and, in parallel, also considers its perturbations. We refer to these complexes as randomly weighted d-complexes or, simply, weighted random complexes. Our results completely characterize the extremal behaviour of the persistence diagram and the nearest face distances associated with such complexes and then, using the above relationships, also of their minimal spanning acycles.
The motivation for this work comes from the much more studied scenario of weighted graphs, the 1-dimensional analogue of weighted simplicial complexes, and their random counterparts. A weighted graph can either be viewed in its entirety or as a process wherein it is sequentially built by adding edges in an order dictated by their weights. Taking the former perspective, a minimal spanning acycle corresponds to a minimal spanning tree, while the nearest face distances are basically the nearest neighbour distances. The other viewpoint helps interpret the persistence diagram associated with a graph; informally, it is a record of the "death times", i.e., the weight values of those edges that connect a priori disjoint components.
The fact that connectivity and nearest neighbour distances are intertwined can be seen from the earliest work itself on random graphs by Erdős and Rényi [ER59]. In fact, three years earlier, Kruskal had proposed his algorithm for constructing a minimal spanning tree. The edge weights of this tree are precisely the times at which components get connected in the process type description of the weighted graph. Hence, that work can be viewed as the first to implicitly exploit the connections between minimal spanning trees and persistence diagrams. Indeed, the notion of persistence diagrams did not exist then, but one interpretation of Kruskal's algorithm is via persistence diagrams and this relation is made more clear in this paper. This implicit relationship was also used later in the seminal work of Frieze [Fri85]. On the other hand, connections between the largest nearest neighbour distances and the longest edges of a minimal spanning tree on randomly weighted graphs have played a key role in [Hen82,ST86,AR02,Pen97,HR05]. In [Pen97], it was shown that the extremal nearest neighbour distances coincided with that of extremal edge-weights in a Euclidean random minimal spanning tree. Such a result is crucial to understanding the connectivity threshold for random geometric graphs (see [Pen03,Chapter 13]). More complete accounts of such connections can be found in [Bol01,vdH16,JLR00,Pen03,FK16].
Recent applications in topological data analysis have motivated the extension of the above results to random complexes. While higher dimensional analogues of connectivity thresholds have already been studied [LM06,MW09,Kah14a], this work generalizes some of these later results to the level of persistence diagrams and minimal spanning acycles. Before delving into the background and details of our results, we summarize our main contributions. Note that a weighted d-complex is a simplicial complex with dimension d in which each face is assigned a real-valued weight. Throughout, we will assume that this weight function is monotone, i.e., the weight of any face is always larger than that of its sub-faces. Such a weighted complex can also be viewed as a process wherein one adds faces in the order dictated by their weights. Because the weight function is monotone, any intermediate construction is also a simplicial complex. With this dual perspective, one can infer properties about minimal spanning acycles from the death and birth times in the persistence diagram and vice versa.
The section numbers in brackets below indicate where one can find a detailed description of the corresponding contribution.
Key Contributions: 1. We first provide a simplicial analogue of Kruskal's algorithm that can be used for finding minimal spanning acycles. Comparing this algorithm with the incremental algorithm used to build a persistence diagram, we establish an equivalence between the face-weights of minimal spanning acycles and the death times; a similar result also holds true for the birth times. This result significantly enhances the connection between minimal spanning acycles and persistence diagrams. In fact, one of the theorems in Hiraoka and Shirai [HS17, Theorem 1.1] now becomes a simple corollary of our result. (Section 1.2).
2. Next, we establish a new stability result for minimal spanning acycles. Because of the equivalence above, this result then automatically applies to the death and birth times as well. Unlike existing stability results for persistence diagrams which concern the multiset of birth-death pairs, our result specifically relates the changes in the set of deaths and, separately, in the set of births to the changes in the face weights. We believe this result can play a crucial role in proving results for randomly weighted complexes with certain dependencies between the different face weights. (Section 1.3) 3. Our final key result concerns randomly weighted d-complexes and suitable noisy perturbations of them, including those with dependencies. If the perturbations decay sufficiently fast, we show that appropriately scaled versions of the following three point processes: (a) nearest face distances, (b) death times in the persistence diagram, and (c) face-weights of the minimum spanning acycle -converge weakly to the same Poisson point process in vague topology. Derivation of (a) and (b) involves use of the method of factorial moments, cohomology theory, and our stability result. On the other hand, going from (b) to (c) is a simple application of our first result. However, unlike in our second contribution, notice that this time we exploit the equivalence in the other direction, i.e., we go from a result on death times to a result on face weights of the minimal spanning acycle. An important paradigm in topological data analysis is that extremal points of a persistence diagram encode meaningful topological information about the underlying structure. Viewed in this light, our result completely characterizes the extremal points of the persistence diagram of these weighted random complexes; this is new even in the non-noisy scenario. (Section 1.4).
4. We conclude by providing another application of our stability result. Namely, the lifetime sum of persistence diagrams converge for randomly weighted d-complexes with noisy weights; this generalizes the ζ(3)-limit for random minimal spanning trees by Frieze [Fri85] and the recent extension to random spanning acycles by Hino and Kanazawa [HK19]. (Section 1.4).
Organisation of Paper: The rest of this section quickly introduces simplicial complexes, minimal spanning acycles, persistent homology and then provides more precise statements of some of our main results as well as place them in context. The next section -Section 2 -gives in detail the necessary topological (Section 2.1) and probabilistic preliminaries (Section 2.2) 1 . Section 3 is exclusively devoted to studying various properties of minimal spanning acycles, algorithms to find them, their connection to persistence diagrams, and our crucial stability result. Finally, in Section 4, we study weighted random complexes and prove our point process convergence results. In subsection 4.1, the weights are independent and identically distributed (i.i.d.) uniform [0, 1] random variables while, in subsection 4.2, the weights are either i.i.d. with a more general distribution F or a perturbation of the same. In the Appendix, we give proofs of two results needed for the main part of the paper and a brief explanation on the method of factorial moments.

Simplicial complexes and minimal spanning acycles:
We begin by defining a (simplicial) complex, which is a higher dimensional analogue of a graph.
Definition 1. An (abstract) simplical complex K on a finite ground set V is a collection of subsets of V such that if σ 1 ∈ K and ∅ = σ 2 ⊂ σ 1 , then σ 2 ∈ K as well. The elements of K are called simplices or faces and the dimension of a simplex σ is |σ| − 1, where | · | means cardinality. A d-face of K is a face of K with dimension d.
Given a complex K and d 0, we denote the d-faces of K by F d (K) and its d-skeleton by K d (i.e., the sub-complex of K consisting of all faces of dimension at most d). We use σ, τ to denote faces and the dimension of the face shall not be explicitly mentioned unless required. A graph is a complex that consists only of 0-faces and 1-faces, or in other words, the 1-skeleton of a complex is a graph. Associated to each simplicial complex is a collection of non-negative integers denoted β 0 (K), β 1 (K), . . . , called the Betti numbers 2 (see Section 2.1 for detailed definitions) which are a measure of connectivity of the simplicial complex. Informally, the d-th Betti number counts the number of (d + 1)dimensional holes in the complex or equivalently the number of independent non-trivial cycles formed by d-faces. Two points to note at the moment are: (i) β 0 (K) is one less than the number of connected components in the graph formed by 0-faces and 1-faces and (ii) if the dimension of K (maximum of dimension of faces) is d, then β j (K) = 0 for all j d+1.
The Betti numbers described above are closely connected to spanning acycles. For example, the spanning tree of a graph on a vertex set V can be described in topological terms as a set of edges S such that β 0 (V ∪ S) = β 1 (V ∪ S) = 0, i.e., V ∪ S is connected and has no cycles. The following higher-dimensional generalization by Kalai [Kal83] is then natural.
Definition 2 (Spanning and Maximal acycle). Consider a complex K of dimension at least d, d it is called a spanning acycle if it has both the properties. Separately, a subset S of d-faces is called a maximal acycle if it is an acycle and maximal with respect to (w.r.t.) the inclusion of d-faces.
Though this definition of a spanning acycle merely replaces appropriate indices in the definition of a spanning tree, what is not obvious is that this is a good higher-dimensional generalization of a spanning tree. This work of ours is the first to formally ascertain that several key properties of a spanning tree naturally extend to a spanning acycle as well; see our results in Section 3.
An alternative but more explicit algebraic description of a spanning tree is that it consists of a set of columns which form a basis for the column space of the incidence matrix or the boundary matrix; i.e., the matrix ∂ 1 whose rows are indexed by vertices and columns by edges and its i, j-th entry is 1 if the vertex i belongs to the edge j and 0 otherwise. For simplicity, we assume throughout this paper that the underlying field F = Z 2 here, i.e., all vector spaces involved are Z 2 -vector spaces. It is well-known that the space of bases for these vector spaces form a matroid. Such a description also holds for spanning acycles. While we never explicitly work with this latter description in this paper, it, however, implicitly underpins many of our proof ideas. We shall explicitly point this out whenever that is the case.
If we assign weights to the faces, we obtain a weighted complex K. Now, setting w(S) := σ∈S w(σ) for a subset S of simplices, we can naturally define a minimal spanning acycle as a spanning acycle S with minimum weight w(S). Since we deal with only finite complexes, the existence of a minimal spanning acycle is guaranteed once a spanning acycle exists. We shall denote a minimal spanning acycle by M d or simply M when the dimension is clear. Though Kalai's definition of a spanning acycle and enumeration of number of spanning acycles (a generalization of Cayley's formula for spanning trees) is more than three decades old, it is receiving increased attention in the last few years [BBC14, CCK15, DKM09, IKR + 11, DKM16, KKL19, KR14, HS17, HS16, Lyo09, LNPR19, MNRR15]. In Section 3, we prove some fundamental properties for minimal spanning acycles: existence, uniqueness, cut property and a simplicial Kruskal's algorithm. Here we would also like to emphasize that properties of spanning acycles are preserved under simplicial isomorphisms but not necessarily under homotopy equivalence.
We would like to highlight that some more fundamental properties of the minimal spanning acycles can be found in an earlier version of our article (see [STY17]); we do not include them here since they are not used elsewhere in the paper. Of these, we would like to point the reader to two interesting results which are not known for matroids in general. First is an inclusion-exclusion identity for the cardinality of maximal acycles which is derived using the Mayer-Vietoris exact sequence from algebraic topology. Second, we provide a generalization of Jarník-Prim-Dijkstra's algorithm to spanning acycles. In fact, we need the complex to be 'hypergraph connected' for the Prim's algorithm to work and it is not obvious what is the analogous notion of 'hypergraph connectivity' in general matroids. As part of the proof, we also show that a spanning acycle is 'hypergraph connected', again by using the Mayer-Vietoris sequence.

Persistence diagrams and minimal spanning acycles.
We now preview the connection between persistence diagrams and acycles. Let K be a weighted complex such that the real-valued weight function w is monotone. Then, K(t) := w −1 (−∞, t] is a simplicial complex for all t ∈ R and we will refer to {K(t) : t ∈ R} as the filtration induced by w on K.
Let d 0 and suppose that β d (K) = 0. Let β d (t) = β d (K(t)). We remark that β d (t) is a jump function. The times of positive jumps (counted with multiplicity) are birth times B = {B i } of the persistence diagram and the times of negative jumps (counted with multiplicity) are death times D = {D i }. The correct way to count multiplicity will be made clear in Definition 9. However, if the weight function is injective, then there is no multiplicity. The non-expert reader may assume weight functions to be injective for ease of understanding the results in the introduction.
Formally, a persistence diagram corresponding to dimension d is the multiset of the points {(B i , D i )}. Note that it is not only a record of the birth and death times, but importantly also of the pairing of a birth with its corresponding death. A persistence diagram is useful for understanding the evolution of topology of a filtration. See Figure  1 for persistence diagrams of two weighted random complexes -the uniformly weighted random d-complexes (see Section 4.1) and Erdős-Rényi clique complexes. The aforementioned persistence diagram would be referred to as the persistence diagram of H d (K) whenever we wish to avoid ambiguities about the dimension and the underlying complex. In this paper, we shall focus only on their two projections -birth and death times.
Though not everything can be inferred from these projections, a crucial quantity that can be understood from these projections is the lifetime sum L d (K) := i (D i − B i ), which by Fubini's theorem also equals ∞ 0 β d (t)dt ([HS17, (1.4)]). We now present the first of our main theorems that connects persistence diagrams to minimal spanning acycles. Here and elsewhere, when the underlying complex K is clear we shall drop it from all our notations.  This result reveals a stronger connection between persistence diagrams and minimal spanning acycles than what is known in literature. If K is a weighted d-complex with β d−1 (K) = β d−2 (K) = 0 then, as a corollary of the above theorem, we obtain the following relation For d = 1 (assuming K 0 ⊂ w −1 (0)), the above relation is well known and, for d 2, this relation was derived recently in [HS17, Theorem 1.1] using different techniques. This latter paper and, in particular, their derivation of (1) served as our stimulus to investigate minimal spanning acycles. Apart from its striking simplicity, we believe Theorem 3 can be useful in studying either of them using the other. In fact, this result is frequently used in this paper. Much of the complexity in understanding persistent homology arises from the pairing of birth and death times. The above result is useful in understanding death or birth times individually and, in certain cases, this shall yield useful information (e.g., lifetime sum) even without the knowledge of the pairings. The proof of the above theorem and some of its consequences can be found in Section 3.3.

Stability of birth and death times
Stability results (e.g. [EH10, Section VIII.2], [CCSG + 09, CDSO14, CSEH07, CSEHM10]) are an important cog in the wheel of topological data analysis and provide a theoretical justification for the robustness of persistent homology. While L ∞ stability (or bottleneck stability) is the most standard form of stability proven for persistence diagrams, L p stability for p 0 requires restrictive assumptions that are not widely applicable. Using simplicial version of Kruskal's algorithm and the correspondence (Theorem 3), we prove the following stability result separately for the birth and death times with minimal assumptions. The usefulness of this stability result will become apparent in Section 1.4 For p = ∞ and a sequence {x i } i 1 , in the usual manner, i |x i | p should be read as sup i |x i |.
As part of the proof (see Section 3.4) of the above stability result, we show that on a fixed simplicial complex changing weights of m (m 1) faces can change at most m death times and m birth times by the difference between the weights on the faces 4 . One might suspect that the L ∞ stability in the above theorem can be deduced from the bottleneck stability of persistence diagrams by a projection argument. This is, however, not the case due to the fact that the diagonal plays a special role in the definition of bottleneck stability of persistence diagrams, but for point processes on R there is no equivalent to the diagonal.

Weighted random complexes
Having offered a teaser to our deterministic results, we now turn to a preview of the probabilistic results. Whereas there is a rich recent literature on deterministic aspects of spanning acycles (see in Section 1.1) and random complexes (see below), the literature is sparser on weighted random complexes or random minimal spanning acycles. The probabilistic model of interest to us is the one introduced by Linial and Meshulam [LM06] and then extended by Meshulam and Wallach [MW09]. This model, called the random d-complex and denoted by Y n,d (p), consists of all faces on n vertices (i.e., ground set V = [n] := {1, . . . , n}) with dimension at most (d − 1) and each d-face is included with probability p independently. Y n,1 (p) is the classical Erdős-Rènyi graph on n vertices with edge-connection probability p. Like Erdős-Rènyi graph is a mean-field model of pairwise interactions, the random d-complex can be considered as a model of higher-order interactions. This model has spawned a rich literature in the recent years [LM06,MW09,CCFK12,CF15a,CF15b,LP16]. Although we focus on the random d-complex, we alert the reader of the existence of a richer theory of random complexes and topological data analysis [Car14,Kah14b,BK14,Kah17,CFK12].
The focus of many studies on random d-complexes has been the two non-trivial Betti numbers of the complex: β d−1 (·) and β d (·). The starting part of our study is the following fine phase transition result for β d−1 (Y n,d (p)).
for some fixed c ∈ R. Then, as n → ∞, β d−1 (Y n,d (p n )) ⇒ Poi(e −c ), where Poi(λ) stands for the Poisson random variable with mean λ and ⇒ denotes convergence in distribution.
The proof of this result proceeds as follows: First, it is shown that N n,d−1 (p n ) ⇒ Poi(e −c ), where N n,d−1 (p) denotes the number of isolated (d − 1)-faces in Y n,d (p). Then, for p n as chosen, it is established that N n,d−1 (p n ) completely determines the behaviour of the (d−1)-th Betti number (see also Appendix C). Building upon this relation, one also has that Pr{β d−1 (Y n,d (p n )) = 0} → 1 if np n − d log n → ∞ and Pr{β d−1 (Y n,d (p n )) = 0} → 0 if np n − d log n → −∞. These were proven by Erdős and Rényi [ER59] in 1959 for d = 1, much later by Linial and Meshulam [LM06] in 2006 for d = 2 and shortly thereafter in 2009 for d 3 by Meshulam and Wallach [MW09].
One of the goals of this paper is to generalize Lemma 5 first to the level of persistence diagrams and then to that of minimal spanning acycles of randomly weighted d-complexes. Before providing the actual statements, we give a formal definition of these weighted complexes.
Definition 6. Let d 1 be some integer. Consider n vertices and let K d n be the complete d-skeleton on them. Let φ : K d n → [0, 1] be the weight function with the following properties: Here, {φ(σ) : σ ∈ F d } are real valued i.i.d. random variables with (cumulative) distribution function F : R → [0, 1] perturbed respectively by { n (σ) : σ ∈ F d }. The latter are another set of real valued random variables not necessarily identically distributed or independent of each other or φ(σ)'s. The randomly weighted d-complex L n,d is the simplicial complex K d n weighted by φ . Associated with L n,d is the canonical simplicial process given by the filtration {L n,d (t) : t ∈ R}, where L n,d (t) = {σ : φ (σ) t}.
For ease of use, we shall write σ ∈ L n,d to mean σ ∈ K d n . Similarly, F i (L n,d ) shall mean F i (K d n ) and so on. Finally, let n ∞ := max σ∈F d (L n,d ) | n (σ)|. Our key result concerning randomly weighted d-complexes is that if the perturbations decay sufficiently fast, then suitably scaled point processes related to the nearest face distances, weights of the faces in the d-minimal spanning acycle, and death times in the associated persistence diagram all converge to the same inhomogeneous Poisson point process. The proof crucially relies upon Theorems 3 and 4.
Formally, we consider the following three scaled point processes on R.
1. (Extremal) nearest face distances, i.e., P C n,d : ( Observe that the scaling used in the definitions of each of P C n,d , P D n,d , and P M n,d pushes quantities less than the d log(n)/n threshold to −∞, asymptotically. In that sense, asymptotically, the three processes only consider the extremal values, i.e., those that are above this threshold.The reason for transforming weights, as will be seen below, is that it yields a limiting point process independent of F . If we think of the weighted complex as a dynamic complex with simplices being added at times equal to their weights, then the transformation by F is nothing but a time-change.
At first glance, these are three distinct point processes on R and there are no obvious reasons why they ought to be connected. However, by applying Theorem 3, we get P M n,d = P D n,d and, from Corollary 30 that we establish later, it follows that P C n,d ⊂ P M n,d . A natural guess based on this would be that a similar relation holds amongst the three processes asymptotically as well. Surprisingly, the below result shows that the three processes in fact have the same asymptotic behaviour.
Theorem 7. Suppose that F is Lipschitz continuous. If n n ∞ → 0 in probability, then each of P C n,d , P D n,d , and P M n,d converges vaguely in distribution to P poi , where P poi is the Poisson point process with intensity e −x dx on R.
Since the (d − 1)-faces have zero weights, the birth times in the persistence diagram of H d−1 are all zero. Hence, if n ∞ = 0 and F is the distribution function of U [0, 1], then P D n,d ((c, ∞)) = β d−1 (Y n,d (p n )) for p n and c as in Lemma 5. Thus, a point process convergence for P D n,d in this special case implies Lemma 5 as a corollary. This and more follows from the above result. See Figure 1(a) for simulations of P D n,d for d = 1, 2, 3, 4 in the above special case.
To the best of our knowledge, a point process convergence result as above is not known even for complete graphs with i.i.d. uniform [0, 1]-weights, which might be considered as a mean-field model for random metric spaces. For random geometric graphs, such a point process convergence result for extremal edge weights of the minimal spanning tree was proven in [Pen97,HR05]. These results were important to understand the connectivity of random geometric graphs. However, reversing the scenario, we have gone from results on connectivity (i.e., H k (·) persistence diagrams) to those for minimal spanning acycles.
The above weak convergence result along with the continuous mapping theorem yields asymptotics of various statistics of P D n,d . Our result could be useful in deriving asymptotics for extremes of other summary statistics of persistence diagrams such as persistence landscapes [Bub15], homological scaffolds [PET + 14] or accumulative persistence function [BM19].
As for our proof, we first deal with the case when F is the distribution function of U [0, 1] and n (σ) ≡ 0. We use the factorial moment method to show convergence of the first point process and then use cohomology theory to show that this is a good enough approximation for the second point process. This yields convergence of the second point process. Finally, this along with Theorem 3 gives the convergence of the third point process (see Section 4.1). This approach is inspired by those of [LM06,MW09,KP14]. Next, we extend this result to the case of the more general i.i.d. weights. Finally, we complete the proof of Theorem 7 by using our stability result (Theorem 4) as well as showing that the topology of bottleneck distance between Radon counting measures is stronger than vague topology (see Section 4.2).
We now present one more powerful consequence of our stability result. While it is believed that introducing weak dependencies between the random variables should not affect the asymptotics, it is often difficult to prove such a statement rigorously. As we again illustrate, our stability result helps bridge this gap in certain situations. In particular, given an arbitrary random complex, it enables one to translate certain limit theorems to noisy variants of this complex once the same has been shown in the noiseless setting.
Consider L n,d from Definition 6 and suppose that F is the distribution function of Let us define the (weighted) lifetime sums for α 0 as To begin with, suppose that n ∞ = 0 for all n 1. In such a case, we denote the weighted random complex by U n,d and the corresponding lifetime sum by L n,d . Then, it follows from a remarkable recent result by Hino-Kanazawa ([HK19, Theorem 4.11]) that, where I α d−1 is an explicitly defined constant (see [HK19, (4.10)] for the definition of constants and [HK19, Section 4.4] for more concrete expressions). In the special case of d = 1, α = 1, this is the famed result of Frieze [Fri85] for random minimal spanning trees with I 1 0 = ζ(3) where ζ is the Riemann-zeta function. Further, I p 0 for p ∈ {1, 2, . . .} are shown to be linear combinations of ζ(3), ζ(4), etc. Using our stability result, we now extend this result to the noisy case. The proof can be found in Section 4.2.

Preliminaries
We describe here the basic notions of simplicial homology, persistent homology, and point processes. We remark that, in an earlier version of the paper (see [STY17, Appendix B]), we have rephrased our topological notions in the language of matrices for an alternative and computationally convenient viewpoint.

Topological notions
We point out that we shall always choose our coefficients from a field F. In this regard, 0 stands for additive identity, 1 stands for multiplicative identity and −1 for the additive inverse of 1. An often convenient choice in computational topology is F = Z 2 in which case 1 = −1.

Simplicial Homology
For a good introduction to algebraic topology, see [Hat02], and for simplicial complexes and homology, see [EH10,Mun84]. Let K be a simplicial complex (see Definition 1). We assume throughout that all our simplicial complexes are defined over a finite set V . The 0-faces of K are also called as vertices. When obvious, we shall omit the reference to the underlying complex K in the notation. A d-simplex σ is often represented as [v 0 , . . . , v d ] to explicitly indicate the subset of V generating the simplex σ.
An orientation of a d-simplex is given by an ordering of the vertices and denoted by [v 0 , . . . , v d ]. Two orderings induce the same orientation if and only if they differ by an even permutation of the vertices. In other words, for a permutation π on [d], the electronic journal of combinatorics 27(2) (2020), #P2.11 where sgn(π) denotes the sign of the permutation π. We assume that each simplex in our complex is assigned a specific orientation (i.e., ordering).
Let F be a field. A simplicial d-chain is a formal sum of oriented d-simplices The free abelian group generated by the d-chains is called the d-th chain group and is denoted by C d (K). Formally, .
and then extend it linearly on It can be verified that ∂ d is a linear map of vector spaces and more importantly that ∂ d−1 • ∂ d = 0 for all d 1, i.e., boundary of a boundary is zero. When the context is clear, we will drop the dimension d from the subscript of ∂ d .
Note that the free abelian group of d-chains is defined only using F d (K). When we use a subset S ⊂ F d (K) of d-faces rather than the entire collection of d-faces to generate the free abelian group, we shall use C d (S) to denote the corresponding free abelian (sub)group of d-chains. In other words, The d-th boundary space denoted by B d is im ∂ d+1 and the d-th cycle space Z d is ker ∂ d . Elements of Z d are called cycles or d-cycles to be more specific. The d-dimensional (reduced) 5 homology group is then defined as the quotient group Again, since we are working with field coefficients, B d , Z d and H d are all F-vector spaces. The bases of these vector spaces form a matroid ( [Oxl03,Wel76]). This implies that certain concepts such as the span of a generating set and properties such as the exchange property automatically hold. While it is not necessary for understanding our results, a familiarity with matroids is helpful. The d-th Betti number of the complex β d (K) is defined to be the rank of the vector space H d . Respectively, let b d (K) := β(B d ) and z d (K) := β(Z d ) denote the ranks of the d-th boundary and d-th cycle spaces, respectively. Thus, we have that Note that we drop the adjective reduced henceforth, but all our homology groups and Betti numbers are indeed reduced ones. Some authors prefer to useH d andβ d to denote reduced homology groups and Betti numbers respectively, but we refrain from doing so for notational convenience. However, under such a notation, we note that β d −β d = 1[d = 0]. This gives an easy way to translate results for reduced Betti numbers to Betti numbers and vice-versa. We denote the Euler-Poincaré characteristic by χ and the Euler-Poincaré formula holds as follows: An important property of homology groups that is often of use is the following: If K 1 , K 2 are two complexes such that the function h : , then there exists an homomorphism h * : H d (K 1 ) → H d (K 2 ) called the induced homomorphism between the homology groups. If K 1 ⊆ K 2 , then a natural simplicial map is the inclusion map from K 1 to K 2 . The case of multiple inclusions now brings us to persistent homology.

Persistent Homology
Put differently, the filtration {K(t) : t ∈ R} describes how to build K by adding collections of simplices at a time. For more complete introduction and survey of persistent homology, see [EH10,Car09,Car14]. We now describe the natural filtration associated with weighted simplicial complexes. Consider a simplicial complex K weighted by w : K → R satisfying w(σ) w(τ ), whenever σ, τ ∈ K and σ ⊂ τ. Functions having this property are called monotonic functions in [EH10, Chapter VIII]. As w is monotone, {K(t) : t ∈ R} with K(t) := w −1 (−∞, t] forms a sublevel set filtration of K. Further note that w induces a partial order on the faces of K. Assuming axiom of choice, this partial order can always be extended to a total order [Szp30]. Let < l denote one such total order. We make the standing assumption that for a given weight function w, the same total order < l is chosen and used throughout the paper. One can now view the above sublevel set filtration associated with (K, w) in a dynamic fashion: as the parameter t evolves over R, K gets built one face at a time respecting the total order < l . In this way, with the addition of faces, the topology of K evolves. Clearly denotes the complex right before the face σ is to be added. Thus, given a monotonic weight function w, we can construct a filtration with respect to the chosen total order < l . We shall call this filtration the canonical filtration associated with the total order < l or a linear filtration of the weight function w.
To track the changes in topology, akin to the definition for homology given in (6), we define the (t 1 , t 2 )-persistent homology group as the quotient group The information for all pairs (t 1 , t 2 ) can be encoded in a unique interval representation called a persistence barcode [ZC05] or equivalently a persistence diagram [CSEH07]. Before giving the definition, we first note that for a finite simplicial complex endowed with a total ordering < l , we can reindex the filtration by assigning a natural number to each simplex. We refer to this as a discrete weight w N corresponding to the monotonic weight function w i.e., w N (σ) < w N (τ ) iff w(σ) < l w(τ ). Note that there is a bijection between total orders < l and weight function w N . Thus, the discrete weight has a natural, well-defined projection π back to the original function values, Definition 9. Given a simplicial complex K with a monotonic function w and the corresponding discrete weight w N : K → N, the d-th persistence diagram Dgm(K, w N ) is the multiset of points in the extended grid N 2 such that the each point (i, j) in the diagram represents a distinct class (i.e., a topological feature This differs from the typical definition of a persistence diagram, where the existence and uniqueness of the persistence diagram is defined in terms of an algebraic decomposition into interval modules see [CDSGO16,CB15]. For technical reasons, this approach generally discards the points on the diagonal, i.e., topological features which are both born and die at time t. In the above definition, the total order guarantees that there are no points on the diagonal of the discrete filtration. However, since we deal with the restricted setting of piece-wise constant functions on finite simplicial complexes, we do not lose any information; indeed, we keep more of the chain level information. We then transform the persistence diagram back to the original monotone function. After the transformation, points may lie on the diagonal and, as we shall see, we do require these points.
Our definition is used implicitly in [ZC05], which first identified the algebraic decomposition as a consequence of the structure theorem of finitely generated modules over a principle ideal domain. This applies in this setting since the homology groups of finite simplicial complexes are always finitely generated. Therefore, we could have equivalently defined the diagram using the decomposition directly as done in Corollary 3.1 in [ZC05], as the modified Smith Normal Form of the boundary operator [SVJ13]. We believe that our definition is more accessible to a non-algebraic audience and is included for completeness. But more important for us are birth and death times defined below.
Definition 10. The death times (respectively birth times) of the filtration associated with (K, w) are equal to the multiset of y-coordinates (x-coordinates) of points in Dgm(K, w).
We now discuss the notion of negative and positive faces which are vital to our proofs.
Lemma 11. ([DE93, Section 3]) Let K be a simplicial complex on vertex set V and σ ⊂ V be a set of cardinality d + 1 in V for some d 0. Additionally, assume that σ / ∈ K but ∂σ ∈ C d−1 (K). Then, β j (K ∪ σ) = β j (K) for all j / ∈ {d − 1, d}. Further, one and only one of the following two statements hold: From the definition of the cycle and boundary spaces, the above two numbered statements can be interpreted equivalently in the following manner which shall be useful for us: (10) Definition 12 (Positive and Negative faces). Let K be a complex with vertex set V and σ ⊂ V be a set of cardinality d + 1 for some d 0. Further assume that σ / This is useful for understanding how the topology evolves in the filtration associated with w (recall (8)). If σ is a d-face, then Lemma 11 shows that the relationship between the topology of the setup before and after addition of σ is as follows: , (ii) one and exactly one of the following is true: As in Definition 12, when (11) holds (respectively (12) holds) σ will be called a negative face (positive face) w.r.t. the natural filtration of (K, w). We emphasize that the total order < l uniquely determines the label of faces as either positive or negative. The above discussion can be neatly converted to an algorithm to generate birth and death times of the persistence diagram with respect to a given linear filtration of the weight function w.
The above algorithm is a simplification of the persistence algorithm in [ELZ02, Fig.  5] which also used negative and positive simplices. The simplification in our algorithm essentially lies in turning a blind eye to the information about the pairing between the birth and death times. The equivalence of negative faces with death times (and hence positive faces with birth times) was established in [ZC05, Fig. 9]. These algorithms extended the incremental algorithm for computing Betti numbers in [DE93]. We summarize the algorithm, especially for ease for future reference, as follows : Let σ be a d-face in K.
the electronic journal of combinatorics 27(2) (2020), #P2.11 Algorithm 1 Incremental Persistence Algorithm Input: K, w Main Procedure: Remark 13. As already explained, if K is a weighted simplicial complex, there is a unique total ordering of the faces if the weight function is injective. Otherwise, it is only a partial ordering. However, this partial ordering can be extended to a total order. This correspondence between monotonic weights and total orders shall be used to simplify many of our proofs. We shall often prove many statements for weighted simplicial complexes with unique weights and appeal to this correspondence in extending the proof to general monotonic weight functions. Equivalently, one can prove results for w N and then use the natural projection π to obtain the corresponding result for monotonic weight function w.

Spanning acycles
As made clear in the title, the other key object of our study is the spanning acycle, which has been already introduced in Definition 2. We now discuss the definition in more detail. Apart from being more restrictive than that in [HS17,Kal83], our definition differs from that of [HS17] in its use of field coefficients over integer coefficients. Clearly, in the case of d = 1, S is a minimal spanning tree on the graph K 1 . Strictly speaking, the above definition is that of a d-spanning acycle but since in most cases the dimension d will be clear from the context, we shall not always explicitly refer to the dimension d. Recall that for any S ⊆ K, w(S) = σ∈S w(σ) denotes the weight of S. Denoting the set of d-spanning acycles of K by S d (K), S 0 ∈ S d (K) is a minimal spanning acycle if Spanning trees and more generally connectivity in the case of graphs can be extended in a multitude of ways to higher-dimensions. Betti numbers and acycles represent one possible (and indeed a very satisfying) way to generalize to higher dimensions. Another common generalization is via the notion of a hypergraph. In this context, one can define a hypergraph on a simplicial complex by considering all the faces as hyper-edges. We will not use hypergraph connectivity in this paper, but we only remark that studying the hypergraph connectivity of spanning acycles yields interesting results. Remark 14. We would like to highlight one more interpretation of the spanning acycles before continuing. As will no doubt be known or obvious to experts in the field, an alternative view of a spanning acycle is as a basis for the space of boundaries. Indeed, Algorithm 1 maintains a basis and, for insertion, checks whether the boundary of a simplex is in the span of the current basis or not. If it is linearly independent, the simplex (or more accurately its weight ) is added to the list of death times, otherwise it is added to the set of birth times.

Probabilistic notions
We give here a brief introduction to point processes on R.
Let m n , m ∈ M p (R). We will say that m 2. For any A ⊆ R, P(A) is a Poisson random variable with mean A µ(x)dx.
Definition 16. Let P n , P be point processes on R, not necessarily defined on the same probability space. We will say that P n converges weakly to P, denoted P n ⇒ P, if for all continuous and bounded f : (M p (R), M p (R)) → R. This is equivalent to saying lim n→∞ Pr{P n ∈ A} = Pr{P ∈ A} for all A ∈ M p (R) such that Pr{P ∈ ∂A} = 0. Here ∂A denotes the boundary of A.
An alternative topology on M p (R) that arises naturally in computational topology is the so-called bottleneck distance d B . Note that we require a modified definition for point measures in R rather than the more standard definition for persistence diagrams (e.g. [CCSG + 09, EH10]). Though this is not a metric in the classical sense, taking min{d B , 1} we obtain a metric on M p (R). More importantly, the topology induced by d B and min{d B , 1} are the same. We shall prove in Lemma 42 that this topology is stronger than that of vague topology.

Minimal spanning acycles
Our main goal here is to derive Theorems 3 and 4. Additionally, we introduce several relevant combinatorial properties of (minimal) spanning acycles. To avoid tedium, we do not always single out the results for the case of minimal spanning tree, i.e., the d = 1 case; since these are classical results, one can refer to [CLR09,Wik16] for graph-theoretic (and expectedly simpler) proofs. Some of these results are direct consequences of the fact that the space of boundaries is a vector space and, hence, allows for a natural matroid to be defined. Others, such as Kruskal's algorithm are folklore, but are included here for completeness as they do not appear in the literature for acycles.

Basic Properties
Our first aim here is to show that a d-spanning acycle exists if and only if β d−1 (K) = 0. We establish this via a series of results. As introduced in Definition 2, a maximal acycle is a natural analogous notion of a spanning forest, however note that we mainly focus on a spanning acycle in the paper. We begin by showing that if a spanning acycle exists for a complex K, then β d−1 (K) = 0. If S is a spanning acycle, then β d−1 (K d−1 ∪ S) = 0 by definition. That this extends to the complete skeleton follows from the following corollary of Lemma 11. Corollary 18. Let K be a simplicial complex with S 1 ⊂ S 2 ⊂ F d . Then, for any j d, Proof. By Lemma 11, adding a d-simplex either increases β d or decreases β d−1 . Since the inequalities only concern β d−1 , S 1 ⊂ S 2 ⊂ F d implies the first 3 inequalities. The equalities β d−1 (K d ) = β d−1 (K j ) = β d−1 (K) follow from the property of simplicial complexes that for every simplex, all of its faces must be contained in the complex. Hence, adding higher than d-dimensional simplices cannot change β d−1 .
It remains to show that β d−1 (K) = 0 implies the existence of a spanning acycle. We omit the case where F d is empty as the d-spanning acycle is simply the empty set in this case. We begin by proving the following fact, which states that a positive simplex remains positive under simplex addition and a negative simplex remains negative under deletion. This is nothing but a restatement that the span of a basis is non-decreasing under the addition of elements. It however will be useful in the proof of correctness of Kruskal's algorithm.
Lemma 19. Let K be a simplicial complex, σ ∈ K be a d-face, and let S 1 ⊆ S 2 ⊆ F d be such that σ / ∈ S 2 . If σ is a positive face w.r.t. K d−1 ∪ S 1 , then σ is a positive face w.r.t Proof. If σ is a positive simplex w.rt. K ∪ S 1 , then (10) implies ∂σ ∈ ∂(C d (K ∪ S 1 )). Since the ∂(C d (K ∪ S 1 )) ⊆ ∂(C d (K ∪ S 2 ), it follows from (10) that σ is a positive simplex for K ∪ S 1 as well. The second statement is simply the contrapositive, since a simplex must be either positive or negative.
The second fact we need is a characterization of positive simplices -if a set of ksimplices do not decrease β k−1 , then they are positive.
Lemma 20. Let S ⊆ F d be such that β d−1 (K d−1 ∪ S) = β d−1 (K). Then any σ ∈ F d \ S is a positive face w.r.t. K d−1 ∪ S. In particular, this holds when S is a maximal acycle.
Together, the above results imply that we can always find a simplex which will decrease the (d − 1)-th Betti number whenever it is greater than that of the whole complex.
As an acycle corresponds to a basis in matroid, every maximal acycle of d-faces has a constant cardinality (i.e., the rank of the space of boundaries). We shall prove this independently below. Let by applying the Euler-Poincaré formula (7) and then using the definition of γ d (K) yields where the latter equality follows from Corollary 18. Another use of Euler-Poincaré formula yields the following result.
Lemma 22. ([DKM16, Proposition 2.13]) For a simplicial complex K and a subset S ⊆ F d of d-faces, any two of the following three statements imply the third.

Proof. By applying the Euler-Poincaré formula to
and rearranging the terms, we derive the identity: Separately, from Corollary 18, we have β d−1 (K d ) = β d−1 (K). From this, the desired result is easy to see.
From the above Lemma, we also have that the cardinality of every maximal acycle is γ d (K). We can now prove the existence result for spanning acycles.
Proof. From (16), we obtain the identity Since γ d (K) 0, it follows from Lemma 21 that, starting with an empty set, we can inductively construct a set S ⊂ F d such that |S| = γ d (K) and β d−1 (K d−1 ∪ S) = β d−1 (K). By Lemma 22, it follows that β d (K d−1 ∪ S) = 0 implying that S is an acycle with |S| = γ d (K), as desired. If β d−1 (K) = 0, then this S is also a spanning acycle.
We now provide a condition for uniqueness of minimal spanning acycles and, towards deriving the same, we first establish the exchange property of spanning acycles.
Lemma 24 (Exchange property). Let S ⊂ F d be a spanning acycle of a simplicial complex K and let σ ∈ F d \ S. Then, for any d-face σ 1 ∈ S such that σ 1 is part of a d-cycle containing σ, S ∪ σ \ σ 1 is also a spanning acycle.
Suppose that for σ 1 ∈ S, S ∪ σ \ σ 1 is not a spanning acycle. Then by Lemma 11, we obtain that Clearly, C 1 ⊂ S as S is a spanning acycle. Hence σ ∈ C 1 and we derive that for some collection of non-zero a τ ∈ F, τ ∈(C 1 ∩S) a τ ∂τ = −∂σ. Setting a τ = 0 for τ ∈ C 1 \ C and similarly for a τ , we derive that But since S is a spanning acycle, the above implies that ∀τ ∈ (C ∪ C 1 ) ∩ S, a τ = a τ and hence C 1 = C . So, we have that σ 1 / ∈ C if S ∪ σ \ σ 1 is not a spanning acycle. By contraposition, we have that if σ 1 ∈ C , then S ∪ σ \ σ 1 is a spanning acycle.
Lemma 25 (Uniqueness). Let K be a simplicial complex weighted by w : K → R which is injective on F d . If a minimal spanning acycle exists, then it must be unique.
Proof. Suppose that S and M are two distinct minimal spanning acycles. Let σ be the dface with least weight such that σ ∈ S M and without loss of generality, assume σ ∈ S. Then there is a d-cycle C ⊂ M ∪ σ such that σ ∈ C . Since C ⊂ S, there exists a d-face σ 1 ∈ M \ S that is part of a d-cycle containing σ. By the choice of σ, w(σ 1 ) > w(σ). From Lemma 24, M ∪σ \σ 1 is a spanning cycle. But w(M ∪σ \σ 1 ) < w(M ), a contradiction.
Remark 26. Suppose the weight function w is not injective on F d but nevertheless monotonic on K. Then as discussed in Remark 13, this weight function shall yield a total order on K and so on F d as well. In such a case, the above theorem guarantees that the minimal spanning acycle is unique with respect to the chosen total order.

Kruskal's algorithm
The classical Kruskal's [Kru56] algorithm helps find minimal spanning trees. We now discuss its generalization that will be useful for finding minimal spanning acycles. Generally, greedy algorithms exist to output a minimal basis for matroids [Wel76,Chapter 19] and the following result can be considered folklore. However, we make use of this repeatedly throughout the remainder of the paper and so we provide a self-contained proof.
Let K be a simplicial complex weighted by w : K → R. By Lemma 11, every σ ∈ F d is either positive or negative, but not both, with respect to a subcomplex K 1 such that K d−1 1 = K d−1 and σ / ∈ K 1 . Using this, we give the simplicial Kruskal's algorithm below.
Lemma 27. Let K be a weighted simplicial complex with β d−1 (K) = 0 and let M be the output of the simplicial Kruskal's algorithm. Then, M is a minimal spanning acycle.
• if σ is negative w.r.t. K d−1 ∪ S, then add σ to S.
Proof. We shall assume that the weight function w is injective. For the general case, similar arguments can be carried out by using Remarks 13 and 26. From Lemmas 23 and 25, it follows that there is a unique minimal spanning acycle which we denote by M 1 . We now show that M is a spanning acycle. Clearly, β d (K d−1 ) = 0 and by our algorithm and Lemma 11, it remains the same at every stage of the algorithm and so β d (K d−1 ∪M ) = 0, proving that M is an acycle. Clearly, each face in F d \ M is positive with respect to K d−1 ∪ M . Hence, M is spanning as using Lemma 19 we have that For the proof of minimality, we argue as in the Kruskal's algorithm for minimal spanning tree. We prove that at any stage of the algorithm, S ⊆ M 1 . Assuming that the above claim is true, M ⊆ M 1 . Since M and M 1 are both spanning acycles, M 1 = M as desired.
It remains to prove that S ⊆ M 1 at any stage. We use induction for the same. Trivially, this is true for S = ∅. Suppose that the claim holds for S at some stage of the algorithm, i.e., S ⊂ M 1 but S = M. This implies that there does exist a d-face in F d \ S which is negative w.r.t. K d−1 ∪S and hence, from Lemma 20, S = M 1 . Let σ be the next face that is added to S and suppose that σ / ∈ M 1 . Clearly, β d (K d−1 ∪M 1 ∪σ) = 1. Hence, there exists a d-cycle in K d−1 ∪M 1 ∪σ whose support 6 C contains σ. Since β d (K d−1 ∪S∪σ) = 0, C ⊆ S∪σ and so there exists σ 1 ∈ C ∩ M 1 \ S. Clearly, either w(σ) < w(σ 1 ) or w(σ) > w(σ 1 ) as w is injective. Suppose that w(σ) < w(σ 1 ). By the exchange property of matroids, it follows that M 1 ∪σ \σ 1 is spanning acycle with w(M 1 ∪σ \σ 1 ) < w(M 1 ), a contradiction. Suppose that w(σ) > w(σ 1 ). Since S M 1 , σ 1 ∈ M 1 \ S, and M 1 is a spanning acycle, it follows from Lemma 19 that σ 1 is negative w.r.t. K d−1 ∪ S. Thus, it follows that the algorithm would have chosen σ 1 before σ, a contradiction. The desired claim now follows.
As with minimal spanning trees, the Kruskal's algorithm has a number of useful consequences. We conclude this section with a definition of a (topological) notion of a cut for a simplicial complex and show that it has the desired properties which will prove useful in Section 4.1.
As expected, this definition yields a corresponding cut property.
Lemma 29 (Cut Property). Let K be a weighted simplicial complex with β d−1 (K) = 0. Let C ⊆ F d be a cut. Then C ∩ S = ∅ for any spanning acycle S and every minimum weight face in C belongs to some minimal spanning acycle.
Proof. Let S be a spanning acycle and suppose that C ∩S = ∅. On one hand, because S is spanning, β d−1 (K d−1 ∪S) = 0. On the other hand, since C is a cut, we have β d−1 (K−C ) > 0. The latter, when combined with the second inequality in Corollary 18 and the fact that This leads to a contradiction and, thus, the first conclusion holds. Now for the second part. Let σ 1 be a minimum weight face in the cut C and let < l be a total order in which this is the unique minimum weight face in the cut C . Consider the simplicial Kruskal's algorithm under this < l and let S 1 be the acycle constructed when σ 1 is the minimum weight face in F . Clearly K d−1 ∪ S 1 ⊆ K − C . Setting C 1 = C \ σ 1 , the cut property implies that Thus σ 1 is negative w.r.t. K−C and, by Lemma 19, is also negative w.r.t. K d−1 ∪S 1 . Hence σ 1 will be added to the minimal spanning acycle by the simplicial Kruskal's algorithm.
We note that this agrees with the graph notion of a cut. This will prove useful when considering extremal faces. We conclude with the following consequence. Let K be a simplicial complex and let τ ∈ F d−1 . Then σ ∈ F d is said to be a coface of τ, if τ ⊂ σ.
Since the set of all cofaces of a (d − 1)-face forms a cut, the below result is immediate.

Persistence diagrams and minimal spanning acycles
In this section, we prove the connection between persistence diagrams and minimal spanning acycles (Theorem 3) and some consequences. Though this correspondance is striking in its simplicity and completely consistent with the minimal spanning tree case, we will see that this has some non-trivial consequences in the study of weighted complexes.
The minimal spanning acycle represents the persistence boundary basis w.r.t. the sublevel set filtration induced by weights on the simplices. This is explicit from the incremental algorithm (Algorithm 1). From the decomposition of a filtration into a persistence diagram, it follows that a positive simplex generates a new homology class and hence forms a new cycle, while a negative simplex bounds an existing non-trivial homology class and hence is a boundary. Our proof will make this idea precise.
Proof of Theorem 3. We only prove the result for death times D since the result for birth times B is then immediate. This is because, on one hand, every d-simplex is either positive or negative with respect to K(σ − ) (see (11) and (12)). On the other hand, by the incremental algorithm (Algorithm 1), negative simplices correspond to death times and positive simplices correspond to birth times (13).
We again only consider the case when the filtration values are unique and appeal to Remark 13 to complete the proof in the general case. Note that, in the general case, we use the same total ordering for the incremental algorithm (Algorithm 1) generating death and birth times as well as the simplicial Kruskal's algorithm (Algorithm 2).
By uniqueness of weights on F d , the Kruskal's algorithm gives us the minimal spanning acycle M . Firstly, by the relation (9), the condition to add σ to S in Kruskal's algorithm is equivalent to ∂(C d (S)) ∂(C d (S ∪ σ)). Similarly, the incremental algorithm adds Let c be a non-trivial value in the filtration, i.e., there exists σ ∈ K such that w(σ) = c. Let M (c) denote the acycle generated by Kruskal's algorithm on K(c), i.e., M (c) := M ∩ K(c); similarly, define the notation M (c−). By the above discussion on Kruskal's algorithm and incremental algorithm, our proof is complete if we show that ∂(C d (M (c))) = ∂(C d (K(c))). Trivially, ∂(C d (M (c))) ⊆ ∂(C d (K(c))) and we shall now show the other inclusion.
The above result has powerful applications for random complexes as will be seen in the next section but we will now mention few applications in the deterministic setting as well. As already mentioned in the introduction, we obtain [HS17, Theorem 1.1] (see (1)) as an easy corollary of our previous theorem. Further, we can easily prove a fundamental uniqueness result for minimal spanning acycles relying upon this correspondence and the uniqueness of persistence diagrams [ Lemma 31. Let K be a weighted d-complex such that β d−1 (K d ) = 0 and M 1 , M 2 be two d-minimal spanning acycles in K. Let c ∈ R. Then we have that |{σ ∈ M 1 : w(σ) = c}| = |{σ ∈ M 2 : w(σ) = c}|.
In the case of unique weights, the minimal spanning acycle is unique making the above lemma trivially true. In the case of non-unique weights, the minimal spanning acycle we obtain will depend on our choice of extension to a total order. However, the above theorem states that the weights of a minimal spanning acycle will be independent of this choice.
We now give an alternative characterization of a minimal spanning acycle that follows from the proof of Theorem 3. Such a characterization of a minimal spanning tree has been very useful in the study of minimal spanning trees on infinite graphs ([LP17, Chapter 11], [Ale95, Proposition 2.1]). A similar characterization for minimal spanning tree is known as the creek-crossing criterion in [Ale95]. However, we wish to point out now that these different characterizations do not coincide even in the infinite graph case ([Ale95, Proposition 2.1]).
Lemma 32. Let K be a weighted simplicial complex with β d−1 (K) = 0. Let σ ∈ F d and M be the minimal spanning acycle with respect to a total order < l extending the partial order induced by w. Then σ ∈ M iff ∂σ / ∈ ∂(C d (K(σ − ))).

Stability result
Here, we provide a proof for Theorem 4.
Proof of Theorem 4. Again, it suffices to prove the theorem for death times and the proof for birth times is quite identical. Secondly, due to Theorem 3, we shall prove the stability result for weights of a minimal spanning acycle. We shall also assume 0 p < ∞ and the extension to p = ∞ follows by a standard limiting argument. Let M, M be the two minimal spanning acycles corresponding to f, f . We begin with the following case: where f, f differ precisely on one simplex σ and f (σ) = a, f (σ) = a , |a − a | = c. In this case, as we shall show later, |M M | 2, where denotes the symmetric difference between the two sets.
Below, we shall also show that |f (σ 1 ) − f (σ 2 )| c. This again shows that By a recursive application of the above case, we can prove the theorem for the general case of f, f differing in many simplices. For the rest of the proof, we shall focus only on the case of f, f differing on exactly one simplex, say σ ∈ F d , and derive the claims made above. Without loss of generality, assume that f, f assign distinct weights to distinct faces; the case of non-distinct weights can be proved by appealing again to Remarks 13 and 26.  ∞, a)). Define these notions, similarly, for notions for M .
We shall break the proof into four cases where the first two take care of the trivial cases, i.e., when M M = ∅. We shall assume that both M and M are generated by simplicial Kruskal's algorithm (Algorithm 2).  M ((a, a )) ∪ τ. The desired results are then easy to see.

Weighted random complexes
Our first aim here is to look at weighted random complexes (Definition 6) and derive our point process convergence result (Theorem 7). Our second aim is to show the other important consequence of our stability result (Corollary 8).
Towards proving Theorem 7, we first consider a special case where the weights are i.i.d. uniform on all possible d-faces and 0 elsewhere.

Random d-complex : I.I.D. uniform weights
The uniformly weighted d-complex U n,d is the randomly weighted d-complex L n,d with n ∞ = 0 and F being the uniform distribution on [0, 1] (see Definition 6); hence, φ = φ in this case. The canonical filtration associated with U n,d is {U n,d (t) : t ∈ [0, 1]}. Trivially, the electronic journal of combinatorics 27(2) (2020), #P2.11 the well-known random d-complex Y n,d (t) defined before Lemma 5 is the same as U n,d (t) in distribution.
Fix d 1. In this section, we show that the three point sets -nearest neighbour distances, death times, weights in the minimal spanning acycle -corresponding to U n,d (see below Definition 6), under appropriate scaling converge to a Poisson point process as n → ∞.
For each σ ∈ F d−1 (U n,d ), letC(σ) := nC(σ) − d log n + log(d!) and let P C n,d be the scaled point set given by Viewing the latter as a point process, for any R ⊆ R, we set For any c ∈ R, let P C n,d (c, ∞) ≡ P C n,d ((c, ∞)). Separately, let N n,d−1 (p) denote the number of isolated (d − 1)-faces in Y n,d (p).
Since U n,d (p) has the same distribution as Y n,d (p), it follows that P C n,d (np − d log n + log(d!), ∞) has the same distribution as N n,d−1 (p). Also, whenever p n is of the form as in (2), then we know from Lemma 5 that, as n → ∞, N n,d−1 (p n ) converges to Poi(e −c ), the poisson random variable with mean e −c . From this, we have P C n,d (c, ∞) ⇒ Poi(e −c ) as n → ∞. We now extend this to a multivariate convergence, thereby proving convergence of point processes P C n,d . Recall that P poi is the Poisson point process as in Theorem 7.
Proposition 33. As n → ∞, P C n,d converges in distribution to P poi .
Proof. Let I := ∪ m j=1 (a 2j−1 , a 2j ] ⊆ R be an arbitrary but fixed union of finite number of disjoint intervals. Since P poi is simple and does not contain atoms, as per Lemma 41, it suffices to prove the following two statements in order to prove weak convergence of the point process P C n,d : In turn, to establish these two statements, we make use of the method of factorial moments, i.e., show that where, for m ∈ N, the notation m ( ) = m(m − 1) · · · (m − + 1) so that E[(P C n,d (I)) ( ) ] represents the -th factorial moment of the random variable P C n,d (I). This suffices since Statement (i) above is precisely the = 1 case, while Statement (ii) follows due to [vdH16,Theorem 2.4]. For a brief motivation on the method of factorial moments, see Appendix B.
The rest of the proof concerns proving (20). Let 1 be fixed. Denote -th factorial moment of P C n,d (I) by M ( ) n,d . For σ ∈ F d−1 (U n,d ) and R ⊆ R, let 1(σ; R) ≡ 1[C(σ) ∈ R], where 1 denotes the indicator function. Then, clearly, Note that if X = 1 a + 1 b , i.e., it is a sum of two indicators, then X (2) = 1 a 1 b + 1 b 1 a , while X ( ) = 0 for all 3. On the other hand, if X = 1 a + 1 b + 1 c , then X (2) = 2 × 1 a 1 b + 2 × 1 a 1 c + 2 × 1 b 1 c , X (3) = 6 × 1 a 1 b 1 c , while X ( ) = 0 for all 4. Proceeding along these lines, it follows using induction on and linearity of expectation that n,d , we will say that both have similar intersection type, denoted by σ σ σ ∼ σ σ σ , if there exists a permutation π of the faces in σ σ σ such that γ(σ σ σ) = γ(π(σ σ σ )). It is easy to see that ∼ is an equivalence relation. Let Γ := {[σ σ σ]} denote the quotient of I ( ) n,d under ∼ with [σ σ σ] denoting the equivalence class of σ σ σ. Since the number of ways in which distinct (d − 1)-faces can intersect each other is finite, we have that the number of equivalence classes in Γ, i.e., |Γ|, is upper bounded by some constant (w.r.t. n). Indeed |Γ| depends on d and , but these are fixed a priori in our setup. Lastly, note that for σ σ σ ∈ I ( ) n,d , the cardinality of its equivalence class |[σ σ σ]| indeed depends on n.

Extremal death times
We now discuss death times in the persistence diagram. First, we state a lemma explaining why nearest neighbour distances approximate death times.
This lemma essentially follows from ideas in the proofs in [KP14, Theorem 1.10]. But, to the best of our knowledge, it has not been explicitly mentioned anywhere. The proof for the case d 2 requires cohomological arguments and hence the entire proof along with more details on cohomology theory has been provided in Section C in the Appendix.
Let P D n,d denote the set of scaled death times in H d−1 (U n,d ) as in the second item listed below (3). Let c ∈ R be some arbitrary but fixed constant and let p n be as defined in (2). Then, for n large enough, we have P D n,d (c, ∞) = β d−1 (U n,d (p n )) and P C n,d (c, ∞) = N d−1 (U n,d (p n )).
From Lemma 34, it then immediately follows that Now we are ready to prove the convergence result for scaled death times.
Proposition 35. As n → ∞, P D n,d converges in distribution to the Poisson point process P poi .
Proof. Let I := ∪ m j=1 (a 2j−1 , a 2j ] ⊆ R be some finite union of disjoint intervals. Since P poi is simple and does not contain atoms, again as per Lemma 41, to prove the desired result, it suffices to show that: From triangle inequality, By combining this with (23) and Statement (i) from above (20), we get (i).
The same argument also shows that |P We shall now prove our most general point process convergence result (Theorem 7) and then describe corollaries which give simpler bounds to verify the assumptions of this result. For the proof, we shall first consider the simplicial complex K d n weighted by φ alone, which we shall refer to as L n,d . With respect to this L n,d , define C(σ), D i , M, P C n,d , P D n,d , and P M n,d , exactly as below Definition 6.
Proposition 37. Suppose that F is continuous. Then, the point processes P C n,d , P D n,d , and P M n,d , converge in distribution to P poi as n → ∞. We need a comparison lemma to prove the main point process convergence result. The first inequality is obvious and the next two follow from Theorem 4 for p = ∞ and Theorem 3.
Lemma 38. For fixed n, d 1, we have the following inequalities: where the infimum is over all possible bijections γ : where the infimum is over all possible bijections γ : Proof of Theorem 7. We only show that P D n,d ⇒ P poi as n → ∞ using Lemma 38, as the other results follow similarly. Let d v be the vague metric given in (28) Except for a few trivial cases, determining the distribution of the maximum n ∞ is not easy and hence we give two simple corollaries to verify the bounds.
Corollary 39. For each n, let {ψ(σ) : σ ∈ F d (L n,d )} have the same distribution as the real valued random variable ψ which, for some s > 0, satisfies E[e s|ψ| ] < ∞. Define n (σ) = a −1 n ψ(σ) where a n is a sequence such that 9 a n = ω(n log n). If F is Lipschitz continuous, then, each of P C n,d , P D n,d , and P M n,d converges in distribution to P poi .
The following corollary follows from Theorem 7 using Markov's inequality and n ∞ n 1 .
Corollary 40. For each n, let { n (σ) : σ ∈ F d (L n,d )} be identically distributed random variables with E| n (σ)| = o(n −d−2 ) for each σ. If F is Lipschitz continuous, then each of P C n,d , P D n,d , and P M n,d converges in distribution to P poi .
Proof of Corollary 8. Fix a p ∈ {1, 2, . . . , }. Let π be a bijection from {D i } to {D i } achieving the infimum in Theorem 4. Due to the finiteness of the complex, such a bijection exists. Now, we derive from mean-value theorem, Hölder's inequality and our stability result (Theorem 4) that Chapter 3], one can check that it is Lipschitz). Then for m 1 , m 2 ∈ M p (R), (28) The following is an oft-used result to prove weak convergence of point processes. Then P n ⇒ P in M p (R).
We now prove a lemma that will be useful when combining results from computational topology (which uses bottleneck distance) and point process theory (vague topology).
Lemma 42. The topology of bottleneck distance is stronger than that of vague topology on M p (R). In particular, for every > 0, there exists a constant λ > 0 and a compact set K such that, whenever d B (m 1 , m) 1/2, we have Proof. We first establish (29). Let > 0 be arbitrary. For any m, m 1 ∈ M p (R), it follows from (28) that we can choose k (independent of m, m 1 ) such that: Let K j be the compact support of h j and λ j , the associated Lipschitz constant. Set λ = k i=1 λ j and K = ∪ k j=1 K 1 j , where K ρ := {x ∈ R : ∃y ∈ K s.t. |x − y| ρ}. Let m, m 1 be such that δ := 2d B (m 1 , m) 1. Let γ : supp(m) → supp(m 1 ) be the bijection such that max x∈supp(m) |x − γ(x)| δ. Also, let M = supp(m), M 1 = supp(m 1 ).
By the definition of Bottleneck distance, we have that, for any compact set K, Fix a j ∈ {1, . . . , k}. By the definition of m(h j ), where in the last inequality we have used (31) and the fact that δ 1. Substituting the above relation in (30), we get as desired. From this, it follows that for every m ∈ M p (R) and > 0, there exists ρ (depending on m and ) such that d B (m 1 , m) ρ implies d v (m 1 , m) , which completes the proof.

B Method of Factorial Moments
Here, we provide a brief motivation for the method of factorial moments. First, this is very closely related to the method of moments and both these methods are useful when the goal is to establish convergence in distribution. Formally, suppose a random variable X is such that its distribution is completely characterised by its moments {E[X k ] : k 1}. Since polynomials are dense in the class of continuous functions, note that the above statement is true for a broad class of random variables, including the Poisson random variable. A standard result in probability theory then states that if all the moments of a sequence of random variables converge to that of X, then the sequence itself converges in distribution to X. Keeping this in mind, the method of moments idea to verify if or not for all k 1. Now, since moments of a random variable are linear combinations of its factorial moments and vice versa, we can alternatively also work with factorial moments. In case of a Poisson random variable, working with latter makes a lot of sense since the resulting expressions are much simpler than those for the corresponding moments. In particular, if X ∼ Poi(λ), then E[X ( ) ] = λ .

C Betti numbers and Isolated faces in Y n,d (p)
Lemma 34 is proved here. The cases d = 1 and d 2 are dealt with separately, with the latter requiring cohomological arguments.
Proof of Lemma 34 for d = 1. Let V n be the vertex set of Y n,1 (p n ). Since there can be at most one component of size bigger than n/2, for reduced β 0 , |β 0 (Y n,1 (p n )) − N 0 (Y n,1 (p n ))| V ⊂Vn,2 |V | n/2 where 1 V = 1 whenever V forms a connected component in Y n,1 (p n ) and there is no edge between a vertex in V and a vertex in V c . If |V | = k, then for all sufficiently large n, This is because, when |V | = k, there are k k−2 possible spanning trees in V, the probability of getting a particular spanning tree in V is p k−1 n , and the probability of having no edge between V and V c is (1 − p n ) k(n−k) . We say sufficiently large because p n may be negative for small n if c is negative. Hence, for all sufficiently large n, E|β 0 (Y n,1 (p n )) − N 0 (Y n,1 (p n ))| n/2 k=2 n k k k−2 p k−1 n (1 − p n ) k(n−k) + (1 − p n ) n 2 .
As p n = (log n + c)/n, the second term decays to 0 with n.
Fix n. Treating k as a continuous variable, observe that the second derivative of T k w.r.t. k is strictly positive for k ∈ (3, n/2). This shows that T k is convex in (3, n/2) and hence T k max{T 3 , T n/2 } for k ∈ {3, . . . , n/2}. But T 3 > T n/2 for all sufficiently large n. Hence, for all sufficiently large n, n/2 k=2 e k n k k −2 p k−1 n e −pnk(n−k) e T 2 + n 2 e T 3 .
But the RHS converges to 0 with n. The desired result now follows.
We now give a brief exposition about reduced cohomology (w.r.t. Z 2 for simplicity) here which is necessary for proving Lemma 34 for the case d 2. In one line, it can be said that cohomology is the dual theory of homology and can be derived by considering the dual of the boundary operator ∂.
Consider a simplicial complex K. For d 0, a d-cochain of K is a map g : F d → Z 2 . Its support 10 is given by supp(g) := {σ ∈ F d : g(σ) = 1}. Let C d := {g : F d → Z 2 } denote the set of all d-cochains and it is Z 2 -vector space under natural addition and scalar multiplication operations on C d . The d-th coboundary operator δ d : C d → C d+1 is defined as follows : δ d (g)(σ) := τ ∈F d :τ ⊂σ g(τ ), g ∈ C d , σ ∈ F d+1 .
For d 1, let B d := im (δ d−1 ); and, for d 0, let Z d := ker(δ d ). Let B 0 := {0, 1}, where 0 and 1 are respectively the 0-cochains that assign 0 and 1 to all vertices. The elements of B d are called coboundaries while the those of Z d are called cocycles. As in homology, we have that δ d • δ d−1 = 0 and hence we define the d-th cohomology group It is well known that the d-th homology group H d is isomorphic to the d-th cohomology group H d and so we have that β d (K) := rank(H d (K)). We first describe upper and lower bounds for Betti numbers, which to the best of our knowledge, have not been explicitly mentioned anywhere. But they follow from the proofs in [LM06,MW09,KP14].
For d = 0, call every A ⊆ F 0 connected. For d 1, call A ⊆ F d connected, if for every σ 1 , σ 2 ∈ A, there exists a sequence τ 1 , . . . , τ i ∈ A with τ 1 = σ 1 and τ i = σ 2 such that, for each j, τ j and τ j+1 share a common (d − 1)-face. Now fix d 1 and consider g ∈ Z d such that supp(g) is not connected. Then clearly there exists {A i : A i ⊆ F d } such that each A i is non-empty and connected; A i ∩ A j = ∅; for all σ i ∈ A i and σ j ∈ A j , σ i and σ j do not have a common (d − 1)-face; and supp(g) = ∪ i A i . Let g A i ∈ C d be such that supp(g A i ) = A i . It is then easy to see that From the above relation and our assumption that g ∈ Z d , it necessarily follows that each g A i ∈ Z d . Suppose not. Then there exists i and σ ∈ F d+1 such that Since no σ i ∈ A i shares a (d − 1)-face with any d-face in ∪ j =i A j , the above necessarily implies that δ d (g)(σ) = 1; which is a contradiction. From the above discussion and that the fact that the rank only depends upon independent elements, we have, for each d 0, β d (K) = rank({[g] : g ∈ Z d , |supp(g)| 1, w(g) = |supp(g)|, supp(g) is connected}).
This shows that the number of isolated vertices is a lower bound for β 0 (K) except in one particular case when all vertices in K are isolated. For d 1, however, one can easily come up with several examples when the number of isolated d-faces exceeds β d (K). From this, it follows that β d (K) for d 1 needs to treated a little differently.
Fix d 1. In contrast to the setup used for the upper bound, we will assume here that the d-skeleton K d of the given simplicial complex K is complete. ConsiderÑ d (K), which we define to be the number of disjoint isolated d-faces in K. We call a d-face disjoint isolated if it is isolated in K and none of its neighbouring d-faces (i.e.,σ ∈ F d which share a (d − 1)-face with σ) are isolated. Let σ 1 , . . . , σÑ d be all the disjoint isolated d-faces in K and let g σ 1 , . . . , g σÑ d be their associated indicator d-cochains. We claim that IfÑ d = 1, then the above is obviously true. We need to verify it forÑ d > 1. Suppose not. Then there exists I ⊆ {1, . . . ,Ñ d } such and a f ∈ C d−1 such that i∈I g σ i = δ d−1 (f ) i.e., i∈I g σ i ∈ [0].
But by the property of the coboundary operator, we derive a the contradiction that Proof. LetÑ d (K) denote the number of disjoint isolated d-faces in K. Then from (39), (36) and (40), we havẽ Combining the above two relations, we get Then, we have that where the second sum is over all neighbouring d-faces of σ. For a neighbouring d-face σ , 1(σ)1(σ ) 1 g σ,σ , where g σ,σ is that g ∈ C d and supp(g) = {σ, σ }. From the above discussion, we have The factor 2 comes because g σ,σ = g σ ,σ . Now using (40) in (41), the proof is complete.
Lemma 46. [KP14, (3.5), (5.1)] Let d 2. Consider the random d-complex Y n,d (p n ) with p n = d log n + c − log(d!) n for some fixed c ∈ R. Let N d−1 (Y n,d (p n )) be the number of isolated (d−1)-faces in Y n,d (p n ). Also let G d−1 (Y n,d (p n )) be as in (35). Then In [KP14] (see in particular Section 5 there), each g ∈ G d−1 (Y n,d (p n )) is identified by an appropriate hypergraph H. X(H) is the number of d-faces in Y n,d (p n ) that contain an odd number of faces of supp(g). Hence the result follows from the following inequality : E[1 g ] = (1 − p n ) X(H) e −pnX(H) .
Proof of Lemma 34 for d 2. This is easy to see from Theorem 45 and Lemma 46.