Spectra of random regular hypergraphs

In this paper, we study the spectra of regular hypergraphs following the definitions from Feng and Li (1996). Our main result is an analog of Alon's conjecture for the spectral gap of the random regular hypergraphs. We then relate the second eigenvalues to both its expansion property and the mixing rate of the non-backtracking random walk on regular hypergraphs. We also prove the spectral gap for the non-backtracking operator of a random regular hypergraph introduced in Angelini et al. (2015). Finally, we obtain the convergence of the empirical spectral distribution (ESD) for random regular hypergraphs in different regimes. Under certain conditions, we can show a local law for the ESD.


Introduction
Since their introduction in the early 1970s (see, for example, Berge's book [5]), hypergraphs have steadily risen to prominence, both from a theoretical perspective and through their potential for applications. Of the most recent fields to recognize their importance we mention machine learning, where they have been used to model data [45], including recommender systems [39], pattern recognition [27] and bioinformatics [40].
As with graphs, one main feature for the study is graph expansion; e.g., studies of regular graphs [1,32,17,2,6], where all vertices have the same degree d, and quasi-regular graphs (e.g., bipartite biregular [8,9], where the graphs are bipartite and the two classes are regular with degrees d 1 , respectively, d 2 ; or preference models and k-frames [42], which generalizes these notions). The key property for graph expansion is fast random walk mixing. There are three main perspectives on examining this property: vertex, edge, and spectral expansion [10]; the latter of these, the spectral gap, is the most desirable feature as it controls the others (the bounds on vertex and edge expansion generally involve the second eigenvalue of the Laplacian of the graph).
For general, connected, simple graphs (possibly with loops), the Laplacian is a scaled and shifted version of the adjacency matrix A = (A ij ) 1≤i,j≤n , where A ij = 1 if and only if i and j are connected by an edge and 0 otherwise. The Laplacian is defined by L = I − D −1/2 AD −1/2 , where D is the diagonal matrix of vertex degrees.
As mentioned before, spectral expansion of a graph involves the spectral gap of its Laplacian matrix; however, in the case of regular or bipartite biregular graphs, looking at the adjacency matrix or the Laplacian is equivalent (in the case of the regular ones, D is a multiple of the identity, and in the case of the bipartite biregular ones, the block structure of the matrix ensures that D −1/2 AD −1/2 = 1 . For regular and bipartite biregular graphs, the largest (Perron-Frobenius) eigenvalue of the adjacency matrix is fixed (it is d for d-regular graphs and √ d 1 d 2 for bipartite biregular ones). So for these special cases, the study of the second largest eigenvalue of the adjacency matrix is sufficient. As we show here, this will also be the case for (d, k)-regular hypergraphs.
The study of the spectral gap in d-regular graphs with fixed d had a first breakthrough in the Alon-Boppana bound [1], which states that the second largest eigenvalue λ := max{λ 2 , |λ n |} (1). Later, Friedman [17] proved Alon's conjecture [1] that a uniformly chosen random d-regular graphs have λ ≤ 2 √ d − 1 + for any > 0 with high probability, as the number of vertices goes to infinity. Recently, Bordenave [6] gave a new proof that λ 2 ≤ 2 √ d − 1+ n for a sequence n → 0 as n → ∞ based on the non-backtracking (Hashimoto) operator. Following the same idea in [6], Coste proved the spectral gap for d-regular digraphs [14] and Brito et al. [9] proved an analog of Alon's spectral gap conjecture for random bipartite biregular graphs; for deterministic ones, the equivalent of the Alon-Boppana bound had first been shown by Lin and Solé [26].
It is thus fair to say that both graph expansion and the spectral gap in regular graphs and quasi-regular graphs are now very well understood; by contrast, despite the natural applications and extension possibilities, hypergraph expansion is a much less understood area. The difficulty here is that it is not immediately clear which operator or structure to associate to the hypergraph. There are three main takes on this: the Feng-Li approach [16], which defined an adjacency matrix, the Friedman-Wigderson tensor approach [18], and the Lu-Peng approach [30,31], which defined a sequence of Laplacian matrices through higher-order random walks.
Several results on hypergraph expansion have been obtained using the Friedman-Wigderson approach. Hyperedge expansion depending on the spectral norm of the associated tensor was studied in the original paper [18], the relation between the spectral gap and quasirandom properties was discussed in Lenz and Mubayi [22,23], and an inverse expander mixing lemma was obtained in Cohen et al. [12]. Very recently, Li and Mohar [24] proved a generalization of the Alon-Boppana bound to (d, k)-regular hypergraphs for their adjacency tensors. On the other side, using the Feng-Li adjacency matrix approach, the original paper [16] proved the Alon-Boppana lower bound for the adjacency matrix of regular hypergraphs, and then Li and Solé [26] defined a (d, k)-regular hypergraph to be Ramanujan if any eigenvalue λ = d(k − 1) satisfies Ramanujan hypergraphs were further studied in [33,25,37]. Note that when k = 2 (when the hypergraphs are actual graphs), this definition coincides with the definition for Ramanujan graphs. The adjacency matrices and Laplacian matrices of general uniform hypergraphs were analyzed in [4], where the relation between eigenvalues and diameters, random walks, Ricci curvature of the hypergraphs were studied.
In this paper, we fill in the gaps in the literature by showing a spectral gap for the adjacency matrix of a hypergraph, following the Feng-Li definition; we connect it to the mixing rate of the hypergraph random walk considered in [45] and subsequently studied in [13,20], and we also show that this gap governs hyperedge and vertex expansion of the hypergraph, thus completing the parallel with graph results. Specifically, for (d, k)-regular hypergraphs and their adjacency matrices (the precise definitions are given in the next section), we prove the following: • Hyperedge and vertex expansion are controlled by the second eigenvalue of the adjacency matrix. • The mixing rate of the random walk is controlled by the second eigenvalue of the adjacency matrix. • The uniformly random (d, k)-regular hypergraph model has a spectral gap. This is by far the most exciting result, and it turns out to be a simple consequence of the spectral gap of uniformly random bipartite biregular graphs [9]. Our result shows that, asymptotically, almost all (d, k)-regular hypergraphs are almost Ramanujan in the sense of Li-Solé (see (1.1)). Other results include the spectral gap and description for the spectrum of the non-backtracking operator of the hypergraph, the limiting empirical distribution for the spectrum of the adjacency matrix of the uniformly random (d, k)-regular hypergraph in different regimes (which was studied by Feng and Li in [16] for deterministic sequences of hypergraphs with few cycles and fixed d, k), and a sort of local law of this empirical spectral distribution.
Our main methodology is to translate the results from bipartite biregular graphs by using the bijection between the spectra (Lemma 4.2). While this bijection has been known for a long time, the results on bipartite biregular graphs [15,9] (especially the spectral gap) are quite recent.
Our spectral gap results are linked to the random walk and offer better control over the mixing rate. Together with the Alon-Boppana result established by Feng-Li [16], they give complete control over the behavior of the random walk and hyperedge/vertex expansion. In our view, this establishes the adjacency matrix perspective of Feng and Li as ultimately more useful not just theoretically, but possibly computationally as well, since computing second eigenvalues of matrices is achievable in polynomial time, whereas the complexity of computing spectral norms of tensors is NP-hard [21].
The rest of the paper is structured as follows. In Section 2 we provide definitions and properties of hypergraphs that we use in the paper. In Section 3 we show that several expansion properties of (d, k)-regular hypergraphs are related to the second eigenvalues of their adjacency matrices. In Section 4 we prove the analog of Friedman's second eigenvalue theorem for uniformly random (d, k)-regular hypergraphs. The spectra of the non-backtracking operator for the hypergraph are analyzed in Section 5. Finally, we study the empirical spectral distributions of uniformly random (d, k)-regular hypergraphs in Section 6.

Preliminaries
Definition 2.1 (hypergraph). A hypergraph H consists of a set V of vertices and a set E of hyperedges such that each hyperedge is a nonempty subset of V . A hypergraph H is k-uniform for an integer k ≥ 2 if every hyperedge e ∈ E contains exactly k vertices. The degree of i, denoted deg(i), is the number of all hyperedges incident to i. A hypergraph is d-regular if all of its vertices have degree d. A hypergraph is (d, k)-regular if it is both d-regular and k-uniform. A vertex i is incident to a hyperedge e if and only if v is an element of e. We can define the incidence matrix X of a hypergraph to be a |V | × |E| matrix indexed by elements in V and E such that X i,e = 1 if i ∈ e and 0 otherwise. Moreover, if we regard X as the adjacency matrix of a graph, it defines a bipartite graph G with two vertex sets being V and E. We call G the bipartite graph associated to H. Definition 2.2 (walks and cycles). A walk of length l on a hypergraph H is a sequence In the associated bipartite graph G, a cycle of length l in H corresponds to a cycle of length 2l. We say H is connected if for any i, j ∈ V , there is a walk between i, j. It's easy to see H is connected if and only if the corresponding bipartite graph G is connected. Definition 2.3 (adjacency matrix). For a hypergraph H with n vertices, we associate a n × n symmetric matrix A called the adjacency matrix of H. For i = j, A ij is the number of hyperedges containing both i and j and A ii = 0 for all 1 ≤ i ≤ n.
If H is 2-uniform, this is the adjacency matrix of an ordinary graph. The largest eigenvalue of A for (d, k)-regular hypergraphs is d(k − 1) with eigenvector 1 √ n (1, . . . , 1).

Expansion and mixing properties of regular hypergraphs
In this section, we relate the expansion property of a regular hypergraph to its second eigenvalue. We prove results on expander mixing and vertex expansion, and compute the mixing rate of simple random walks and non-backtracking random walks. These results follow easily from the same methodology used in Chung's book [10]. Let H = (V, E) be a (d, k)-regular hypergraph, for any which counts the number of hyperedges between vertex set V 1 , V 2 with multiplicity. For each hyperedge e, the multiplicity is given by |e ∩ V 1 | · |e ∩ V 2 |. We first provide an edge mixing result whose equivalence for regular graphs is given in [10].
Remark 3.2. The above result is qualitatively different from the expander mixing lemma for kuniform regular graphs studied in [18,12]. Their result considers the number of hyperedges between any k subsets of V and the parameter λ there is the spectral norm of a tensor associated with the hypergraph.
Proof. Let 1 V i be the indicator vector of the set V i for i = 1, 2. Let v 1 , . . . , v n be the unit eigenvector associated to λ 1 , . . . λ n of A. We have the following decomposition of 1 V 1 , 1 V 2 : Therefore by the Cauchy-Schwarz inequality, and similarly, i≥2 This implies For any subset S ⊂ V , we define its neighborhood set to be N (S) := {i : there exists j ∈ S such that {i, j} ⊂ e for some e ∈ E}.
We have the following result on vertex expansion of regular hypergraphs.
are the unit eigenvectors of A associated to λ 1 , . . . , λ n , respectively. Then we know γ 1 = |S| √ n and On the other hand, For the rest of this section, we compute the mixing rates of random walks on hypergraphs. The simple random walk on a general hypergraph was first defined in [45], where the authors gave a random walk explanation of the spectral methods for clustering and segmentation on hypergraphs, which generalized the result in Meila and Shi [36] for graphs. A quantum version of random walks on regular hypergraphs was recently studied by Liu et al. [28].
The simple random walk on k-uniform hypergraphs has the following transition rule. Start at a vertex v 0 . If at the t-th step we are at vertex v t , we first choose a hyperedge e uniformly over all hyperedges incident with v t , and then choose a vertex v t+1 ∈ e, v t+1 = v t uniformly at random. The sequence of random vertices (v t , t ≥ 0) is a Markov chain. It generalizes the simple random walk on graphs. We denote by P = (P ij ) 1≤i,j≤n the transition matrix for the Markov chain and let D be the diagonal matrix with D ii = deg(i), 1 ≤ i ≤ n. From the definition of the simple random walk on hypergraphs, for any (d, k)-regular hypergraphs with adjacency matrix A, the transition matrix satisfies P = 1 d(k−1) A.
It's known (see for example [29]) that for any graph (or multigraph) G, if G is connected and non-bipartite, then it has a unique stationary distribution. For d-regular graphs, being connected and non-bipartite is equivalent to requiring λ = max{λ 2 (A), |λ n (A)|} < d, see for example [2]. The simple random walk on (d, k)-regular hypergraphs H = (V, E) can also be seen as a simple random walk on a multigraph G H on V , where the number of edges between i, j in G H is A ij . The adjacency matrix of G H is the same as the adjacency matrix of H. Therefore the simple random walk on H converges to a unique stationary distribution if and only if the multigraph G H is connected and non-bipartite. These two conditions can be satisfied as long as we have the following condition on the second eigenvalue.
Proof. If the corresponding multigraph G H is bipartite, then we have λ n = −λ 1 = −d(k − 1). If G H is not connected, then it has at least two connected components, the largest eigenvalue will have multiplicity ≥ 2, which implies λ = d(k − 1). Therefore λ < d(k − 1) if and only if G H is nonbipartite and connected. From the general theory of Markov chains on graphs and multigraphs, the simple random walk on G H converges to a stationary distribution. Therefore the simple random walk on H converges to a stationary distribution.
For any (d, k)-regular hypergraph H with λ < d(k − 1), a simple calculation shows that the stationary distribution is π(i) = 1 n for all i ∈ V . The mixing rate of the simple random walk on hypergraphs, which measures how fast the Markov chain converges to the stationary distribution, is defined by where π is the unique stationary distribution on V . Let λ 1 ≥ λ 2 ≥ · · · ≥ λ n be the eigenvalues of A and we define the second eigenvalue in absolute value of A by λ := max{λ 2 , |λ n |}.
The non-backtracking walk on a hypergraph is defined in [38], as a generalization of nonbacktracking walk on a graph. Recall a walk of length l in a hypergraph is a sequence We say w is a non-backtracking walk if e i = e i+1 for 1 ≤ i ≤ l − 1. Define a non-backtracking random walk of length l on H from some given vertex v 0 ∈ V , to be a uniformly chosen member of the set of non-backtracking walks of length l starting at v 0 . Let be the set of oriented hyperedges of a k-uniform hypergraph H. Similar to case for regular graphs in [2], we can also consider the non-backtracking random walk on H starting from an initial vertex v 0 as a Markov chain {X t } t≥0 with a state space E(H) in the following way. The distribution of the initial state is given by for any e ∈ E(H). The transition probability is given by otherwise.
Notice that if H is a (d, k)-regular hypergraph with (d, k) = (2, 2), then H is a 2-regular graph, which is a disjoint union of cycles. The non-backtracking random walk on H is periodic and does not converge to a stationary distribution. Given a (d, k)-regular hypergraph H = (V, E) with (d, k) = (2, 2), letP (l) i,j be the transition probability that a non-backtracking random walk of length l on H starts at i and ends at j. Definẽ to be the mixing rate of the non-backtracking random walk. As a generalization of the result in [2], we can connect the second eigenvalue of regular hypergraphs to the mixing rate of nonbacktracking random walk. It turns out that similar results were already studied in [11] for cliquewise non-backtracking walks on regular graphs. Especially, Corollary 1.3 in [11] is equivalent to the following theorem. We include a proof here for completeness.
Theorem 3.5. Let H be a (d, k)-regular hypergraph with d, k ≥ 2 whose adjacency matrix has the second largest eigenvalue in absolute value λ := max{λ 2 , |λ n |} < d(k − 1), then (1) the mixing rate of the simple random walk on H is Then a non-backtracking random walk on H converges to the uniform distribution, and its .
Proof. (1) We first consider simple random walks. For any l ≥ 1, P l = 1 ((k−1)d) l A l and the vector v 1 = 1 √ n (1, . . . , 1) is an eigenvector of P l corresponding to the unique largest eigenvalue 1. Let µ(l) = max{|λ 2 (P l )|, |λ n (P l )|}, we have This implies ρ(H) ≤ λ (k−1)d . On the other hand, let J be a n × n matrix whose entries are all 1, we have This completes the proof of part (1) of Theorem 3.5. For part (2), we follow the steps in [2]. Recall that the Chebyshev polynomials satisfy the following recurrence relation: U k+1 (x) = 2xU k (x) − U k−1 (x), ∀k ≥ 0. We also define U −1 (x) = 0, U 0 (x) = 1. Let A be the adjacency matrix of H and define the matrix A (l) such that A (l) ij is the number of non-backtracking walks of length l from i to j for all i, j. By definition, the matrices A (l) satisfy the following recurrence: where (k − 1)dI in the first equation eliminates the diagonal of A 2 to avoid backtracking, and (k − 1)(d − 1)A (l−1) in the second equation of (3.8) eliminates the walk which backtracks in the (l + 1)-st step. We claim that We can check that Therefore (3.9) holds for l = 1, 2.
i,j is the probability that a non-backtracking random walk of length l on H starts from i and ends in j. The number of all possible non-backtracking walks of length l starting from i is d(k − 1)((k − 1)(d − 1)) l−1 . This is because for the first step we have d(k − 1) many choices for hyperedges and vertices, and for the remaining (l − 1) steps we have ((k − 1)(d − 1)) l−1 many choices in total. Normalizing A (l) yields Letμ 1 (l) = 1,μ 2 (l) ≥ · · · ≥μ n (l) be the eigenvalues ofP (l) ,μ(l) := max{|µ 2 (l)|, |µ n (l)|}. We obtain thatP (l) is precisely the transition matrix of a non-backtracking random walk of length l. Same as Claim 2.2 in [2], we haveμ We sketch the proof of (3.11) here. SinceP (l) is doubly stochastic, v 1 = 1 √ n (1, . . . , 1) is an eigenvector ofP (l) corresponding to the largest eigenvalue 1. We have On the other hand, let J be as above, we have By (3.9) and (3.10), for 1 ≤ i ≤ n, . This completes the proof.

Spectral gap of random regular hypergraphs
Let G(n, m, d 1 , d 2 ) be the set of all simple bipartite biregular random graphs with vertex set V = V 1 ∪ V 2 such that |V 1 | = n, |V 2 | = m, and every vertex in V i has degree d i for i = 1, 2. Here we must have nd 1 = md 2 = |E|. Let H(n, d, k) be the set of all simple (without multiple hyperedges) (d, k)-regular hypergraphs with labelled vertex set [n] and nd k many labelled hyperedges denoted by {e 1 , . . . , e nd/k }.
Remark 4.1. From this section on, we always assume d ≥ k for simplicity, since a (d, k)-regular hypergraph, its dual hypergraph is (k, d)-regular, and they have the same associated bipartite biregular graph by swapping the vertex sets V 1 and V 2 .
It's well known (see for example [16]) that there exists a bijection between regular multihypergraphs and bipartite biregular graphs. See Figure 1 as an example of the bijection. For a given bipartite biregular graph, if there are two vertices in V 2 that share the same set of neighbors in V 1 , the corresponding regular hypergraph will have multiple hyperedges, see Figure 2. Let G (n, m, d 1 , d 2 ) be a subset of G(n, m, d 1 , d 2 ) such that for any G ∈ G (n, m, d 1 , d 2 ), the vertices in V 2 have different sets of neighborhoods in V 1 . We obtain the following bijection.  Proof. Let G ∈ G (n, nd/k, d, k) be an (n, nd/k, d, k)-bipartite biregular graph, and A G be its adjacency matrix, we then have A G = 0 X X 0 , where X is a n × (nd/k) matrix with entries Conversely, for any simple (d, k)-regular hypergraph H ∈ H(n, d, k), X corresponds the incidence matrix of H, and we can associate to H a (n, nd/k, d, k)-bipartite biregular graph G whose adjacency matrix is 0 X X 0 , and it has no two vertices in V 2 sharing the same set of neighbors.
From Lemma 4.2, the uniform distribution on G (n, nd/k, d, k) for bipartite biregular graphs induces the uniform distribution on H(n, d, k) for regular hypergraphs. With this observation, we are able to translate the results for spectra of random bipartite biregular graphs into results for spectra of random regular hypergraphs. Our first step is the following spectral gap result.   [17,6]. In terms of Ramanujan hypergraphs defined in (1.1), the theorem implies almost every (d, k)-regular hypergraph is almost Ramanujan.
We start with the following lemma connecting the adjacency matrix of a regular hypergraph and its associated bipartite biregular graph.
Proof. Let V and E be the vertex and hyperedge set of H respectively. For i = j, we have For the diagonal elements, we have (XX ) ii = e∈E X ie X ie = deg(i) = d. Therefore A H + dI = XX .
It's not hard to show that for d ≥ k, all eigenvalues of A G from (4.1) occur in pairs (λ, −λ), where |λ| is a singular value of X, along with extra (dn/k − n) many zero eigenvalues. The next result for random bipartite biregular graphs is given in [9]. Lemma 4.6 (Theorem 4 in [9]). Let A G be the adjacency matrix of a random bipartite biregular graph G sampled uniformly from G(n, m, d 1 , d 2 ), where d 1 ≥ d 2 are independent of n. Then: (1) Its second eigenvalue λ 2 satisfies asymptotically almost surely as n → ∞.
(2) Its smallest positive eigenvalue λ + min satisfies We will use a result from [35] that estimates the probability that a random bipartite biregular graph sampled from G(n, m, d 1 , d 2 ) contains some subgraph L ⊂ K n,m , where K n,m is the complete bipartite graphs with |V 1 | = n, |V 2 | = m. Let |L| be the number of edges of L and we use the notation [x] a denotes the falling factorial x(x − 1) · · · (x − a + 1). For any vertex v ∈ K n,m , let g v and l v denote the degree of v considered as a vertex of G and L respectively. Let l max be the largest value of l i . Lemma 4.7 (Theorem 3.5 in [35]). Let L ⊂ K n,m . If |L| + 2d 1 (d 1 + l max − 2) ≤ nd 1 − 1, then With Lemma 4.7, we are able to estimate the probability that a random bipartite biregular graph sampled uniformly from G (n, nd/k, d, k) belongs to G (n, nd/k, d, k). Lemma 4.8. Let G be a bipartite biregular graph sampled uniformly from G (n, nd/k, d, k) such that 3 ≤ k ≤ d ≤ n 32 . Then In particular, if 3 ≤ k ≤ d ≤ n 32 and d k = o(n 1/2 ), as n → ∞, Proof. Let V = V 1 ∪ V 2 be the vertex set of a graph G sampled uniformly from G (n, nd/k, d, k).
Let L be the subgraph induced by N (v 1 , v 2 ) and v 1 , v 2 (see Figure 2). Then |L| = 2k and l max = k. When 1 ≤ d ≤ n 32 , the assumption in Lemma 4.7 holds. By Lemma 4.7, we have The number of all possible vertex pairs in V 2 is nd/k 2 and the number of all possible k many distinct vertices in V 1 is n k . Therefore for sufficiently large n, the probability that there exists two vertices in V 2 having the same neighborhood is at most Since x ln(x) is decreasing on x ∈ (0, e −1 ) and 3 ≤ k ≤ d ≤ n 32 , we have for large n, 12e n ≤ 4ek n < e −1 and k ln(4ek/n) ≤ 3 ln(12e/n). Then 4ek n k ≤ 12e n 3 . Therefore This completes the proof.
With the four lemmas above, we are ready to prove Theorem 4.3.
Proof of Theorem 4.3. Let A H be the adjacency matrix of a random (d, k)-regular hypergraph with d ≥ k. Then its associated bipartite biregular graph has adjacency matrix (4.1), where X is a n × nd/k matrix and XX = A H + dI. Let G be a bipartite biregular graphs chosen uniformly from G (n, nd/k, d, k). From Lemma 4.8, we have · P G ∈ G (n, nd/k, d, k) + o(1).
The uniform measure on G(n, nd/k, d, k) conditioned on the event {G ∈ G (n, nd/k, d, k)} is a uniform measure on G (n, nd/k, d, k). Hence asymptotically almost surely a bipartite biregular graph G sampled uniformly from G (n, nd/k, d, k) satisfies (4.2). Note that G also satisfies (4.3) asymptotically almost surely. Since there is a bijection between G (n, nd/k, d, k) and H(n, d, k) described in Lemma 4.2, by (4.2) and Lemma 4.5, we have with high probability, . And it implies with high probability, Similarly, from (4.3), for the smallest eigenvalue λ n (A H ), we have with high probability, , which implies with high probability, Combining (4.5) with (4.6), and note that the largest eigenvalue of A is d(k−1), we have |λ−k+2| ≤ 2 (d − 1)(k − 1)+o(1) for any eigenvalue λ = d(k−1) asymptotically almost surely. This completes the proof of Theorem 4.3.

Spectra of the non-backtracking operators
Following the definition in [3], for a hypergraph H = (V, E), its non-backtracking operator B is a square matrix indexed by oriented hyperedges E = {(i, e) : i ∈ V, e ∈ E, i ∈ e} with entries given by for any oriented hyperedges (i, e), (j, f ). This is a generalization of the graph non-backtracking operators to hypergraphs. In [3] a spectral algorithm was proposed for solving community detection problems on sparse random hypergraph, and it uses the eigenvectors of the non-backtracking operator defined above. To obtain theoretical guarantees for this spectral algorithm, we need to prove a spectral gap for the non-backtracking operator. To the best of our knowledge, this operator has not been rigorously analyzed for any random hypergraph models. In the first step, we study the spectrum of the non-backtracking operator for the random regular hypergraphs. From the bijection in Lemma 4.2, it is important to find its connection to the non-backtracking operator of the corresponding bipartite biregular graph. Hence B H = M N , this completes the proof.
Remark 5.2. Lemma 5.1 is true for any hypergraphs, including non-uniform hypergraphs.
If H is a (d, k)-regular hypergraph, then G is a (d, k)-bipartite biregular graph with |V 1 (G)| = n, |V 2 (G)| = nd/k. Our next lemma for the spectrum of B G is from from [9]. Lemma 5.3 (Lemma 2 in [9]). Let G be a (d, k)-bipartite biregular graph with n vertices. Any eigenvalue of B G belongs to one of the following categories: (1) ±1 are both eigenvalues with multiplicity |E(G)| − |V (G)| = n(d − 1) − nd/k.
(2) ±i √ d − 1 are eigenvalues with multiplicity n − r, where r is the rank of X. (3) ±i √ k − 1 are eigenvalues with multiplicity nd/k − r. (4) Every pair of non-zero eigenvalues (−ξ, ξ) of the adjacency matrix A G generates exactly 4 eigenvalues of B G with the equation We have the following characterization of eigenvalues for B H of a (d, k)-regular hypergraph H. It follows immediately from Lemma 5.1 and Lemma 5.3.
Theorem 5.4. Let H be a (d, k)-regular hypergraph on n vertices and G be its associated (d, k)bipartite biregular graph with adjacency matrix A G given in (4.1). All eigenvalues of B H can be classified into the following: (1) 1 with multiplicity n(d − 1) − nd/k.
(2) −(d − 1) with multiplicity n − r, where r is the rank of X.
(4) Every pair of non-zero eigenvalues (−ξ, ξ) of A G generates exactly 2 eigenvalues of B H with the equation: Let G be an associated (d, k)-bipartite biregular graph of a regular hypergraph H. From [9, Section 2], ± (d − 1)(k − 1) are eigenvalues of B G with multiplicity 1. Then from Theorem 5.4, B H has an eigenvalue λ 1 (B H ) = (d−1)(k −1) with multiplicity 1. From [9,Theorem 3], for random (d, k)-bipartite biregular graphs, the second largest eigenvalue (in absolute value) λ 2 (B G ) satisfies asymptotically almost surely as n → ∞. Therefore from the discussion above, together with Lemma 4.8, we obtain the following spectral gap result for B H . asymptotically almost surely as n → ∞.

Empirical spectral distributions
In the last section, we study the empirical spectral distribution of the adjacency matrix of a random regular hypergraph. We define the empirical spectral distribution (ESD) of a symmetric n × n matrix M to be the probability measure µ n on R given by where δ x is the point mass at x and λ 1 , . . . , λ n are the eigenvalues of M . We always assume d ≥ k (see Remark 4.1). Feng and Li in [16] derived the limiting ESD for a sequence of connected (d, k)regular hypergraphs with fixed d, k as follows. The definition of primitive cycles in [16] is the same as cycles in our Definition 2.2. converges weakly in probability to the measure µ supported on [−2, 2], whose density function is given by We prove that for uniform random regular hypergraphs, the assumptions in Theorem 6.1 hold with high probability, which implies the convergence of ESD in probability for random regular hypergraphs.
Lemma 6.2. Let H be a random (d, k)-regular hypergraph with fixed d ≥ k ≥ 3. Then H is connected asymptotically almost surely.
Proof. H is connected if and only if its associated bipartite biregular graph G is connected. The first eigenvalue for the (d, k)-bipartite biregular graph G is λ 1 = √ dk and we know from Lemma 4.6 and Lemma 4.8, for a uniformly chosen random regular hypergraph H, the corresponding bipartite biregular graph G satisfies So when d ≥ k ≥ 3, for sufficiently large n, the first eigenvalue has multiplicity one with high probability. If G is not connected, we can decompose G as G = G 1 ∪ G 2 such that there is no edge between G 1 and G 2 . Then G 1 , G 2 are both bipartite biregular graphs with the largest eigenvalue √ dk. However, that implies G satisfies λ 2 = √ dk, a contradiction.
The following lemma shows the number of cycles of length l in H is o(n) asymptotically almost surely. Hence X l = o(n) asymptotically almost surely.
Combining Theorem 6.1, Lemma 6.2 and Lemma 6.3, we have the following theorem for the ESDs of random regular hypergraphs with fixed d, k: Theorem 6.4. Let A n be the adjacency matrix of a random (d, k)-regular hypergraph on n vertices.
. For fixed d ≥ k ≥ 3, the empirical spectral distribution of M n converges in probability to a measure µ with density function f (x) given in (6.1).
Remark 6.5. When k = 2, f (x) is the density of the Kesten-McKay law [34] with a different scaling factor. For k ≥ 3, the limiting distribution in (6.1) is not symmetric (i.e. f (x) = f (−x)), which is quite different from the random graph case. For random bipartite biregular graphs with bounded degrees, the limit of the ESDs was derived in [19], and later in [7] using different methods.
In [16], the cases where d, k grow with n have not been discussed. With the results on random bipartite biregular graphs from [15,41], we can get the following result in this regime. Theorem 6.6. Let A n be the adjacency matrix of a random (d, k)-regular hypergraph on n vertices. For d → ∞ with d k → α ≥ 1 and d = o(n 1/2 ), the empirical spectral distribution of M n := converges in probability to a measure supported on [−2, 2] with a density function To prove Theorem 6.6, we will apply the following results for the global law of random bipartite biregular graphs. Theorem 6.7 (Theorem 1 in [15] and Corollary 2.2 in [41]). Let A G be the adjacency matrix of a random bipartite biregular graph sampled from G(n, m, d, k) with n ≤ m, d k → α ≥ 1, and d = o(n 1/2 ) as n → ∞. Then the ESD of A G √ k converges asymptotically almost surely to a distribution supported on and a point mass of α−1 α+1 at 0, where a = 1 − α −1/2 , b = 1 + α −1/2 . Proof of Theorem 6.6. Let A G be the adjacency matrix of a random (d, k)-bipartite biregular graph sampled from G(n, nd/k, d, k). Since X is a n × m matrix with n ≤ m, the ESD of XX k is the distribution of the squares of the nonzero eigenvalues of A G √ k , and from (6.3), the ESD of XX k is supported on [a 2 , b 2 ] with the density function given bỹ asymptotically almost surely. By Lemma 4.8, the same statement holds for a random bipartite biregular graph G sampled uniformly from G (n, nd/k, d, k). Since the adjacency matrix of the corresponding regular hypergraph H is A n = XX − d, by scaling, this implies that the ESD of is supported on [−2, 2] and the density is given by (6.2).
The convergence of empirical spectral distributions on short intervals (also known as the local law) for random bipartite biregular graphs was studied in [15,41,44]. Universality of eigenvalue statistics was studied in [43]. All of these local eigenvalue statistics can be translated to random regular hypergraphs via the bijection in Lemma 4.2. As an example, we translate the following result about the local law for random bipartite biregular graphs in [15] to random regular hypergraphs. and µ be the limiting ESD defined in (6.2). For any > 0, there exists a constant C such that for all sufficiently large n and δ > 0, for any interval I ⊂ [−2 + , 2] with length |I| ≥ 4(1+ √ α) 2 √ α max 2η, η −δ log δ , it holds that |µ n (I) − µ(I)| ≤ δC |I| with probability 1 − o(1), where η is given by the following quantities: h = min log n 9(log k) 2 , k , r = e 1/h , η = r 1/2 − r −1/2 . (6.5) We prove Theorem 6.8 from the following local law for random bipartite biregular graphs in [15]. Lemma 6.9 (Theorem 3 in [15]). Let G be a random (d, k)-bipartite biregular graph on n + nd/k vertices satisfying d → ∞ as n → ∞ and log k = o √ log n , d k → α ≥ 1. Let A G be the adjacency matrix of G and µ n be the ESD of A G √ k−1 and let µ be the measure defined in (6.3). For any > 0, there exists a constant C such that for all sufficiently large n and 0 < δ < 1, for any interval I ⊂ R avoiding [− , ] and with length |I| ≥ max 2η, η −δ log δ , |µ n (I) − µ(I)| ≤ δC |I| (6.6) with probability 1 − o(1/n), where η is given in (6.5).
Proof of Theorem 6.8. For any interval I ⊂ R and a symmetric matrix M , we denote N M I to be the number of eigenvalues of M in the interval I. For a random (d, k)-regular hypergraph H with adjacency matrix A, let G be its associated bipartite biregular graph and A G be the adjacency matrix of G. With Lemma 4.8, we know (6.6) holds for A G with probability 1 − o(1). Recall that the ESD of W := XX k−1 is the distribution of the squares of the nontrivial eigenvalues of A G .  . Note that from (6.8) and (6.10), the interval length of I 3 satisfies where the last inequality is from (6.7). From (6.9), From Lemma 6.9, since √ β ≥ /2, I 3 is an interval avoiding [− /2, /2], hence there exists a constant C such that where µ G is the limiting measure defined in (6.3). Let µ X be the limiting measure defined in (6.4) and µ A be the limiting measure defined in (6.2). Note that µ A (I 1 ) = µ X (I 2 ) = 2(α + 1)µ G (I 3 ). Therefore (6.12) implies 1 2(n + nd k ) N M I 1 − 1 2(α + 1) µ A (I 1 ) ≤ δC |I 3 |. (6.13) Let µ n be the ESD of M . From (6.13), we get |µ n (I 1 ) − µ A (I 1 )| = 1 n N M I 1 − µ A (I 1 ) = 2 1 + d k 1 2(n + nd k ) where the first inequality in (6.14) is from (6.11), and C is a constant depending on α and . This completes the proof of Theorem 6.8.