Random numerical semigroups and a simplicial complex of irreducible semigroups

We examine properties of random numerical semigroups under a probabilistic model inspired by the Erdos-Renyi model for random graphs. We provide a threshold function for cofiniteness, and bound the expected embedding dimension, genus, and Frobenius number of random semigroups. Our results follow, surprisingly, from the construction of a very natural shellable simplicial complex whose facets are in bijection with irreducible numerical semigroups of a fixed Frobenius number and whose $h$-vector determines the probability that a particular element lies in the semigroup.


Introduction
A numerical semigroup is a subset S of the non-negative integers Z ≥0 that is closed under addition. A nonnegative integer n is a gap of S if n / ∈ S and we denote the set of gaps of S by G(S). Numerical semigroups appear in several areas of mathematics [5], and there are several interesting combinatorial invariants of a semigroup [14]. Notable numerical semigroup invariants include the embedding dimension e(S), which is the number of minimal generators of S and the genus g(S), which is the number of gaps of S, i.e., g(S) = #G(S), and the Frobenius number F (S), which is the largest gap of S. The latter two invariants are usually only defined when S is cofinite, that is, when S has finite complement in Z ≥0 . The theory of numerical semigroups is a vibrant subject, with connections to algebraic geometry and commutative algebra [1,6,8,9,12,17] as well as integer optimization and number theory (see [5] and references therein). In this paper, we investigate invariants of numerical semigroups from a probabilistic point of view.
V. Arnol'd [4], J. Bourgain and Y. Sinai [7] initiated the study of the "average behavior" of numerical semigroups by analyzing the Frobenius function F (S) for "typical" numerical semigroups (see more recent work in [2]). Each of these papers produced random numerical semigroups using the uniform probability distribution on the collection G(N, T ) = {a ∈ Z N >0 : gcd(a) = 1 and a ≤ T } of generating sets, and each proved several interesting statements about the expected value (in the usual probabilistic sense) of the Frobenius number. See the references in [2] for a thorough overview.
In this paper, we study a different model, which generates at random a numerical semigroup S according to the following procedure: (1) fix a nonnegative integer M and a probability p ∈ [0, 1]; (2) initialize a set of generators A = {0} for S; (3) independently choose with probability p whether to include each n ≤ M in A.
The notation S ∼ S(M, p) indicates S is a random numerical semigroup produced with this model. A similar model was recently used to produce random monomial ideals [10] (that is, each multivariate monomial with bounded total degree is included in a generating set with probability p). The authors dubbed this model the "ER-type model" for its resemblance to the Erdős-Rényi model of random graphs [11]; we will use the same convention.
Unlike previously used models, which sampled uniformly among numerical semigroups with a fixed number of generators, the ER-type model allows one to specify a probability as input, yielding more refined control over the numerical semigroups produced. Our model is also more closely aligned with the "standard" sampling methods from probabilistic combinatorics, and more compatible with the use of numerical semigroups in integer programming, where non-cofinite semigroups occur alongside cofinite ones.
Our main result is as follows.
Although part (a) of Theorem 1.1 follows from standard arguments in probabilistic combinatorics (Theorem 3.4), parts (b) and (c) follow, surprisingly, from the construction of a very natural shellable simplicial complex (Definition 4.3) whose facets are in bijection with irreducible numerical semigroups of a fixed Frobenius number (Definition 4.2). As it turns out, some of the probabilities involved in determining the expected values above require precisely the h-vector (in the sense of algebraic combinatorics [18]) for this simplicial complex. Through the h-vector, we distinguish parts (b) and (c) of Theorem 1.1 (Corollary 5.6) and estimate the finite expectations (Theorem 6.4).
Acknowledgements. The authors would like to thank Iskander Aliev, Martin Henk, Calvin Leng, and Pedro García-Sanchez for several helpful conversations and suggestions, and are grateful to Zachary Spaulding for assisting with the experiments in Table 2. The first and second author were partially supported by NSF grant DMS-1522158 to the University of California Davis. The first author was also partially supported by NSF grant DMS-1440140, while he visited the Mathematical Sciences Research Institute in Berkeley, California, during the Fall 2017 semester. The third author was supported by NSF collaborative grant DMS-1522662 to Illinois Institute of Technology.

Background
We begin by recalling some basic notions from probability theory and establishing notation used throughout the paper. For a more comprehensive resource on methods in probabilistic combinatorics, we refer the reader the excellent book of Alon and Spencer [3]. The expected value of a discrete random variable X taking values in Ω ⊆ N is E [X] = n∈Ω n · P [X = n], and the variance of X is given by In Theorem 3.4, we will bound the probability that a nonnegative integer random variable X is non-zero using the so-called moment method techniques. The first moment method, a consequence of Markov's inequality [3], says that the probability X is non-zero is bounded above by its expectation, i.e. P [X = 0] ≤ E [X] . On the other hand, the second moment method provides an upper bound for the probability that X is zero in terms of its variance, 2 , and follows from Chebyshev's inequality [3,Theorem 4.1.1].
For functions f, g depending on some parameter n, we use the notation f g to indicate that the ratio g/f → 0 as n → ∞ and similarly f g means that f /g → 0 as n → ∞. With this, clearly f 1 implies f → 0 as n → ∞. In the theory of Erdős-Rényi random graphs, many graph properties tend to appear or not appear with high probability based on the asymptotics of the probability parameter p. This phenomenon is quantified by the notion of a threshold function (see [3,Chapter 10]). Here we define a notion of threshold function tailored to the context of random numerical semigroups. We say that a property P of a numerical semigroup S is monotone if S has P and if S ∪ {s} for s ∈ N is still a numerical semigroup, then S ∪ {s} also has P. For example, the property of being cofinite is monotone, whereas the property of having Frobenius number n is not. A threshold function for a monotone numerical semigroup property P is a function t(M ) such that for S ∼ S(M, p). Loosely speaking, if t is a threshold for a property P, when p is much smaller than t, then S ∼ S(M, p) will not have P a.a.s., and if p is much larger than t, S will have P a.a.s.

Distribution and cofiniteness
In this section, we prove that the threshold function for cofiniteness coincides with the threshold function for nonemptyness (Theorem 3.4). First, we give Theorem 3.2, which states the probability of observing a fixed numerical semigroup in terms of its embedding dimension and gaps.

Proof. First, observe that S = S if and only if A ⊃ A and no gap g of S is in
Let X be the the number of pairs of coprime integers in A, so that Thus, By the second moment method, P [X > 0] → 1 and thus when p 1/M , A will contain a pair of coprime integers a.a.s, which guarantees that S is cofinite a.a.s. in this case.

The simplicial complex of irreducible semigroups
Before proving the remaining parts of Theorem 1.1, we introduce in Definition 4.3 a simplicial complex whose combinatorial properties govern several questions arising from the ER-type model for sampling random numerical semigroups. We prove that this complex is shellable (Proposition 4.7), in the process uncovering a combinatorial interpretation of its h-vector entries (Corollary 4.10). We begin by recalling the definition of a simplicial complex and some related concepts, as presented in [18,Chapter 2].
in terms of the f -vector. We now define the simplicial complex ∆ n whose facets are in natural bijection with irreducible numerical semigroups with Frobenius number n.  Proof. The first claim follows from [14,Proposition 3.4], and yields a bijection between the gaps of S (excluding n/2) and the elements of S less than n. The second claim now follows.
Remark 4.5. Lemma 4.4 implies that for an irreducible numerical semigroup S, the set of minimal generators less than n/2 (so long as it is nonempty) uniquely determines S. In particular, the minimal generators determine which integers less than n/2 lie in S, and the fact that i ∈ S if and only if n − i / ∈ S determines the remainder of the gaps of S.
We say ∆ is shellable if it has a shelling order.
Proposition 4.7. Fix n ≥ 1, let S 1 , . . . , S r denote the irreducible numerical semigroups with Frobenius number n, and let F i = S i ∩[n−1] be the facet of ∆ n corresponding to S i . If F i ≥ F j for all i < j, then F 1 , . . . , F r is a shelling order for ∆ n .
Proof. Fix i ≥ 2. It suffices to prove that whenever is closed under addition since S j is closed under addition and F j and F i have identical elements less than a. Additionally, S i ∪ {b} \ {a} is closed under addition, since its elements greater than b are identical to those of S j . In particular, S k = S i ∪ {b} \ {a} is an irreducible numerical semigroup with Frobenius number n, and k < i since F k > F i . This completes the proof.  Moreover, under any shelling order of ∆ n (in particular, under any ordering F 1 , . . . , F r satisfying Proposition 4.7), F j \ (F 1 ∪ · · · ∪ F j−1 ) has a unique minimal face R j for each j, and h n,i = #{R j : #R j = i} counts the number of such minimal faces with i vertices. As such, it suffices to show each R i equals the set G i of minimal generators of S i less than n/2. Now, clearly h n,0 = 1, so assume G i is nonempty. Any facet F containing G i agrees with F i for all elements less than n/2, and Remark 4.5 implies F i = F . As such, F i is the only facet containing the face G i , and R i ⊂ G i . Conversely, suppose a ∈ G i \ R i , and let S = S i \ {a} ∪ {n − a}. Since a is a minimal generator of S, the set S is closed under addition and has the same number of elements less than n as S i , meaning S is an irreducible numerical semigroup. Since S i contains R i and appears before S i in the shelling order, we have arrived at a contradiction. Corollary 4.10. There is a bijection between the irreducible numerical semigroups with Frobenius number n and the numerical semigroups not containing n whose generators are all less than n/2. In particular, h n,i equals the number of embedding dimension i semigroups not containing n whose generators are all less than n/2.
Proof. For any irreducible numerical semigroup S = n 1 < · · · < n k with F (S) = n, the semigroup T = n 1 , . . . , n t with n t < n/2 < n t+1 is contained in S and thus cannot contain n. Conversely, fix a numerical semigroup T = n 1 < · · · < n t with n t < n/2, let B = {s ∈ (n/2, n) : s / ∈ T and n − s / ∈ T }, and let S = T ∪ B. If s + s / ∈ T for s ∈ B and s ∈ T , then (n − s) − s / ∈ T , meaning s + s ∈ B. We conclude S is closed under addition, at which point Lemma 4.4 implies S is irreducible.
The second claim now follows from the first and Theorem 4.9.
Remark 4.11. By Theorem 4.9, the coefficients h n,i of the polynomial h n (x) can be computed using [16], which gives an algorithm to compute the set of irreducible numerical semigroups with Frobenius number n. A precomputed list up to n = 90 can be found at the following webpage (the computations for n ≥ 88 each take over a day to complete with the authors' personal computers): https://gist.github.com/coneill-math/c2f12c94c7ee12ac7652096329417b7d The h-vectors of ∆ 89 and ∆ 90 (some entries of which are given in Table 1) demonstrate an interesting phenomenon: not only are the coefficients h n,i not necessarily monotone for fixed i, but the fewer divisors n has, the larger h n,i tends to be with respect to the surrounding n-values. This is likely due in part to the complex ∆ n having more vertices in this case. The computations also take considerably longer in such cases; indeed, n = 89 took longer than for n = 88 and n = 90 combined.
We remind the reader that the polynomial h n (x) does not depend on a given numerical semigroup; rather, there is precisely one polynomial for each n ∈ Z ≥1 , and it encodes information about all numerical semigroups with Frobenius number n. Though a wide assortment of posets whose elements are numerical semigroups have been studied elsewhere in the literature [15,13], we were unable to locate any that consider ∆ n .
We now give some basic properties of the h-vector of ∆ n .  Proof. We proceed using Corollary 4.10 and characterizing the possible sets A ⊂ (0, n/2) of integers minimally generating a numerical semigroup S = A with n / ∈ S. Since A = ∅ generates the semigroup S = {0}, we have h n,0 = 1. Additionally, any non-divisor of n less than n/2 generates a semigroup not containing n, which proves part (b). Now, Theorem 4.13 implies d n ≤ (n − 1)/2 − n/3 . Moreover, the set A = (n/3, n/2) ∩ Z minimally generates a semigroup not containing n since the sum of any two elements is strictly less than n while the sum of any three is strictly larger than n. Since |A| = d n , this proves (c).
We conclude this section with the following bounds on the h-vector entries of ∆ n , which play a crutial role in establishing the final threshold function in Section 5 and estimating several expected values in Section 6.  Proof. By Corollary 4.10, h n,i counts sets A minimally generating a semigroup not containing n.
We claim m = min(A) ≥ 2i. Indeed, since A forms a minimal generating set, each element must be distinct modulo m. Additionally, since n / ∈ A , any element a ∈ A cannot satisfy a ≡ n mod m, and since max(A) ≤ n/2 , any two elements a, b ∈ A cannot satisfy a + b ≡ n mod m. The upper bound immediately follows.
For the lower bound, we claim that for any j ≥ 2, each set A of i distinct integers chosen from the open interval (n/(j + 1), n/j) minimally generates a numerical semigroup S with n / ∈ S. Indeed, the sum of any j elements of A is strictly less than n, while the sum of any j + 1 is strictly larger than n. Additionally, since j ≥ 2, the sum of any two elements of A exceeds n/j, ensuring A minimally generates S. This completes the proof.
Remark 4.14. Proposition 4.12 implies the lower bound in Theorem 4.13 is tight for i = 1. The given bounds are tighter than those for more general Cohen-Macaulay simplicial complexes [18] and are sufficient to prove the results in the coming sections, but still leave room for improvement. Table 1 compares values from Theorem 4.13 with those computed in Remark 4.11.

Expected number of minimal generators
The main result of this section is Corollary 5.6, which states that if p → 0 as M → ∞, then the expected number of generators, expected number of gaps, and expected Frobenius number are all unbounded. Our proof uses a surprising connection between the probability a n (p) that a nonnegative integer n lies in the chosen semigroup (Definition 5.2) and the h-vector of the simplicial complex ∆ n introduced in Section 4; see Remark 5.4.
We now consider each summand σ j of the outer sum for a fixed value of j. Using the division algorithm to write each n = k(2j(j + 1)) + r for k ≥ 0 and 1 ≤ r ≤ 2j(j + 1), and supposing M = m(2j(j + 1)) for some m ∈ Z ≥1 , we obtain Since p 1/M , a simple calculus exercise shows (1 − p) j(j+1) (1 + p) 2 m → 0. If p → 0, we obtain which must hold for every N ≥ 2. Proof. Apply Theorem 5.5 and the inequalities which hold for any cofinite numerical semigroup S.

Approximations
In the final section of this paper, we prove the only remaining case in Theorem 1.1, namely where p is bounded away from zero (Theorem 6.4). In this case, it suffices to assume p ∈ (0, 1) is constant. In doing so, we provide explicit bounds on E [e(S)], E [g(S)], and E [F (S)] as M → ∞ using the h-vector bounds in Theorem 4.13; Remark 6.5 discusses the accuracy of these estimates.   Table 2. Comparing estimates of E [e(S)] using the bounds in Theorem 6.4, exact computation using the polynomials in Remark 4.11, and experimental evidence from 100,000 samples.
made better by simply noting that e(S) is at most the smallest generator (which has expected value 1/p). Such improvements to Theorem 4.13 should be possible, given the precise characterization of the h-vector of ∆ n in Corollary 4.10.
It is also worth noticing that the polynomials computed in Remark 4.11 are not sufficient for accuracy for the p values in Table 2. Indeed, with p = 0.01 and p = 0.001, each partial sum for M = 90 fails to reach the lower bound, and even when p = 0.1, the last 10 summands (i.e. for n = 81, . . . , 90) each lie between 0.01 and 0.02, so the next several summands will likely still contribute significantly to the limit.