Beyond sum-free sets in the natural numbers

For an interval [1,N] in the natural numbers, investigating subsets S of [1,N] such that |{(x,y) in S^2:x+y in S}|=0, known as sum-free sets, has attracted considerable attention. In this paper, we define r(S):=|{(x,y) in S^2: x+y in S}| and consider its behaviour as S ranges over the subsets of [1,N]. We obtain a comprehensive description of the spectrum of attainable r-values for the s-sets of [1,N], constructive existence results and structural characterizations for sets attaining extremal and near-extremal values.


Introduction
For a finite interval [1, N ] ⊆ N, investigating the nature and number of subsets S ⊆ [1, N ] with the property that |{(x, y) ∈ S 2 : x + y ∈ S}| = 0 has attracted considerable attention. Such sets, called sum-free, were first studied implicitly by Schur in 1916 ([8]); in 1988, interest was revived by Cameron and Erdős in [1], and their eponymous conjecture regarding the number of such subsets was later proved by Green ([3]) and Sapozhenko ([7]). A precisely analogous problem can also be considered when [1, N ] is replaced by the integers mod p, or indeed by a range of other abelian (and even non-abelian) groups; see papers such as [5], [6] and [9].
In this paper, we remain in the setting of [1, N ] (where N ∈ N) and consider the quantity r N (S) := |{(x, y) ∈ S : x + y ∈ S}| as S ranges over all subsets of [1, N ]. When r N (S) = 0, S is of course a sum-free subset of [1, N ]. Natural questions which arise are: let N ∈ N; then for a given s ∈ [1, N ], what are the minimum and maximum possible values of r N (S) as S runs through all size-s subsets of [1, N ]? (Consideration of cardinalities implies it cannot always be 0.) What can be said about the structure of sets attaining them? Are all intermediate values between the maximum and minimum attained by some s-set in [1, N ]? In a previous paper with Mullen and Yucas [4], the author considered the Z/pZ case, and established best-possible minimum and maximum values. The precise nature of the spectrum of values attained between the extremes remains unknown in this setting, although some partial results and a conjecture are contained in [4].
This paper provides a comprehensive description of the situation in the [1, N ] setting. We describe the spectrum of attainable values, establish constructive existence results and obtain characterizations of sets attaining extremal (and some near-extremal) values. It transpires that in order to answer the above questions for subsets of a given interval [1, N ] (N ∈ N), it is helpful to consider the problem from another angle. For s ∈ N, we can ask: what is the range of cardinalities |{(x, y) ∈ S : x + y ∈ S}| that can be attained as S runs through all size-s subsets of N? What is the smallest N for which all of these values are attained by subsets of [1, N ]? We shall see that the "tipping point" for the problem occurs at N = 2s − 1, and that sum-free sets play a crucial role.

Preliminaries
Throughout, we use the standard notation [a, b] (a, b ∈ N) for the set {x ∈ N : a ≤ x ≤ b}.
For a finite set S = {x 1 < · · · < x s } ⊆ N, S may be considered as a subset of any [1, N ] with N ≥ x s ; since the value of r N (S) described above remains the same for any such choice of N , we may refer simply to r(S). We shall call r(S) the r-value of S, and say that S is ρ-closed if it has r-value ρ = r(S). Clearly, 0 ≤ r(S) ≤ |S| 2 .
Next we provide, for reference, formulae for certain "standard constructions" which will be used throughout the rest of the paper. We omit proofs in general, as these can easily be supplied by the interested reader. Then Lemma 2.7. Let S ⊆ N be an interval of size s, i.e. S = [i, i + (s − 1)], and let x ∈ S. If i < s, If i ≥ s, r(S \ x) = r(S) = 0.
Proof. For i ≥ s, the interval S = [i, i + (s − 1)] has r-value 0, and clearly every subset of a 0-closed set is 0-closed. Assume i < s. The deleted element x may contribute to r(S) as a summand (i.e. in pairs of the form (x, t), (t, x) ∈ S 2 such that x + t ∈ S), or as a sum (i.e. x = t + u for some pair (t, u) ∈ S 2 ), or in both roles. Since all elements are positive integers, x cannot be a summand in an expression summing to x. Each x ∈ S occurs as a summand in a sum x + t (t ∈ S) s − x times if x ≤ s, and 0 times otherwise, and similarly for t + x (t ∈ S). Doubling this quantity will over-count by one, precisely if x + x ∈ S, i.e. precisely if i ≤ 2x ≤ i + (s − 1). So, x ∈ S contributes the following to r(S): x > s Each x ∈ S occurs as a ordered sum in S + S: 0 times if x < 2i and x − 2i + 1 times if 2i ≤ x ≤ 2i + (s − 1). Hence, r(S \ x) is equal to r(S) minus the contributions made by x as both summand and sum in (S + S) ∩ S; the result follows.

Extremal r-values
We begin by considering the extremal r-values g s,N and f s,N .
. , x s } is an arithmetic progression with common difference equal to x 1 .
For each i = 1, . . . , s, all elements of x i + S are greater than x i and hence greater than x j with j < i. So |(x i + S)∩S| ≤ s− i, and this bound is attained precisely if It is clear that, if S is an arithmetic progression with common difference x 1 , then r(S) = s(s−1) 2 . Now suppose S is an s-set in [1, N ] with r(S) = s(s−1) 2 . Equality must be attained in Hence the maximum possible value of a size-s set in [1, N ] (s ≤ N ) does not depend on N .
We now ask: what is the smallest possible r-value for a set of size s? It is not always possible to obtain a minimum r-value of 0. In fact, the following result describes the situation precisely.
• y 1 ≥ 3. This is not possible since if T = {k, 2k, . . . , kt} with k ≥ 3 then S cannot have its last t terms forming an arithmetic progression with common difference 3.
In the case when s ≤ N 2 , the techniques of the above proof cannot be used to characterize the structure of an f s,N -closed set with f s,N = 0. In fact, the difficulty of determining the structure of all sum-free sets is well-known. We observe that the case t = s with N = 2s can be shown to have the same two possibilities as the t = s − 1 case, namely an interval or arithmetic progression with common difference 2. Section 4 establishes results on the structure of r-closed sets with small r-values, including this case.

Sets corresponding to small r-values
In this section, techniques are developed which allow us to describe the structure of sets with r-values equal or close to f s,N .

Set structure when
Proof. For (a), |(S − S) + | is at least s − 1 and at most x s − x 1 . The reverse implication is easily seen to hold by taking i = s − 1. For the forward implication, let |(S − S) + | ≤ (s − 1) + k but suppose that, for some 1 ≤ j < s − 1, ∪ j i=1 D S (i) contains at least j + k + 1 distinct values. Choose j to be the smallest such. But then each of the s − 1 − j sets D S (j + 1), . . . , D S (s − 1) contributes at least one element which did not occur in the previous sets (namely x s − x s−j−1 , . . . , x s − x 1 ), and so |(S − S) + | ≥ (j + k − 1) + (s − j + 1) = s + k which is impossible. For part (b), part (a) implies that D S (j + 1) \ D S (j) contains at most one value, while clearly it contains at least one value namely x s − x s−j−1 , so it contains precisely one value. Hence ∪ j+1 i=1 D S (i) contains exactly j + k + 1 distinct values; repeating this argument proves the claim for all l > j.
We have the following immediate corollary: For s > 4, S has one of the following forms: Proof. Let s ≥ 5. We prove the equivalent claim that D S (1) has the form (a i , 2a, a j ) or (2a, a k , 2a) for some i, j, k ≥ 0 (where the notation a i denotes i consecutive entries, each with value a). By Lemma 4.1 (a), ∪ j i=1 D S (i) contains at most j + 1 distinct values for 1 ≤ j ≤ s− 1. In fact, D S (1) contains exactly two values, since a single-valued set would correspond to an arithmetic progression and hence |(S − S) + | = s − 1. Thus by Lemma 4.1 (b), precisely one new value occurs in moving from Consider the vector D S (1) as corresponding to a word (of length s − 1) in two symbols {a, b}, (a, b, ∈ (S − S) + ). We now ask: which words form valid vectors? Consideration of D S (2) shows that Note that, for s = 2, 3, this approach gives no restriction on the structure of D S (1). One immediate application of these results is to establish the following facts about 0-closed sets.
, so this case is impossible as S cannot lie within [1, 2K].
If N = 8, then either S is an arithmetic progression with common difference 1 or 2, or

Sets with small r-values
Here we prove a structural result and, using the same technique, a non-existence result which will form the base case of an inductive argument in the next section. Throughout this section, we assume s > 3.
Here V is an arithmetic progression, which by size considerations has common difference 1 or 2. If the difference is 1, The only remaining case is that V is a (v + 2)-term arithmetic progression with second and second-last elements deleted. Here, Finally, the case when v = 4, i.e. N = 9, can be established by direct verification, e.g. using GAP [2].
We remark that an analogous proof technique can be applied to establish the structure of other s-sets in [1, N ] with s close to N 2 and r(S) close to 0. We now use a similar approach to prove a non-existence result. Proof. By Theorem 3.3, f s,N = 1. We first suppose that N > 8. With a view to obtaining a contradiction, we suppose that there exists S of size s with r(S) = f s,N + 1 = 2.
There are two possibilities for S: (a) There exists precisely one x ∈ S ∩ (S + S), which has precisely two representations as a(n ordered) sum in S + S, namely x = a + b = b + a for a = b ∈ S; (b) There exist precisely two elements x = y ∈ S ∩ (S + S), and each has a single representation as a(n ordered) sum in S + S: x = a + a, y = b + b for some a = b ∈ S.
Case (a) Deleting x from S yields a 0-closed set of maximum size N 2 , which must be either an interval (i.e. It now remains to show that the options |(U − U ) + | = u − 1 and u lead to a contradiction.
But a ∈ U and a < N 2 − 1, hence this case is impossible. • |(U − U ) + | = u: Assume first that u ≥ 5. By Proposition 4.3, U is either a (u + 1)-term arithmetic progression with one (non-extremal) term deleted, or a (u + 2)-term arithmetic progression with the second and second-last terms deleted. By size considerations, such an arithmetic progression must either be an interval or, when U has u+1 terms, {1, 3, 5, . . . , N −1}. But since 2b ∈ U , the latter is not possible, so U is an interval with one or two elements deleted.
First suppose U is an interval of length u + 1 = N 2 with one element deleted. Since U is maximal 0-closed, the interval of length N 2 cannot itself be 0-closed, and so must be [i, i+  2.4, an interval has an r-value of the form k(k−1) 2 for some k), so this is impossible. Hence there exists no subset S of [1, N ] with |S| = s = N 2 + 1 which has r(S) = f s + 1 = 2, for N > 8. Direct checking (e.g. computationally using GAP) establishes the result for N = 6 and N = 8.
We immediately have the following consequence.

Establishing the spectrum of r-values
At the outset, we posed the following question: for N ∈ N and 1 ≤ s ≤ N , does there exist a size-s set S with r(S) = v for each v ∈ [f s , g s ]? In the previous section, it was shown that the answer cannot be in the affirmative for every N and s. To address this question in a general setting, we will consider the problem from a different angle; first specify s ∈ N, and make the choice of interval a secondary consideration. , and this is attainable in any interval [1, N ] with N ≥ s (take the set [1, s]; there are other possible s-sets for sufficiently large N ). At the other extreme, Section 3 showed that the minimum possible value in R(s) is 0, but that this is attainable in an interval [1, N ] only if N ≥ 2s − 1. For s ≤ N < 2s − 1, the minimum r-value for a size-s set in [1, N ] is given by In the remainder of the paper, we will prove the following result describing the spectrum of values. Let s(> 2) ∈ N.   Proof. All r-values are obtained as r([1, s + 1] \ x) for x ∈ [1, s + 1]; apply Lemma 2.7 to the size t = s + 1 interval T := [i, i + (t − 1)] with i = 1.

The non-exceptional range
We now establish results leading to a proof that the smallest N such that R(s, N ) = [0, s(s−1) 2 ] is N = 2s − 1.
Deleting x from I: since a − 1 ≤ x ≤ s − 1, x occurs as a summand and may also occur as a sum. As a summand, there are (s − x + 2) pairs of the form (x, a − 1), (x, a), . . . , (x, s + a − x); doubling this to count all pairs will overcount by precisely one, so there are 2(s − x) + 3 pairs here in total (for: one pair is counted twice if x does not occur as a sum; however if x ≥ 2a − 2, there are also pairs corresponding to x as a sum: these pairs are (a − 1, x − (a − 1)), (a, x − a), . . . , (x − a + 1, a − 1) and so there are x − 2a + 3 such. There is clearly no overlap in the sum/summand counts.
Finally, consider the overlap between pairs counted in the x and the (s + a − 1) cases. The quantity s + a − 1 will have x as a summand in the pairs (x, s + a − 1 − x) and (s + a − 1 − x, x); two pairs unless x = s + a − 1 − x, i.e. x = (s+a−1) 2 , in which case it is just a single pair. So after subtracting both quantities for the two cases, we must add 2 unless x = s+a−1 2 , when we add 1 instead. No other type of overlap is possible. Hence The next proposition complements the previous result.
Proof. The proof of Proposition 5.2 can be replicated, with 2 adaptations: • in replacing s+a−1 by s+a as the deleted element, the number of pairs to be subtracted to account for its deletion is increased by 1 to s − a + 3.
• in considering the overlap between pairs counted in the x and the (s + a) cases, s + a will have x as a summand in the pairs (x, s + a − x) and (s + a − x, x); these pairs are distinct unless x = s + a − x, i.e. x = (s+a) 2 . Hence, after subtracting both quantities for the two cases, 2 must be added unless x = s+a 2 , when 1 must be added instead.
Note that in Proposition 5.

Describing the exceptional values
We are aiming to show that R(s, N ) equals [f s,N , g s,N ] with some "missing" values if and only if s + 2 ≤ N ≤ 2s − 2. The following theorem shows that, for N in the stated range, it is never possible for the size-s subsets of [1, N ] to attain all values in the interval [f s,N , g s,N ].
Proof. Let 1 = a ∈ N. We will prove that, for s ≥ a + 2, R(s, s + a) does not contain f s,s+a + 1. We will use induction on s. The base case is s = a + 2: we must show that R(a + 2, 2a + 2) does not contain f a+2,2a+2 +1, i.e. that R(b, 2b−2) does not contain f b,2b−2 +1 for b = a+2(≥ 4) ∈ N. Consider r(T ∪{m+a}). Since all elements of T are smaller than m+a, clearly t+(m+a) ∈ T for all t ∈ T ; also 2(m + a) ∈ T ∪ {m + a}. So any new contribution to the r-value from the adjoining of m+a must correspond to its arising as a sum in T +T . Now, in [ But clearly the first option is impossible, as there is a unique T with this r-value, which has α = a and was dealt with above. The second case is also impossible, since it requires r(T ) to be less than the minimum possible. The only possibility is the third case, but here we must have i = 1; we cannot have i > 1 since this would force α = a + i − 1 > a. We now establish that all exceptional r-values must be of the form f s,N + k where k is odd and lies in a restricted range. for 1 ≤ a + 2 ≤ s − 1 and 0 for a + 2 > s − 1. We first consider the former case, i.e. 1 ≤ a ≤ s − 3. Consider adjoining an element x, 1 ≤ x ≤ a + 1. The element x cannot occur as a sum with summands from I ∪ {x}; we consider its role as a summand. The pair (x, x) yields 2x ∈ I precisely if a + 2 ≤ 2x ≤ s + a. Since x ≤ a + 1 ≤ a + 3 2 ≤ 2a+3 2 ≤ s+a 2 (using a ≤ s − 3), 2x can never be too large to lie in I, and so 2x ∈ I precisely if a+2 2 ≤ x ≤ a + 1 (clearly a+2 Since here (s−a−1)(s−a) 2 = 1, this is of the same form as the main case.
Combining several previous results yields the following description and upper bound for the set of exceptions.

Concluding remarks
In this paper, a comprehensive description has been given of the behaviour of the r-values of subsets of [1, N ] ⊆ N. The range of possible r-values for any s-set has been described, and exceptional values have been shown to possess a very specific form. We end this paper by conjecturing that the set shown to contain the exceptional values is precisely the set of exceptions: Inductive proof strategies for this conjecture, extending the approach used for Theorem 5.6, encounter problems due to the presence of the floor and ceiling functions, suggesting that an alternative approach may be required.