Cumulative subtraction games

We study zero-sum games, a variant of the classical combinatorial Subtraction games (studied for example in the monumental work"Winning Ways", by Berlekamp, Conway and Guy), called Cumulative Subtraction (CS). Two players alternate in moving, and get points for taking pebbles out of a joint pile. We prove that the outcome in optimal play (game value) of a CS with a finite number of possible actions is eventually periodic, with period $2s$, where $s$ is the size of the largest available action. This settles a conjecture by Stewart in his Ph.D. thesis (2011). Specifically, we find a quadratic bound, in the size of $s$, on when the outcome function must have become periodic. In case of two possible actions, we give an explicit description of optimal play. We generalize the periodicity result to games with a so-called reward function, where at each stage of game, the change of `score' does not necessarily equal the number of pebbles you collect.


Introduction
Two players, Alice and Bob, stand next to a single pile of 7 pebbles, alternately taking pebbles from it. They compete on who takes most pebbles. However there is a restriction on the number of pebbles they may take each turn; on each turn they can take exactly 2 or 3 pebbles. Now we ask the question: if Alice starts, should she play greedily and take 3 or make a sacrifice and take 2?
In this paper we study generalizations of this game, called cumulative subtraction (CS). CS is related to the famous game of nim. It has similar type of moves, but a different winning condition. We restrict attention to games with a single heap and a common finite action set of size at least 2.
Definition 1 (cumulative subtraction). An instance of cumulative subtraction (CS), (S, x, p), is composed of a finite action set S, where |S| ≥ 2, a heap of x ∈ Z ≥0 pebbles, and a current score p. In case p = 0, the game is also denoted by (S, x), and when the heap size is generic, the game is simply called S (e.g. S is viewed as a ruleset). It is a two player game, and the two players Positive and Negative take turns moving. A position is denoted by (x, p). A Positive's move is of the form (x, p) → (x − s, p + s), for some s ∈ S, provided that x − s ≥ 0. A Negative's move is of the form (x, p) → (x − s, p − s), for some s ∈ S, provided that x − s ≥ 0. A position (t, p t ) is terminal if t < min S. The result of a game is the terminal score p t .
We are interested in optimal play of CS, which is a zero-sum game, where Positive is the 'maximizer' and Negative is the 'minimizer'. Optimal play is reflected in the outcome function. Note that the maximizing action in Definition 2 might not be unique. However uniqueness is a convenient tool in proofs of optimal play, as is further explained via Lemma 2. 1 To this purpose we define the opt-function.
Definition 3 (Optimal action). Given a game S, the optimal action, opt : Z ≥min S → S, is a mapping from the set of non-terminal positions to the maximum action s, such that o(x) = s − o(x − s).
Note that increasing the starting score by p points will increase the outcome by p, but it will not change the optimal sequence of actions. Observation 1. The outcome is the (von Neumann [5] PSPE) game value if the initial score is 0, and Positive starts.
Definition 4 (Game convergence). A game S converges at position x > 0, if, for all positions y ≥ x, opt(y) is constant, but opt(x − 1) = opt(y). This is denoted by ξ(S) = x. If there is no such x then the game does not converge.
If p is the smallest such number, then the sequence is periodic with period p.
Definition 5 (Eventual periodicity). A function g : Z ≥0 → Z is eventually periodic, if there is a p ∈ Z >0 , such that g(x) = g(x + p), for all sufficiently large x ∈ Z. It is eventually periodic with period p , if p is the smallest p such that g(x) = g(x + p ), for all sufficiently large x ∈ Z.
Because of our convention that Positive starts, the relative number of actions that players will have throughout the game is: either Positive and Negative play the same number of actions, or Positive has an extra turn.
Definition 6 (Greedy action and sacrifice). Consider a game (S, x). A greedy action is max{s ∈ S | s ≤ x}, and a sacrifice is any action that is not greedy.
Lemma 1. For all games (S, x), the outcome is bounded between 0 and the maximum action, i.e. 0 ≤ o(x) ≤ max S Proof. Since Positive plays at least the same number of actions as Negative plays, by playing greedily she guarantees a result of at least 0. Since Positive plays at most one more action than Negative, if Negative plays greedily he guarantees a result of at most max S.

Contribution
Our main result (see Section 3) is that all CS games converge (to the maximum action), and thus the outcome of any CS is eventually periodic. The results are 1. In Theorem 3 we give an upper bound on the convergence of any CS game. The bound is quadratic in max S.
2. In Corollary 5 we prove that any game S, is eventually periodic with period 2 max S. That is, o(x + 2 max S) = o(x), for any large enough position x. This is a proof of a conjecture by Stewart from [4].
3. In Theorem 6, we fully solve the case where all actions up to max S are permitted (the case of full support).
4. In Theorem 12 we describe explicitly the optimal play for the class of games with exactly two actions, and in Corollary 13 we specify the corresponding outcome. In Corollary 14 we give an explicit formula for convergence.
5. Section 6 concerns so-called truncated games (small actions are cut out). We were not able to solve the whole class, but we guide the readers towards a thrilling conjecture, Conjecture 16.

CS with arbitrary support
In this section we do not restrict S beyond its definition, a finite set of size at least two. Let us begin with a general lemma.
Lemma 2. For any pair of sequences of optimal play actions, if one of the players, say Positive, switches the order between two actions such that the larger action is played before the smaller one, then this switch cannot decrease the outcome.
Proof. By this switch, the opponent does not get any new playing possibilities, and thus the opponent's new optimal play is a sequence of actions that were available before the switch.
By this lemma, without loss of generality, in this section we assume that both players play non-increasing actions, and in particular, for each game, optimal play will give the unique sequence of actions as prescribed by the opt-function. 2 Definition 7. Given a game S, the endgame is the set of positions strictly smaller than max S. A player enters the endgame, playing from position x ≥ max S, if she plays action a and x − a < max S. The term endgame play refers to the action that enters the endgame together with all subsequent moves.
Proof. Consider play from some large position until one of the players enters the endgame. Suppose that one of the players' strategy, say Positive's, consists in playing at least max S sacrifices before the endgame. We will find a strategy σ by Negative that produces a negative outcome. By Lemma 1, this will imply that Positive's strategy cannot be optimal play.
Negative's strategy σ is greedy play. Positive played at least max S sacrifices before the endgame. There are two cases: (i) Positive enters the endgame (ii) Negative enters the endgame In case (i), the score just before Positive enters the endgame is no more than (− max S). In case (ii), the score after Negative entered the endgame is no more than (− max S). This is true in both cases because Negative's greedy strategy σ consisted exclusively of max S actions, whereas Positive has played at least max S sacrifices, so the outcome decreases by at least 1 with each sacrifice.
By Lemma 2 we may assume that players play non-increasing actions, i.e., at each stage of game, if more than one action produces the outcome, players will choose the largest of those actions. Therefore, in case (i) Positive enters the endgame by a sacrifice, hence the score remains strictly smaller than 0 after Positive's action. Thus, Negative assures an outcome strictly smaller than 0 (by Lemma 1 applied to the players in reversed rolls).
In case (ii) when Negative enters the endgame, Positive plays first below the heap size of max S. By definition of the endgame, Positive can increase the score by at most (max S − 1).
Thus, either way the outcome will be negative, and by the lower bound in Lemma 1, we have reached the desired contradiction.
Therefore any optimal strategy, by either player, must consist of less than max S sacrifices.
This gives the bound in the theorem because Positive plays less than max S sacrifices in optimal play. Namely where the terms in (2)

CS with full support
Consider a CS where the set of possible actions contains all the integers from 1 up to s 1 , i.e., S = {1, 2, . . . , s 1 }. We call this game CS with full support. In this game, optimal play is to play greedy at each position.
Theorem 6. In CS with full support, the optimal play is x for any position x < s 1 and s 1 for any position x ≥ s 1 . That is, each CS with full support converges at s 1 , and moreover its outcome is periodic with the pattern (0, 1, . . . , s 1 , s 1 − 1, . . . , 1). (3) Proof. The proof is by induction. For the base case, consider 0 ≤ x ≤ s 1 : when playing from position x, Positive takes all the pebbles, and thus o(x) = x. When playing from positions x + s 1 , Positive's optimal play is to take s 1 and negative takes the rest, It is Positive's optimal play since if she takes less than s 1 then Negative can take more than x. Assume k > 0 repetitions of the pattern (3). We study the next s 1 positions and show that the outcome in those positions will be exactly as in (3).
For the following s 1 positions the outcome is

CS with two actions
In a game of just two possible actions, S = {s 2 , s 1 }, with s 1 > s 2 we characterize the set of positions where it is optimal to sacrifice, and this set will be called X * (see Definition 8 and Theorem 12).
We think of α as the size of the sacrifice a player makes by taking just s 2 instead of the greedy action s 1 . let and otherwise X * (i) = ∅. Let Inequality (4) means that i sacrifices is worth more than (i − 1) greedy actions.
A simple observation is that no player can benefit by playing more than s 1 α sacrifices.
Proof. Suppose that i ∈ Z >0 counts the number of sacrifices by Positive, and assume is 2 < (i − 1)s 1 , where Negative plays i − 1 greedy actions. Then o(x) < 0, which is impossible by Lemma 1. Hence, for any optimal strategy we must have is 2 ≥ (i − 1)s 1 . Therefore, s 1 ≥ iα, which implies the lemma since i is an integer. Moreover, observe that s 1 α = s 1 +s 2 −s 2 α = 1 + s 2 α . Note that i max is the largest i such that (4) holds. E.g., in Example 2, i max = 3. We will see that i max − 1 is the maximum number of sacrifices a player can beneficially make, to win an extra turn. In Example 2, 2 sacrifices are still beneficial since 3·5 > 2·7 however 3 sacrifices are not since 4·5 ≯ 3·7.
Next, we develop a tool, Positive's 'complementary strategy', which gives a lower bound on the result, and it equals the outcome if Negative plays optimally (see Lemma 8).
Lemma 8. From position x ∈ X * (i) Positive's optimal play is the complementary strategy and Negative's optimal play is the greedy strategy. The outcome is Proof. If Positive plays the complementary strategy, this produces at least the result in (6). Moreover, Positive gets i turns by the complementarystrategy vs. Negative's (i − 1) turns. By definition of X * (i), we let x = is 2 + (i − 1)s 1 + δ, for some δ ∈ {0, ..., α − 1}. Suppose that Positive deviates from the complementary strategy and plays at least one greedy action, then Negative can play only greedy actions and play the last turn. By estimating the number of remaining pebbles for Positive, Positive can play at most i − 1 actions. If this were optimal play, the outcome would be at most 0, which contradicts (6). Suppose that Negative deviates from the greedy strategy, then Positive still plays the Complementary-strategy and gets an extra α for each deviation of Negative.
Equivalently, for x ∈ X * (i), and a consequence of this is Lemma 9. Look at Table 1 The outcome is 0 in those positions since both players will play s 1 until the game ends, and they will have equal numbers of turns.
Lemma 9. Suppose that x = min{X * (i)}, for any 1 ≤ i ≤ i max , and y is such that and the optimal action is opt(y) = s 1 .
where the upper bound is by x = min{X * (i)}. If Positive starts by playing s 2 , then by the lower bound, Negative can play i − 1 turns of s 1 , whereas by the upper bound, Positive can play at most (i − 1) actions in total. Hence the result is negative, which is not optimal, by Lemma 1.
On the other hand, by the lower bound if both players play greedily, the result is 0. This is the outcome, because Positive cannot do better, and by Lemma 1, neither can Negative.
The following result is used in the second part of the proof of Theorem 12.
Lemma 11. Consider S = {s 2 , s 1 } and α = s 1 − s 2 . If x ∈ X * , and x ≥ α then Proof. We study the function and show that η(x) ≥ 0, if x ∈ X * and x ≥ α. We think about o(x) as the outcome when Positive starts, and −o(x) as the outcome when Negative starts. It suffices to show that, for all plays by Negative from x, there is a response by Positive such that the inequality (9) holds.
Case 2: If there is a move from x, but no move from x − α, then x < s 1 ; thus x ∈ X * (1), and the only possible action is s 2 , then η( Case 3: If there is a move from x, and a move from x − α, then 1. If Negative plays optimally s 1 from x, and Positive plays s 2 from x − α, we get 2. If Negative plays optimally s 2 from x, and Positive plays s 2 from x − α, we get and we note that, if Negative has no move from x − α − s 2 = x − s 1 , then this implies η(x) ≥ 0. Assume Negative has a move then there are two cases: 2.1 On the second move, if Negative plays optimally s 2 , and Positive plays s 1 , we get On the second move, if Negative's optimal move is s 1 , and Positive responds with s 1 , we get and since, by definition of X * , if x ∈ X * then x − s 1 − s 2 ∈ X * . Therefore η(x − s 2 − s 1 ) ≥ 0 by induction.
This concludes the proof of inequality (9).
The following theorem is the main result for CS with two actions. Proof. We prove that opt(x) = s 2 if and only if x ∈ X * . The proof is split into two cases.
(ii) 2s 2 > s 1 : sacrifice is optimal if and only if x ∈ X * In Case (i), i max = 1, which means that it is never beneficial to sacrifice. Thus, in this case the optimal play, game convergence and periodicity is analogous to the full support case, (Theorem 6). Consider Case (ii). For the direction "x ∈ X * implies opt(x) = s 2 ", by Lemma 8 we know that optimal play from position x ∈ X * is given by Positive's complementary strategy, which starts by playing action s 2 .
The proof of the reverse direction "opt(x) = s 2 implies x ∈ X * " uses Lemma 11. We prove by induction that for each position x ∈ X * , if x ≥ s 1 then s 1 is an optimal move. We begin by stating the base case.
Assume next that x ≥ 2s 1 . It suffices to prove that playing s 1 is weakly better than playing There are three cases, depending on whether x−s 1 or x−s 2 belongs to X * respectively. Note that both cannot belong to X * , because x−s 2 −(x−s 1 ) = α, and, for all i, X * (i) contains at most α −1 consecutive numbers (and more than s 1 numbers separate two disjoint sets X * (i) and X * (j)).
For 1., use the statement of the theorem as induction hypothesis, that is For 2., use induction to conclude s 1 ∈ opt(x − s 1 ) and s 1 ∈ opt(x − s 2 ). We get if Lemma 11 applies, i.e. if x − s 1 − s 2 ∈ X * . Thus, in this case we are done. The other case is whenever x − s 1 − s 2 ∈ X * . Since x ∈ X * , this case happens if and only if x − s 2 − s 1 ∈ X * (i max ). By (7) and Lemma 1, in this case, where the inequality (11) is by i max = s 2 α + 1 > s 2 α .
For 3., consider first the case i < i max . We use the 'duality' (8) between outcomes and number of consecutive positions with outcome 0 just below X * (i). Indeed, in this case, Lemma 9 implies that there are at least α such consecutive positions with outcome 0, that is, The remaining case is for x − s 2 ∈ X * (i max ). We use that o(x − s 2 ) ≥ 0, and prove that o(x − s 1 ) = α. This suffices, to prove the theorem.
Let us first sketch the idea, of this final part of the proof. In fact, by our previous items, playing optimally from x − s 1 , there will be an even number of greedy actions, namely 2i max , of which the last one is s 2 . This follows because, none of the greedy actions will end up in X * , and we showed already that s 1 is optimal if a player does not start in X * (i), with i < i max . Indeed, this gives the outcome α.
To finish the proof, let us justify the claim in the previous paragraph.
This proves the theorem.
Let us denote by [x] y the smallest non-negative number congruent to x modulo y.
Corollary 13. The outcomes of the game S = {s 2 , s 1 } are Proof. This follows from proof of Theorem 12.
In particular, the periodic outcome pattern, at convergence, is obtained by applying i = i max . See also Figure 2.
Note that the first three items concern the outcomes of the positions in the congruence classes 0, . . . , s 1 − 1 (mod 2s 1 ) and the last item concerns the 'anti-symmetric' part among the heap sizes s 1 , . . . , 2s 1 − 1 (mod 2s 1 ). The third item shows that once the outcomes for positions in X * (i) have been computed, then they stabilize, for congruent larger heap sizes modulo 2s 1 .
Corollary 14. Consider CS with two possible actions, S = {s 2 , s 1 }, with s 1 > s 2 > s 1 /2. Then the largest heap size for which Positive can play the smaller action s 2 until the game ends, and obtain the optimal play outcome, is Proof. This follows by Theorem 12.
Note that the formula in Corollary 14 implies explicit game convergence at ξ(S) = s 1 for example in case s 1 = s 2 + 1, then ξ(x) = 2s 2 2 . In Figures 1 and 2 we sketch the optimal actions modulo s 2 + s 1 and the outcomes modulo 2s 1 , of the two-action games with 2s 2 ≥ s 1 .
Optimal actions before convergence modulo (s 1 + s 2 ) Figure 1: Optimal actions before convergence, for pile sizes modulo (s 1 + s 2 ). The positions in X * are of the form x + (s 1 + s 2 )i, for 0 ≤ i < s 1 α , and where s 2 ≤ x < s 1 . The pile sizes are pictured on the inside and the optimal actions on the outside.

Pile size
The outcomes of the first 2s 1 positions The outcomes at convergence modulo 2s 1 Figure 2: Initial outcomes (top) and outcomes at convergence (bottom) for pile sizes modulo 2s 1 , for 2-action games. The pile sizes are pictured on the inside and the outcomes on the outside of the respective circle.

CS with truncated support
In Section 4 we have a simple proof for the full support case, and this might lead one to think that the generalized case of truncated support is similarly simple. However we do not yet understand the full class of truncated support games. So far, our efforts lead us to the intriguing Conjecture 16.
The truncated support games includes as special cases both all games with full support (a = 1, 0-truncated) and some games with two actions (a = m − 1, m − 2-truncated) which are the games that have the slowest convergence.
For each a, we estimate in which interval of size 2m, optimal play converges to the maximal action m.
The a th column in Table 2 shows the convergence for (a − 1)-truncated support games, for m ∈ {2, . . . , 10}. The #x column is the number of unique values of tr m a . From this table alone, for a ≥ 2, the sequence of number of occurrences is non-increasing. But this is not true in general. To obtain some more insight, we plot the entries for m = 25, 50, 100. Via early observations, these pictures seem to converge to some function of the form A   (by the support size 2 result). We have, for all a ≥ 1, x a < x a+1 . But, what is the number of elements in x, for each m? The initial sizes of these sets are displayed in the last column of the table, as #x.
Study the first differences ∆ m a = x m a+1 − x m a , a ≥ 1. Define, for all m ≥ 3, and for all 1 ≤ j ≤ #x, M j := #{a | x m j = tr m a }. One can prove the following result by combining methods and results in Theorem 6 and Theorem 12. This result reflects an emerging 'duality' between individual games and sequences of games, which appears to continue in the inner regions of the pictures. We make the following conjecture.
• The first differences, ∆ m , equal in reverse order the number of multiplicities of the numbers in tr m . That is, for all a, M m+1−a = ∆ m a .

Discussion
In our work we study four classes of CS games, all with a finite support. The convergence theorem, Theorem 3, tells how to play for any given game with large heap size; however when the heap size is small, we only have full understanding of optimal play in the classes of 2-actions and full support. As future work we suggest to study optimal play when heaps are small, in other classes of CS, in particular the class of truncated games. 3 For the case |S| = 2, Theorem 12 states the positions where it is optimal to sacrifice. The following two observations are immediate from this result.
Observation 3. Consider a game with exactly two actions. In optimal play, if a player makes a sacrifice, then she plays the last move.
Observation 4. Consider a game with exactly two actions. In optimal play, at least one of the players plays only greedy actions.
For games with more than 2 possible actions the observations do not hold any more. Example 4. Let S = {1, 5, 7} played from position x = 18. The (unique) optimal play sequence is 5; 7; 5; 1, showing that sometimes it is beneficial to sacrifice without playing last. Actually Positive sacrifices in order to play the last 'big' action.
For games with |S| ≥ 4 It is not true that only one player sacrifices in optimal play. Consider the following example Example 5. Let S = {2, 10, 13, 14} played from position x = 35. The unique optimal play sequence is 10; 13; 10; 2, and the first two actions are both sacrifices.
This example triggers another question: is it true that when both players sacrifice, Negative makes a smaller sacrifice than Positive?
Conjecture 17. In a game were optimal play includes sacrifices by both players, Negative's sacrifice is smaller than Positive's sacrifice. (In Example 5 Positive sacrifices 4 while Negative sacrifices 1).

Multi pile CS
CS can be extended naturally to multiple piles. In CS with multiple piles, on each turn the active player first chooses a pile, then plays as in the single pile game on that pile. (In the CGT jargon, this is disjunctive sum play.) By looking at many games with two piles such as in Figures 5 and 6 we observe convergence to the greedy action and periodicity in the outcome. By using similar arguments as in the folklore for classical subtracting games, one can show that in CS with two piles, the outcome is eventually periodic on any horizontal or vertical line. Here, we strengthen this result to a conjecture in the spirit of Theorem 3.
Conjecture 18. Consider CS on two piles. The outcome is eventually periodic on any horizontal or vertical line, with period at most 2 max S.
In addition, we observe regularity of the outcomes along diagonal halflines of the form (x, k + x), called k-diagonals, for any constant k ∈ Z.
Conjecture 19. Consider CS on two piles. The outcome is eventually periodic along any k-diagonal.  (x 1 , x 2 ). The outcome is bounded between 0 and 7, low outcomes are painted in blue while high outcomes in red. Figure 6: Outcomes for the game S = {2, 10, 13, 14} with two piles, starting from a position of the form (x 1 , x 2 ). The outcome is bounded between 0 and 14, low outcomes are painted in blue while high outcomes in red.