The Tu--Deng Conjecture holds almost surely

The Tu--Deng Conjecture is concerned with the sum of digits $w(n)$ of $n$ in base~$2$ (the Hamming weight of the binary expansion of $n$) and states the following: assume that $k$ is a positive integer and $1\leq t<2^k-1$. Then \[\Bigl \lvert\Bigl\{(a,b)\in\bigl\{0,\ldots,2^k-2\bigr\}^2:a+b\equiv t\bmod 2^k-1, w(a)+w(b)<k\Bigr\}\Bigr \rvert\leq 2^{k-1}.\] We prove that the Tu--Deng Conjecture holds almost surely in the following sense: the proportion of $t\in[1,2^k-2]$ such that the above inequality holds approaches $1$ as $k\rightarrow\infty$. Moreover, we prove that the Tu--Deng Conjecture implies a conjecture due to T.~W.~Cusick concerning the sum of digits of $n$ and $n+t$.


Introduction and results
Z. Tu and Y. Deng's Conjecture [17] is concerned with the Hamming weight w(n) of the binary expansion of a nonnegative integer n (the sum of digits of n in base two) and addition modulo 2 k − 1. This conjecture is as follows.
The conjecture arose in the construction of Boolean functions with optimal algebraic immunity (see Tu and Deng [17,18]). Indeed, if the conjecture is true, the functions defined by Tu and Deng have this property.
Such functions are used in the construction of stream ciphers, which are widely used encryption methods due to their high speed and low hardware requirements [4]. However, they are prone to serious attacks [2,5,6]. In order to prevent them from these known attacks algebraic immunity was introduced [12]. We refer the reader to the above-cited papers by Tu and Deng for a more extensive discussion of the rôle of their conjecture within the cryptographic context.
Let us give a probabilistic (and combinatorial) interpretation of the conjecture. Let S k := 2 k −2 t=1 S t,k . Let us consider an arbitrary pair (a, b) of S k . On the one hand, the number of 1s in the binary expansion of a (and b) is at most k − 1 . On the other hand, the constraint on the Hamming weights implies that the total number of 1s in both integers is less than k. Finally, note that all such pairs except (0, 0) are part of S k . Therefore, considering how we may (or actually may not) distribute 1s on the 2k digits in base 2 of a and b together we get The sequence including (0, 0), i.e., the sequence for |S k | + 1 is A000346 in Sloane's OEIS 1 . It is then easy to compute the asymptotic expansion of this sequence as As there are 2 k − 2 possible choices for t we see by the pigeonhole principle that at least one of the sets S t,k has to be asymptotically of size 2 k /2. Therefore, the Tu-Deng Conjecture describes a uniform distribution among the possible sets S t,k . While working on the Tu-Deng Conjecture, T. W. Cusick (private communication, 2011Cusick (private communication, , 2015 formulated a related conjecture on the Hamming weight: Conjecture C. Assume that t is a nonnegative integer. Then c t := dens n ∈ N : w(n + t) ≥ w(n) > 1 2 , where dens A denotes the asymptotic density of a set A ⊆ N (which exists in this case).
Also, note that the density in Conjecture C exists, which follows, for example, from the "Lemma of Bésineau" [3, Lemme 1], see also [9,Lemma 2.1]. In fact, we have c t = 1 2 k {n < 2 k : w(n + t) ≥ w(n)} (1.1) for k ≥ α + µ, where α = w(t) + 1 and 2 µ ≤ t < 2 µ+1 [9, equation (10) and Section 3.3]. We also studied [9] a statement complementary to Cusick's Conjecture: Conjecture CC. Assume that t is a nonnegative integer. Theñ Analogously to the case c t , we havẽ for k large enough. Taken together, Conjectures C and CC locate quite precisely the median of the random variable X t on Z defined by j → dens n : w(n + t) − w(n) = j .
Numerical experiments reveal thatc t ≤ 1/2 < c t for all t < 2 30 . In fact, Drmota, Kauers, and the first author [9] proved that Conjectures C and CC are satisfied for almost all t in the sense of asymptotic density. In the present paper, we want to show that an analogous result holds for Conjecture TD.
Theorem 1.1. Define P t,k as before, In particular, lim k→∞ 1 2 k t ∈ {1, . . . , 2 k − 2} : 1/2 − ε < P t,k < 1/2 = 1. Moreover, we will prove that Conjectures C and CC are in fact implied by Conjecture TD. In fact, we will see that Conjectures C and CC are contained as "extremal cases" in Conjecture TD, choosing t and letting k → ∞.
However, so far we did not succeed in proving the opposite implication. Meanwhile, due to the similarity of the conjectures, it is reasonable to expect that a proof of Conjecture C, when one is found (and if it is found first), will lead to a proof of Conjecture TD. We wish to highlight this similarity between the conjectures. Proposition 1.3. For integers k ≥ 1 and a, b we define Conjecture TD is equivalent to the statement that The binary operation ⊕ k can also be seen as "circular addition" in base 2: if a carry occurs at the index k − 1 in the addition a + b, this carry does not propagate into position k, but into the lowest bit instead. Moreover, if a + b = 2 k − 1, the result is set to zero.
By Proposition 1.3, we may summarize the content of Conjectures TD and C by the following elementary question: how does the sum of digits change under (modular) addition of a constant? It is this formulation in particular that makes the Tu-Deng Conjecture a mathematically interesting problem.
The idea of the proof of Theorem 1.1 is to show a concentration result using Chebyshev's inequality. More precisely, we consider the moments 1 2 k 0≤t<2 k |S t,k | and 1 2 k 0≤t<2 k |S t,k | 2 and derive asymptotic expansions for them. (Note that |S 0,k | = |S 2 k −1,k | = 1, so that the cases t ∈ {0, 2 k − 1} will not matter asymptotically.) These expansions are then used to prove that the values P t,k concentrate well below 1/2, as k → ∞. This idea of proof is analogous to the method used by Drmota, Kauers, and the first author [9]. In fact, the trivariate rational generating function we are going to encounter is very similar to the one in that paper. The remaining part of this paper is dedicated to the proofs of Theorem 1.1 and Propositions 1.2 and 1.3. Throughout the proofs, we will use the notation t c k = 2 k − 1 − t. We will assume that 0 ≤ t < 2 k ; then the binary expansion of t c k is the Boolean complement of the binary expansion of t, padded with 1s up to the index k − 1.

Proof of Proposition 1.2
We first rewrite the Tu-Deng Conjecture. Let us split the set S t,k according to whether a + b < 2 k − 1: set t,k = a ∈ t + 1, . . . , 2 k − 2 : w(a) + w 2 k − 1 + t − a < k . Note that the sets M t,k } form a partition of S t,k . We define the quantity β t,k,j = a ∈ 0, . . . , t : w a + 2 k − 1 − t − w(a) = j , where k ≥ 1, 0 ≤ t < 2 k and j are integers. By the identity Since w(0) > w(0 + t) and w 2 k − 1 − t > w 2 k − 1 , we obtain Both Conjecture C and Conjecture CC are trivial if t = 0. Let t ≥ 1 be given and assume that k ≥ 1 is such that t < 2 k − 1; we choose k ≥ 2k . With this choice we have w(a) ≤ w a + 2 k − 1 − t as long as 0 ≤ a ≤ t. This is the case since 2 k − 2 k + 1 ≤ a + 2 k − 1 − t ≤ 2 k − 1, therefore the tail of 1s at the left of the binary expansion of 2 k − 1 − t, having length at least k , is not touched by the addition of a. Therefore S (1) t,k = t + 1 for large k. Assuming that Conjecture TD holds, we obtain 2 k−1 ≥ t + 1 + a ∈ 0, . . . , 2 k − 1 − t : w(a) > w(a + t) > a ∈ {0, . . . , 2 k − 1} : w(a) > w(a + t) This last expression equals 2 k 1 − c t if k is chosen large enough (see (1.1)), which implies c t > 1/2. To derive Conjecture CC, we replace t in the Tu-Deng Conjecture by 2 k − 1 − t. Noting that j∈Z β t,k,j = t + 1, we obtain Letting k → ∞ and using (1.2) we obtainc t ≤ 1/2.
Proof. By the identity ν 2 (n!) = n − w(n) we have ν 2 n k = w(k) + w(n − k) − w(n) for 0 ≤ k ≤ n. By the substitution a → t − a and the formula w 2 k − 1 − m = k − w(m), valid for m < 2 k , we obtain

Proof of Proposition 1.3
Using the identity w(2 k − 1 − t) = k − w(t) (see also the proof of Proposition 1.2 from the previous section), we see that . From this the first equivalence follows.
In order to prove the second statement, it is sufficient to show that the values 2 −k {n ∈ {0, . . . , 2 k − 1} : w(n + t) ≥ w(n)} are nonincreasing in k. (Note that this is not the case for Tu-Deng; otherwise we would have a proof of the implication C⇒TD.) We proceed by induction on t and show the more general statement that the values v t,k,j = 2 −k {n ∈ {0, . . . , 2 k − 1} : w(n + t) − w(n) ≥ j} are nonincreasing in k, for each j ∈ Z.
Moreover, we have d(2n, 2t + 1) = d(n, t) + 1 and d(2n + 1, 2t + 1) = d(n, t + 1) − 1, therefore we have for all k ≥ 0 Our strategy is to show that the standard deviation of the random variable t → Γ t,k,1 is much smaller than the distance to 2 k−1 , such that the values P t,k concentrate below 1/2 by Chebyshev's inequality. We are therefore interested in the mean value and the variance of t → Γ t,k,1 on the intervals [0, 2 k ). First, we want to find a recurrence for the values β t,k,j = a ∈ {0, . . . , t} : w a + t c k − w(a) = j , where k ≥ 1, 0 ≤ t < 2 k and j ∈ Z. For convenience, we set β −1,j,k = 0.
Proof. The last claim β t,k,j = 0 for |j| > k follows by induction. The first two statements and the cases t = 0 are clear. We note the almost trivial identities (2t which hold for all t and k. We calculate for 1 ≤ t < 2 k : The statement also holds for t = 0, using β −1,k,j = 0. Moreover, for 0 ≤ t < 2 k we have and the last statement also holds for t = 2 k − 1.
We want to compute the first moments of the values β t,k,j . Define Clearly, we have m 0,j = δ 0,j . Using the above recurrence, we obtain for k ≥ 1 We define the bivariate generating function F : Since β t,k,j = 0 for j > k and 0 ≤ t < 2 k (which can be proved by induction) this function captures all interesting values. Moreover, we have β t,k,j = 0 for j ≤ −k + 1.
Using the recurrence for m k,j , we obtain m k,k− x k y .
As above, we calculate for k ≥ 1: The first moments of the random variable t → β t,k,j , where t ∈ {1, . . . , 2 k −1} are contained in certain diagonals of the bivariate rational function F (x, y) (to be precise, the diagonal contains the values m k,j , which are first moments multiplied by 2 k ). The moments corresponding to j = 0 are contained in the main diagonal.
We define The first moment of t → 2 k Γ t,k,1 is therefore given by M k,k−1 = x k y k−1 G(x, y). Extracting this diagonal, we rediscover the result given in the introduction.
While this result was proved in a shorter way in the introduction, we decided to also keep this longer proof, for two reasons: on the one hand, we have captured the first moments of all Γ t,k,k− in one generating function, which better shows the underlying structure; on the other hand, this proof is a gentle introduction to the method used for the second moment later.
Proof. The idea of the proof is to extract the (shifted) diagonal of G(x, y). First note that [x k y k−1 ]G(x, y) = [x k y k ]yG(x, y). The diagonal is given by ∆(yG)(z) := k≥1 M k,k−1 z k . The computation is then a routine exercise in enumerative combinatorics (see e.g. [16,Chapter 6.3]) and can be automatized to a great extent using computer algebra. We do not present this standard argument here. More details can be found in the accompanying Maple Worksheet [1] implementing the manipulations on the power series using the gfun package [14].
We get ∆(yG)(z) = 1 2 from which we extract coefficients noting n≥0 2n n z n = (1 − 4z) −1/2 . The asymptotics is directly computed (to any needed order) from the known asymptotics of the central binomial coefficient.
We proceed to the second moments of the values Γ t,k,j . Define The second moment of t → Γ t,k,1 = P t,k is obviously given by 1 8 k M k,k−1,k−1 , which we want to realize as a diagonal of a trivariate rational generating function.
Proposition 4.4. We have the asymptotic expansion Corollary 4.5. Let X k be the discrete random variable defined by X k (t) = P t,k = |S t,k |/2 k , where 1 ≤ t < 2 k − 1, and let σ k = E(X k − EX k ) 2 be the corresponding standard deviation. Then for k → ∞ we have Proof. The first and second moments of the random variable t → 1 2 k |S t,k | are given by 1 4 k M k,k−1 and 1 8 k M and auxiliary values We calculate, for k ≥ 1 and , m ≥ 0: By the obvious identities a k, ,m = a k,m, and b k, ,m = c k,m, we have We define generating functions Summing over k, , m, the above recurrences translates to identities for these functions: noting that a k, ,m = 0 for < 0 or m < 0, and that Inserting these identities into the equation for A(x, y, z), we obtain After some rewriting we obtain Note that the denominator is the same as in [9,Equation (19)]. Define We define generating functions We obtain (4.1) A (x, y, z) = 1 + 2xzA (x, y, z) + 2xy 2 zB (x, y, z) + 2xyz 2 A (x, y, z) + 2xyB (x, y, z) B (x, y, z) = xz 2 A (x, y, z) + (xy 2 z 2 + x + 4xyz)B (x, y, z) + xy 2 C (x, y, z) and C (x, y, z) = 2xz(1 + yz)B (x, y, z) + 2xy(1 + yz)C (x, y, z).

It follows that
We have
By extending the above argument, which is only a computational issue, we obtain more terms in the asymptotic expansion, which yields the statement of Proposition 4.4. For details see the accompanying Maple worksheet [1].

Conclusion
It is an elementary problem to study the behaviour of the digital expansion of an integer under addition of a constant. More specifically, we wish to understand the sum of digits in base 2 of n and n + t, which amounts to study the number of carries occurring in the addition of the binary expansions of n and t. The question arises how often a certain number of carries is attained when adding n to a given integer t. At first, this has the appearance of an easy task. However, we soon meet the difficulty that carries may propagate through several blocks of 1s; it is not clear how to capture all of the appearing patterns simultaneously. Both Conjecture TD and Conjecture C concern this question, and neither of them could be solved for the past seven years since their introduction. Only partial results have been obtained so far, including an almost-all result for Cusick's conjecture proved by Drmota, Kauers, and the first author. The current paper adds to our knowledge on the Tu-Deng conjecture by proving an analogous result: Conjecture TD holds almost surely in a precise sense.
Our method certainly can be applied to related questions. While analoga of (1.3) and (1.4) fail for the sum-of-digits function in base 3, they seem to hold for the Hamming weight of the ternary expansion of n (the number of nonzero digits of n in base 3). We are confident that our method yields almost-all results for these questions.
A different kind of extension of the considered problems concerns the sum of digits of n, n + t and n + 2t: do we have |{n ∈ {0, . . . , 2 k − 1} : s(n) ≤ s(n + t), s(n) ≤ s(n + 2t)}| > 2 k−2 ? Is the same true for ⊕ k instead of +? Again, we expect that nontrivial results can be obtained using our method.
Meanwhile, the full statement of Conjecture TD remains an open problem. One possible approach to proving it is to assume a hypothetical counterexample to the conjecture, and from it construct a large set of counterexamples, which would contradict the asymptotical statement of our main theorem. However, it is a nontrivial task to compare the values P t,k for different t, in particular to construct (many) integers t and k satisfying P t ,k ≥ P t,k . It follows that this approach cannot yet be used to prove the conjecture.
In a similar vein, we may consider the following approach to proving Conjecture C: we have numerically c t ≤ c t , where t is obtained by appending 01 · · · 1 to the binary expansion and the number of 1s is large enough. If this can be proved, we may iterate the procedure of appending 01 · · · 1, obtaining t (k) ; moreover, by asymptotic considerations one can certainly prove that c t (k) > 1/2 for k large enough. By monotonicity, we obtain c t > 1/2. Again, the problem to overcome is the comparison of values of c t for different t, which seems to be difficult.