### Strings with Maximally Many Distinct Subsequences and Substrings

#### Abstract

A natural problem in extremal combinatorics is to maximize the number of distinct subsequences for any length-$n$ string over a finite alphabet $\Sigma$; this value grows exponentially, but slower than $2^n$. We use the probabilistic method to determine the maximizing string, which is a cyclically repeating string. The number of distinct subsequences is exactly enumerated by a generating function, from which we also derive asymptotic estimates. For the alphabet $\Sigma=\{1,2\}$, $\,(1,2,1,2,\dots)$ has the maximum number of distinct subsequences, namely ${\rm Fib}(n+3)-1 \sim \left((1+\sqrt5)/2\right)^{n+3} \! / \sqrt{5}$.

We also consider the same problem with sub*strings* in lieu of sub*sequences*. Here, we show that an appropriately truncated de Bruijn word attains the maximum. For both problems, we compare the performance of random strings with that of the optimal ones.