v11i1r8 — Comment by the authors, Jan 5, 2004.
Just a week after publication of our article, Prof. Jan Snellman of Stockholm University brought to our attention an important reference we overlooked, a 1993 article [Sha93] by Prof. Jeffrey O. Shallit of the University of Waterloo. Our Theorem 8 (for the substring problem) should now be thought of as generalizing Shallit's Theorems 1 and 2 from alphabet size d=2 to arbitrary d, with the added observation that for d > 2 the maximizing strings can be taken as prefixes of a single infinite string, while for d=2 this is not possible.
We regret our oversight, perhaps arising from a failure to search on "factors" or "subwords" as well as "substrings", and we thank Prof. Shallit for his understanding.
Prof. Snellman also pointed out a technical error. In the statement of Theorem 8 (as is already correctly elucidated in its proof), k should not be the integer part of log_{d}(n) but rather the integer such that d^{k}+k-1 ≤ n < d^{k+}^{1}+(k+1)-1.
We are very grateful to Prof. Snellman for pointing out the two errors, and pleased that the electronic format of E-JC allows us to redress them promptly.
Readers may also be interested in a recent paper by Svante Janson, Stefano Lonardi, and Wojciech Szpankowski [JLS03] and additional references cited therein.
References
- [JLS03]
- Svante Janson, Stefano Lonardi, and Wojciech Szpankowski, On the average sequence complexity, Tech. Report 2003:41, Uppsala Univ., November 2003.
- [Sha93]
- Jeffrey Shallit, On the maximum number of distinct factors in a binary string, Graphs Combin. 9 (1993), no. 2, 197-200. MR94b:05015
v11i1r8 — Comment by authors, Jul 15, 2004.
Professor Zoltán Kása of Babes-Bolyai University has pointed out that all of our results for substrings were discovered by Professor Antal Iványi of Eötvös Loránd University in 1987 [Ivá87]. As in our Theorem 8, [Ivá87] used de Bruijn sequences to construct words of any alphabet size with the maximal number of distinct substrings, and showed that for alphabet size > 2 there is an infinite word whose n-prefixes suffice, and that for binary alphabets there is not. We are grateful to Kása and Iványi for their correspondence.
The connection between de Bruijn sequences and strings with maximally many distinct substrings is also used by Anisiu, Blázsik and Kása [ABK02] to calculate, for a given string length, the number of strings with maximally many distinct substrings.
Iványi's work [Ivá87] actually addresses "d-substrings", a notion introduced by Hunyadvári and Iványi [HI84] which interpolates between substrings and subsequences as d is varied. A d-substring of a string is a subsequence which can be obtained with gaps of at most d; for d=1 it is a substring and for d=infinity it is a subsequence. Kása [Kás98] also considers d-complexity (the number of d-substrings), focussing on the existence of a string with a given 1-complexity.
We note that if d is at least as large as the alphabet size, the string we construct having a maximum number of distinct subsequences has an equal number of distinct d-substrings, so our result generalizes immediately to d-complexity in that case. For d smaller than the alphabet size, a simple argument (obtained in a discussion with Don Coppersmith, and paralleling the recursion for Fibonacci numbers) shows that the maximum d-complexity of a string of length n is of asymptotic order L^{n}, where both L and the hidden constants depend on d.
References
- [ABK02]
- Mira-Cristiana Anisiu, Zoltán Blázsik, and Zoltán Kása, Maximal complexity of finite words, Pure Math. Appl. 13 (2002), no. 1-2, 39-48, Algebraic systems (Felix-Oradea, 2001). MR 2004e:68083
- [HI84]
- László Hunyadvári and Antal Iványi, On some complexity measures of words, Conference on Automata, Languages, and Mathematical Systems (Salgótarján, Hungary, May 21-23, 1984), no. DM 84-2, Karl Marx Univ. of Economics, Dept. of Mathematics, 1984, pp. 67-82.
- [Ivá87]
- Antal Iványi, On the d-complexity of words, Ann. Univ. Sci. Budapest. Sect. Comput. 8 (1987), 69-90 (1988). MR 90d:68063
- [Kás98]
- Zoltán Kása, On the d-complexity of strings, Pure Math. Appl. 9 (1998), no. 1-2, 119-128, Modern applied analysis (Ilieni, 1997). MR 2000c:68112