Doubled patterns are $3$-avoidable

In combinatorics on words, a word $w$ over an alphabet $\Sigma$ is said to avoid a pattern $p$ over an alphabet $\Delta$ if there is no factor $f$ of $w$ such that $f=h(p)$ where $h:\Delta^*\to\Sigma^*$ is a non-erasing morphism. A pattern $p$ is said to be $k$-avoidable if there exists an infinite word over a $k$-letter alphabet that avoids $p$. A pattern is said to be doubled if no variable occurs only once. Doubled patterns with at most 3 variables and patterns with at least 6 variables are $3$-avoidable. We show that doubled patterns with 4 and 5 variables are also $3$-avoidable.


Introduction
A pattern p is a non-empty word over an alphabet ∆ = {A, B, C, . . . } of capital letters called variables. An occurrence of p in a word w is a non-erasing morphism h : ∆ * → Σ * such that h(p) is a factor of w. The avoidability index λ(p) of a pattern p is the size of the smallest alphabet Σ such that there exists an infinite word w over Σ containing no occurrence of p. Bean, Ehrenfeucht, and McNulty [2] and Zimin [13] characterized unavoidable patterns, i.e., such that λ(p) = ∞. We say that a pattern p is t-avoidable if λ(p) ≤ t. For more informations on pattern avoidability, we refer to Chapter 3 of Lothaire's book [8].
It follows from their characterization that every unavoidable pattern contains a variable that occurs once. Equivalently, every doubled pattern is avoidable. Our result is that : Theorem 1. Every doubled pattern is 3-avoidable.
Let v(p) be the number of distinct variables of the pattern p. For v(p) ≤ 3, Cassaigne [5] began and I [9] finished the determination of the avoidability index of every pattern with at most 3 variables. It implies in particular that every avoidable pattern with at most 3 variables is 3-avoidable. Moreover, Bell and Goh [3] obtained that every doubled pattern p such that v(p) ≥ 6 is 3-avoidable.
Therefore, as noticed in the conclusion of [10], there remains to prove Theorem 1 for every pattern p such that 4 ≤ v(p) ≤ 5. In this paper, we use both constructions of infinite words and a non-constructive method to settle the cases 4 ≤ v(p) ≤ 5.
As noticed in these papers, if p has length at least 2 v(p) then p contains a doubled pattern as a factor. Thus, Theorem 1 implies Theorem 2.(b).

Extending the power series method
In this section, we borrow an idea from the entropy compression method to extend the power series method as used by Bell and Goh [3], Rampersad [12], and Blanchet-Sadri and Woodhouse [4].
Let us describe the method. Let L ⊂ Σ * m be a factorial language defined by a set F of forbidden factors of length at least 2. We denote the factor complexity of L by n i = L ∩ Σ i m . We define L ′ as the set of words w such that w is not in L and the prefix of length |w| − 1 of w is in L. For every forbidden factor f ∈ F , we choose a number 1 ≤ s f ≤ |f |. Then, for every i ≥ 1, we define an integer a i such that We consider the formal power series P (x) = 1 − mx + i≥1 a i x i . If P (x) has a positive real root x 0 , then n i ≥ x −i 0 for every i ≥ 0.
Let us rewrite that P ( Since n 0 = 1, we will prove by induction that n i n i−1 ≥ x −1 0 in order to obtain that n i ≥ x −i 0 for every i ≥ 0. By using (1), we obtain the base case: . Now, for every length i ≥ 1, there are: The power series method used in previous papers [3,4,12] corresponds to the special case such that s f = |f | for every forbidden factor. Our condition is that P (x) = 0 for some x > 0 whereas the condition in these papers is that every coefficient of the series expansion of 1 P (x) is positive. The two conditions are actually equivalent. The result in [11] concerns series of the form S(x) = 1 + a 1 x + a 2 x 2 + a 3 x 3 + . . . with real coefficients such that a 1 < 0 and a i ≥ 0 for every i ≥ 2. It states that every coefficient of the series The entropy compression method as developped by Gonçalves, Montassier, and Pinlou [6] uses a condition equivalent to P (x) = 0. The benefit of the present method is that we get an exponential lower bound on the factor complexity. It is not clear whether it is possible to get such a lower bound when using entropy compression for graph coloring, since words have a simpler structure than graphs.

Applying the method
In this section, we show that some doubled patterns on 4 and 5 variables are 3-avoidable. Given a pattern p, every occurrence f of p is a forbidden factor. With an abuse of notation, we denote by |A| the length of the image of the variable A of p in the occurrence f . This notation is used to define the length s f .
Let us first consider doubled patterns with 4 variables. We begin with patterns of length 9, so that one variable, say A, appears 3 times. We set s f = |f |. Using the obvious upper bound on the number of pattern occurrences, we obtain Then P (x) admits x 0 = 0.3400 . . . as its smallest positive real root. So, every doubled pattern p with 4 variables and length 9 is 3-avoidable and there exist at least x −n 0 > 2.941 n ternary words avoiding p. Notice that for patterns with 4 variables and length at least 10, every term of a,b,c,d≥1 3 a+b+c+d x 3a+2b+2c+2d in P (x) gets multiplied by some positive power of x. Since 0 < x < 1, every term is now smaller than in the previous case. So P (x) admits a smallest positive real root that is smaller than 0.3400 . . . Thus, these patterns are also 3-avoidable. Now, we consider patterns with length 8, so that every variable appears exactly twice. If such a pattern has ABCD as a prefix, then we set Then P (x) admits 0.3819 . . . as its smallest positive real root, so that this pattern is 3-avoidable. Among the remaining patterns, we rule out patterns containing an occurrence of a doubled pattern with at most 3 variables. Also, if one pattern is the reverse of another, then they have the same avoidability index and we consider only one of the two. Thus, there remain the following patterns: ABACBDCD, ABACDBDC, ABACDCBD, ABCADBDC, ABCADCBD, ABCADCDB, and ABCBDADC. Now we consider doubled patterns with 5 variables. Similarly, we rule out every pattern of length at least 11 with the method by setting s f = |f |.
4 has a positive real root.
We also rule out every pattern of length 10 having ABC as a prefix. We set s f = |f | − |ABC| = |A| + |B| + |C| + 2|D| + 2|E|. Then we check that has a positive real root. Again, we rule out patterns containing an occurrence of a doubled pattern with at most 4 variables and patterns whose reversed pattern is already considered. Thus, there remain the following patterns: ABACBDCEDE, ABACDBCEDE, and ABACDBDECE.

Sporadic doubled patterns
In this section, we consider the 10 doubled patterns on 4 and 5 variables whose 3-avoidability has not been obtained in the previous section.
We define the avoidability exponent AE(p) of a pattern p as the largest real x such that every x-free word avoids p. This notion is not pertinent e.g. for the pattern ABW BAXACY CAZBC studied by Baker, McNulty, and Taylor [1], since for every ǫ > 0, there exists a (1 + ǫ)-free word containing an occurrence of that pattern. However, AE(p) > 1 for every doubled pattern. To see that, consider a factor A . . . A of p. If an x-free word contains an occurrence of p, then the image of this factor is a repetition such that the image of A cannot be too large compared to the images of the variables occurring between the As in p. We have similar constraints for every variable and this set of constraints becomes unsatisfiable as x decreases towards 1. We present one way of obtaining the avoidability exponent for a doubled pattern p of length 2v(p). We construct the v(p) × v(p) matrix M such that M i,j is the number of occurrences of the variable X j between the two occurrences of the variable X i . We compute the largest eigenvalue β of M and then we have AE(p) = 1 + 1 β+1 . For example if p = ABACDCBD, then we get M = such that for every 5 4 + -free word w ∈ Σ * 5 , we have that m(w) avoids p. The proof that p is avoided follows the method in [9]. Since there exist exponentially many 5 4 + -free words over Σ 5 [7], there exist exponentially many binary words avoiding p.

Simultaneous avoidance of doubled patterns
Bell and Goh [3] have also considered the avoidance of multiple patterns simultaneously and ask (question 3) whether there exist an infinite word over a finite alphabet that avoids every doubled pattern. We give a negative answer.
A word w is n-splitted if |w| ≡ 0 (mod n) and every factor w i such that w = w 1 w 2 . . . w n and |w i | = |w| n for 1 ≤ i ≤ n contains every letter in w. An n-splitted pattern is defined similarly. Let us prove by induction on k that every word w ∈ Σ n k k contains an n-splitted factor. The assertion is true for k = 1. Now, if the word w ∈ Σ n k k is not itself n-splitted, then by definition it must contain a factor w i that does not contain every letter of w. So we have w i ∈ Σ n k−1 k−1 . By induction, w i contains an n-splitted factor, and so does w. This implies that for every fixed n, every infinite word over a finite alphabet contains n-splitted factors. Moreover, an n-splitted word is an occurrence of an n-splitted pattern such that every variable has a distinct image of length 1. So, for every fixed n, the set of all n-splitted patterns is not avoidable by an infinite word over a finite alphabet.

Conclusion
Our results answer settles the first of two questions of our previous paper [10]. The second question is whether there exists a finite k such that every doubled pattern with at least k variables is 2-avoidable. As already noticed [10], such a k is at least 5 since, e.g., ABCCBADD is not 2-avoidable.