A ternary square-free sequence avoiding factors equivalent to $abcacba$

We solve a problem of Petrova, finalizing the classification of letter patterns avoidable by ternary square-free words; we show that there is a ternary square-free word avoiding letter pattern $xyzxzyx$. In fact, we: (1) characterize all the (two-way) infinite ternary square-free words avoiding letter pattern $xyzxzyx$ (2) characterize the lexicographically least (one-way) infinite ternary square-free word avoiding letter pattern $xyzxzyx$ (3) show that the number of ternary square-free words of length $n$ avoiding letter pattern $xyzxzyx$ grows exponentially with $n$.


Introduction
A theme in combinatorics on words is pattern avoidance. A word w encounters word p if f (p) is a factor of w for some non-erasing morphism f . Otherwise w avoids p. A standard question is whether there are infinitely many words over a given finite alphabet Σ, none of which encounters a given pattern p. Equivalently, one asks whether an ω-word over Σ avoids p.
The first problems of this sort were studied by Thue [8,9] who showed that there are infinitely many words over {a, b, c} which are square-free -i.e., do not encounter xx. He also showed that over {a, b} there are infinitely many overlap-free words -which simultaneously avoid xxx and xyxyx. Thue also introduced a variation on pattern avoidance by asking whether one could simultaneously avoid squares xx and factors from a finite set. For example, Thue showed that infinitely many words over {a, b, c} avoid squares, and also have no factors aba or cbc.
In combinatorics, once an existence problem has been solved, it is natural to consider stronger questions: characterizations, enumeration problems and extremal problems. Since Thue, progressively stronger questions about pattern-avoiding sequences have been asked and answered: • Gottschalk and Hedlund [4] characterized the doubly infinite binary words avoiding overlaps.
• How many square-free words of length n are there over {a, b, c}? The number of such words was shown to grow exponentially by Brandenburg [2]. • Let w be the lexicographically least square-free ω-word over {a, b, c}. As the author [1] has pointed out, the method of Shelton [7] allows one to test whether a given finite word over {a, b, c} is a prefix of w. Interest in words avoiding patterns continues, and a recent paper by Petrova [6] studied letter pattern avoidance by ternary square-free words. A word w over {1, 2, 3} avoids the letter pattern P ∈ {x, y, z} * if no factor of w is an image of P under an injection from {x, y, z} to {1, 2, 3}. For example, to avoid the letter pattern xyzxzyx, a word w cannot contain any of the factors 1231321, 1321231, 2132312, 2312132, 3123213 and 3213123. Regarding this particular letter pattern, Petrova remarks at the end of her paper that '(p)roving its avoidance will finalize the classification of letter patterns avoidable by ternary square-free words.' In this note, we show that there is a ternary square-free word avoiding letter pattern xyzxzyx. In fact, we • characterize all the (two-way) infinite ternary square-free words avoiding letter pattern xyzxzyx • characterize the lexicographically least (one-way) infinite ternary square-free word avoiding letter pattern xyzxzyx • show that the number of ternary square-free words of length n avoiding letter pattern xyzxzyx grows exponentially with n.

Preliminaries
We will assume standard notations from combinatorics on words. For reference see the books by Lothaire [3,5]. In particular, a word is square-free if it has no non-empty factor xx. Let S = {1, 2, 3}, T = {a, b, c, d} and U = {a, c, d}. For an alphabet Σ, we denote by Σ * , the set of all finite words over Σ; by Σ ω , we denote the ω-words over Σ, which are infinite to the right; by Σ Z we denote the Z-words over Σ, which are doubly infinite. Depending on context, a 'word' over Σ may refer to a finite word, an ω-word or a Z-word.
We put natural orders on alphabets S, T and U: 1 < 2 < 3 and a < b < c < d.
These induce lexicographic orders on words over these alphabets; the definition is recursive: if w is a word and x, y are letters, then wx < wy if and only if x < y. Call a word over S factor-good if it has no factor of the form xyzxzyx where {x, y, z} = S; i.e., the factors 1231321, 1321231, 2132312, 2312132, 3123213, 3213123 are forbidden. Call a word over S good if it is square-free and factor-good. Petrova's question is whether there are infinitely many good words.

Results on good words
Theorem 3 and Theorem 4 below characterize good Z-words. These turn out to be in 2-to-1 correspondence with square-free Z-words over U.
Let f :T * → S * be the morphism given by Let g:U * → T * be the map where g(u) is obtained from a word u ∈ {a, c, d} * by replacing each factor ac of u by abc, each factor da of u by dba and each factor dc of u by dbc.
Theorem 4. Let w ∈ S Z be good. Exactly one of the following is true: (1) There is a square-free word u ∈ U Z such that w = f (g(u)).
(2) There is a square-free word u ∈ U Z such that w = π(f (g(u))).
We can also characterize the lexicographically least good ω-word: There are 'many' finite good words, in the sense that the number of words grows exponentially with length. For each non-negative integer n, let G(n) be the number of good words of length n. Theorem 6. The number of good words of length n grows exponentially with n. In particular, there are positive constants A, B and C > 1 such that

Proof of Theorem 4
The proof of Theorem 4 proceeds via a series of lemmas. Suppose that w ∈ Σ Z is good. Since w is square-free, w ∈ {12, 123, 1232, 13, 132, 1323} Z .
Proof. If the lemma is false, then either • w contains a finite factor with prefix 1231 and suffix 1321 or • w contains a factor with prefix 1321 and suffix 1231.
Without loss of generality up to relabeling, suppose that w contains a factor with prefix 1231 and suffix 1321. Since it is good, w cannot have 1231321 as a factor. Consider then a shortest factor 1231v1321 of w; thus |1231v1321| 1231 = 1. Exhaustively listing good words 1231u with |1231u| 1231 = 1, we find that there are only finitely many, and exactly three which are maximal with respect to right extension: 12312131232123, 123132312131232123, 12313231232123. It follows that one of these is a right extension of 1231v1321; however, none of the three has 1321 as a factor. This is a contradiction. Proof. We prove this via a series of claims: Claim 9. Neither of 132313 and 21232 is a factor of t.
Proof of Claim. Since t ∈ 1213{12, 123, 1232, 13, 1323} ω , if 132313 is a factor of t, then so is one of 1323131 and 13231323, both of which end in squares. This is impossible, since t is good. Similarly, if 21232 is a factor of t, so is one of 121232 and 12321232, both of which begin with squares.
Proof of Claim. Word u must be 13 or 1323; otherwise, 12u begins with the square 1212. Suppose u = 1323. By the previous claim, v must have prefix 12. But then 2uv has prefix 2132312 = xyzxzyx, where x = 2, y = 1, z = 3; this is impossible. Thus u = 13.
Proof of Claim. Word u must end with 2; otherwise, u13v contains the square 3131. Thus u must be 12 or 1232. Suppose u = 1232. By the first claim, t must have suffix 3. But then tu13 has suffix 3123213 = xyzxzyx, where x = 3, y = 1, z = 2; this is impossible. Thus u = 12.
We have proved that 12 and 13 only appear in t in the context 1213. It follows that t ∈ {1213, 123, 1232, 1323}. Proof. We know that w ∈ {12, 123, 1232, 13, 1323} Z . If neither of 121 and 131 is a factor of w, then w is concatenated from copies of A = 1323, B = 1232 and C = 123. However, CB and AC1 contain squares, while BA12 contains 2132312, which cannot be a factor of a good word. This implies that A, B and C always occur in w in the cyclical order A → B → C → A, Suppose that w is a factor of v; this implies that w can be walked on the directed graph D. If w does not begin or end with b, then w = g(h(w)).
Word u must be square-free; otherwise its image w contains a square. This the first alternative in Theorem 4 holds.
The other situation occurs if we decide, after Lemma 7, that w 1231 = 0. As we remarked at that point in our argument, this amounts to interchanging 2's and 3's, i.e., applying π. In such a case, we find that w = π(f (g(u))).
This completes the proof of Theorem 4.

Proof of Theorem 3
Proof of Lemma 1. Let w = f (g(u)). Each length 7 factor of w is a factor of f (g(u ′ )), some factor u ′ ∈ U 3 . A finite check establishes that f (u ′ ) is factor-good for each u ∈ U 3 .
Proof of Lemma 2. Suppose for the sake of getting a contradiction, that XX is a non-empty square in w = f (v). If |X| ≤ 2, then XX is a factor of f (v ′ ), some factor v ′ of v with |v ′ | = 2. However, we need only consider v ′ ∈ {ab, ad, ba, bc, ca, cd, db}.
(We can walk v ′ on D.) In each case, we check that f (v ′ ) is square-free. From now on, then, suppose that |X| ≥ 3; in this case we can write From the definition of g and the fact that v n = b, it follows that v n−1 v n v 1 ∈ {abc, dba, dbc}.

Proof of Theorem 5
Let u be the lexicographically least square-free ω-word over U = {a, c, d}, and let t = f (g(u)). It follows that u has prefix ac, so that t has prefix p = f (g(ac)) = f (abc) = 12131231323. A finite search shows that p is the lexicographically least good word of length 11. It will therefore suffice to show that t is the lexicographically least good ω-word with prefix p.
Suppose that t 1 is a good ω-word with prefix p. By Lemma 7, it follows that |t 1 | 1321 = 0, and from the proof of Theorem 4, we conclude that t 1 = f (g(u 1 )), for some square-free word u 1 . It remains to show that u 1 is lexicographically greater than or equal to u. Suppose not.
Since t 1 has prefix p, word ac must be a prefix of u 1 , and u, u 1 agree on a prefix of length at least 2. Let qrs and qrt be prefixes of u 1 and u, respectively, where r, s, t ∈ {a, c, d}, and s is lexicographically less than t.
• If r = a, then we cannot have s = a, since u 1 is square-free. We therefore must have s = c and t = d. It follows that t 1 has prefix f (g(qa)bc) = f (g(qa))1231323, and t has prefix f (g(qa)d) = f (g(qa))1232, and we see that t 1 is lexicographically less than t. This contradicts the minimality of t.
• If r = c, then we must have s = a and t = d. It follows that t 1 has prefix f (g(qca)) = f (g(qc))1213, and t has prefix f (g(qcd)) = f (g(qc))1232, and again t 1 is lexicographically less than t, giving a contradiction. • If r = d, then we must have s = a and t = c. It follows that t 1 has prefix f (g(qd)ba) = f (g(qd))1231213, and t has prefix f (g(qd)bc) = f (g(qc))1231323, and again t 1 is lexicographically less than t. We conclude that u 1 is lexicographically greater than or equal to u, and u is the lexicographically least square-free ω-word over U, as claimed.