On Shuffling of Infinite Square-free Words

In this paper we answer two recent questions from Charlier et al. (2014) and Harju (2013) about self-shuffling words. An infinite word w is called self-shuffling, if w = ∞ i=0 U i V i = ∞ i=0 U i = ∞ i=0 V i for some finite words U i , V i. Harju recently asked whether square-free self-shuffling words exist. We answer this question affirmatively. Besides that, we build an infinite word such that no word in its shift orbit closure is self-shuffling, answering positively a question of Charlier et al.


Introduction
A self-shuffling word, a notion which was recently introduced by Charlier et al. [2], is an infinite word that can be reproduced by shuffling it with itself.More formally, given two infinite words x, y ∈ Σ ω over a finite alphabet Σ, we define S (x, y) ⊆ Σ ω to be the collection of all infinite words z for which there exists a factorization An infinite word w ∈ Σ ω is self-shuffling if w ∈ S (w, w).Various well-known words, e.g., the Thue-Morse word or the Fibonacci word, were shown to be self-shuffling.
Harju [5] studied shuffles of both finite and infinite square-free words, i.e., words that have no factor of the form uu for some non-empty factor u.More results on square-free shuffles were obtained independently by Harju and Müller [6], and Currie and Saari [4].However, the question about the existence of an infinite square-free self-shuffling word, posed in [5], remained open.We give a positive answer to this question in Sections 2 and 3.
The shift orbit closure S w of an infinite word w can be defined, e.g., as the set of infinite words whose sets of factors are contained in the set of factors of w.In [2] it has been proved that each word has a non-self-shuffling word in its shift orbit closure, and the following question has been asked: Does there exist a word for which no element of its shift orbit closure is self-shuffling (Question 7.2)?In Section 4 we provide a positive answer to the question.More generally, we show the existence of a word such that for any three words x, y, z in its shift orbit closure, if x is a shuffle of y and z, then the three words are pairwise different.On the other hand, we show that for any infinite word there exist three different words x, y, z in its shift orbit closure such that x ∈ S (y, z) (see Proposition 7).
Apart from the usual concepts in combinatorics on words, which can be found for instance in the book of Lothaire [7], we make use of the following notations: For every k 1, we denote the alphabet {0, 1, . . ., k − 1} by Σ k .For a word w = uvz we say that u is a prefix of w, v is a factor of w, and z is a suffix of w.We denote these prefix-and suffix relations by u p w and v s w, respectively.By w[i, j] we denote the factor of w starting at position i and ending after position j.Note that we start numbering the positions with 0.
A prefix code is a set of words with the property that none of its elements is a prefix of another element.Similarly, a suffix code is a set of words where no element is a suffix of another one.A bifix code is a set that is both a prefix code and a suffix code.A morphism h is square-free if for all square-free words w, the image h(w) is square-free.

A square-free self-shuffling word on four letters
Let g : Σ * 4 → Σ * 4 be the morphism defined as follows: We will show that the fixed point w = g ω (0) is square-free and self-shuffling.Note that g is not a square-free morphism, that is, it does not preserve square-freeness, as g(23) = 0130302 contains the square 3030.
Lemma 1.The word w = g ω (0) contains no factor of the form 3u1u3 for any u ∈ Σ * 4 .
Proof.We assume that there exists a factor of the form 3u1u3 in w, for some word u ∈ Σ * 4 .From the definition of g, we observe that u can not be empty.Furthermore, we see that every 3 in w is preceded by either 0 or 1.If 1 s u, then we had an occurrence of the factor 11 in w, which is not possible by the definition of g, hence 0 s u.Now, every 3 is followed by either 0 or 2 in w and 01 is followed by either 2 or 3. Since both 3u and 01u are factors of w, we must have 2 p u.This means that the factor 012 appears at the center of u1u, which can only be followed by 1 in w, thus 21 p u.However, this results in the factor 321 as a prefix of 3u1u3, which does not appear in w, as seen from the definition of g.Lemma 2. The word w = g ω (0) is square-free.
Proof.We first observe that {g(0), g(1), g(2), g(3)} is a bifix code.Furthermore, we can verify that there are no squares uu with |u| 3 in w.Let us assume now, that the square uu appears in w and that u is the shortest word with this property.If u = 02u , then u = u 03 must hold, since 02 appears only as a factor of g(3), and thus uu is a suffix of the factor g(3)u g(3)u 03 in w.As w = g(w), also the shorter square 3g −1 (u )3g −1 (u ) appears in w, a contradiction.The same desubstitution principle also leads to occurrences of shorter squares in w if u = xu and x ∈ {01, 03, 10, 12, 13, 21, 30, 32}.
If u = 2u then either 03 s u or 030 s u or 01 s u, by the definition of g.In the last case, that is when 01 s u, we must have 21 p u, which is covered by the previous paragraph.If u = u 030, then uu is followed by 2 in w and we can desubstitute to obtain the shorter square g −1 (u )3g −1 (u )3 in w.If u = 2u and u = u 03, and uu is preceded by 03 or followed by 2 in w, we can desubstitute to 1g −1 (u )1g −1 (u ) or g −1 (u )1g −1 (u )1, respectively.Therefore, assume that u = 2u 03 and as we already ruled out the case when 21 p u, we can assume that uu is preceded by 030 and followed by 02 in w.This however means that we can desubstitute to get an occurrence of the factor 3g −1 (u )1g −1 (u )3 in w, a contradiction to Lemma 1.
We now show that w = g ω (0) can be written as Proof.We use the notation x = v −1 u meaning that u = vx for finite words x, u, v.We are going to show that the self-shuffle is given by the following:

Now we verify that
from which it follows that w is self-shuffling.It suffices to show that each of the above products is fixed by g.Indeed, straightforward computations show that hence ∞ i=0 U i is fixed by g and thus w = ∞ i=0 U i .In a similar way we show that 3 Square-free self-shuffling words on three letters We remark that we can immediately produce a square-free self-shuffling word over Σ 3 from g ω (0): Charlier et al. [2] noticed that the property of being self-shuffling is preserved by the application of a morphism.Furthermore, Brandenburg [1] showed that the morphism is square-free.Therefore, the word f (g ω (0)) is a ternary square-free self-shuffling word, from which we can produce a multitude of others by applying square-free morphisms from Σ * 3 to Σ * 3 .

A word with non self-shuffling shift orbit closure
In this section we provide a positive answer to the question from [2] whether there exists a word for which no element of its shift orbit closure is self-shuffling.
The Hall word H = 012021012102 • • • is defined as the fixed point of the morphism h(0) = 012, h(1) = 02, h(2) = 1.Sometimes it is referred to as a ternary Thue-Morse word.It is well known that this word is square-free.We show that no word in the shift orbit closure S H of the Hall word is self-shuffling.More generally, we show that if x is a shuffle of y and z for x, y, z ∈ S H , then they are pairwise different.Proposition 4.There are no words x, y in the shift orbit closure of the Hall word such that x ∈ S (y, y).
Proof.Suppose the converse, i.e., there exist words x, y ∈ S H such that Define the set X of infinite words as follows: In other words, X consists of words in S H which can be introduced as a shuffle of some word y in S H with itself.Now suppose, for the sake of contradiction, that X is non empty, and consider x ∈ X with the first block U 0 of the smallest possible positive length.We remark that such x and corresponding y are not necessarily unique.We can suppose without loss of generality that y starts with 0 or 10.Otherwise, we exchange 0 and 2, consider the morphism 0 → 1, 1 → 20, 2 → 210, and the argument is symmetric.
It is not hard to see from the properties of the morphism h that removing every occurrence of 1 from x and y results in (02) ω .Hence the blocks in the factorizations of y after removal of 1 are of the form (02) i for some integer i.Thus the first letter of each block U i and V i that is different from 1 is 0, and the last letter different from 1 is 2.
Then, U i and V i are images by the morphism h of factors of the fixed point of h.Therefore, there are words x , y ∈ S H such that x = h(x ), y = h(y ), Notice that the first block U 0 cannot be equal to 1. Indeed, otherwise x starts with 11, which is impossible, since 11 is not a factor of the fixed point of h.
Clearly, taking the preimage decreases the lengths of blocks in the factorization (except for those equal to 1), and since U 0 = 1, the length of the first block in the preimage is smaller, i.e., |U 0 | < |U 0 |.This is a contradiction with the minimality of |U 0 |.Corollary 5.There are no self-shuffling words in the shift orbit closure of H.

With a similar argument we can prove the following:
the electronic journal of combinatorics 22(1) (2015), #P1.55 Proposition 6.There are no words x, y in the shift orbit closure of H such that x ∈ S (x, y).
Proof.First we introduce a notation x ∈ S 2 (y, z), meaning that there exists a shuffle starting with the word z (i.e., U 0 = ε, V 0 = ε).Next, x ∈ S (x, y) implies that there exists z in the same shift orbit closure such that z ∈ S 2 (z, y).Indeed, one can remove the prefix U 0 of x to get z, i.e., z = (U 0 ) −1 x, and keep all the other blocks U i , V i in the shuffle product.
Define the set Z of infinite words as follows: In other words, Z consists of words in S H which can be introduced as a shuffle of some word y in S H with z starting with the block V 0 .Now consider z ∈ Z with the first block V 0 of the smallest possible length.We remark that such z and a corresponding y are not necessarily unique.
As in the proof of Proposition 4, the shuffle cannot start with a block of length 1.Again, if we remove every occurrence of 1 in y (and in z), we get (02) ω or (20) ω ; moreover, since V 0 contains letters different from 1, the first letter different from 1 is the same in y and z.So, without loss of generality we assume that both y and z without 1 are (02) ω , and the blocks U i and V i without 1 are integer powers of 02.Then, U i and V i are images by the morphism h of factors of H. Therefore, there are words z , y ∈ S H such that z = h(z ), y = h(y ), As in the proof of Proposition 4, since V 0 = 1, the length of the first block in the preimage is smaller, i.e., |V 0 | < |V 0 |.This is again a contradiction with the minimality of |V 0 |.So, we proved that if there are three words x, y, z in the shift orbit closure of the fixed point of h such that x ∈ S (y, z), then they should be pairwise distinct.Now we are going to prove that for any infinite word there exist three different words in its shift orbit closure such that x ∈ S (y, z).
An infinite word x is called recurrent, if each of its prefixes occurs infinitely many times in it.Proposition 7. Let x be a recurrent infinite word.Then there exist two words y, z in the shift orbit closure of x such that x ∈ S (y, z).
Proof.We build the shuffle inductively.
Start from any prefix U 0 of x.Since x is recurrent, each of its prefixes occurs infinitely many times in it.Find another occurrence of U 0 in x and denote its position by i 1 .Put the electronic journal of combinatorics 22(1) (2015), #P1.55 At step k, suppose that the shuffle of the prefix of x is built: Find another occurrence of k−1 l=0 V l in x at some position j k > j k−1 .We can do it since x is recurrent.Put We note that k l=0 U l is a factor of x by the construction; more precisely, it occurs at position i k−1 .
Find an occurrence of k l=0 U l at some position i Continuing this line of reasoning, we build the required factorization.
Since each infinite word contains a recurrent (actually, even a uniformly recurrent) word in its shift orbit closure, we obtain the following corollary: Corollary 8.Each infinite word w contains words x, y, z in its shift orbit closure such that x ∈ S (y, z).
The following example shows that the recurrence condition in Proposition 7 cannot be omitted: Example 9. Consider the word 3H = 3012021 • • • which is obtained from H by adding a letter 3 in the beginning.Then the shift orbit closure of 3H consists of the shift orbit closure of H and the word 3H itself.Assuming 3H is a shuffle of two words in its shift orbit closure, one of them is 3H (there are no other 3's) and the other one is something in the shift orbit closure of H, we let y denote this other word.Clearly, the shuffle starts with 3, and cutting the first letter 3, we get H ∈ S (H, y), a contradiction with Proposition 6.
There also exist examples where each letter occurs infinitely many times: Example 10.The following word: the electronic journal of combinatorics 22(1) (2015), #P1.55 does not have two words y, z in its shift orbit closure such that x ∈ S (y, z).The idea of the proof is that the shift orbit closure consists of words of the following form: 1 * 20 ω , 0 * 1 ω , x itself and all their right shifts.Shuffling any two words of those types, it is not hard to see that there exists a prefix of the shuffle which contains too many or too few occurrences of some letter compare to the prefix of x.We leave the details of the proof to the reader.By Corollary 8, there are x, y, z in the shift orbit closure of H such that x ∈ S (y, z).To conclude this section, we give an explicit construction of two words in the shift orbit closure of H which can be shuffled to give H.We remark though that this construction gives a shuffle different from the one given by Corollary 8. Let: By definition, the shift orbit closure of the Hall word is closed under h.Moreover this shift orbit closure is also closed under h .One of the ways to see this is the following.It is well known that the Thue-Morse word, which is a fixed point of the morphism 0 → 01, 1 → 10 starting with 0, is a morphic image of H under a morphism 0 → 011, 1 → 01, 2 → 0. Therefore, the set of factors of the Hall word is closed under reversal . Now by induction we prove that for each word v one has h (v) = (h(v R )) R (it is enough to prove this equality for letters and for the concatenation of two words).This implies that the shift orbit closure of the Hall word is closed h .

Conclusions
We showed that infinite square-free self-shuffling words exist.The natural question that arises now is whether we can find infinite self-shuffling words subject to even stronger avoidability constraints: For this we recall the notion of repetition threshold RT (k), which is defined as the least real number such that an infinite word over Σ k exists, that does not contain repetitions of exponent greater than RT (k).Due to the collective effort of many researchers (see [3,8] and references therein), the repetition threshold for all alphabet sizes is known and characterized as follows: if k = 3 w ∈ Σ ω k without factors of exponent greater than RT (k) is called a Dejean word.Charlier et al. showed that the Thue-Morse word, which is a binary Dejean word, is self-shuffling [2].Question 12. Do there exist self-shuffling Dejean words over non-binary alphabets?the electronic journal of combinatorics 22(1) (2015), #P1.55