Uncountably many minimal hereditary classes of graphs of unbounded clique-width

Given an infinite word over the alphabet $\{0,1,2,3\}$, we define a class of bipartite hereditary graphs $\mathcal{G}^\alpha$, and show that $\mathcal{G}^\alpha$ has unbounded clique-width unless $\alpha$ contains at most finitely many non-zero letters. We also show that $\mathcal{G}^\alpha$ is minimal of unbounded clique-width if and only if $\alpha$ belongs to a precisely defined collection of words $\Gamma$. The set $\Gamma$ includes all almost periodic words containing at least one non-zero letter, which both enables us to exhibit uncountably many pairwise distinct minimal classes of unbounded clique width, and also proves one direction of a conjecture due to Collins, Foniok, Korpelainen, Lozin and Zamaraev. Finally, we show that the other direction of the conjecture is false, since $\Gamma$ also contains words that are \emph{not} almost periodic.


Introduction
Typically, when some graph parameter is bounded for a given graph or collection of graphs, then there exist efficient (polynomial time) algorithms for a range of problems that are in general intractable. To give two examples, Courcelle's Theorem [5] states that any graph property expressible in MSO 2 logic can be decided in linear time on graphs with bounded treewidth, and Courcelle, Makowsky and Rotics [4] showed that any graph property expressible in MSO 1 logic has a linear time algorithm on graphs with bounded clique-width.
While treewidth is the parameter of choice for minor-closed classes of graphs, clique-width has the property that it can remain bounded for graphs with a high edge density, and is thus of more use with hereditary classes of graphs. Indeed, if H is an induced subgraph of G, then the clique-width of H is at most the clique-width of G. (For formal definitions, see Section 2.) In light of the algorithmic consequences, a natural goal is to characterise which classes of graphs are unbounded with respect to a given parameter. A standard approach is to identify the minimal classes: for example, planar graphs are the unique minimal minor-closed class of graphs The rest of this paper is organised as follows. Section 2 provides some background, and defines key definitions and concepts. Section 3 is devoted to providing that the classes defined over {0, 1, 2, 3} in general have unbounded clique-width. This is done from first principles for classes over the alphabet {2, 3}, and then extended to the full four-letter alphabet using rank-width techniques.
Section 4 provides the central proof that if α is an infinite almost periodic word with at least one non-zero letter then G α is a minimal hereditary class of graphs of unbounded clique-width. To do this we modify the notion of cluster graphs, first used by Lozin [14], and show how this can be used in conjunction with Menger's Theorem to provide an integrated proof of the minimality result. That there are uncountably many distinct minimal hereditary classes of graphs of unbounded clique width follows by considering Sturmian sequences.
Finally, in Section 5, we explore sequences that are recurrent but not almost periodic, and prove the precise characterisation between minimal and non-minimal hereditary classes of graphs of unbounded clique-width.

Preliminaries
A graph G is a pair of sets, vertices V(G) and edges E(G) ⊆ V(G) × V(G). Unless otherwise stated, all graphs in this paper are simple, i.e. undirected, without loops or multiple edges. We denote N(v) as the neighbourhood of a vertex v, that is, the set of vertices adjacent to v.
A set of vertices is independent if no two of its elements are adjacent. A graph is bipartite if its vertices can be partitioned into two independent sets. Given a graph G(V, E), a subset U ⊆ V and a vertex v ∈ V \ U, we say that v distinguishes U if v has both a neighbour and a non-neighbour in U. If U is indistinguishable by the vertices outside U, we call U a module. A module U is trivial if |U| = 1 or U = V(G). A graph, every module of which is trivial, is called prime. We denote the set of prime induced subgraphs of G, Prime(G). We will use the notation H I G to denote graph H is an induced subgraph of graph G, meaning H can be obtained from G by a sequence of vertex removals. If graph G does not contain the induced subgraph H we say that G is H-free.
A class of graphs C is hereditary if it is closed under taking induced subgraphs, that is G ∈ C implies H ∈ C for every induced subgraph H of G. It is well known that for any hereditary class C there exists a unique (but not necessarily finite) set of minimal forbidden graphs {H 1 , H 2 , . . . } such that C = Free(H 1 , H 2 , . . . ).
If H is an induced subgraph of G, then this can be witnessed by one or more embeddings, where an embedding of H in G is an injective map φ : V(H) → V(G) such that the subgraph of G induced by the vertices φ(V(H)) is isomorphic to H. In other words, vw ∈ E(H) if and only if φ(v)φ(w) ∈ E(G).

Bipartite hereditary graph classes defined by an infinite word
The graph classes we consider are all formed by taking the set of finite induced subgraphs of an infinite graph defined on a grid of vertices. We start by defining an infinite empty graph P with vertices V(P) = {v i,j : i, j ∈ N}.
In general, we think of P as an infinite two-dimensional array in which v i,j represents the vertex in the i-th row (counting from the left) and j-th column (counting from the top). Hence vertex v 1,1 is in the top left corner of the grid and the grid extends infinitely to the right and downwards. The j-th column of P is the set C j = {v i,j : i ∈ N}, and the i-th row of P is the set R i = {v i,j : j ∈ N}.
We will refer to a (finite or infinite) sequence of letters chosen from a finite alphabet as a word.
We denote by α j the j-th letter of the word α and we denote α k j to be the concatenation of k copies of the letter α j . A factor of α is a contiguous subword of α. The length of a word α is the number of letters the word contains, while the weight of α is the number of non-zero letters it has, which we will denote |α| 1 .
An infinite word α is recurrent if each of its factors occurs in it infinitely many times. We say that α is almost periodic (sometimes called uniformly recurrent or minimal) if for each factor β of α there exists a constant L(β) such that every factor of α of length at least L(β) contains β as a factor. Finally, α is periodic if there is a positive integer p such that α k = α k+p for all k. Clearly, every periodic word is almost periodic, and every almost periodic word is recurrent.
Let α be an infinite word such that α j ∈ {0, 1, 2, 3} for each natural j. We define a family of infinite graphs {P α } with vertices V(P), and with edges between consecutive columns C j , C j+1 of V(P α ), such edges determined by the letters of the word α.
(i) If α j = 0 then the edges between C j and C j+1 are given by {(v i,j , v i,j+1 ) : i ∈ N} (i.e. a matching).
(ii) If α j = 1 then the edges between C j and C j+1 are given by {(v i,j , v k,j+1 ) : i = k; i, k ∈ N} (i.e. the complement of a matching).
(iii) If α j = 2 then the edges between C j and C j+1 are given by {(v i,j , v k,j+1 ) : i k; i, k ∈ N}.
(iv) If α j = 3 then the edges between C j and C j+1 are given by {(v i,j , v k,j+1 ) : i k; i, k ∈ N}.
This notation matches and extends that used in [3].
Let G α denote the class of all finite induced subgraphs of P α . By definition, G α is a hereditary class, and any graph G ∈ G α can be witnessed by an embedding into the infinite graph P α . Given such an embedding of G into P α , we will be especially interested in the induced subgraphs of G that occur in two adjacent columns: an α j -link is the induced subgraph of G on the vertices of G ∩ (C j ∪ C j+1 ), and will be denoted by G j . Letting 2 ∞ stand for the infinite word of all 2s, we note that G 2 ∞ is the class of bipartite permutation graphs. Note, too, that G 3 ∞ = G 2 ∞ (this can be seen by considering a vertical reflection of the 2-dimensional array), so it is not necessarily the case that two words α and β give rise to distinct hereditary classes.
On the other hand, the word with alternating 2s and 3s, which throughout we will denote by ω = 232323 · · · , defines the class G ω which is distinct from G 2 ∞ . Indeed, the graph, embeds in P ω , but is one of the minimal forbidden induced subgraphs of permutation graphs.

Clique-width and rank-width
Clique-width is a graph width parameter introduced by Courcelle, Engelfriet and Rozenberg in the 1990s [6]. A recent survey of clique-width for hereditary graph classes is [8]. The cliquewidth of a graph is denoted cwd(G) and is defined as the the minimum number of labels needed to construct G by means of the following four graph operations: (a) creation of a new vertex v with label i (denoted i(v)), (b) taking the disjoint union of two previously-constructed labelled graphs G and H (denoted (c) adding an edge between every vertex labelled i and every vertex labelled j for distinct i and j (denoted η i,j ) and (d) giving all vertices labelled i the label j (denoted ρ i→j ).
Every graph can be defined by an algebraic expression τ using the four operations above, which we will refer to as a clique-width expression. This expression is called a k-expression if it uses k different labels.
Alternatively, any clique-width expression τ defining G can be represented as a rooted tree, tree(τ), whose leaves correspond to the operations of vertex creation, the internal nodes correspond to the ⊕-operation, and the root is associated with G. The operations η and ρ are assigned to the respective edges of tree(τ).
A related parameter is that of rank-width, which was introduced by Oum and Seymour in 2006 [16]. They showed that the measures are closely related through the inequality rwd(G) cwd(G) 2 rwd(G)+1 − 1 so that a graph class has bounded clique-width if and only if it has bounded rank-width.
For a graph G and a vertex v, the local complementation at v is the operation that replaces the subgraph induced by the neighbourhood of v with its complement. For a graph G and an edge vw, the graph obtained by pivoting vw is the graph obtained by applying local complementation at v, then at w and then at v again. When G is bipartite, Oum showed in [15] that this is equivalent to complementing the edges between N(v)\w and N(w)\v.
We will use the notation H V G to denote graph H is a vertex-minor of graph G, meaning H can be obtained from G by a sequence of vertex removals and local complementations. A useful result is the following.

Graph classes with unbounded clique-width
In this section we show that the graph classes {G α }, where α is an infinite word over {0, 1, 2, 3} that contains infinitely many letters from {1, 2, 3}, have unbounded clique-width. We will extend the methods of Golumbic and Rotics [12] and Brandstädt and Lozin [2], the latter of which proved that the clique-width of the class of bipartite permutation graphs (i.e. G 2 ∞ ) is unbounded. We deal with {2, 3} graphs first as these are susceptible to direct proof methods. We can then use rank-width/local complementation techniques following Collins et al [3] to show that the more complex {0, 1, 2, 3} graphs have a suitable {2, 3} vertex-minor to prove that these too have unbounded clique-width.
In the case of binary words, the following result covers what we require.

{2, 3} graph classes with unbounded clique-width
To assist with determining the clique-width of these graph classes, it is helpful to consider a sequence {W α n } of graphs which are constructed as follows.
The vertices of W α n are an n × n array with vertex u i,j in row i and column j. These vertices are partitioned according to the diagonal they are on, such that u i,j is on diagonal D i+j−1 . Hence, D 1 = {u 1,1 }, D 2 = {u 2,1 , u 1,2 }, and so on.
W α n has an edge (u i,j , u k,l ) whenever u i,j ∈ D m , u k,l ∈ D m+1 and either : (a) α m = 2 and k i, or (b) α m = 3 and l j.
Observe that these edges always create an n × n square grid. Furthermore, the edges between two consecutive diagonals D m and D m+1 of W α n correspond to the edges between two consecutive columns C m and C m+1 of P α . Lemma 3.2. For all n ∈ N, W α n can be embedded into P α .
Proof. For each vertex u x,y of W α n we will identify a vertex v i,j of P α so that we can define an injective map φ : V(W α n ) → V(P α ) such that φ(u x,y ) = v i,j as follows: If (x, y) = (1, 1) then (i, j) = (n, 1), otherwise (i, j) is given by: where It can be seen that the subgraph of P α induced by the vertices φ(V(W α n )) is isomorphic to W α n .
An example of embedding W ω 5 is shown in Figure 2.
We will now calculate a lower bound for the clique-width of W α n by demonstrating a minimum number of labels needed to construct W α n using the allowed four graph operations.
Let τ be a clique-width expression defining W α n and tree(τ) the rooted tree representing τ. The subtree of tree(τ) rooted at a node x will be denoted tree(x, τ). This subtree corresponds to a subgraph of W α n we will call W x .
Let a be the lowest ⊕ node in tree(τ) such that W a contains a full row and a full column of W α n . We denote by r and b the two children of a in tree(τ). Let us colour the vertices of W r and W b red and blue, respectively, and all the other vertices in W α n white. We let colour(v) denote the colour of a vertex v as described above, and label(v) denote the label of vertex v (if any) at node a.
We assume that there is neither a blue nor a red column in W α n . For if W α n contains a blue column then obviously it cannot contain a red row, and it cannot contain a blue row due to the choice of a. Likewise, if W α n contains a red column. Hence, if it does include a blue or red column, as a consequence of the symmetry of W α n , we can apply similar types of argument to those that follow to the rows instead of the columns to deliver the same result. Observation 3.3. Suppose u, v, w are three vertices such that u and v are non-white, uw ∈ E(W α n ) but vw / ∈ E(W α n ), and label(w) = label(u). Then u and v must have different labels at node a because the edge uw still needs to be created, whilst respecting the non-adjacency of v and w.
Let us denote a row without white vertices by r. From the foregoing, we know that every column j in W α n must have a vertex with different colour than that of v r,j . We denote by x j a nearest to v r,j vertex in the same column with colour(x j ) = colour(v r,j ). We then let y j denote v i+1,j if i < r or v i−1,j if i > r. Notice that y j is non-white and x j is adjacent to y j . This creates two sets of n vertices, X = {x j : j = 1, . . . , n} and Y = {y j : j = 1, . . . , n} where x j is adjacent to y j for each j.
We examine first the case for the word ω = 2323 · · · , the infinite word with alternating 2s and 3s, as this is relatively simple whilst demonstrating the technique required. Proof. We claim that for W ω n , no three vertices in Y can have the same label. From this the lemma will follow since the n vertices in Y are labelled with at least n/2 labels at node a. Suppose for a contradiction that vertices Y ′ = {y j 1 , y j 2 , y j 3 } ⊆ Y have the same label at a. Without loss of generality we assume that Assume a vertex y j i in Y ′ is not adjacent to a vertex x j k in X ′ . Clearly, i = k and hence vertices y j i , y j k and x j k form a triple as described in Observation 3.3, so y j i and y j k have different labels, a contradiction. Hence, each vertex in Y ′ must be adjacent to each vertex in X ′ . (b) if a vertex v i,j lies on an even diagonal D 2m then it has only one neighbour to its left and all its neighbours lie on the odd diagonals D 2m−1 and D 2m+1 .
The leftmost vertex in X ′ , x j 1 , must have two neighbours in Y ′ to its right, so from Observation 3.5, x j 1 must sit on an even diagonal and all the vertices in Y ′ must sit on odd diagonals. On the other hand, the rightmost vertex in X ′ , x j 3 , must have two neighbours in Y ′ to its left, so from Observation 3.5, x j 3 must sit on an odd diagonal and all the vertices in Y ′ must sit on even diagonals. We have a contradiction and hence no three vertices in Y can have the same label, as required.
The extension to consider arbitrary words over {2, 3} is similar, but the behaviour of the diagonals given in Observation 3.5 represent just two of several possible types of behaviour.  Proof. We claim that for W α n , no five vertices in Y can have the same label. From this the lemma will follow since the n vertices in Y are labelled with at least n/4 labels at node a.
Suppose for a contradiction that vertices Y ′ = {y j 1 , y j 2 , y j 3 , y j 4 , y j 5 } ⊆ Y have the same label at a. Without loss of generality we assume that If a vertex lies on diagonal D m we say it has type α m−1 α m . Hence, we have 4 possible types of vertex, namely 22, 23, 32 and 33. The distinguishing feature of each is its neighbourhood. Each has two branches to its neighbourhood on diagonals D m−1 and D m+1 -see Figure 3. We have 5 vertices in X ′ each of which could be one of 4 types, so potentially there are 4 5 = 1, 024 different cases, but we can handle these in groups that depend on the type of the two outside vertices x j 1 and x j 5 .
Case 1: x j 1 and x j 5 are each of type 22 or 33.
Since x j 1 must be adjacent to 4 vertices in Y ′ to its right and x j 5 must be adjacent to 4 vertices in Y ′ to its left, then all the vertices in Y ′ must be on the same diagonal (as the neighbourhood of x j 1 has only one branch going rightwards and the neighbourhood of x j 5 has only one branch going leftwards). But now there is no possible neighbourhood type for x j 3 that could make it simultaneously adjacent to y j 1 and y j 5 , so we have a contradiction.
Case 2: x j 1 and x j 5 are each of type 23 or 32.
In Lemma 3.4 we dealt with the case where there are 3 vertices in X ′ that are either 23 or 32 types so we only need to consider the case where the middle 3 vertices in X ′ are type 22 or 33.
Furthermore, as x j 1 must be adjacent to 4 vertices in Y ′ to its right and x j 5 must be adjacent to 4 vertices in Y ′ to its left, then x j 1 can only be of type 23 and x j 1 type 32. Then considering vertex x j 3 we must have all three vertices x j 1 , x j 3 and x j 5 on the same diagonal otherwise x j 3 cannot be adjacent to both y j 1 and y j 5 . Lastly, x j 2 and x j 4 must be of the same type as x j 3 (i.e. either type 22 or 33) and on the same diagonal as x j 3 in order to be adjacent to both y j 1 and y j 5 . But then they cannot both be adjacent to y j 3 and we have a contradiction.
Case 3: One of x j 1 and x j 5 is of type 22 or 33 and the other is of type 23 or 32.
Suppose x j 1 is type 22 and x j 5 is type 32, then y j 3 , y j 4 and y j 5 must be on the same diagonal. x j 3 cannot be type 23 or 32 as it could not be adjacent to both y j 1 and y j 5 .
If x j 3 is type 33 then it must be on the same diagonal as x j 1 in order for both x j 1 to be adjacent to y j 3 and x j 3 to be adjacent to y j 1 . But then x j 2 cannot be any of the 4 types and still be simultaneously adjacent to y j 1 , y j 3 and y j 5 , hence we have a contradiction.
Likewise if x j 3 is type 22 then it must be on the diagonal D m−2 (where x j 1 is on D m ) in order for both x j 1 to be adjacent to y j 3 and x j 3 to be adjacent to y j 1 . But then, as before, x j 2 cannot be any of the 4 types and still be simultaneously adjacent to y j 1 , y j 3 and y j 5 , hence we have a contradiction.
For the other Case 3 combinations (e.g. x j 1 is type 33 and x j 5 is type 32) we can use an analogous argument and each time reach a contradiction. We omit the details.
We have now considered all possible combinations of vertex type and each one leads to a contradiction. Hence, no five vertices in Y can have the same label.

Lemma 3.7.
If α is any infinite word over the alphabet {2, 3} then the graph class G α is a hereditary class of graphs of unbounded clique-width.
Proof. This follows immediately from Lemmas 3.4 and 3.6.

{0, 1, 2, 3} graph classes with unbounded clique-width
We now extend our results to graph classes G α where α is an infinite word over the alphabet {0, 1, 2, 3} containing an infinite number of letters from {2, 3}. For this we will use the rank-width parameter described in Section 2.2. From [3] we have a toolkit of graph operations which we extend to show that the graph class G α contains a graph with a vertex-minor H γ 1,1 (q, q) for some q where γ is an infinite word from the alphabet {2, 3}. If we can make q as large as we like then combining Lemma 2.1 with Lemma 3.7 gives us the result that G α has unbounded rank-width and therefore unbounded clique-width.
The following graph operations are demonstrated in [3] unless otherwise stated. Each operation takes a graph H α 1,1 (m, n) and uses local complementation and vertex deletion to create a p × q vertex-minor H γ 1,1 (p, q), for some p m and q n, where γ is a q − 1 letter word derived from α with certain letters removed. We use the term 0 removal where the removed letter(s) are 0s and likewise 1 removal where the removed letter(s) are 1s. 0 Removal Operations (i) Removing a 0 from the factor 00 can be achieved by applying local complementation to each of the vertices in the middle column and then deleting the vertices in that column.
(ii) Removing the 0 from the factor 01 can be achieved by applying local complementation to each of the vertices in the middle column sequentially from top to bottom and then deleting the vertices in that column. If the number of rows m is even this is equivalent to removing the 0 from the factor 01. If the number of rows is odd the same result is achieved by modifying the process so that the local complementation ends on row m − 1 and deleting the last row of vertices. The factor 10 can be reduced to 1 in the same way.
(iii) Removing the 0 from the factor 02 can be achieved by local complementation on the vertices of the middle column and then deleting the odd rows. Also factor 20 can be reduced to 2, 03 to 3 and 30 to 3 in the same way.
These operations allow us to create a vertex-minor H γ 1,1 (p, q) with the letters of γ from the alphabet {1, 2, 3}.

Removal Operations
(i) Transforming the factor 211 into 2 can be achieved by using one pivot and deleting the first and last rows to give a 200 factor and then using 0 removal operations to reduce to 2. In the same way we can transform the factor 112 into 2, 311 into 3 and 113 into 3.
(ii) Transforming the factor 212 into 22 can be achieved by using one pivot and deleting the first and last rows to give a 202 factor and then using 0 removal operations to reduce to 22 . In the same way we can also reduce 313 to 33.
(iii) Finally, we claim we can transform the factor 213 into 22. As this is not covered by [3] we give the proof here.
Let C k , C k+1 , C k+2 , C k+3 be four consecutive columns of H i,j (m, n) such that C k ∪ C k+1 induce a 2-link, C k+1 ∪ C k+2 induce a 1-link and C k+2 ∪ C k+3 induce a 3-link. Let x be the vertex in the first row of column C k+1 and y be the vertex in the last row of column C k+2 It can be seen that by pivoting on the edge xy and then deleting the whole of the first row, the second row to the right of column C k+2 and deleting y and vertices on the last row to the right of y, we have transformed the factor 213 into 202. We can then use the zero removal operations to reduce to 22. In the same way we can also reduce 312 to 33.
Observation 3.8. If r is the number of rows prior to the removal of a 0 or 1 by one of the these operations then after the operation the number of rows left will be at least (r/2) − 2.
Thus, by starting with a large enough number m in our choice of H i,j (m, n), we may remove a finite number of 0s and 1s and still ensure that there are enough rows left at the end of the process.
We now have a complete set of tools, using local complementation and vertex removal applied to H α 1,1 (m, n), to create a vertex-minor H γ 1,1 (p, q) with the letters of γ coming only from the alphabet {2, 3}. Lemma 3.9. Let α be an infinite word over the alphabet {0, 1, 2, 3} which has an infinite number of letters from {2, 3}. Further, let β be a factor α k α k+1 · · · α k+p−1 of length p which has q (0 < q p) letters from {2, 3} and p − q letters from {0, 1}.
Proof. This follows by applying the graph operations described above to remove the 0s and 1s. There are p − q such letters, and using Observation 3.8 it can be seen that by starting with at least (q + 4)2 p−q rows, there will be at least q rows remaining after executing the necessary operations to remove them.
We now have the following theorem: Proof. If there are no 2s or 3s or only a finite number of 2s and 3s in α, we can use Lemma 3.1.
If there is an infinite number of 2s and 3s in α then we can use Lemmas 3.7 and 3.9 as follows. For any q we can find a graph G in G α that has a vertex-minor H γ 1,1 (q, q), for some infinite word γ using only letters from the alphabet {2, 3}. In turn, H γ 1,1 (q, q) contains an induced subgraph W γ q/2 , so using Lemma 2.1 we have rwd(G) rwd(H γ 1,1 (q, q)) rwd(W γ q/2 ).
But from Lemma 3.6 cwd(W γ q/2 ) q/8 → ∞ as q → ∞ (and hence also rwd(W γ q/2 ) → ∞) so it follows that G α is a graph class with unbounded rank-width and hence unbounded cliquewidth.

Minimality of almost periodic graph classes
Let α be an infinite almost periodic word from the alphabet {0, 1, 2, 3}, with at least one non-zero letter. In this section we will prove that the graph classes G α are minimal of unbounded cliquewidth. To do this, we must show that any proper hereditary subclass has bounded cliquewidth. If C is a proper hereditary subclass of G α then there must exist a non-trivial finite forbidden graph F that is in G α but not in C. In turn, this graph F must be an induced subgraph of some H α 1,j (k, k) for some k 2.
Consider a graph G ∈ C ⊆ Free(H α 1,j (k, k)). If there exists an embedding φ : V(G) → V(P α ) straddling columns C j , . . . , C j+k−1 of P α then there must be limits on the vertices of φ(V(G)) in these columns to avoid creating an induced subgraph isomorphic to H α 1,j (k, k).
Clearly the most obvious thing to avoid is any k complete horizontal rows which would automatically generate the forbidden graph. For graphs only involving letters 0 and 1 this is sufficient, and was dealt with in [3]. However, letters 2 and 3 are more complex. Lozin in [14] dealt with the class G 2 ∞ by introducing the concept of clusters and creating a cluster graph which provided a method of defining a partition of the vertices of G which gave the desired boundary on clique-width.
In considering {0, 1, 2, 3} graphs we use a modified version of the cluster graph method used in [14] which we describe in Section 4.1.
The following lemma will be used to place a bound on the clique-width of induced subgraphs of P α . Given a graph G and a subset of vertices U ⊆ V(G), 2 vertices of U will be called Usimilar if they have the same neighbourhood outside U. U-similarity is an equivalence relation. The number of equivalence classes of U in G will be denoted µ G (U) (or µ(U) when the context is clear). Also, by G[U] we will denote the subgraph of G induced by U. It follows that U is a module of G if and only if µ(U) = 1. [14]). If the vertices of a graph G can be partitioned into subsets U 1 , U 2 , . . . , U n in such a way that for every i (a) the clique-width of G[U i ] is at most k 2, and

Lemma 4.1 (Lozin
then the clique-width of G is at most kl.

Proof.
To build H α i,j (m, n) we partition it into subsets U 1 , U 2 , . . . , U m by including in U k the vertices of the k-th row of H α i,j (m, n). This means G(U k ) is a disjoint union of paths so has clique-width at most 3. Also, only a vertex in U k that is in a column s such that α s−1 and/or α s is non-zero could have a neighbour outside U k , so µ(U k ) 2t + 1. Also, it is not difficult to see that µ(U 1 ∪ · · · ∪ U k ) 2t + 1, as the vertices in each column have the same neighbourhood in H α i,j (m, n) outside U 1 ∪ · · · ∪ U k . Therefore, the result follows by applying Lemma 4.1.
We will also make use of the following lemma. Thus we can assume that our arbitrary graph G is prime, and therefore connected, since if it were not so, we could any prime subgraph H which has the same clique-width.

Cluster graphs
Consider a connected graph G embedded in P α such that its leftmost vertex is in column C a and rightmost vertex in column C a+n−1 . Let a left module of G ∩ C j be a maximal set of vertices in that column that are indistinguishable by vertices in G ∩ C j−1 . Similarly, a right module of G ∩ C j is a maximal set of vertices that are indistinguishable by vertices in G ∩ C j+1 . Thus the vertices of every column of G, except the leftmost and rightmost columns, can be partitioned in two ways, as a set of left modules or as a set of right modules.
Now consider an α j -link, G j (the subgraph of G induced by the vertices of G∩C j and G∩C j+1 ). For convenience, we will say G j is in standard form if, without changing the vertical order of the vertices, it is presented as an induced subgraph of H α 1,j (m, 2) with minimum m (i.e. taking out any superfluous gaps). Let G s j be the standard form of G j , noting that the (left and right) modules of G j and G s j are identical. Suppose R is the set of right modules of G s j ∩ C j and L the set of left modules of G s j ∩ C j+1 . We will say that two modules A and B overlap if the set of rows containing vertices of A has non-zero intersection with the set of rows containing vertices of B. It can be seen that a right module in R can only overlap with a left module in L on at most one row, for if they overlapped on two or more rows they would no longer be (right/left) modules. Furthermore, a right module in R cannot overlap with more than one left module in L and vice-versa.
If a right module in R overlaps with a left module in L, the two modules can be paired to form a cluster. This pairing process is well-defined and matches all such modules except for at most one unmatched right module in G ∩ C j and one unmatched left module in G ∩ C j+1 . These unmatched right/left modules have the characteristic they are indistinguishable to all vertices in the column to the right/left respectively. We will refer to them as right/left boundary modules and the vertices in them as right/left boundary vertices respectively. We put each boundary vertex in its own cluster in G j .
Hence, the clusters of G j form a partition of the vertices of the α j -link. If α j is 0 or 1 then every cluster in G j is either a horizontal pair of vertices or a boundary vertex. If α j is 2 or 3 then each cluster is a complete bipartite induced subgraph of G j or a boundary vertex. When there are no boundary vertices, G j is isomorphic to H α 1,j (m, 2) consisting of m clusters, each containing two vertices of a same row.
At this stage, the vertices in the leftmost (G ∩ C a ) and rightmost (G ∩ C a+n−1 ) columns are each only in one cluster as they only appear in one α j -link. We now add two additional columns of clusters, one to the left of G with a cluster for each vertex of G ∩ C a , and one to the right of G with a cluster for each vertex of column G ∩ C a+n−1 . Thus the vertices of every column G ∩ C j of G are now in two clusters.
With any finite induced subgraph G of P α we associate an oriented graph which we call the cluster graph B(G), whose vertices are each associated with one of the clusters of G. The vertices of B(G) representing clusters of the same α j -link, G j , we call a column of B(G), and denote this by B(G j ). Of the two additional cluster columns defined in the previous paragraph we will call the one on the left B(G a−1 ) and the one on the right B(G a+n ). For ease of exposition, we will always present the vertices of B(G j ) in the same vertical order as in G. In the following we denote the i-th cluster of G j , counting from top to bottom, as K i,j , with corresponding vertex, u i,j in B(G j ). The edges of B(G) are defined as follows.
Type A If K r,j has a vertex of G in common with cluster K s,j+1 then B(G) has a directed edge (u r,j , u s,j+1 ) (i.e an edge oriented from u r,j to u s,j+1 ). Note that if there was more than one vertex of G in the intersection between two clusters of G, then these form a module of size greater than one and G is not prime, a contradiction. Hence, any two clusters of G have at most one vertex in the intersection. It follows that each Type A edge of B(G) corresponds to a vertex of G.
Type B Let u i,j and u i+1,j be two consecutive vertices in a column of B(G). If α j = 2 then B(G) has a directed edge (u i,j , u i+1,j ) and if α j = 3 then B(G) has a directed edge (u i+1,j , u i,j ).
Edges of type A are oriented from left to right and go between consecutive columns of B(G) whilst edges of type B are oriented down when α j = 2 and up when α j = 3. Drawing B(G) by arranging the vertices in columns in the same order as the respective clusters of G it becomes clear that B(G) is a planar graph.
If we have a right boundary vertex in C j then it will be in both a cluster in G j−1 , say K r,j−1 , and a singleton cluster in G j , say K s,j . The two vertices, u r,j−1 and u s,j in B(G) associated with these clusters will be joined by a type A edge. However, there can be no type A edge to the immediate right of u s,j as K s,j contains no vertex from column C j+1 . Similarly for left boundary vertices, mutatis mutandis.
An example of a cluster graph is shown in Figure 4.
As we assume G is prime (from Lemma 4.3) and therefore connected, when it is embedded in P α it must occupy vertices in consecutive columns (i.e. if it has one or more vertices in columns C x and C x+2 then it must also have at least one vertex in column C x+1 ). Suppose G straddles a set of columns including the k columns C j , . . . , C j+k−1 with defining factor β = α j α j+1 · · · α j+k−2 . Let us denote the subgraph of G induced by these columns G * . The respective graph B(G * ) will be denoted by B * ; it has k + 1 columns denoted by B 1 , . . . , B k+1 . It can be seen that if B * has k directed disjoint paths from column B 1 to column B k+1 then G * contains the forbidden subgraph H α 1,1 (k, k), although the reverse is not necessarily true.
This leads to the following result: Lemma 4.4. Let C be a proper subclass of G α such that C ⊆ Free(H α 1,j (k, k)) ⊂ G α . Furthermore, let G be any graph in C with induced subgraph G * and associated cluster graph B * defined as above. Then B * can have at most k − 1 directed paths from column B 1 to column B k+1 .

Applying Menger's Theorem
We will be using Menger's Theorem to help us define a partition of the vertices of G on which to apply Lemma 4.1. Menger's Theorem is one of the cornerstones of graph theory.  Proof. Let X denote the set of vertices V(G)\ S that can be reached from A by following directed paths, and let Y denote the set of vertices of V(G) \ S from which there starts a directed path that ends in a vertex of B. Now, X and Y are disjoint (by Menger's theorem). If there are any vertices of V(G) \ S that lie in neither X nor Y, we can assign them to either arbitrarily. Now every edge with one endpoint in X and the other in Y must be oriented from Y to X, otherwise we find a directed path from A to B.
We can apply this to the cluster graph B * referred to in Lemma 4.4 with columns B 1 and B k+1 connected to each other by a set P of at most k − 1 disjoint paths. Denote s = |P|. The s paths of P cut B * into s + 1 horizontal stripes, that is, subgraphs induced by two consecutive paths in P and all the vertices between them (s − 1 such stripes) and 2 further stripes for the subgraphs induced by the top path and all vertices above it, and the bottom path and all vertices below it.
From Menger's Theorem these two columns can be separated from each other by a set S of s k − 1 vertices, containing exactly one vertex in each of the paths, such that there are no paths from B 1 to B k+1 that avoid this set S. From Corollary 4.6 we have a partition of the vertices of B * \ S into two sets X V(B * ) and Y V(B * ) such that there are no directed edges from a vertex in X V(B * ) to a vertex in Y V(B * ) . As B * is planar this means we can draw a curve Ω between X V(B * ) and Y V(B * ) such that this curve crosses B * at precisely the set S and such that there are no directed edges crossing Ω from X V(B * ) to Y V(B * ) . It follows that we can partition the Type A edges of B * into two sets X E(B * ) and Y E(B * ) either side of Ω, and as these Type A edges correspond to the vertices of G * then we also have a partition of these vertices into two sets X V(G * ) and Y V(G * ) .

Almost periodic {0, 1, 2, 3} graph classes are minimal of unbounded clique-width
We now come to the key result of Section 4.

Lemma 4.7.
Let α be an infinite almost periodic word from the alphabet {0, 1, 2, 3} which has at least one non-zero letter, and k a natural number at least 2. Further, let β = α j α j + 1 · · · α j+k−2 be a k − 1 letter factor of α such that β appears in every factor of α of length L(β), so that H α 1,j (k, k) is a graph in G α whose edges correspond to the subword β. Then any graph G in G α that is H α 1,j (k, k)-free has clique-width bounded by a constant c(k, L(β)) that depends only on k and L(β).
Proof. Let G be a graph in G α that is H α 1,j (k, k)-free. In the following we refer to vertex grid coordinates (x, y) of an embedding of G in P α as described in Section 2.1. As before, we assume G is prime (from Lemma 4.3) and therefore connected.
We define a partition {V 1 , . . . , V n } of the vertices of G as follows. Let a be the first column of P α in which a vertex of G is embedded. Denoting the set of vertices of G in a set of consecutive columns as a bar, let V i be the bar of G in columns [a + (i − 1)(L(β) The corresponding subword for the graph induced by bar V i is of length L(β) so must contain a copy of β by definition. Let this copy of β correspond to columns C y , . . . , C y+k−1 of P α . Following the same notation as Section 4.1 we define G * as the subgraph of G induced by the columns C y , . . . , C y+k−1 and B * its respective cluster graph, with columns B 1 , . . . , B k+1 . We define P, S, s, Ω, We now show that the partition X V(G * ) /Y V(G * ) of the vertices of G * defined in Section 4.2 gives us a number of equivalence classes, µ G * (X V(G * ) ) and µ G * (Y V(G * ) ), bounded by a function of k. We consider this in 3 cases depending on the alphabet of β: Case 1 β is a subword from the alphabet {0, 1}.
A {0, 1} cluster graph B(G) contains only edges of type A and only horizontal paths. Every cluster is either a horizontal pair of vertices or a boundary vertex. Each row of B(G) is either a (left to right) directed path or a disjoint union of directed paths. If a row is a disjoint union of paths then the gaps between the paths have either no vertex or a boundary vertex immediately on either side.
It is easy to see that the curve Ω must traverse each stripe of B * by passing through a gap in each row between the paths at the top and bottom of the stripe.
From Section 4.2 the X E(B * ) /Y E(B * ) partition of the Type A edges of B * defined by Ω gives a corresponding X V(G * ) /Y V(G * ) partition of the vertices of G * . We can partition the edges of B * that correspond to vertices of a column C j of G * , into at most 2s + 1 subsets C 1,j , . . . , C 2s+1,j , as follows: (i) The edges forming the paths of P (s edges/subsets).
(ii) The remaining edges in each stripe (at most s + 1 subsets).
We claim that no vertex of Y V(G * ) can distinguish the vertices of C i,j ∩ X V(G * ) . Suppose to the contrary, that a vertex y ∈ Y V(G * ) is not adjacent to x 1 ∈ C i,j ∩ X V(G * ) but is adjacent to x 2 ∈ C i,j ∩ X V(G * ) . Then x 1 and x 2 cannot be in the same cluster in G j , because they are on different rows, but one of them must be in the same cluster as y. But this cluster is then not a boundary cluster as it contains two vertices and hence the cluster must be on a path of P. But x 1 and x 2 are in different clusters in G j so cannot both be on the same path of P and so are in different subsets, C i,j ∩ X V(G * ) , a contradiction.
So the maximum number of equivalence classes in C j ∩X V(G * ) is 2s+1 2k−1 and hence µ G * (X V(G * ) ) is at most the number of different C i,j 's, which is at most k(2s + 1) 2k 2 − k.
Without loss of generality, we may assume that no α-link G j , where α j ∈ {2, 3}, contains a boundary vertex. For if such vertices exist, they will be positioned at one extreme (top or bottom) of a column. It is then possible to add an additional vertex in the opposite column to turn them into a cluster. Therefore, by adding at most two vertices to each column of G, we can extend it to a graph G ′ which has no boundary vertices, contains G as an induced subgraph and is H α 1,j (k + 2, k)-free.
Observation 4.8. The curve Ω traverses each stripe of B * monotonically in a horizontal direction, meaning that its x-coordinate changes within a stripe either non-increasingly or non-decreasingly.
Proof of Observation. Suppose for a contradiction that Ω had an unavoidable local maximum within a stripe, we would have a vertex v (to the left of the curve) that causes this maximum x-coordinate. Obviously, v does not belong to B k+1 (since otherwise B k+1 is not separated from B 1 ), and v must have a neighbour to its right within the stripe (since there are no boundary vertices in G). But then the Type A edge connecting v to that neighbour would cross Ω, which contradicts Corollary 4.6.
This observation allows us to conclude that whenever Ω separates the Type A edges between two columns of B * within a stripe, the result is two intervals, one above Ω and one below it.
From Section 4.2 the X E(B * ) /Y E(B * ) partition of the Type A edges of B * defined by Ω gives a corresponding X V(G * ) /Y V(G * ) partition of the vertices of G * . We can partition the Type A edges of B * that correspond to vertices of a column C j of G * , into at most 4s+1 4k−3 subsets C 1,j , . . . , C 4s+1,j , as follow: (i) The Type A edges intersecting the paths of P (s edges/subsets).
(ii) For each such edge e, the Type A edges that have a common vertex with e, at most 2 subsets in each stripe (up to 2s subsets).
(iii) The remaining Type A edges in each stripe (at most s + 1 subsets).
From Observation 4.8 the vertices of each C i,j form an interval, i.e., they are consecutive in C j . We claim that no vertex of Y V(G * ) can distinguish the vertices of C i,j ∩ X V(G * ) . Suppose to the contrary, that a vertex y ∈ Y V(G * ) is not adjacent to x 1 ∈ C i,j ∩ X V(G * ) but is adjacent to x 2 ∈ C i,j ∩ X V(G * ) . Without loss of generality we assume that α j = 2, as the case α j = 3 follows by symmetry.
The vertices x 1 and x 2 cannot be in the same cluster in G j because they are distinguished by y. Furthermore, as α j = 2 then y must be in column C j+1 on a row above that of x 1 but below or level with the row of x 2 . We can assume that y is in the same G j cluster as x 2 because if it is not, then it must be in a cluster positioned between the cluster containing x 2 and the cluster containing x 1 . As C i,j is an interval, this cluster must include some other vertex x 3 ∈ C i,j ∩ X V(G * ) , and we can proceed using x 3 instead of x 2 . Let the cluster of G j including y and x 2 be denoted K r,j .
Let u r,j denote the vertex of B * corresponding to K r,j . Also, let e x 1 , e x 2 , e y be the edges of B * corresponding to vertices x 1 , x 2 , and y respectively. Since e x 2 and e y are incident to u r,j but separated by Ω, vertex u r,j lies on Ω and hence belongs to the separator S. Therefore, u r,j belongs to a path from P. But then C i,j is of the second type and therefore e x 1 must also be incident to u r,j . This contradicts the fact that x 1 does not belong to K r,j . This contradiction shows that any two vertices of C i,j ∩ X V(G * ) have the same neighbourhood in Y.
So µ G * (X V(G * ) ) is at most the number of different C i,j s, which is at most k(4s + 1) Case 3 β is a subword from the alphabet {0, 1, 2, 3}.
Each column B i of cluster graph B * is associated with a letter of β. We can divide B * into alternating {0, 1} bars and {2, 3} bars (reminder, a bar is a set of consecutive columns). Suppose we label these bars D 1 , D 2 , . . . , D m of lengths k 1 , k 2 , . . . , k m so that k 1 + k 2 + · · · + k m = k − 1. Without loss of generality, we will say that if i is odd, D i is a {0, 1} bar and if i is even D i is a {2, 3} bar.
Define the partition curve Ω as before. It can be observed that, within each stripe, Ω can only pass at most once through each bar in B * . For if it passed twice through a column in a {2, 3} bar, in a given stripe, with at least one vertex between the two sections of Ω then there must be a Type B edge passing from X V(B * ) to Y V(B * ) which contradicts Corollary 4.6 of Menger's Theorem.
Also, for the same reasons given in Case 2, within each stripe, the line Ω must pass across {2, 3} bars monotonically in a left/right x-coordinate sense.
Using the same arguments as used in the {0, 1} and {2, 3} proofs we can partition the vertices of G * into sets X V(G * ) and Y V(G * ) such that the maximum number of different equivalence classes for a column Using this X V(G * ) /Y V(G * ) partition of the vertices of G * we can create a partition of V i . All vertices in V i in columns to the left of G * are added to the vertices of X V(G * ) and all the vertices in V i in columns to the right of G * are added to the vertices of Y V(G * ) to produce a partition of bar V i into two parts X i and Y i . Let U i = Y i−1 ∪X i . Each G[U i ] has at most 2(L(β)+1) columns. The subsets U 1 , . . . U n form a partition of the vertices of G, such that for every i: (a) using Corollary 4.2 the clique-width of G[U i ] is at most 6(2L(β) + 2) + 3 = 12L(β) + 15, and Thus from Lemma 4.1 the clique-width of G is at most (12L(β) + 15)(8k 2 − 6k).
Hence the clique-width of G is bounded by a constant that depends only on k and L(β).
Theorem 4.9. Let α be an infinite almost periodic word over the alphabet {0, 1, 2, 3} containing at least one non-zero letter. Then the class G α is a minimal hereditary class of graphs of unbounded clique-width.
Proof. If C is a proper hereditary subclass of G α then there must exist a non-trivial finite forbidden graph F that is in G α but not in C. But F must be an induced subgraph of some H α 1,j (k, k) so C ⊆ Free(H α 1,j (k, k) and Lemma 4.7 gives us a bound on the clique-width. Hence, G α is a minimal hereditary class of graphs of unbounded clique-width.

Uncountably many minimal graph classes with unbounded clique-width
We now proceed to show that there is an uncountably infinite number of such graph classes. To do this we will use the class of almost periodic sequences known as Sturmian. One definition of a Sturmian sequence is a binary sequence that has complexity p α (n) = n + 1, where the complexity function p α (n) is the number of different factors of length n in α [10].
An alternative characterisation of Sturmian sequences is as rotation sequences defined by an irrational number, and hence it follows that the number of such sequences is uncountably infinite. We say that two sequences are locally isomorphic if they have the same factors. If two Sturmian sequences are locally isomorphic this means they have the same n + 1 factors of length n out of a possible 2 n such factors [18]. Hence the set of Sturmian sequences with a particular set of factors is countable in number and so it follows there is an uncountable number of such sets with different factors.
We denote rev(β) as the sequence β in reverse order (mirror image). Then F β can be embedded in P α if and only if β or rev(β) is a factor of α.
Proof. Clearly, by its definition, F β can be embedded in P α in the way described if β (or rev(β)) is a factor of α.
[To avoid much repetition in what follows we will just refer to β to mean β or rev(β).] We prove that F β can only be embedded in P α in the way described, and only if β is a factor of α, by induction on k.
Firstly, if k = 2 then β = 1 and F β = C 6 , the cycle on 6 vertices. It is trivial to see that this can be embedded in P α only if there is at least one 1 in α. In fact, C 6 can be embedded in two ways. Firstly, (Method 1) in the way described in the Lemma, with 6 vertices from 3 rows and 2 consecutive columns, or secondly, (Method 2) over 4 consecutive columns with 1 vertex from the first and last column and 2 from each of the middle columns.
If k = 3 then β must be 10, 01 or 11. F β still includes an induced subgraph C 6 , but now the addition of 3 more vertices and corresponding edges means we can no longer use Method 2. Hence, F β can only be embedded in P α in the way described in the Lemma (Method 1) .
Next using the strong induction hypothesis, we assume that the Lemma is true for all words of length less than k − 1. Thus if β contains a factor that is not a factor of α then F β cannot embed in P α .
If F β does embed in P α then, if β − is the word β without its last letter, we must have β − a factor of α where F β − can only embed in P α by Method 1. Now it is straightforward to see that this cannot be extended to F β if the next letter is not the same as the last letter of β, and that if it is the same, it can only be done by Method 1. Proof. There exists an uncountably infinite number of Sturmian binary sequences that are not locally isomorphic. Suppose we have Sturmian words α 1 and α 2 that have unique factors β 1 and β 2 respectively. Then using Lemma 4.10, the class G α 1 does not contain the graph F β 2 and the class G α 2 does not contain the graph F β 1 . So G α 1 and G α 2 are different graph classes. It follows from Theorem 4.9 each one defines a different minimal hereditary class of graphs of unbounded clique-width.

Recurrent but not almost periodic words
We have seen that (with the exception of the all-zeros word) every almost periodic word α over {0, 1, 2, 3} defines a minimal hereditary class G α of unbounded clique width. At the other extreme, if α is a word over {0, 1, 2, 3} that contains a factor β = α j α j+1 · · · α j+k−2 that either does not repeat, or repeats only a finite number of times, then G α cannot be a minimal class of unbounded clique-width, as forbidding the induced subgraph H α 1,j (k, k) would leave a proper subclass that by Theorem 3.10 still has unbounded clique-width.
Thus, to complete the delineation between minimality and non-minimality (with respect to having unbounded clique-width) of the classes G α , it remains to consider words α that are recurrent but not almost periodic, i.e. words in which each factor occurs infinitely many times, but where the gap between consecutive occurrences of a factor may be arbitrarily large.
Fix a recurrent but not almost periodic word α over {0, 1, 2, 3}. Since α is recurrent, any factor β of α must occur an infinite number of times, and we will call the factors between any consecutive pair of occurrences of β the β-gap factors. Since α is not almost periodic, there exists a factor β of length k − 1, say, such that the β-gap factors can be arbitrarily long. Denote the sequence of β-gap factors by γ 1 , γ 2 , . . . . If, amongst these gap factors, we find that for any integer m there exists some (indeed, infinitely many) γ i which has at least m letters that are not 0, then by the analysis in Section 3 there exist graphs whose clique-width grows as a function of m. Thus, the proper subclass C = Free(H α 1,j (k, k)) ∩ G α (where j denotes the start of the first occurrence of β in α) contains graphs of arbitrarily large clique-width, and thus G α is not minimal. Now let Γ denote the collection of all recurrent words α over {0, 1, 2, 3} other than the all-zeros word, with the property that for any factor β of α, the weight of every β-gap factor is bounded. We now show that it is precisely the words in Γ that define minimal classes of unbounded clique-width. Proof. If G γ is a minimal hereditary graph class of unbounded clique-width, and γ is not almost periodic, then from the preamble to Section 5 we have already demonstrated that γ ∈ Γ .
To prove the converse, suppose γ ∈ Γ . In the case that γ is almost periodic, we may appeal directly to Lemma 4.7. For this more general setting, we may proceed in an almost identical manner.
If C is a proper hereditary subclass of G γ then there must exist a non-trivial finite forbidden graph F that is in G γ but not in C. In turn, this graph F must be an induced subgraph of some H γ 1,j (k, k) for some k ∈ N. Any graph G in C must be Free(H α 1,j (k, k)) for the fixed value of k 2.
As before, let β = γ j γ j+1 · · · γ j+k−2 and G * denote the subgraph of G induced by the columns C j . . . C j+k−1 . We can use the same cluster graph arguments to show that there is a partition We know that the factor β appears an infinite number of times in γ and that the weight of the string between each copy of β is bounded by a constant, say, W(β).
Suppose the i-th copy of β in γ generates the subgraph G * i of G, with corresponding partition X i /Y i , then we define U i as the subgraph induced by the vertices of Y i−1 , X i and all the vertices of G in columns between these two sets.
But we know that k and W(β) are fixed dependent on the forbidden graph F and hence the graph class C has bounded clique-width. Thus G γ is a minimal hereditary graph class of unbounded clique-width.
While Γ includes every periodic and almost periodic word over {0, 1, 2, 3}, it does also contain other (recurrent) words. One simple way to generate such sequences is by substitution. We use [10] as our reference work on substitutions. For example, consider the infinite binary word ψ generated by an iterative substitution σ, beginning with 1 such that σ(1) = 1010 and σ(0) = 0.
The first four iterates, and the start of ψ, are as follows. The word ψ has the following characteristics.
(i) The number of ones doubles with each iteration and therefore ψ contains an infinite number of ones.
(iii) By construction ψ is recurrent but is not almost periodic, because it contains arbitrarily long strings of zeros.
The following lemma shows that ψ ∈ Γ , and therefore provides us with the promised counterexample to the conjecture of Collins et al [3].

Lemma 5.2.
For any factor β of the word ψ, the weight of the β-gap factors is bounded, and thus ψ ∈ Γ .
Proof. Suppose the longest subfactor of contiguous zeros in β is 0 k . It can be observed that σ n (1) ends with the factor 0 n . Hence β must have appeared by the (k + 1)-th iteration, σ k+1 (1) or it is not a factor of ψ. Since |σ k+1 (1)| 1 = 2 k+1 , we have this as a bound on the weight between any consecutive occurrences.
We can extend this idea to construct other recurrent but not almost periodic infinite binary sequences in Γ . Indeed, any iterative substitution σ γ where σ γ (1) = δ and σ γ (0) = 0 such that δ is a finite binary word whose first letter is 1, last letter is 0, and with |δ| 1 2 will define a sequence γ. Now |σ n γ (1)| 1 = |δ| n 1 and it follows using Lemma 5.2 that the weight of the set of β-gap factors for every factor β is bounded, and hence γ ∈ Γ .
Finally, notice that Γ does not comprise all recurrent binary sequences. Indeed, for any γ ∈ Γ that is recurrent but not almost periodic, then the sequence γ, formed as the complement of γ (i.e. inverting the 1s and 0s), is a recurrent sequence that does not lie in Γ , and so G γ is not a minimal hereditary graph class of unbounded clique-width.

Concluding remarks
Linear clique-width The linear clique-width of a graph G is defined as the minimum number of labels needed to construct G by means of the operations allowed for standard clique-width, except for the disjoint union operation. Our minimality of unbounded clique-width arguments rest on constructing partitions that satisfy the conditions in Lemma 4.1. In fact, there exists a 'linear' analogue of this, see [3,Lemma 3], and it is likely that this may be used in conjunction with our arguments above to show that G α for any α ∈ Γ is also minimal of unbounded linear clique-width.
Towards a characterisation of clique width for bipartite graphs While the ultimate goal of characterising which hereditary graph classes have unbounded clique-width remains somewhat remote, a nearer goal is the restriction of this characterisation to cover classes of bipartite graphs.
The identification of the collection of words Γ represents a key step towards a fuller classification: even though we now have uncountably many minimal classes, the collection Γ is relatively easily stated, and gives us the precise delineation between minimal and non-minimal for the classes under consideration.
To extend our work to cover all bipartite graphs still faces a number of hurdles. First, there exist minimal classes of bipartite graphs that are not of the form G α for any α ∈ Γ (for example, the bichain graphs of Atminas, Brignall, Lozin and Stacho [1]), so the current four-letter alphabet {0, 1, 2, 3} is certainly not complete. Second, even with a more complete construction of classes, one must prove that such a list is complete, taking into account the pernicious issue of the class of square grids (which is bipartite and has unbounded clique-width yet contains no minimal class).