A general theory of Wilf-equivalence for Catalan structures

The existence of apparently coincidental equalities (also called Wilf-equivalences) between the enumeration sequences, or generating functions, of various hereditary classes of combinatorial structures has attracted significant interest. We investigate such coincidences among non-crossing matchings and a variety of other Catalan structures including Dyck paths, 231-avoiding permutations and plane forests. In particular we consider principal classes defined by not containing an occurrence of a single given structure. An easily computed equivalence relation among structures is described such that if two structures are equivalent then the associated principal classes have the same enumeration sequence. We give an asymptotic estimate of the number of equivalence classes of this relation among structures of a given size and show that it is exponentially smaller than the corresponding Catalan number. In other words these"coincidental"equalities are in fact very common among principal classes. Our results also allow us to prove, in a unified and bijective manner, several known Wilf-equivalences from the literature.


Introduction
The Catalan numbers are renowned for their ubiquity in problems of combinatorial enumeration. A few of the many contexts in which they arise are: plane forests (counted by number of nodes), non-crossing matchings or arch systems (counted by number of matched pairs or arches), Dyck paths, and 231-avoiding permutations. These contexts share the additional property -to be detailed in Section 2 -that each admits a natural substructure relation, and that there are bijections between them which preserve that relationship. So, one can further consider those structures of each type which do not contain some designated substructure(s). As part of a previous work (see an extended abstract [4], or [5]) the present authors considered certain coincidences of enumeration (often called Wilf-equivalences) between such classes of Catalan structures avoiding a given substructure (in our case, permutations avoiding 231 and π). Using a non-standard bijection we were able to explain some of those coincidences. However, when we turned to the more general question: How many distinct enumeration sequences are there for classes of 231-avoiding permutations defined by a single additional restriction?
we were struck by the difference between the computed numbers, and any known general equivalences. Specifically it seemed that there were many more such coincidences (and so fewer enumeration sequences) than one might have expected. This phenomenon will be explained in the current paper. We will show in Section 5 that although there are Cat n = 2n n /(n + 1) ∼ (1/ √ π)n −3/2 4 n distinct classes of permutations avoiding 231 and an additional permutation of size n, these classes have asymptotically at most cn −3/2 γ n distinct enumeration sequences where c ≈ 1.13 and γ ≈ 2.4975 (these are approximate values only).
A particularly wide collection of such classes share generating functions derived from the continued fraction representation of C(t) = Cat n t n , the generating function of the Catalan numbers. Since C = 1/(1 − tC) it follows that: This fraction can be truncated after n levels, producing a sequence of generating functions: The functions C n enumerate many specific subclasses of the Catalan classes above -for instance the 231-avoiding permutations that also avoid a descending permutation of size n, or the Dyck paths of height at most n. Other examples can be found in [4,14]. Previously these enumeration coincidences were understood on an analytic (or perhaps more properly arithmetic) level only. We can explain them, and many others, bijectively -among other things we can show, combining Propositions 13, 14 and 19: The number of 231-avoiding permutations, π, of size n for which the generating function of the class of permutations avoiding both 231 and π is C n (t) is the n th Motzkin number.
The proof of this fact also describes (at least in principle) bijections between any two such classes. Furthermore, we show that for any other 231-avoiding permutation θ of size n, the generating function for 231 and θ-avoiding permutations is dominated (term by term and eventually strictly) by C n (t).
The main tool in producing these results is a binary relation on Catalan structures defined purely intrinsically by four very simple rules in Section 4. This relation induces an equivalence relation ∼ on these Catalan structures whose equivalence classes are the connected components of the binary relation. Remarkably, if A ∼ B then the collection of structures not containing A has the same generating function as the collection of structures not containing B, so that one generating function may be associated with each equivalence class of ∼. For convenience in the description and proofs we will work mostly in the domain of arch systems, but of course all the results translate to the other domains directly using the natural bijections of Section 2. We have been able to verify that through size 15  In the final section we discuss this conjecture, and further open problems.
In the next section we consider the quartet of Catalan structures, namely arch systems, Dyck paths, plane forests, and 231-avoiding permutations in more detail and introduce our basic terminology and notation. This is followed by some preparatory results before we introduce the relation ∼ and prove its main property, namely that it refines Wilf-equivalence in Theorem 8. We can represent the collection of all ∼-equivalence classes, which we call cohorts, as a slight modification of the family of non-plane forests and this also permits us to determine the number of cohorts in structures of size n, both through a functional equation or recurrence and asymptotically. We then consider further relationships between the cohorts, and the properties of the special main cohort mentioned above -which is maximal in terms of the associated generating functions and also conjecturally in terms of the cardinality of the cohort. Finally we consider some open problems that arise from this work.
2 Arch systems, Dyck paths, plane forests, and 231avoiding permutations Among the most well-known Catalan structures are certainly the Dyck paths. A Dyck path of semi-length n is a path in the positive quarter-plane, taking steps u = (1, 1) and d = (1, −1), starting at (0, 0) and ending at (2n, 0). Steps u and d of a Dyck path may be paired, by associating to each u step the first d step on its right at the same ordinate. These pairs (u, d) may also be seen as pairs of opening and closing parentheses, and under this correspondence Dyck paths correspond to parentheses word where parentheses are properly matched. A subpath of a Dyck path is defined by the deletion of some pairs of steps (u, d) (or equivalently of matched parentheses). The deletion here is intended as a contraction of the segment of each deleted step into a point, so that deleting k pairs of steps in a Dyck path of semi-length n provides a Dyck path of semi-length n − k.
Another natural way of representing proper parentheses words is as non-crossing matchings or arch systems. These form a second family of Catalan structures, and will be essential in the presentation of our results. An arch system of size n is a set of n arches connecting 2n points arranged along a baseline, such that all arches are above the baseline and no pair of arches cross. The left end of each arch encodes an opening parenthesis and its right end the corresponding closing parenthesis. A subsystem of an arch system can be obtained simply by deleting some of the original system's arches.
We can concatenate arch systems, A and B in the obvious way -just draw the arch system B strictly to the right of A on the same baseline. The resulting arch system will be denoted AB.
Definition 2. An atom is a non empty arch system that cannot be written as the concate-nation of two non empty arch systems, i.e. one that has a single outermost arch. Atoms will generally be denoted by lower case letters. The contents of an atom a are the unique arch system, A, such that a is obtained by adding a single arch outside all of A, and we write a = A .
Since every non empty arch system is a unique concatenation of atoms, we see immediately that the generating function for arch systems, A(t) according to the number of arches satisfies: proving that -and this should be no surprise -that arch systems are enumerated by the Catalan numbers.
There is a bijection between arch systems with n arches, and non-empty plane forests with n nodes obtained simply by mapping each arch to a node in such a way that if one arch lies within another, then its node is a descendant of the other, and if it lies to the left of another, then its node does so too. Equivalently, describing this recursively: take an arch system A, write it as a concatenation of atoms A = a 1 a 2 · · · a m and associate to it a forest of m trees whose roots, r i , correspond to the outermost arches of the a i (and are arranged from left to right for i from 1 through m) and such that the tree rooted at r i is (up to the addition of the root r i ) the forest of the contents of a i . This bijection also preserves the "substructure" relationship provided that in the case of forests we maintain ancestry in substructures (e.g. if a child, x, of a node, y, is deleted, then all the children of x remaining become children of y, preserving their left to right order both among themselves and with respect to their new siblings).
Finally, we can consider 231-avoiding permutations of {1, 2, . . . , n}. These are those permutations π which, when written in one line notation, contain no subsequence bca with a < b < c. Here the substructure relationship (known as the pattern relationship among permutations) involves deleting some symbols and then relabelling the remaining ones to form a permutation of {1, 2, . . . , m} for some m < n while maintaining their relative order (e.g. if we delete 2 from 31254 we obtain 2143). It is perhaps not immediately clear that these are also in bijection with Dyck paths, arch systems or plane forests. However, these permutations are precisely those that can be sorted by a single pass through a stack [12] and we can form a Dyck path by adding a step u whenever pushing an element on to the stack, and a step d whenever popping one from the stack. Since the sequence of push and pop operations to sort a permutation is easily seen to be unique, and every sequence of operations sorts some permutation this is clearly a bijection. Moreover, it respects the substructure relationship since, when deleting an element, we just delete the pair of matched steps, or equivalently the arch in the corresponding arch system, which corresponds to push and pop operations that affect that element. This bijection can also be realised intrinsically. The n arches are labelled with the integers from 1 through n according to the following rules: if two arches are nested, then the outer arch has a greater label than the inner one, and if two arches are not nested the arch to the left has a lesser label than the arch to the right. The permutation is then read by reading the labels of the arches in order of their leftmost endpoints. This means that the left to right maxima of the permutation (i.e. the elements that have no greater element to their left) correspond to outermost arches, and within them an arch system is constructed using the same principle recursively on the following lesser elements. An example of these correspondences is given in Figure 1.
Remark 3. Of course, there are also classical bijections between Dyck paths, plane forests or 231-avoiding permutations and plane binary trees. However, it is deliberate that we do not consider binary trees among the Catalan families of this work, since the substructure relation on Dyck paths, plane forests or 231-avoiding permutations does not translate naturally to the context of binary trees. This fact somehow explains why the link between 231-avoiding permutations and binary trees with respect to pattern avoidance is not as natural as one might hope for -see [8,Section 6].
In these four equivalent contexts we are interested in considering the problem: Given a single structure A, what is the generating function of the collection of structures that do not have A as a substructure?
Going back to some examples discussed in the introduction, note that Dyck paths of height at most n corresponds to Dyck paths that do not have u n d n as a subpath. Under the correspondences we have described, these correspond to arch systems that do not have N n = ... ... , the nested arch system with n arches, as a subsystem, plane forests of depth at most n, and 231-avoiding permutations with no n(n − 1) . . . 21 pattern.
Structures that do not have A as a substructure are said to avoid A and we will denote the set (or class) of them by Av(A). If a structure does not avoid A it is said to involve or contain A. In this paper we will only be considering the avoidance of a single structure -but of course in general we could consider any collection of structures closed downwards under the substructure relation. Sometimes par abus de langage we may say that A and B are Wilf-equivalent when we mean that Av(A) and Av(B) are. If A and B are of different sizes, then they cannot possibly be Wilf-equivalent, so effectively Wilf-equivalence is an equivalence relation on structures of size n for each n. As such, the n th Catalan number is an upper bound for the number of its equivalence classes there, but we shall see that this is far from the truth.

Arch systems containing and avoiding subsystems
If an arch system X contains some arch system P then there is a leftmost occurrence of P in X (which we often denote P L ) by which we mean the occurrence of P whose rightmost point (i.e. the point of X that corresponds to the final point of P in this occurrence) is as far left as possible. If there are two such occurrences with the same rightmost point, we designate as P L the one whose second rightmost point is as far left as possible etc. There is also a corresponding notion of rightmost occurrence.
One advantage of working with arch systems is that it is clear that, when searching for a substructure of X equal to some given arch system we may proceed in a greedy fashion. That is: Observation 5. Suppose that P , Q and X are arch systems and that P Q is a substructure of X. Then, in witnessing this we may use the leftmost occurrence, P L , of P in X.
We will use this observation (and some obvious generalisations) repeatedly without further comment. Note however that we do not suggest that X must factor into a part containing P and a part containing Q. For example the system P has P as a substructure, but no such factorisation. For any arch system A, let F A denote the generating function of Av(A). It is a result of [13] (expressed in somewhat different terms of course) that F A is necessarily a rational function. In fact, given a factorisation of A into atoms we can write down a system of equations that allow for the recursive computation of F A (again, this is already done in [13] and, in somewhat more general terms, in [2]). The following proposition simply translates that result into the current context. Proposition 6. Let A be an arch system, with A = a 1 a 2 · · · a m its factorisation into atoms, and a 1 = A 1 . Then the generating function of Av(A) is In particular, F A is rational.
Fundamentally the first part of the proposition is proved simply by partitioning A-avoiding arch systems according to "how much of A" can be found within the first arch, and the conclusion of the second part follows by an easy inductive argument.

A refinement of Wilf-equivalence
In this section, we introduce an equivalence relation, ∼, on the collection of arch systems. We will then establish that this relation refines Wilf-equivalence, i.e. that A ∼ B implies Av(A) ≃ Av(B). So, without further ado: Definition 7. The binary relation, ∼, on arch systems is the finest equivalence relation that satisfies: where A, B, P and Q denote arbitrary arch systems; and a, b and c denote arbitrary atoms or empty arch systems. The equivalence classes of ∼ will be called cohorts.
Note that if A ∼ B then A and B have the same number of arches. Note also that A ∼ B ⇔ A ∼ B , since (non trivial) equivalences between atoms may only be produced by rule (1).
The main result which we prove in the following subsections is Interestingly, another equivalence relation (say, ≡) on Catalan structures has been defined in a similar fashion by Rudolph [15]. She proves in this paper that two ≡-equivalent 132avoiding permutations π and τ are equipopular, that is: for any n, the total number of occurrences of π and τ in 132-avoiding permutations of size n are equal. In other words, ≡ refines equipopularity, and the analogy with ∼ refining Wilf-equivalence is clear. What it further interesting in the case of ≡, is that it coincides with equipopularity, as shown in [7]. As a consequence, the number of equivalence classes for equipopularity among permutations of size n is given by the number of partitions of n.
We separate the proof of Theorem 8 into bijective and analytic proofs -including some bijective proofs for cases where analytic ones are available. One reason for this is that the bijective proofs can frequently be refined to allow for term by term comparisons between the generating functions for inequivalent cohorts, while this is not so easily accomplished when only analytic proofs are available. A second reason is that these bijective proofs are needed for proving our claim of the introduction: that we are able (at least in principle) to provide bijections between any two classes of permutations Av(231, π) and Av(231, τ ) for π and τ of size n whose generating function is C n .
To prove Theorem 8 it is sufficient to show that its conclusion holds for each of the four cases arising in Definition 7. The proof is therefore subdivided into such cases. For compactness of notation we have found it convenient to denote functional application in exponential form, i.e. the image of an arch system X under a map τ will be denoted X τ . Figure 2: The situation arising in the proof of case (2). In an arch system X involving P Q but avoiding P aQ the leftmost copy of P , denoted P L , and the rightmost copy of Q, denoted Q R are designated. Arches with one endpoint inside and one endpoint outside the interval between P L and Q R create a sequence of subintervals (I 1 through I 4 here) that must avoid a. To produce a P bQ avoiding arch system, a bijection mapping a-avoiding systems to b-avoiding systems is applied to the I i and the remainder of the system is left unchanged.

Bijective proofs
Proof of case (1). Let A and B be given with A ∼ B, and suppose that Av(A) ≃ Av(B). We may further assume that A and B are not empty, or the result trivially holds. Take σ to be any size-preserving bijection between Av(A) and Av(B). Define a map τ on atoms x = X belonging to Av( A ) by x τ = X σ . This is possible since x ∈ Av( A ) if and only if X ∈ Av(A). Now extend τ to concatenations of atoms in the obvious way, . . x τ m . Since Av( A ) consists exactly of arch systems which are concatenations of atoms whose contents belong to Av(A) (and correspondingly Av( B ) consists exactly of arch systems which are concatenations of atoms whose contents belong to Av(B)), τ : Av( A ) → Av( B ) is a size preserving bijection.
Proof of case (2). Let arbitrary arch systems P and Q and atoms a and b be given with a ∼ b. Assume that a and b are not empty (or the result trivially holds), and let σ : Av(a) → Av(b) be a size preserving bijection. We will define a size preserving bijection τ : Av(P aQ) → Av(P bQ).
Suppose that X ∈ Av(P aQ). If X ∈ Av(P Q) we define X τ = X. Otherwise take the leftmost copy, P L , of P in X and the rightmost copy, Q R , of Q. The arches that begin before the end of P L but end after it, and those that end after the beginning of Q R but begin before it divide the segment between the end of P L and the beginning of Q R into intervals. This is illustrated in Figure 2. Since a is an atom, any occurrence of a between the end of P L and the beginning of Q R would have to be entirely contained in one of the intervals. So, each of these intervals contains an arch system that avoids a and conversely, if we are given an arch system with this property, it avoids P aQ. So define X τ by applying σ to each of the intervals while retaining the structure of X up to the end of P L and from the beginning of Q R (including the arches that define the intervals). It is immediate to check that this defines a bijection from Av(P aQ) to Av(P bQ).
Proof of case (3). The claim is trivial when a or b is empty. For the non-trivial case let a and b be non empty arbitrary atoms and P and Q arbitrary arch systems. We wish to construct a bijection τ : Av(P abQ) → Av(P baQ). It will be helpful in what follows for the reader to refer to Figure 3. As in the previous case consider an arch system X. If X avoids P aQ then define X τ = X. Otherwise take P L to be the leftmost P , a L the leftmost atom involving a following P L and Q R the rightmost Q in X. Furthermore, denote by C the contents of a L . As in the previous proof the interval between P L and Q R is subdivided by those arches that have only one endpoint in this interval, say there are i (resp. j) such arches with only their right (resp. left) endpoint between P L and Q R . But now also one of those intervals (the one containing a L ) is further subdivided before and after a L by a L itself and any arches nested over a L . Denote by k the number of such arches (including the outermost arch of a L ). All the designated subintervals to the left of a L must avoid a (since a L was leftmost) while those to the right of it must avoid b (since X avoids P abQ). To define X τ simply reverse the order of these subintervals (keeping the arch systems within them fixed i.e. the contents of a subinterval are not changed, only its position between P L and Q R ). The structure of the arch system outside these intervals is unchanged, that is: the arch system before P L and after Q R is not modified, and there are still k arches on top of C, and i (resp. j) arches with only their right (resp. left) endpoint between P L and Q R . In the resulting arch system X τ , P L and Q R are still the leftmost copies of P and the rightmost copies of Q respectively (since nothing before the end of P L or after the start of Q R has been changed). Between these, the atom a L has become the rightmost atom involving a. Since all of the intervals before it but following P L avoid b, X τ avoids P baQ. Moreover, it is clear that we can reverse this construction, so τ : Av(P abQ) → Av(P baQ) is a size preserving bijection as claimed.
Remark that in the proof of case (3), we have chosen to reverse A 1 . . . A i+k CB 1 . . . B j+k to B j+k . . . B 1 CA i+k . . . A 1 in X τ . But many variants of τ could have been defined by choosing any other permutation of the A ℓ , B m and C that respects that all the B m are to the left of C and all the A ℓ to its right.
Turning now to case (4), we will give an analytic proof below, but here give a bijective proof of a special case of it (which we will make use of later). Namely, we prove that Av(a b ) ≃ Av( ba ), which with cases (1) and (3), is equivalent to case (4) with (at least) one of a, b and c empty.
Bijective proof of specialisation of case (4): Av(a b ) ≃ Av( ba ). We may assume that a is not empty (otherwise there is nothing to prove). We will also assume that b is not empty, but will indicate along the proof how it can be modified in case b is empty. The proof goes along familiar lines, so we will be somewhat brief. Let X ∈ Av(a b ) be given. We wish to define its image X τ , and will assume that Y τ has already been defined for all Y of smaller size. If X ∈ Av(b) let X τ = X. Otherwise consider the rightmost occurrence, b R , of b. Since b is an atom, this occurrence ends with the final arch of something of the form C where the contents of b occur in C, but b does not. Consider the intervals defined by the nest of arches (if any) over C . Immediately to the left of C , we have an interval M and the only condition is that it must avoid a b . Once we move past the first enclosing arch to the left the remaining intervals (of which there are, say p called A 1 through A p ) must avoid a. To Reverse order of subintervals Figure 3: The situation arising in the proof of case (3). In the top diagram the original P abQ avoiding arch system is shown. Each interval A i must avoid a and each interval B j must avoid b. In the bottom diagram its image is shown -the atom a L and the nest of arches around it are moved to the right to allow copies of the B j to be placed on the left, and copies of A i on the right, as seen in the middle two diagrams.
the right of C all the intervals (of which there are p + 1, B 0 through B p ) must avoid b. So Now set: In the case where b is empty, we should instead decompose X according to its last arch as X = A 1 M , where A 1 avoids a and M avoids a , and set X τ = A 1 M τ .
That X τ avoids ba follows by induction inside M τ and because the A i (resp. B i ) all avoid a (resp. b).
Finally, the decomposition of arch systems avoiding ba according to their leftmost occurrence of b (resp. their first arch is b is empty) allows to describe them canonically as where each B i avoids b, C avoids b but involves the contents of b, each A i avoids a, and M ′ avoids ba . So the above construction can be reverse, and τ : Av(a b ) → Av( ba ) is a size preserving bijection as claimed.
Note that, as in the proof of case (3), we can again define many variants of the bijection τ : Av(a b ) → Av( ba ), by replacing in X τ the sequence B 0 B p B p−1 . . . B 1 (resp. A 1 . . . A p−1 A p ) by any permutation of the B i (resp. A i ).

Analytic proofs
To complete the proof of Theorem 8 we need to consider the full version of case (4) i.e. we must show that Av(a bc ) ≃ Av( ab c) when none of a, b and c is empty.
Proof of case (4). Let a = A , b = B and c = C . For an arch system X let F X be the generating function of Av(X). Using the general technique described in Proposition 6 we can compute the generating function F a bc in terms of F A , F B and F C .
Solving the system 1 for F a bc in terms of F A , F B and F C gives a terrible mess which is nevertheless symmetric in F A , F B and F C . In fact the solution is tidier if written in terms of F a , F b and F c (recall that F a = 1/(1 − tF A ), i.e. F A = (F a − 1)/(tF a ) etc.): Accordingly, F a bc is symmetric in F a , F b and F c . This proves that Av(a bc ) ≃ Av(c ab ). Now use case (3) to reach the desired conclusion.
We have seen in the above proof that, for any atom a = A , F a completely determines F A and conversely, via the relations F a = 1/(1 − tF A ) and F A = (F a − 1)/(tF a ). This simple fact also provides an analytic proof that:

The combinatorial class of cohorts
From Theorem 8 it follows that the number of different generating functions of classes of arch systems avoiding an arch system with n arches (or equivalently, the number of Wilfequivalence classes of permutation classes Av(231, π) for π of size n avoiding 231) is at most the number of cohorts (i.e. equivalence classes of ∼) for n element structures. In Conjecture 1 we suggest that these numbers may actually be equal, explaining our interest in the enumeration of cohorts. In any case, the number of cohorts certainly provides an upper bound for the number of such Wilf-equivalence classes. Towards the goal of enumerating cohorts, we first associate with each cohort a single structure, and then enumerate such structures. These structures that represent cohorts may be seen as choosing one representative in the set of all structures (e.g. all arch systems) that form a cohort. Alternativelyand it is rather this point of view we choose -we can think of the structure representing a cohort as an abstract structure from which all structures in the cohort may be recovered.

The structure of a cohort
It is easiest to describe the single (abstract) structure associated with a cohort in the context of plane forests. Note first that these structures representing cohorts should be non-plane objects. Indeed: Proposition 10. If two plane forests A and B are isomorphic as non-plane forests, then A ∼ B.
Proof. This follows directly by induction from rules (1), (2) and (3). Specifically, suppose that plane forests A and B which are isomorphic as non-plane forests are given and that the result holds for all plane forests of lesser size. If A and B are trees (corresponding to atoms in the context of arch systems), then the result applies to the forests obtained by deleting their roots (i.e. the contents of these atoms), and hence by rule (1) to A and B. Otherwise, each of A and B is the concatenation of the same number of trees (i.e. atoms), say m. First, Figure 4: ∼-equivalences on trees that are derived from rule (4).
using rule (3) we can find A ′ ∼ A so that A ′ = a 1 a 2 . . . a m , B = b 1 b 2 . . . b m , and each tree a i is isomorphic b i . Then using rule (2) we are done.
We note that this proposition already establishes that there are no more cohorts for n element structures than there are rooted non-plane forest with n nodes, or equivalently rooted non-plane trees with n + 1 nodes. As the asymptotic enumeration of these (see for example [9, Proposition VII.5 and note VII.21]) has exponential growth rate approximately 2.956 we already see exponentially fewer Wilf-equivalence classes than there are structures of size n. However, the final rule provides a further reduction.
Let us focus our attention on ∼-equivalences between atoms (or trees) only that may be derived from rule (4). In this context, an equivalent form of this rule is a bc ∼ ab c . So in terms of trees, rule (4) allows us to rotate subtrees at binary branches. Furthermore, it also allows unary nodes to be lifted through binary ones (from the case when c is empty) via a b ∼ ab . Finally, in the case were b and c are empty, rule (4) rewrites as a ∼ a , allowing to transform a leaf hanging below a binary node x into a unary node between x and its other child. These operations on trees are shown in Figure 4.
So, consider any subtree of a plane forest that has a binary root. In this tree replace any subtree whose root has three or more children by a symbol representing that atom (and temporarily call such atoms, large). As a result we obtain a tree, T , all of whose internal nodes have one or two children and where the leaves are either large atoms, or bare nodes. As shown in Figure 4(ii) and (iii), we can lift the unary nodes and bare nodes through the binary ones to obtain a ∼-equivalent tree T ′ with a chain of unary nodes running from the root, connected to a full binary tree all of whose leaves are labelled with large atoms. Finally, we can rotate the large atoms (see Figure 4(i)), permute them (from P abQ ∼ P baQ), and replace them by equivalent large atoms (from a ∼ b ⇒ P aQ ∼ P bQ). So we see that two such full binary trees (with leaves that are large atoms) are ∼-equivalent if and only if they have the same number of nodes (and hence leaves) and there is a bijection between their sets of leaves such that items in correspondence in these sets are ∼-equivalent large atoms. More properly, note that these "sets" of leaves are actually multisets, since repetitions are allowed.
For ease of explanation, in the rest of this section we will focus on atomic cohorts, i.e. cohorts that contain at least one atom (or tree). Note that this is not an actual restriction: atomic cohorts for (n + 1) element structures are in bijective correspondence with cohorts for n element structures, since A ∼ B ⇔ A ∼ B .
The above discussion leads to a recursive description of (representatives for) atomic cohorts. Consider the recursive specification of a variety, A of non-plane tree-like structures: where • refers to a class with a single object of size 1, parentheses denote ordered pairs, △ m denotes a class with a single object of size m, ⊎ denotes disjoint union, and MSet denotes the multiset construction, with the subscript denoting the number of elements in the multiset. Equivalently, as non-plane trees: There is a size-preserving bijection between atomic cohorts and A.
Proof. This is basically simply a direct translation of the preceding discussion, where we have unravelled all possible equivalences following from rules (1) to (4). The class B represents "large atoms". Then the elements of A are described in order as: a single node, a root with one child, an atom corresponding to a full binary tree with k leaves labelled by large atoms, or a large atom.
We shall use this description to refine the asymptotic enumeration of the number of cohorts. Furthermore, for each cohort of size up to 15, we can produce a representative arch system X for that cohort, and check that the generating functions of the classes Av(X) are all distinct. With Theorem 8, this ensures that the above also shows the first few terms of the sequence enumerating Wilf-equivalence classes of classes Av(A) for A of size n. Notice that more terms of the enumeration sequence of cohorts may be obtained from Equation (6) below -namely, the next few terms are 38 027, 86 993, 200 018, 461 847, 1 070 675. From Theorem 8, these are upper bounds on the number of Wilf-equivalence classes of Av(A), but we cannot ensure that they are equal (although we suspect they are). In the following, we therefore study the asymptotic behaviour of the number of cohorts of arch systems of size n.

The number of cohorts
As already noted, the number of cohorts of arch systems of size n equals the number of atomic cohorts of arch systems of size n + 1. Here we can make profitable use of (5) to provide a functional equation for the generating function A(t) = a n t n counting atomic cohorts which is susceptible to asymptotic analysis using the techniques of Section VII.5 of [9], or with minor variations of [10]. Specifically we obtain: where are operators representing the generating functions that enumerate multisets of objects, and respectively such multisets of size at least 2 or 3 counted by the generating function Z.
Clearly the power series A dominates t + tM ≥3 (A) term by term, and so a n is at least the number of non-plane trees with n nodes in which each internal node has at least 3 children. This trivial estimate suffices to show that the radius of convergence, ρ A , of A is less than 1 (and hence so is that of B). Now observe that in general If the radius of convergence of Z is r < 1, then the radius of convergence of W is easily seen to be at least √ r > r. This suggests that when analysing the radius of convergence of generating functions defined by functional equations involving the M operator, we treat these as implicit definitions of the desired function in terms of "known" analytic functions which, while related to the function we are analysing are analytic in a disc around the origin strictly containing the radius of convergence of the function we seek. Effectively these are the first five steps of [10]. So to proceed we view (6) as an implicit definition of A in terms of these "known" functions after having eliminated B entirely and noting also that the terms corresponding to Z(t 2 ) in any occurrences of M ≥3 should also be treated as "known". Thus we aim to find the radius of convergence of the solution to F (t, y) = 0 where: In this expression we replace the subscripted M operators by their definitions above, and then on the remaining occurrences of M use the form given by 7 to replace the definition of F by one involving y, t and some functions of t known to be analytic on the domain of interest. Continuing with the steps of [10] as we know already that the solution y is a generating function we can find its radius of convergence ρ A by determining the smallest positive root of the equation F y (t, y) = 0 (where F y is the derivative of F with respect to y).
Of course in finding this root we first take the derivative formally and then replace y and all the related "known" functions by polynomial approximations of some degree, denoted n, obtained by using equation (6)  These values agree well with the numerical estimates obtained by simply looking at computed coefficients of A and fitting an asymptotic expression of the form a n ∼ cn −3/2 γ n . Note however that the apparent accuracy is significantly less than that given in examples VII.21 and VII.22 of [9]. We suspect that this arises due to the iterated application of M and the correction terms that are part of the definitions of M ≥2 and M ≥3 . Another possible reason is that we also truncate the "known" parts at degree n. Approximate values of α and γ are α ≈ 0.454 and γ ≈ 2.4975.
Recall that atomic cohorts of arch systems with n + 1 arches are in bijection with cohorts of arch systems with n arches, so to obtain the general asymptotics we multiply the constant term from the atomic asymptotics by γ yielding: Theorem 12. The number of cohorts of arch systems with n arches behaves asymptotically as cn −3/2 γ n , where c ≈ 1.13 and γ ≈ 2.4975.

The main cohort, and comparison between cohorts
We start this section by defining a special cohort of arch systems of any size n and studying its properties. We specifically deal with the number of arch systems contained in this cohort, and with the generating function of any class Av(X) for an arch system X in this cohort. This will complete the proofs of our claims of the introduction. This special cohort is called the main cohort, because it appears to be the largest with respect to two criteria.
Accordingly, we report in this section some results about the comparison between cohorts (of structures of the same size, n) with respect to these two criteria. One is the size of these cohorts, i.e. the number of equivalent arch systems they contain. Here, we focus on extremal cases: we conjecture that the main cohort is the one with maximal size, and we describe singleton cohorts, that is: cohorts which contain one single arch system. Cohorts may also be compared with respect to the (common) generating functions of the classes Av(X) they represent. We provide some rules on arch systems that allow the comparison between the generating functions of their cohorts, and show that the main cohort is largest in the sense that its generating function dominates that of any other cohort.

The main cohort
Following the discussion of Subsection 5.1, for each n there is a unique cohort of structures of size n that arises from all unary-binary plane forests (i.e. no large atoms are involved) -by definition, such forests consist of at most two trees, which are themselves unary-binary trees. We call this the main cohort for structures of size n and denote it by M n . A representative of this cohort is the system N n of n nested arches, whose corresponding forest is a chain of n nodes. But from its description in terms of forests, it is clear that the main cohort also includes all the arch systems of size n that can be built using the following operations, and only these: concatenate two atoms that belong to M j and M k for j + k = n, or place an arch over an arch system of M n−1 . For the same reason, if we let M n denote the number of atoms (i.e. trees) of size n in the cohort M n , it is immediate that the generating function M (t) = M n t n satisfies: This identifies (M n ) as the sequence of Motzkin numbers (offset by 1): Recalling that the number of atoms in the main cohort for structures of size n + 1 is equal to the total number of arch systems in the main cohort for structures of size n, we obtain: Proposition 13. The size of the main cohort for structures of size n is the n-th Motzkin number: |M n | = Motz n .
Furthermore, to M n corresponds one generating function: that of any Av(X) for X ∈ M n . Taking X = N n , where N n is the nest of n arches, these generating functions C n are easily seen to satisfy giving that: For any structure X in M n , the generating function of Av(X) is C n .
This justifies the remarks concerning the sequence of generating functions (C n ) made in the introduction.
Note that Proposition 14 provides an alternative proof of the enumeration of Av(231, π) (by C n for n = |π|) for several families of patterns π that appear in the literature: namely decreasing patterns and patterns of the form 1n(n − 1) . . . 32 [6], reverse of 2-layered permutations and 132-avoiding wedge-patterns of [13,14], and patterns λ k ⊕ λ n−k of [4]. Indeed, all such patterns belong to the main cohort of the corresponding size.
For any structure A in M n , it is easy to see that there exists a chain of ∼-equivalences from A to N n that never uses rule (4) with all of a, b and c not empty. So the same holds for any pair of structures A and B in M n . Therefore, the bijective proofs of Subsection 4.1 provide, for any such pair, a bijection between Av(A) and Av(B). A special case of this statement answers a question raised in [14], about the description of a bijection between Av(132, π) and Av(132, τ ), for π any 2-layered pattern and τ any 132-avoiding wedge-pattern.
The name main cohort has been chosen because we suspect that this cohort is the largest in two senses. We shall see in Subsection 6.3 that C n dominates (term by term) the generating function F X of Av(X) for any arch system X of size n. Moreover, unless X ∈ M n , eventually C n dominates F X strictly.
Since the main cohort is constructed using the smallest building blocks i.e. any other cohort must involve somehow one or more atoms consisting of at least four arches (such as ) it seems natural to suspect that among the cohorts of n-arch systems, the main cohort is largest. Turning this intuition into a proof is however far from immediate, and we offer the following conjecture: Conjecture 15. For every positive integer n ≥ 3 the size of M n is greater than the size of any other cohort of an arch system of size n.

Singleton cohorts
At the other end of the chain, it is amusing to consider the cohorts that contain only a single arch system. Modulo Conjecture 1 these correspond to the only arch systems, A, that can be recognised directly from the generating function of Av(A).

Proposition 16. The cohort of a (non empty) arch system A is a singleton if and only if:
• A = b k where k ≥ 3 and b is an atom which is the only atom in its cohort 2 , or • A = a 2 where a is an atom whose contents are some b k as in the first condition, or • A is an atom whose contents are either empty or some b k as in the first condition.
Moreover, the atoms which are the only atoms in their cohort are: and the atoms whose (non empty) contents belong to a singleton cohort.
Proof. Suppose first that an arch system A is a concatenation of two or more atoms. For such arch systems rule (3) would yield more than one element in A's cohort unless these atoms were all identical. Further, rule (2) would do likewise if that atom were not the only atom in its cohort.
On the other hand, if these conditions are met, and A is a concatenation of at least three atoms then rules (1) and (4) cannot be applied, so such A are indeed arch systems whose cohort is a singleton.
If the cohort of A = a 2 is a singleton, and a = X then clearly the cohort of X must be a singleton (else rule (1) would apply). Furthermore, X must be the concatenation of at least three atoms, or else rule (4) could be applied in A. Conversely, if X satisfies these conditions then none of the rules can be applied to yield any other element of A's cohort.
If A = X is an atom that forms a singleton cohort, then its contents X (if not empty) must belong to a singleton cohort (else rule (1) would apply). X cannot be an atom since Y ∼ Y (from rule (4) with c = Y and a and b empty). Similarly, X cannot be the concatenation of two atoms, since ab = a b (from rule (4) with c empty). So X must satisfy the first condition. Conversely if the contents X of A do satisfy this condition then the cohort of A will be a singleton: indeed, the only rules allowing one to find a ∼-equivalent of an atom are rule (1) and the special cases of rule (4) -which do not apply here since X is the concatenation of at least three atoms.
If an atom in the only atom is its cohort, then obviously its contents are either empty or belong to a singleton cohort. Conversely, consider an atom that is either or X where the cohort of X is a singleton. Certainly, is the only atom in its cohort (which is indeed a singleton here). We claim that for any arch system X whose cohort is a singleton, X is the only atom in its cohort. Such X satisfies one of the conditions of Proposition 16. If X = b k as in the first condition, then none of the rules (1) to (4) apply to X -note that here the cohort of X is actually a singleton, from the third condition. If X = a 2 as in the second condition, then only special cases of rule (4) apply to X = aa , producing two ∼-equivalent to X , namely a a and a a . If X = Y is an atom as in the third condition, then only special cases of rule (4) apply to X = Y producing two (one if Y is empty) ∼-equivalent to X , namely Y and Y . In all cases, we observe that X is indeed the only atom in its cohort.
In order to translate these conditions into recurrences allowing to count singleton cohorts we introduce several auxiliary functions: S 1 (n) counts the atomic singleton cohorts, S 2 (n) counts the singleton cohorts of the form a 2 , and S ≥3 (n) counts the singleton cohorts of the form b k for k ≥ 3. Also A(n) counts the number of cohorts that contain a single atom. Then we obtain as recursive conditions: These together with appropriate boundary conditions determine all the functions and hence the total number S(n) of singleton cohorts, S(n) = S 1 (n) + S 2 (n) + S ≥3 (n). Note that the actual recurrences really just involve S ≥3 and A as follows: It might be possible to derive from the above some information on the "average behaviour" of S(n), the number of singleton cohorts of n-arch systems. But this would likely involve tricky computations with number theoretic arguments, that we leave aside for the moment.

Comparing avoidance classes between cohorts
One (maybe the most important) purpose of this subsection is to prove that the main cohort is the largest in terms of the generating function associated with Av(X), for X in this cohort. This claim is proved as a consequence of more general statements, that allow the comparison of such generating functions associated with various cohorts.
Let us start by introducing some notation. For any cohort C, and any A and B in C, we know from Theorem 8 that Av(A) and Av(B) have the same generating function. We may therefore associate this generating function with C and, when doing so, we denote it F C . For two cohorts C and D, with generating functions F C = c n t n and F D = d n t n , we write C ≤ D when for all n, c n ≤ d n . We also write C < D when C ≤ D and there exists n 0 such that for all n ≥ n 0 c n < d n . Finally, for any arch system A, we denote by C A the cohort containing A, that is to say the equivalence class of A for ∼.
Variations on the bijective proofs of cases (1), (2) and the specialisation of case (4) of Theorem 8 allow us to provide some recursive rules for the comparison of cohorts C A .
Proposition 17. For any arch systems A and B, if C A ≤ C B then C A ≤ C B , and if Proof. To prove that C A ≤ C B (resp. C A < C B ) we should compare (term by term) the enumeration sequences of Av( A ) and Av( B ), proving that the latter is weakly (resp. eventually strictly) larger. To do that, it is enough to give a size-preserving injection (resp. size-preserving injection which fails to be surjective in any size from some n 0 ) from Av( A ) to Av( B ) given one from Av(A) to Av(B). This follows immediately from the same arguments used in the proof of case (1) of Theorem 8, essentially by replacing "bijection" wherever it occurs by "injection" (resp. "injection which is not surjective in any size from some n ′ 0 " -observe that n 0 = n ′ 0 + 1).
Proposition 18. For any arch system A and any atom b, if C A ≤ C b then C P AQ ≤ C P bQ , and unless A = a is an atom such that a ∼ b, C P AQ < C P bQ . Moreover, if C A < C b then C P AQ < C P bQ .
Proof. To prove C P AQ ≤ C P bQ , we describe a size-preserving injection from Av(P AQ) to Av(P bQ), based on one from Av(A) to Av(b).
With the same decomposition used in the proof of case (2) of Theorem 8, we see that, given an injection from Av(A) to Av(b), an injection from Av(P AQ) to Av(P bQ) can be constructed. This uses the fact that if a concatenation I 1 I 2 . . . I k of arch systems avoids A, then each arch system I i must avoid A.
If C A < C b , this injection cannot possibly be a bijection (except for the first few sizes n ≤ some n 0 ). Indeed, it is easy to construct elements of any size n + |P | + |Q| of Av(P bQ) that do not lie in its image from elements of Av(b) of size n that do not lie in the image of the original injection. In fact, for this injection to be a bijection, we need two conditions. The first one is that a concatenation of arch systems should avoid A if and only if each arch system in this sequence avoids A: this happens exactly when A is an atom. The second condition is that the injection from Av(A) to Av(b) needs to be a bijection, i.e. that A ∼ b.
Propositions 17 and 18 are enough to prove that the main cohorts M n = C Nn are the largest in the sense that their generating functions F Mn eventually dominate the generating functions of any other cohort of arch systems of size n. Recall that N n is the arch system consisting of n nested arches.
Proposition 19. For every arch system A of size n, either A is in the cohort of N n or C A < C Nn .
Proof. The proof is by induction. The base case (n = 1) is clear. So assume that n ≥ 2 and that the statement holds for all n ′ < n. Consider an arch system A of size n. Either A = X or A = Xa where a is an atom and X a non empty arch system.
In the first case, by induction we know that exactly one of the following holds: • X is in the cohort of N n−1 ; and then A is in the cohort of N n by rule (1).
• C X < C Nn−1 ; but then Proposition 17 ensures that C A < C Nn .
In the second case, denoting the size of X by j, we know that either X is in the cohort of N j or C X < C Nj .
Assume first that X ∼ N j . If X is an atom, then Xa ∼ N j a by rule (2). Now either a ∼ N n−j , in which case N j a ∼ N j N n−j ∼ N n so that A = Xa is in the cohort of N n ; or C a < C Nn−j , and Proposition 18 ensures that C A = C Xa < C XNn−j ≤ C Nj Nn−j (using Proposition 18 again, since C X ≤ C Nj by induction). We conclude using C NjNn−j = C Nn .
If X is not an atom, we deduce from X ∼ N j that C X ≤ C Nj and Proposition 18 (applied twice) and induction ensure that C Xa < C Nja ≤ C NjNn−j = C Nn .
The last case is C X < C Nj , in which case Proposition 18 gives C Xa < C Nj a ≤ C Nn (as before).
Finally, the bijective proof of the specialisation of case (4) of Theorem 8 can also be adapted to the comparison of cohorts.
Proposition 20. For any arch system A, and any arch system b which is an atom or empty, Proof. Let us assume that A is not empty, otherwise the statement is clear. Again, we use the same decomposition as in the proof of the specialisation of case (4) of Theorem 8 to see that an injection from Av(A b ) to Av( bA ) can be constructed.
More precisely, the arch systems of Av(A b ) either avoid b or are of the form where C contains the contents of b but avoids b, the concatenation of arch systems A p . . . A 1 avoids A, every B i avoids b, and the concatenation of arch systems A p . . . A 1 M avoids A b . This last condition implies that M avoids A b , but is more restrictive in general. It is equivalent exactly when A is an atom (given that A p . . . A 1 avoids A).
On the other hand, the arch systems of Av( bA ) either avoid b or are of the form where C contains the contents of b but avoids b, the concatenation of arch systems A p . . . A 1 avoids A, every B i avoids b, and M ′ avoids bA (without further restriction on M ′ ).
So "mapping the blocks" recursively as in the proof of the specialisation of case (4) of Theorem 8 we get a size-preserving injection from Av(A b ) to Av( bA ). If A is not an atom, we claim that starting at some size n 0 , this injection is not surjective. Indeed, there exist arch systems M of all sufficiently large sizes such that M avoids A b but A p . . . A 1 M contains A b for some A i such that A p . . . A 1 avoids A.

Conclusions and open problems
Several questions are left open in this work. An important one is certainly to provide a completely bijective proof of our main result (Theorem 8), that is: proving case 4 of this theorem bijectively. Even a sensible combinatorial explanation of the rather tidy expression for F a bc in terms of F a , F b and F c would represent progress in this direction. Another problem is to prove that the main cohort is the largest also in terms of number of elements it contains.
But the most intriguing problem is certainly to prove a converse statement to our main theorem: that not only does ∼ refine Wilf-equivalence but also coincides with it. This is stated as Conjecture 1 at the beginning of our paper, and we offer a stronger version of this conjecture, by way of conclusion.
Conjecture 21. For any two arch systems A and B, both with n arches, either A and B are in the same cohort (i.e. A ∼ B), or the enumeration sequences of Av(A) and Av(B) differ at the latest at size 2n − 2.
We have been able to check that this stronger conjecture holds up to arch systems A and B of size 15. We further know that the size 2n − 2 is the smallest one for which such a conjecture could be true. Indeed, we have identified families of arch systems A n and B n of any size n ≥ 4 such that the enumeration sequences of Av(A n ) and Av(B n ) coincide up to size 2n − 3 but differ at 2n − 2. These are described below.
Let k denote the concatenation of k empty arches. Now, for any n ≥ 4, set C n = n−4 , A n = C n , and B n = C n . We claim that there is a size preserving bijection between Av(A n ) and Av(B n ) restricted to arch systems with at most 2n − 3 arches, but that there are more arch systems of size 2n − 2 avoiding A n than B n .
Observe that A n = bA and B n = A b for b = C n and A = . So the proof of Proposition 20 provides an injection ϕ from Av(B n ) to Av(A n ). It is relatively easy to see that ϕ is actually a bijection when restricted to arch systems with at most 2n − 3 arches. This essentially amounts to examining where these at most 2n − 3 arches can be in arch systems containing C n but avoiding A n . It is also not hard to see that the arch system C n C n of size 2n − 2 avoids A n but is not in the image of ϕ.
To the best of our knowledge, this work is the first global approach to the study of Wilfequivalences, a popular topic of research in the field of permutation patterns from its early days until now -and arguably so in the wider context of hereditary classes of combinatorial structures. It is performed in the context of Catalan structures, or equivalently permutations avoiding 231 and another pattern π -which we could call principal subclasses of Av(231). We believe that similar investigations, aiming at classifying all Wilf-equivalences between principal subclasses of (well-behaved) permutation classes should be carried out. One promising example being considered by the first author, Cheyne Homberger and Jay Pantone is the class of separable permutations, Av(2413, 3142). This comment is motivated in part by the results of [3] which provide a partial parallel of Proposition 6 but more generally because the separable permutations permit several other "well-structured" representations.
We can even hope to extend our ideas further, to a partial classification of Wilf-equivalences between principal permutation classes, i.e. classes of permutations defined by the avoidance of a single pattern. The framework of matchings with excluded sub-matchings, as defined in [11], could provide a good tool for that. Matchings are similar to arch systems, but were arches are allowed to cross. Namely, a matching of size n is a set of n arches connecting 2n points arranged along a baseline, with all arches above the baseline. Obviously, our families Av(A) of arch systems avoiding a given arch system A can be seen as matchings with excluded sub-matchings: namely, those avoiding and A. But (principal) permutation classes Av(π) can also be represented as matchings with excluded sub-matchings. Indeed, permutations are in immediate correspondence with matchings having all their arches opened before any arch is closed, or equivalently with matchings avoiding . Under this correspondence, a permutation class Av(π) is simply the class of matchings avoiding and the matching encoding π. If it were possible to adapt our work to such cases, and in particular to provide an upper bound on the asymptotic number of Wilf-equivalence classes of principal permutation classes, this would be a major achievement in the field.