Stack Sorting with Increasing and Decreasing Stacks

We introduce a sorting machine consisting of $k+1$ stacks in series: the first $k$ stacks can only contain elements in decreasing order from top to bottom, while the last one has the opposite restriction. This device generalizes \cite{SM}, which studies the case $k=1$. Here we show that, for $k=2$, the set of sortable permutations is a class with infinite basis, by explicitly finding an antichain of minimal nonsortable permutations. This construction can easily be adapted to each $k \ge 3$. Next we describe an optimal sorting algorithm, again for the case $k=2$. We then analyze two types of left-greedy sorting procedures, obtaining complete results in one case and only some partial results in the other one. We close the paper by discussing a few open questions.


Introduction
The problem of sorting a permutation using a stack was first introduced by Knuth [6] in the 1960s; in its classical formulation, the aim is to sort a permutation using a first-in/last-out device. As it is well known, in this case a permutation π = π 1 · · · π n is sortable if and only if there do not exist three indices i < j < k such that π k < π i < π j . In the language of permutation patterns, we say that the set of sortable permutations is a class with basis {231}, meaning that each of these permutations cannot contain the pattern 231 as a subpermutation; a class is a downset in the permutation pattern poset and each class is determined by the minimal elements in its complement, which form its basis. Recall that the set of permutations can be partially ordered by means of the relation of "being a pattern", and we write σ ≤ π to mean that σ is a pattern of π. The resulting poset is called the permutation pattern poset, and a downset (i.e., a subset closed by going downwards) of the permutation pattern poset is usually called a class. For the basics on permutation patterns in combinatorics and computer science, we refer to [3].
More generally (see [11]), one can consider a network of sorting devices, each of which is represented as a node in a directed graph; when there is an arc from node S to node T , the machine is allowed to pop an element from S and push it into T ; if we mark two distinct vertices as the input and the output, then the sorting problem consists of looking for a sequence of operations that allows us to move a permutation from the input to the output, finally obtaining the identity permutation.
In this framework, some of the typical problems are the following: • characterize the permutations that can be sorted by a given network; • enumerate sortable permutations with respect to their length; • if the network is too complex, find a specific algorithm that sorts "many" input permutations and characterize such permutations.
Concerning the last problem, note that, for a given network of devices, although the set of sortable permutations forms a class in general, this is not true anymore if one chooses a specific sorting strategy; this approach leads in general to more complicated characterizations which involve other kinds of patterns (as it happens, for instance, for West 2-stack-sortable permutations [12]).
Although it is very hard to obtain interesting results for large networks, a lot of work has been done for some particular, small networks (see [2] for a dated survey, or [5] for a more recent one); in this work we restrict our attention to the case of stacks connected in series, with the restriction that the elements are maintained inside each stack either in increasing or in decreasing order. Our starting point is [10], where Rebecca Smith proved that the permutations sorted by a decreasing stack followed by an increasing one form a class with basis {3241, 3142}. In the present paper, we try to find some information on what happens when we add more decreasing stacks in front. Our first result is that the device having two decreasing stacks followed followed by an increasing one does not have a finite basis. Our proof can be easily adapted to show the same property for any number of decreasing stacks in front. Next, we provide an optimal algorithm to sort permutations, again in the case of two decreasing stacks followed by an increasing one. Our algorithm is optimal in the sense that it is able to sort all sortable permutations. Finally, we select a couple of (greedy) strategies and we prove that one of them can be studied in a very neat way, whereas the other one seems to be too difficult to allow a simple description of sortable permutations in terms of patterns, even including generalized versions of them.
2 Many decreasing stacks followed by an increasing one.
Generalizing the approach of [10], here we will consider a sorting device made by k decreasing stacks in series, denoted by D 1 , . . . , D k , followed by an increasing stack I. Recall that "decreasing" (resp., "increasing") stack means that the elements inside the stack have to be in decreasing (resp., increasing) order from top to bottom. When k = 0, we just have a single increasing stack, so we obtain the usual Stacksort procedure. When k = 1, we obtain exactly the DI machine described in [10]. In the sequel we denote our machine with D k I.
The D k I machine can perform the following operations: • d 0 : push the next element of the input permutation into the first decreasing stack D 1 ; • d i , for i = 1, . . . , k − 1: pop an element from D i and push it into the next decreasing stack D i+1 ; • d k : pop an element from the last decreasing stack D k and push it into the increasing stack I; • d k+1 : pop an element from the increasing stack I and output it (by placing it on the right of the list of elements that have already been output).
Notice that each operation can be performed only if it does not violate the restrictions of the stacks; in this case, we call it a legal operation. For the special case of the operation d k+1 , we will assume that d k+1 is legal both if we are pushing into the output the smallest among the elements not already in the output and if all the other operations are not legal.
Remark 2.1. If an occurrence of the pattern 231 is pushed into the last stack I, then the input permutation cannot be sorted. Moreover, this is the only situation that corresponds to a failure in the sorting procedure. This is a consequence of the classical result of Knuth [6], where in fact the only stack is used exactly as if it were increasing.
For any given k, we are now interested in characterizing the set Sort k = {π ∈ S | there is a sequence of legal operations of the D k I machine that sorts π}.
If π ∈ Sort k , we say that π is k-sortable. Notice that we are using the sorting machine in the most general setting, so using a standard argument it is easy to show that Sort k is a class for every k. The natural way to describe Sort k is therefore to understand its basis. Here we show that, even when k = 2, the basis of Sort k is infinite, by explicitly finding an infinite antichain of permutations which are not 2-sortable and are minimal with respect to the pattern ordering. The construction of the infinite antichain described in the next theorem can be easily adapted to every k ≥ 2. The software PermLab [1], developed by Michael Albert, has been an extremely useful tool to find such an antichain. This result is in sharp contrast with what happens when k = 1, which is the case considered in [10], where it is shown that the basis is finite (of cardinality 2). We start by stating some useful lemmas, whose proofs are straightforward.
Lemma 2.2. Let π be an input permutation for the D k I machine; if i < j and π i > π j , then π i is necessarily pushed into I before π j . In other words, the decreasing stacks D 1 , . . . , D k cannot repair inversions. Lemma 2.3. Let π be an input permutation for the D k I machine and let a < b < c be elements of π. Focus on the instant when, during the sorting process, b is pushed into the increasing stack. Then, if any of the following conditions holds, π cannot be sorted anymore: 2. c is in D j , for some j, and a is still in the input; 3. c and a are still in the input, with a following c.

Proof.
The previous lemma implies that, if any of the above conditions is satisfied, an occurrence of the pattern 231 is pushed into the increasing stack, so π cannot be sorted anymore due to Remark 2.1.
Rephrasing the last lemma, if we try to sort π and, when b is pushed into the increasing stack, one of the listed conditions holds, then there is no hope to complete the procedure to obtain a sorted output.
where ω (j) = 2j + 2, 2j + 5, 2j, 2j + 3, 2j − 2, 2j + 1, . . . , 6, 9, 4, 7. Then the set of permutations {α (j) } j≥0 constitutes an infinite antichain in the permutation pattern poset, each of whose element is not 2-sortable. Moreover, α (j) is minimal with respect to such a property, i.e. if we remove any element of α (j) we obtain a 2-sortable permutation. Proof. We start by proving (using induction) that α (j) is not 2-sortable, for every j. If j = 0, it is easy to check that α (0) = 43152 cannot be sorted using the D 2 I machine. Let j ≥ 1 and α (j) = α 1 · · · α 2j+5 . Since α 1 = 2j + 4 > α 2 = 3, α 1 has to be pushed into D 2 before α 2 enters D 1 . Notice that the maximum of α (j) is α 4 = 2j + 5 and there are elements following it in α (j) which are smaller than both α 1 and α 4 , so we cannot push α 1 into I due to the previous lemma. Thus the only option we are left with is to push α 3 = 2j + 2 into D 1 immediately above α 2 . Now, the next element of the input is the maximum α 4 , and of course we can push it through the decreasing stacks and finally into I. Observe that pushing the maximum available element in I is always convenient. So the second maximum α 1 = 2j + 4, which is currently contained in D 2 , can be pushed into I similarly, leaving us with just the elements α 3 and α 2 in D 1 , with α 3 on top. The next element of the input is α 5 = 2j < α 3 , so pushing α 3 into D 2 is forced. Now, getting rid of the two maximal elements of α (j) already pushed into I, notice that we are in the same configuration that arises when processing α (j−1) after considering the first two elements, so we can conclude that α (j) is not 2-sortable by inductive hypothesis. An example of the above argument for j = 2 is shown in Figure 1. In passing, we observe that the optimal sorting strategy here would be, at each step, to push the maximum and second maximum element still available into I; in the general case, this strategy fails since 3 remains stuck in D 1 , blocked by a larger element in D 2 , until we reach the final portion of α (j) . This crucial remark will be useful in the last part of this proof.
We now prove that α (j) is minimal not 2-sortable. This can be proved with a case by case analysis, depending on the element we choose to remove. We show in detail just some of these cases, leaving the remaining ones to the reader.
• If we remove the first element α 1 = 2j + 4, we can push the new first element α 2 = 3 directly into D 2 ; from now on, we can follow the sorting procedure outlined above, pushing at each step the maximum and second maximum available elements into I. However in this case, before processing the three last elements 1, 5, 2, we have that both 3 and 4 are in D 2 , whereas in processing α (j) we have 3 inside D 1 and 4 inside D 2 . Therefore we can now push 1 into D 1 and 5 into I and finally 4, 3, 2, 1 in the correct order, as desired.
• If we remove α 2 = 3, we can sort the resulting permutation using the same procedure, this time obtaining a configuration with just 4 in D 2 and 1, 5, 2 in the input.
• Consider the removal of an element x = α i , for some i = 3, . . . , 2j + 2. In the first part of the sorting procedure, the element 3 is stuck into D 1 , similarly to what happens when processing α (j) . However, as soon as we scan the element that follows x in α (j) , when we push maximum and second maximum in I we are left for a moment with the stack D 2 empty (and just 3 in D 1 ), because we removed the element x that had to occupy D 2 . So we can take advantage of this fact and move 3 into D 2 , concluding the sorting procedure as in the previous cases.
• The removal of the elements 1, 5, 2 can be dealt with in a similar way.
Thus we have seen that, in any case, removing any element of α (j) results in a 2-sortable permutation, so α (j) is minimal not 2-sortable.  Step 2 output input Step 3 output input Step 4 output input Step 5 output input D1 D2 I 8 9 3 6

47152
Step 9 Figure 1: The recursive construction described in Theorem 2.4 with input α (2) = 836947152 (on the right). The last step corresponds to input α (1) = 6347152 after having pushed the first two elements into the machine.
(γ) T op(D 1 ) < Input, Input < T op(I) and the sequence of elements from Input to the first element larger than T op(D 2 ) is increasing.
In the sequel, each of the d i 's, for i = 0, 1, 2, 3, will be called an operation, exactly as we did until now. Instead, each of the six items in the above description of algorithm D 2 I will be called an instruction. Therefore, an instruction of D 2 I consists of performing a (legal) operation, provided that some constraints are satisfied.
It is not difficult to realize that instruction 2 of the above algorithm is not essential for its correctness, so in principle we could remove it. However, in some cases (and in particular in the proof of the optimality) it is convenient to have it.
Algorithm D 2 I sets certain priorities between operations, provided that certain conditions are fulfilled. In general, given any two operationsd andd, we will use the notationd✄d to mean thatd has higher priority thand (and so, if bothd andd are legal,d is performed). Moreover, we denote with (ω) d any operation d which, in order to be performed, has to be legal and also to satisfy an additional constraint ω.
Using these notations, we can illustrate algorithm D 2 I (in which instruction 2 has been removed) with the following chain of priorities: Notice that condition (α) is equivalent to saying that operation d 2 is legal; however, for homogeneity's sake, we have preferred to state it explicitly in the description of our algorithm.

Remarks.
1. If, at some point, algorithm D 2 I performs instruction 6, then the input permutation is not sorted at the end of the process, and this is the only obstruction to the sorting process. In other words, D 2 I sorts a permutations if and only if it never executes instruction 6.
2. To some extent, algorithm D 2 I generalizes Smith's algorithm for a decreasing stack and an increasing stack in series. More specifically, interpreting the first stack of our device as the input container (and so removing the decreasing constraint) and operation d 1 as the input operation, which insert the current element of the input permutation into the (new) first decreasing stack, we obtain precisely Smith's algorithm.
The proof of the optimality of our algorithm is not trivial, and requires several steps. Our first goal is to prove some properties of algorithm D 2 I.
Proof. By induction on the step number. At the beginning of the sorting process, the statement in the lemma is true since all the stacks are empty. Now suppose that the statement holds at step n, and consider all possible instructions that can be performed: a simple case-bycase analysis shows that the same inequality is true also at step n + 1. Proof. The previous lemma tells that condition (α) is always true, so instruction 5 of D 2 I can always be executed provided that D 2 is not empty. Proof. The proof works by induction, exactly in the same way as Lemma 3.1. However, it is worth giving the details in at least one case. Suppose that, at step n of the algorithm, we have T op(D 1 ) < T op(I) and we perform instruction 5, that is we move T op(D 2 ) into I. Notice that, at step n, we must have T op(D 1 ) < T op(D 2 ), otherwise condition (β) would hold, and so instruction 3 would be performed by D 2 I instead of instruction 5. Therefore, at step n + 1, we have T op(D 1 ) < T op(I), because T op(I) at step n + 1 is exactly T op(D 2 ) at step n. Proof. We know from Corollary 3.2 that D 2 must be empty in order to execute instruction 6. If D 1 were not empty, then condition (β) would be satisfied, thanks to the previous lemma, and so instruction 3 would be performed.
From now on, we aim at showing that, if π is a 2-sortable permutation, then there exists a sorting algorithm for π which has many properties that also D 2 I has. In the end, we will prove that such properties do characterize algorithm D 2 I. Proposition 3.5. Let π be a 2-sortable permutation. There exists a sorting algorithm for π which performs operation d 0 (resp., d 1 , d 2 ) only if condition (γ) (resp., (β), (α)) holds.
Proof. Condition (α) is obviously necessary in order to perform d 2 , since I is an increasing stack.
Consider now condition (β). Again, in order to perform d 1 we must have T op(D 2 ) < T op(D 1 ), since D 2 is a decreasing stack. Moreover, we will show that it is necessary to have T op(D 1 ) < T op(I) if we want to perform d 1 and eventually sort the input. Indeed, suppose that T op(I) < T op(D 1 ) and set T op(I) = b, T op(D 2 ) = a and T op(D 1 ) = c. There are two cases to analyze. If a < b, then performing d 1 would force b to reach the output before a, which would cause the sorting process to fail. On the other hand, if b < a, we must have that b is the next element to be output. Therefore we can perform d 3 until T op(I) is not the next element to be output. But in this case necessarily a < T op(I), and we are thus led to the previous case.
Finally, we analyze condition (γ). The inequality T op(D 1 ) < Input is necessary in order to perform d 0 , since D 1 is decreasing; the inequality Input < T op(I) is necessary as well, by an argument similar to that employed for condition (β). We will now show that requiring the third constraint of (γ) to perform d 0 does not prevent the procedure to sort the input. Suppose that the third constraint of (γ) is not satisfied and set x = T op(D 2 ). This means that currently the input consists of a (nonempty) increasing sequence of elements smaller than x whose last term (call it b) is bigger than the next one (call it a). Of course, it is a ≤ x as well. First of all, if it were possible to perform d 1 , then necessarily T op(D 2 ) < T op(D 1 ); since we are supposing to be able to perform d 0 , we already know that T op(D 1 ) < Input, thus we would have T op(D 2 ) < Input; this would imply that the third constraint of (γ) is satisfied, which is not. If we decide to perform d 0 , we still cannot perform d 1 of course, so we can continue to perform d 0 until we reach a. At that point, the only possible operation to perform would be d 2 . However, the same configuration could have been reached by performing d 2 before starting executing d 0 . This essentially means that the set of configurations that are reachable by performing d 2 whenever the third constraint of (γ) is not satisfied is a superset of the set of configurations that are reachable by performing d 0 in the same situation. Thus, if the input is 2-sortable, then it is 2-sortable also by an algorithm which executes d 0 only if (γ) is satisfied.
At this point, it is convenient to make a brief recap. What we have shown until now is that, if π is a 2-sortable permutation, then there exists a sorting algorithm for π having the following features: • if T op(I) is the next element to be output, it performs d 3 ; • it executes instruction 2 of D 2 I whenever it is possible to execute it; • it performs operation d 0 (resp., d 1 , d 2 ) only if condition (γ) (resp., (β), (α)) hold; • if no other operation is allowed, it performs d 3 .
In order to conclude our proof, we now need to show that, if π is 2-sortable, then there exists a sorting algorithm ALG for π which satisfies the above listed properties and, in addition, performs operations d 0 , d 1 , d 2 in exactly the same order as algorithm D 2 I does. This would mean precisely that ALG coincides with D 2 I, as desired.
We start by comparing operations d 1 and d 2 . From now on, any sorting algorithm having the properties listed above will be called special, and we will denote a generic special algorithm with ALG . Proposition 3.6. Let π be a 2-sortable permutation. There exists a special sorting algorithm ALG for π for which (β) Proof. Suppose that, at a certain point of the execution of ALG on π, it is possible to perform both d 1 and d 2 . Clearly, we can suppose that both instruction 1 and 2 of D 2 I cannot be executed by ALG. This implies that there must exist an element a of π still in the input, which is smaller than T op(D 2 ). Set x = T op(D 2 ) and y = T op(D 1 ). If we perform d 2 , then we would have x = T op(I) < T op(D 1 ) = y (since we are supposing that it was possible to perform also d 1 ). This means that a could overcome y only when y is already inside I, and this can happen only if x has already been output. This however would cause the output to be unsorted, since in the output x would come before a, and x > a. We can thus conclude that, in the hypothesis of the proposition, performing d 2 would make the sorting process fail, and so (β) d 1 ✄ (α) d 2 , as desired.
We can now observe that, if π is a 2-sortable permutation, then there exists a special sorting algorithm ALG for π such that Lemmas 3.1 and 3.3 and Corollaries 3.2 and 3.4 hold. In fact, all the proofs of the above mentioned results do not depend on the specific algorithm D 2 I, except for Lemma 3.3, where it is explicitly used that fact that (β) d 1 ✄ (α) d 2 . However, in view of the previous proposition, without loss of generality we can assume that there is a special sorting algorithm for π which satisfies such a condition. In what follows, a special sorting algorithm with this additional property will be called extraspecial (and still denoted ALG).
Before concluding our tour de force, we still need a final preparatory result.

Proposition 3.7. A permutation π is 2-sortable if and only if it does not contain any occurrence
bca of the pattern 231 such that, at some step of any extraspecial sorting algorithm for π, we have b = T op(I) and c and a are still in the input.
Proof. Suppose that π is 2-sortable and that bca is an occurrence of 231 in π. Moreover, suppose that, at some point of the extraspecial sorting algorithm ALG, we have b = T op(I) and c and a are still in the input. Then, if we continue the execution of ALG, since the first two stacks are decreasing, a can overcome c only inside the increasing stack; but c can enter the increasing stack only if b is in the output. This will cause b to be output before a, and so the input permutation would eventually not be sorted, which is a contradiction.
On the other hand, suppose that π is not 2-sortable and let ALG be any extraspecial algorithm. Since π is not 2-sortable, at some point ALG output an element y which is not the correct one; in other words, there exists x < y which is still inside one of the decreasing stacks or in the input. However, the decreasing stacks must be empty, as a consequence of Corollaries 3.2 and 3.4, hence x must be in the input. Moreover, if the z is the first element of the input when y goes to the output, then necessarily z > y, since otherwise condition (γ) would be satisfied (which is not possible, since ALG executes instruction 6). Thus, in particular, z = x, and the elements yzx constitute an occurrence of 231 in π which violates the required condition.
We are finally ready to conclude our proof of the optimality of D 2 I. Proof. Let π be a sortable permutation. Then there exists an extraspecial algorithm ALG which sorts π. The only possibility for ALG to be different from D 2 I is that the order in which ALG performs operations d 0 , d 1 and d 2 may be different. However, we already know that, for an extraspecial algorithm, (β) d 1 ✄ (α) d 2 . What remains to do is to compare d 0 with d 2 and d 0 with d 1 .
First, suppose that ALG is in a certain configuration, in which both d 0 and d 2 can be performed. We can further assume that condition (β) is not satisfied, otherwise d 2 would certainly not be performed, as a consequence of Proposition 3.6. Set c = T op(I), b = T op(D 2 ) and a = T op(D 1 ), and call y the first element of the current input which is greater than b (if it exists). Since we are supposing that condition (γ) is satisfied, the sequence from the beginning of the current input to y is increasing. If there were an element x < b following y, then performing d 2 would prevent to successfully sort the permutation, as a consequence of Proposition 3.7 (the three elements b, y and x would constitute the "bad" occurrence of 231). Therefore, also keeping in mind that b > a (since a < c as a consequence of condition (γ) and we are supposing that condition (β) is not satisfied), we can assert that the set of all numbers contained in D 1 , D 2 and in the input before y (if such an element exists) is precisely the set of all numbers ≤ b which are not already in the output. It is now possible to show that, using algorithm D 2 I, such numbers reach the output before any other number makes any move. Indeed, D 2 I performs d 0 and pushes the first number of the current input inside D 1 (above a). Then the algorithm keeps performing d 0 until y is reached (in fact condition (β) keeps failing to be satisfied, since all numbers before y in the input are < b); at this point, D 1 and D 2 contains precisely the next elements to be output, so D 2 I performs instruction 2. We can thus conclude that, in the considered configuration, using algorithm D 2 I does not prevent the permutation to be sorted, hence performing d 0 instead of d 2 is irrelevant (if not necessary). Now suppose that ALG is in a certain configuration, in which both d 0 and d 1 can be performed. Letting T op(I) = d, T op(D 2 ) = a, T op(D 1 ) = b and Input = c, we then know that a < b < c < d. If ALG chose to perform d 0 , then c would be pushed into D 2 , with b still in the same stack. Clearly, sooner or later, there would be a step of ALG moving b from D 2 to D 1 . Let us now focus on this exact moment (when b is pushed into D 1 ) and call the resulting configuration ℵ: we claim that, if we modify ALG by just performing d 1 instead of d 0 in the configuration described at the beginning of the present paragraph, we can reach the same configuration ℵ mentioned above. So suppose that, after having performed d 0 and before moving b to D 2 , the elements that ALG has pushed into D 1 are c, c 1 , . . . , c k . Clearly, when b is moved into D 2 , such elements must all be inside I, since they are all greater than b and D 2 is decreasing. If, in the meanwhile, a has not been pushed into I, then we can reach the same configuration by first moving b into D 2 (thus performing d 1 ) and then moving all the elements c, c 1 , . . . , c k into I by performing the same sequence of operations. Otherwise, if a would have been moved into I before all elements c, c 1 , . . . , c k reach I (possibly together with some further elements from D 2 ), this should have been done in a configuration in which both d 0 and d 1 were not legal (since we have already shown that both d 0 ✄ d 2 and d 1 ✄ d 2 ). This is however impossible, since we will now see that d 1 is certainly legal. Indeed, focussing on the instant immediately before a is pushed into I, since b is into D 1 , T op(D 1 ) > b (since D 2 is decreasing) and we know that b > a, hence T op(D 2 ) < T op(D 1 ). Moreover, since ALG is extraspecial, we also know from Lemma 3.3 that T op(D 1 ) < T op(I). Therefore condition β is satisfied, hence d 1 is legal. Summing up, we have shown that, if both d 0 and d 1 are legal, then performing d 0 leads to a configuration which can be reached also performing d 1 instead. As a consequence, performing d 1 instead of d 0 preserves sortability.

Some further algorithms
As we have seen in the previous section, there exists an optimal algorithm for the D 2 I machine which is able to sort all sortable permutations. However, it is not a very easy one: in order to understand which operation should be performed at each step, one needs to check certain conditions, which in some cases are rather weird. Another approach could be to consider some much easier algorithms, which of course fail to be optimal, but have the nice feature of being more intuitive.
In the present section we briefly sketch two very natural algorithms, one of which turns out to be "too easy" whereas the other one reveals to be "too hard".

A left-greedy algorithm
Our first proposal is a left-greedy procedure for the D k I machine: at each step, we perform the operation d j having maximum index j among the legal available operations. In other words, such a left-greedy procedure is characterized by the following chain of priorities: Setting Sort (lg) k = {π : π is sorted by the left-greedy procedure}, it turns out that Sort (lg) k is in fact a class which we are able to characterize completely. The choice of a left-greedy strategy, instead of a right-greedy one, is suggested by the results contained in [9]. Proof. We start by proving that if π contains 231, then π / ∈ Sort (lg) k . Let bca be an occurrence of the pattern 231 in π. If b is pushed into I before c, then π cannot be sorted, as a consequence of Lemma 2.3. Then suppose that b is stuck into a decreasing stack D j , for some j. In particular, since the algorithm is left-greedy, this implies that D k is not empty (more precisely, each stack D i , with i ≥ j, has to contain at least one element). Let z be the first element that reaches D k without going directly into I and consider the step in which z is pushed into D k ; again because we are using a left-greedy strategy, the next stack I cannot be empty at that moment. Let y = T op(I). Note that y < z, otherwise z would be pushed into I. Moreover, since y is not pushed into the output, there must still be an element t < y that is not in the output (and neither in I, of course). In particular, t follows z, because z is the top of D k . We are thus in a position to apply Lemma 2.3 with the three elements t < y < z, which is enough to conclude that π / ∈ Sort (lg) k .
Conversely, we have to show that, if π / ∈ Sort (lg) k , then π contains the pattern 231. Factorize π as π = α 1 α 2 · · · α r , where each α i is a maximal decreasing sequence. W.l.o.g., we can suppose that, if α 1 contains i elements, then α 1 = i(i − 1) · · · 21; otherwise, in fact, we could simply remove α 1 and consider the remaining permutation: since by hypothesis π is not sortable, there must be an index h such that α h is not the set of the next elements to be output. So suppose that α 1 = π 1 π 2 · · · π i = i(i − 1) · · · 21, hence π i < π i+1 . All the elements of α 1 are pushed into the increasing stack, whereas π i+1 remains stuck into D k . Notice that the hypothesis on α 1 implies that not all elements inside the increasing stack can be output, since there is at least one element x following π i+1 in π which is smaller than all elements of α 1 . Such an element x is still in the input when π i+1 reaches D k (since all the remaining decreasing stacks are clearly empty). Call y the top of the increasing stack when π i+1 reaches D k : then the three elements y, π i+1 and x are an occurrence of the pattern 231 in π.
As a consequence of the previous proposition, our left-greedy procedures sort precisely the same permutations as Stacksort does. Thus, in a sense, adding any number of decreasing stacks before an increasing one does not improve the sorting power of the machine, provided that we always perform the leftmost legal operation. This does not mean, however, that the left-greedy algorithms are equivalent to Stacksort. Indeed, taking for instance k = 1 and the input permutation 2341, the left-greedy D k I machine returns 2134 as output, whereas Stacksort returns 2314. In other words, while the preimage of the identity permutation is the same for Stacksort and for every left-greedy D k I machine, the preimages of other permutations are in general different. It would be certainly interesting to investigate more deeply the preimage of a generic permutation for the left-greedy D k I machine.

A quasi left-greedy algorithm.
There is a better way to design an algorithm which is quasi left-greedy and is able to sort more permutations than the previous one. The idea is to give the increasing stack a privileged role, using it only when no other operation is possible. Formally, at each step we choose to perform the first legal operation according to the following priority rule: This quasi left-greedy procedure is similar to the optimal algorithm for the D 2 I machine described in Section 3, the only difference being that no additional conditions are required in order to perform operations (other than the fact that each operation can be performed only if it is legal, of course).
In analogy with the previous case, define Sort (qlg) k to be the set of permutations sorted by the quasi left-greedy algorithm with k decreasing stacks; such permutations will be called qlgk-sortable permutations. We observe immediately that the permutation 231 is qlg-2-sortable. Unfortunately, Sort (qlg) k is not in general a permutation class, except for the case k = 1, for which we have the following result, whose proof can be found in [4].  When k > 1 things become much more involved. As an example, for k = 2, the permutation 631425 is qlg-2-sortable, whereas its subpermutation 52314 is not. In fact, a complete characterization of Sort (qlg) 2 appears to be quite hard. In the rest of the section we will prove some partial results that should make abundantly clear that understanding the set of qlg-k-sortable permutations is a very hard task.

Proof.
We start by proving that any sortable permutation π cannot contain the pattern 3214. Suppose that cbad is an occurrence of 3214 in π and let m be the smallest element that follows b and precedes d. We focus on the instant when d is pushed into D 1 . Notice that: • c has to be contained in I, because c > b > a and the stacks D 1 and D 2 are decreasing; • m is still in D 1 ; in fact it cannot go directly into D 2 because there are elements in D 2 which are larger than it (at least b or the elements that replaced it). Moreover it is the smallest element before d, so the next element of the input cannot force m to be pushed into D 2 .
Therefore we can apply Lemma 2.3 with the elements c, d and m and conclude that π cannot be sorted. We now consider the pattern 52314. Let ebcad be an occurrence of 52314 in π. Without loss of generality, we can suppose that e is the rightmost element of π preceding b which plays the role of 5. In fact, given any other occurrenceêbcad of 52314, withê to the right of e, any extension of such an occurrence to one of the desired patterns would also give a similar extension of ebcda. In other words, we can suppose that there is no element greater than e between e and b in π. The fact that π is sortable, together with Lemma 2.3, guarantees that, when d is pushed into D 1 , one of the two following configurations holds: In the first case, when a is pushed into I (with d still in the input, of course), D 1 has to be nonempty and the next element of the input z has to be smaller than a. So the elements b, a, z, d form an occurrence of the pattern 3214, which is a contradiction with what we have just proved above.
We now focus on the second case. When d is pushed into D 1 , we have: Suppose there is an element x between b and c in π such that x < b. If x > a, then bxad is an occurrence of 3214, which is again a contradiction. If x < a, we have that ebxcad is an occurrence of 631425, as desired. Otherwise, suppose that x > b for each x between b and c in π. This implies that b is pushed directly into D 2 by the algorithm, because c lies above b in D 2 and no other element can push b into D 2 before c enters. As a consequence, e must be already in I when b is pushed into D 1 , thus, when e is pushed into I, setting y 1 = top(D 1 ) and denoting with y 2 the next element of the input, we have e > y 1 > y 2 . Moreover, it cannot be y 2 = b, otherwise b could not go directly into D 2 , because it would be blocked by the smallest element t inside D 1 which is greater than y 2 (such an element exists since y 1 > y 2 ). We are now left with two distinct cases: e either precedes or follows y 1 .
1. If e precedes y 1 , we have the pattern ey 1 y 2 bcad. Note that y 1 < d as a consequence of our choice of e, and also y 1 < b, otherwise y 1 bad would be an occurrence of 3214. Therefore we have the following possibilities: • y 1 > y 2 > a, hence y 1 y 2 ad is an occurrence of 3214, against the fact that π is sortable; • y 1 > a > y 2 , hence ey 1 y 2 bcad is an occurrence of 7314526.
2. If e follows y 1 , a thorough case by case analysis, similar to the previous one, leads to the remaining four patterns72814536,73814526,82714536 and83714526.
The above proposition cannot be inverted, since there exist permutations that are not qlg-2sortable, yet satisfy the two conditions listed above. An example is given by 11 2 10 1 4 9 3 6 7 5 8; notice, in particular, that it contains three occurrences of 52314 and each of them can be extended to one of the above barred patterns (more specifically, two of the occurrences can be extended to82714536, whereas the remaining one can be extended to 7214536).
In fact, starting from the permutation 52314, it is possible to construct a sequence of permutations of increasing lengths whose sortability depends on the parity of the length. To be more precise, for m ≥ 1, define the permutation γ m ∈ S 3m+2 as follows: γ m = 3m+2, 2, 3m + 1, 1, In other words, starting from γ 1 = 52314, γ m is obtained by inserting a new occurrence P 1 = 2, 3m + 1, 1 of the pattern 231 between the first and the second element of γ m−1 , then suitably rescaling the remaining elements. We have the following result: 1. γ i is a pattern of γ i+1 , for each i ≥ 1. if and only if i is even.

γ i ∈ Sort
Proof. (sketch) The first statement follows directly from the definition of γ m . To prove the second one, we analyze how the quasi left-greedy algorithm manages the occurrences P k of the pattern 231 in γ m . The crucial remark is that, when k is even, the elements of P k can be pushed into the decreasing stacks without extracting other elements, whereas this cannot be done when k is odd. Set P k = P k (2)P k (3)P k (1), for k = 1, . . . , m. The behavior of the algorithm in both cases is represented in Figure 2, Figure 3 and Figure 4.    The behavior of the algorithm on P k , when k is even. Here the algorithm pushes P k into D 1 and D 2 without extracting other elements.
As a consequence, it is easy to check that, if m is even, then the last 4 elements of γ m can be pushed into the decreasing stacks, so the permutation is eventually sorted. On the other hand, if m is odd, then the second-to-last element 2m − 1 forces 2m + 1 to be pushed into the increasing stack immediately above 2m + 3, and the final element 2m + 2 will be output in the wrong position. Therefore the algorithm does not sort γ m .
The existence of an infinite chain of permutations which are alternately sortable and nonsortable suggests that it should be quite difficult to obtain a simple characterization of Sort ; it is also conceivable that it should be possible to adapt the above proposition to larger values of k, thus obtaining similar (negative) results.

Final remarks
In the present work we started the analysis of a sorting device consisting of k decreasing stacks followed by an increasing one, generalizing the case k = 1 addressed in [10]. In general, the problem of characterizing sortable permutations in terms of forbidden patterns seems quite hard, due to the fact that the basis is infinite, as shown in Theorem 2.4. We have however been able to describe an optimal algorithm in the case k = 2 which can sort every sortable permutation. Such an algorithm employs a strategy which is surely nontrivial. Thus we have also briefly discussed some simpler algorithms, which are not able to sort all sortable permutations but are certainly simpler to describe.   There are of course several items that remain to be investigated. Some of them are the following: • determine the complexity of the optimal algorithm for the D 2 I machine; • enumerate sortable permutations, both in the general case and in the restricted (leftgreedy and quasi left-greedy) cases; • study the machine consisting of two passes through the DI machine described in [10]: are there analogies with West 2-stack-sortable permutations?