A new lower bound for the Towers of Hanoi problem

More than a century after its proposal, the Towers of Hanoi puzzle with 4 pegs was solved by Thierry Bousch in a breakthrough paper in 2014. The general problem with p pegs is still open, with the best lower bound on the minimum number of moves due to Chen and Shen. We use some of Bousch's new ideas to obtain an asymptotic improvement on this bound for all p>= 5.


Introduction
The Towers of Hanoi is a puzzle invented by the French mathematicianÉdouard Lucas in 1883 ( [6]). The setup consists of 3 pegs and N disks of different sizes, arranged on the first peg in increasing order according to size. The goal is to move the disks from the first peg to another in as few moves as possible, such that the following three rules are always obeyed: (R1) only one disk can be moved at a time; (R2) each move consists of taking the topmost disk on a peg and placing it on another peg; (R3) a smaller disk is always moved on top of a larger one, or on an empty peg.
It is easy to see that the solution requires 2 N − 1 moves. The puzzle is very popular, and is frequently used to teach recursive algorithms to first-year computer science students.
Several variations of the original problem have been proposed ( [11]), with one possibility being to increase the number of pegs available in the game. The puzzle with 4 pegs was first introduced by Dudeney in 1908 in his book The Canterbury Puzzles, under the name "Reve's Puzzle". In 1939, the general problem with p pegs and N disks was proposed in the American Mathematical Monthly in the Advanced Problems section, as Problem 3918 ( [9]). Two years later, the journal published the proposer's (B.M. Stewart) claimed solution [10], as well as one solution submitted by a reader (J. S. Frame) [4]. The two solutions presented essentially equivalent formulas for the minimum number of moves needed, as well as an algorithm achieving the given bound. However, as noted by the Editors of the Monthly, the two proofs rested on an unproven assumption about the optimality of the algorithm.
In fact, proving that the Frame-Stewart algorithm is best possible has since become a notorious open problem ( [7]). However, in 2014, more than a century after Dudeney's book appeared, the case p = 4 was finally solved by Bousch ([2]) in a very elegant way. We will say more about his beautiful solution later, but first let us describe the Frame-Stewart algorithm.
Given N disks and p pegs, the algorithm chooses an integer 1 ≤ ℓ < N that minimizes the number of steps in the following formula: • Move the initial ℓ disks from the intermediate peg to the goal peg, using p pegs. Let Φ(p, N ) denote the number of steps taken by the Frame-Stewart algorithm for N disks and p pegs. Then we have the recursive formula with initial data Φ(3, N ) = 2 N − 1 and Φ(p, 1) = 1.
Let H(p, N ) denote the minimum number of steps needed to move N disks frome one peg to another, using p pegs, according to the rules (R1)-(R3). We already know that H(3, N ) = Φ(3, N ). Building upon a result of Szegedy ([12]), Chen and Shen showed the following.
Theorem 1 (Chen-Shen, [3]). For all p ≥ 3 and , and so by the above theorem H(p, N ) has the same growth rate as Φ(p, N ). Theorem 1 gives the best known lower bound on H(p, N ). Apart from this, Bousch has proved the following: The main result of this note is the following asymptotic improvement of Theorem 1.
(this decomposition exists and is unique). Then we have The proof relies on the following idea, introduced by Szegedy. Rather than finding a lower bound for H(p, N ), one can try to bound the length Γ(p, N ) of the shortest sequence of steps that moves every disk at least once (here we also minimize over all possible starting configurations). Clearly Γ(p, N ) is then a lower bound for H(p, N ), as every disk must move at least once from the initial peg to the destination peg in the Hanoi problem. Szegedy has shown the following.
The main step in the proof of Theorem 3 is the following result, which may be of independent interest.
In fact we believe that the following holds.

Definitions and auxiliary results
For n ∈ N let [n] = {0, 1, . . . , n − 1} denote the set of natural numbers smaller than n. Given p pegs and N disks, we always label the pegs using numbers from [p], and similarly the disks using numbers from [N ].
We now give a more precise description of Φ as follows.
In the case p = 4, this can be written more compactly as follows.
Note for later use the following property of ∆ p : Let p ≥ 3. We call an arrangement of disks on p pegs a configuration if no disk is placed on top of a larger one. Note that the set of configurations of N disks can be identified with the set [p] [N ] of functions [N ] → [p], in particular, given a configuration u, we let u −1 (x) denote the set of disks placed on peg x. Furthermore, u| S represents the configuration obtained from u by deleting all disks in [N ] \ S.
We define the Hanoi graph H(p, N ) as having vertex set [p] [N ] , and an edge between two vertices u and v if the corresponding configurations can be obtained from one another by a single disk move. We consider H(p, N ) to be a metric space with the usual metric that has distance 1 between any two adjacent vertices.
The path γ is called essential if any disk is moved by γ at least once. Note that in this case the path γ * : [T ] → H(p, N ) given by γ * (t) := γ(T − t − 1) is also essential. By definition, Γ(p, N ) := min{ℓ(γ) : γ is an essential path in H(p, N )}.
The structure of shortest paths (geodesics) in the Hanoi graph has been studied before (see [1]). Note that an essential path need not be a geodesic.
We now introduce a crucial definition, due to Bousch.
and further The function Ψ(E) is well-defined, as Ψ L (E) becomes negative for large L, and Ψ 0 (E) = |E|. Bousch showed the following.
Proof. The first identity is Lemma 2.2 from [2]. The second follows from (1). whenever c is a midpoint configuration, but we will not use this stronger statement.
We shall also need the following two lemmas.
Finally, we shall need the following recursive lower bound for Γ.
Moreover, this inequality is tight.
Proof. Let γ : [T ] → H(4, N ) be a shortest path between u and v. Let t 1 ∈ [T − 1] be the first time when the disk N − 1 moves to one of the pegs 2 and 3. Then we may assume without lack of generality that γ t1 (N − 1) = 0 and γ t1+1 (N − 1) = 2. Set Note that A∪B = [N − 1], as no other disk besides N − 1 is on pegs 0 or 2 at time t 1 . By Theorem 7, d(γ(t 1 ), γ(0)) ≥ Ψ(B), as pegs 2 and 3 are empty in γ(0) = u. Similarly, by Theorem 7 and the fact that all disks are placed on pegs 2 and 3 in v, Consequently by Lemma 10, and the fact that the disk N − 1 moves once at time t 1 , We now show that the inequality is tight. Let a, b ∈ N arbitrary such that a + b = N and b ≥ 1. Consider a configuration u a,b with the disk N −1 on peg 1, disks N −b, N −b+1, . . . , N −2 on peg 0, and disks 0, . . . , a−1 arranged on pegs 0 and 1 in such a way that they form a midpoint configuration of a disks on 4 pegs.
Let v a,b be the resulting configuration. It has disks N − b, . . . , N − 1 on peg 2, and disks 0, . . . , a − 1 on peg 3. Also We now minimize over all choices of a and b. This gives configurations u and v such that , by Lemma 8, We would now like to extend this result to configurations which may share a peg, i.e. there is a peg which is occupied in both the starting and ending configuration. Surprisingly, this requires some more effort. Proof. We prove the lemma by induction on N .
Then a ∈ {0, 3}. Let π be the involution on {0, 1, 2, 3} which exchanges elements 1 and a. We modify γ into a new path γ ′ by letting γ ′ | [0,t1+1] = γ| [0,t1+1] and setting for all t > t 1 + 1, At time t 1 + 1, peg 1 is empty and peg a only contains the disk N − 1. Hence all moves represented by γ ′ are valid moves. However, γ ′ may contain repeated states, so we may need to delete some in order to make it into a proper path. Note that in γ ′ (T − 1) pegs 1 and 3 − a are empty, as in γ(T − 1) pegs a and 3 − a were empty. Consequently by Theorem 7, By Theorem 7 and the fact that the pegs 1 and 2 are empty in Also, by Lemma 12 and the fact that pegs 1 and 2 are empty in γ(t 1 + 1)| [N −1] , while pegs 0 and 3 are empty in Hence adding the move of the disk N − 1 gives The proof is nearly identical to that of Lemma 13, and so we omit it. Proof. We show the equivalent statement 2Φ(4, N + 1) − 2 ≥ Φ(4, N + 2) − 1. Write Thus in both cases Φ(4, N + 2) − 1 is at most 2Φ(4, N + 1) − 2, as desired.
We are now ready to prove the counterpart to Lemma 12. Proof.
If v −1 (1) = ∅ then all disks are on peg 2 in v. But pegs 2 and 3 are empty in u, so by Theorem 7 and Lemma 8, By Lemma 15, this is at least 1 + Φ(4,N +2)−5 4 , proving the claim in this case. So we may assume that v −1 (1) = ∅. Let D be the largest disk on peg 1 in v.  Then as b ∈ {2, 3} and pegs 2 and 3 are empty in u.
If c = 1, then we claim that γ| [t1+1,T −1] is an essential path when restricted to the moves of the first N − 1 disks. Indeed, the disks on peg 1 at time t 1 + 1 will all have to move, to make room for the disk D.  If b = 2, let t 3 > t 1 be the last time when γ t3 (N − 1) = 2. Then at time t 3 , peg b = 2 and some other peg do not contain any disks smaller than N − 1. So by Theorem 7, Thus in any case, as claimed.
We shall further assume that γ t1+1 (N − 2) = 2. We now choose t 2 ∈ [T ] such that the disk N − 2 moves at time t 2 , and the difference |t 2 − t 1 | is minimal. Clearly t 2 exists, although there may be two distinct choices, if the disk N − 2 moves before and after time t 1 . If there are two possibilities for t 2 , we choose one arbitrarily. Then by definition of t 2 , the disk N −2 does not move in the time interval [min{t 1 , t 2 +1}, max{t 1 +1, t 2 }].
Note that we can always replace γ with γ * , t 1 with t ′ 1 := T −t 1 −2 and t 2 with t ′ 2 := T −t 2 −2. Then γ * is still essential, N − 1 moves at time t ′ 1 , N − 2 moves at time t ′ 2 , and the difference |t ′ 1 − t ′ 2 | = |t 1 − t 2 | is still minimal. First suppose peg 3 is empty at time t 1 + 1. By replacing γ with γ * if necessary, we may assume that t 2 > t 1 . At time t 1 + 1, all disks in [N − 1] are on peg 2, while at time t 2 , pegs 2 and γ t2+1 (N − 2) do not contain any disks from [N − 2]. Consequently, by restricting to the first N − 2 disks and applying Theorem 7, we get ℓ(γ| [t1+1,t2] ) ≥ Φ(4,N −1)−1 2 . Adding the 2 moves of the disks N − 1 and N − 2 and using Lemma 15, we get Therefore we may assume that peg 3 is not empty at time t 1 + 1. Let D be the largest disk on peg 3 and t 3 any time when the disk D moves.
Let us look at the path γ| [t2+1,t1] . If we go backwards from time t 1 to time t 2 + 1, all disks on peg 2, except N − 2 (in other words, the disks in A), will have to move to make room for the move of the disk N − 2 at time t 2 + 1. Hence by restricting to the first N − 2 disks and then applying Theorem 7, we get ℓ(γ| Case 2. t 2 > t 1 but t 3 < t 1 .
This case follows from the previous one by replacing γ with γ * .
By replacing γ with γ * if necessary, we may suppose that t 2 , t 3 > t 1 . As the disk N − 2 does not move in the time interval [t 1 , t 2 ], we have γ t2 (N − 2) = 2.
We shall consider two further subcases.
Then γ| [t1+1,t2] is an essential path when restricted to the moves of the first N − 2 disks. Indeed, all disks on peg 3 must move, because D moves, and all disks on peg 2 move, to make room for the move of the disk N − 2 at time t 2 . Hence by Lemmas 12 and 16 applied to u := γ(t 1 + 1)| [N −2] and v := γ(t 2 )| [N −2] , we get that Adding the further 2 moves of the disks N − 1 and N − 2 gives the result.
If γ t2+1 (N − 2) = 3 then the disk D moves at least once in the time interval [t 1 + 1, t 2 ] and we may apply the previous subcase.
We will now show that the bound can be achieved. Let a, b ≥ 0 such that a + b = N − 3. Consider a configuration u a,b with the disk N − 1 on peg 2, the disk N − 2 on peg 1 and the disk N − 3 on peg 0. We put disks N − 3 − b, N − 2 − b, . . . , N − 4 on peg 0, and distribute the remaining a disks on pegs 0 and 1 in such a way that they form a midpoint configuration on 4 pegs.
Then we can first move the disk N − 1 to peg 3, followed by the disks 0, 1, . . . , a − 1 to the same peg in at most Φ(4,a+1)−1 2 moves. Afterwards we move the disk N − 2 to peg 2 (the peg is now free, and there are no more disks on top of the disk N − 2). We further move the disks Finally, we move the disk N − 3 to peg 1.
Let v a,b be the resulting configuration. We have just constructed an essential path γ a,b between u a,b and v a,b with ℓ(γ a,b ) ≤ 2 + Φ(4, a + 1) − 1 2 Minimizing over all choices of a and b yields an essential path γ of length at most A similar argument as above shows that Γ(p, N ) , for all p ≥ 3 and N ≥ p − 1.

Proof of Theorem 3
Let us recall the statement of Theorem 3. Given p ≥ 4 and N ≥ 1, we write Theorem 3 then states that H(p, N ) ≥ (m + t)2 m−2(p−2) .
Note that this decomposition of N − 1 exists: first choose m ≥ 0 maximal with ∆ p m ≤ N − 1 and then t ≥ 0 maximal with ∆ p−1 t ≤ N − 1 − ∆ p m. Let r be the remainder. Then t ≤ m, as ∆ p m + ∆ p−1 (m + 1) = ∆ p (m + 1). One can easily show that this decomposition is unique.
Proof of Theorem 3. We prove the stronger statement Γ(p, N ) ≥ (m + t)2 m−2(p−2) by induction, first after p, and then after N . The theorem then follows from the fact that H(p, N ) ≥ Γ(p, N ).