Switch-based Markov Chains for Sampling Hamiltonian Cycles in Dense Graphs

We consider the irreducibility of switch-based Markov chains for the approximate uniform sampling of Hamiltonian cycles in a given undirected dense graph on $n$ vertices. As our main result, we show that every pair of Hamiltonian cycles in a graph with minimum degree at least $n/2+7$ can be transformed into each other by switch operations of size at most $10$, implying that the switch Markov chain using switches of size at most $10$ is irreducible. As a proof of concept, we also show that this Markov chain is rapidly mixing on dense monotone graphs.


Introduction
In this work, we consider the problem of sampling Hamiltonian cycles in dense graphs using switch-based Markov chains. Throughout, let G be an n-vertex graph and denote its minimum degree by δ(G). A Hamiltonian cycle of G is a simple cycle of G that includes every vertex. A classical theorem by Dirac [3] states that if δ(G) ≥ n/2 then G has a Hamiltonian cycle. Moreover, it is well known that in general it is NP-complete to decide if G has a Hamiltonian cycle even if δ(G) ≥ ( 1 2 − ε)n. Dyer, Frieze, and Jerrum [4] considered the question of counting and sampling Hamiltonian cycles in dense graphs. They consider a Markov Chain Monte Carlo (MCMC) approach for solving the sampling problem. Here, one defines a suitable Markov chain on the (exponentially large) set of all Hamiltonian cycles, and shows that it is rapidly mixing, i.e., only a polynomial number of steps of the chain are needed in order to obtain a sample that is close to uniform. In particular, they give a fully-polynomial almost uniform sampler for sampling Hamiltonian cycles from graphs G with δ(G) ≥ ( 1 2 + ε)n, which is then turned into a fully-polynomial randomised approximation scheme for counting Hamiltonian cycles in such graphs by a standard reduction.
For the sampling problem, they take a two-step approach. First, based on a result of Jerrum and Sinclair [7], they show that there is a rapidly mixing Markov chain on the set of all 2-factors of G (which are all subgraphs of G in which every vertex has degree 2). Then it is shown that the number of 2-factors in G is at most a polynomial factor larger than the number of Hamiltonian cycles in G. This then automatically implies (roughly speaking) that if one takes a polynomial number of samples from the Markov chain that samples 2-factors, most likely one of those samples will be a Hamiltonian cycle. This sample is then also an approximately uniform sample from the set of all Hamiltonian cycles in G.
At the end of their paper, Dyer, Frieze and Jerrum [4] ask if there is a rapidly mixing Markov chain on the set of Hamiltonian cycles, and possibly 'near-Hamiltonian cyles', that mixes rapidly. 1 As a first step towards addressing this question, we show there exist switchbased Markov chains on the set of all Hamiltonian cycles of a dense graph that converge to the uniform distribution, provided that δ(G) ≥ 1 2 n + 7.
Switch Markov chains are arguably the simplest and most natural Markov chains on the set of Hamiltonian cycles of a graph. Given a graph G, let H G denote the set of Hamiltonian cycles of G. We say that H ∈ H G can be obtained from H ∈ H G by a k-switch if |E(H) E(H )| ≤ 2k, that is, a k-switch is an operation for transforming one Hamiltonian cycle into another by altering at most 2k of its edges. 2 For a given constant k ∈ N, the k-switch Markov chain on H G is defined as follows in this work. Given that the Markov chain is currently in state H ∈ H G , we first pick ∈ {1, . . . , k} uniformly at random, and then select a set L ⊆ E(G) with |L| = 2 uniformly at random. It is not hard to show that the k-switch Markov chain will converge to the uniform distribution on H G (because of symmetry of the transition probabilities) provided that the chain is irreducible. Irreducibility here refers to the fact that any two Hamiltonian cycles H 1 , H 2 ∈ H G can be transformed into each other by a sequence of k-switches.

Our contributions
The main goal of this work is to provide the first irreducibility results for the k-switch Markov chain. Given a graph G, we say H G is k-switch irreducible if for every H, for which H G is not k-switch irreducible.
(iii) We give examples of graphs G with δ(G) ≥ 2 3 n − 1 for which H G is not 2-switch irreducible.
The second item essentially establishes that, for the case k = 10, the result in the first item is best possible (up to a constant-sized gap between n/2 − 17 and n/2 + 7). Moreover, the third item shows that the 2-switch Markov chain (probably the simplest Markov chain on Hamiltonian cycles) cannot be used to address the question of Dyer, Frieze and Jerrum for all dense graphs with δ(G) ≥ n/2.
As a proof of concept, we show that, for dense monotone graphs G (discussed in the related work section), the k-switch Markov chain on H G is rapidly mixing. We do this by means of a meta-theorem which shows that if the k-switch Markov chain on H G is (strongly) irreducible for some dense monotone graph G, then it is also rapidly mixing. Here, strong irreduciblity roughly refers to the fact that if two Hamiltonian cycles are close to each other in terms of symmetric difference, we should be able to transform them into each other using a small number of k-switches; a formal definition is given later on. In the first item above, we show indeed this strong version of irreducibility for k = 10. Overall, several interesting new questions arise in light of our work and we hope our results will stimulate more work in the area. In particular, what is the smallest k for which the k-switch Markov chain is (strongly) irreducible for dense graphs with δ(G) ≥ n 2 + c, where c is a (small) constant? Furthermore, given the vast interest in the 2-switch Markov chain for other combinatorial objects (see Section 1.2), what is the smallest 4 constant 2 3 < γ < 1 such that the 2-switch Markov chain is irreducible for all dense graphs with δ(G) ≥ γn + c for some (small) constant c?

Related work
The question of irreducibility, as well as being integral to the MCMC method, is studied in its own right under the moniker of reconfiguration problems. Here, one wishes to decide whether the space of solutions to some combinatorial problem is connected (where two solutions are adjacent if one can be obtained from the other by some small prescibed change); see for example the surveys of van den Heuvel [17] and Nishimura [12]. Reconfiguration problems about Hamiltonian cycles have not been widely considered. Takaoka [15] has considered the complexity of deciding whether H G is 2-switch irreducible when G belongs to particular structural graph classes. This includes a hardness result for chordal bipartite graphs, but also a result establishing the 2-switch irreducibility of Hamiltonian cycles in unit interval graphs and monotone graphs. A slightly different Hamiltonian reconfiguration problem is considered by Lignos [9].
The mixing time of switch-based Markov chains have been studied extensively for sampling subgraphs of K n with a given degree sequence, see, e.g., [8,2,11,1]. It is well known, see e.g. [16], that every two graphs (thought of as subgraphs on K n ) with the same degree sequence can be transformed into each other with switches of size 2 (in K n ). This remains true if one restricts oneself to the class of all connected subgraphs of K n with a fixed degree sequence [16]. In particular, relevant to our setting, Feder et al. [6] (implicitly) show that the 2-switch chain is rapidly mixing on the set of all Hamiltonian cycles in case G is the complete graph. There are more direct ways to obtain this result, but we mention it here as we rely on some of their ideas in order to address the mixing time of the switch Markov chain on dense monotone graphs.
Monotone graphs, also known as bipartite permutation graphs, have been widely studied from the structural graph theory perspective, perhaps most notably in their characterisation [14] (we define them formally in Section 4.1). Monotone graphs have also been considered in the context of switch-based Markov chains for the sampling of perfect matchings: in particular, Dyer, Jerrum and Müller [5] show that the 2-switch Markov chain for sampling perfect matchings is rapidly mixing on monotone graphs. We refer the reader to [5] for further results in this direction.
We mentioned above that Takaoka [15] shows that the set of all Hamiltonian cycles in a given monotone graph is 2-switch irreducible. We remark that this is established in the weak sense by showing that every Hamiltonian cycle can be transformed, by switches of size 2, into a fixed canonical Hamiltonian cycle. However, we need the stronger notion of irreducibility for our rapid mixing proof for dense monotone graphs to go through.

Preliminaries
Let G = (V, E) be a simple undirected graph with vertex set V = {v 1 , . . . , v n } and edge set E = {e 1 , . . . , e m }. We use the shorthand notation uv to denote an edge {u, v} ∈ E. A 2-factor of G is a subgraph F in which every vertex v ∈ V has degree precisely d F (v) = 2. We use F G to denote the set of all 2-factors of G. A Hamiltonian cycle is a connected 2-factor, i.e., a simple cycle passing through all vertices of G. We use H G to denote the set of all Hamiltonian cycles of G. Given two graphs G = (V, E) and G = (V, E ) on the same vertex set V , their symmetric difference is denoted by G G = (V, E E ) = (V, (E \E )∪(E \E)). We use N G (v) = {w : vw ∈ E} to denote the set of neighbours of v ∈ V in G and we write d G (v) = |N (v)| for the degree of v dropping subscripts when the graph is clear.
For a given k ≥ 2 and (finite) set A of graphs on some vertex set V , a switch of size k (with respect to A) is an operation on a given graph F = (V, E) ∈ A in which we remove exactly k edges from F and add exactly k edges from the complement of F in such a way that the resulting graph F also satisfies F ∈ A. We then define a k-switch to be a switch of size at most k. In this work we are mostly interested in A = H G or A = F G for a given undirected graph G.
Fix a graph G and consider a k-switch with respect to F G or H G . Writing A for the 2k edges involved in the k-switch, it is easy to see that every vertex of the graph S = (V, A) must have even degree (since all graphs in F G or H G are regular of degree 2). Moreover, every connected component of S can be thought of as an alternating circuit, i.e. a circuit whose edges alternate between edges in G and edges not in G. 5 k-Switch irreducibility. For a given graph G and integer k, we say that H G is (weakly) kswitch irreducible if for every H 1 , H 2 ∈ H G , there exists a sequence H 1 = Z 1 , . . . , Z q = H 2 of Hamiltonian cycles in H G such that every consecutive pair of Hamiltonian cycles (Z i , Z i+1 ) differs by a k-switch. Moreover, for a given class of graphs G and integer k, we say that G is strongly k-switch irreducible for Hamiltonian cycles if there exists a function φ : N → N with the following property: for all G ∈ G, whenever H 1 , there exists a sequence of Hamiltonian cycles H 1 = Z 1 , . . . , Z q = H 2 of Hamiltonian cycles in H G such that every consecutive pair of Hamiltonian cycles (Z i , Z i+1 ) differs by a k-switch operation and q ≤ φ(x).
Roughly speaking, strong irreducibility states that if two Hamiltonian cycles are somewhat 'close' to each other in terms of symmetric difference, then we should be able to transform one into the other with a 'small' number of k-switches. Similarly we define (strong) irreducibility for 2-factors.
For a given graph G, we say that F G (the set of 2-factors of G) is (weakly) k-switch irreducible if for every F 1 , F 2 ∈ F G , there exists a sequence F 1 = Z 1 , . . . , Z q = F 2 of 2-factors in F G such that every consecutive pair of 2-factors (Z i , Z i+1 ) differs by a k-switch. For a given class of graphs G and integer k, we say that G is strongly k-switch irreducible for 2-factors if there exists a function φ : N → N with the following property: for all G ∈ G, whenever Hamiltonian cycles in H G such that every consecutive pair of Hamiltonian cycles (Z i , Z i+1 ) differs by a k-switch operation and q ≤ φ(x).
Markov chains and mixing times. The preliminaries here will only be required in Section 4 onwards. We write M = (Ω, P ) to denote an aperiodic, irreducible and time-reversible Markov chain M on state space Ω with transition matrix P . We write P t (x, ·) for the distribution over Ω at time step t given that the initial state is x ∈ Ω. The total variation distance of this distribution from the (unique) stationary distribution π at time t with initial and the mixing time of M is Informally, τ ( ) is the number of steps until the Markov chain is -close to its stationary distribution. When π is the uniform distribution over Ω, we say that a Markov chain is to be rapidly mixing if the mixing time can be upper bounded by a function polynomial in ln(|Ω|/ ). As the Markov chains we consider are time-reversible, the matrix P only has real eigenvalues, that we denote by 1 = λ 0 > λ 1 ≥ λ 2 ≥ · · · ≥ λ |Ω|−1 > −1. We can always replace the transition matrix P of the Markov chain by (P + I)/2, to make the chain lazy, and, hence, guarantee that all its eigenvalues are non-negative. It then follows that the second-largest eigenvalue of (the new transition matrix) P is λ 1 . In this work we always consider the lazy versions of the Markov chains involved, but we do not always mention this explicitly. It follows directly from Proposition 1 in [13] that where π * = min x∈Ω π(x). When π is the uniform distribution, the above bound reduces to The quantity (1 − λ 1 ) −1 can be upper bounded using the multicommodity flow method of Sinclair [13]. We define the state space graph of the chain M as the directed graph G with vertex set Ω that contains exactly the arcs (x, y) ∈ Ω × Ω for which P (x, y) > 0 and x = y. Let P = ∪ x =y P xy , where P xy is the set of simple paths between x and y in G. A flow f in Ω is a function P → [0, ∞) with the property p∈Pxy f (p) = π(x)π(y) for all x, y ∈ Ω, x = y. The flow f can be extended to a function on oriented edges of G by setting f (e) = p∈P:e∈p f (p), so that f (e) is the total flow routed through the edge e ∈ E(G). Let (f ) = max p∈P:f (p)>0 |p| be the length of a longest flow carrying path, and let ρ(e) = f (e)/Q(e) be the load of the edge e, where Q(e) = π(x)P (x, y) for e = (x, y). The maximum load of the flow is then given by ρ(f ) = max e∈E(G) ρ(e). Sinclair, in Corollary 6 of [13], shows that We use the following (by now standard) technique for bounding the maximum load of a flow in case the chain M has uniform stationary distribution π. Suppose θ is the smallest positive transition probability of the Markov chain between two distinct states in Ω. If b is such that f (e) ≤ b/|Ω| for all e ∈ E(G), then it follows that ρ(f ) ≤ b/θ. This implies that Now, if (f ), b and 1/θ can be bounded by a function polynomial in ln(|Ω|), it follows that the Markov chain M is rapidly mixing. In this case, we say that f is an efficient flow. Note that in this approach the transition probabilities do not play a role as long as 1/θ is polynomially bounded.

Irreducibility of k-switch Markov chain
In this section we will prove various results regarding the (non)-irreducibility of the k-switch Markov chain. The main result of this section is Theorem 3.1 below. Afterwards, we provide various examples of non-irreducibility for certain combinations of δ(G) and k.
Theorem 3.1. If a graph G satisfies δ(G) ≥ 1 2 n + 7, then the set H G of all Hamiltonian cycles of G is 10-switch irreducible. Moreover, the class of graphs G for which δ(G) ≥ 1 2 n + 7 is strongly 10-switch irreducible for Hamiltonian cycles. In order to prove Theorem 3.1, we rely on the following lemma. It allows us to quickly reconfigure a 2-factor T into a Hamiltonian cycle H without increasing the symmetric difference with respect to some fixed Hamiltonian cycle H. Then there exists a Hamiltonian cycle H , so that T can be transformed into H with at most t − 1 switches of size at most 3, and for which |H H| ≤ |T H|. (1) Proof. Let t be the number of components of T . We will prove the statement in the lemma using induction. If t = 1 then T is Hamiltonian and we are done as we may take H = T . Suppose t > 1. Let C 1 , . . . , C t denote the cyclic components of T . Since H is Hamiltonian, there must be some edge vw ∈ E(H) connecting two components of T (see Figure 3.1). We assume without loss of generality that vw connects C 1 and C 2 , i.e that v ∈ V (C 1 ) and w ∈ V (C 2 ) (by renumbering if necessary). Moreover, since v has degree two in H and vw ∈ E(H), it must be that there exists an a ∈ V (C 1 ) (one of the two neighbours of We assign orientations to C 1 , . . . , C t . We will call v + the vertex following v in the appropriate orientation and v − the vertex preceding v. We choose the orientations on C 1 and C 2 such that v = a + and b = w + , see Figure 3.1, and we assign arbitrary orientations on C 3 , . . . , C t . Consider X := {v + | v ∈ N (a)}. As δ(G) ≥ 1 2 n + 1, |X| ≥ 1 2 n + 1. Also consider N (b), and note that we have x y x y  and set x = y − noting that ax ∈ E(G). 6 If y / ∈ {a, b + , w, v + }, the general case, we now switch along the cycle vaxybwv; see Figure 3.1. Note that the edge xy may lie on C 1 , C 2 or a different cycle C i . In all these cases, we do not increase |T ∆H|, as vw ∈ E(H) and va, bw / ∈ E(H). If xy / ∈ E(C 1 ∪ C 2 ), we decrease the number of cycles by two, otherwise by one. For the special cases y ∈ {a, b + , w, v + }, we switch along different cycles as follows; see Figure 3.2. If y = v + , we switch along the cycle vybwv. If y = w, we switch along the cycle vaxwv. If y ∈ {a, b + }, then ab ∈ E(G), and we switch along the cycle vabwv. It is easy to see that in these cases we decrease |T ∆H| by at least two and we decrease the number of cycles by one.
In any case, the resulting 2-factor has fewer components and the symmetric difference is not larger. Repeated application of this procedure proves the statement of the lemma.
We now continue with the proof of Theorem 3.1.
Proof of Theorem 3.1. We claim that for two given Hamiltonian cycles H 1 and H 2 there is a switch of size at most 4 that transforms H 1 into a 2-factor T with at most 3 components such that |T H 2 | < |H 1 H 2 |. The theorem then follows from Lemma 3.3 since with two switches of size at most 3, we can transform T into some Hamiltonian cycle H satisfying In particular we can transform H 1 to H with a switch of size at most 4 + 2 × 3 = 10, and repeating this we can transform H 1 into H 2 with at most x = |H 1 H 2 | switches of size 10, proving the theorem (where we take φ(x) = x in the definition of strong irreducibility).
We now prove the claim. Note that the symmetric difference of H 1 and H 2 is the vertexdisjoint union of circuits in which edges alternate between H 1 and H 2 and the circuits visit each vertex zero, one, or two times. If the symmetric difference of H 1 and H 2 contains such alternating circuits with four or six edges (corresponding to switches of size 2 or 3), the claim obviously holds, so assume otherwise. In this case it is not hard to see that we can find an H 1 , H 2 -alternating walk P = a 1 a 2 a 3 a 4 a 5 a 6 (here the a i are vertices and a 1 and a 6 are distinct) such that the a 1 a 2 , a 3 a 4 , a 5 a 6 are edges of H 1 , and a 2 a 3 , a 4 a 5 are edges of H 2 .
We try to find vertices b and c that are neighbours on H 1 such that b ∈ N (a 1 ) and c ∈ N (a 6 ). Then the circuit C := a 1 a 2 a 3 a 4 a 5 a 6 cba 1 is a 4-switch for H 1 . Deleting the edges a 1 a 2 , a 3 a 4 , a 5 a 6 and cb divides H 1 into four paths and adding a 2 a 3 , a 4 a 5 , a 6 c and ba 1 can connect some of these paths again.
Therefore, switching H 1 along C can produce at most 4 connected components, and this only happens if the four edges a 2 a 3 , a 4 a 5 , a 6 c and ba 1 connect each path into a cycle (see Figure 3.3, left side). If one of the paths is just an isolated vertex, it cannot be connected to itself in this way. It is easy to check that 4 components are produced if and only if the vertices a 1 , a 2 , . . . , a 6 , c, b are distinct and appear in that order along H 1 (as in Figure 3.3, left side). To prevent this, we choose b and c as follows: orient H 1 so that a 2 follows a 1 . We This ensures that the resulting 4-switch (along the circuit C := a 1 a 2 a 3 a 4 a 5 a 6 cba 1 ) produces at most three components. Finally, if T is the 2-factor produced by switching H 1 along C, then compared to H 1 , T contains at least two new edges of H 2 (namely a 2 a 3 , a 4 a 5 ) but T may have lost one edge of H 2 (namely bc if it was in fact an edge of H 2 ), giving a net gain of one. Since T and H 1 have the same number of edges, we see that |T H 2 | ≤ |H 1 H 2 | − 1, as required.
We also give a version of Theorem 3.1 for 2-factors, instead of Hamiltonian cycles, that we will need later. The proof is a simplification of Theorem 3.1 and so we defer its proof to the appendix.
For convenience, we select n such that m is odd and m ≥ 3. We denote the vertices of A i by v i,j for j = 1, . . . , m. Take as edge set E all edges between vertices in A 1 , all edges between vertices in A 3 , and all edges from vertices in A i to vertices in A i+1 for i = 1, 2 (see Figure 3.4).
We color edges as follows: All edges incident to a vertex in A 1 are colored blue, and all other edges red. Note that all cycles of length 4 contain an even number of red and blue edges. This means that any switch along a 4-cycle preserves the parity of red and blue edges.
We will finish the construction by describing two Hamiltonian cycles H 1 and H 2 that have different parities of blue edges. As any 2-switches preserve the parity of blue edges, H 1 cannot be converted to H 2 via 2-switches.
The blue edges in H 1 are v 2,1 v 1,1 , v 1,k v 1,k+1 for k = 1, . . . , m − 1 and v 1,m v 2,m . The red edges in There are an even number of blue edges and an odd number of red edges in H 1 . The Hamiltonian cycle H 2 is constructed by swapping the roles of the blue and red edges.
Example 3.6 (The case δ(G) ≈ n 2 for each fixed k.). For k fixed and n ≥ 3k + 5, there is a graph G with δ(G) ≥ (n − 3k − 4)/2 for which H G is not k-switch irreducible. Our construction relies on the following lemma.
Lemma 3.7. For any , there is a graph X with 3 + 1 vertices that has exactly two Hamiltonian paths H 1 and H 2 . Moreover, these two paths satisfy |H 1 ∆H 2 | = 2 .
Proof. Without loss of generality let be odd, and set n = 3 + 1. Let X = (V, E) with V := {v 1 , . . . , v n } and E := , and E 2 = {v j v j+4 | and a 6 are in different parts; say a 1 ∈ A and a 6 ∈ B. Then N (a 1 ), M ⊆ B, so since |N (a 1 )|, |M | ≥ 1 2 n + 7, so |N (a 1 ) ∩ M | ≥ 14, and we continue as before.  For the example we begin by applying the previous lemma with = k + 1 to obtain the graph X of order r := 3 + 1. For any n such that n + r is odd, we construct our example G by taking an (unbalanced) complete bipartite graph with parts A and B of size n+(r−1) 2 and n−(r−1) 2 respectively and adding a copy of X inside A. See Figure 3.5, right side. As there are no edges inside B, any Hamiltonian cycle of G must use r − 1 edges inside A, and so these must be within X. Since X has r vertices, any Hamiltonian cycle of G must induce a Hamiltonian path on X. By construction, X has exactly two Hamiltonian paths H 1 and H 2 , and they have a symmetric difference of 2k + 2. It is easy to see that G has Hamiltonian cycles that use each of the two Hamiltonian paths in X, but it is impossible to perform a sequence of k-switches to transform a Hamiltonian cycle that uses H 1 into one that uses H 2 ; indeed if such a sequence existed, examining its restriction to X would yield a sequence of switches of size at most k that transforms H 1 into H 2 but maintaining a Hamiltonian path in X at each stage; this is impossible since X has only two Hamiltonian paths and their symmetric difference has size 2 = 2(k + 1) > k.

Rapid mixing for dense monotone graphs
In this section we will give some rapid mixing results for switch-based Markov chains for sampling Hamiltonian cycles in special classes of dense graphs.
We first present a result for the sampling of 2-factors using switch-based Markov chains, which will be used later on, and that might be of independent interest. Given a graph G, recall the k-switch Markov chain on H G defined in the introduction. Replacing H G with F G (the set of all 2-factors of G) everywhere in that definition defines the k-switch Markov chain on F G . Theorem 4.1. Let G be the class of all graphs G on n vertices with δ(G) ≥ n/2. If G is strongly k-switch irreducible for 2-factors for some k ∈ N (this is the case for k = 4 by Proposition 3.4) then there is an efficient multicommodity flow for the k-switch Markov chain on F G , and, in particular, this Markov chain is rapidly mixing.
Moreover, Theorem 4.1 remains true for the bipartite case of the problem, where we are given a bipartite graph G = (A ∪ B, E) with both |A| = |B| = n, and where every vertex in A ∪ B has degree at least n/2.
The proof of Theorem 4.1 is outlined in Appendix A. It is based on the embedding argument introduced in [1] for the switch Markov chain that samples graphs with a given degree sequence. It is perhaps interesting to note that it seems much harder to prove Theorem 4.1 by using other approaches for that problem, such as [2,11]. These approaches do have the advantage that they get better mixing time bounds than those in [1].

Dense monotone graphs
In this section we will describe a rapid mixing result for sampling Hamiltonian cycles from dense monotone graphs that is based on Theorem 4.1. We will start with the definition of monotone graphs (also known as bipartite permutation graphs).  a permutation (a 1 , . . . , a n ) of the vertices in A and a permutation (b 1 , . . . , b n ) of the vertices in B, such that the adjacency matrix C of G, with rows indexed by a 1 , . . . , a n and columns indexed by b 1 , . . . , b n , has monotone rows and columns. This means that for each i, there exists 1 ≤ r i ≤ t i ≤ n such that C(a i , b j ) = 1 if and only if r i ≤ j ≤ t i and the sequences (r i ) n i=1 and (t i ) n i=1 are non-decreasing. Intuitively, this means that the 1entries in every row and column are contiguous. Note that although the definition does not immediately appear to be symmetric in A and B, one can easily check that it is. An example of such an adjacency matrix of a monotone graph is Moreover, we say G is a γ-dense monotone graph if every vertex in A ∪ B has degree at least γn.
The main theorem of this section is given below. Theorem 4.3. Let D be the set of all monotone graphs with δ(G) ≥ n/2. 8 If D is strongly k-switch irreducible for Hamiltonian cycles for some k ∈ N (this is the case for k = 10 by Remark 3.2) then for every G ∈ D, the k-switch Markov chain for sampling a Hamiltonian cycle from H G is rapidly mixing.
As mentioned earlier, the set of all Hamiltonian cycles for (not necessarily dense) monotone graphs is connected under switches of size two [15] in the weak sense as defined in the preliminaries. Takaoka shows that every Hamiltonian cycle can be transformed into a 'canonical' Hamiltonian cycle using switches of size two. This is, however, not enough for the argument we will give below. For that we need the strong sense of irreducibility.
Proof of Theorem 4.3. The proof relies on an embedding argument similar to that in [6], but technically somewhat different. While the argument in [6] corresponds to the case where G is a complete bipartite graph (which is indeed monotone), here we relax the argument so that it extends to monotone graphs.
Let G ∈ D be given. In particular, our goal is to show, for every G ∈ D, the existence of a function φ : F G → H G with the properties i) φ −1 (H) ≤ poly(n) for every H ∈ H G , and, ii) there exists a function f : If such a function exists, one can argue exactly as in [6] that every efficient multi-commodity flow for the k-switch Markov chain on the set of all 2-factors F G can be transformed into an efficient multi-commodity flow for the k-switch Markov chain on the set of all Hamiltonian cycles H G . 9 (The embedding argument from [6] that we refer to here is essentially the same as that used to prove Theorem 4.1 in Appendix A.) As we know that there exists an efficient multi-commodity flow for the k-switch Markov chain (by Theorem 4.1 and Proposition 3.4), this then shows that the k-switch Markov chain on H G is also rapidly mixing.
The remainder of the proof is dedicated to showing the existence of such a function φ for each G ∈ D, which we will do in three claims. Let G = (A ∪ B, E) ∈ D be a monotone graph with |A| = |B| = n where we assume that n is even for simplicity. 10 Let a 1 , . . . , a n (resp. b 1 , . . . , b n ) be the vertices of A (resp. B) in order as given in Definition 4.2. Set Claim 5. Given G ∈ D, let P G be the set of all subgraphs K ⊆ G such that K is the union of three vertex-disjoint paths that together cover all vertices of G. Then there exists an injective function φ 1 : F G → P G and a function g : Claim 6. Given G ∈ D, there is a function φ 2 : P G → H G such that for every K ∈ P G , we have that |K φ(K)| ≤ 9; in particular, for each H ∈ H G , |φ −1 2 (H)| ≤ |E(G)| 9 = poly(n). The function φ is the composition of φ 1 and φ 2 and can easily be seen to satisfy the desired properties (taking f (k) = g(k) + 18). Therefore it remains only to prove the claims.
Proof of Claim 4. Note that a 1 b 1 must be an edge of G. If this is not the case, then b 1 can never have positive degree, because of monotonicity of the rows of the adjacency matrix. As both a 1 and b 1 have degree at least n/2, we can conclude that all edges of the form a i b j with 1 ≤ i, j ≤ n/2 are present (again because of monotonicity) so G[A 1 ∪ B 1 ] is complete bipartite. A similar argument holds for the edge a n b n that yields G[A 2 ∪ B 2 ] is complete bipartite.
Proof of Claim 5. We use a similar idea as in [6]. We fix the total orderings a n 2 +1 < a n 2 +2 < · · · < a n < a 1 < a 2 < · · · < a n 2 on the vertices in A and b n 2 +1 < b n 2 +2 < · · · < b n < b 1 < b 2 < · · · < b n 2 on the vertices of B. Fix F ∈ F G and let C 1 , . . . , C q be the cycles (or connected components) of F . For a given cycle C r , we use a r to denote the highest ordered vertex of A in C r , and we use b r to denote the highest ordered vertex of B in C r . We first group the cycles in three sets depending on the vertices a r and b r . We define and Q A 2 ∪B 2 as the set of all remaining cycles not in Q A 1 or Q B 1 . Note that the cycles in Q A 2 ∪B 2 are fully contained in A 2 ∪ B 2 . For each cycle C r in Q A 1 and Q A 2 ∪B 2 , let c r be an arbitrary neighbour of a r in C r and for each cycle C r in Q B 1 let d r be an arbitrary neighbour of b r on C r (in each case there are two choices). We delete the edges a r c r and b r d r from F to create paths; we will connect the paths in each group together to build the three paths which will define φ 1 (F ) ∈ P G . We first explain the idea (of Feder et al. [6]) on how to glue together the paths from Q A 2 ∪B 2 in such a way that we can uniquely recover the original paths from the single glued path: this case is easiest because we know from Claim 4 that the graph G[A 2 ∪B 2 ] is complete bipartite.
After renaming the cycles, let us assume the cycles in Q A 2 ∪B 2 are C 1 , . . . , C q where a 1 < a 2 · · · < a q . Let P r be the path obtained by deleting the edge a r c r from the cycle C r . As all the cycles lie entirely within A 2 ∪ B 2 and G[A 2 ∪ B 2 ] is complete bipartite, we know that all the edges c r a r+1 are present in G for r = 1, . . . , q − 1. Adding these edges to the graph consisting of P 1 , . . . , P q , results in a path that we call P A 2 ∪B 2 . Note that, given P A 2 ∪B 2 , (without knowing the paths P 1 , . . . , P q ), we can uniquely recover these P 1 , . . . , P q as follows. We know that the endpoint of P A 2 ∪B 2 that is contained in A is the first vertex of P 1 , i.e., the vertex a 1 (the other endpoint is necessarily in B). In order to recover P 1 we start following the path P A 2 ∪B 2 , starting from a 1 , until we reach the first vertex in A that is ordered higher than a 1 ; this is the first vertex of P 2 , i.e., the vertex a 2 . Continuing in this fashion we can uniquely recover all the paths P i .
We apply a similar procedure to the paths obtained from Q A 1 and Q B 1 to form paths P A 1 and P B 1 , respectively. The problem here is that the underlying graph is not complete bipartite so we do not apriori know if the edges to 'glue' the paths together are all present: we argue that they are in fact present. The proof for Q A 1 that we will give below also holds for Q B 1 by symmetry of monotonicity (the case of Q B 1 is essentially a slightly more restrictive setting in which some of the cases below cannot occur).
Assume that the cycles in Q A 1 are C 1 , . . . , C p labelled so that a 1 < a 2 < · · · < a p . By means of a case distinction, depending on whether c r ∈ B 1 or c r ∈ B 2 for r = 1, . . . , p − 1, we will show that the edges c r a r+1 always exist.
Case 1: c r ∈ B 1 . As we know that a r+1 ∈ A 1 , by definition of Q A 1 it follows that c r a r+1 is in G, since G[A 1 ∪ B 1 ] is complete bipartite by Claim 4.
Case 2: c r ∈ B 2 . Since a r < a r+1 =: a j by assumption, monotonicity tells us that the neighbourhood N (a r+1 ) ⊆ B ends at either c r or to the right of c r . Furthermore, we know a j b j ∈ E(G), again since G[A 1 ∪ B 1 ] is bipartite by Claim 4. Since b j ∈ B 1 , it lies to the left of c r ∈ B 2 so, in particular, the neighbourhood N (a r+1 ) starts before c r . Monotonicity then tells us that the edge c r a r+1 is also present in G.
We have shown how to construct the paths P A 1 , P B 1 , and P A 2 ∪B 2 , which together clearly cover all vertices of G. We define φ 1 (F ) = P A 1 ∪ P B 1 ∪ P A 2 ∪B 2 ∈ P G .
In order to see that φ 1 is injective, note first that if K ∈ P G is the image of some (unknown ) F ∈ F G under φ 1 , then one of the paths in K has all its vertices in A 2 ∪ B 2 (we call this path P A 2 ∪B 2 ), one has all its vertices from A in A 2 and some vertices from B 1 (we call this path P B 1 ), and we call the remaining path P A 1 . As described earlier, we can then easily identify the constituent paths that were glued together to form P A 1 , P B 1 , and P A 2 ∪B 2 . Finally we can complete each constituent path to a cycle to uniquely recover F . Therefore φ 1 is injective.
Finally, suppose F, F ∈ F G with |F F | ≤ k. In particular, there are at most k cycles that belong to one of F or F but not both. In constructing φ 1 (F ) (resp. φ 1 (F )), we first delete one edge from each cycle of F (resp. F ) to obtain a union of paths, which we call J (resp. J ). Then |J J | ≤ k and there are at most k paths that belong to one of J or J but not both. When gluing paths of J (resp. J ) together to form φ 1 (F ) (resp. φ 1 (F )) there are at most 2k gluing edges that are used for one of J or J but not both (at most two such edges for each differing path). This shows that |φ 1 (F ) φ 1 (F )| ≤ k + 2k = 3k, showing φ 1 has the desired property (taking g(k) = 3k).
Proof of Claim 6. This claim follows immediately from Lemma 4.7 below. Lemma 4.7. Suppose G = (V, E) is an n-vertex graph with δ(G) > n/2. If P 1 , . . . , P k are k vertex-disjoint paths in G that together cover all vertices V , then there exists a Hamiltonian cycle H of G such that E(H) E(P 1 ∪ · · · ∪ P k ) ≤ 3k.
For bipartite graphs, we have the following. Suppose G = (V, E) is a bipartite graph with bipartition V = A∪B with |A| = |B| = n and δ(G) ≥ n/2. If P 1 , . . . , P k are k vertex-disjoint paths in G that together cover all vertices V , then there exists a Hamiltonian cycle H of G such that E(H) E(P 1 ∪ · · · ∪ P k ) ≤ 3k.
We prove the lemma for graphs; an almost identical proof works for bipartite graphs and we indicate where the proofs differ.
Proof. We will inductively modify the system of paths, at each step modifying at most 3 edges and reducing the number of paths by 1.
Let x i and y i be the endpoints of P i and orient the path P i from x i to y i . For any vertex x, let x + (resp. x − ) be the successor (resp. predecessor) of x on its path (note that these exist except possibly at the 2k endpoints of the paths). For any set S ⊆ V (G), we define S + := {x + : s ∈ S}.
Assuming k ≥ 2, take any two paths, say P 1 and P 2 . [If G is bipartite, we choose P 2 s.t. x 1 and y 2 are in different parts, say x 1 ∈ A and x 2 ∈ B. Note that this is always possible, renaming paths if necessary.] If x 1 is adjacent to any of x 2 , . . . , x k , say to x i , then we can reduce the number of paths by replacing P 1 and P i by y 1 P 1 x 1 x i P i y i as required (only modifying one edge) and we continue. Therefore we may assume that x 1 is not adjacent to any of x 2 , . . . , x k , and in particular, |N (x 1 ) − | = |N (x 1 )| > n/2. Then since |N (y 2 )| > n/2, we must have that N (x 1 ) − ∩ N (y 2 ) is non-empty. [Note that for G bipartite N (x 1 ) − , N (y 2 ) ⊆ A and therefore N (x 1 ) − ∩ N (y 2 ) also holds.] Let z ∈ N (x 1 ) − ∩ N (y 2 ) and assume z ∈ V (P i ) for some i = 1, . . . , k. If i = 1, 2 then we can replace P 1 , P 2 , P i with the two paths y 1 P 1 x 1 z + P i y i and x i P i zy 2 P 2 x 2 , which together cover all the vertices of V (P 1 )∪V (P 2 )∪V (P i ) (see Figure 4.2 (a)). If i = 1, we replace P 1 , P 2 with the path y 1 P 1 z + x 1 P 1 zy 2 P 2 x 2 (see Figure 4.2 (b)) and if i = 2, we replace P 1 , P 2 with y 1 P 1 x 1 z + P 2 y 2 zP 2 x 2 . In all three of these cases, we delete one edge and add two (i.e. we modify three edges) and reduce the number of paths by 1.
By iterating this, we obtain a Hamiltonian path P by modifying at most 3(k − 1) edges. We can then complete this to a Hamiltonian cycle in the standard way. Let x and y be the This completes the proof of the three claims and hence of the theorem.

Remarks regarding the density assumption
It is perhaps interesting to note, in general, it is necessary to make some kind of assumption on the minimum degree of the monotone graph for the argument in the proof of Theorem 4.3 to work. Without it, it is not necessarily true that the number of 2-factors is at most a polynomial factor larger than the number of Hamiltonian cycles of a given graph G. See the matrix below for an indication of the family of instances that should achieve this claim.
We next explain why this claim is true. Let the rows be indexed by A = (a 1 , . . . , a n ) and the columns by B = (b 1 , . . . , b n ). As a 1 only has two neighbours, any Hamiltonian cycle must contain the edges a 1 b 1 and a 1 b 2 . This is indicated in the matrix below.
Note that in both the matrices above, there is now one vertex in B that has two neighbours already (and therefore cannot be chosen as neighbour in any later step). By repeating this argument, one can show that for every row i = 2, . . . , n − 1 there are two possible choices of extending the current Hamiltonian path, and so the number of Hamiltonian cycles equals 2 n−2 .
However, the number of 2-factors in at least (n/4)!. To see this, first note that this is a lower bound on the number of Hamiltonian cycles in the (complete) subgraph induced by the vertices {a 3n/4+1 , . . . , a n } and {b 1 , . . . , b n/4 } (assuming that n is divisible by four). It is not hard to see that any Hamiltonian cycle on this induced subgraph can be extended to a 2-factor of the original bipartite graph. 11 Nevertheless, we believe that our result can be generalized to monotone graphs with minimum degree γn for any γ ∈ (0, 1). However, this comes at the expense of many more technicalities that (in our opinion) do not offer any additional insights. Remember that in Claim 4, we show that the nodes of G can be partitioned into two complete bipartite graphs whenever γ ≥ 1/2. More generally, for a given γ ∈ (0, 1), it should be possible to partition the nodes of G into a constant c = c(γ) number of complete bipartite graphs. The analogue of Claim 5 would then be to show that all cycles in a given 2-factor can be broken up, and glued together again, into a constant d(γ) number of (vertex-disjoint) paths, after which one would need to argue that the resulting collection of paths is close, in terms of symmetric difference, to a Hamiltonian cycle in the monotone graph.

A Rapid mixing of switch-based chains for sampling 2-factors in dense graphs
This section is a modification of certain parts in [1].
We will tailor all definitions to the notion of 2-factors for sake of readibility. Let 2 = (2, 2, . . . , 2) be the all-twos sequence of length n. Let G be a given (dense) undirected graph G and let F G be the set of all 2-factors of G.
We write G(d ) for the set of all subgraphs of G with degree sequence d . Let F G = ∪ d G(d ) with d ranging over the set d : d j ≤ 2 for all j, and In other words, F G is the set of almost 2-factors, that is, subgraphs of G with degree sequence d where (i) d = 2, or (ii) there exist distinct κ, λ such that d i = 1 if i ∈ {κ, λ} and d i = 2 otherwise, or (iii) there exists a κ so that d i = 0 if i = κ and d i = 2 otherwise. In the case (ii) we say that d has two vertices with degree deficit one, and in the case (iii) we say that d has one vertex with degree deficit two.
A family D of graphs G is called P -stable [7] if there exists a polynomial q(n) such that for all G ∈ D we have |F G |/|F G | ≤ q(n) where n is the number of vertices of G.
Jerrum and Sinclair [7] define a Markov chain that, tailored to 2-factors, works as follows.
Let F ∈ F G be the current 2-factor of the JS chain. Choose an ordered pair of vertices (i, j) uniformly at random: 1. if F ∈ F G and (i, j) is an edge of F , delete (i, j) from G (Type 0 transition),

if F /
∈ F G and the degree of i in G is less than 2, and (i, j) is not an edge of F , add (i, j) to F if this edge is in G; if this causes the degree of j to exceed 2, select an edge (j, k) uniformly at random from F and delete it (Type 1 transition).
In case the degree of j does not exceed 2 in the second case, we call this a Type 2 transition.
The graphs F, F ∈ F G are JS adjacent if F can be obtained from F with positive probability in one transition of the JS chain and note this relation is symmetric. The properties of the JS chain, stated in Theorem A.1 below, are easy to check [7].
Theorem A.1. The JS chain on F G is irreducible, aperiodic and symmetric, and, hence, has uniform stationary distribution over F G . Moreover, P (F, F ) −1 ≤ 2n 3 for all JS adjacent F, F ∈ F G , and also the maximum in-and out-degrees of the state space graph of the JS chain are bounded by n 3 .
We say that two graphs F, F ∈ F G are within distance r in the JS chain if there exists a path of length at most r from F to F in the state space graph of the JS chain. By dist(F , 2) we denote the minimum distance of F ∈ F G to an element in F. The following parameter will play a central role in this work. Let Based on the parameter k JS , we define the notion of strong stability [1].
Definition A.2 (Strong stability). A family of graphs D is called strongly stable if there exists a constant such that k JS (G) ≤ for all G ∈ D.
It is shown by Jerrum and Sinclair [7], that if D is the set of all graphs G with δ(G) ≥ n/2, then D is strongly stable for = 3. (This gives rise to the condition on the minimum degree in the statement of Theorem 4.1.) We now have all the ingredients for the proof of Theorem 4.1. It uses essentially the same argument as that in [1], where it is shown that the switch Markov chain for sampling graphs with given degrees is rapidly mixing for certain strongly stable classes of degree sequence, i.e., for the notion of strong stability in that setting.
Proof of Theorem 4.1. The high-level idea is to use an embedding argument which states that an efficient multi-commodity flow for the JS chain can be transformed into an efficient flow for the k-switch Markov chain.
The fact that there exists an efficient multi-commodity flow for the JS chain can be shown using exactly the same arguments as in Theorem 3.2 in [1]. 12 Without going into all the details, we will give a sketch of this argument. Recall that Sinclair's multi-commodity flow method asks us to define a flow f in the state space graph of the JS chain that routes a fraction π(X)π(Y ) of flow from X to Y for every X, Y ∈ F G . Here, The notion of strong stability allows us to take a shortcut here: Instead of defining a flow between every two states in F G , one can first define a flow between any two 2-factors F, F ∈ F G . Then, roughly speaking, in order to define a flow between any two states in F G , we use the fact that every 'almost 2-factor' X ∈ F G \ F G is close to some actual 2-factor in the state space graph, because of strong stability. These short paths between states in F G \ F G and F G can be exploited to define the desired flow between any two states in F G .
In order to define the flow between two 2-factors F and F , we decompose the symmetric difference F F into a collection of alternating circuits. 13 We then use the operations defining the JS chain in order to transform F into F by 'flipping' edges on an alternating resulting (simple) graph J = (F G , A) and the resulting flow in this graph f * . By what is said above, we have max e f * (e) ≤ p (n) for some polynomial p , i.e., the congestion of f * is at most a polynomial factor larger than that off .
The final problem, before we obtain the desired flow g, is that the graph J contains edges (possibly with flow) between 2-factors F, F ∈ F G that might be more than a k-switch away from each other. Said differently, these edges do not represent transitions in the k-switch Markov chain. Let us partition the edge set A = A switch ∪ A infeasible where A switch contains all edges of A that represent a transition in the k-switch Markov chain, and A infeasible all those edges that do not.
We argue that for every edge a = (F, F ) ∈ A infeasible , we can always find a short 'detour' in the graph J using only edges in A switch . To see this, fix some a ∈ A infeasible . Suppose that X and Y are adjacent in the JS chain and that F = ψ(X) and F = ψ(Y ) (these X and Y exist by existence of the infeasible edge a). Since k JS = 3, it can be shown that |F ∆F | ≤ 12.
Intuitively, this follows from the fact that in the JS chain, F = ψ(X) is close to X, which is close to Y , which is in turn close to ψ(Y ) = F . It follows that with at most 12/k switches of size at most k, that define the shortcut in J using only edges in A switch , we can transform F into F . This follows from the assumption of k-switch irreducibility. Since all these detours take place on a 'local' level, the congestion of the resulting multi-commodity flow for the k-switch Markov chain, that we get from rerouting the flow of infeasible edges over their respective shortcut, increases at most by a polynomial factor on every fixed feasible edge in J. That is, for a fixed edge b = (F 0 , F 0 ) ∈ A switch , the total number of edges a = (F, F ) ∈ A infeasible that use b in their detour is at most poly(n), as (roughly speaking) F 0 is at most 12/k transitions away from F by construction (and k is constant).
This yields the desired flow g. For a precise and detailed outline of this idea, we refer the reader to [1].
Proof of Proposition 3.4. We claim that given F 1 , F 2 , ∈ F G , there is a T ∈ F G that can be obtained from F 1 by a 4-switch such that |T F 2 | < |F 1 F 2 |. Applying this repeatedly proves the proposition, taking φ(k) = k.
Let F 1 , F 2 ∈ F G . Note that the symmetric difference of F 1 and F 2 is the vertex-disjoint union of circuits in which edges alternate between F 1 and F 2 and the circuits visit each vertex zero, one, or two times. If the symmetric difference of F 1 and F 2 contains such alternating circuits with four or six edges (corresponding to switches of size 2 or 3), then switching along such a circuit reduces the symmetric difference, so assume otherwise.
In this case it is not hard to see that we can find an H 1 , H 2 -alternating walk P = a 1 a 2 a 3 a 4 a 5 a 6 (here the a i are vertices and a 1 and a 6 are distinct) such that a 1 a 2 , a 3 a 4 , a 5 a 6 are edges of F 1 , and a 2 a 3 , a 4 a 5 are edges of F 2 .
We try to find vertices b and c that are neighbours on F 1 such that b ∈ N (a 1 ) and c ∈ N (a 6 ). Then the circuit C := a 1 a 2 a 3 a 4 a 5 a 6 cba 1 is a 4-switch for F 1 . We choose b and c as follows. Orient the cycles of F 1 arbitrarily. We call the vertex following a vertex v in this orientation v + and the previous vertex v − . Set M = {v + : v ∈ N (a 6 )} and consider