Uniform generation of spanning regular subgraphs of a dense graph

Let $H_n$ be a graph on $n$ vertices and let $\overline{H_n}$ denote the complement of $H_n$. Suppose that $\Delta = \Delta(n)$ is the maximum degree of $\overline{H_n}$. We analyse three algorithms for sampling $d$-regular subgraphs ($d$-factors) of $H_n$. This is equivalent to uniformly sampling $d$-regular graphs which avoid a set $E(\overline{H_n})$ of forbidden edges. Here $d=d(n)$ is a positive integer which may depend on $n$. Two of these algorithms produce a uniformly random $d$-factor of $H_n$ in expected runtime which is linear in $n$ and low-degree polynomial in $d$ and $\Delta$. The first algorithm applies when $(d+\Delta)d\Delta = o(n)$. This improves on an earlier algorithm by the first author, which required constant $d$ and at most a linear number of edges in $\overline{H_n}$. The second algorithm applies when $H_n$ is regular and $d^2+\Delta^2 = o(n)$, adapting an approach developed by the first author together with Wormald. The third algorithm is a simplification of the second, and produces an approximately uniform $d$-factor of $H_n$ in time $O(dn)$. Here the output distribution differs from uniform by $o(1)$ in total variation distance, provided that $d^2+\Delta^2 = o(n)$.


Introduction
Enumeration and uniform generation of graphs of given degrees has been an active research area for four decades, with many applications. Research on graph enumeration started in 1978 when Bender and Canfield [2] obtained the asymptotic number of graphs with bounded degrees. The constraint of bounded degrees was slightly relaxed by Bollobás [3] to the order of √ log n. A further relaxation on the degree constraints was obtained by McKay [14] to regular, and the maximum degree of H n can be linear. We are not aware of any other algorithms which have explicit polynomial bounds on the runtime and vanishing bounds on the approximation error. For approximate sampling, the Jerrum-Sinclair algorithm [11,Section 4] generates an approximately uniform d-factor of H n in polynomial time, as long as d + ∆ ≤ n/2 + 1. It seems unlikely that the approach of Cooper et al [4] can be adapted to the setting of sampling d-factors, due to the complexity of the analysis. Erdős et al. [6] analysed a Markov chain algorithm which uniformly generates bipartite graphs with a given half-regular degree sequence, avoiding a set of edges which is the union of a 1-factor and a star. Here "half-regular" means that the degrees on one side of the bipartition are all the same, with the possible exception of the centre of the star. The aim of this paper is to develop efficient algorithms that sample d-factors of H n uniformly or approximately uniformly. We will describe and analyse three different algorithms, which we call FactorEasy, FactorUniform and FactorApprox. Our main focus is FactorUniform, which is an algorithm for uniformly generating d-factors of an (n − 1 − ∆)regular host graph H n , when d and ∆ are not too large. For smaller d and ∆, the simpler algorithm FactorEasy is more efficient, and does not require H n to be regular (here ∆ is the maximum degree of H n ). Finally, FactorApprox is a linear-time algorithm for generating d-factors of H n asymptotically approximately uniformly, under the same conditions as FactorUniform.
Our results are stated formally below. All asymptotics are as the number of vertices n tends to infinity, along even integers if d is odd. Throughout, d = d(n) and ∆ = ∆(n) are positive integers which may depend on n. Remark: For simplicity we considered regular spanning subgraphs in this paper, and the host graph is assumed regular for FactorUniform and FactorApprox. However, all of these algorithms are flexible and can be modified to cope with more general degree sequences for the spanning subgraph and for the host graph. For instance, the McKay-Wormald algorithm [15] can uniformly generate a subgraph of K n with a given degree sequence where the maximum degree is not too large. We believe that FactorEasy can be easily modified for general degree sequences, by calling [15], and then slighly modifying the analysis, with essentially the same switching. To cope with denser irregular subgraphs or sparser irregular host graphs, FactorUniform and FactorApprox can be modified accordingly, by possibly introducing new types of switchings. We do not pursue this here.
In Section 2, we describe the common framework of FactorEasy and FactorUniform, and define some key parameters that will appear in these two algorithms and in FactorApprox. More detail on the structure of the paper can be found at the end of Section 3.1.

The framework for uniform generation
The new approach of Gao and Wormald [8,9] gives a common framework for Algorithms FactorEasy and FactorUniform, which we will now describe. The Gao-Wormald scheme reduces to the McKay-Wormald algorithm [15] by setting certain parameters to some trivial values. Our algorithm FactorEasy is indeed an adaptation of the simpler McKay-Wormald algorithm, whereas FactorUniform uses the full power of [8,9], which allows it to cope with a larger range of d and ∆.
It is convenient to think of the host graph H n as being defined by a 2-colouring of the complete graph K n with the colours red and black, where edges of H are coloured red, while edges of H n are coloured black. Then our aim is to uniformly sample d-factors of K n which contain no red (forbidden) edges.
Both algorithms FactorEasy and FactorUniform begin by generating a uniformly random d-regular graph G on {1, 2, . . . , n}. For the range of d which we consider, this can be done using the Gao-Wormald algorithm REG [8,9]. Typically this initial graph G will contain some red edges. Let S i denote the set of all d-regular graphs containing precisely i red edges. The set S i is called a strata. For some positive integer parameter i 1 , which we must define for each algorithm, let If the initial graph G does not belong to A 0 then the algorithm will reject G and restart. Otherwise, the initial graph G belongs to A 0 and so it does not contain too many red edges. Then the algorithm will perform a sequence of switching operations (which we must define), starting from G, until it reaches a d-regular graph with no red edges. At each switching step there is a chance of a rejection: if a rejection occurs then the algorithm will restart. This rejection scheme must also be defined, for each algorithm.
In FactorEasy, only one type of switching (Type I) is used. Each such switching reduces the number of red edges by exactly one. As soon as FactorEasy reaches a d-regular graph with no red edges, it outputs that graph, provided that no rejection has occurred. When (d + ∆)d∆ is of order n or greater, the probability of a rejection occurring in FactorEasy is very close to 1 and FactorEasy becomes inefficient.
FactorUniform reduces the probability of rejection by permitting some switchings that are invalid in FactorEasy, as well as introducing other types of switchings. Switchings which typically reduce the number of red edges by exactly one will still be the most frequently applied switchings, although in FactorUniform we relax these switchings slightly so that certain operations which were forbidden in FactorEasy will be permitted in FactorUniform. The other types of switchings do not necessarily reduce the number of red edges. Rather counterintuitively, some switchings will create more red edges. The new types of switchings are introduced for the same reason as the use of the rejection scheme: to remedy the distortion of the distribution which arises by merely applying Type I switchings.
We will specify parameters ρ τ (i), for 0 ≤ i ≤ i 1 and τ ∈ Γ, where Γ is the set of the types of switchings to be applied in FactorUniform. In each step of FactorUniform, ignoring rejections that may occur with a small probability, a switching type τ ∈ Γ is chosen with probability ρ τ (i) if the current graph G is in S i . Then, given τ , a random switching of type τ is performed. As mentioned before, the Type I switchings are the most common: in fact, we set ρ I (i) close to 1 for each 0 ≤ i ≤ i 1 . If the current graph lies in S 0 then a Type I switching applied to that graph will simply output the graph, but any other type of switching will not. Hence FactorUniform does not always immediately produce output as soon as it reaches a graph in S 0 , unlike FactorEasy.
As a preparation, we compute the expected number of red edges in a random d-regular subgraph of K n . Lemma 2.1. Let G be a uniformly random d-regular graph on {1, 2, . . . , n}. The expected number of red edges in G is |E(H n )|d/(n − 1).
Proof. Let uv be a red edge in K n (that is, an edge in H n ). We know that u is incident with d edges in G, and by symmetry each of the n − 1 edges incident with u in K n is equally likely to be in G. Thus, the probability that uv ∈ G is d/(n − 1). There are exactly |E(H n )| red edges in K n . By linearity of expectation, the expected number of red edges in G is |E(H n )|d/(n − 1).

The algorithm FactorEasy
Define The following is a direct corollary of Lemma 2.1, using Markov's inequality. We now define the switching operation which we use in FactorEasy, called a 3-edgeswitching. To define a 3-edge-switching from the current graph G, choose a sequence of vertices (v 0 , . . . , v 5 ) such that v 0 v 1 is a red edge in G, v 2 v 3 and v 4 v 5 are edges in G (with repetitions allowed), and the choice satisfies the following conditions: • v 2 v 3 and v 4 v 5 are black edges in G, • v 0 v 5 , v 1 v 2 and v 3 v 4 are all non-present in G, and are all black in K n ; • The vertices v 0 , . . . , v 5 are distinct, except that v 2 = v 5 is permitted.
We say that the 6-tuple v = (v 0 , v 1 , . . . , v 5 ) is valid if it satisfies these conditions. Given a valid 6-tuple v, the 3-edge-switching determined by v deletes the three edges v 0 v 1 , v 2 v 3 , v 4 v5 and replaces them with the edges v 1 v 2 , v 3 v 4 , v 0 v 5 , producing a new graph G ′ . This switching operation is denoted by (G, v) → G ′ , and is illustrated in Figure 1.

Figure 1: A 3-edge switching
Each 3-edge switching reduces the number of red edges by exactly one. The inverse operation, obtained by reversing the arrow in Figure 1, is called an inverse 3-edge switching. The 6-tuple (v 0 , v 1 , . . . , v 5 ) is valid for the inverse 3-edge switching if exactly one new red edge v 0 v 1 is introduced, no multiple edges are introduced, and all vertices are distinct except Define Let f (G) denote the number of valid 6-tuples (v 0 , v 1 , . . . , v 5 ) which determine a 3-edge switchings that can be applied to G, and let b(G) denote the number of valid 6-tuples (v 0 , v 1 , . . . , v 5 ) which determine a inverse 3-edge switchings that can be applied to G. In the following lemma, we show that f (G) is approximately m(i) and b(G) is approximately m(i) for G ∈ S i . Hence The proof of Lemma 3.2 is deferred to Section 3.2.

Algorithm FactorEasy: definition and analysis
First, FactorEasy calls the Gao-Wormald algorithm REG [8,9] to generate a uniformly random d-regular graph G on {1, 2, . . . , n}. If G contains more than i 1 red edges then FactorEasy restarts. Otherwise, FactorEasy iteratively performs a sequence of switching steps. Let G t be the graph obtained after t switching steps, and assume G t = G ∈ S i . The (t + 1)-th switching step is composed of the following substeps: (i) If i = 0 then output G.
(ii) If i > 0 then uniformly at random choose a red edge in G and choose two further edges in G (of any colour), with repetition allowed. Randomly label the two endvertices of the red edge as v 0 and v 1 , and the endvertices of the other two edges as v 2 and v 3 , and v 4 and v 5 respectively. If v = (v 0 , . . . , v 5 ) is a valid 6-tuple then let (G, v) → G ′ be the 3-edge switching induced by this 6-tuple, as in Figure 1. Otherwise (when the 6-tuple is not valid), perform an f-rejection.
(iii) If no f-rejection is performed then perform a b-rejection with probability .
As mentioned before, if any rejection occurs then FactorEasy restarts.
Proof of Theorem 1.1. First we prove uniformity by induction. Recall that G 0 is a uniformly random d-regular graph in A 0 if no initial rejection occurs. If G 0 ∈ S i then clearly G 0 is uniformly distributed over S i . Next we prove that if G t is uniformly distributed over S i then G t+1 is uniformly distributed over S i−1 , assuming that no rejection occurs. For every G ∈ S i and G ′ ∈ S i−1 , let Ψ(G, G ′ ) denote the set of valid 6-tuples v = (v 0 , . . . , v 5 ) such that (G, v) → G ′ is a 3-edge-switching, and let Ψ( Given G t = G, note that m(i) is exactly the number of choices of a 6-tuple of vertices (v 0 , . . . , v 5 ), with repetition allowed, such that v 0 v 1 is a red edge in G, and v 2 v 3 and v 4 v 5 are edges of G. Thus, the probability that G is converted to G ′ without f-rejection in step t + 1 is equal to The probability that no b-rejection occurs is equal to which is invariant over all graphs G ∈ S i by the inductive hypothesis. Then using the fact that the union in Next, we prove that if (d + ∆)d∆ = o(n) then FactorEasy runs in O((d + ∆) 3 n) time in expectation. The runtime for generating a random d-regular graph on {1, 2, . . . , n} using the Gao-Wormald algorithm [9] is O(d 3 n) in expectation. By Lemma 3.2, the probability of an f-rejection or a b-rejection in each step is O((d + ∆)/n). By definition of i 1 and A 0 , the algorithm FactorEasy performs at most i 1 = O(|E(H n )|d/n) = O(d∆) switching steps. Thus, the overall probability of any f-rejection or b-rejection is at most It follows from this and from Corollary 3.1, that in expectation FactorEasy restarts O(1) times. It only remains to show that in each run of FactorEasy without rejection, the time complexity is O((d + ∆) 3 n). We will prove this part in Section 6.
We close this section by proving the following lemma, which follows easily from Lemma 3.2. This result, which will be useful later, only requires a rather weak condition on d and ∆.
Proof. We have The result follows by Lemma 3.2.
When (d+∆)d∆ is no longer negligible compared to n, the probability that an f-rejection or b-rejection occurs in FactorEasy before reaching S 0 is very close to 1. In this case, FactorEasy becomes very inefficient as it has to restart many times. In Section 5 we define FactorUniform, which will use 4-edge switchings instead of 3-edge switchings, giving more room for performing valid operations. In addition, various new ideas will be incorporated into the design of FactorUniform to achieve uniformity in the output and efficiency when d 2 + ∆ 2 = o(n). Since the uniform sampler FactorUniform can be treated as an extension of the approximate sampler FactorApprox, we will introduce FactorApprox first, in Section 4 below. All runtime analysis is deferred to Section 6, where the proofs of our main theorems are completed.
For the algorithms FactorUniform and FactorApprox, we restrict the host graph H n to be regular (specifically, (n−∆−1)-regular), to simplify the analysis. However, we believe that FactorUniform (and FactorApprox) can be modified to work for irregular host graphs H n .
Before moving on to these more advanced algorithms, we first complete the analysis of FactorEasy by proving Lemma 3.2.

Proof of Lemma 3.2
For the upper bound of f (G), note that there are exactly 2i ways to choose (v 0 , v 1 ), and then at most dn ways to choose (v 2 , v 3 ) and at most dn ways to choose (v 4 , v 5 ).
For the lower bound of f (G), there are 2i ways to choose (v 0 , v 1 ). Given that, there are at least (dn − 2i − 4d) ways to choose (v 2 , v 3 ) such that v 2 v 3 is a black edge in G, and v 0 , v 1 , v 2 and v 3 are all distinct. Then there are at least (dn − 2i − 7d) ways to choose (v 4 , v 5 ) such that v 4 v 5 is a black edge in G and . This gives at least 2i(dn − 2i − 4d)(dn − 2i − 7d) choices of (v 0 , . . . , v 5 ), but some of these choices are not valid: specifically, we must subtract the number of choices such that one of v 0 v 5 , v 1 v 2 or v 3 v 4 is either an edge in G (either black or red), or is a red non-edge (that is, an edge in H n ∩ G).
• There are at most 3 · 2id 2 dn choices such that v 0 v 5 or v 1 v 2 or v 3 v 4 is an edge in G; • There are at most 3 · 2idn∆d choices such that v 0 v 5 or v 1 v 2 or v 3 v 4 is a red non-edge.
Subtracting these yields the desired lower bound for f (G).
, but some of these choices are not valid: specifically, we must subtract the number of choices where one of v 2 v 3 , v 4 v 5 is either an edge in G or a red non-edge, or one of v 0 v 5 , v 1 v 2 is a red edge.
• There are at most 2 · 2|E(H)|d 3 (d + ∆) choices such that v 2 v 3 or v 4 v 5 is either an edge in G or a red non-edge; • There are at most 2 · 2i∆d · dn choices such that v 0 v 5 or v 1 v 2 is a red edge.
Subtracting these gives the stated lower bound for b(G). Finally, the last statement of the Lemma 3.2 follows immediately from the above bounds by (3.1) and noting that |E(H n )| ≤ ∆n/2.

The approximate algorithm: FactorApprox
We now assume that H n is (n − ∆ − 1)-regular, which implies that |E(H n )| = ∆n/2. By Lemma 2.1, the expected number of red edges in a random d-factor of G is asymptotic to ∆d/2. This leads to the following definition and corollary.
Define  Gao and Wormald [8,9] gave an algorithm called REG* for generating regular graphs asymptotically approximately uniformly in runtime O(dn). FactorApprox uses REG* to generate a random d-regular graph on K n . When d 2 = o(n), the distribution of the output of REG* differs from the uniform by o(1) in total variation distance. This implies the following result, using Corollary 4.1 Both FactorUniform and FactorApprox use 4-edge-switchings, which we now define. To define a 4-edge-switching from the current graph G, choose a sequence of 8 vertices • no two of the eight vertices are equal except for possibly v 2 = v 7 ;  Figure 2. The resulting graph is denoted by G ′ . This operation is called a 4-edge-switching. We say that the 4-edge-switching is valid if it arises from a valid 8-tuple. Under the switching, the colour of each edge or non-edge stays the same, but edges become non-edges and vice-versa. In Figure 2, a solid line indicates an edge of G and a dashed line represents an edge of G; that is, a non-edge in G t . We label each edge (or non-edge) by the allowed colour, where 'b', 'r' and 'b/r' denote 'black', 'red', and 'black or red', respectively.

Figure 2: A 4-edge-switching
The structure of FactorApprox is similar to that of FactorEasy, except that there is no rejection after the first step.
First, FactorApprox uses REG* to repeatedly generate a random d-regular graph on the vertex set {1, 2, . . . , n} until the resulting graph belongs to A 0 . Next, FactorApprox repeatedly applies random 4-edge switchings from the current graph, until a graph without red edges is produced. Finally, FactorApprox outputs this graph, which is a d-factor of H n .
To choose a uniformly random 4-edge switching from a current graph G = G t , we can uniformly at random choose a red edge and label its end vertices by v 0 and v 1 . Then uniformly choose three edges of G (repetition allowed) and label their endvertices. If the resulting 8-tuple is valid then perform the corresponding 4-edge-switching to produce the new graph G ′ = G t+1 . Otherwise, repeat until a valid 4-edge-switching is obtained.
Efficiency. The runtime of REG* is O(dn) and by Corollary 4.2, a constant number of attempts will be sufficient to generate a random d-regular graph with at most i 1 red edges in expectation. Now a given graph G, consider choosing v 0 v 1 to be a uniformly random red edge of G and choosing each of v 2 v 3 , v 4 v 5 , v 6 v 7 to be a uniformly random edge of G (with repetition allowed). It is easy to see that with high probability, this random choice of Correctness. It only remains to show that FactorApprox outputs a random d-factor of H n whose distribution differs from the uniform by o(1) in total variation distance. We will prove this in Section 6.

The exactly uniform sampler: FactorUniform
In this section, our aim is to define and analyse an algorithm for uniform generation of dfactors of H n , which is efficient for larger values of d and ∆ than FactorEasy. Specifically, we assume that d 2 + ∆ 2 = o(n). As in Section 4, we assume that H n is (n − ∆ − 1)-regular and define i 1 and A 0 as in (4.1), (4.2).
The analysis of FactorEasy given in Section 3.1 shows that the probability of an f-rejection or a b-rejection depends on the gap between the upper and lower bounds of f (G) and b(G). To obtain an algorithm which is still efficient for larger values of d and ∆, we must reduce the variation of f (G) and b(G) among G ∈ S i , for some family of switchings. We use several techniques to reduce this variation: • We use 4-edge-switchings instead of 3-edge switchings.
• We will perform more careful counting than the analysis of Lemma 3.2.
• We will occasionally allow switchings which create new red edges.
• We will introduce other types of switchings to "boost" the probability of graphs which are otherwise not created sufficiently often. These types of switchings are called boosters.
The last two of these techniques follow the approach of [8,9]. In particular, every switching introduced will have a type and a class. The number of switchings of type τ that can be performed on a given graph G is denoted by f τ (G), and the number of switchings of class α that can be applied to other graphs to produce G is denoted by b α (G). We will also need parameters m τ (i) and m α (i) which satisfy Further details of the structure of algorithm FactorUniform will be explained in Section 5.3.

Type I switchings
FactorUniform will mainly use the 4-edge-switching shown in Figure 2, which replaces four edges by four new edges. We will call this the Type I switching, as we will define other types of switchings for use in FactorUniform later. Suppose that a Type I switching transforms a graph G into a graph G ′ ∈ S i . Then the initial graph G can be in different strata, depending on the colour of v 1 v 2 , v 2 v 3 , v 6 v 7 and v 0 v 7 . If these edges are all black then we say that the Type I switching is in Class A. In this case, G has exactly one more red edge than G ′ . Other Type I switchings are categorised into different classes, as shown in Table 1.
Each row of Table 1, other than the first row, defines two new classes of switching, depending on how the vertices are labelled. For example, the second row defines Classes B1+ and B1−, which we refer to collectively as B1±. Every class with a name ending "+" arises from using the vertex labelling shown on the left of Figure 3, while those classes ending in "−" arise from using the vertex labelling shown on the right of Figure 3. For example, if v 1 v 2 (respectively, v 0 v 7 ) is red but not present in G, and all the other edges and non-edges involved in the switching, except for v 0 v 1 , are black, then this switching is in Class B1+ (respectively, B1−) and both G and G ′ belong to S i . class action the switching  Next we bound the number of Type I switchings which can be performed in an arbitrary G ∈ S i . Define The proof of the following lemma is similar to that of Lemma 3.2, and is deferred to the appendix.
Remark 5.2. If we did not allow the creation of structures in classes B1±, B2± and C± then the f-rejection probability would be too big. For instance, suppose that we did not allow Class B1+ (and that the other Class B and C switchings are permitted). Then for most graphs in S i , the typical number of forward switchings would be approximately (This is approximately m I (i) with the term 2i∆d(dn) 2 subtracted, which is the typical number of 8-tuples corresponding to a Class B1+ switching.) However, consider an extreme "worst case", when d = ∆ and d divides 2i, and the d-factor G is composed of two d-regular graphs, one with only red edges and the other with only black edges. Then there are no red edges in G that are incident with a red dashed edge, so no Class B1+ switchings need to be ruled out (as none are possible). In this case, the number of forward switchings in G is Hence, f I (G) differs from max G∈S i f I (G) by Ω(i∆d(dn) 2 ) for most graphs G ∈ S i , causing an f-rejection probability of Ω(i∆d(dn) 2 /i(dn) 3 ) = Ω(∆/n) in a single step. But then the overall rejection probability will be too big, since there will be up to i 1 = Θ(d∆) steps, and d∆ 2 /n may not be o(1) under the conditions of Theorem 1.2. To reduce the probability of an f-rejection we must allow these Class B1+ switchings to proceed (and Class B1− switchings too, by symmetry). Similar arguments explain the introduction of Classes B2± and C±.

Class A
Class A switchings are all of Type I. In this section we will obtain a lower bound for b A (G) and an upper bound for the average of b A (G) over all G ∈ S i . The following lemma will be useful. Proof. Let uvw be a red 2-path in K n . We now bound the probability that uvw is contained in a random G ∈ S i . Let W denote the set of graphs in S i which contain uvw and let W ′ denote the set of graphs in S i−2 which do not contain any of uv and vw. Consider the switching as shown in Figure 4.
The total number of red 2-paths in K n is O(∆ 2 n). Thus, by linearity of expectation, the expected number of red 2-paths contained in a random G ∈ S i is The proof for the second claim is similar. Fix a red 3-path v 1 u 1 u 2 v 2 in K n . Let G be a uniformly random graph in S i . Using another switching (see Figure 5) and a similar argument as above, we can bound the probability that u 1 v 1 and u 2 v 2 are edges in G by O(i 2 /∆ 2 n 2 ). The red dotted edge marked "?" denotes a red edge which may be either present or absent in G. The total number of choices for u 1 , Thus, by linearity of expectation, the expected number of pairs of red edges The following lemma is proved in the appendix.

Classes B2±
Classes B2± are easy to handle, so we discuss them before Classes B1±. Define The proof of the following lemma is deferred to the appendix.

Classes B1±
The inverse switching of Type I Class B1± is indeed the same as the forward switching, up to a permutation of the labelling of the vertices involved in the switching. Recall the example discussed in Remark 5.2, where the d-factor is composed of a union of a d-regular graph with only red edges and a d-regular graph with only black edges. It is easy to see that in such a graph, the number of inverse Type I Class B1± switchings is zero. In general, the number of the following structure in G can vary a lot among G ∈ S i : b r r However, we do know that the sum of the number of the following structures in any d-factor G ∈ S i is between 2i(∆ − 1)(d − 1) and 2i(∆ − 1)d: This motivates the introduction of switchings of other types than Type I. We display these new switchings in Table 2. Here, the colours of the edges and non-edges must be black unless specified as red. These new types of switchings are categorised into Class B1+ or B1−, under the rule that if the type ends with "+" then the class also ends with "+", and similarly for "−". Note that due to symmetry, some switchings of different types have the same definition. For instance, type IIa+ and type IIa− switchings are defined in the same way. However, they are introduced as booster switchings for different classes, and thus are categorised into different types. Again, if the class name ends with a "+" then the vertices are labelled as shown on the left of Figure 3, while if the class name ends with a "−" then the vertices are labelled as shown on the right of Figure 3.
We will show that for any G ∈ S i , the number of inverse Class B1+ (or B1−) switchings does not vary much, even though the number can be zero if restricted to inverse Type I Class B1+ (respectively, Class B1−) switchings only.
As shown in Table 2, Type IIa± switchings are described by an 8-tuple v = (v 0 , . . . , v 7 ), while Type IIb± switchings are described by 8-tuple (v 0 , . . . , v 7 ) together with 8 additional vertices, and Type IIc± switchings are described by an 8-tuple (v 0 , . . . , v 7 ) together with 12 additional vertices, providing the additional edges used to perform the switching. We denote the sequence of these additional vertices by y, where the vertices are arranged in some prescribed order: see Figure 7 in the appendix for Type IIc+. An inverse Type IIb± switching is described by choosing the 8-tuple v and an 8-tuple y of additional vertices, while an inverse Type IIc± switching is described by choosing the 8-tuple v and a 12-tuple y of additional vertices.
Suppose that a Type IIb± or Type IIc± switching based on the 8-tuple v creates a graph G ′ . We refer to the subgraph of G ′ formed by vertices in v as an octagon. If an 8-tuple v = (v 0 , . . . , v 7 ) in G can be combined with an 8-tuple (respectively, 12-tuple) of additional vertices y on which an inverse Type IIb±, (respectively, inverse Type IIc±) switching can be performed, then we call v an octagon of Type IIb± (respectively, Type IIc±). The switching operation is denoted by (G, v, y) → G ′ . Note that octagons of different types induce different subgraph structures and (non-)edge colour restrictions. Each octagon which can result from a Type IIb± (respectively, Type IIc±) switching is not created equally often, due to the varying number of ways to select the additional vertices needed to perform the inverse switching. Thus we introduce another sort of rejection, called pre-b-rejection, to equalise the frequency of the creation of each octagon, given a switching type τ ∈ {IIb±, IIc±}.  on τ ) so that an inverse Type τ switching can be performed using v and y. Define m IIb± (i) = (dn − 2i − 12) 4 − 6(dn) 3 d 2 − 6(dn) 3 ∆d; (5.6) m IIc± (i) = (dn − 2i − 14) 6 − 9(dn) 5 d 2 − 9(dn) 5 ∆d. (5.7) We will prove the following lemma in the appendix.
Lemma 5.6. Let τ ∈ {IIb+, IIc+} and G ∈ S i . For any octagon v = (v 0 , . . . , v 7 ) in G that can be created by a type τ switching, We will specify m τ (i) for τ ∈ {IIa±, IIb±, IIc±} in (5.11)-(5.13). It is trivial to see that f τ (G) ≤ m τ (i) for each such τ and for G ∈ S i . These types of switchings are performed so rarely that the trivial lower bound f τ (G) ≥ 0 is sufficient for our analysis: see the proof of Lemma 5.12.

Pre-b-rejection
When we count the number of inverse Class B1± switchings applicable to G ∈ S i , we count the number of choices of (v 0 , . . . , v 7 ) that are allowed to be created by a Class B1± switching. However, some types of switchings create structures with more vertices than those in an octagon. For instance, let v be an octagon of type IIb+ and let x = (x 1 , . . . , x 4 ) be the four extra edges that are created by a Type IIb+ switching. We can consider (G, v, x) as a pre-state of (G, v). Each octagon v in G corresponds to exactly b τ (G, v) pre-states, and b τ (G, v) ≈ m τ (i) by Lemma 5.6. By carefully designing the pre-b-rejection scheme, we can ensure that each octagon v in G is created equally often if each of its pre-states are created equally often.
When a Type τ switching converting G to G ′ is chosen, corresponding to a valid 8-tuple v, we will reject the algorithm and restart with probability This restart will be called a pre-b-rejection.
The pre-b-rejection is incorporated in the formal definition of the algorithm in Section 5.3. We close this section by bounding b α (G) for α ∈ {B1±}. The proof is given in the appendix. Define Lemma 5.7. For any G ∈ S i , and α ∈ {B1±},

Classes C±
Now consider Class C±. As we will show later, the probability that a Type I switching is in Class C± is very small. Thus, we only need a rather rough lower bound on the number of inverse Class C± switchings, so that the probability of a b-rejection is not too close to 1. However, there are very rare graphs in S i that cannot be created by a Type I Class C± switching. For instance, this may occur if d = ∆ and the set of all red edges in G form a red d-regular subgraph. Then G does not contain the following structure, and thus cannot be created by a Type I switching. In this case, that the probability of a b-rejection would equal 1 due to the existence of such graphs. In order to reduce the probability of a b-rejection, we introduce a new type of switching, namely Type III+ for Class C+ and Type III− for Class C−, that boost the probability of graphs which contain the following structure: It turns out that for any G ∈ S i , the number of choices of 6-tuples of vertices (x 1 , . . . , x 6 ) such that x 1 x 2 and x 5 x 6 are black edges in G, x 2 x 3 and x 4 x 5 are red non-edges, and x 3 x 4 is either a red edge or a red non-edge, is always sufficiently concentrated. See Lemma 5.8 for a precise bound. This is why we boost the second structure, to transform a highly varying count into a well-concentrated count. The Type III±, Class C± switchings are shown in Table 3.
type, class action the switching III±, C± S i → S i r r r r r r Table 3: The booster switchings for Classes C± Unusually, the Type III± switchings do not perform any switch of edges, except for designating an 8-tuple of vertices satisfying certain constraints, as shown in Table 3. They can be viewed as adding a small "do nothing" probability to the algorithm. As we will see later, the probability of ever performing a Class C± switching is extremely small.
As before, although type III+ and III− switchings have the same definition, they are booster switchings for classes C+ and C− respectively, and thus have to be categorised into different types. Define We will prove the following lemma in the appendix.
For each G ∈ S i and for α ∈ {C±},

The algorithm: FactorUniform
Now we have defined all types and classes of switchings involved in FactorUniform. Figure 6 depicts all switchings which produce an element of S i , labelled by their type and class. We now describe the algorithm FactorUniform formally. First, FactorUniform calls REG to generate a uniformly random d-regular graph G 0 on {1, 2, . . . , n}. If G contains more than i 1 red edges then FactorUniform restarts. Otherwise, FactorUniform iteratively performs a sequence of switching steps. In each switching step, if the current graph is in S i then FactorUniform chooses a switching type τ from a set of types Γ, with probability ρ τ (i). Then FactorUniform chooses a random Type τ switching, and either performs this chosen switching, or restarts with a small probability (the sum of the rejection probabilities). Here Γ = {I, IIa±, IIb±, IIc±, III±} and we insist only that τ ∈Γ ρ τ (i) ≤ 1 for all i ≤ i 1 . With probability 1 − τ ∈Γ ρ τ (i) we perform a rejection called "t-rejection" instead of choosing a switching type. The parameters ρ τ (i) will be specified in the next section. If i = 0 then a Type I switching is interpreted as outputting the current graph.
To be more specific, let G t be the graph obtained after t switching steps, and suppose that G t = G ∈ S i . The (t + 1)'th switching step is composed of the following substeps: (i) Choose switching type τ ∈ Γ with probability ρ τ (i). If no type is chosen, perform a t-rejection.
(ii) Assume that no t-rejection was performed. If i = 0 and τ = I then output the current graph G. Otherwise, choose a random Type τ switching S for G and let G ′ be the graph obtained from G by performing S. Perform an f-rejection with probability 1 − f τ (G)/m τ (i).
(iii) If no f-rejection is performed then perform a pre-b-rejection (described above Lemma 5.7), if applicable.
(iv) If no f-rejection or pre-b-rejection is performed then let α be the class of S and suppose that G ′ ∈ S i ′ . Perform a b-rejection with probability 1 − m α (i ′ )/b α (G ′ ).
(v) If no b-rejection is performed then set G t+1 = G ′ .
If any rejection occurs then FactorUniform restarts.

Uniformity: fixing ρ τ (i)
We complete the definition of FactorUniform by specifying the parameters ρ τ (i), for τ ∈ Γ. Let σ(G) denote the expected number of times that G is reached by FactorUniform. We will design ρ τ (i) such that σ(G) = σ i for some σ i , for every G ∈ S i and every 0 ≤ i ≤ i 1 . A method of designing these parameters is discussed in [9, Section 5] in a general setting. In the rest of this section, we carry out this method and apply it to our specific problem. For convenience, we summarise which switching types occur for each class in Table 4. As usual, a type ending in "+" goes with a class ending in "+", and similarly for those ending in "−". A  I  B1±  I, IIa±, IIb±, IIc±  B2±  I  C±  I, III±   Table 4: Reference table for types and classes.
Next we list parameters m α (i), which are lower bounds on the number of ways to perform switchings of each class to produce a given G ∈ S i : Fix a class α and let τ be a type such that class α and type τ appear together in some row of Table 4. For each relevant i ≤ i 1 , let q τ α (i) denote the expected number of times that an element of S i is reached by a Type τ , Class α switching. We will choose our parameters to ensure that the value of q τ α (j) does not depend on τ , for any type τ associated with class α. This common value is denoted by q α (i); that is, q α (i) = q τ α (i) for any type τ associated with class α.
It follows then that This equation holds because every G ∈ S i can be chosen as the initial graph, if not initially rejected; or is reached via some switching. The probability that G is the graph obtained at Step 0 is 1/|A 0 |, since G 0 is chosen uniformly, and by our design of the algorithm, q α (i) m α (i) is exactly the expected number of times that G is reached via some class α switching and is not t-rejected, f-rejected, pre-b-rejected or b-rejected. Immediately we have This is because a graph G ∈ S i can be created via a Class A switching only via a Type I switching on a graph G ′ ∈ S i+1 . (See the first line of Table 4.) Every graph in S i+1 is visited σ i+1 times in expectation, and given any G ′ ∈ S i+1 such that S = (G ′ , G) is a valid Type I Class A switching, the probability that FactorUniform chooses Type I is ρ I (i + 1), and the probability that FactorUniform chooses the particular switching S is 1/m I (i + 1). Next, consider α ∈ {B1±}. A Class B1± switching can be of Type I, IIa±, IIb±or IIc± (see the second line of Table 4). Now G ∈ S i might be created from some G ′ ∈ S i via a Type I Class B1± switching. Thus, arguing as above, we have To ensure that the expected number of times G is visited via Type τ Class B1± switchings does not depend on τ , for τ ∈ {I, IIa±, IIb±, IIc±}, we must choose ρ τ (·) for τ ∈ {IIa±, IIb±, IIc±} such that and Next we consider α ∈ {B2±}. A Class B2± switching can only be of Type I (see the third line of Table 4). Thus we immediately have Finally, consider α ∈ {C±}. A Class C± switching can be of Type I or Type III±(see the fourth row of Table 4). Considering switchings of Type I and Class C±, we have q C+ (i) = q C− (i) = σ i+1 ρ I (i + 1) m I (i + 1) .

(5.24)
To ensure that the expected number of times G is visited by Type III± Class C± switchings, we must choose ρ τ (·) for τ ∈ {III±} such that and Combining (5.15)-(5.26), and using (5.14), we deduce that

Boundary conditions and solving the system
Recall that FactorUniform rejects the initial d-regular graph if it contains more than i 1 red edges. Thus, we set ρ τ (i) = 0 for all i > i 1 . The parameter ρ I (i) is already defined for all 0 ≤ i ≤ i 1 , recalling that ρ I (0) is interpreted as the probability of outputting the current graph. The Type I switchings may be of Class B1±, B2± or C±, as shown in Figure 6. A Type I Class B1± switching converts a graph from S i to S i , and the booster switchings for Class B1± are of Type IIa±, IIb±, IIc±. By Table 2, we must define ρ τ (i) for all Similarly, consider the booster switchings for classes C±. When τ ∈ {III±} we must define ρ τ (i) for all 0 ≤ i ≤ i 1 . Note that in general there are switchings converting graphs in S j to graphs in S i for i − 3 ≤ j ≤ i + 2, as shown in Figure 6. For i close to i 1 or 0 there are fewer switching types involved. For instance, graphs in S i 1 cannot be reached by a Type I Class A switching, because the boundary conditions will be set so that no graphs in strata j > i 1 will ever be reached. Similarly, no graphs in S 0 can be reached by a Type IIa± switching, because any such switching increases the number of red edges in the graph, and this number can never be negative.
Inductive step. Now assume that i ≤ i 1 −2 and that ρ I (j), ρ III (j) and x j have been determined for all j > i. By (5.35) we have We also have As in base case (b), we obtain and immediately this gives ρ III (i) and ρ I (i). Thus we have uniquely determined ρ I (i), ρ III (i) and x i that satisfy (5.29) and (5.30). Next we specify ǫ. Define We have already shown that there are unique x * i , ρ * I (i) and ρ * III+ (i), ρ * III− (i) satisfying (5.29) and (5.30). Substituting these values into (5.17)-(5.22) uniquely determines ρ * τ (i) for all τ ∈ {IIa±, IIb±, IIc±} and all relevant values of i.
It is easy to verify that x * i 1 > 0 and x * i 1 −1 > 0 and that (5.46) is satisfied for i = {i 1 , i 1 −1}. We will prove by induction on i, for all 0 ≤ i ≤ i 1 − 1, that (5.46) is satisfied as well as the following strengthening condition of (5.45).
Lemma 5.10. The output of FactorUniform is a uniformly random d-factor of H n .
Proof. By the definition of FactorUniform, the output graph contains no red edges. Thus it is a d-factor of H n . Recall that the parameters ρ τ (i) are set according to the solution of (5.15)-(5.28). Hence, the expected number of times that a graph in S 0 is visited equals σ 0 , by (5.27), which is the same for every graph in S 0 . It follows that for every d-factor G of H n , the probability that G is the output of FactorUniform is σ 0 ρ I (0), which, again, is independent of G. Hence, FactorUniform is a uniform sampler for d-factors of H n .

Rejection probability and number of switching steps
We bound the number of switching steps performed by FactorUniform and the probability of any rejection in FactorUniform. Lemma 5.11. FactorUniform performs at most O(i 1 ) switching steps in expectation and with high probability. The proofs of Lemmas 5.11 and 5.12 are standard, and very similar to those in [9]. We include the proofs in the appendix.

Completing the proofs of our main results
To complete the proofs of Theorems 1.1, 1.2 and 1.3 we must analyse the runtime of algorithms FactorEasy and FactorUniform, and prove that the output of FactorApprox is within o(1) of uniform.
Proof of Theorem 1.1. Here we complete the proof of Theorem 1.1, partly presented in Section 3.1. It only remains to prove that each switching step can be implemented efficiently. In each iteration, the switching of a bounded number of edges can be done in O(1) time. The time-consuming part is to compute b(G) to determine the probability of a b-rejection.
To compute b(G), we use brute force to search for all possible choices of v 5 , v 0 , v 1 , v 2 . This can be done in time O(d 2 ∆n). Denote the number of choices by X. Multiplying X by dn − 2i gives the first estimate for b(G). Next we accurately compute b(G) using inclusionexclusion. The choices of v 3 and v 4 must satisfy a set U of constraints. We can count the number of choices which satisfy all constraints using inclusion-exclusion. The inclusionexclusion argument involves a bounded number of terms counting choices where a subset W ⊆ U of constraints are violated. We will show that each such term can be computed in time O((d + ∆) 3 n) with the aid of a proper data structure.
Let G be the supergraph of G consisting of G together with all red edges in K n \ G. We use red for the colour of edges in G which are not in G. When computing X using brute force search, we can record the number of 3-paths in G between any two vertices of any type (for example, red-black-red, or black-red-black, or black-red-black); we can also record the number of 3-paths and 2-paths in G of any given type starting from any given vertex. The time complexity for computing all these numbers is O((d + ∆) 3 n) since the maximum degree in G is bounded above by d + ∆. The number of other local structures of at most 4 vertices can be computed and recorded within this time complexity bound, that is, triangles, 4-cycles, etc. We can also record lists of pairs of vertices which are joined by a 3-path, or 2-path, or an edge, within the same time complexity.
Given W , let b W be the choices of the sequence of the six vertices that violate constraints in W (here constraints in U \ W may or may not be violated). Since the structure counted by b W uses up to six vertices, it is easy to see that b W can be computed using the numbers we have recorded. 3 , where b W,1 counts those choices where W is violated by taking v 4 v 5 as a red edge; b W,2 counts those choices where v 4 v 5 is a black edge, and b W,3 counts those choices where v 4 v 5 is a red non-edge. In each case, we can run through n choices for v 5 , and compute b W,i using the number of 3-paths and 2-paths starting from v 5 that have been recorded. The time complexity is then O(n). For every other W it is easy to check that a similar scheme works. Thus, it takes O((d + ∆) 3 n) time to compute b(G) for the first switching step of FactorEasy.
For the subsequent switching steps, we do not need to recompute b(G) from scratch. After each switching step, we can update the data recorded in our data structure very efficiently, because only 3 new edges are added, and 3 edges are deleted. Since the data we store are counts of structures involving only up to 4 vertices, changing each edge will alter at most O((d + ∆) 2 ) entries. For each entry change, we can update b(G) by updating the corresponding b W terms in the inclusion-exclusion formula. Thus the time complexity for computing b(G) is O((d + ∆) 2 ) after each subsequent switching step and there are O(d∆) switching steps in expectation, since FactorEasy restarts O(1) times in expectation (proved in Section 3.1). Thus the total time complexity for FactorEasy is O((d + ∆) 3 n + d∆(d + ∆) 2 ) = O((d + ∆) 3 n) in expectation.
Proof of Theorem 1.2. In Lemma 5.10 we have shown that FactorUniform is a uniform sampler. It only remains to prove the efficiency. By Corollary 4.2, FactorUniform restarts only O(1) times in expectation and O(log n) times a.a.s. before finding a d-regular graph containing at most i 1 red edges. The total time complexity for finding such a graph is O(d 3 n) in expectation, and O(d 3 n log n) a.a.s.. By Lemma 5.12, the probability that FactorUniform restarts afterwards is o(1). Thus, we only need to bound the remaining runtime of FactorUniform assuming no rejections. By Lemma 5.11, after finding a d-regular graph with at most i 1 red edges, FactorUniform will perform O(i 1 ) switching steps in expectation and with high probability. In each switching step, the most timeconsuming part is to compute f τ (G), b τ (G, v), and b α (G) for τ ∈ {I, IIa±, IIb±, IIc±, III±} and α ∈ {A, B1±, B2±, C±}.
We first bound the a.a.s. time complexity. Note that f τ (G) and b τ (G, v) will only need to be evaluated once a type τ switching is performed. By (5.30) and (5.44), the probability that a type τ switching is ever performed in FactorUniform for any τ / ∈ {I, III±} is O(ǫi 1 ) = o(1). It follows immediately that a.a.s., only f τ (G) for τ ∈ {I, III±} and b α (G), α ∈ {A, B1±, B2±, C±}, will ever be computed during the implementation of FactorUniform.
First consider f τ (G) for τ = I. We want to count choices of (v 0 , . . . , v 7 ) such that v 0 v 1 is a red edge in G, v 2 v 3 and v 6 v 7 are edges (red or black) in G, and v 4 v 5 is a black edge in G, satisfying a set U of constraints (that is, no vertex collision except for v 2 = v 7 and certain edges are forbidden in G and must be with certain colour in K n ). As before, using inclusionexclusion, we can express this number by b W , W ⊆ U, where b W is the number of choices where the conditions in W are violated. We use similar data structures as in Theorem 1.1, but we record counts of structures containing up to 5 vertices. Thus the time complexity for constructing the data structures is O((d + ∆) 4 n) in the first iteration. It is easy to see, as in the proof of Theorem 1.1, that all terms b W in the inclusion-exclusion can be computed using the recorded data. Moreover, it takes O((d + ∆) 3 ) time to update the data structure after each subsequent switching step. Thus, the total time complexity for computing The same time complexity bound holds for computing f III+ (G) and f III− (G) throughout FactorUniform, as the same number of vertices are involved in a Type III± switching as in a Type I switching.
Next we consider b α (G). Similar arguments as for f I (G) show that the time complexity of computing b α (G) for every α throughout FactorUniform is at most O((d + ∆) 4 n). The Gao-Wormald algorithm [9], used to produce the initial d-regular graph, has time complexity O(d 3 n) in expectation. Hence a.a.s. the time complexity to produce the initial d-regular graph is O(d 3 n log n). Therefore, the a.a.s. time complexity bound for FactorUniform is O(d 3 n log n + (d + ∆) 4 n).
Now we consider the time complexity in expectation. To do this, we obtain an upper bound for the time complexity of computing f τ (G) and b τ (G, v), τ / ∈ {I, III±}, and then multiply by the probability that the switching type τ is chosen in a single step, and finally multiply by O(i 1 ) = O(d∆), which is an upper bound for the expected number of switching steps performed by FactorUniform. These switchings are performed rarely, so we do not attempt to update the data after every switching step. Instead we simply reconstruct the data structure whenever it is needed. Obviously for d and ∆ in different ranges, different counting schemes can be used to optimise the runtime: we have not attempted this. Here we simply use the scheme which naturally extends that given in Theorem 1.1.
This leads to the following bounds on the complexity of computing f τ (G) for a particular G, in these cases: For τ = I, the Type τ switchings are only implemented occasionally. Let ρ * τ = max 0≤i≤i 1 ρ τ (i). Multiplying the above bounds by O(i 1 )ρ * τ , using (5.49)-(5.52), yields the following overall bounds on the expected time complexity for computing f τ (G) during FactorUniform: Combining the contribution from every type τ , the expected time complexity for computing f τ (G) throughout FactorUniform is bounded above by on the complexity of constructing the data structure. Multiplying these bounds by O(i 1 )ρ * τ , using (5.50) and (5.52), yields the following upper bound on the complexity of computing b τ (G, v) during the implementation of FactorUniform: Since d 2 + ∆ 2 = o(n), the above terms are dominated by (6.1). Thus, combining everything together, the expected time complexity of FactorUniform is bounded by (6.1). If we ignore the f-rejections in the implementation of Type III± switchings then the switching has no effect and we may simply skip that step. Hence this modification of FactorUniform behaves identically to FactorApprox, that is, by repeatedly performing valid Type I switchings. Therefore, we can couple the implementation of FactorUniform and FactorApprox such that FactorUniform and FactorApprox output the same graph G as long as no rejections occur in FactorUniform and no types of switchings other than I, III± are chosen in FactorUniform. By Lemma 5.12, the probability of performing any rejection in FactorUniform is o(1). By (5.30) and (5.44), the probability of performing any switchings in FactorUniform of type other than I, III± is O(ǫ) = o(1). Hence, the total variation distance between the output of FactorApprox, and that of FactorUniform, which is uniform, is o(1). Here o(1) also accounts for the probability that the coupled REG and REG* have distinct outputs. The time complexity bound of O(dn) was established in Section 4.
• As a final adjustment (which will appear with positive sign in the inclusion-exclusion) we must consider choices where one of v 1 v 2 , v 2 v 3 is red and one of v 6 v 7 , v 7 v 0 is red: there are at most 2idn(2i + ∆d) 2 ≤ 2i(dn) 3 · 9∆ 2 /n 2 of these, since i ≤ i 1 .
Putting this together, we have proved that (5.3) holds, and that This completes the proof.
Proof of Lemma 5.4. Firstly, note that all Class A switchings are of Type I. (See Table 4.) Thus we only need to count inverse Type I Class A switchings. Consider the right hand side of Figure 2. First we find a lower bound for the number of ways to select an 8-tuple (v 0 , . . . , v 7 ) such that v 0 v 1 is a red non-edge, v 1 v 2 , v 3 v 4 , v 5 v 6 are all edges and v 2 v 3 , v 4 v 5 , v 6 v 7 are non-edges, with all vertices distinct except possibly v 2 and v 7 . There are (∆n − 2i) choices for (v 0 , v 1 ), then d 2 choices for (v 2 , v 7 ), then dn − 4 choices for v 2 v 3 avoiding the two chosen edges, and then dn − 6 choices for v 4 v 5 avoiding the three chosen edges. This gives the expression (∆n − 2i)d 2 (dn − 4)(dn − 6).
For the lower bound, we must subtract from this expression the number of choices of 8-tuple with at least one defect. The possible defects are: vertex coincidence; a dashed edge (other than v 0 v 1 ) is present in G; a dashed edge (other than v 0 v 1 ) is a red non-edge; or a chosen edge is a red edge in G. We now give upper bounds on the number of choices with particular defects.
• Vertex coincidences: Out of 8 2 = 28 possible vertex coincidences, one is allowed and 7 are impossible, leaving 20 vertex coincidences that must be explicitly ruled out: at most O(d 4 ∆n 2 ) choices.
• One of the dashed edges (other than v 0 v 1 ) is present in G: at most choices.
• v 1 v 2 is red, or v 0 v 7 is red: at most 2 · 2i∆d(dn) 2 choices. For later use, we remark that this upper bound includes cases where the edge v 0 v 1 is red and present in G.
Subtracting these choices leads to the inequality proving the first statement of the lemma.
To prove the second statement, we must investigate the average value of b A (G) over all G ∈ S i . We continue inclusion-exclusion, calculating the number (or, in two cases, the expected number) of 8-tuples containing two defects.
• Two or more dashed edges (other than v 0 v 1 ) are present: at most O(d 6 ∆n) choices, giving a relative error of O(d 2 /n 2 ).
• One of the dashed edges (other than v 0 v 1 ) is a red non-edge and one of the chosen edges is red: a t most O(i∆ 2 d 3 n) such choices, giving a relative error of O(∆ 2 /n 2 ), since i ≤ i 1 .
• One of the dashed edges (other than v 0 v 1 ) is present in G, and one of the chosen edges is red: at most O(i∆d 4 n) such choices, giving a relative error of O(d∆/n 2 ).
• Two of the chosen edges are red: at most O(i 2 d 2 ∆n) for all G ∈ S i , giving a relative error of O(i 2 /(d 2 n 2 )) = O(∆ 2 /n 2 ), unless the two chosen edges are v 1 v 2 and v 0 v 7 . The number of choices such that v 1 v 2 and v 0 v 7 are red, with v 0 v 1 a red non-edge, can vary a lot across S i , and here we will need to calculate the average. (We come back to this, below.) • One dashed edge is a red non-edge and another dashed edge is present in G (neither edge is v 0 v 1 ): at most O(∆ 2 d 5 n) such choices, giving a relative error of O(d∆/n 2 ).
• Two dashed edges (other than v 0 v 1 ) are both red non-edges: at most O(d 4 ∆ 3 n) choices, giving a relative error of O(∆ 2 /n 2 ).
There are two cases that must be considered further. * Recall that in the lower bound, we subtracted some "illegal" cases where v 0 v 1 is red and present. For the average-case expression we must add these cases back in. The term 4i∆d(dn) 2 , which we subtracted to obtain the lower bound, was an upper bound for the number of 8-tuples in which v 1 v 2 or v 7 v 0 is red. To obtain this bound, we first choose a red edge v 0 v 7 , say, in 2i ways, and then are at most ∆ choices for v 1 . This upper bound of ∆ includes the possibility that the edge v 0 v 1 is present in G. But such choices are not valid inverse Type I switchings, and so we must undo this subtraction by adding them back in now. The number of choices of (v 7 , v 0 , v 1 , v 2 ) such that v 0 v 7 and v 0 v 1 are red edges in G, and v 1 v 2 is an edge in G (of any colour), varies quite widely among different G ∈ S i . By Lemma 5.3, the expected number of choices for this 4-tuple for a uniformly random G ∈ S i is O(i 2 d/n). * The second thing we must consider is the choices for the 8-tuple switching in which v 0 v 7 and v 1 v 2 are both red edges in G. By Lemma 5.3, the expected number of 4-tuples (v 7 , v 0 , v 1 , v 2 ) with this property (and with v 0 v 1 a red non-edge in G) is O(i 2 ∆/n).
Adding these counts together and multiplying by (dn) 2 , the expected number of choices of (v 0 , v 1 , . . . , v 7 ) which must be added to the lower bound leading to a relative error of O((d + ∆)∆/n 2 ). This completes the proof of the second statement of the lemma.
Proof of Lemma 5.5. Observe that all Class B2± switchings are of Type I. (See Table 4.) Thus we only need to count inverse Type I Class B2+ switchings, say, and the same bounds will hold for Class B2−, by symmetry. There are ∆n − 2i ways to choose v 0 and v 1 . Then d 2 ways to fix v 7 and v 2 . Then there are at most ∆d ways to choose v 3 and v 4 and finally at most dn ways to choose v 5 and v 6 . So the total number of inverse switchings is at most (∆n − 2i)∆d 4 n, giving the upper bound as desired. To deduce a lower bound, we subtract the number of the following structures, for which we only need an upper bound: • v 0 v 7 or v 1 v 2 is red: at most 2 · 2i∆ 2 d 2 · dn = 4i∆ 2 d 3 n choices.
This immediately gives the required lower bound on the number of available inverse Class B2+ switchings, completing the proof.
Proof of Lemma 5.6. We only prove the result for τ = IIb+, as the argument for τ = IIc+ is similar and by symmetry, the same bounds will hold for τ = IIb− and τ = IIc−, respectively.
Let v = (v 0 , . . . , v 7 ) be a fixed 8-tuple which gives rise to the octagon shown in Figure 7. We bound the number of ways to choose an 8-tuple y = (y 1 , y 2 , y 3 , y 4 , y 5 , y 6 , y 7 , y 8 ) of additional vertices so that dashed lines in Figure 7 correspond to black non-edges in G. The upper bound (dn) 4 is obvious. For the lower bound, first notice that this number is at least (dn − 2i − 12) 4 , as the 4 extra edges involved in the inverse switching are black, and are distinct from each other and from the 3 black edges in the octagon Further, we need to subtract the number of choices where at least one defect appears. There are at most 6 · (dn) 3 d 2 choices where one designated non-edge (such as v 0 y 1 or y 2 y 4 ) is actually present, and at most 6 · (dn) 3 d∆ choices where one designated non-edge is red. Subtracting these counts gives the required lower bound.
Proof of Lemma 5.7. The number of ways to choose v 7 , v 0 , v 1 and v 2 is between 2i(∆ − 1)(d − 1) and 2i(∆ − 1)d. The number of ways to choose the other four vertices is at least (dn − 2i − 10d) 2 so that there are no vertex coincidence and both v 3 v 4 and v 5 v 6 are black. We subtract the choices where v 2 v 3 or v 4 v 5 or v 6 v 7 is present in G, or is a red non-edge. There are at most 3 · 2i(∆ − 1)d(d 2 (dn) + ∆d(dn)) such choices. This verifies the desired lower bound. The upper bound 2i(∆ − 1)d(dn) 2 is trivial, which yields the required relative error.
Proof of Lemma 5.8. We only discuss the case τ = III+, as the case τ = III− is symmetric. There are at most ∆n − 2i ≤ ∆n ways to choose (v 0 , v 1 ), and then at most ∆ 2 ways to choose (v 2 , v 7 ). Then there are at most d 2 ways to choose (v 3 , v 6 ). Finally, there are at most dn − 2i ≤ dn ways to choose (v 4 , v 5 ). This gives the required upper bound for f III+ (G). To obtain the lower bound, we need to subtract from these d 2 ∆ 2 (∆n − 2i)(dn − 2i) choices of (v 0 , . . . , v 7 ) the following cases: (a) v 1 v 2 or v 0 v 7 is a red edge in G; (b) v 2 v 3 or v 6 v 7 is a red edge in G; (c) v 3 v 4 or v 5 v 6 is a edge in G.
The number of choices for (a) is at most 2 · 2id 2 ∆ 2 dn = 4id 3 ∆ 2 n. To see this, there are at most 2i ways to fix v 1 and v 2 if v 1 v 2 is a red edge in G; then at most d ways to fix v 3 , at most ∆ 2 ways to fix v 0 and v 7 , at most d ways to fix v 6 and finally at most dn ways to fix v 4 and v 5 . The factor of 2 covers the case that v 2 v 7 is a red edge present in G.
Next we bound b α (G) for α = C+, as the case α = C− is symmetric. To perform an inverse Class C+ switching, we need to designate an 8-tuple v = (v 0 , . . . , v 7 ) such that either an inverse Type I Class C+ switching can be performed on v, or an inverse Type Va switching can be performed on v. Note that an inverse type C+ switching is just the same as a type C+ switching. The lower bound on f III+ (G) naturally is a lower bound for b C+ (G). So immediately we have b C+ (G) ≥ m C+ (i) as specified in (5.9). It is not hard to see that d 3 ∆ 3 n 2 is an upper bound for b C+ (G), because there are at most ∆n ways to fix v 0 and v 1 (either v 0 v 1 is present or not present in G), and at most ∆ 2 d 2 ways to fix v 2 , v 3 , v 7 , v 6 and then at most dn ways to fix v 4 and v 5 .
Proof of Lemma 5.11. The proof is almost identical to [9,Lemma 8]. We omit the details here and only sketch the main idea. It is easy to verify that with probability 1−o(1), a Type I Class A switching is performed in each step. Thus, we can easily bound the probability by o(1) that more than 5% of steps are implemented with a switching that is not of Type I Class A. With a Type I Class A switching, the number of red edges reduces by exactly one. On the other hand, the number of red edges can increase by at most 3 in each step. Since initially there are at most i 1 red edges, it follows immediately that with high probability the number of switching steps performed by FactorUniform is O(i 1 ).
Proof of Lemma 5.12. By Lemma 5.9, the probability of a t-rejection in each step is at most ǫ. Thus, the probability of a t-rejection in FactorUniform is O(ǫi 1 ) = o(1) by Lemma 5.11.
Next, consider f-rejections. Let G t be the graph obtained after t switching steps. Given G t = G ∈ S i , the probability that an f-rejection occurs at step t + 1 is Summing over all G ∈ S i and summing over all steps t, the probability of an f-rejection in FactorUniform is at most Next, we verify that for every τ ∈ Γ and for every i ≤ i 1 , For τ / ∈ Γ \ {I,III±}, we use the bounds in (5.50)-(5.52) for ρ τ (i) and the trivial bound 1 for 1 − f τ (G)/m τ (i). For τ = III±, we use (5.49) for ρ τ (i) and Lemma 5.8 for 1 − f τ (G)/m τ (i). Lastly, for τ = I, we use the trivial bound 1 for ρ I (i) and Lemma 5.1 for 1 − f I (G)/m I (i). These yield the desired bound above.
We also have which is the number of switching steps in FactorUniform. By Lemma 5.11, this is O(i 1 ) in expectation and with high probability. Thus, the probability of an f-rejection is at most and this is o(1) when d 2 + ∆ 2 = o(n). Next, consider pre-b-rejections. Pre-b-rejections can happen when a switching of type τ ∈ {IIb±, IIc±} is performed. Given G t = G ∈ S i , the probability that a pre-b-rejection occurs at step t + 1 is Thus the above summation is Since τ |Ψ τ,α (G ′ )| = b α (G ′ ), by definition of b α (G ′ ), the probability of a b-rejection in FactorUniform is (since ρ I (i ′ + 1) ≤ 1 and σ i ′ +1 = O((i + 1)σ i ′ /d∆) by Lemma 5.9) where Eb α (G ′ ) is the expectation of b α (G ′ ) on a uniformly random G ′ ∈ S i ′ . Thus, the contribution to (6.2) from α = A is O d 2 + ∆ 2 n 2 + 1 n ∆nd 2 (dn) 2 d∆(dn) 3 (1), Thus, the contribution to (6.2) from α ∈ {B1±} is Thus, the contribution to (6.2) from α ∈ {C±} is This completes the proof that the probability of any b-rejection in FactorUniform is o(1).