Comparison of two convergence criteria for the variable-assignment Lopsided Lovasz Local Lemma

The Lopsided Lovasz Local Lemma (LLLL) is a cornerstone probabilistic tool for showing that it is possible to avoid a collection of"bad"events as long as their probabilities and interdependencies are sufficiently small. The strongest possible criterion in these terms is due to Shearer (1985), although it is technically difficult to apply to constructions in combinatorics. The original formulation of the LLLL was non-constructive; a seminal algorithm of Moser&Tardos (2010) gave an efficient algorithm for nearly all its applications, including to $k$-SAT instances where each variable appears in a bounded number of clauses. Harris (2015) later gave an alternate criterion for this algorithm to converge; unlike the LLL criterion or its variants, this criterion depends in a fundamental way on the decomposition of bad-events into variables. In this note, we show that the criterion given by Harris can be stronger in some cases even than Shearer's criterion. We construct $k$-SAT formulas with bounded variable occurrence, and show that the criterion of Harris is satisfied while the criterion of Shearer is violated. In fact, there is an exponentially growing gap between the bounds provable from any form of the LLLL and from the bound shown by Harris.


Introduction
The Lovász Local Lemma (LLL) is a general probabilistic principle for showing that, in a probability space Ω with a finite set B of "bad" events which are not too interdependent and are not too likely, then there is a positive probability no events in B occur.
Since its introduction in [3], it has become a cornerstone of the probabilistic method of combinatorics.
There have been numerous extensions of the LLL since its original formulation. One important generalization known as the Lopsided Lovász Local Lemma (LLLL) [4] observes that it is not necessary for bad-events to be fully independent. If the bad-events are positively correlated in a certain sense, then for the purposes of the LLL this is just as good as independence. This type of correlation, which we discuss shortly, is known as lopsidependency.
In order to explain the LLL formally, we need to introduce a number of definitions. For any collection of events S ⊆ B, we define S = B∈S B; we refer to this event as avoiding S. A dependency graph is a graph G on vertex set B such that for any B ∈ B A probability space Ω and collection of bad-events B does not have a unique dependency graph or lopsidependency graph. Rather, we suppose that we are given Ω, B and some chosen graph G which is a (lopsi-)dependency graph for them.
For such a graph G with vertex set V = B, we say a set S ⊆ V is stable if no elements of S are adjacent in G. For real numbers p v , indexed by the vertices v ∈ V , we define the stable set polynomial of G with respect to base set S ⊆ V , denoted Q(G, S, ⃗ p), by 3. (Cluster-expansion criterion [2]) If there is a function µ : The symmetric LLLL uses only a few crude parameters of the problem instancenamely, the maximum probability of a bad-event and the maximum degree of the lopsidependency graph. The other variants use progressively more information and take advantage of refined dependency structure. See also [14] for another criterion in this vein. In [20], Shearer derived the most powerful possible criterion in these terms.
1. Suppose that Q(G, ∅, ⃗ p) > 0 and Q(G, S, ⃗ p) ⩾ 0 for all S ⊆ V . Then for any probability space Ω, and any events B 1 , . . . , B n ⊆ Ω in that space such that Pr(B i ) = p i for i = 1, . . . , n and such that G is a lopsidependency graph for B = {B 1 , . . . , B n }, In this case, we say that Shearer's criterion is satisfied by G, ⃗ p.

Suppose that either
Then there is some probability space Ω and events B 1 , . . . , B n ⊆ Ω such that Pr Ω (B i ) = p i for i = 1, . . . , n and such that G is a dependency graph for B = {B 1 , . . . , B n } and Pr(B) = 0.
In this case, we say that Shearer's criterion is violated by G, ⃗ p.
Having bad-events with probability 0 or 1 is not so interesting, and Theorem 2 can be simplified when we disallow these cases. Lemma 5.27). Suppose that p 1 , . . . , p n ∈ (0, 1). Shearer's criterion is satisfied by G, p if and only if Q(G, S, ⃗ p) > 0 for all stable sets S.
Thus, Shearer's criterion exactly characterizes which probability and lopsidependency structure of the bad-events guarantees a positive probability of avoiding B. From a theoretical point of view, alternate bounds such as Theorem 1 are all weaker than, and are implied by, Shearer's criterion. However, Shearer's criterion is technically difficult to apply to constructions in combinatorics.

The variable-assignment LLLL
The LLLL has been applied to diverse probability spaces such as random permutations [16], Hamiltonian cycles [1], and perfect matchings [17]. However, by far the most common form of the LLL and LLLL concerns what we refer to as the variable-assignment setting. Here, the probability space Ω has m independent discrete random variables X 1 , . . . , X m , and the bad-events can be taken to be "monomial events"; that is, each B ∈ B can be written in the form For such a monomial event, we define var(B) = {i 1 , . . . , i k }. We say that two events It is immediate that the canonical dependency graph is, indeed, a dependency graph for Ω, B. The fact that the canonical lopsidependency graph is a lopsidependency graph follows from the FKG inequality. Most applications of the LLL use only the canonical dependency graph; some noteworthy applications of the canonical lopsidependency graph include monochromatic hypergraph coloring [18] and boolean satisfiability [6]. We will discuss the latter in much more detail later.
In [13], Kolipaka & Szegedy noted that the Shearer criterion is not tight for the variable-assignment LLL setting. Namely, they found an explicit dependency graph and vector of probabilities where the Shearer criterion is violated yet any variable-assignment realization must have a satisfying assignment. Later work [11] provided a more systematic description of which dependency graphs were satisfiable in the variable-assignment setting.

The Moser-Tardos algorithm
The LLLL ensures that Pr(B) > 0, and this is usually sufficient for combinatorics where the main goal is to show existence results. However, typically Pr(B) is exponentially small, and hence the LLLL does not give efficient algorithms for constructing such a configuration. In [19], Moser & Tardos introduced a remarkably simple algorithm for the variable-assignment LLLL setting: Choose a true bad-event B arbitrarily.
They showed that when the asymmetric LLLL criterion is satisfied with respect to the canonical lopsidependency graph, then this algorithm terminates in expected polynomial time with a configuration avoiding B. Later work [13] showed that this algorithm terminates quickly whenever the Shearer criterion is satisfied. Thus, at least for the variableassignment LLLL setting, this gives an efficient algorithm for nearly every construction based on the LLLL.
In [7], Harris gave a different type of criterion for the Moser-Tardos algorithm. Unlike the symmetric LLLL or other similar criteria, this cannot be stated solely in terms of the dependency graph and the probabilities of the bad-events. We summarize it here (in a slightly simplified form).

Definition 5 (Orderability). Given B ∈ B, we say that a set of bad-events
Then the Moser-Tardos algorithm terminates with probability 1.
Theorem 6 is superficially similar to the cluster-expansion criterion. It is strictly stronger than the asymmeric LLLL and certain simplified forms of the cluster-expansion criterion. However, its relation to the Shearer criterion is not clear. It is quite plausible, along the lines of [13,9], that it truly takes advantage of extra information in the variable assignment LLLL. On the other hand it is quite plausible that Theorem 6 is more along the lines of [14], namely, it provides a more accurate and computationally efficient approximation to Shearer's criterion.
In this paper, we will construct a problem instance for which Theorem 6 is satisfied, yet Shearer's criterion is violated. Thus, it is impossible to deduce the fact that Pr(B) > 0 based only on the probabilities and interdependency structure of the bad-events; it is necessary to take into account the decomposition of the bad-events into variables (as is provided by Theorem 6). In other words, Theorem 6 can be stronger than Shearer's criterion.
We emphasize that Shearer's criterion concerns arbitrary probability spaces; one cannot hope to provide a stronger criterion than Shearer's for the level of generality to which the latter applies. The strength of Theorem 6 comes from its less general setting (the variable assignment LLLL), which is nevertheless encompasses many applications in combinatorics.
We also remark on other related criteria for the variable-assignment LLL setting. For instance, [9,10] derive certain convergence conditions in terms of the bipartite graph H on vertex sets {1, . . . , m} and B and an edge on (i, B) when i ∈ var(B), and [11] derives conditions in terms of the probabilities that certain neighboring bad-events hold simultaneously.

Satisfiability with bounded variable occurrence
Consider boolean k-satisfiability instances, where we have m boolean variables X 1 , . . . , X m and n clauses C 1 , . . . , C n of width k, each of the form the electronic journal of combinatorics 29(4) (2022), #P4.10 for distinct literals l i1 , . . . , l ik (i.e. expressions of the form X j or ¬X j ). The goal is to produce a value for the boolean variables X 1 , . . . , X m ∈ {T, F } m such that all the clauses C i are simultaneously true. Equivalently, we want to find a satisfying assignment of the conjuctive-normal form (CNF) formula We are interested specifically in instances where each variable appears in a bounded number of clauses. For each i = 1, . . . , m, define R 0 (Φ, i) and R 1 (Φ, i) to be the number of clauses which contain the literal X i (respectively ¬X i ), and let R( In [15], Kratochvíl, Savický, and Tuza defined the function f (k) as the largest integer L such that whenever R(Φ, i) ⩽ L for all i, then Φ is satisfiable; they showed f (k) ⩾ 2 k ek . A series of later works [21,12,5,6] showed a variety of upper and lower bounds of f (k). In particular, [6] showed The lower bound comes from the variable-assignment LLLL. Here, the probability space Ω is defined by setting each variable X i = T with a certain probability p i given by for some carefully chosen parameter x ⩾ 0. Then, for each clause C i , there is a corresponding bad-event B i that C i is false, namely B i has the form where j i1 , . . . , j ik ∈ {T, F }. Using Theorem 6 in place of the LLLL, and using a slightly different value for the probabilities p i , Harris [7] showed a stronger bound With these constructions, we thus know the asymptotic bound nevertheless, there are two main reasons to determine f (k) as precisely as possible. First, since f (k) grows exponentially in k, the asymptotic value is not as relevant for practical applications. Second, [15] showed a sudden gap in the computational complexity of k-SAT: for problem instances where variables may appear in f (k) + 1 clauses, it is NP-complete to determine satisfiability. On the other hand, problems instances where they appear in at most f (k) clauses are always satisfiable and the problem is computationally vacuous. Thus, tiny gaps in the value of f (k) can lead to huge gaps in computational hardness.

Restricting the number of occurrences of each literal
Our goal is to demonstrate that the bound in Eq. (3) cannot be shown from the Shearer criterion. If the probability space Ω is allowed to vary in a problem-specific way, then any satisfiable formula can trivially satisfy the LLL: namely, Ω puts probability mass 1 on some satisfying assignment. Thus, in order to separate the LLL and Theorem 6, we must restrict Ω to be problem-independent.
In both the constructions of [6] and [7], the probabilities p i depend solely on the imbalance between R 0 (Φ, i) and R 1 (Φ, i). They use slightly different formulas; however, in both constructions, the extremal case is when R 0 (Φ, i) = R 1 (Φ, i), in which case p i is set to 1/2.
Accordingly, let us define f ′ (k) to be the largest integer L such that whenever R 0 (Φ, i) ⩽ L and R 1 (Φ, i) ⩽ L for all i, then the formula Φ is satisfiable. Clearly f ′ (k) ⩾ f (k)/2. This function is also studied in [5], with slightly different terminology, in terms of a combinatorial object called a (k, d)-tree.

Definition 7 ([6]
). 1 A (k, d)-tree is a binary tree T where every leaf has depth at least k, and every node u of T has at most d descendant leaves within distance k of u.
We quote the following two results from [5] and [6]: Theorem 8. This immediately gives the following result: Let us use the LLL and Theorem 6 to show more precise lower bounds on f ′ (k). We will fix a problem-independent probability space Ω to set each X i to be T with probability p i = 1/2. For each clause C i , we have a bad-event B i with probability Pr(B i ) = p = 2 −k .
Theorem 10 (Follows easily from the symmetric LLLL). f ′ (k) ⩾ ⌊ 2 k ek − 1/k⌋ Proof. Consider some bad-event, without loss of generality The neighbors of B in the canonical lopsidependency graph G are bad-events involving X i = F for some i = 1, . . . , k; as each literal occurs at most L times, there are at most d = kL such bad-events. The symmetric LLLL criterion ep(d + 1) ⩽ 1 then holds if L ⩽ 2 k ek − 1/k.

Theorem 11 (From Theorem 6). Suppose that
for all i. Then the Moser-Tardos algorithm finds a satisfying assignment of Φ in expected polynomial time.
In particular, Proof. We will set µ(B) = α for all B ∈ B, where α ⩾ 0 is some parameter to be determined. Consider some bad-event, without loss of generality It is difficult to list all orderable sets of neighbors of B according to Definition 5. However, to apply Theorem 6, we only need to provide an upper bound on the sum over such orderable sets (possibly including some additional neighbor-sets Y as well). Any such orderable set will have, for each j = 1, . . . , k, a choice of zero or one bad-events A j which first disagree with B on variable X j . (That is, in Definition 5, we have B i = A j where z i = X j ). Thus, we have an upper bound: So a sufficient criterion to satisfy Theorem 6 is We choose α to maximize α−2 −k (α+(1+Lα) k ); simple calculus gives α = and L ⩽ 2 k −1 k , then Theorem 6 is satisfied. The second condition L ⩽ 2 k −1 k can be easily seen to be redundant, leading to the given bounds.
⌋ to be the bounds on f ′ (k) which are provable respectively from the symmetric LLLL (Theorem 10) and from the criterion of Theorem 11. We observe that 2ek 2 − 1 So the gap between the LLL and Theorem 6 appears to be growing exponentially in k. (The relative difference between the formulas approaches zero, however).

Constructing the extremal formula Φ
Let us fix integers L, k. We will construct a k-SAT instance Φ with R 0 (Φ, i), R 1 (Φ, i) ⩽ L, in which the Shearer criterion is violated for the canonical lopsidependency graph corresponding to the natural space Ω where Pr(X i = T ) = 1/2, and all variables X i are independent, and with the natural collection of bad-events corresponding to the clauses. However, L ⩽ F MT (k); thus Theorem 6 ensures that Φ is satisfiable.
To begin the construction, start with Φ 0 containing no clauses (i.e. Φ 0 is the tautology). At stage i of the process, we modify Φ i−1 to produce a new formula Φ i by adding L − 1 clauses in which i appears positively and L − 1 clauses in which i appears negatively. All other variables in these clauses are completely new, not appearing in any clause of Φ i−1 ; they all appear positively in the 2L − 2 new clauses, and each of the new variables (other than variable i) appears in exactly one new clause.
Note that Pr(B) = p = 2 −k for all bad-events. Furthermore, since each variable i has exactly one positive occurence added in some iteration Φ i ′ for i ′ ̸ = i, we have Define G ℓ be the canonical lopsidependency graph corresponding to the bad-events for the formula Φ ℓ . Although these graphs are complex, they contain a relatively simple and regular family of subgraphs H j . We will show that Shearer's criterion is violated for these subgraphs; as shown in [20], this implies that Shearer's criterion is violated for the overall graph G ℓ .
The graph family H j will consist of many copies of K L−1,L−1 , the complete bipartite graph with L − 1 vertices on each side. Each graph H j has a special copy of K L−1,L−1 , called the root of H j . We define these graphs recursively. First, H 0 is the empty graph.
To form H j+1 , we start by taking a new copy of K L−1,L−1 designated as the root of H j+1 . For each vertex v in this root, we add k − 1 separate new copies of H j , along with an edge connecting v to all the vertices in the right-half of the root of the corresponding H j .
For example, H 1 consists of a single copy of K L−1,L−1 . See Figure 1.

Root of H_{j+1}
(k-1) copies of H_j … v Figure 1: Construction of H j+1 from H j . We have only shown here two copies of H j corresponding to a single vertex v in the root of H j+1 . There are k − 1 copies of H j for each vertex in the root of H j+1 (a total of 2(L − 1)(k − 1) copies of H j ).

Proposition 12.
Any graph H j appears as a subgraph of G ℓ for some ℓ sufficiently large.
Proof. Define A i to be the collection of clauses in Φ i but not Φ i−1 . We can also define a tree structure T on the variables of Φ: variable i is a parent of variable j if variable j appears in Φ i but not Φ i−1 . For any variable i, let T i denote the subtree of T rooted at i.
For any set of variables S, define G ℓ [S] to be the subgraph of G ℓ induced on the clauses ϕ of Φ ℓ such that all variables in ϕ come from S. Observe that if S, S ′ are disjoint sets of variables then G ℓ [S], G ℓ [S ′ ] are also vertex-disjoint graphs.
We will prove by induction on j a stronger claim: for any variable i, there is some integer D(i, j) sufficiently large such that the induced subgraph G D(i,j) [T i ] contains a copy of H j , and the root of this copy of H j corresponds to the clauses of A i .
When j = 0 this is vacuously true. For the induction step, consider some variable i. Let C denote the (2L − 2)(k − 1) variables which are children of i in T . By inductive hypothesis, for each i ′ ∈ C, the graph G D(i ′ ,j−1) [T i ′ ] contains a copy of H j−1 whose root corresponds to A i ′ .
Let ℓ = i + max i ′ ∈C D(i ′ , j − 1); we claim that the choice D(i, j) = ℓ satisfies the induction claim. For, in the graph G ℓ [T i ], the clauses of A i in which i appears positively are lopsidependent with those clauses in which i appears negatively. Thus, it has a copy of K L−1,L−1 corresponding to A i ; we denote this copy by J.
Consider some clause ϕ ∈ A i , corresponding to a vertex of J, and some variable i ′ ̸ = i in this clause. The root of J i ′ corresponds to the clauses A i ′ . Note that ϕ is the only clause of A i in which i ′ appears, and it appears positively in ϕ. Variable i ′ also appears negatively in exactly L − 1 clauses of A i ′ , which correspond to the right-half of J i ′ . Thus, there are edges from ϕ in J to all the right-vertices in k − 1 copies of H j−1 . As this is true for every ϕ ∈ J, the resulting graph is precisely H j . This completes the induction.

Computing the Shearer criterion for H j
We now compute the Shearer criterion for the family of graphs H j . For our intermediate calculations, we also need to work with another closely-related family of graphs. For each j ⩾ 0, define a graph H ′ j by taking a single vertex v along with k − 1 new copies of H j . We include an edge from v to all the vertices in the right-half of the roots of H j . See Figure 2.  We will make use of two computational tricks for stable set polynomials; the proofs of these are elementary and are omitted here.

Proposition 13. If vertex set V is partitioned into connected-components as
We now begin the calculation.
Proposition 15. Let us define Then r 0 = 1 − p, s 0 = 1, and r, s satisfy the mutual recurrence relations for j ⩾ 1: Proof. The base cases are clear, since H 0 is empty and H ′ 0 is a single node. We first show the bound on s j for j ⩾ 1. In any stable set U of H j , either U contains zero vertices from the left half of the root of H j , or zero vertices from the right-half of the root of H j , or both. In the first two cases, when we remove the vertices in the left (respectively right) half of H j , then we are left with L − 1 copies of H ′ j−1 and (k − 1)(L − 1) copies of H j−1 . In the third case, we are left with (k − 1)(2L − 2) copies of H j−1 . We can sum the first two contributions and subtract the third, as it is double-counted: this gives Next consider the bound for r j . Let v denote the root node of H ′ j and let J 1 , . . . , J k−1 be the copies of H j to which it is connected, and let P i denote the root of each J i . We apply Proposition 14 with X = {v}, and so either : the vertices in the left half of P i now yield L − 1 disconnected copies of H ′ j−1 and each vertex u in the right half of P i now yields k − 1 disconnected copies of H j−1 . Over all k − 1 choices of i and all (k − 1)(L − 1) choices for u in each P i , we see that Figure 3.
Summing the contributions of these two terms according to Proposition 14 gives We will show a recurrence relation for a j . Using Proposition 15, we calculate for j ⩾ 1: Here again using Proposition 15, we get and, substituting this into the equation for a j , this implies: We must have a j > 2  Otherwise, suppose that g(a) ⩾ a for some a ∈ (2 − −2 2L−2 , 1]. Observe that g(1) = 1 − p < 1. Hence, the function g(a) − a changes sign on the interval (2 − −2 2L−2 , 1]. This implies there must be a fixed point g(a) = a on this interval.
For any k ⩾ 1, let us define the quantityF Shearer (k) by: In light of Proposition 17, this is an upper bound on the value of f ′ (k) that can be shown using the LLL or any variant of it. We observe thatF Shearer (k) ⩾ F LLL (k) for all values of k -this must be the case, since the bound F LLL was indeed derived using the LLL and this is always weaker than Shearer's criterion. To illustrate, we list F LLL ,F Shearer , and F MT for a few small values of k in Table 1.
The gap betweenF Shearer and F LLL is very small, suggesting that there is little to no improvement possible in the bound for f ′ (k) from a more advanced more of the LLL.
We next derive an asymptotic approximation toF Shearer .