Extended Formulation for CSP that is Compact for Instances of Bounded Treewidth

In this paper we provide an extended formulation for the class of constraint satisfaction problems and prove that its size is polynomial for instances whose constraint graph has bounded treewidth. This implies new upper bounds on extension complexity of several important NP-hard problems on graphs of bounded treewidth.


Introduction
Many important combinatorial optimization problems belong to the class of constraint satisfaction problems (CSP). Naturally, a lot of effort has been given to design efficient approximation algorithms for CSP, to prove complexity lower bounds for CSP, and to identify tractable instances of CSP (e.g., from the point of view of parameterized complexity). It has been shown that CSP is solvable in polynomial time for instances whose constraint graph has bounded treewidth [7].
In recent years, a lot of attention has been given to study extension complexity of problems [5]: what is the minimum number of inequalities representing a polytope whose (suitably chosen) linear projection coincides with the convex hull H of all integral solutions of Q? Such a polytope is called the extended formulation of H. Note that membership of a problem in the class P of polynomially solvable problems does not necessarily imply the existence of an extended formulation of polynomial size [16]. In this work, we present an extended formulation for CSP and show that its size is polynomial for instances of CSP whose constraint graph has bounded treewidth.
a set of soft constraints C ⊆ {C U | U ⊆ V } where each soft constraint C U ∈ C with U = {i 1 , i 2 , . . . , i k } and i 1 < i 2 < · · · < i k , is a |U |-ary relation The constraint graph of Q is defined as G = (V, E) where E = {{u, v} | ∃C U ∈ C ∪ H s.t. {u, v} ⊆ U }. We say that a CSP instance Q has bounded treewidth if the constraint graph of Q has bounded treewidth. In binary CSP, every hard and soft relation is a unary or binary relation, and in boolean CSP, the domain of every variable is {0, 1}. We use D to denote the maximal size of all domains, that is, D = max u∈V |D u |. For a vector z = (z 1 , z 2 , . . . , z n ) and U = {i 1 , i 2 , . . . , i k } ⊆ V with i 1 < i 2 < · · · < i k , we define the projection of z on U as z | U = (z i1 , z i2 , . . . , z i k ). A vector z ∈ R n satisfies the constraint C U ∈ C ∪ H if and only if z| U ∈ C U . We say that a vector z ⋆ = (z ⋆ 1 , . . . , z ⋆ n ) is a feasible assignment for Q if z ⋆ ∈ D 1 × D 2 × . . . × D n and z ⋆ satisfies every hard constraint C ∈ H. For a given feasible assignment z ⋆ we define an extended feasible assignment ex(z ⋆ ) = (z ⋆ , h ⋆ ) ∈ R n+|C| as follows: the coordinates of h ⋆ are indexed by the soft constraints from C (to be more precise: by the subsets U of V used as lower indices of the soft constraints) and for each C U ∈ C, we have h ⋆ U = 1 if and only if z ⋆ | U ∈ C U , and h ⋆ U = 0 otherwise. We denote by F (Q) the set of all feasible assignments for Q, by F ex (Q) = {ex(z ⋆ ) | z ⋆ ∈ F (Q)} the set of all extended feasible assignments for Q. For every instance Q we define two polytopes: CSP (Q) is the convex hull of F ex (Q) and CSP ′ (Q) the convex hull of F (Q). We also define three trivial linear projections: where z ∈ R n and h ∈ R |C| , and observe that proj V (CSP (Q)) = CSP ′ (Q).
In the decision version of CSP, the set C of soft constraints is empty and the task is to decide whether there exists a feasible assignment. In the maximization (minimization, resp.) version of the problem, the task is to find a feasible assignment that maximizes (minimizes, resp.) the number of satisfied (unsatisfied, resp.) soft constraints. Note that there is no difference between maximization and minimization versions of the problem with respect to optimal solutions but the two versions differ significantly from an approximation perspective.
In the weighted version of CSP we are also given a weight function w : C → R that specifies for each soft constraint C ∈ C its weight w(C). The goal is to find a feasible assignment that maximizes (minimizes, resp.) the total weight of satisfied (unsatisfied, resp.) constraints. The unweighted version of CSP is equivalent to the weighted version with w(C) = 1 for all C ∈ C.
Even more generally, the relations in the soft constraints can be replaced by bounded real valued payoff functions: a soft constraint C U ∈ C with U = {i 1 , i 2 , . . . , i k } is not a |U |-ary relation but a function w : D i1 ×D i2 ×. . .×D i k → R and the payoff of the soft constraint C U for a feasible assignment z ⋆ is w(z ⋆ | U ); the objective is to maximize (minimize, resp.) the total payoff. For the sake of simplicity of the presentation we do not consider the problem in this generality although the techniques used in this paper apply in the general setting as well.
For notions related to the treewidth of a graph, we stick to the standard terminology as given in the book by Kloks [10]).

Related Work
CSP for graphs of bounded treewidth. As CSP captures many NP-hard problems, it is a natural problem to identify tractable special cases of CSP. Freuder [7] showed that CSP instances with treewidth bounded by τ can be solved in time O(D τ n). Later, Grohe et al. [8] proved that, assuming F P T = W [1], this is essentially the only nontrivial class of graphs for which CSP is solvable in polynomial time (cf. Marx [12]).
Describing the polytope of CSP solutions by the means of linear programming, for instances of bounded treewidth, is not a new idea. In 2007, Sellmann et al. published a paper [18] in which they described a linear program that was supposed to define the convex hull of all feasible solutions of a binary CSP when the constraint graph is a tree. They also provided a procedure to convert a given CSP instance with bounded treewidth into one whose constraint graph is a tree, at the cost of blowing up the number of variables and constraints by a function of the treewidth. Unfortunately, there was a substantial bug in their proof and one of the main theorems in the paper does not even hold [17].
The paper [18] also implicitely includes this folklore result: if the constraint graph has treewidth at most τ , then CSP can be solved by τ levels of the Sherali-Adams hierarchy. The resulting formulation is of size O(n τ ) while our approach yields size O(D τ n).
CSP for general graphs. Chan et al. [4] study the extent to which linear programming relaxation can be used in dealing with approximating CSP. They show that polynomial-sized LPs are exactly as powerful as LPs obtained from a constant number of rounds of the Sherali-Adams hierarchy. They also prove integrality gaps for polynomial-sized LPs for some CSP.
Raghavendra [13] shows that under the Unique Games Conjecture, a certain simple SDP relaxation achieves the best approximation ratio for every CSP. In a follow up paper, Raghavendra and Steurer [14] describe an efficient rounding scheme that achieves the integrality gap of the simple SDP relaxation, and, in another paper [15], they show unconditionally that the integrality gap of this SDP relaxation cannot be reduced by Sherali-Adams hierarchies.
Other related results. Buchanan and Butenko [3] provide an extended formulation for the independent set problem, a special case of CSP, that has size O(2 τ n) where τ denotes the treewidth of the given graph. Our results can be viewed as a generalization of this result: the size of our formulation, when applied to the independent set problem, is also O(2 τ n).
In a recent work, Bienstock and Munoz [2] define a class of so called general binary optimization problems which are essentially weighted boolean CSP problems, and for instances of treewidth τ provide an LP formulation of size O(2 τ n). Again, this is a special case of our result in this paper. It is worth mentioning at this point that every CSP instance can be transformed into a boolean CSP instance; however, the standard transformation results in a substantial increase (in some cases even Ω(D)) of the treewidth of the constraint graph.

New Results
Our main result is summarized as the following theorem. As a corollary we obtain upper bounds on the extension complexity for several NP-hard problems on the class of graphs with bounded treewidth; as far as we know, these results have not been known.

Integer Linear Programming Formulation
We start by introducing the terms and notation that we use throughout this section. We assume that Q = (V, D, H, C) is a given instance of CSP. For every subset W ⊆ V we define the set of all configurations of W as where λ is a symbol not appearing in any of the domains D u , u ∈ V . For a configuration K ∈ K(U ) and v ∈ V , we use the notation K(v) to refer to the v-th element of K. Also, for a configuration K ∈ K(U ), v ∈ V \ U and α ∈ D v , we use the notation K[v ← α] to denote the configuration For an n-dimensional vector K = (α 1 , . . . , α n ) and a subset of variables U ⊆ V we denote by K ↾ U the restriction of K to U that is defined as an n-dimensional vector with K ↾ U (i) = K(i) for i ∈ U and K ↾ U (i) = λ for i ∈ U (i.e., we set to λ all coordinates of K outside of U ). We denote by Λ the configuration (λ, . . . , λ) ∈ K(∅); note that for α ∈ D v , Λ[v ← α] is the configuration from K({v}) with exactly one non-λ element, namely the v-th element, equaling α.
In our linear program, for every index v ∈ V and every i ∈ D v , we introduce a binary variable y i v . The task of the variable y i v is to encode the value of the CSP-variable z v : the variable y i v is set to one if and only if z v = i. Since in every solution each variable assumes a unique value, we enforce the constraint i∈D(v) y i v = 1 for each v ∈ V . For every configuration K ∈ U:CU ∈C∪H K(U ) we introduce a binary variable g(K). The intended meaning of the variable g(K), for K ∈ K(U ) and U ⊆ V , is to provide information about the values of the CSP-variables z u for u ∈ U in the following way: g(K) = 1 if and only if for every u ∈ U , z u = K(u). To ensure consistency between the y and g variables, for every C U ∈ C ∪ H and for every v ∈ U , we enforce the constraint K∈K(U):K(v)=i g(K) = y i v . Note that for binary CSP, the g variables capture the values of CSP-variables z for pairs of elements from V that correspond to edges of the constraint graph.
Relaxing the integrality constraints we obtain the following initial LP relaxation of the CSP problem Q = (V, D, H, C): Note that there is a one to one correspondence between the (extended) feasible assignments of Q and integral solutions of (1) - (3); from now on we denote by proj 1 the linear projection of the convex hull of integral solutions of (1) - (3) to CSP (Q). Also observe that the total weight of CSP-constraints satisfied by an integral vector (y, g) satisfying (1) Unfortunately, even for CSP problems whose constraint graph is series-parallel, the polytope given by the LP (1) -(3) is not integral (consider, e.g., the instance of CSP corresponding to the independent set problem on K 3 ). The weakness of the formulation is that no global consistency among the y variables is guaranteed. To strengthen the relaxation, we introduce new variables and constraints derived from a tree decomposition of the constraint graph of Q.

Extended Formulation
Here we describe, for every CSP instance Q = (V, D, H, C), a polytope P (Q), and in the next subsection we prove that P (Q) is an extended formulation of CSP (Q) and CSP ′ (Q). The set of variables in the given LP description of P (Q) is substantially different from the set of variables used in the LP (1) -(3), and the set of new constraints is completely different from the the set of constraints in the LP (1) - (3). Whereas in the previous subsection, there is (roughly) a variable g(K) for every feasible assignment of every subset of CSP variables corresponding to a soft or hard constraint, here we have a variable for every feasible assignment of every subset of CSP variables corresponding to a bag in a given tree decomposition of the constraint graph. Nevertheless, as we show after 1 In the case of general payoff functions, the total weight is given by defining P (Q), there exists a simple linear projection of P (Q) to the convex hull of all integral points in the polytope given by the LP (1) -(3).
Let T = (V T , E T ) be a fixed nice tree decomposition [10] of the constraint graph of Q and for every node a ∈ V T , let B(a) ⊆ V denote the corresponding bag. Let B = {B(a) | a ∈ V T } denote the set of all bags of T . Let K B = B∈B K(B) be the set of all configurations of all bags in T . We use V I ⊆ V T to denote the subset of all introduce nodes in T and V F ⊆ V T to denote the subset of all forget nodes in T .
For every configuration K ∈ K B we introduce a binary variable f (K). As in the previous subsection, the intended meaning of the variable K ∈ K(B), for B ∈ B, is to provide information about the values of the CSP-variables z u for u ∈ B in the following way: f (K) = 1 if and only if for every u ∈ B, z u = K(u). To ensure consistency among variables indexed by the configurations of the same bag, namely to ensure that for every B ∈ B there exists exactly one configuration K ∈ K(B) with f (K) = 1, we introduce for every B ∈ B the LP constraint For every introduce node c ∈ V T with a child b ∈ V T and for every configuration K ∈ K(B(b)) we have the constraint K ′ ∈K(B(c)): and symmetrically, for every forget node c ∈ V T with a child b ∈ V T and for every configuration K ∈ K(B(c)) we have the constraint K ′ ∈K(B(b)): Relaxing the integrality constraints and putting all these additional constraints together, we obtain: (6) the only child of c For the given binary CSP instance Q, we denote the polytope associated with the LP (4) -(7), as P (Q).
Consider now a vector f ∈ P (Q) and the following set of linear equations: It is just a technical exercise to check that for a given f ∈ P (Q), there always exists a unique solution (y, g) of this LP and that the unique (y, g) is a linear projection of f . Moreover, such a vector (y, g) also satisfies the LP constraints (1) -(3). The point is that there exists a linear projection of P (Q) into the polytope defined by the LP (1) -(3); moreover, an integral point from P (Q) is mapped on an integral point. From now on we denote this projection proj 2 .

Proof of Theorem 1
As in the previous subsections, we assume that Q = (V, D, H, C) is a given instance of CSP, G = (V, E) is the constraint graph of Q and T = (V T , E T ) a fixed nice tree decomposition of G. We start by introducing several notions that will help us dealing with tree decompositions and our linear program.
For a node a ∈ V T , let T (a) = (V a , E a ) be the subtree of T rooted in a; the configurations relevant to T (a) are those in the set R(a) = b∈Va K(B(b)), and the variables relevant to T (a) are those f (K) for which K ∈ R(a). For succinctness of notation, we denote the projection f | R(a) of the vector f on the set of variables relevant to T (a) also by f | a . The constraints relevant to T (a) are those containing only the variables relevant to T (a). We say that a vector I ∈ {0, 1} R(a) agrees with the configuration K ∈ R(a) if I(K) = 1.
Let f be a fixed solution of the LP (4) -(7) that corresponds to a vertex of the polytope P (Q). Our main tool is the following lemma.
Proof. By induction. We start in the leaves of T and proceed in a bottom-up fashion.
Base case. Assume that b ∈ V T is a leaf of the nice decomposition tree T . By definition of a nice tree decomposition, the bag B(b) consists of a single vertex, say a vertex v ∈ V . The only variables relevant to , and the only relevant constraints are those of the type (4) and (7).
Let M ′ ∈ N be such that an M ′ -multiple of every relevant variable is integral; as f is a solution corresponding to a vertex of the polytope P (Q), all the variables are rational which guarantees that such an M ′ exists. For every j ∈ D v we define an integral vector I j such that The vector I j will appear with multiplicity M ′ · y j v among the integral solutions I 1 , . . . , I M ′ for G ′ . Then, obviously, both properties ♠ and ♣ are satisfied.
Inductive step. Consider an internal node c ∈ V T of the nice decomposition tree T . We distinguish three cases: c is a join node, c is an introduce node and c is a forget node.
Two vectors I i and J j that agree with a given configuration K ∈ K(B(c)) can be easily merged into an integral vector L ∈ {0, 1} R(c) that satisfies L| a = I i and L| b = J j ; as the set of all constraints relevant to T (c) is the union of the constraints relevant to T (a) and the constraints relevant to T (b), the vector L satisfies also all the constraints relevant to T (c).
For simplicity we assume, without loss of generality, that M = M ′ . Then, by the property ♣ and since B(a) = B(b) = B(c), for every configuration K ∈ K(B(c)), the number of vectors I i that agree with K is equal to the number of vectors J j that agree with K, namely M · f (K). Thus, it is possible to match the vectors I i and J j one to one in such a way that both vectors in each pair agree with the same configuration; let L 1 , L 2 , . . . , L M denote the result of their merging as described above. Then the vectors L i satisfy the property ♠ as explained in the previous paragraph, and by construction they also satisfy the property ♣.
Introduce node. Assume that the only child of the introduce node c is a node b and B(c) = B(b) ∪ {v}. By the inductive assumption, there exists integer M and integral vectors I 1 , . . . , I M ∈ {0, 1} R(b) , each of them satisfying the relevant constraints for T (b) and such that f Without loss of generality we assume that for every variable relevant to T (c), its M -multiple is integral. We partition the vectors I 1 , . . . , I M into several groups indexed by the configurations from K(B(b)): the group Z K , for K ∈ K(B(b)), consists exactly of those vectors I i that agree with K.
Consider a fixed configuration K ∈ K(B(b)) and the corresponding group Z K . Note that the size of this group is M · f (K). We further partition the group c)), in such a way that Z K ′ contains exactly M · f (K ′ ) vectors (it does not matter which ones); the LP constraint (5) makes this possible. Then, for every j ∈ D v , we create from every vector I ∈ Z K[v←j] a new integral vector J I in the following way: for everyK ∈ R(b), J I (K) = I(K); this guarantees J I | b = I, Obviously, the new vectors J I satisfy all constraints relevant to T (b), and it is easy to check that they satisfy all constraints relevant to T (c) as well, given the definitions above. Moreover, the definitions above imply that the vectors J I satisfy the property ♣.
Forget node. Assume that the only child of the forget node c is a node b, B(c) = B(b) \ {v}. This case is symmetric to the previous one in that instead of splitting the groups Z K into smaller groups Z K ′ , we merge them into bigger Z K ′ .
By the inductive assumption, there exists an integer M and integral vectors I 1 , . . . , I M ∈ {0, 1} R(b) , each of them satisfying the relevant constraints for T (b) and such that f Without loss of generality we assume that for every variable relevant to T (c), its M -multiple is integral. We partition the vectors I 1 , . . . , I M into several groups indexed by the configurations from K(B(b)): the group Z K , for K ∈ K(B(b)), consists exactly of those vectors I i that agree with K. Note that the size of Z K is M · f (K).
For every K ′ ∈ K(B(c)) we create a bigger group group Z K ′ by merging |D v | of the groups Z K , namely those satisfying K| B(c) = K ′ . By the LP constraint (6), the new group Z K ′ contains exactly M · f (K ′ ) vectors. For every K ′ ∈ K(B(c)), we create from every vector I ∈ Z K ′ a new integral vector J I in the following way: for everyK ∈ R(b), J I (K) = I(K).
We have to check that the vectors J I satisfy all constraints relevant to T (c). The only possibly new constraints are those using variables f (K ′ ) for K ′ ∈ K(B(c)) and it is easily seen that they are satisfied, given the definitions above. Also, the definitions above imply that the vectors J K ′ satisfy the property ♣. ⊓ ⊔ By applying Lemma 1 to the whole tree T , that is, to the subtree rooted in the root of T , we immediately obtain that f is an integral vector, and, thus, also the corresponding vertex of P (Q) is integral. As this holds for every vertex of P (Q), we conclude that P (Q) is an integral polytope.
Considering the notes at the ends of the previous two subsections, we also conclude that CSP (Q) = proj 1 (proj 2 (P (Q)) and CSP ′ (Q) = proj V (CSP (Q)).
To complete the proof of Theorem 1, we observe that the number of variables and constraints in the LP (4) - (7) is O(D τ n).

Applications
The purpose of this section is to make explicit the extension complexity upper bounds given in Theorem 1 for several well known graph problems. We find it interesting that the attained extension complexity upper bounds meet the best possible (assuming Strong ETH) time complexity lower bounds, given by Lokshtanov et al. [11]; the only exception is the Multiway Cut problem. To state our results, we use for each problem the following template: Problem name Projection Extension complexity Time complexity Instance: . . . Solution: . . . CSP formulation: V , D, H, C. CSP version: Decision / Max / Min where Projection is the name of the linear projection that yields the natural polytope of the problem Q from the CSP (Q) polytope (or from the P (Q) polytope, in case of the OCT problem). We use the notation [n] = {1, . . . , n}.
Coloring / Chromatic Number [1] proj V O(q τ n) Θ(q τ n) Solution: A coloring of G with q colors with no monochromatic edges.
Decision Comment: Note that Chromatic Number χ(G) of G is always upper bounded by τ + 1 since graphs of bounded treewidth are τ -degenerate and thus (τ + 1)colorable. Thus, if the goal is to determine χ(G), it suffice to find the smallest q such that CSP (Q) is non-empty.
List-H-Coloring / List Homomorphism [6] Decision Comment: Note that the problems List Coloring, Precoloring Extension and H-Coloring (or Graph Homomorphism) are all special cases of this problem. The lower bound given by Lokshtanov et al. [11] applies to all of them since Coloring is a special case of each of them.
Unique Games [9] proj id O(t τ n) -Instance: Graph G = (V, E), an integer t ∈ N, a permutation π e of order t for every edge e ∈ E. Solution: A mapping ℓ : V → [t] such that the number of edges uv ∈ E with π uv (ℓ(u)) = ℓ(v) is maximized.
The decision variant of this problem is not interesting as it is trivially solvable in polynomial time.
Multiway Cut [1] proj E O(t τ n) O(t τ n) Instance: Graph G = (V, E), an integer t ∈ N and t vertices s 1 , . . . , s t ∈ V Solution: A partition of V into sets V 1 , . . . , V t such that for every i we have s i ∈ V i and the total number of edges between V i and V j for i = j is minimized.
CSP formulation: V = [n], D v = [t] for every v ∈ V , H = ∅, C uv = {(i, i) | i ∈ [n]} for every edge uv ∈ E. Min Comment: Setting z v = i models vertex v belonging to the set V i . Not satisfying the constraint C uv means that the edge uv belongs to the multiway cut.
Max Cut [1] proj E O(2 τ n) Θ(2 τ n) Instance: Graph G = (V, E) Solution: A partition of vertices into two sets V 1 , V 2 such that the number of edges between V 1 and V 2 is maximized. Max Comment: The values 0, 1 model the vertex belonging to the set V 1 or V 2 . If we replace maximization by minimization, the problem becomes Edge Bipartization (aka Edge OCT) problem which is a parametric dual of Max Cut.
Vertex Cover [1] proj V O(2 τ n) Θ(2 τ n) Instance: Graph G = (V, E) Solution: A set of vertices C ⊆ V of minimal size such that every edge contains a vertex v ∈ C as at least one of its endpoints. Min Comment: The values 0, 1 model the vertex belonging to C or V \ C. If we replace maximization by minimization, the problem becomes Independent Set problem which is a parametric dual of Vertex Cover.

Open problems
A natural research direction is to examine more closely the extension complexity for CSP and the specific graph problems on graphs with bounded treewidth, in particular, what are the best possible upper bounds?