Approximation via Correlation Decay when Strong Spatial Mixing Fails* Ivona Bezakovat Andreas Galanist Leslie Ann Goldbergf Heng Guos Daniel Stefankovic 23 January 2019 Abstract Approximate counting via correlation decay is the core algorithmic technique used in the sharp delineation of the computational phase transition that arises in the approxima- tion of the partition function of anti-ferromagnetic two-spin models. Previous analyses of correlation-decay algorithms implicitly depended on the occur- rence of strong spatial miring.This,roughly,means that one uses worst-case analysis of the recursive procedure that creates the sub-instances.In this paper,we develop a new analysis method that is more refined than the worst-case analysis.We take the shape of instances in the computation tree into consideration and we amortise against certain "bad"instances that are created as the recursion proceeds.This enables us to show correlation decay and to obtain an FPTAS even when strong spatial mixing fails. We apply our technique to the problem of approximately counting independent sets in hypergraphs with degree upper-bound A and with a lower bound k on the arity of hyperedges.Liu and Lin gave an FPTAS for k >2 and A 3 and A 6.The best previously-known approximation scheme for A=6 is the Markov-chain simulation based FPRAS of Bordewich,Dyer and Karpinski,which only works for k>8. Our technique also applies for larger values of k,giving an FPTAS for k >A.This bound is not substantially stronger than existing randomised results in the literature. Nevertheless,it gives the first deterministic approximation scheme in this regime.More- over,unlike existing results,it leads to an FPTAS for counting dominating sets in regular graphs with sufficiently large degree. We further demonstrate that in the hypergraph independent set model,approximating the partition function is NP-hard even within the uniqueness regime.Also,approximately *This work was done in part while the authors were visiting the Simons Institute for the Theory of Com- puting. Department of Computer Science,Rochester Institute of Technology,Rochester,NY,USA.Research sup- ported by NSF grant CCF-1319987. Department of Computer Science,University of Oxford,Wolfson Building,Parks Road,Oxford,OX1 3QD, UK.The research leading to these results has received funding from the European Research Council under the European Union's Seventh Framework Programme (FP7/2007-2013)ERC grant agreement no.334828. The paper reflects only the authors'views and not the views of the ERC or the European Commission.The European Union is not liable for any use that may be made of the information contained therein. SSchool of Informatics,University of Edinburgh,Informatics Forum,Edinburgh,EH8 9AB,UK. Department of Computer Science,University of Rochester,Rochester,NY 14627.Research supported by NSF grant CCF-1563757
Approximation via Correlation Decay when Strong Spatial Mixing Fails∗ Ivona BezákovᆠAndreas Galanis‡ Leslie Ann Goldberg‡ Heng Guo§ Daniel Štefankovič¶ 23 January 2019 Abstract Approximate counting via correlation decay is the core algorithmic technique used in the sharp delineation of the computational phase transition that arises in the approximation of the partition function of anti-ferromagnetic two-spin models. Previous analyses of correlation-decay algorithms implicitly depended on the occurrence of strong spatial mixing. This, roughly, means that one uses worst-case analysis of the recursive procedure that creates the sub-instances. In this paper, we develop a new analysis method that is more refined than the worst-case analysis. We take the shape of instances in the computation tree into consideration and we amortise against certain “bad” instances that are created as the recursion proceeds. This enables us to show correlation decay and to obtain an FPTAS even when strong spatial mixing fails. We apply our technique to the problem of approximately counting independent sets in hypergraphs with degree upper-bound ∆ and with a lower bound k on the arity of hyperedges. Liu and Lin gave an FPTAS for k ≥ 2 and ∆ ≤ 5 (lack of strong spatial mixing was the obstacle preventing this algorithm from being generalised to ∆ = 6). Our technique gives a tight result for ∆ = 6, showing that there is an FPTAS for k ≥ 3 and ∆ ≤ 6. The best previously-known approximation scheme for ∆ = 6 is the Markov-chain simulation based FPRAS of Bordewich, Dyer and Karpinski, which only works for k ≥ 8. Our technique also applies for larger values of k, giving an FPTAS for k ≥ ∆. This bound is not substantially stronger than existing randomised results in the literature. Nevertheless, it gives the first deterministic approximation scheme in this regime. Moreover, unlike existing results, it leads to an FPTAS for counting dominating sets in regular graphs with sufficiently large degree. We further demonstrate that in the hypergraph independent set model, approximating the partition function is NP-hard even within the uniqueness regime. Also, approximately ∗This work was done in part while the authors were visiting the Simons Institute for the Theory of Computing. †Department of Computer Science, Rochester Institute of Technology, Rochester, NY, USA. Research supported by NSF grant CCF-1319987. ‡Department of Computer Science, University of Oxford, Wolfson Building, Parks Road, Oxford, OX1 3QD, UK. The research leading to these results has received funding from the European Research Council under the European Union’s Seventh Framework Programme (FP7/2007-2013) ERC grant agreement no. 334828. The paper reflects only the authors’ views and not the views of the ERC or the European Commission. The European Union is not liable for any use that may be made of the information contained therein. §School of Informatics, University of Edinburgh, Informatics Forum, Edinburgh, EH8 9AB, UK. ¶Department of Computer Science, University of Rochester, Rochester, NY 14627. Research supported by NSF grant CCF-1563757. 1
counting dominating sets of bounded-degree graphs(without the regularity restriction)is NP-hard. 1 Introduction We develop a new method for analysing correlation decays in spin systems.In particular,we take the shape of instances in the computation tree into consideration and we amortise against certain "bad"instances that are created as the recursion proceeds.This enables us to show correlation decay and to obtain an FPTAS even when strong spatial mixing fails.To the best of our knowledge,strong spatial mixing is a requirement for all previous correlation-decay based algorithms.To illustrate our technique,we focus on the computational complexity of approximately counting independent sets in hypergraphs,or equivalently on counting the satisfying assignments of monotone CNF formulas. The problem of counting independent sets in graphs(denoted #IS)is extensively studied. A beautiful connection has been established,showing that approximately counting indepen- dent sets in graphs of maximum degree A undergoes a computational transition which coin- cides with the uniqueness phase transition from statistical physics on the infinite A-regular tree.The computational transition can be described as follows.Weitz [22]designed an FP- TAS for counting independent sets on graphs with maximum degree at most A =5.On the other hand,Sly [19]proved that there is no FPRAS for approximately counting independent sets on graphs with maximum degree at most A =6(unless NP =RP).The same connection has been established in the more general context of approximating the partition function of the hard-core model [22,16,19,4,6,20]and in the even broader context of approximating the partition functions of generic antiferromagnetic 2-spin models [18,6,20,11](which includes, for example,the antiferromagnetic Ising model).As a consequence,the boundary for the existence of efficient approximation algorithms for these models has been mapped out. Approximate counting via correlation decay is the core technique in the algorithmic devel- opments which enabled the sharp delineation of the computational phase transition.Another standard approach for approximate counting,namely Markov chain Monte Carlo (MCMC) simulation,is also conjectured to work up to the uniqueness threshold,but the current analy- sis tools that we have do not seem to be powerful enough to show that.For example,sampling independent sets via MCMC simulation is known to have fast mixing only for graphs with degree at most 4 [14,3],rather than obtaining the true threshold of 5. In this work,we consider counting independent sets in hypergraphs with upper-bounded vertex degree,and lower-bounded hyperedge size.A hypergraph H=(V,F)consists of a vertex set V and a set F of hyperedges,each of which is a subset of V.A hypergraph is said to be k-uniform if every hyperedge contains exactly k vertices.Thus,a 2-uniform hypergraph is the same as a graph.We will consider the more general case where each hyperedge has arity at least k,rather than exactly k. An independent set in a hypergraph H is a subset of vertices that does not contain a hyperedge as a subset.We will be interested in computing ZH,which is the total number of independent sets in H(also referred to as the partition function of H).Formally,the problem of counting independent sets has two parameters-a degree upper bound A and a INote that there are non-monotonic examples of antiferromagnetic 2-spin systems where the boundary is more complicated because the uniqueness threshold fails to be monotonic in A [10.However,this can be cleared up by stating the uniqueness condition as uniqueness for all d A.See [10,11]for details
counting dominating sets of bounded-degree graphs (without the regularity restriction) is NP-hard. 1 Introduction We develop a new method for analysing correlation decays in spin systems. In particular, we take the shape of instances in the computation tree into consideration and we amortise against certain “bad” instances that are created as the recursion proceeds. This enables us to show correlation decay and to obtain an FPTAS even when strong spatial mixing fails. To the best of our knowledge, strong spatial mixing is a requirement for all previous correlation-decay based algorithms. To illustrate our technique, we focus on the computational complexity of approximately counting independent sets in hypergraphs, or equivalently on counting the satisfying assignments of monotone CNF formulas. The problem of counting independent sets in graphs (denoted #IS) is extensively studied. A beautiful connection has been established, showing that approximately counting independent sets in graphs of maximum degree ∆ undergoes a computational transition which coincides with the uniqueness phase transition from statistical physics on the infinite ∆-regular tree. The computational transition can be described as follows. Weitz [22] designed an FPTAS for counting independent sets on graphs with maximum degree at most ∆ = 5. On the other hand, Sly [19] proved that there is no FPRAS for approximately counting independent sets on graphs with maximum degree at most ∆ = 6 (unless NP = RP). The same connection has been established in the more general context of approximating the partition function of the hard-core model [22, 16, 19, 4, 6, 20] and in the even broader context of approximating the partition functions of generic antiferromagnetic 2-spin models [18, 6, 20, 11] (which includes, for example, the antiferromagnetic Ising model). As a consequence, the boundary for the existence of efficient approximation algorithms for these models has been mapped out1 . Approximate counting via correlation decay is the core technique in the algorithmic developments which enabled the sharp delineation of the computational phase transition. Another standard approach for approximate counting, namely Markov chain Monte Carlo (MCMC) simulation, is also conjectured to work up to the uniqueness threshold, but the current analysis tools that we have do not seem to be powerful enough to show that. For example, sampling independent sets via MCMC simulation is known to have fast mixing only for graphs with degree at most 4 [14, 3], rather than obtaining the true threshold of 5. In this work, we consider counting independent sets in hypergraphs with upper-bounded vertex degree, and lower-bounded hyperedge size. A hypergraph H = (V, F) consists of a vertex set V and a set F of hyperedges, each of which is a subset of V . A hypergraph is said to be k-uniform if every hyperedge contains exactly k vertices. Thus, a 2-uniform hypergraph is the same as a graph. We will consider the more general case where each hyperedge has arity at least k, rather than exactly k. An independent set in a hypergraph H is a subset of vertices that does not contain a hyperedge as a subset. We will be interested in computing ZH, which is the total number of independent sets in H (also referred to as the partition function of H). Formally, the problem of counting independent sets has two parameters — a degree upper bound ∆ and a 1Note that there are non-monotonic examples of antiferromagnetic 2-spin systems where the boundary is more complicated because the uniqueness threshold fails to be monotonic in ∆ [10]. However, this can be cleared up by stating the uniqueness condition as uniqueness for all d ≤ ∆. See [10, 11] for details. 2
lower bound k on the arity of hyperedges.The problem is defined as follows.2 Name #HyperlndSet(k,A) Instance A hypergraph H with maximum degree at most A where each hyperedge has car- dinality (arity)at least k. Output The number ZH of independent sets in H. Previously,#HyperlndSet(,A)has been studied using the MCMC technique by Border- wich,Dyer,and Karpinski [1,2](see also [3]).They give an FPRAS for all k >A+2 >5 and for k >2 and A =3.Despite equipping path coupling with optimized metrics obtained using linear programming,these bounds are not tight for small k.Liu and Lu [12]showed that there exists an FPTAS for all k>2 and A 5 using the correlation decay technique. Thus,the situation seems to be similar to the graph case-given the analysis tools that we have,correlation-decay brings us closer to the truth than the best-tuned analysis of MCMC simulation algorithms.On the other hand,the technique of Liu and Lu [12]does not extend beyond A =5.To explain the reason why it does not,we need to briefly describe the correlation-decay-based algorithm framework introduced by Weitz [22].The main idea is to build a recursive procedure for computing the marginal probability that any given vertex is in the independent set.The recursion works by examining sub-instances with "boundary conditions"which require certain vertices to be in,or out,of the independent set.The recur- sion structure is called a "computation tree".Nodes of the tree correspond to intermediate instances,and boundary conditions are different in different branches.The computation tree allows one to compute the marginal probability exactly but the time needed to do so may be exponentially large since,in general,the tree is exponentially large.Typically,an approx- imate marginal probability is obtained by truncating the computation tree to logarithmic depth so that the(approximation)algorithm runs in polynomial time.If the correlation be- tween boundary conditions at the leaves of the(truncated)computation tree and the marginal probability at the root decays exponentially with respect to the depth,then the error incurred from the truncation is small and the algorithm succeeds in obtaining a close approximation. All previous instantiations under this framework require a property called strong spatial miring (SSM)3,which roughly states that,conditioned on any boundary condition on inter- mediate nodes,the correlation decays.In other words,SSM guards against the worst-case boundary conditions that might be created by the recursive procedure. Let the (A-1)-ary k-uniform hypertree T&,A be the recursively-defined hypergraph in which each vertex has A-1 "descending"hyperedges,each containing k-1 new vertices. Observation1.Letk≥2.For△≥6,strong spatial miring does not hold on Tk.△. Observation 1 follows from the fact that the infinite (A-1)-ary tree T2.A can be embedded in the hypertree TkA,and from well-known facts about the phase transition on T2.A. Observation 1 prevents the generalisation of Liu and Lu's algorithm [12]so that it applies for A 6.even with an edge-size lower bound k.The problem is that the construction of the computation tree involves constructing intermediate instances in which the arity of a hyperedge can be as small as 2.So,even if we start with a k-uniform hypergraph,the 2Equivalently,one may think of this problem as the problem of counting satisfying assignments of a mono- tone CNF formulas,where vertices are variables and hyperedges are clauses.Being out of the independent set (as a vertex)corresponds to being true (as a variable). 3See Section 2.1 for a definition. 3
lower bound k on the arity of hyperedges. The problem is defined as follows.2 Name #HyperIndSet(k, ∆). Instance A hypergraph H with maximum degree at most ∆ where each hyperedge has cardinality (arity) at least k. Output The number ZH of independent sets in H. Previously, #HyperIndSet(k, ∆) has been studied using the MCMC technique by Borderwich, Dyer, and Karpinski [1, 2] (see also [3]). They give an FPRAS for all k ≥ ∆ + 2 ≥ 5 and for k ≥ 2 and ∆ = 3. Despite equipping path coupling with optimized metrics obtained using linear programming, these bounds are not tight for small k. Liu and Lu [12] showed that there exists an FPTAS for all k ≥ 2 and ∆ ≤ 5 using the correlation decay technique. Thus, the situation seems to be similar to the graph case — given the analysis tools that we have, correlation-decay brings us closer to the truth than the best-tuned analysis of MCMC simulation algorithms. On the other hand, the technique of Liu and Lu [12] does not extend beyond ∆ = 5. To explain the reason why it does not, we need to briefly describe the correlation-decay-based algorithm framework introduced by Weitz [22]. The main idea is to build a recursive procedure for computing the marginal probability that any given vertex is in the independent set. The recursion works by examining sub-instances with “boundary conditions” which require certain vertices to be in, or out, of the independent set. The recursion structure is called a “computation tree”. Nodes of the tree correspond to intermediate instances, and boundary conditions are different in different branches. The computation tree allows one to compute the marginal probability exactly but the time needed to do so may be exponentially large since, in general, the tree is exponentially large. Typically, an approximate marginal probability is obtained by truncating the computation tree to logarithmic depth so that the (approximation) algorithm runs in polynomial time. If the correlation between boundary conditions at the leaves of the (truncated) computation tree and the marginal probability at the root decays exponentially with respect to the depth, then the error incurred from the truncation is small and the algorithm succeeds in obtaining a close approximation. All previous instantiations under this framework require a property called strong spatial mixing (SSM)3 , which roughly states that, conditioned on any boundary condition on intermediate nodes, the correlation decays. In other words, SSM guards against the worst-case boundary conditions that might be created by the recursive procedure. Let the (∆ − 1)-ary k-uniform hypertree Tk,∆ be the recursively-defined hypergraph in which each vertex has ∆ − 1 “descending” hyperedges, each containing k − 1 new vertices. Observation 1. Let k ≥ 2. For ∆ ≥ 6, strong spatial mixing does not hold on Tk,∆. Observation 1 follows from the fact that the infinite (∆−1)-ary tree T2,∆ can be embedded in the hypertree Tk,∆, and from well-known facts about the phase transition on T2,∆. Observation 1 prevents the generalisation of Liu and Lu’s algorithm [12] so that it applies for ∆ ≥ 6, even with an edge-size lower bound k. The problem is that the construction of the computation tree involves constructing intermediate instances in which the arity of a hyperedge can be as small as 2. So, even if we start with a k-uniform hypergraph, the 2Equivalently, one may think of this problem as the problem of counting satisfying assignments of a monotone CNF formulas, where vertices are variables and hyperedges are clauses. Being out of the independent set (as a vertex) corresponds to being true (as a variable). 3See Section 2.1 for a definition. 3
computation tree will contain instances with small hyperedges.Without strong spatial mixing, these small hyperedges cause problems in the analysis.Lu,Yang and Zhang [13]discuss this problem and say "How to avoid this effect is a major open question whose solution may have applications in many other problems."This question motivates our work. To overcome this difficulty,we introduce a new amortisation technique in the analysis. Since lack of correlation decay is caused primarily by the presence of small-arity hyperedges within the intermediate instances,we keep track of such hyperedges.Thus,we track not only the correlation,but also combinatorial properties of the intermediate instances in the computation tree.Using this idea,we obtain the following result. Theorem 2.There is an FPTAS for #HyperlndSet(3,6). Note that #HyperlndSet(2,6)is NP-hard to approximate due to [19],so our result is tight for A =6.This also shows that A=6 is the first case where the complexity of approximately counting independent sets differs on hypergraphs and graphs,as for A A allows us to connect the problem of counting independent sets in hypergraphs with the problem of counting dominating sets in A-regular graphs and show that the latter admits an FPTAS when A is sufficiently large.Recall that a dominating set in a graph G is a subset S of the vertices such that every vertex not in S is adjacent to at least one vertex in S.We then consider the following problem. Name#RegDomSet(△) Instance A A-regular graph G. Output The number of dominating sets in G. Our theorems have the following corollary. Corollary4.For all po.sitive integers△satisfying either△≤5or△≥l99,there is an FPTAS for the problem #RegDomSet(A). We remark that Corollary 4 cannot be obtained using the result of [2],i.e.,the seemingly small difference between k≥△andk≥△+2 does matter in deriving Corollary4fom Theorems 2 and 3.We should also emphasise that it is necessary to consider A-regular graphs as inputs to the dominating set problem,since otherwise for graphs of maximum degree A (not necessarily regular),we show that the problem is NP-hard to approximate for A >18 (Theorem 52).It is relevant to remark here that we believe that Corollary 4 should hold for all△;to do this,it would be sufficient to remove the restriction△≥200 from Theorem 3.Note that,while we do not know how to remove the restriction,it would at least be possible to improve"200"to some smaller number.However,we have chosen to stick
computation tree will contain instances with small hyperedges. Without strong spatial mixing, these small hyperedges cause problems in the analysis. Lu, Yang and Zhang [13] discuss this problem and say “How to avoid this effect is a major open question whose solution may have applications in many other problems.” This question motivates our work. To overcome this difficulty, we introduce a new amortisation technique in the analysis. Since lack of correlation decay is caused primarily by the presence of small-arity hyperedges within the intermediate instances, we keep track of such hyperedges. Thus, we track not only the correlation, but also combinatorial properties of the intermediate instances in the computation tree. Using this idea, we obtain the following result. Theorem 2. There is an FPTAS for #HyperIndSet(3, 6). Note that #HyperIndSet(2, 6) is NP-hard to approximate due to [19], so our result is tight for ∆ = 6. This also shows that ∆ = 6 is the first case where the complexity of approximately counting independent sets differs on hypergraphs and graphs, as for ∆ ≤ 5 both admit an FPTAS [12]. Moreover, Theorem 2 is stronger than the best MCMC algorithm [2] when ∆ = 6 as [2] only works for k ≥ 8. We also apply our technique to large k. Theorem 3. Let k and ∆ be two integers such that k ≥ ∆ and ∆ ≥ 200. Then there is an FPTAS for the problem #HyperIndSet(k, ∆). In the large k case, our result is not substantially stronger than that obtained by analysis of the MCMC algorithm [2] (k ≥ ∆ rather than k ≥ ∆ + 2) but it is incomparable since our algorithm is deterministic rather than randomised. Perhaps more importantly, the bound k ≥ ∆ allows us to connect the problem of counting independent sets in hypergraphs with the problem of counting dominating sets in ∆-regular graphs and show that the latter admits an FPTAS when ∆ is sufficiently large. Recall that a dominating set in a graph G is a subset S of the vertices such that every vertex not in S is adjacent to at least one vertex in S. We then consider the following problem. Name #RegDomSet(∆). Instance A ∆-regular graph G. Output The number of dominating sets in G. Our theorems have the following corollary. Corollary 4. For all positive integers ∆ satisfying either ∆ ≤ 5 or ∆ ≥ 199, there is an FPTAS for the problem #RegDomSet(∆). We remark that Corollary 4 cannot be obtained using the result of [2], i.e., the seemingly small difference between k ≥ ∆ and k ≥ ∆ + 2 does matter in deriving Corollary 4 from Theorems 2 and 3. We should also emphasise that it is necessary to consider ∆-regular graphs as inputs to the dominating set problem, since otherwise for graphs of maximum degree ∆ (not necessarily regular), we show that the problem is NP-hard to approximate for ∆ ≥ 18 (Theorem 52). It is relevant to remark here that we believe that Corollary 4 should hold for all ∆; to do this, it would be sufficient to remove the restriction ∆ ≥ 200 from Theorem 3. Note that, while we do not know how to remove the restriction, it would at least be possible to improve “200” to some smaller number. However, we have chosen to stick 4
with "200"in order to keep the proof accessible.We explain next the difficulties in obtaining Theorems 2 and 3. The main technical difficulty in correlation-decay analysis is bounding a function that we call the "decay rate".This boils down to solving an optimisation problem with (k- 1)(A-1)variables.In previous work(e.g.[17]),this optimisation has been solved using a so- called "symmetrization"argument,which reduces the problem to a univariate optimisation via convexity.However,the many variables represent different branches in the computation tree. Since our analysis takes the shape of intermediate instances in the tree into consideration.the symmetrization argument does not work for us,and different branches take different values at the maximum.This problem is compounded by the fact that the shape of the sub-tree consisting of "bad"intermediate instances is heavily lopsided,and the assignment of variables achieving the maximum is far from uniform.Given these problems,there does not seem to be a clean solution to the optimisation in our analysis.Instead of optimizing,we give an upper bound on the maximum decay rate.In Theorem 2,as k and A are small,the number of variables is manageable,and our bounds are much sharper than those in Theorem 3.On the other hand,because of this,the proof of Theorem 3 is much more accessible,and we will use Theorem 3 as a running example to demonstrate our technique. We also provide some insight on the hardness side.Recall that for graphs it is NP-hard to approximate #IS beyond the uniqueness threshold (A =6)[19].4 We prove that it is NP-hard to approximate #HyperlndSet(6,22)(Corollary 50).In contrast,we show that uniqueness holds on the 6-uniform A-regular hypertree iff A 28 (Corollary 59).Thus,efficient ap- proximation schemes cease to exist well below the uniqueness threshold on the hypertree.In fact,we show that this discrepancy grows exponentially in k:for large k,it is NP-hard to approximate #HyperlndSet(,A)when A>5.2/2(Theorem 49 and Corollary 51),despite the fact that uniqueness holds on the hypertree for all A 2) are substantially different from those in the graph case(k=2). As mentioned earlier,there are models where efficient(randomised)approximation schemes exist (based on MCMC simulation)even though SSM does not hold.In fact,this can happen even when uniqueness does not hold.A striking example is the ferromagnetic Ising model (with external field).As [18]shows,there are parameter regimes where uniqueness holds but strong spatial mixing fails.It is easy to modify the parameters so that even uniqueness fails.Nevertheless,Jerrum and Sinclair [9]gave an MCMC-based FPRAS that applies for all parameters and for general graphs (with no degree bounds).It is still an open question to give a correlation decay based FPTAS for the ferromagnetic Ising model. 1.1 Note added in revision-recent work After our work,Hermon,Sly,and Zhang [8]have shown that the Glauber dynamics for sampling independent sets on n-vertex k-uniform hypergraphs has mixing time O(n log n), if A <c2k/2 for some constant c.Combined with our hardness result,Corollary 51,this establishes a sharp computational complexity phase transition,up to constants.Moreover, Moitra [15]has given a deterministic algorithm for approximately counting and sampling the satisfying assignments of k-CNF formulas,provided that the variable degree A/60 4For graphs the uniqueness and SSM thresholds coincide,but for hypergraphs they differ. 5
with “200” in order to keep the proof accessible. We explain next the difficulties in obtaining Theorems 2 and 3. The main technical difficulty in correlation-decay analysis is bounding a function that we call the “decay rate”. This boils down to solving an optimisation problem with (k − 1)(∆ −1) variables. In previous work (e.g. [17]), this optimisation has been solved using a socalled “symmetrization” argument, which reduces the problem to a univariate optimisation via convexity. However, the many variables represent different branches in the computation tree. Since our analysis takes the shape of intermediate instances in the tree into consideration, the symmetrization argument does not work for us, and different branches take different values at the maximum. This problem is compounded by the fact that the shape of the sub-tree consisting of “bad” intermediate instances is heavily lopsided, and the assignment of variables achieving the maximum is far from uniform. Given these problems, there does not seem to be a clean solution to the optimisation in our analysis. Instead of optimizing, we give an upper bound on the maximum decay rate. In Theorem 2, as k and ∆ are small, the number of variables is manageable, and our bounds are much sharper than those in Theorem 3. On the other hand, because of this, the proof of Theorem 3 is much more accessible, and we will use Theorem 3 as a running example to demonstrate our technique. We also provide some insight on the hardness side. Recall that for graphs it is NP-hard to approximate #IS beyond the uniqueness threshold (∆ = 6) [19].4 We prove that it is NP-hard to approximate #HyperIndSet(6, 22) (Corollary 50). In contrast, we show that uniqueness holds on the 6-uniform ∆-regular hypertree iff ∆ ≤ 28 (Corollary 59). Thus, efficient approximation schemes cease to exist well below the uniqueness threshold on the hypertree. In fact, we show that this discrepancy grows exponentially in k: for large k, it is NP-hard to approximate #HyperIndSet(k, ∆) when ∆ ≥ 5 · 2 k/2 (Theorem 49 and Corollary 51), despite the fact that uniqueness holds on the hypertree for all ∆ ≤ 2 k/(2k) (Lemma 60). Theorem 49 follows from a rather standard reduction to the hard-core model on graphs. Nevertheless, it demonstrates that the computational-threshold phenomena in the hypergraph case (k > 2) are substantially different from those in the graph case (k = 2). As mentioned earlier, there are models where efficient (randomised) approximation schemes exist (based on MCMC simulation) even though SSM does not hold. In fact, this can happen even when uniqueness does not hold. A striking example is the ferromagnetic Ising model (with external field). As [18] shows, there are parameter regimes where uniqueness holds but strong spatial mixing fails. It is easy to modify the parameters so that even uniqueness fails. Nevertheless, Jerrum and Sinclair [9] gave an MCMC-based FPRAS that applies for all parameters and for general graphs (with no degree bounds). It is still an open question to give a correlation decay based FPTAS for the ferromagnetic Ising model. 1.1 Note added in revision — recent work After our work, Hermon, Sly, and Zhang [8] have shown that the Glauber dynamics for sampling independent sets on n-vertex k-uniform hypergraphs has mixing time O(n log n), if ∆ ≤ c2 k/2 for some constant c. Combined with our hardness result, Corollary 51, this establishes a sharp computational complexity phase transition, up to constants. Moreover, Moitra [15] has given a deterministic algorithm for approximately counting and sampling the satisfying assignments of k-CNF formulas, provided that the variable degree ∆ ≲ 2 k/60 . 4For graphs the uniqueness and SSM thresholds coincide, but for hypergraphs they differ. 5
Moitra's work relies on connections to the Lovasz local lemma.His work improves our A {0,1}which encode a valid independent set of H.For an assignment o:V{0,1}and a subset A C V,we denote by o the restriction of a to the subset A. For a hypergraph H=(V,F)and a subset A C V,we denote by HA the subgraph of H induced by A,i.e.,HA:=(A,Ueer(enA)).Also,for a vertex vV and A CV,we denote 6
Moitra’s work relies on connections to the Lovász local lemma. His work improves our ∆ ≤ k bound in Theorem 3 and removes the monotonicity assumption. Despite this recent progress, Theorem 2 remains the best algorithmic result for k = 3. 1.2 Outline of Paper In Section 2, we first give some preliminaries. We give a formal definition of strong spatial mixing (Section 2.1) and a reformulation of #HyperIndSet(k, ∆) as the problem of counting satisfying assignments in monotone CNF formulas (Section 2.2). This will allow us to use the computation tree used by Liu and Lu [12]. A formal description of the computation tree of [12] is given in Section 2.3. In Section 3, we give an overview of our proof approach, i.e., the main idea behind our new amortisation technique. Section 3.2 concludes the proof of Theorems 2 and 3, using two (not yet proved) technical lemmas (Lemma 21 and 22) which solve a complicated multivariate optimisation problem, and represent the bulk of the technical work of the paper. Section 3.3 gives the proof of Corollary 4. In Section 4, we give the proof of Lemma 21, which applies to the large-∆ setting of Theorem 3 and is by far the technically simpler of the two lemmas. Section 5 contains the proof of the technically more challenging Lemma 22 which applies to the k = 3, ∆ = 6 setting of Theorem 2. Section 6 gives the formal statements and proofs of the hardness results stated in the Introduction; Section 6.1 has the hardness results for independent sets in hypergraphs and Section 6.2 has the hardness results for dominating sets in graphs. Also, Section 7 studies the uniqueness threshold on the k-uniform ∆-regular hypertree (and gives the proofs of the uniqueness statements made in the Introduction). Finally, Section 8 gives the proof for several technical inequalities used in Section 5. Throughout the paper we use computer algebra to prove multivariate polynomial inequalities over the field of real numbers (the coefficients of the polynomials are rational). More specifically, we use the Resolve command in Mathematica. The underlying quantifier elimination algorithm (described in [21]) provides a rigorous decision procedure that determines feasibility of a collection of polynomial inequalities. 2 Preliminaries 2.1 Strong Spatial Mixing For the purposes of this section, it will be convenient to view the independent set model as a 2-spin model. Namely, if H = (V, F) is a hypergraph, each independent set I can be viewed as a {0, 1}-assignment σ to the vertices in V , where a vertex v is assigned the spin 1 under σ if v ∈ I and 0 otherwise. We denote by ΩH the set of all independent sets in H. The Gibbs distribution µH(·) is the uniform distribution over ΩH. The Gibbs distribution of H can clearly be viewed as the uniform distribution over those assignments σ : V → {0, 1} which encode a valid independent set of H. For an assignment σ : V → {0, 1} and a subset Λ ⊂ V , we denote by σΛ the restriction of σ to the subset Λ. For a hypergraph H = (V, F) and a subset Λ ⊂ V , we denote by HΛ the subgraph of H induced by Λ, i.e., HΛ := (Λ, ∪ e∈F (e ∩ Λ)). Also, for a vertex v ∈ V and Λ ⊂ V , we denote 6
by dist(v,A)the length of the shortest path'between v and a vertex of A. Definition 5.Let 6:Z0,1].The independent set model exhibits strong spatial mixing on a hypergraph H=(V,F)with decay rate 6()iff for every vEV,for every A CV;for any two configurations n,n:A->10,1}encoding independent sets of HA,it holds that uH(a()=1|aA=)-H(a(u)=1|oA=7)≤6(dist(,A), where A'denotes the set of vertices in A such that n and n'differ 2.2 Reformulation in terms of Monotone CNF formulas The problem of counting the independent sets of a hypergraph has an equivalent formulation in terms of monotone CNF formulas.In order to describe the equivalent formulation,we first describe the problem of counting satisfying assignments of monotone CNF formulas. A monotone CNF formula C consists of a set of variables V and a set of clauses {ci,c2,...}. Each clause ci is associated with some subset S;of V and is the disjunction of all variables in Si.The arity of a clause ci,denoted cil,is defined to be Sil.For a variable x V,its degree d(C)is the number of clauses where z appears.The marimum degree of C is given by maxrev d(C). Definition 6.Let Ck.be the set of all monotone CNF formulas which have marimum degree at most A and whose clauses have arity at least k. Note that a formula in Ck.A may have some clauses with arbitrarily large arities.A satisfying assignment of the formula is an assignment of truth values to the variables which nakes the formula evaluate to“true”. Suppose that H =(V,F)is a hypergraph with maximum degree at most A where each hyperedge has arity at least k.Let C be the corresponding formula in Ck.with variable set V. The correspondence is that each hyperedge Si of H is associated with exactly one clause ci of C.Independent sets of H are in one-to-one correspondence with satisfying assignments of C-a variable is assigned value "true"in an assignment if and only if it is out of the corresponding independent set. Going the other direction,any monotone CNF formula can be viewed as a hypergraph. In the technical sections of this paper,we use the monotone CNF formulation. In this article,when we consider a monotone CNF formula C we will typically use n to denote VI.Variables in V will be denoted by x1,2,...When z and C are clear from context, we will sometimes use d to denote d(C).When C is clear from context,we will sometimes use A to denote maxrev d(C). 2.3 The Computation Tree In this section,we set up relevant notation and give an exposition of the computation tree of Liu and Lu [12]which will also be used in our proof(though our analysis will be different). The computation tree of Liu and Lu is given in terms of the monotone CNF version of the problem.Below we give the relevant definitions and notation;our notation aligns as much as possible with that of [12]. 5A path in a hypergraph with hyperedge set F is a sequence of edges eo,...,eF such that eine for all i=0,...,-1
by dist(v,Λ) the length of the shortest path5 between v and a vertex of Λ. Definition 5. Let δ : Z+ → [0, 1]. The independent set model exhibits strong spatial mixing on a hypergraph H = (V, F) with decay rate δ(·) iff for every v ∈ V , for every Λ ⊂ V , for any two configurations η, η′ : Λ → {0, 1} encoding independent sets of HΛ, it holds that µH(σ(v) = 1 | σΛ = η) − µH(σ(v) = 1 | σΛ = η ′ ) ≤ δ ( dist(v,Λ ′ ) ) , where Λ ′ denotes the set of vertices in Λ such that η and η ′ differ. 2.2 Reformulation in terms of Monotone CNF formulas The problem of counting the independent sets of a hypergraph has an equivalent formulation in terms of monotone CNF formulas. In order to describe the equivalent formulation, we first describe the problem of counting satisfying assignments of monotone CNF formulas. A monotone CNF formula C consists of a set of variables V and a set of clauses {c1, c2, . . .}. Each clause ci is associated with some subset Si of V and is the disjunction of all variables in Si . The arity of a clause ci , denoted |ci |, is defined to be |Si |. For a variable x ∈ V , its degree dx(C) is the number of clauses where x appears. The maximum degree of C is given by maxx∈V dx(C). Definition 6. Let Ck,∆ be the set of all monotone CNF formulas which have maximum degree at most ∆ and whose clauses have arity at least k. Note that a formula in Ck,∆ may have some clauses with arbitrarily large arities. A satisfying assignment of the formula is an assignment of truth values to the variables which makes the formula evaluate to “true”. Suppose that H = (V, F) is a hypergraph with maximum degree at most ∆ where each hyperedge has arity at least k. Let C be the corresponding formula in Ck,∆ with variable set V . The correspondence is that each hyperedge Si of H is associated with exactly one clause ci of C. Independent sets of H are in one-to-one correspondence with satisfying assignments of C — a variable is assigned value “true” in an assignment if and only if it is out of the corresponding independent set. Going the other direction, any monotone CNF formula can be viewed as a hypergraph. In the technical sections of this paper, we use the monotone CNF formulation. In this article, when we consider a monotone CNF formula C we will typically use n to denote |V |. Variables in V will be denoted by x1, x2, . . . When x and C are clear from context, we will sometimes use d to denote dx(C). When C is clear from context, we will sometimes use ∆ to denote maxx∈V dx(C). 2.3 The Computation Tree In this section, we set up relevant notation and give an exposition of the computation tree of Liu and Lu [12] which will also be used in our proof (though our analysis will be different). The computation tree of Liu and Lu is given in terms of the monotone CNF version of the problem. Below we give the relevant definitions and notation; our notation aligns as much as possible with that of [12]. 5A path in a hypergraph with hyperedge set F is a sequence of edges e0, . . . , eℓ ∈ F such that ei ∩ ei+1 ̸= ∅ for all i = 0, . . . , ℓ − 1. 7
Our goal is to approximately count the number of satisfying assignments of a formula C∈Ck.△,which we denote by Z(C).Since C is monotone,.an assignment o:V→{0,l}is satisfying if,for every clause in C,there is at least one variable x Ec with a(r)=1.Note that Z(C)>0 since the all-1 assignment satisfies every monotone CNF formula.For convenience, we will use the simplified notation "x=1"to denote (the set of)satisfying assignments of C in which z is set to 1,and we similarly use "x =0".We associate the formula C with a probability distribution in which each satisfying assignment has probability mass 1/Z(C). We will denote probabilities with respect to this distribution by Prc(). Let be avariable in V.Define R(C.),this is well-defined since Prc( 1)>0 by the monotonicity of C.In fact,the monotonicity of C also implies that 00 and computes a quantity R(C,x)satisfying R(C,)-R(C,)0 and C be a monotone CNF formula C with maximum degree A whose clauses have arity at least k.Let 1,...,n be the variables in C.Let Ci be the formula obtained from C by setting 1=...=xi=1 and removing all the clauses that are satisfied (i.e.,all clauses that contain a variable from x1,...,xi).We have zZC=Pcm1==n=)=ⅡPrc(c+1=1|1=…=4=1) (1) -PIG(I=1)-1+R() 0 Note that every Ci is a monotone CNF formula with maximum degree A whose clauses have arity at least k.By the assumption in the lemma,we can compute (in poly(n,1/e)time) quantities R(Ci,+)such that R(C,xi+1)-R(C,x+1)≤e/(100m)for all i=0,,n-1. (2) Let Π(1+(C,x+1) (3) =( It is not hard to conclude from (1),(2)and (3)that (1-)Z(C)<Z(C)<(1+)Z(C).This completes the proof. ▣ Liu and Lu [12]established that a computation tree approach gives a recursive procedure for eractly calculating R(C,x)for any monotone CNF formula C and any variable x E C. 8
Our goal is to approximately count the number of satisfying assignments of a formula C ∈ Ck,∆, which we denote by Z(C). Since C is monotone, an assignment σ : V → {0, 1} is satisfying if, for every clause in C, there is at least one variable x ∈ c with σ(x) = 1. Note that Z(C) > 0 since the all-1 assignment satisfies every monotone CNF formula. For convenience, we will use the simplified notation “x = 1” to denote (the set of) satisfying assignments of C in which x is set to 1, and we similarly use “x = 0”. We associate the formula C with a probability distribution in which each satisfying assignment has probability mass 1/Z(C). We will denote probabilities with respect to this distribution by PrC(·). Let x be a variable in V . Define R(C, x) := PrC (x=0) PrC (x=1) , this is well-defined since PrC(x = 1) > 0 by the monotonicity of C. In fact, the monotonicity of C also implies that 0 ≤ R(C, x) ≤ 1, where the upper bound follows from the fact that, for every satisfying assignment with x = 0, flipping the assignment of x to 1 does not affect satisfiability. Our interest in the quantity R(C, x) stems from the following simple lemma from [12]. Lemma 7 ([12]). Let k and ∆ be positive integers. Suppose that there is a polynomial-time algorithm (in n and 1/ε) that takes an n-variable formula C ∈ Ck,∆, a variable x of C, and an ε > 0 and computes a quantity Rb(C, x) satisfying |Rb(C, x) − R(C, x)| ≤ ε. Then, there is an FPTAS which approximates Z(C) for every C ∈ Ck,∆. Proof. The proof is actually identical to the argument in [12, Appendix A]. We include the proof for completeness, and also because an examination of the proof is necessary to check that the FPTAS for approximating Z(C) invokes the algorithm that computes Rb(C, x) only on formulas C whose clauses have arity at least k (and maximum degree ∆). Let ε > 0 and C be a monotone CNF formula C with maximum degree ∆ whose clauses have arity at least k. Let x1, . . . , xn be the variables in C. Let Ci be the formula obtained from C by setting x1 = · · · = xi = 1 and removing all the clauses that are satisfied (i.e., all clauses that contain a variable from x1, . . . , xi). We have 1 Z(C) = PrC(x1 = . . . = xn = 1) = n∏−1 i=0 PrC(xi+1 = 1 | x1 = · · · = xi = 1) = n∏−1 i=0 PrCi (xi+1 = 1) = n∏−1 i=0 1 1 + R(Ci , xi+1) . (1) Note that every Ci is a monotone CNF formula with maximum degree ∆ whose clauses have arity at least k. By the assumption in the lemma, we can compute (in poly(n, 1/ε) time) quantities Rb(Ci , xi+1) such that Rb(Ci , xi+1) − R(Ci , xi+1) ≤ ε/(100n) for all i = 0, . . . , n − 1. (2) Let Zb(C) = n∏−1 i=0 ( 1 + Rb(Ci , xi+1) ) . (3) It is not hard to conclude from (1), (2) and (3) that (1−ε)Z(C) ≤ Zb(C) ≤ (1 +ε)Z(C). This completes the proof. Liu and Lu [12] established that a computation tree approach gives a recursive procedure for exactly calculating R(C, x) for any monotone CNF formula C and any variable x ∈ C. 8
We next give the details of this recursive procedure (see [12,Lemma 5]).First,we introduce the following definitions. Definition 8.Let C be a monotone CNF formula and let x be a variable in C.We call the variable r forced (in C)if x appears in a clause of arity 1 in C (note that in every satisfying assignment of C it must be the case that r=1 and hence R(C,x)=0).We call the variable x free if x does not appear in any clause of C (note that R(C,)=1 in this case). Definition 9.Let C be a monotone CNF formula and let c be a clause in C.We call the clause c redundant (in C)if there is a clause c in C such that c is a (strict)superset of c (note that removing c from C does not affect the set of satisfying assignments of C). We next give the details of the computation tree.The nodes in the computation tree will be pairs (C,z)such that C is a monotone CNF formula and x is a variable which is not forced in C. (4) Let (C,x)satisfy (4).We first perform a pre-processing step on C which involves (i)initially removing all of the redundant clauses,(ii)then,removing all clauses of arity 1.Note that part (ii)of the preprocessing step removes all forced variables that were present in C;at the time of the removal,forced variables appear only in clauses of arity 1 since part (i)of the preprocessing step has already removed all redundant clauses in C (and hence all clauses of arity greater than 1 that contain forced variables).Denote the formula after the completion of the preprocessing step by C.Note that every clause in C is also a clause in the initial formula C.It follows that x is not forced in C.Further,since removing redundant clauses does not change the set of satisfying assignments of C and x is not forced in C,we have that R(C,x)=R(C,E). If z is free in C(the formula after the pre-processing step),then the start node (C,z)is (declared)a leaf of the computation tree (note that in this case R(C,x)=1).In the sequel, we assume that z is not free in C.Denote by {cilield the clauses where x occurs in C and let wi=cil-1 (note that d >1).We will use w to denote the vector (w1,...,wd).The variables in clause ci other than will be denoted byi.For the pair (C,),we next construct pairs(Cii,ij)for i[d and j [wi],where Cij is an appropriate subformula obtained from C,roughly,by hard-coding (some of)the occurrences of the variables in C to either 1 or 0(this will be explained below and will henceforth be referred to as pinning)6. Precisely,for i [d,let Ci be the formula obtained from C by removing clauses c1,...,ci-1 (note that this has the same effect as pinning the occurrences of r in these clauses to 1)and pinning the occurrences of x in ci+1,...,ca to 0(this corresponds to removing x from these clauses,and thus reducing their arities).For j [wil,the formula Cii is obtained from Ci by further removing clause ci and pinning all the occurrences of xi1,...,xij-1 to 0. Before proceeding,let us argue that the pairs (Ci.j,xij)satisfy (4)for all ie[d and j[wi].For such i,j,we first prove that Cii is a (satisfiable)monotone CNF formula. That is,we prove that the various pinnings in the construction of Cii from C do not pin all variables of some clause of C to 0.For the sake of contradiction,assume otherwise.Observe that Ci;is obtained from C by either removing some clauses or by pinning some occurrences of the variables to 0.Clearly,removal of clauses does not affect satisfiability,so we may focus on the effect of pinning.For i[d and j [wil,the only variables whose (some of the) 6Note that our notation for i,j is different from the one in [12];there,the roles of i,j are interchanged. 9
We next give the details of this recursive procedure (see [12, Lemma 5]). First, we introduce the following definitions. Definition 8. Let C be a monotone CNF formula and let x be a variable in C. We call the variable x forced (in C) if x appears in a clause of arity 1 in C (note that in every satisfying assignment of C it must be the case that x = 1 and hence R(C, x) = 0). We call the variable x free if x does not appear in any clause of C (note that R(C, x) = 1 in this case). Definition 9. Let C be a monotone CNF formula and let c be a clause in C. We call the clause c redundant (in C) if there is a clause c ′ in C such that c is a (strict) superset of c ′ (note that removing c from C does not affect the set of satisfying assignments of C). We next give the details of the computation tree. The nodes in the computation tree will be pairs (C, x) such that C is a monotone CNF formula and x is a variable which is not forced in C. (4) Let (C, x) satisfy (4). We first perform a pre-processing step on C which involves (i) initially removing all of the redundant clauses, (ii) then, removing all clauses of arity 1. Note that part (ii) of the preprocessing step removes all forced variables that were present in C; at the time of the removal, forced variables appear only in clauses of arity 1 since part (i) of the preprocessing step has already removed all redundant clauses in C (and hence all clauses of arity greater than 1 that contain forced variables). Denote the formula after the completion of the preprocessing step by Ce. Note that every clause in Ce is also a clause in the initial formula C. It follows that x is not forced in Ce. Further, since removing redundant clauses does not change the set of satisfying assignments of C and x is not forced in C, we have that R(C, x e ) = R(C, x). If x is free in Ce (the formula after the pre-processing step), then the start node (C, x) is (declared) a leaf of the computation tree (note that in this case R(C, x) = 1). In the sequel, we assume that x is not free in Ce. Denote by {ci}i∈[d] the clauses where x occurs in Ce and let wi = |ci | − 1 (note that d ≥ 1). We will use w to denote the vector (w1, . . . , wd). The variables in clause ci other than x will be denoted by xi,1, . . . , xi,wi . For the pair (C, x), we next construct pairs (Ci,j , xi,j ) for i ∈ [d] and j ∈ [wi ], where Ci,j is an appropriate subformula obtained from Ce, roughly, by hard-coding (some of) the occurrences of the variables in Ce to either 1 or 0 (this will be explained below and will henceforth be referred to as pinning)6 . Precisely, for i ∈ [d], let Ci be the formula obtained from Ce by removing clauses c1, . . . , ci−1 (note that this has the same effect as pinning the occurrences of x in these clauses to 1) and pinning the occurrences of x in ci+1, . . . , cd to 0 (this corresponds to removing x from these clauses, and thus reducing their arities). For j ∈ [wi ], the formula Ci,j is obtained from Ci by further removing clause ci and pinning all the occurrences of xi,1, . . . , xi,j−1 to 0. Before proceeding, let us argue that the pairs (Ci,j , xi,j ) satisfy (4) for all i ∈ [d] and j ∈ [wi ]. For such i, j, we first prove that Ci,j is a (satisfiable) monotone CNF formula. That is, we prove that the various pinnings in the construction of Ci,j from Ce do not pin all variables of some clause of Ce to 0. For the sake of contradiction, assume otherwise. Observe that Ci,j is obtained from Ce by either removing some clauses or by pinning some occurrences of the variables to 0. Clearly, removal of clauses does not affect satisfiability, so we may focus on the effect of pinning. For i ∈ [d] and j ∈ [wi ], the only variables whose (some of the) 6Note that our notation for i, j is different from the one in [12]; there, the roles of i, j are interchanged. 9
occurrences in C get pinned to 0 are i1,...,ij-1.Since we assumed (for contradiction) that Cii is unsatisfiable,it must be the case that there exists a clause c in C all of whose variables are (a subset of),.1,...1.It follows that ci is redundant in C since it is a strict superset of clause c.This gives a contradiction,since the pre-processing operation ensures that C has no redundant clauses.Thus,Cij is satisfiable as wanted.Next,we show that rij is not forced in Cij.First,observe that xij is not forced in C since the second part of the preprocessing step ensures that C does not contain forced variables.Thus,the only way that zij can be forced in Cij is if there existed a clause d in C whose variables were ij together with a subset of i,1,...,ij-1.Since C includes ci and C does not have redundant clauses,it must be the case that c'=ci.It remains to observe that Cij does not include (any subclause of)ci,from which it follows that rij is not forced in Cij. We are now ready to state the relation between R(C,z)and the quantities R(Cij,ij) with i∈[d and j∈[ul. Lemma 10 ([12,Lemma 5).It holds that R(C,x - R(Cjxi】 1+R(Cij,ij) (5) Proof.The proof is identical to the proof of [12,Lemma 5](which in turn builds on the technique of [22]),we give the proof for completeness. Recall that C is the formula after the preprocessing step and that R(C,)=R(C,z).We may assume that x is not free in C(otherwise,it holds that R(C,r)=1,which coincides with the evaluation of the right hand side of (5)under the standard convention that the empty product evaluates to 1). Equation(5)follows immediately from the following two equalities. R(C,x)=II R(Ci,)R(Ci,x)=1- R(Ci,c) (6) ield j=1 1+R(Ci.j;Tij) Prc(x=0) The first equality in (6)is a consequence of a telescoping expansion ofTo see this, let C be the formula obtained from C by replacing,for all i [d,the occurrence of the variable x in clause ci by a new variable z.We have that Prc(x=0)Prc(=0,.,4=0) C,=Pre=司=m=1%可 Pra(=1,-1=1,2=0…4=0) -Ⅱ8-14-141-0…4=0 = Prc;(=0) ield ΠPcx=D iel which yields the first equlty (6)after substitutingR(C) PrC;(z=0) For the second equality in(6),observe that x appears only in clause ci of the formula Ci, and thus (denoting by Ci\ci the formula which is obtained from Ci by deleting clause ci) R(Ci,) PCz=1=1-PrCe(1=0,4,=0)=】 PrC(x=0) IIPrc.=0), j=1 which proves the desired equality after substituting Prc(0)= R(C,x,) 10
occurrences in Ce get pinned to 0 are x, xi,1, . . . , xi,j−1. Since we assumed (for contradiction) that Ci,j is unsatisfiable, it must be the case that there exists a clause c ′ in Ce all of whose variables are (a subset of) x, xi,1, . . . , xi,j−1. It follows that ci is redundant in Ce since it is a strict superset of clause c ′ . This gives a contradiction, since the pre-processing operation ensures that Ce has no redundant clauses. Thus, Ci,j is satisfiable as wanted. Next, we show that xi,j is not forced in Ci,j . First, observe that xi,j is not forced in Ce since the second part of the preprocessing step ensures that Ce does not contain forced variables. Thus, the only way that xi,j can be forced in Ci,j is if there existed a clause c ′ in Ce whose variables were xi,j together with a subset of x, xi,1, . . . , xi,j−1. Since Ce includes ci and Ce does not have redundant clauses, it must be the case that c ′ = ci . It remains to observe that Ci,j does not include (any subclause of) ci , from which it follows that xi,j is not forced in Ci,j . We are now ready to state the relation between R(C, x) and the quantities R(Ci,j , xi,j ) with i ∈ [d] and j ∈ [wi ]. Lemma 10 ([12, Lemma 5]). It holds that R(C, x) = ∏ d i=1 ( 1 − ∏wi j=1 R(Ci,j , xi,j ) 1 + R(Ci,j , xi,j ) ) . (5) Proof. The proof is identical to the proof of [12, Lemma 5] (which in turn builds on the technique of [22]), we give the proof for completeness. Recall that Ce is the formula after the preprocessing step and that R(C, x e ) = R(C, x). We may assume that x is not free in Ce (otherwise, it holds that R(C, x e ) = 1, which coincides with the evaluation of the right hand side of (5) under the standard convention that the empty product evaluates to 1). Equation (5) follows immediately from the following two equalities. R(C, x e ) = ∏ i∈[d] R(Ci , x), R(Ci , x) = 1 − ∏wi j=1 R(Ci,j , xi,j ) 1 + R(Ci,j , xi,j ) . (6) The first equality in (6) is a consequence of a telescoping expansion of PrCe(x=0) PrCe(x=1) . To see this, let Ce′ be the formula obtained from Ce by replacing, for all i ∈ [d], the occurrence of the variable x in clause ci by a new variable x ′ i . We have that R(C, x e ) = PrCe(x = 0) PrCe(x = 1) = PrCe′(x ′ 1 = 0, . . . , x′ d = 0) PrCe′(x ′ 1 = 1, . . . , x′ d = 1) = ∏ i∈[d] PrCe′(x ′ 1 = 1, . . . , x′ i−1 = 1, x′ i = 0, . . . x′ d = 0) PrCe′(x ′ 1 = 1, . . . , x′ i = 1, x′ i+1 = 0, . . . x′ d = 0) = ∏ i∈[d] PrCi (x = 0) PrCi (x = 1), which yields the first equality in (6) after substituting R(Ci , x) = PrCi (x=0) PrCi (x=1) . For the second equality in (6), observe that x appears only in clause ci of the formula Ci , and thus (denoting by Ci\ci the formula which is obtained from Ci by deleting clause ci) R(Ci , x) = PrCi (x = 0) PrCi (x = 1) = 1 − PrCi\ci (xi,1 = 0, . . . , xi,wi = 0) = 1 − ∏wi j=1 PrCi,j (xi,j = 0), which proves the desired equality after substituting PrCi,j (xi,j = 0) = R(Ci,j ,xi,j ) 1+R(Ci,j ,xi,j ) . 10