Electronic Colloquium on Computational Complexity, Report No. 55 (2001)
Improved Resolution Lower Bounds for the Weak Pigeonhole Principle Alexander A. Razborov July 26, 2001 Abstract
Recently, Raz [Raz01] established exponential lower bounds on the size of resolution proofs of the weak pigeonhole principle. We give another proof of this result which leads to better numerical bounds. Speci cally, every resolution proof? of PHP nm must have ? we show 1that size exp (n= log m) =2 which implies an exp (n1=3 ) bound when the number of pigeons m is arbitrary. As a step toward extending this bound to the functional version of PHPnm (in which one pigeon may not split between several holes), we introduce one intermediate version (in the form of a PHP -oriented calculus) which, roughly speaking, allows arbitrary \monotone reasoning" about the ? location of anindividual pigeon. ?For thisversion we prove an exp (n= log2 m)1=2 lower bound (exp (n1=4 ) for arbitrary m).
1. Introduction Propositional proof complexity is an area of study that has seen a rapid development over the last decade. It plays as important a role in the theory of feasible proofs as the role played by the complexity of Boolean circuits in the theory of ecient computations. Propositional proof complexity is Institute for Advanced Study, Princeton, US and Steklov Mathematical Institute, Moscow, Russia,
[email protected]. Supported by the State of New Jersey and NSF grant CCR-9987077.
1
ISSN 1433-8092
in a sense complementary to the (non-uniform) computational complexity; moreover, there exist extremely rich and productive relations between the two areas (see e.g. [Raz96, BP98]). Much of the research in proof complexity is centered around the resolution proof system that was introduced in [Bla37] and further developed in [DP60, Rob65]. In fact, it was for a subsystem of this system (nowadays called regular Resolution) that Tseitin proved the rst non-trivial lower bounds in his seminal paper of more than 30 years ago [Tse68]. Despite its apparent (and deluding) simplicity, the rst exponential lower bounds for general Resolution were proven only in 1985 by Haken [Hak85]. These bounds were achieved for the pigeonhole principle PHPnn+1 (which asserts that (n + 1) pigeons cannot sit in n holes so that every pigeon is alone in its hole), and they were followed by many other strong results on the complexity of resolution proofs (see e.g. [Urq87, CS88, BT88, BP96a, Juk97]). Ben-Sasson and Wigderson [BW99] established a very general trade-o between the minimal width wR( ) and the minimal size SR ( ) of resolution proofs for any tautology . Their inequality (strengthening a previous result for Polynomial Calculus from [CEI96]) says that
wR( ) O
q n( ) log SR ( ) ;
(1)
where n( ) is the number of variables. It is much easier to bound the width wR( ) than the size SR ( ) and, remarkably, Ben-Sasson and Wigderson pointed out that (apparently) all lower bounds on SR ( ) known at that time can be viewed as lower bounds on wR ( ) followed by applying the inequality (1) (although, sometimes with some extra work). This \width method" seemed to fail bitterly for tautologies with a huge number of variables n( ). There are two prominent examples of such tautologies. The rst example is the weak pigeonhole principle PHPnm, where the word \weak" refers to the fact that the number of pigeons m may be much larger (potentially in nite) than the number of holes n. The second example is made by the tautologies expressing the hardness of the Nisan-Wigderson generator for propositional proof systems [ABRW00]. Accordingly, other methods were developed for handling the weak pigeonhole principle PHPnm (as long as the resolution size is concerned, the case of generator tautologies is still completely open). [RWY97] proved exponential 2
lower bounds for a subsystem of regular resolution (so-called rectangular calculus), [PR00] proved such bounds for unrestricted regular resolution, and recently Raz [Raz01] completely solved the case of general resolution proofs for the version of the weak pigeonhole principle in which the axioms forbidding pigeons to split between several holes are missing. The main goal of this paper is to present another (and, probably, 1=3 simpler) proof of the latter result; we also get a stronger bound exp (n ) (Theorem 2.2 below; the bound resulting from [Raz01] would be something like 1 = 10 exp (n ) ). This is already quite close to the best known upper bound exp(O(n log n)1=2 ) [BP96b]. What is, however, more important is that we essentially show how to match some basic ideas from [RWY97, PR00, Raz01] with the width-bounding argument from [BW99]. More speci cally, our main technical tool (Lemma 3.1, essentially borrowed from [RWY97, PR00, Raz01]) allows us to prove some analogue of the relation (1) (Lemma 3.3, Claim 4.2) even in certain situations when the number of variables is huge. Neither the methods from [Raz01] nor our methods apply directly to the functional version FPHPnm in which one pigeon may not split between several holes. This version of the weak pigeonhole principle appears to be at least as natural and traditional as the \ordinary" one, and some more reasons to be interested in it can be found in the concluding section 5. As a step toward the goal of getting resolution lower bounds for FPHPnm, we introduce an intermediate version (in terms of the so-called monotone functional calculus) which essentially allows arbitrary \monotone reasoning" about the locations of any individual 1=pigeon. For this stronger version we prove a slightly weaker 4 bound exp (n ) (Theorem 2.7). The paper is organized as follows. In Section 2 we give necessary de nitions and In Section 3 we prove our \base result" (which is preliminaries. 1 = 4 an exp (n ) lower bound for the ordinary version): the proof is simpler 1 = 4 than for the better bound exp (n ) , but nonetheless illustrates all basic ideas of our approach. The two improvements of this result already mentioned above (Theorems 2.2, 2.7) are presented in Section 4. The paper is concluded with a brief discussion in Section 5 that also includes several open problems.
3
2. Preliminaries
Let x be a Boolean variable, i.e. a variable that ranges over the set f0; 1g. A literal of x is either x (denoted sometimes as x1 ) or x (denoted sometimes as x0 ). A clause is a disjunction of literals. The empty clause will be denoted by 0. A clause is positive if it contains only positive literals x1 . For two clauses C 0; C , let C 0 C mean that every literal appearing in C 0 also appears in C . A CNF is a conjunction of pairwise dierent clauses. For a CNF , let n( ) be the overall number of distinct variables appearing in it. An assignment to the variables fx1; : : : ; xng is a mapping : fx1; : : : ; xng ! f0; 1g. A restriction of these variables is a mapping : fx1; : : : ; xng ! f0; 1; ?g. The restriction of a Boolean function f (x1; : : : ; xn) by , denoted by f j is the function obtained from f by setting the value of each x 2 ?1 (f0; 1g) to (x), and leaving each x 2 ?1(?) as a variable. One of the simplest and the most widely studied propositional proof systems is Resolution which operates with clauses and has one rule of inference called resolution rule: C0 _ x C1 _ x (C0 _ C1 C ): C A resolution refutation of a CNF is a resolution proof of the empty clause 0 from the clauses appearing in . The size SR(P ) of a resolution proof P is the overall number of clauses in it. For an unsatis able CNF , SR ( ) is the minimal size of its resolution refutation. For n, a non-negative integer let [n] def = f1; 2; : : : ; ng, and for ` n let def ` [n] = f I [n] j jI j = ` g.
De nition 2.1 (:PHPnm) is the unsatis able CNF in the variables f xij j i 2 [m]; j 2 [n] g that is the conjunction of the following clauses: Qi def =
_n
j =1 Qi1;i2 ;j def =
xij (i 2 [m]); (xi1 j _ xi2 j ) (i1 6= i2 2 [m]; j 2 [n]):
The rst main result of this paper is the following
Theorem 2.2 SR (:PHPnm) exp (n= log m)1=2 . 4
Corollary 2.3 For every m, SR (:PHPnm) exp (n1=3 ) . Proof of Corollary 2.3 from Theorem 2.2. Let SR(:PHPnm) = S . Since a resolution proof of size S can use at most S axioms from (:PHPnm), and these axioms involve at most 2S pigeons i 2 [m], we also have SR (:PHPn2S ) S: Now the required bound S exp (n1=3 ) immediately follows from Theorem 2.2.
The following normal form for resolution refutations of the pigeonhole principle was proposed in [BP96b] (they used for it the longer name \monotone resolution proof system" which we abbreviate to \monotone calculus"). For I [m]; J [n] let __ XIJ def = xij i2I j 2J
(these are exactly \rectangular clauses" from [RWY97]), and we will also naturally abbreviate XI;fjg to XIj and Xfig;J to XiJ .
De nition 2.4 ([BP96b]) Fix m > n. The monotone calculus operates with positive clauses in the variables f xij j i 2 [m]; j 2 [n] g, and has one inference rule which is the following monotone rule:
C0 _ XI0;j
C
C1 _ XI1;j
(C0 _ C1 C ; I0 \ I1 = ;):
(2)
A monotone calculus refutation of a set of positive clauses A is a monotone calculus proof of 0 from A, and the size S (P ) of a monotone calculus proof is the overall number of clauses in it.
Proposition 2.5 ([BP96b]) SR(:PHPnm) coincides, up to a polynomial, with the minimal possible size of a monotone calculus refutation of the set of axioms fQ1 ; Q2 ; : : : ; Qmg from De nition 2.1.
In the rest of this section we will be discussing some modi cations of the base principle PHPnm. The reader interested only in the original formulation may skip this and proceed directly to Section 3. 5
The (negation of the) functional pigeonhole principle (:FPHPnm) is obtained from (:PHPnm) by adding new clauses
Qi;j1;j2 def = (xij1 _ xij2 ) (i 2 [m]; j1 6= j2 2 [n]): [BW99] also introduced the extended pigeonhole principle (:EPHPnm) by allowing abbreviations for arbitrary Boolean functions that depend on a single pigeon i 2 [m]. EPHPnm is obviously reducible to PHPnm (in the sense that every resolution proof of PHPnm leads to a resolution proof of EPHPnm that is roughly of the same size). The reduction from FPHPnm to EPHPnm may seem somewhat counter-intuitive but it is actually not hard (see e.g. Section 5). Thus, EPHPnm is intermediate between PHPnm and FPHPnm. The following monotone version of EPHPnm, formulated as a natural extension of the monotone calculus, is in turn intermediate between PHPnm and EPHPnm. Let Fnmon be the set of monotone Boolean functions in the auxiliary variables fx1 ; : : : ; xn g, and let V arsmon(m; n) def = f xif j f 2 Fnmon g. Denote by j the restriction of the variables fx1 ; : : : ; xng that assigns xj to 0 and leaves all other variables unassigned. For I [m] and j 2 [n], let Ij be the restriction of the variables V arsmon(m; n) de ned by ( def xi; (f ) ; i 2 I Ij = x ; i 62 I: if Ij also naturally acts on clauses in the variables V arsmon(m; n). j
De nition 2.6 The monotone functional calculus operates with positive clauses in the variables V arsmon(m; n) and has the following two inference rules: C1 _ xi;g1 _ : : : _ xi;gs C0 _ xi;f1 _ : : : _ xi;fr C _ xi;h1 _ : : : _ xi;ht (C0 _ C1 C ; i 2 [m]; (f1 _ : : : _ fr ) ^ (g1 _ : : : _ gs) (h1 _ : : : _ ht)) and C0 C1 C (I0;j (C0) _ I1;j (C1) C for some I0 ; I1 such that I0 \ I1 = ; and j 2 [n]): As always, a refutation in this calculus is a proof of 0, and the size is measured by the number of clauses. 6
Our second main result is this:
Theorem 2.7 Every monotone functional calculus refutation of 2 1 = 2 fQ1; : : : ; Qmg must have size exp (n= log m) . By the same trick as before, we get
Corollary 2.8 For every m, every monotone 1=4 functional calculus refutation of fQ1 ; : : : ; Qm g must have size exp (n ) . Remark 1 The choice of inference rules for the monotone functional calcu-
lus may seem somewhat arbitrary. It is worth noting in this respect that Theorem 2.7 holds for the semantical version as well. Namely, we may allow arbitrary binary rules that are sound w.r.t. the set of assignments satisfying the axioms Qi1 ;i2;j .
3. Proof of the base result
We begin with the bound SR (:PHPnm) exp (n1=4 ) . It is weaker than both Corollary 2.3 and Corollary 2.8, but the proof is simpler and already illustrates all the major ideas. Fix m > n. Given Proposition 2.5, we may assume that we have a monotone calculus refutation P of fQ1 ; : : : ; Qm g, and we should lower bound its size S (P ). For analyzing the refutation P we are going to allow stronger axioms of the form XiJ (note that Qi = Xi;[n]). XiJ will be allowed as an axiom if jJ j exceeds a certain threshold di depending on the pigeon i. In this way we will be able to simplify the refutation P by \ ltering out" of it all clauses C containing at least one such axiom. Our rst task (Section 3.1) will be to show that if the thresholds di are chosen cleverly, then in every clause C passing this lter, almost all pigeons pass it safely, i.e. their degree in C is well below the corresponding threshold di. This part in a sense replaces the inequality (1), and it is inspired by the papers [RWY97, PR00, Raz01]. The pseudo-width of a clause C will be de ned as the number of pigeons that narrowly pass the lter (d1; : : : ; dm). The second task (Section 3.2) will be to get lower bounds on the pseudo-width, and this will be accomplished by an easy adaptation of the standard argument from [BW99].
7
3.1.
Pseudo-width and its reduction
For a positive clause C in the variables f xij j i 2 [m]; j 2 [n] g, let
Ji(C ) def = f j 2 [n] j xij occurs in C g and
di(C ) def = jJi(C )j: Suppose that we are given a vector d = (d1; : : : ; dm) of elements from [n] (\pigeon lter"), and let be another parameter. We let Id; (C ) def = f i 2 [m] j di(C ) di ? g
(3)
and we de ne the pseudo-width wd; (C ) of a clause C as
wd; (C ) def = jId; (C )j : The pseudo-width wd; (P ) of a monotone calculus refutation P is naturally de ned as max f wd; (C ) j C 2 P g. Our main tool for reducing the pseudo-width of a monotone calculus proof is the following \pigeon lter" lemma which is in fact a rather general combinatorial statement (in particular, we will use it in the same form in Section 4).
Lemma 3.1 Suppose that we are given S integer vectors r1 ; r2; : : : ; rS of length m each: r = (r1 ; : : : ; rm ). Then there exists an integer vector (r1; : : : ; rm) such that ri < blog2 mc for all i 2 [m] and for every 2 [S ] at least one of the following two events happen: 1. 9i 2 [m](ri ri);
2. j f i 2 [m] j ri ri + 1 gj O(log S ).
We postpone the proof and rst show how to use this lemma for reducing the pseudo-width.
De nition 3.2 Given a vector d = (d1; : : : ; dm), a d-axiom is an arbitrary clause of the form XiJ , where jJ j di. 8
Lemma 3.3 Suppose that there exists a monotone calculus refutation P of fQ1; : : : ; Qmg that has size S . Then there exists an integer vector d = (d1; : : : ; dm) with n=(2 log2 m) < di n=2 for all i 2 [m] and a monotone calculus refutation P 0 of a set of d-axioms which also has size S and such that1
wd;n=(2 log2 m) (P 0) O(log S ):
Proof of Lemma 3.3 from Lemma 3.1. Fix a defmonotone calculus refutation P of fQ1; : : : ; Qmg with S (P ) S . Let = n=(2 log2 m), and for C 2 P de ne ( def 1 if di (C ) n=2 ri(C ) = b (n=2)?d (C ) c + 1 otherwise: n o We apply Lemma 3.1 to the vectors r(C ) def = (r1(C ); : : : ; rm(C )) j C 2 P , i
and let (r1 ; : : : ; rm) satisfy the conclusion of that lemma. Set di def = b n2 ? ric +1 (so that di is the minimal integer with the property b (n=2) ?di c + 1 ri). Note that since ri < blog2 mc, we have di > . Consider now an arbitrary C 2 P . If for the vector r(C ) the rst case in Lemma 3.1 takes place, then b (n=2)? di (C ) c + 1 ri for some i 2 [m]. This implies di(C ) di; thus, C contains a subclause which is a d-axiom. We may replace C by this axiom which will reduce its pseudo-width wd; (C ) to 1. n o In the second case, i 2 [m] b (n=2)? di (C ) c ri O(log S ). Since i 2 Id; (C ) implies the inequality b (n=2)? di (C ) c ri, for all such C we have wd; (C ) O(log S ). This completes the proof of Lemma 3.3
Proof of Lemma 3.1. This lemma is proved P by an easy probabilistic argument. For r = (r1 ; : : : ; rm), let W (r) def = mi=1 2?ri . It suces to prove the existence of a vector r such that for every 2 [S ] we have:
W (r ) 2 ln S =) 9i 2 [m](ri ri); W (r ) 2 ln S =) j f i 2 [m] j ri ri + 1 gj O(log S ):
(4) (5)
The condition d n=2 will not be needed in Section 3. Also, we will not need there the upper bound on the size of the whole refutation P 0 , only its consequence that P 0 actually employs at most S d-axioms. Both these conditions, however, will be essential for the improvements in Section 4. 1
i
9
Let t def = blog2 mc ? 1 and R be the distribution on [t] given by pr def = def 1?t ? r 2 (1 r t ? 1); pt = 2 . Pick independent random variables r1; : : : ; rm according to this distribution. Let us check that for any individual 2 [S ] the related condition (4), (5) is satis ed with high probability. Case 1.XW (r) 2 ln S . X Note that 2?ri m 2?t?1 2, therefore 2?ri 2 ln S ? 2. On the ri >t
ri t
other hand, for every i with t we have P[ri ri] 2?ri and these events are independent. Therefore,
ri
P[8i 2 [m](ri > ri)]
Y
ri t
1 ? 2?r
i
1 0 X exp @? 2?r A O(S ?2): ri t
i
Case 2. W (r ) 2 ln S . In this case P[ri ri + 1] 22?r and, therefore, E[j f i 2 [m] j ri ri + 1 g j] 4W (r ) 8 ln S: i
Since these events are independent, we may apply Cherno's bound and conclude that P[j f i 2 [m] j ri ri + 1 g j C log S ] S ?2 for any suciently large constant C . So, for every individual 2 [S ] the probability that the related property (4), (5) fails is at most O(S ?2). Therefore, for at least one choice of r1; : : : ; rm they will be satis ed for all 2 [S ]. This completes the proof of Lemma 3.1. 3.2.
Lower bounds on pseudo-width
Lemma 3.4 Let (d1; : : : ; dm) be an integer vector, be any parameter such that < di for all i 2 [m] and A be an arbitrary set of d-axioms. Then every monotone calculus refutation P of A must satisfy wd; (P ) ( 2 =(n log jAj)). Proof. Let w0 def = n log jAj , where is a suciently small constant. We will show that every refutation of A must have pseudo-width > w0. For an assignment a to the variables f xij j i 2 [m]; j 2 [n] g, let Ji(a) def = f j j aij = 1 g : Set ` def = b=(4w0)c, and let D be the set of those assignments a for which: 2
2
10
1. a satis es all axioms Qi1;i2 ;j , i.e., Ji1 (a) \ Ji2 (a) = ; for i1 6= i2 ; 2. jJi(a)j ` for all i 2 [m]. For a set of positive clauses ? and another positive clause C , let ? j= C mean that every assignment a 2 D satisfying all clauses from ? also satis es C. Fix now any proof P from the set of axioms A with wd; (P ) w0. Our goal is to show that 0S 62 P . Let Ai consist of those axioms in A that have = AId; (C ) (recall that Id; (C ) is given the form XiJ , AI def = i2I Ai and AC def by (3)). For C 2 P we will show by induction on the number of steps in the derivation of C that AC j= C . Base case C 2 A is obvious since C 2 AC . Inductive step. AC0 j= C0; AC1 j= C1 and C is obtained from C0 ; C1 by a single application of the rule (2). Since the rule (2) is sound on D, AId; (C0 )[Id; (C1 ) j= C , and also
jId; (C0) [ Id; (C1)j 2w0: Let us choose the minimal I [m] such that AI j= C ; then still jI j 2w0. We will show that in fact I Id; (C ), and this will obviously imply AC j= C . Assume the contrary, and pick up an arbitrary i0 2 I n Id; (C ). Since I is minimal, AI nfi g 6j= C , and let a 2 D satisfy all clauses in AI nfi g and falsify C . Re-assigning in a all values aij with i 62 I n fi0g to 0 will preserve these properties (remember that C is positive!), therefore we may assume from the beginning that aij = 0 for all i 62 I n fi0 g and j 2 [n]. 0
0
Let now
J0 def =
[ i2I nfi0 g
Ji(a) [ Ji0 (C )
(6)
and J1 def = [n] n J0. Note that
jJ1j n ? (2w0` + (di ? )) n ? di + =2: 0
0
(7)
J1 is the set of holes \permissible" for the pigeon i0 : if we change a by picking an arbitrary `-subset J of J1 and letting ai0 ;j = 1 for j 2 J , then we will get yet another assignment from D which will still falsify C . We want to show that J can be chosen in such a way that this assignment will also satisfy 11
all axioms in Ai0 , and for that purpose we pick J uniformly and at random among all `-subsets of J1 . Let a be the (random) assignment resulting from a by re-assigning all ai0;j (j 2 J ) to 1. Take an arbitrary A 2 Ai0 . Since jJi0 (A)j di0 , by (7) we have jJi0 (A) \ J1j =2: (8) Now we can apply Cherno's bound and conclude that P[A(a) = 1] = P[Ji0 (A) \ J 6= ;] 1 ? exp (? (`=n)) 1 ? jAj?2 if the constant in the de nition of w0 is small enough. Hence, for at least one choice of a all axioms in Ai0 will be satis ed. This contradicts our assumption AI j= C , and this contradiction completes the inductive step. We have shown that AC j= C for every C 2 P . Finally, since < di for all i 2 [m], we have Id; (0) = ; and A0 = ;. Therfore, A0 6j= 0, 0 62 P and Lemma 3.4 is completely proved. Combining Lemma 3.4 with Lemma 3.3 (and observing that, as always, we may assume m 2S ), we get
Theorem 3.5 For every m, SR(:PHPnm) exp (n1=4 ) .
4. Improvements In this section we prove Theorems 2.2 and 2.7. Each of these two improvements is achieved by letting one more ingredient of the basic proof from the previous section to depend on the candidate refutation P . To get the numerical improvement (Theorem 2.2), we will pre-process the set of legitimate assignments D according to the content of P . In proving lower bounds for the monotone functional calculus (Theorem 2.7), the ranking function ri(C ) (cf. the proof of Lemma 3.3) will depend on P and will be constructed dynamically. For Theorem 2.2 we will show improved lower bounds on the pseudowidth wd; (P ). Now we do need the bound di n=2 promised in Lemma 3.3. Also, we need to know an upper bound on the size of the whole proof (as opposed to Lemma 3.4 for which we only needed a bound on the number of d-axioms). 12
Lemma 4.1 Let (d1; : : : ; dm) be an integer vector and be a parameter such that < di n=2 for all i 2 [m]. Then for every monotone calculus refutation P of any set of d-axioms we have the trade-o wd; (P ) log S (P ) (). Proof. Fix an arbitrary proof P from a set of d-axioms A. Set w0 def = log S(P ) ,
where is a suciently small constant; we will show that wd; (P ) > w0. Let ` def = b 20nw0 c. Analyzing the proof of Lemma 3.4, we see that it almost goes through with Sthis new value of `. The only problem is that now the set of forbidden holes i2I nfi0 g Ji(a) in (6) may have as many as (n) elements. Thus, if we are unlucky, it may have a huge intersection with the set
J 0 def = Ji0 (A) n Ji0 (C )
(9)
for some C 2 P; i0 62 Id; (C ); A 2 Ai0 , or even completely cover it. All this means that we do not have any useful analogue of (8). We take care of this by pre-processing the set D. Namely, we are going to remove from it in advance all trouble-making assignments that may eventually contribute to the unpleasant situation described above. Formally, let us call any set of the form (9), where C 2 P; i0 62 Id; (C ) and A 2 Ai0 a dierence set. Notice for the record that every dierence set has size at least , and altogether there are at most S (P )2 of them. Next, let us call J [n] good if its intersection with every dierence set J 0 has size at most jJ 0j=(4w0). We will call an assignment a 2 D good if Ji (a) is good for all i 2 [m]. Now, we de ne the main relation ? j= C as the semantical implication with respect to good assignments a 2 D, and literally repeat the argument from the proof of Lemma 3.4 up to and including the de nition (6) of J0; J1 . We no longer have (7) but, using the premise di n=2, we can at least observe a weaker bound jJ1j n ? (2w0) 20nw + (di ? ) 910n ? di 25n : (10) 0 The most crucial observation for our improvement is that the bound (8) still holds Sfor any A 2 Ai0 , although for a dierent reason. Indeed, Ji0 (A) \ J1 = J 0 n i2I nfi0 g Ji(a), where J 0 is given by (9). Since every one of Ji (a) is good and J 0 is a dierence set, jJ 0 \ Ji (a)j jJ 0j=(4w0). This implies jJi0 (A) \ J1j jJ 0j=2 =2, i.e., exactly (8). 13
Similarly to the proof of Lemma 3.4, we now chose J as a random `-subset of J1 and denote by a the resulting variation of the original assignment a. Given (8), the same calculation based on Cherno's bound as before shows that with probability 1 ? o(1) a satis es all axioms in Ai0 . In order to complete the proof in our case, we, however, still need to make sure that there exists a good J with this property. For this purpose notice that for any xed dierence set J 0 we have
P[jJ \ J 0 j > jJ 0j=(4w0)] = P[jJ \ (J 0 \ J1)j > jJ 0j=(4w0)] : By (10),
jJ j jJ 0 \ J1 j ` jJ 0j jJ 0j : jJ1j (2n=5) 8w0
Therefore, we may apply Cherno's bound and conclude that (as long as the constant in the de nition of w0 is small enough), P[jJ \ J 0j > jJ 0j=(4w0)] S (P )?3. Thus, J is good with probability 1 ? o(1). Fixing it in such a way that it is at the same time good and satis es all axioms from Ai0 , we complete the inductive step in the proof of AC j= C . Finally, good assignments do exist (see the above argument or simply take the identically zero assignment). Therefore, A0 6j= 0, and this completes the proof of Lemma 4.1. Theorem 2.2 now straightforwardly follows from Lemma 3.3 and Lemma 4.1. Finally we prove Theorem 2.7. For a positive clause C in the variables V arsmon(m; n) denote by fi(C ) the following monotone function in the variables x1 ; : : : ; xn: fi(C ) def = W f f j xif 2 C g. The vector f (C ) def = (f1(C ); : : : ; fm (C )) bears all the information about C necessary for our proof. Its overall strategy once more naturally generalizes the proof of Theorem 3.5. Namely, we are going to construct an appropriate ranking function rk : Fnmon ?! N , form (similarly to the proof of Lemma 3.3) the family of integer vectors f (rk(f1 (C )); : : : ; rk(fm (C ))) j C 2 P g, apply to this family Lemma 3.1, de ne the corresponding notion of pseudo-width, cross our ngers and hope that the proof of Lemma 3.4 also goes through. And indeed there exists a particular combinatorial choice of the ranking 1=6 function rk for . We, however, which this plain strategy gives a lower bound exp n 14
skip this and proceed immediately to the better bound exp n1=4 whose proof does involve some new and potentially useful ideas. Fix a monotone functional calculus refutation P of fQ1 ; : : : ; Qm g that has size S , and let
Mi def= f fi(C ) j C 2 P g [ f0; j_=1 xj g: Our ranking function rk will essentially depend on fMig. n
It is also natural (although, not absolutely necessary) to let it depend on i, so that we will actually have individual ranking functions rk1 ; : : : ; rkm for every pigeon. Moreover, rki will be de ned only on Mi. Instead of trying to guess in advance what might be good ranking functions, we will take the opposite approach and de ne them as \universal" (w.r.t. M1; : : : ; Mm) functions, by which we roughly mean \the best possible ranking function for which the inductive step in the proof of Lemma 3.4 goes through". After that it will turn out that these universal functions in fact possess a clean combinatorial meaning (implicit in the proof of Lemma 4.3). Formally, let w0 = C log S , where C is the constant assumed in the righthand side of the second case in Lemma 3.1. Let ` be an arbitrary parameter (to be speci ed later). Similarly to the proof of Lemma 3.4, de ne D as the set of all assignments satisfying the axioms Qi1 ;i2;j and such that jJi(a)j ` for all i 2 [m]. For every i 2 [m] we recursively construct an increasing chain Ri1 Ri2 : : : Rir : : : Mi (Rir will be the set of all functions f 2 Mi with rki(f ) r, and, in sharp contrast with the proof of Lemma 4.1, these constructions will be totally independent for dierent i). Base. Ri1 def = fWnj=1 xj g. Recursive step. Suppose that Rir is already constructed and f 2 Mi. Then f 2 Ri;r+1 if and only if there exists J0 2 [n](2w0 `) such that every assignment b 2 f0; 1gn that contains ` ones, satis es all functions in Rir and, moreover, has the property 8j 2 J0(bj = 0), also satis es f . It is important (and easy to see) that indeed Rir Ri;r+1.
Claim 4.2 For any ` > 0 there exists i 2 [m] such that in the above construction we have 0 2 Ri;blog mc . 2
15
Proof. For i 2 [m] and f 2 Mi de ne rki(f ) def = min f r j f 2 Rir g (rki(f ) def = def 1 if no such r exists). For C 2 P , let ri(C ) = rki (fi(C )), and let us apply Lemma 3.1 to the set of vectors f (r1(C ); : : : ; rm (C )) j C 2 P g. Let
r = (r1 ; : : : ; rm) be the resulting pigeon lter. De ne an r-axiom as an arbitrary clause C of the form xi;f1 _ : : : _ xi;f with rki (f1 _ : : : _ fs) ri. Let Ir (C ) def = f i 2 [m] j rki(fi) ri + 1 g, let wr (C ) def = jIr (C )j and let def wr (P ) = max f wr (C ) j C 2 P g. Arguing as in the proof of Lemma 3.3, we come up with a monotone functional calculus refutation P 0 from a set of r-axioms A which has the same size S and satis es the additional property wr (P ) w0. What is important to us (and can be easily checked) is that still fi (C ) 2 Mi for all C 2 P 0. Now we only have to go through the proof of Lemma 3.4 and check that it applies in the current situation. This is quite straightforward up to the de nition (6) of J0; J1. Essentially the only thing to be checked up to that point is that the rules of the monotone functional calculus are sound on D, and this is easy. The rest of the proof does not make sense now since there does not appear to exist any reasonable way to de ne Ji0 (C ). What we, however, know is that 8A 2 Ai0 (fi0 (A) 2 Ri0 ;r ), where r def = ri0 . On the other hand, fi0 (C ) 62 Ri0 ;r+1 since i0 62 Ir (C ). According to the de nition of Rir , this implies that for every J0 2 [n](2w0 `) there exists an assignment b 2 f0; 1gn that contains ` ones, has the property 8j 2 J0(bj = 0), satis es all fi0 (A) (A 2 Ai0 ) and falsi es fi0 (C ). In particular, an assignment b with these properties exists for J0 def = Si2I nfi0 g Ji(a). Re-assigning the values ai0 ;j to bj for all j 2 [n], we complete the inductive step in proving AI (C ) j= C . Since 0 2 P , we in particular have AI (0) j= 0 which implies Ir (0) 6= ;. But i 2 Ir (0) in turn implies 0 2 Ri;blog2 mc since ri < blog2 mc. Claim 4.2 is proved. Thus, we are only left to show a lower bound on rki(0), and this turns out to be (relatively) easy. Lemma 4.3 Let f0; Wn x g M F mon and L; ` be parameters such that s
r
r
j =1 j
n
! L` jMj exp n ; 16
(11)
where is a suciently small constant. De ne the sets Rr : : : M by the following recursion:
R 1 R2 : : :
Base. R1 def = fWnj=1 xj g. Recursive step. f 2 Rr+1 if and only if there exists J0 2 [n]L such that every assignment b 2 f0; 1gn that contains ` ones, satis es all functions in Rr and, moreover, has the property 8j 2 J0 (bj = 0), also satis es f .
Then 0 62 Rbn=(2L)c .
Proof. Given f 2 M, let rk(f ) def = min f r j f 2 Rr g. For every f 2 M with rk(f ) < 1 x once and for all some J0 = J0(f ) 2 [n]L witnessing the fact f 2 Rrk(f ) . Pick b at random among all assignments in f0; 1gn that contain exactly ` ones. Given (11), we may apply Cherno's bound and conclude that " !# 2 L` P 8f 2 M(rk(f ) < 1) j f j 2 J0(f ) j bj = 1 gj n > 0: (12)
Fix an arbitrary b 2 f0; 1gn with this property. We claim that f (c) = 1 for every f 2 Rr and every c b that contains at least 1 + (r ? 1) 2nL` ones. The base r = 1 is obvious. Suppose that f 2 Rr+1 ; c b and c has at least 1 + r 2nL` ones. Let d c be obtained from c by re-assigning all positions in J0(f ) to 0. Since b satis es the condition in (12), d still has at least 1 + (r ? 1) 2nL` ones. Therefore, by the inductive assumption, d satis es all functions in R . By the de nition of Rr+1, d satis es f and, since f is monotone, c satis es it,r too. This completes the inductive step. In particular, f (b) = 1 for every f 2 Rbn=(2L)c . Lemma 4.3 follows. Theorem 2.7 is immediately implied by Claim 4.2 and Lemma 4.3. Indeed, let ` def = Cn1=2 , where C > 0 is a suciently large constant. Then by Claim 4.2, 0 2 Ri;blog2 mc for some i 2 [m]. On the other hand, letting in Lemma 4.3 M def= Mi and L def= (2`w0), we observe that the boundn (11) is nsatis ed (if the 1=2 constant C is large enough). Therefore, log2 m 4`w0 = log S . Theorem 2.7 follows. 17
5. Conclusion and open problems Neither techniques from [Raz01] nor our techniques can be directly applied to the functional version FPHPnm of the pigeonhole principle (in which one pigeon may not split between several holes). Another problem of a very similar nature which still remains open is to construct a pseudo-random generator G : f0; 1gm ?! f0; 1gn with m n2 that would be hard for Resolution (see [ABRW00]). Lower bounds for tautologies from either of these two classes would unconditionally imply that Resolution does not possess a poly-size proof of NP 6 P=poly (as formalized e.g. in [Raz98, Section 5]). At the moment we only know that this independence result follows from the existence of one-way functions, and exponential lower bounds for the ordinary pigeonhole principle only imply that Resolution can not eciently prove the stronger variant \NP is not doable by poly-size circuits of unbounded fan-in". Let us clarify some connections that might be useful in this respect. As we noted in Section 2, [BW99] de ned so-called extended pigeonhole principle EPHPnm. In the terminology of our paper, its equivalent formulation can be described as the result of removing in the de nition of the monotone functional calculus all references to the monotonicity. That is, the set of variables will be V ars(m; n) def = f xif j i 2 [m]; f is an arbitrary function in n variables g, the clauses C are no longer required to be positive, and we also restore the resolution rule. Somewhat counter-intuitively, EPHPnm is stronger than FPHPnm: if we have a refutation of :(EPHPnm), then the substitution
_ xif 7! f xij j f (j ) = g
will (essentially) take it to a resolution refutation of :(FPHPnm). On the other hand, it is easy to see that the reduction from FPHPnm to the propositional statement expressing NP 6 P=poly given in [Raz98, Section 5] already works with EPHPnm. Moreover, it needs only the rectangular extension variables XiJ . Unfortunately, in order to prove lower bounds even for this weakest possible form of extension the methods from both [Raz01] and the current paper yet have to be enhanced with some new ideas. The best known upper bound on SR(:PHPnm) is exp(O(n log n)1=2 ) [BP96b], and we have shown the lower bound SR (:PHPnm) exp( (n1=3 )). That would be interesting to further narrow this gap. Speci cally, what is 1 log log S R (:PHPn ) 2 2 the value of lim supn!1 ? log2 n 18
Finally, in Section 4 we saw two separate improvements of our basic technique from Section 3 that nonetheless have a similar avour. Namely, in the proof of Lemma 4.1 we made the set of legitimate assignments D depend on the candidate refutation P , and in the proof of Theorem 2.7 a somewhat similar construction is applied to the ranking function rk. Are these two really dierent or we can interpret them as two partial case of a single construction? Can one get better results (speci cally, can it be useful for solving open problems posed above) if both D and rk are constructed dynamically?
6. Acknowledgements I am grateful to Stasys Jukna and Toni Pitassi for catching several misprints.
References [ABRW00] M. Alekhnovich, E. Ben-Sasson, A. Razborov, and A. Wigderson. Pseudorandom generators in propositional complexity. In Proceedings of the 41st IEEE FOCS, 2000. [Bla37] A. Blake. Canonical expressions in Boolean algebra. PhD thesis, University of Chicago, 1937. [BP96a] P. Beame and T. Pitassi. Simpli ed and improved resolution lower bounds. In Proceedings of the 37th IEEE FOCS, pages 274{282, 1996. [BP96b] S. Buss and T. Pitassi. Resolution and the weak pigeonhole principle. Manuscript, 1996. [BP98] P. Beame and T. Pitassi. Propositional proof complexity: Past, present and future. Technical Report TR98-067, Electronic Colloquium on Computational Complexity, 1998. [BW99] E. Ben-Sasson and A. Wigderson. Short proofs are narrow resolution made simple. In Proceedings of the 31st ACM STOC, pages 517{526, 1999. 19
[BT88] [CEI96] [CS88] [DP60] [Hak85] [Juk97]
[PR00] [Raz96]
[Raz98] [Raz01] [Rob65]
S. Buss and G. Turan. Resolution proofs of generalized pigeonhole principle. Theoretical Computer Science, 62:311{317, 1988. M. Clegg, J. Edmonds, and R. Impagliazzo. Using the Groebner basis algorithm to nd proofs of unsatis ability. In Proceedings of the 28th ACM STOC, pages 174{183, 1996. V. Chvatal and E. Szemeredi. Many hard examples for resolution. Journal of the ACM, 35(4):759{768, 1988. M. Davis and H. Putnam. A computing procedure for quanti cation theory. Journal of the ACM, 7(3):210{215, 1960. A. Haken. The intractability or resolution. Theoretical Computer Science, 39:297{308, 1985. S. Jukna. Exponential lower bounds for semantic resolution. In P. Beame and S. Buss, editors, Proof Complexity and Feasible Arithmetics: DIMACS workshop, April 21-24, 1996, DIMACS Series in Dicrete Mathematics and Theoretical Computer Science, vol. 39, pages 163{172. American Math. Soc., 1997. T. Pitassi and R. Raz. Exponential lower bound for the weak pigeonhole principle in regular resolution. Manuscript, 2000. A. Razborov. Lower bounds for propositional proofs and independence results in Bounded Arithmetic. In F. Meyer auf der Heide and B. Monien, editors, Proceedings of the 23rd ICALP, Lecture Notes in Computer Science, 1099, pages 48{62, New York/Berlin, 1996. Springer-Verlag. A. Razborov. Lower bounds for the polynomial calculus. Computational Complexity, 7:291{324, 1998. R. Raz. Resolution lower bounds for the weak pigeonhole principle. Manuscript, 2001. J. A. Robinson. A machine-oriented logic based on the resolution principle. Journal of the ACM, 12(1):23{41, 1965. 20
[RWY97]
[Tse68]
[Urq87]
A. Razborov, A. Wigderson, and A. Yao. Read-once branching programs, rectangular proofs of the pigeonhole principle and the transversal calculus. In Proceedings of the 29th ACM Symposium on Theory of Computing, pages 739{748, 1997. G. C. Tseitin. On the complexity of derivations in propositional calculus. In Studies in constructive mathematics and mathematical logic, Part II. Consultants Bureau, New-York-London, 1968. A. Urquhart. Hard examples for resolution. Journal of the ACM, 34(1):209{219, 1987.
21
ECCC ISSN 1433-8092 http://www.eccc.uni-trier.de/eccc ftp://ftp.eccc.uni-trier.de/pub/eccc
[email protected], subject ’help eccc’