Lecture 5.Formula complexity I A Boolean function can be computed by different models,and an important one is the class of formulas.Roughly speaking,formulasare circuits with fanout 1.To be more specific,a formula is a rooted tree s.t. ● each internal node is associated with a gate which is either AND or OR, ● each leaf is associated with a variable xi,the negation ofa variable or a constant 0 or 1. (Note that we have used DeMorgan rule to push all negations to the leaf level.)The formula computes a function f on input x by evaluating the gates in a bottom-up manner,and the root outputs f(x). Different measures of formula are studied,among which depth,size(number of gates)and leaf size (number of leaves)are the most important ones.The leaf size L(F)ofa formula F is simply the number of leaves.The leaf size L(f)ofa function f is the minimum leaf size L(F)of any formula F that computes f. A function f:{0,1}n→{0,1 is monotone if f(x)≤fy)whenever xi≤y,∀i∈[n]In other words,it requiresthat changing an input variable from 0 to I doesn't decrease the function value.For monotone functions,we can require the formula contain only variables and constants in leaves,but no negation of variables.(Convince yourself.)These are called monotone formulas.So for monotone functions,we can define the monotone leafsize L(f)as the minimal number of leaves ofany monotone formula that computes f. While circuit lower bounds are too hard to prove---the best unconditional lower bound is only cn for some constant c,people stepped back and tried to prove fomula lower bounds.In this and the next lectures,we will learn various methods to prove lower bounds for formula complexity. 1.(Random)Restriction The idea of restriction is as follows.We will pick some variables and fix their values to be some constants,resulting in a"subfunction"f.The variables are picked and fixed in such a ways that the complexity ofthis subfunction is considerably smaller than that off.But if we keep doing so,we'll go to a very small scale function with complexity 0.But this small scale function is still not a constant function,thus has a positive complexity---contradiction.This idea went all the way back to Subbotovskaya [Sub61]
Lecture 5. Formula complexity I A Boolean function can be computed by different models, and an important one is the class of formulas. Roughly speaking, formulas are circuits with fanout 1. To be more specific, a formula is a rooted tree s.t. ⚫ each internal node is associated with a gate which is either AND or OR, ⚫ each leaf is associated with a variable xi , the negation of a variable 𝑥̅𝑖 , or a constant 0 or 1. (Note that we have used DeMorgan rule to push all negations to the leaf level.) The formula computes a function f on input x by evaluating the gates in a bottom-up manner, and the root outputs f(x). Different measures of formula are studied, among which depth, size (number of gates) and leaf size (number of leaves) are the most important ones. The leaf size L(F) of a formula F is simply the number of leaves. The leaf size L(f) of a function f is the minimum leaf size L(F) of any formula F that computes f. A function 𝑓:{0,1} 𝑛 → {0,1} is monotone if 𝑓(𝑥) ≤ 𝑓(𝑦) whenever 𝑥𝑖 ≤ 𝑦i ,∀𝑖 ∈ [𝑛]. In other words, it requires that changing an input variable from 0 to 1 doesn’t decrease the function value. For monotone functions, we can require the formula contain only variables and constants in leaves, but no negation of variables. (Convince yourself.) These are called monotone formulas. So for monotone functions, we can define the monotone leaf size 𝐿+(𝑓) as the minimal number of leaves of any monotone formula that computesf. While circuit lower bounds are too hard to prove---the best unconditional lower bound is only cn for some constant c, people stepped back and tried to prove formula lower bounds. In this and the next lectures, we will learn various methods to prove lower bounds for formula complexity. 1. (Random) Restriction The idea of restriction is as follows. We will pick some variables and fix their values to be some constants, resulting in a “subfunction” f’. The variables are picked and fixed in such a ways that the complexity of this subfunction is considerably smaller than that of f. But if we keep doing so, we’ll go to a very small scale function with complexity 0. But this small scale function is still not a constant function, thus has a positive complexity---contradiction. This idea went all the way back to Subbotovskaya [Sub61]
Lemma 1.1.For every Boolean function f ofn variables,it is possible to fix one variable such that the resulting subfunction f satisfies 0s(-).0n Before proving the lemma,let's get some feelings of the parameters and see why it's nontrivial. Suppose F is a minimal formula which computesf,and F has s=L(f)leaves.Since f has n variables, there exists one variable x;which appears at least s/n times in the leaf level(though some of the appearances are in the form of )If we fix x,to be some constant,then the leaf size will reduce to s-s/n =(1-1/n)s.The lemma says that actually one can reduce the size by more.Now we prove it. Proof.Define F and s as in the above discussion.The idea is that if there is a subtree H in F which is of the form xi AG for some subtree G,then setting xi=0 will kill the whole subtree G.(You can handle the cases where the gate is OR,or where the occurrence ofx;is in negation,in a similar way.) Therefore,suppose variable iappears t>s/n times,then foreach appearance there is an assignment of x;=0/1 to kill its"neighbor"subtree G.Then one of the two assignments,x;=0 and x;=1,kills more,at least t/2,subtrees.Use that assignment.The leaf size reduces to at most s-t-s(-别ss1-)s as claimed. The above analysis is almost correct except that it forgets to consider the case in which the subtrees G intersect.Consider the extreme case that there is a subtree H of the form xixixi.This looks weird since clearly some xi's are redundant.We can indeed make this rigorous by claiming the following:For any leaf xi or i its neighbor subtree G doesn't contain variable i.(Thus the G's don't intersect.) The claim is not hard to prove.Basically,if we see xiA G and G contains xi,then it is observed that one can simply replace the x;in G by I without changing the function.(If x;=1,then we"guessed correctly"by replacing xi in G by 1.If xi=0,then xiAG=0 anyway.)Othercases such as G contains i or the gate connecting xi and G is OR,can be handled similarly.This completes the proof of the lemma. Repeatedly applying the lemma gives the following theorem
Lemma 1.1. For every Boolean function f of n variables, it is possible to fix one variable such that the resulting subfunction 𝑓′ satisfies 𝐿(𝑓 ′) ≤ (1 − 1 𝑛 ) 3/2 ⋅ 𝐿(𝑓). Before proving the lemma, let’s get some feelings of the parameters and see why it’s nontrivial. Suppose F is a minimal formula which computes f, and F has s = L(f) leaves. Since f has n variables, there exists one variable xi which appears at least s/n times in the leaf level (though some of the appearances are in the form of 𝑥̅𝑖 ). If we fix xi to be some constant, then the leaf size will reduce to s − s/n = (1 − 1/n)s. The lemma says that actually one can reduce the size by more. Now we prove it. Proof. Define F and s as in the above discussion. The idea is that if there is a subtree H in F which is of the form xi ∧ 𝐺 for some subtree G, then setting xi = 0 will kill the whole subtree G. (You can handle the cases where the gate is OR, or where the occurrence of xi is in negation, in a similar way.) Therefore, suppose variable i appears t ≥ s/n times, then for each appearance there is an assignment of xi = 0/1 to kill its “neighbor” subtree G. Then one of the two assignments, xi = 0 and xi = 1, kills more, at least t/2, subtrees. Use that assignment. The leaf size reduces to at most s − t − t 2 ≤ (1 − 3 2n) s ≤ (1 − 1 𝑛 ) 3/2 s, as claimed. The above analysis is almost correct except that it forgets to consider the case in which the subtrees G intersect. Consider the extreme case that there is a subtree H of the form 𝑥𝑖 ∧𝑥𝑖 ∧ 𝑥𝑖… This looks weird since clearly some xi’s are redundant. We can indeed make this rigorous by claiming the following: For any leaf 𝑥𝑖 or 𝑥̅𝑖 , its neighbor subtree G doesn’t contain variable i. (Thus the G’s don’t intersect.) The claim is not hard to prove. Basically, if we see xi ∧ 𝐺 and G contains xi , then it is observed that one can simply replace the xi in G by 1 without changing the function. (If xi= 1, then we “guessed correctly” by replacing xi in G by 1. If xi= 0, then xi ∧ 𝐺 = 0 anyway.) Other cases such as G contains 𝑥̅𝑖 , or the gate connecting xi and G is OR, can be handled similarly. This completes the proof of the lemma. □ Repeatedly applying the lemma gives the following theorem
Theorem 1.2.For every Boolean function f ofn variables and every kE [n],there is a way to fix n-k variables such that the resulting subfunction ofk variables satisfies Proof.Applying the lemma n-k times,we obtain a subfunction ofk variables with leafsize at most 0s-月)(-(1-.0=月.0 In the above theorems,we carefully constructed an assignment to shrink the leaf size.Next we will show that actually this construction doesn't need to be so careful---a random one does the job with good probability. Theorem 1.3.Let f be a Boolean function ofn variables.Pick k variables unifomly at random,and for the rest n-k variables,assign each of them a random constant c E {0,1).Then with probability at least 3/4, s4(月 ·L(0D Proof.Let's choose a variable randomly and assign a random value.(We will choose variables one by one;note that the process is actually the same as the random assignment in the statement of the theorem.)Denote by si the number of resulting leaves.The proof of the previous theorem actually shows that the expected number of reduced leaves is at leastThus 2n 3/2 s1≤s-器≤1-& Continue the above process for n-k variables,resulting in a subfunction ofk variables with the expected leafsize -1s-)1-”-2s=( s The claimed statement now follows from Markov Inequality (The above analysis is a bit informal about the expectation ofmultiple stages.Canyou make it more formal?Just compute the stage 2,namely E[s2J,andyou'll see why it's correct.)
Theorem 1.2. For every Boolean function f of n variables and every k ∈ [n], there is a way to fix n − k variables such that the resulting subfunction of k variables satisfies 𝐿(𝑓 ′) ≤ ( 𝑘 𝑛 ) 3/2 ⋅ 𝐿(𝑓). Proof. Applying the lemma n − k times, we obtain a subfunction of k variables with leaf size at most 𝐿(𝑓 ′) ≤ (1 − 1 𝑛 ) 3/2 (1 − 1 𝑛−1 ) 3/2 … (1 − 1 𝑘+1 ) 3/2 ⋅ 𝐿(𝑓) = ( 𝑘 𝑛 ) 3/2 ⋅ 𝐿(𝑓). □ In the above theorems, we carefully constructed an assignment to shrink the leaf size. Next we will show that actually this construction doesn’t need to be so careful---a random one does the job with good probability. Theorem 1.3. Let f be a Boolean function of n variables. Pick k variables uniformly at random, and for the rest n − k variables, assign each of them a random constant c ∈ {0,1}. Then with probability at least 3/4, 𝐿(𝑓 ′) ≤ 4( 𝑘 𝑛 ) 3/2 ⋅ 𝐿(𝑓). Proof. Let’s choose a variable randomly and assign a random value. (We will choose variables one by one; note that the process is actually the same as the random assignment in the statement of the theorem.) Denote by s1 the number of resulting leaves. The proof of the previous theorem actually shows that the expected number of reduced leaves is at least 3 2 𝑠 𝑛 . Thus 𝐸[𝑠1 ] ≤ 𝑠 − 3𝑠 2𝑛 ≤ (1 − 1 𝑛 ) 3/2 𝑠. Continue the above process for n − k variables, resulting in a subfunction of k variables with the expected leaf size 𝐸[𝑠𝑛−𝑘] ≤ (1 − 1 𝑛 ) 3/2 (1 − 1 𝑛−1 ) 3/2 … (1 − 1 𝑘+1 ) 3/2 ⋅ 𝑠 = ( 𝑘 𝑛 ) 3/2 ⋅ 𝑠. The claimed statement now follows from Markov Inequality. □ (The above analysis is a bit informal about the expectation of multiple stages. Can you make it more formal? Just compute the stage 2, namely E[s2], and you’ll see why it’s correct.)
We can apply the theorem to show a lower bound for the parity function f(x)=xx2..xn Theorem 1.4.For the parity function,L(f)2n3/2 Proof.Fix all but one variable,resulting in the parity function(or its negation)ofone variable,which has leaf size at least 1.So 1≤)≤( 3/2 L(f). which implies the claimed lower bound. Q 2.Rectangle and tiling number We now introduce another lower bound method for leaf size,due to Khrapchenko [Khr72]and Rychkov [Ryc85]. A rectangle is a pair (A,B)of disjoint sets A,B (0,1)m.A rectangle (A,B)is monochromatic if there exists an iE [n],s.t.for all xEA and y E B,it holds that xi yi.A rectangle(A,B)is 1-monochromatic if there exists an iE [n],s.t.for all x EA and y E B,it holds that xi=1 and yi=0.A rectangle (A,B)is 0-monochromatic if there exists an i E [n],s.t.for all xEA and yB, it holds that xi=0 and yi 1.Any monochromatic rectangle is either 1-or 0-monochromatic.(See why?) The tiling number x(R)of a rectangle R is the minimum number r s.t.R can be partitioned intor monochromatic rectangles.Please note that by "partition",we emphasize that the rectangles are disjoint (and,of course,their union needs to be the whole set R).The tiling numberx(f)of a function fis x(f-1(1)x f-1(0)),the tiling number ofthe rectangle f-1(1)xf-1(0) Theorem 2.1.For every Boolean function f,L(f)xf).If f is monotone,then L(f)(f). Proof.We prove the first inequality by induction on L(f),and the second one is similar.The base is easy to confirm:If L(f)=1,then f(x)=xi or f(x)=i.Then the rectangle f-1(1)xf-1(0) is itself monochromatic by definition. Now the induction step.Take a formula F with the minimal leaf size.Assume that the root is AND function;the other case can be handled similarly.So the function f=foAf,and L(f)=L(fo)+
We can apply the theorem to show a lower bound for the parity function 𝑓(𝑥) = 𝑥1 ⊕ 𝑥2 ⊕ … ⊕𝑥𝑛. Theorem 1.4. For the parity function, 𝐿(𝑓) ≥ 𝑛 3/2 . Proof. Fix all but one variable, resulting in the parity function (or its negation) of one variable, which has leaf size at least 1. So 1 ≤ 𝐿(𝑓 ′) ≤ ( 1 𝑛 ) 3/2 𝐿(𝑓), which implies the claimed lower bound. □ 2. Rectangle and tiling number We now introduce another lower bound method for leaf size, due to Khrapchenko [Khr72] and Rychkov [Ryc85]. A rectangle is a pair (A,B) of disjoint sets 𝐴, 𝐵 ⊆ {0,1} 𝑛. A rectangle (A,B) is monochromatic if there exists an i ∈ [n], s.t. for all 𝑥 ∈ 𝐴 and 𝑦 ∈ 𝐵, it holds that 𝑥𝑖 ≠ 𝑦𝑖 . A rectangle (A,B) is 1-monochromatic if there exists an i ∈ [n], s.t. for all x ∈ A and y ∈ B, it holds that 𝑥𝑖 = 1 and 𝑦𝑖 = 0. A rectangle (A,B) is 0-monochromatic if there exists an i ∈ [n], s.t. for all 𝑥 ∈ 𝐴 and 𝑦 ∈ 𝐵, it holds that 𝑥𝑖 = 0 and 𝑦𝑖 = 1. Any monochromatic rectangle is either 1- or 0-monochromatic. (See why?) The tiling number 𝜒(𝑅) of a rectangle R is the minimum number r s.t. R can be partitioned into r monochromatic rectangles. Please note that by “partition”, we emphasize that the rectangles are disjoint (and, of course, their union needs to be the whole set R). The tiling number 𝜒(𝑓) of a function f is χ(𝑓 −1(1) × 𝑓 −1(0)), the tiling number of the rectangle 𝑓 −1(1) × 𝑓 −1(0). Theorem 2.1. For every Boolean function f, 𝐿(𝑓) ≥ 𝜒(𝑓). If f is monotone, then 𝐿+(𝑓) ≥ 𝜒+(𝑓). Proof. We prove the first inequality by induction on L(f), and the second one is similar. The base is easy to confirm: If 𝐿(𝑓) = 1, then 𝑓(𝑥) = 𝑥𝑖 or 𝑓(𝑥) = 𝑥̅𝑖 . Then the rectangle 𝑓 −1(1) × 𝑓 −1(0) is itself monochromatic by definition. Now the induction step. Take a formula F with the minimal leaf size. Assume that the root is AND function; the other case can be handled similarly. So the function 𝑓 = 𝑓0 ∧ 𝑓1 , and 𝐿(𝑓) = 𝐿(𝑓0 ) +
L(f).(Convince yourself that the subtrees are also optimal for the comresponding subfunctions.) Define Bo=f-1(0)nf01(0),andB1=f-1(0)nf61(1) First,we observe that f-1(1)sf01(1),f-1(1)-1(1),B0sf61(0),B1Sf-1(0),f-1(0)=B0UB. The first two inclusions are because f(x)=1 implesthat fo(x)=fi(x)=1.(Recall that f= fo A f1.)The third one is by definition of Bo.The fourth one is because f(x)=0 and fo(x)=1 fi(x)=0.The last one is trivial by definition of Bo and B.Now x()=xf-1(1)×f-1(0) ∥def.ofxf) ≤x(f-1(1)×B)+x(f-1(1)×B) /∥tiling separately ≤x(61(1)×f61(0)+x1(1)×1(0)】 //the four inclusions above ≤x(f)+x() ∥def.ofxf) ≤L(fo)+L() /induction hypothesis =L(f) Now we need to further lower bound x(f). Theorem 2.2.Let A=f-1(1),B=f-1(0),and S={(x,y):xEA,y E B,lx yl 1).We have Is12 (0≥A·B where Ix⊕yl is the Hamming weight of x⊕y. Proof.Suppose we have a decomposition of A x B into r monochromatic rectangles Ri of dimensions six ti.Letc,be the number ofelements of S in Ri.We claim that each cis minfsi,ti}≤,√sit.Indeed,suppose that all elements(xy)inR:differ at positionj:xj≠y.Then each x has at most oney s.t.(x,y)ESn Ri,therefore ci<si.Similarly,we can showthat cisti. Then Rearranging the terms gives the bound. ◇
𝐿(𝑓1). (Convince yourself that the subtrees are also optimal for the corresponding subfunctions.) Define 𝐵0 = 𝑓 −1(0) ∩ 𝑓0 −1 (0), and 𝐵1 = 𝑓 −1(0) ∩ 𝑓0 −1 (1). First, we observe that 𝑓 −1(1) ⊆ 𝑓0 −1 (1), 𝑓 −1(1) ⊆ 𝑓1 −1 (1), 𝐵0 ⊆ 𝑓0 −1 (0), 𝐵1 ⊆ 𝑓1 −1 (0), 𝑓 −1(0) = 𝐵0 ∪ 𝐵1. The first two inclusions are because 𝑓(𝑥) = 1 imples that 𝑓0 (𝑥) = 𝑓1(𝑥) = 1. (Recall that 𝑓 = 𝑓0 ∧ 𝑓1.) The third one is by definition of B0. The fourth one is because 𝑓(𝑥) = 0 and 𝑓0 (𝑥) = 1 ⇒ 𝑓1(𝑥) = 0. The last one is trivial by definition of B0 and B1. Now 𝜒(𝑓) = 𝜒(𝑓 −1(1) × 𝑓 −1(0)) // def. of 𝜒(𝑓) ≤ 𝜒(𝑓 −1(1) × 𝐵0 ) + 𝜒(𝑓 −1(1) × 𝐵1 ) // tiling separately ≤ 𝜒 (𝑓0 −1 (1) × 𝑓0 −1 (0)) + 𝜒(𝑓1 −1 (1) × 𝑓1 −1 (0)) // the four inclusions above ≤ 𝜒(𝑓0 ) + 𝜒(𝑓1) // def. of 𝜒(𝑓𝑖) ≤ 𝐿(𝑓0 ) + 𝐿(𝑓1 ) // induction hypothesis = 𝐿(𝑓) □ Now we need to further lower bound 𝜒(𝑓). Theorem 2.2. Let 𝐴 = 𝑓 −1 (1), 𝐵 = 𝑓 −1 (0), and 𝑆 = {(𝑥, 𝑦): 𝑥 ∈ 𝐴,𝑦 ∈ 𝐵,|𝑥 ⊕𝑦| = 1}. We have 𝐿(𝑓) ≥ |𝑆|2 |𝐴| ⋅ |𝐵| where |𝑥 ⊕ 𝑦| is the Hamming weight of 𝑥 ⊕𝑦. Proof. Suppose we have a decomposition of 𝐴 × 𝐵 into r monochromatic rectangles Ri of dimensions 𝑠𝑖 × 𝑡𝑖 . Let ci be the number of elements of S in Ri . We claim that each 𝑐𝑖 ≤ 𝑚𝑖𝑛{𝑠𝑖 ,𝑡𝑖 } ≤ √𝑠𝑖 𝑡𝑖 . Indeed, suppose that all elements (x,y) in Ri differ at position j: 𝑥𝑗 ≠ 𝑦𝑗 . Then each x has at most one y s.t. (𝑥,𝑦) ∈ 𝑆 ∩ 𝑅𝑖 , therefore 𝑐𝑖 ≤ 𝑠𝑖 . Similarly, we can show that 𝑐𝑖 ≤ 𝑡𝑖 . Then |𝑆|2 = (∑𝑐𝑖 𝑟 𝑖=1 ) 2 ≤ 𝑟∑𝑐𝑖 2 𝑟 𝑖=1 ≤ 𝑟∑𝑠𝑖 𝑡𝑖 𝑟 𝑖=1 = 𝑟|𝐴 × 𝐵| = 𝑟|𝐴| ⋅ |𝐵|, Rearranging the terms gives the bound. □
We can use this to show a quadratic lower bound for parity function,improving the bound given in the last section. Theorem 2.3.Let f be the parity function ofn variables.Then L(f)>n2. Proof.Clearly,IAl IBl 2n-1,and ISl n2n-1.So L(f)>n2. One may wonder whether this lower bound can be further improved.The answer is no:actually there is a formula to compute parity function with O(n2)leaves.Can you find one? (hint:Consider the two-variable version,x1x2first.) References [Sub61]B.A.Subbotovskaya,Realizations of linear functions by formulas using +,.,-Soviet Math.Dokl.2.110-112.1961 [Khr72]V.M.Khrapchenko,A method of obtaining lower bounds for the complexity of nt-schemes,Math.Notes Acad.ofSci.USSR 10,474-479,1972 [Ryc85]K.L.Rychkov,A modification of Khrapchenko's method and its application to lower bounds for-schemes of code functions,in Metody Diskretnogo Analiza,42 (Novosibirsk),91-98, 1985
We can use this to show a quadratic lower bound for parity function, improving the bound given in the last section. Theorem 2.3. Let f be the parity function of n variables. Then 𝐿(𝑓) ≥ 𝑛 2 . Proof. Clearly, |A| = |B| = 2 n−1 , and |S| = n2 n−1 . So 𝐿(𝑓) ≥ 𝑛 2 . □ One may wonder whether this lower bound can be further improved. The answer is no: actually there is a formula to compute parity function with O(n 2 ) leaves. Can you find one? (hint: Consider the two-variable version, 𝑥1 ⊕ 𝑥2, first.) References [Sub61] B. A. Subbotovskaya, Realizations of linear functions by formulas using +, ., -, Soviet Math. Dokl. 2, 110–112, 1961. [Khr72] V.M. Khrapchenko, A method of obtaining lower bounds for the complexity of 𝛑-schemes, Math. Notes Acad. of Sci. USSR 10, 474–479, 1972. [Ryc85] K. L. Rychkov, A modification of Khrapchenko’s method and its application to lower bounds for -schemes of code functions, in Metody Diskretnogo Analiza, 42 (Novosibirsk), 91–98, 1985