1: 10 · Y. Yin —The typical case_中国高校课件下载中心

正在加载图片...

1:10 ·Y.Yin -The typical case for practically efficient data structures:m=@(logn)no(n), s=Poly(n),andt=o(o) 5.NEAREST NEIGHBOR SEARCH We consider the decision version of nearest neighbor search,4-near neighbor(- NN),in a high dimensional Hamming cube (0,1)4.Here,=(0,1)4,Y=(0.1), and f(x,y)E(0,1)answer whether there exists a point in y within distance from the x.As in Borodin et al.[1999]and Barkol and Rabani [2002],we assume that d=@(logn)nn(1)to make the problem nontrivial. We prove that with the above setting,there does not exist a (s,b,t)-CPP for the -NN problem,if s =Poly(n),b nl-for some positive constant a and t o(log(d/logn)).To prove this,we prove that the lower bound holds for the partial match problem [Indyk et al.2004;Jayram et al.2004; Patrascu 2008,which is an instantiation of the A-NN problem as shown in Borodin et al.[1999]. The partial match problem is defined as follows:The domain is a Hamming cube (0,1]d,where d =o(logn)nno(1,and each data instance y is a set of n points from the domain,that is,Y=().The set of queries is=(0.1, Given a data instance y()and a queryxe(01,),f(x,y)=1if and only if there is a z e y such that z matches x,except for the bits assigned with“*”. THEOREM 5.1.There is no (s,b,1)-CPP for the partial match problem if s< (d/2logn)alogn and b nl-a for some constant 0<a 1. PROOF.We denote the problem as f.From the characterization of(s,b,1)- CPP given in Theorem 3.3,it is sufficient to show that for any s x 25 partition Fof Y,there exist xeX and y e Y such that for all FeF(y),f(x,F)=2.We prove this with the probabilistic method.With some distribution ofx and y,we show that for any s x 25 partition F of Y,Pr[VFF(y),f(x,F)=2]>0. For the rest of the proof,we assume that y is uniformly selected from Y and that x is generated by uniformly choosing r=1+logn bits and fixing each of them uniformly and independently at random with 0 or 1,and setting the other bits to“*”. We then prove two supporting lemmas.Recall that for a partition P of y, Py)denotes the set F∈P that y∈F. LEMMA 5.2.For any partition P of Y,if Ps 25,then for k logn it holds that [po≤(-] exp (6 In2-n2-*) PROOF.We let P=(F1,F2,...,Fe),whereks 25,and let pi=Fl/Yl.Be- cause P is a partition of Y,we know that pi=1.We define a random variable ACM Transactions on Computation Theory,Vol.2,No.1,Article 1,Pub.date:November 2010.1: 10 · Y. Yin —The typical case for practically efficient data structures: m = ω(log n) ∩ o(n), s = Poly(n), and t = o n log n . 5. NEAREST NEIGHBOR SEARCH We consider the decision version of nearest neighbor search, λ-near neighbor (λ- NN), in a high dimensional Hamming cube {0, 1} d. Here, X = {0, 1} d, Y = {0,1}d n , and f(x, y) ∈ {0, 1} answer whether there exists a point in y within distance λ from the x. As in Borodin et al. [1999] and Barkol and Rabani [2002], we assume that d = ω(log n) ∩ no(1) to make the problem nontrivial. We prove that with the above setting, there does not exist a (s, b,t)-CPP for the λ-NN problem, if s = Poly(n), b < n1−α for some positive constant α and t = o log d/ log n . To prove this, we prove that the lower bound holds for the partial match problem [Indyk et al. 2004; Jayram et al. 2004; Patras ˘ ¸cu 2008], which is an instantiation of the λ-NN problem as shown in Borodin et al. [1999]. The partial match problem is defined as follows: The domain is a Hamming cube {0, 1}d, where d = ω(log n) ∩ no(1), and each data instance y is a set of n points from the domain, that is, Y = {0,1}d n . The set of queries is X = {0, 1, ∗}d. Given a data instance y ∈ {0,1}d n and a query x ∈ {0, 1, ∗}d, f(x, y) = 1 if and only if there is a z ∈ y such that z matches x, except for the bits assigned with “∗”. THEOREM 5.1. There is no (s, b, 1)-CPP for the partial match problem if s < (d/2 log n) α log n and b < n1−α for some constant 0 <α< 1. PROOF. We denote the problem as f. From the characterization of (s, b, 1)- CPP given in Theorem 3.3, it is sufficient to show that for any s × 2b partition F of Y, there exist x ∈ X and y ∈ Y such that for all F ∈ F(y), f(x, F) = 2. We prove this with the probabilistic method. With some distribution of x and y, we show that for any s × 2b partition F of Y, Pr[∀F ∈ F(y), f(x, F) = 2] > 0. For the rest of the proof, we assume that y is uniformly selected from Y and that x is generated by uniformly choosing r = 1 + log n bits and fixing each of them uniformly and independently at random with 0 or 1, and setting the other bits to “∗”. We then prove two supporting lemmas. Recall that for a partition P of Y, P(y) denotes the set F ∈ P that y ∈ F. LEMMA 5.2. For any partition P of Y, if |P| ≤ 2b , then for k < log n it holds that Pr y P(y) ≤ 1 − 2−k 2d n ≤ exp b ln 2 − n2−k . PROOF. We let P = {F1, F2,..., Fk}, where k ≤ 2b , and let pi = |Fi|/|Y|. Because P is a partition of Y, we know that i pi = 1. We define a random variable ACM Transactions on Computation Theory, Vol. 2, No. 1, Article 1, Pub. date: November 2010.

<<向上翻页向下翻页>>