Advanced Algorithms Dimension Reduction 尹一通Nanjing University,2022Fall
尹⼀通 Nanjing University, 2022 Fall Advanced Algorithms Dimension Reduction
Metric Embedding Two metric spaces:(X,dx)and (Y,dy) (X,dx) (Y,d) low-distortion:for small a 1 x1,2∈X:-dxx1,x2)≤dy((x1),(x2》≤adxx1,x2)
Metric Embedding low-distortion: for small α ≥ 1 ∀x1, x2 ∈ X : 1 α dX(x1, x2) ≤ dY(ϕ(x1), ϕ(x2)) ≤ αdX(x1, x2) ϕ (X, dX) (Y, dY) • Two metric spaces: (X, d and X) (Y, dY)
Dimension Reduction Input:n points,...,R Output::n pointsy1,y2,,yn∈Rkst.1≤i,j≤n: (1-e)x;-xl≤Ily:-yl≤(1+e)l:-xl ·Usually we want k<d. ·How small can k be? ·For what distance‖l·l? The embedding should be efficiently constructible
• Usually we want . • How small can be? • For what distance ? • The embedding should be efficiently constructible. k ≪ d k ∥ ⋅ ∥ Dimension Reduction Input: points Output: points s.t. n x1, x2, …, xn ∈ ℝd n y1, y2, …, yn ∈ ℝk ∀1 ≤ i, j ≤ n : (1 − ϵ)∥xi − xj ∥ ≤ ∥yi − yj ∥ ≤ (1 + ϵ)∥xi − xj ∥
Johonson-Linenstrauss Theorem/Transformation (JLT)
Johonson-Linenstrauss Theorem/Transformation (JLT)
Johonson-Linenstrauss Theorem (Johnson-Lindenstrauss 1984) "In Euclidian space,it is always possible to embed a set of n points in arbitrary dimension to O(log n)dimension with constant distortion. Theorem (Johnson-Lindenstrauss 1984): VO<e<1,for any set S of n points from R,there is a 中:Rd→R with k=O(e-2logn),such that Vx,y∈S: (1-e)x-yl2≤lφ(x)-(y)川l2≤(1+e)lx-y2
(Johnson-Lindenstrauss 1984) Johonson-Linenstrauss Theorem Theorem (Johnson-Lindenstrauss 1984): , for any set of points from , there is a with , such that ∀0 < ϵ < 1 S n ℝd ϕ : ℝd → ℝk k = O(ϵ−2 log n) ∀x, y ∈ S : (1 − ϵ)∥x − y∥2 2 ≤ ∥ϕ(x) − ϕ(y)∥2 2 ≤ (1 + ϵ)∥x − y∥2 2 “In Euclidian space, it is always possible to embed a set of n points in arbitrary dimension to O(log n) dimension with constant distortion
Johonson-Linenstrauss Theorem (Johnson-Lindenstrauss 1984) "In Euclidian space,it is always possible to embed a set of n points in arbitrary dimension to O(log n)dimension with constant distortion. Theorem (Johnson-Lindenstrauss 1984): VO<e<1,for any set S of n points from Rd,there is a A∈Rkxd with k=O(e-2logn),such that Vx,y∈S: (1-e)x-yl2≤‖Ax-Ay2≤(1+e)lx-yl2 ·The probabilistic method:for random A∈Rkxd rVx,y∈s:-er-y服sar-A3≤1+6x-yl明=1-o(日) w.h.p
(Johnson-Lindenstrauss 1984) Johonson-Linenstrauss Theorem Theorem (Johnson-Lindenstrauss 1984): , for any set of points from , there is a with , such that ∀0 < ϵ < 1 S n ℝd A ∈ ℝk×d k = O(ϵ−2 log n) ∀x, y ∈ S : (1 − ϵ)∥x − y∥2 2 ≤ ∥Ax − Ay∥2 2 ≤ (1 + ϵ)∥x − y∥2 2 “In Euclidian space, it is always possible to embed a set of n points in arbitrary dimension to O(log n) dimension with constant distortion.” • The probabilistic method: for random A ∈ ℝk×d Pr[∀x, y ∈ S : (1 − ϵ)∥x − y∥2 2 ≤ ∥Ax − Ay∥2 2 ≤ (1 + ϵ)∥x − y∥2 2] = 1 − O ( 1 n) w.h.p
Theorem (Johnson-Lindenstrauss 1984): VO<e<1,for any set S of n points from Rd,there is a A∈Rxd with k=O(e-2logn),such that Vx,y∈S: (1-e)lx-y3≤Ax-Ay3≤(1+elx-y3 ·The probabilistic method:for random A∈Rkxd Pr[xy∈s:-ok-脂≤Ar-a服≤+or-明=1-o(日) o Efficient construction of random A Rxd. 。 projection onto uniform random k-dimensional subspace; (Johnson-Lindenstrauss;Dasgupta-Gupta) independent Gaussian entries;(Indyk-Motwani) i.i.d.-1/+1 entries;(Achlioptas)
Theorem (Johnson-Lindenstrauss 1984): , for any set of points from , there is a with , such that ∀0 < ϵ < 1 S n ℝd A ∈ ℝk×d k = O(ϵ−2 log n) ∀x, y ∈ S : (1 − ϵ)∥x − y∥2 2 ≤ ∥Ax − Ay∥2 2 ≤ (1 + ϵ)∥x − y∥2 2 • The probabilistic method: for random A ∈ ℝk×d Pr[∀x, y ∈ S : (1 − ϵ)∥x − y∥2 2 ≤ ∥Ax − Ay∥2 2 ≤ (1 + ϵ)∥x − y∥2 2] = 1 − O ( 1 n) • Efficient construction of random : • projection onto uniform random k-dimensional subspace; (Johnson-Lindenstrauss; Dasgupta-Gupta) • independent Gaussian entries; (Indyk-Motwani) • i.i.d. -1/+1 entries; (Achlioptas) A ∈ ℝk×d
Dimension Reduction Input:n points...d Output::n pointsy1,Jy2,,yn∈Rks.t.H1≤i,j≤n: (1-e)x:-x3≤ly:-yl3≤(1+e)川;-x3 for some suitablek=O(e-2log n): J-L Transformation (i.i.d.Gaussian entries): Entries of A Rkxd are chosen i.i.d.from(0,1/k); (Gaussian distribution with mean 0 and variance 1/k) For i=1,2,...,n:let yi=Ax Gaussian random variable X~(u,o2): Px≤= E[X]= 202 dx Var[X]=o2
• for some suitable k = O(ϵ : −2 log n) Dimension Reduction J-L Transformation (i.i.d. Gaussian entries): Entries of are chosen i.i.d. from ; (Gaussian distribution with mean 0 and variance ) For : let ; A ∈ ℝk×d 𝒩(0,1/k) 1/k i = 1,2,…, n yi = Axi • Gaussian random variable X ∼ 𝒩(μ, σ : 2 ) Pr[X ≤ t] = ∫ t −∞ 1 2πσ2 e −(x − μ) 2 2σ2 dx 𝔼[X] = μ Var[X] = σ2 Input: points Output: points s.t. n x1, x2, …, xn ∈ ℝd n y1, y2, …, yn ∈ ℝk ∀1 ≤ i, j ≤ n : (1 − ϵ)∥xi − xj ∥2 2 ≤ ∥yi − yj ∥2 2 ≤ (1 + ϵ)∥xi − xj ∥2 2
Norm Preservation ·0≤e≤l,V set S of n points from Rd Random matrix A ERkxd withk=(e-2log n): Johnson-Lindenstrauss Theorem: With high probability(≥1-O(1/n)y:x,y∈S, (1-e)lx-y2≤IAx-Ay2≤(1+e)lx-y3 ☒ union bound 1-≤4品 over all O(n2) ≤1+e 个 pairs ofx,y∈S unit vector! for any unit vector u ERd: Pr[lAu3-1><是a
• , set of points from • Random matrix with : ∀0 ≤ ϵ ≤ 1 ∀ S n ℝd A ∈ ℝk×d k = (ϵ−2 log n) Norm Preservation Johnson-Lindenstrauss Theorem: With high probability ( ≥ 1 − O(1/n)): ∀x, y ∈ S, (1 − ϵ)∥x − y∥2 2 ≤ ∥Ax − Ay∥2 2 ≤ (1 + ϵ)∥x − y∥2 2 1 ✏ A (xy) kxyk2 2 2 1 + ✏ unit vector! union bound over all O(n2) pairs of x, y ∈ S for any unit vector u ∈ ℝd : Pr ⇥ kAuk2 2 1 > ✏ ⇤ < 1 n3
AERAxd:each entry of A is chosen i.i.d.from ( for any unit vector u E Rd: Pr[I川Au3-1>( -套w)=() i-1 independently!
for any unit vector u ∈ ℝd : Pr ⇥ kAuk2 2 1 > ✏ ⇤ < 1 n3 A ∈ ℝk×d : each entry of A is chosen i.i.d. from 𝒩 (0, 1 k ) kAuk2 2 = X k i=1 (Au) 2 i E ⇥ kAuk2 2 ⇤ = X k i=1 E ⇥ (Au) 2 i ⇤ (Au) 2 i = 0 @X d j=1 Aijuj 1 A 2 = N ✓ 0, 1 k ◆ Aij ⇠ N ✓ 0, 1 k ◆ linearity of expectation each i.i.d. (Au)i = X d j=1 Aijuj ⇠ N 0, Pd j=1 u2 j k ! recall: X ⇠ N µ1, 2 1 , Y ⇠ N µ2, 2 2 =) X + Y ⇠ N (µ1 + µ2, 2 1 + 2 2) independently!