正在加载图片...
are performed for training.In this paper,we adopt the same The proof of Theorem 1 can be found in the supplemen- column sampling method as LFH (Zhang et al.2014)for our tary material, COSDISH.Unlike LFH which adopts continuous relaxation That is to say,if we use the solution of fi().i.e..Br= for learning,in COSDISH we propose a novel discrete Fi=sgn(STB),as the solution of f2(),the solution optimization (learning)method based on column sampling. is a 2q-approximation solution.Because g is usually small, More specifically,in each iteration we randomly sample we can say that it is a constant-approximation solution, a subset n of N={1,2,...,n}and then choose the which means that we can get an error bound for the original semantic similarity between all n points and those points problem. indexed by 1.That is to say,we sample columns of S with the column numbers being indexed by We use Because the elements of STB can be zero,we set B)= S-1,1)x to denote the sampled sub-matrix of e_sgn(STB,B-1))in practice: similarity. b1>0 We can find that there exist two different kinds of points b1=0 in each iteration,one being those indexed by n and the e-sgn(b1,62) -1 b1<0 other being those indexed by I =N-1.We use S {-1,1x to denote a sub-matrix formed by the rows of where t is the iteration number,and e_sgn(,)is applied in S indexed by2.sr∈{-1,1}rlx,B∈{-l,1al×g an element-wise manner. and Br e{-1,1rlx are defined in a similar way. Update B with Bl Fixed When Br is fixed,the sub- According to the problem in (1),the problem associated problem of B is given by: with the sampled columns in each iteration can be reformu- min llgsT-BT BO]T+S-B2 B2]T lated as follows: B2e{-1,1InI×g llas"-B+-B"[BT 2) Inspired by TSH (Lin et al.2013a),we can transform the above problem to q binary quadratic programming (BQP) Alternating Optimization problems.The optimization of the kth bit of B is given by: We propose an alternating optimization strategy,which eQb+b的7p (4) contains several iterations with each iteration divided into two alternating steps,to solve the problem in (2).More where b denotes the kth column of B,and specifically,we update B with B fixed,and then update B with Br fixed.This two-step alternating optimization procedure will be repeated for several times. ∑6,Q=0 Update Br with BR Fixed When BS is fixed,the objec- 0=-24g- tive function of Br is given by: 左一1 f(B)=IgsT-BPB]哈 =-2Bik(g配:- l=1 m=1 Br∈{-1,1}1rI×9 Note that the formulation in(4)is not the same as that in Minimizing f2()can be viewed as a discrete least square TSH due to the additional linear term.More details about problem.By relaxing the B into continuous values,it is the above derivation can be found in the supplementary easy to get the optimal continuous solution.However,after material. we quantize the continuous solution into discrete solution, Then,we need to turn the problem in (4)into a standard the error bound caused by quantization cannot be guar- BOP form in which the domain of binary variable is {0,1} anteed.Hence,even if the continuous solution is optimal and there are no linear terms. for the relaxed continuous problem,the discrete solution First,we transform the domain of bk from {-1,11 to might be far away from the optimal solution of the original problem. 10,1)11.Let B=(b+1),we have: Here,we design a method to solve it in a discrete way with [bTQ()b+bTp() a constant-approximation bound.Let's consider the follow- ing problem fi()which changes the loss from Frobenius 12412 2 norm in f2()to Li norm: =4∑∑Q+2∑限-∑(Q+Q以》 m=1l=1 fi(B)=llasr-Bl[BTI. (3) const, Br∈{-1,1IrI×9 where const is a constant. It's easy to find that when Br sgn(STB),f() Hence,problem(4)can be reformulated as follows: reaches its minimum.Furthermore,we have the following theorem. min b Tbp() (5) Theorem 1.Suppose that fi(Fi)and f2(F2)reach their BE(0.1)I1 minimum at the points Fi and F3,respectively.We have IThe supplementary material can be downloaded from http: f2(F1)≤2gf2(F2) //cs.nju.edu.cn/1wj/paper/COSDISH_sup.pdf.are performed for training. In this paper, we adopt the same column sampling method as LFH (Zhang et al. 2014) for our COSDISH. Unlike LFH which adopts continuous relaxation for learning, in COSDISH we propose a novel discrete optimization (learning) method based on column sampling. More specifically, in each iteration we randomly sample a subset Ω of N = {1, 2, . . . , n} and then choose the semantic similarity between all n points and those points indexed by Ω. That is to say, we sample |Ω| columns of S with the column numbers being indexed by Ω. We use Se ∈ {−1, 1} n×|Ω| to denote the sampled sub-matrix of similarity. We can find that there exist two different kinds of points in each iteration, one being those indexed by Ω and the other being those indexed by Γ = N − Ω. We use SeΩ ∈ {−1, 1} |Ω|×|Ω| to denote a sub-matrix formed by the rows of Se indexed by Ω. SeΓ ∈ {−1, 1} |Γ|×|Ω| , BΩ ∈ {−1, 1} |Ω|×q and BΓ ∈ {−1, 1} |Γ|×q are defined in a similar way. According to the problem in (1), the problem associated with the sampled columns in each iteration can be reformu￾lated as follows: min BΩ,BΓ kqSeΓ − B Γ [B Ω] T k 2 F + kqSeΩ − B Ω[B Ω] T k 2 F . (2) Alternating Optimization We propose an alternating optimization strategy, which contains several iterations with each iteration divided into two alternating steps, to solve the problem in (2). More specifically, we update BΓ with BΩ fixed, and then update BΩ with BΓ fixed. This two-step alternating optimization procedure will be repeated for several times. Update BΓ with BΩ Fixed When BΩ is fixed, the objec￾tive function of BΓ is given by: f2(B Γ ) BΓ∈{−1,1}|Γ|×q = kqSeΓ − B Γ [B Ω] T k 2 F . Minimizing f2(·) can be viewed as a discrete least square problem. By relaxing the BΓ into continuous values, it is easy to get the optimal continuous solution. However, after we quantize the continuous solution into discrete solution, the error bound caused by quantization cannot be guar￾anteed. Hence, even if the continuous solution is optimal for the relaxed continuous problem, the discrete solution might be far away from the optimal solution of the original problem. Here, we design a method to solve it in a discrete way with a constant-approximation bound. Let’s consider the follow￾ing problem f1(·) which changes the loss from Frobenius norm in f2(·) to L1 norm: f1(B Γ ) BΓ∈{−1,1}|Γ|×q = kqSeΓ − B Γ [B Ω] T k1. (3) It’s easy to find that when BΓ = sgn(SeΓBΩ), f1(·) reaches its minimum. Furthermore, we have the following theorem. Theorem 1. Suppose that f1(F ∗ 1 ) and f2(F ∗ 2 ) reach their minimum at the points F ∗ 1 and F ∗ 2 , respectively. We have f2(F ∗ 1 ) ≤ 2qf2(F ∗ 2 ). The proof of Theorem 1 can be found in the supplemen￾tary material1 . That is to say, if we use the solution of f1(·), i.e., BΓ = F ∗ 1 = sgn(SeΓBΩ), as the solution of f2(·), the solution is a 2q-approximation solution. Because q is usually small, we can say that it is a constant-approximation solution, which means that we can get an error bound for the original problem. Because the elements of SeΓBΩ can be zero, we set BΓ (t) = e sgn(SeΓBΩ, BΓ (t−1)) in practice: e sgn(b1, b2) = ( 1 b1 > 0 b2 b1 = 0 −1 b1 < 0 where t is the iteration number, and e sgn(·, ·) is applied in an element-wise manner. Update BΩ with BΓ Fixed When BΓ is fixed, the sub￾problem of BΩ is given by: min BΩ∈{−1,1}|Ω|×q kqSeΓ−B Γ [B Ω] T k 2 F +kqSeΩ−B Ω[B Ω] T k 2 F . Inspired by TSH (Lin et al. 2013a), we can transform the above problem to q binary quadratic programming (BQP) problems. The optimization of the kth bit of BΩ is given by: min bk∈{−1,1}|Ω| [b k ] T Q(k)b k + [b k ] T p (k) (4) where b k denotes the kth column of BΩ, and Q (k) i,j i6=j = − 2(qSeΩ i,j − kX−1 m=1 b m i b m j ), Q(k) i,i = 0, p (k) i = − 2 X |Γ| l=1 B Γ l,k(qSeΓ l,i − kX−1 m=1 B Γ l,mB Ω i,m). Note that the formulation in (4) is not the same as that in TSH due to the additional linear term. More details about the above derivation can be found in the supplementary material. Then, we need to turn the problem in (4) into a standard BQP form in which the domain of binary variable is {0, 1} and there are no linear terms. First, we transform the domain of b k from {−1, 1} |Ω| to {0, 1} |Ω| . Let b k = 1 2 (b k + 1), we have: [b k ] T Q (k)b k + [b k ] T p (k) =4 X |Ω| m=1 X |Ω| l=1 b k mb k l Q (k) m,l + 2 X |Ω| m=1 b k m(p (k) m − X |Ω| l=1 (Q (k) m,l + Q (k) l,m)) + const, where const is a constant. Hence, problem (4) can be reformulated as follows: min b k ∈{0,1}|Ω| [b k ] T Q (k) b k + [b k ] T p (k) , (5) 1The supplementary material can be downloaded from http: //cs.nju.edu.cn/lwj/paper/COSDISH_sup.pdf
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有