to the vertices of binary hypercube i_中国高校课件下载中心

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Isotropic hashing

正在加载图片...

to the vertices of binary hypercube is minimized.Compared to the data-dependent methods,the data-independent methods need longer codes to achieve satisfactory performance [7. For most existing projection functions such as those mentioned above,the variances of different projected dimensions are different.Many researchers [7.12.21]have argued that using the same number of bits for different projected dimensions with unequal variances is unreasonable because larger-variance dimensions will carry more information.Some methods [7,12]use orthogonal trans- formation to the PCA-projected data with the expectation of balancing the variances of different PCA dimensions,and achieve better performance than the original PCA based hashing.However, to the best of our knowledge,there exist no methods which can guarantee to learn a projection with equal variances for different dimensions.Hence,the viewpoint that using the same number of bit- s for different projected dimensions is unreasonable has still not been verified by either theory or experiment. In this paper,a novel hashing method,called isotropic hashing(IsoHash),is proposed to learn a pro- jection function which can produce projected dimensions with isotropic variances (equal variances). To the best of our knowledge,this is the first work which can learn projections with isotropic vari- ances for hashing.Experimental results on real data sets show that IsoHash can outperform its counterpart with anisotropic variances for different dimensions,which verifies the intuitive view- point that projections with isotropic variances will be better than those with anisotropic variances. Furthermore,the performance of IsoHash is also comparable,if not superior,to the state-of-the-art methods. 2 Isotropic Hashing 2.1 Problem Statement Assume we are given n data points {x1,x2,...,x}with xi Rd,which form the columns of the data matrix XE Rdxn.Without loss of generality,in this paper the data are assumed to be trry gdaote the leuhe more oine股 zero centered which means original space Rd should be hashed into similar binary codes in the code space {0,1]m to preserve the similarity structure in the original space.In general,we compute the binary code of xi as yi=[h(x),h2()..,hm(]T with m binary hash functions h( Because it is NP hard to directly compute the best binary functions hk()for a given data set [31]. most hashing methods adopt a two-stage strategy to learn h().In the projection stage,m real- valued projection functions {f(x)are learned and each function can generate one real value. Hence,we have m projected dimensions each of which corresponds to one projection function.In the quantization stage,the real-values are quantized into a binary string by thresholding. Currently,most methods use one bit to quantize each projected dimension.More specifically, hk(xi)=sgn(fk(xi))where sgn(x)=1 if >0 and 0 otherwise.The exceptions of the quan- tization methods only contain AGH [21],DBQ [14]and MH [15],which use two bits to quantize each dimension.In sum,all of these methods adopt the same number (either one or two)of bits for different projected dimensions.However,the variances of different projected dimensions are unequal,and larger-variance dimensions typically carry more information.Hence,using the same number of bits for different projected dimensions with unequal variances is unreasonable,which has also been argued by many researchers [7,12,21].Unfortunately,there exist no methods which can learn projection functions with equal variances for different dimensions.In the following content of this section,we present a novel model to learn projections with isotropic variances. 2.2 Model Formulation The idea of our IsoHash method is to learn an orthogonal matrix to rotate the PCA projection matrix. To generate a code of m bits,PCAH performs PCA on X,and then use the top m eigenvectors of the covariance matrixXXT as columns of the projection matrix W ERdxm.Here,top m eigenvectors are those corresponding to the m largest eigenvalues k1,generally arranged with the non-to the vertices of binary hypercube is minimized. Compared to the data-dependent methods, the data-independent methods need longer codes to achieve satisfactory performance [7]. For most existing projection functions such as those mentioned above, the variances of different projected dimensions are different. Many researchers [7, 12, 21] have argued that using the same number of bits for different projected dimensions with unequal variances is unreasonable because larger-variance dimensions will carry more information. Some methods [7, 12] use orthogonal transformation to the PCA-projected data with the expectation of balancing the variances of different PCA dimensions, and achieve better performance than the original PCA based hashing. However, to the best of our knowledge, there exist no methods which can guarantee to learn a projection with equal variances for different dimensions. Hence, the viewpoint that using the same number of bits for different projected dimensions is unreasonable has still not been verified by either theory or experiment. In this paper, a novel hashing method, called isotropic hashing (IsoHash), is proposed to learn a projection function which can produce projected dimensions with isotropic variances (equal variances). To the best of our knowledge, this is the first work which can learn projections with isotropic variances for hashing. Experimental results on real data sets show that IsoHash can outperform its counterpart with anisotropic variances for different dimensions, which verifies the intuitive viewpoint that projections with isotropic variances will be better than those with anisotropic variances. Furthermore, the performance of IsoHash is also comparable, if not superior, to the state-of-the-art methods. 2 Isotropic Hashing 2.1 Problem Statement Assume we are given n data points {x1, x2, · · · , xn} with xi ∈ R d , which form the columns of the data matrix X ∈ R d×n. Without loss of generality, in this paper the data are assumed to be zero centered which means Pn i=1 xi = 0. The basic idea of hashing is to map each point xi into a binary string yi ∈ {0, 1} m with m denoting the code size. Furthermore, close points in the original space R d should be hashed into similar binary codes in the code space {0, 1} m to preserve the similarity structure in the original space. In general, we compute the binary code of xi as yi = [h1(xi), h2(xi), · · · , hm(xi)]T with m binary hash functions {hk(·)} m k=1. Because it is NP hard to directly compute the best binary functions hk(·) for a given data set [31], most hashing methods adopt a two-stage strategy to learn hk(·). In the projection stage, m realvalued projection functions {fk(x)} m k=1 are learned and each function can generate one real value. Hence, we have m projected dimensions each of which corresponds to one projection function. In the quantization stage, the real-values are quantized into a binary string by thresholding. Currently, most methods use one bit to quantize each projected dimension. More specifically, hk(xi) = sgn(fk(xi)) where sgn(x) = 1 if x ≥ 0 and 0 otherwise. The exceptions of the quantization methods only contain AGH [21], DBQ [14] and MH [15], which use two bits to quantize each dimension. In sum, all of these methods adopt the same number (either one or two) of bits for different projected dimensions. However, the variances of different projected dimensions are unequal, and larger-variance dimensions typically carry more information. Hence, using the same number of bits for different projected dimensions with unequal variances is unreasonable, which has also been argued by many researchers [7, 12, 21]. Unfortunately, there exist no methods which can learn projection functions with equal variances for different dimensions. In the following content of this section, we present a novel model to learn projections with isotropic variances. 2.2 Model Formulation The idea of our IsoHash method is to learn an orthogonal matrix to rotate the PCA projection matrix. To generate a code of m bits, PCAH performs PCA on X, and then use the top m eigenvectors of the covariance matrix XXT as columns of the projection matrix W ∈ R d×m. Here, top m eigenvectors are those corresponding to the m largest eigenvalues {λk} m k=1, generally arranged with the non- 2

<<向上翻页向下翻页>>

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Isotropic hashing