learn the hash functions in (He et al_中国高校课件下载中心

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Double-bit quantization for hashing

正在加载图片...

learn the hash functions in (He et al.2011).One of the Problem Definition most recent data-dependent methods is iterative quantization Given a set of n data points S={x1,x2,..,xn}with (ITQ)(Gong and Lazebnik 2011)which finds an orthogonal rotation matrix to refine the initial projection matrix learned xi Rd,the goal of hashing is to learn a mapping to encode point xi with a binary string yi {0,1), by principal component analysis(PCA)so that the quanti- where c denotes the code size.To achieve the similarity- zation error of mapping the data to the vertices of binary preserving property,we require close points in the original hypercube is minimized.It outperforms most other state-of- space Rd to have similar binary codes in the code space the-art methods with relatively short codes. 10,1)e.To get the c-bit codes,we need c binary hash func- Because it is NP hard to directly compute the best bi- tions [h()1.Then the binary code can be computed nary codes for a given data set(Weiss,Torralba,and Fer- as yi=[h(xi),h2(xi),...,he(xi)]T.Most hashing algo- gus 2008),both data-independent and data-dependent hash- rithms adopt the following two-stage strategy: ing methods typically adopt a two-stage strategy.In the first stage,several projected dimensions of real values are gener- In the first stage,c real-valued functions {f() ated.Then in the second stage,the real values will be quan- are used to generate an intermediate vector tized into binary codes by thresholding.Currently,most ex- a=i(x,f2(x,…,f(xT where z:∈Re, isting methods use one single bit to quantize each projected These real-valued functions are often called dimension.More specifically,given a point x,each projected projection functions (Andoni and Indyk 2008: dimension i will be associated with a real-valued projection Wang,Kumar,and Chang 2010;Gong and Lazeb- function fi(x).The ith hash bit of x will be 1 if fi(x)>0. nik 2011),and each function corresponds to one of the c Otherwise,it will be 0.One problem with this single-bit projected dimensions: quantization(SBQ)is that the threshold (0 for most cases In the second stage,the real-valued vector zi is encoded if the data are zero centered)typically lies in the region of into binary vector yi,typically by thresholding.When the the highest point density and consequently a lot of neighbor- data have been normalized to have zero mean which is ing points close to the threshold might be hashed to totally adopted by most methods,a common encoding approach different bits,which is unexpected according to the princi is to use function sgn(x),where sgn(x)=1 if x >0 and ple of hashing.Figure 1 illustrates an example distribution 0 otherwise.For a matrix or a vector,sgn()will denote of the real values before thresholding which is computed the result of element-wise application of the above func- by PCA'.We can find that point“B”and point“C"in Fig tion.Hence,let yi sgn(zi),we can get the binary code ure 1(a)which lie in the region of the highest density will be of xi.This also means that h(xi)=sgn(f(xi)). quantized into 0 and I respectively although they are very close to each other.On the contrary,point"A"and point"B" We can see that the above sgn()function actually quan- will be quantized into the same code 0 although they are far tizes each projected dimension into one single bit with the threshold 0.As stated in the Introduction section,this SBQ away from each other.Because a lot of points will lie close to the threshold,it is very unreasonable to adopt this kind of strategy is unreasonable,which motivates the DBQ work of SBQ strategy for hashing. this paper. To the best of our knowledge,only one existing method. called AGH (Liu et al.2011),has found this problem of Double-Bit Ouantization SBQ and proposed a new quantization method called hierar- This section will introduce our double-bit quantization chical hashing(HH)to solve it.The basic idea of HH is to (DBQ)in detail.First,we will describe the motivation of use three thresholds to divide the real values of each dimen- DBO based on observation from real data.Then,the adap- sion into four regions,and encode each dimension with dou- tive threshold learning algorithm for DBQ will be proposed ble bits.However,for any projected dimension,the Ham- Finally,we will do some qualitative analysis and discussion ming distance between the two farthest points is the same about the performance of DBQ. as that between two relatively close points,which is unrea- sonable.Furthermore,although the HH strategy can achieve Observation and Motivation very promising performance when combined with AGH pro- Figure I illustrates the point distribution(histogram)of the jection functions (Liu et al.2011),it is still unclear whether real values before thresholding on one of the projected di- HH will be truly better than SBO when it is combined with mensions computed by PCA on 22K LabelMe data set (Tor- other projection functions. ralba,Fergus,and Weiss 2008)which will be used in our In this paper,we clearly claim that using double bits experiments.It clearly reveals that the point density is high- with adaptively learned thresholds to quantize each pro- est near the mean,which is zero here.Note that unless oth- jected dimension can completely solve the problem of erwise stated,we assume the data have been normalized to SBQ.The result is our novel quantization strategy called have zero mean,which is a typical choice by existing meth- double-bit quantization(DBQ).Extensive experiments on ods. real data sets demonstrate that our DBQ can significantly The popular coding strategy SBQ which adopts zero as outperform SBQ and HH. the threshold is shown in Figure 1(a).Due to the threshold- ing,the intrinsic neighboring structure in the original space Distribution of real-valued points computed by other hashing is destroyed.For example,points A,B,C,and D are four methods,such as SH and ITQ,are similar. points sampled from the X-axis of the point distributionlearn the hash functions in (He et al. 2011). One of the most recent data-dependent methods is iterative quantization (ITQ) (Gong and Lazebnik 2011) which finds an orthogonal rotation matrix to refine the initial projection matrix learned by principal component analysis (PCA) so that the quantization error of mapping the data to the vertices of binary hypercube is minimized. It outperforms most other state-ofthe-art methods with relatively short codes. Because it is NP hard to directly compute the best binary codes for a given data set (Weiss, Torralba, and Fergus 2008), both data-independent and data-dependent hashing methods typically adopt a two-stage strategy. In the first stage, several projected dimensions of real values are generated. Then in the second stage, the real values will be quantized into binary codes by thresholding. Currently, most existing methods use one single bit to quantize each projected dimension. More specifically, given a point x, each projected dimension i will be associated with a real-valued projection function fi(x). The ith hash bit of x will be 1 if fi(x) ≥ θ. Otherwise, it will be 0. One problem with this single-bit quantization (SBQ) is that the threshold θ (0 for most cases if the data are zero centered) typically lies in the region of the highest point density and consequently a lot of neighboring points close to the threshold might be hashed to totally different bits, which is unexpected according to the principle of hashing. Figure 1 illustrates an example distribution of the real values before thresholding which is computed by PCA1 . We can find that point “B” and point “C” in Figure 1(a) which lie in the region of the highest density will be quantized into 0 and 1 respectively although they are very close to each other. On the contrary, point “A” and point “B” will be quantized into the same code 0 although they are far away from each other. Because a lot of points will lie close to the threshold, it is very unreasonable to adopt this kind of SBQ strategy for hashing. To the best of our knowledge, only one existing method, called AGH (Liu et al. 2011), has found this problem of SBQ and proposed a new quantization method called hierarchical hashing (HH) to solve it. The basic idea of HH is to use three thresholds to divide the real values of each dimension into four regions, and encode each dimension with double bits. However, for any projected dimension, the Hamming distance between the two farthest points is the same as that between two relatively close points, which is unreasonable. Furthermore, although the HH strategy can achieve very promising performance when combined with AGH projection functions (Liu et al. 2011), it is still unclear whether HH will be truly better than SBQ when it is combined with other projection functions. In this paper, we clearly claim that using double bits with adaptively learned thresholds to quantize each projected dimension can completely solve the problem of SBQ. The result is our novel quantization strategy called double-bit quantization (DBQ). Extensive experiments on real data sets demonstrate that our DBQ can significantly outperform SBQ and HH. 1Distribution of real-valued points computed by other hashing methods, such as SH and ITQ, are similar. Problem Definition Given a set of n data points S = {x1, x2, · · · , xn} with xi ∈ R d , the goal of hashing is to learn a mapping to encode point xi with a binary string yi ∈ {0, 1} c , where c denotes the code size. To achieve the similaritypreserving property, we require close points in the original space R d to have similar binary codes in the code space {0, 1} c . To get the c-bit codes, we need c binary hash functions {hk(·)} c k=1. Then the binary code can be computed as yi = [h1(xi), h2(xi), · · · , hc(xi)]T . Most hashing algorithms adopt the following two-stage strategy: • In the first stage, c real-valued functions {fk(·)} c k=1 are used to generate an intermediate vector zi = [f1(xi), f2(xi), · · · , fc(xi)]T , where zi ∈ R c . These real-valued functions are often called projection functions (Andoni and Indyk 2008; Wang, Kumar, and Chang 2010; Gong and Lazebnik 2011), and each function corresponds to one of the c projected dimensions; • In the second stage, the real-valued vector zi is encoded into binary vector yi , typically by thresholding. When the data have been normalized to have zero mean which is adopted by most methods, a common encoding approach is to use function sgn(x), where sgn(x) = 1 if x ≥ 0 and 0 otherwise. For a matrix or a vector, sgn(·) will denote the result of element-wise application of the above function. Hence, let yi = sgn(zi), we can get the binary code of xi . This also means that hk(xi) = sgn(fk(xi)). We can see that the above sgn(·) function actually quantizes each projected dimension into one single bit with the threshold 0. As stated in the Introduction section, this SBQ strategy is unreasonable, which motivates the DBQ work of this paper. Double-Bit Quantization This section will introduce our double-bit quantization (DBQ) in detail. First, we will describe the motivation of DBQ based on observation from real data. Then, the adaptive threshold learning algorithm for DBQ will be proposed. Finally, we will do some qualitative analysis and discussion about the performance of DBQ. Observation and Motivation Figure 1 illustrates the point distribution (histogram) of the real values before thresholding on one of the projected dimensions computed by PCA on 22K LabelMe data set (Torralba, Fergus, and Weiss 2008) which will be used in our experiments. It clearly reveals that the point density is highest near the mean, which is zero here. Note that unless otherwise stated, we assume the data have been normalized to have zero mean, which is a typical choice by existing methods. The popular coding strategy SBQ which adopts zero as the threshold is shown in Figure 1(a). Due to the thresholding, the intrinsic neighboring structure in the original space is destroyed. For example, points A, B, C, and D are four points sampled from the X-axis of the point distribution

<<向上翻页向下翻页>>

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Double-bit quantization for hashing