Column Sampling based Discrete Superv_中国高校课件下载中心

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Column Sampling based Discrete Supervised Hashing

正在加载图片...

Column Sampling based Discrete Supervised Hashing Wang-Cheng Kang,Wu-Jun Li and Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Collaborative Innovation Center of Novel Software Technology and Industrialization,Nanjing 210023 Department of Computer Science and Technology,Nanjing University,China kwc.oliver@gmail.com,liwujun,zhouzhenju.edu.cn Abstract methods.Representative data-independent methods include locality-sensitive hashing(LSH)(Andoni and Indyk 2006) By leveraging semantic (label)information,supervised hashing has demonstrated better accuracy than unsuper- and its variants.The data-dependent hashing can be further vised hashing in many real applications.Because the divided into unsupervised hashing and supervised hashing hashing-code learning problem is essentially a discrete methods (Liu et al.2012:Zhang et al.2014).Unsupervised optimization problem which is hard to solve,most ex- hashing methods,such as iterative quantization(ITQ)(Gong isting supervised hashing methods try to solve a relaxed and Lazebnik 2011),isotropic hashing (IsoHash)(Kong and continuous optimization problem by dropping the dis- Li 2012),discrete graph hashing(DGH)(Liu et al.2014)and crete constraints.However,these methods typically suf- scalable graph hashing(SGH)(Jiang and Li 2015),only use fer from poor performance due to the errors caused by the feature information of the data points for learning with- the relaxation.Some other methods try to directly solve out using any semantic (label)information.On the contrary. the discrete optimization problem.However.they are supervised hashing methods try to leverage semantic (label) typically time-consuming and unscalable.In this paper, we propose a novel method,called column sampling information for hashing function learning.Representative based discrete supervised hashing(COSDISH),to di- supervised hashing methods include sequential projection rectly learn the discrete hashing code from semantic learning for hashing (SPLH)(Wang,Kumar,and Chang information.COSDISH is an iterative method,in each 2010b),minimal loss hashing (MLH)(Norouzi and Fleet iteration of which several columns are sampled from the 2011),supervised hashing with kernels (KSH)(Liu et al. semantic similarity matrix and then the hashing code 2012),two-step hashing (TSH)(Lin et al.2013a),latent is decomposed into two parts which can be alternately factor hashing (LFH)(Zhang et al.2014),FastH (Lin et al. optimized in a discrete way.Theoretical analysis shows 2014),graph cuts coding(GCC)(Ge,He,and Sun 2014)and that the learning (optimization)algorithm of COSDISH supervised discrete hashing (SDH)(Shen et al.2015). has a constant-approximation bound in each step of the Supervised hashing has attracted more and more attention alternating optimization procedure.Empirical results on datasets with semantic labels illustrate that COSDISH in recent years because it has demonstrated better accu- can outperform the state-of-the-art methods in real racy than unsupervised hashing in many real applications. applications like image retrieval. Because the hashing-code learning problem is essentially a discrete optimization problem which is hard to solve,most existing supervised hashing methods,such as KSH (Liu et Introduction al.2012),try to solve a relaxed continuous optimization Although different kinds of methods have been proposed problem by dropping the discrete constraints.However, for approximate nearest neighbor (ANN)search (Indyk these methods typically suffer from poor performance due to and Motwani 1998),hashing has become one of the most the errors caused by the relaxation,which has been verified popular candidates for ANN search because it can achieve by the experiments in (Lin et al.2014;Shen et al.2015). better performance than other methods in real application- Some other methods,such as FastH (Lin et al.2014),try to s(Weiss,Torralba,and Fergus 2008;Kulis and Grauman directly solve the discrete optimization problem.However, 2009;Zhang et al.2010;Zhang,Wang,and Si 2011;Zhang they are typically time-consuming and unscalable.Hence, et al.2012;Strecha et al.2012;Zhen and Yeung 2012; they have to sample only a small subset of the entire dataset Rastegari et al.2013;Lin et al.2013b;Xu et al.2013; for training even if a large-scale training set is given,which Jin et al.2013;Zhu et al.2013;Wang,Zhang,and Si 2013; cannot achieve satisfactory performance in real applications Zhou,Ding,and Guo 2014;Yu et al.2014). In this paper,we propose a novel method,called column There have appeared two main categories of hashing sampling based discrete supervised hashing (COSDISH), methods (Kong and Li 2012;Liu et al.2012;Zhang et to directly learn the discrete hashing code from semantic al.2014):data-independent methods and data-dependent information.The main contributions of COSDISH are listed as follows: Copyright C)2016,Association for the Advancement of Artificial Intelligence (www.aaai.org).All rights reserved. COSDISH is iterative,and in each iteration column sam-Column Sampling based Discrete Supervised Hashing Wang-Cheng Kang, Wu-Jun Li and Zhi-Hua Zhou National Key Laboratory for Novel Software Technology Collaborative Innovation Center of Novel Software Technology and Industrialization, Nanjing 210023 Department of Computer Science and Technology, Nanjing University, China kwc.oliver@gmail.com, {liwujun,zhouzh}@nju.edu.cn Abstract By leveraging semantic (label) information, supervised hashing has demonstrated better accuracy than unsupervised hashing in many real applications. Because the hashing-code learning problem is essentially a discrete optimization problem which is hard to solve, most existing supervised hashing methods try to solve a relaxed continuous optimization problem by dropping the discrete constraints. However, these methods typically suffer from poor performance due to the errors caused by the relaxation. Some other methods try to directly solve the discrete optimization problem. However, they are typically time-consuming and unscalable. In this paper, we propose a novel method, called column sampling based discrete supervised hashing (COSDISH), to directly learn the discrete hashing code from semantic information. COSDISH is an iterative method, in each iteration of which several columns are sampled from the semantic similarity matrix and then the hashing code is decomposed into two parts which can be alternately optimized in a discrete way. Theoretical analysis shows that the learning (optimization) algorithm of COSDISH has a constant-approximation bound in each step of the alternating optimization procedure. Empirical results on datasets with semantic labels illustrate that COSDISH can outperform the state-of-the-art methods in real applications like image retrieval. Introduction Although different kinds of methods have been proposed for approximate nearest neighbor (ANN) search (Indyk and Motwani 1998), hashing has become one of the most popular candidates for ANN search because it can achieve better performance than other methods in real applications (Weiss, Torralba, and Fergus 2008; Kulis and Grauman 2009; Zhang et al. 2010; Zhang, Wang, and Si 2011; Zhang et al. 2012; Strecha et al. 2012; Zhen and Yeung 2012; Rastegari et al. 2013; Lin et al. 2013b; Xu et al. 2013; Jin et al. 2013; Zhu et al. 2013; Wang, Zhang, and Si 2013; Zhou, Ding, and Guo 2014; Yu et al. 2014). There have appeared two main categories of hashing methods (Kong and Li 2012; Liu et al. 2012; Zhang et al. 2014): data-independent methods and data-dependent Copyright c 2016, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. methods. Representative data-independent methods include locality-sensitive hashing (LSH) (Andoni and Indyk 2006) and its variants. The data-dependent hashing can be further divided into unsupervised hashing and supervised hashing methods (Liu et al. 2012; Zhang et al. 2014). Unsupervised hashing methods, such as iterative quantization (ITQ) (Gong and Lazebnik 2011), isotropic hashing (IsoHash) (Kong and Li 2012), discrete graph hashing (DGH) (Liu et al. 2014) and scalable graph hashing (SGH) (Jiang and Li 2015), only use the feature information of the data points for learning without using any semantic (label) information. On the contrary, supervised hashing methods try to leverage semantic (label) information for hashing function learning. Representative supervised hashing methods include sequential projection learning for hashing (SPLH) (Wang, Kumar, and Chang 2010b), minimal loss hashing (MLH) (Norouzi and Fleet 2011), supervised hashing with kernels (KSH) (Liu et al. 2012), two-step hashing (TSH) (Lin et al. 2013a), latent factor hashing (LFH) (Zhang et al. 2014), FastH (Lin et al. 2014), graph cuts coding (GCC) (Ge, He, and Sun 2014) and supervised discrete hashing (SDH) (Shen et al. 2015). Supervised hashing has attracted more and more attention in recent years because it has demonstrated better accuracy than unsupervised hashing in many real applications. Because the hashing-code learning problem is essentially a discrete optimization problem which is hard to solve, most existing supervised hashing methods, such as KSH (Liu et al. 2012), try to solve a relaxed continuous optimization problem by dropping the discrete constraints. However, these methods typically suffer from poor performance due to the errors caused by the relaxation, which has been verified by the experiments in (Lin et al. 2014; Shen et al. 2015). Some other methods, such as FastH (Lin et al. 2014), try to directly solve the discrete optimization problem. However, they are typically time-consuming and unscalable. Hence, they have to sample only a small subset of the entire dataset for training even if a large-scale training set is given, which cannot achieve satisfactory performance in real applications. In this paper, we propose a novel method, called column sampling based discrete supervised hashing (COSDISH), to directly learn the discrete hashing code from semantic information. The main contributions of COSDISH are listed as follows: • COSDISH is iterative, and in each iteration column sam-

向下翻页>>

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Column Sampling based Discrete Supervised Hashing