正在加载图片...
192 Q.Cui et al. The contributions of this paper are summarized as follows: We study the novel fine-grained hashing topic to leverage the search and storage efficiency of hash codes for solving the challenging large-scale fine- grained image retrieval problem. We propose a unified end-to-end trainable network,i.e.,ExchNet,to first learn fine-grained tailored features and then generate the final binary hash codes Particularly,the proposed attention constraints,local feature alignment and anchor-based learning fashion contribute well to obtain discriminative fine- grained representations. We conduct extensive experiments on five fine-grained datasets to validate both effectiveness and efficiency of our proposed ExchNet.Especially for the results on large-scale datasets,ExchNet exhibits its outperforming retrieval performance on either speedup,memory usages and retrieval accuracy. 2 Related Work Fine-Grained Image Retrieval.Fine-Grained Image Retrieval(FGIR)is an active research topic emerged in recent years,where the database and query images could share small inter-class variance but large intra-class variance.In previous works [36,handcrafted features were initially utilized to tackle the FGIR problem.Powered by deep learning techniques,more and more deep learn- ing based FGIR.methods [19,26,31-33,36,41,42]were proposed.These deep methods can be roughly divided into two parts,i.e.,supervised and unsupervised methods.In supervised methods,FGIR is defined as a metric learning problem. Zheng et al.[41]designed a novel ranking loss and a weakly-supervised attrac- tive feature extraction strategy to facilitate the retrieval performance.Zheng et al.[42]improved their former work [41]with a normalize-scale layer and de- correlated ranking loss.As to unsupervised methods,Selective Convolutional Descriptor Aggregation (SCDA)[31]was proposed to localize the main object in fine-grained images firstly,and then discard the noisy background and keep useful deep descriptors for fine-grained image retrieval. Deep Hashing.Hashing methods can be divided into two categories,i.e., data-independent methods [6]and data-dependent methods [10,17],based on whether training points are used to learn hash functions.Generally speaking, data-dependent methods,also named as Learning to Hash(L2H)methods,can achieve better retrieval performance with the help of the learning on training data.With the rise of deep learning,some L2H methods integrate deep feature learning into hash frameworks and achieve promising performance.As previous work,many deep hashing methods [2,3,7,14,16,17,21,22,30,35,38,39]for large- scale image retrieval have been proposed.Compared with deep unsupervised hashing methods [7,14,21],deep supervised hashing methods [14,16,17,35]can achieve superior retrieval accuracy as they can fully explore the semantic infor- mation.Specifically,the previous work 35 was essentially a two-stage method192 Q. Cui et al. The contributions of this paper are summarized as follows: – We study the novel fine-grained hashing topic to leverage the search and storage efficiency of hash codes for solving the challenging large-scale fine￾grained image retrieval problem. – We propose a unified end-to-end trainable network, i.e., ExchNet, to first learn fine-grained tailored features and then generate the final binary hash codes. Particularly, the proposed attention constraints, local feature alignment and anchor-based learning fashion contribute well to obtain discriminative fine￾grained representations. – We conduct extensive experiments on five fine-grained datasets to validate both effectiveness and efficiency of our proposed ExchNet. Especially for the results on large-scale datasets, ExchNet exhibits its outperforming retrieval performance on either speedup, memory usages and retrieval accuracy. 2 Related Work Fine-Grained Image Retrieval. Fine-Grained Image Retrieval (FGIR) is an active research topic emerged in recent years, where the database and query images could share small inter-class variance but large intra-class variance. In previous works [36], handcrafted features were initially utilized to tackle the FGIR problem. Powered by deep learning techniques, more and more deep learn￾ing based FGIR methods [19,26,31–33,36,41,42] were proposed. These deep methods can be roughly divided into two parts, i.e., supervised and unsupervised methods. In supervised methods, FGIR is defined as a metric learning problem. Zheng et al. [41] designed a novel ranking loss and a weakly-supervised attrac￾tive feature extraction strategy to facilitate the retrieval performance. Zheng et al. [42] improved their former work [41] with a normalize-scale layer and de￾correlated ranking loss. As to unsupervised methods, Selective Convolutional Descriptor Aggregation (SCDA) [31] was proposed to localize the main object in fine-grained images firstly, and then discard the noisy background and keep useful deep descriptors for fine-grained image retrieval. Deep Hashing. Hashing methods can be divided into two categories, i.e., data-independent methods [6] and data-dependent methods [10,17], based on whether training points are used to learn hash functions. Generally speaking, data-dependent methods, also named as Learning to Hash (L2H) methods, can achieve better retrieval performance with the help of the learning on training data. With the rise of deep learning, some L2H methods integrate deep feature learning into hash frameworks and achieve promising performance. As previous work, many deep hashing methods [2,3,7,14,16,17,21,22,30,35,38,39] for large￾scale image retrieval have been proposed. Compared with deep unsupervised hashing methods [7,14,21], deep supervised hashing methods [14,16,17,35] can achieve superior retrieval accuracy as they can fully explore the semantic infor￾mation. Specifically, the previous work [35] was essentially a two-stage method
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有