正在加载图片...
ExchNet:A Unified Hashing Network for Large-Scale Fine-Grained Retrieval 197 to learn two different binary codes for the same training sample.The learning procedure is as follows: ug(oba]cat)=sig(W(foballeat), (9) v:=h(G fgloballcat)=sign(W(h)[;fgloballeat), (10) where [;]cat denotes the concatenation operator,and ui,viE{-1,+119 denote the two different binary codes of the i-th sample.g represents the code length W(g)and W(h)present the parameters of hash functions g()and h(),respec- tively.We denote U ={ui}and V={vi}21 as learned binary codes. Inspired by [14,we only keep binary codes v;and set hash function h()implic- itly.Hence,we can perform feature learning and binary codes learning simulta- neously. To preserve the pairwise similarity,we adopt the squared loss and define the following objective function: L(ui,vj:C)=(uJvj-qSi)2, (11) where u(bae),is the pairwise similarity label andc-C We use to denote the parameters of deep neural network and hash layer.The aforementioned process is generally illustrated by Fig.4. Due to the zero-gradient problem caused by the sign()function,La(,,) becomes intractable to optimize.In this paper,we relax g()=sign()into g()=tanh()to alleviate this problem.Then,we can derive the following loss function: Cq(,,C)=(au-q5)2, (12) wherefoballeat)and U is relaxed as U= Then,given a set of image samples ={1,...,n}and their pairwise labels S={S=,we can get the following objective function by combining Eqs.(5),(7)and(12: .c()=∑ca,u:S)+∑Cpc,)+n∑c(c) (13) V.e.c i,j=1 i=1 i=1 8.t.i∈{1,…,n,i=(g:flob]cat),v∈{-l,+1}9 where Sj represents the similarity between the i-th and j-th samples,q denotes the code length,A and y are hyper-parameters. 3.4 Learning Algorithm To solve the optimization problem in Eq.(13),we design an alternating algorithm to learn V,6,and C.Specifically,we learn one parameter with the others fixed. 1 We omit the bias term for simplicity.ExchNet: A Unified Hashing Network for Large-Scale Fine-Grained Retrieval 197 to learn two different binary codes for the same training sample. The learning procedure is as follows: ui = g([Gˆi; f global i ]cat) = sign(W(g) [Gˆi; f global i ]cat), (9) vi = h([Gˆi; f global i ]cat) = sign(W(h) [Gˆi; f global i ]cat), (10) where [; ]cat denotes the concatenation operator, and ui, vi ∈ {−1, +1}q denote the two different binary codes of the i-th sample. q represents the code length. W(g) and W(h) present the parameters of hash functions g(·) and h(·)1, respec￾tively. We denote U = {ui}n i=1 and V = {vi}n i=1 as learned binary codes. Inspired by [14], we only keep binary codes vi and set hash function h(·) implic￾itly. Hence, we can perform feature learning and binary codes learning simulta￾neously. To preserve the pairwise similarity, we adopt the squared loss and define the following objective function: Lsq(ui, vj , C) = u i vj − qSij 2 , (11) where ui = g([Gˆi; f global i ]cat), Sij is the pairwise similarity label and C = {Ci}M i=1. We use Θ to denote the parameters of deep neural network and hash layer. The aforementioned process is generally illustrated by Fig. 4. Due to the zero-gradient problem caused by the sign(·) function, Lsq(·, ·, ·) becomes intractable to optimize. In this paper, we relax g(·) = sign(·) into gˆ(·) = tanh(·) to alleviate this problem. Then, we can derive the following loss function: Lˆsq(uˆi, vj , C) = uˆ i vj − qSij 2 , (12) where uˆi = ˆg([Gˆi; f global i ]cat) and U is relaxed as Uˆ = {uˆi}n i=1. Then, given a set of image samples X = {x1,...,xn} and their pairwise labels S = {Sij}n i,j=1, we can get the following objective function by combining Eqs. (5), (7) and (12): min V ,Θ,C L(X ) = n i,j=1 Lˆsq(uˆi, vj ; Sij ) + λ n i=1 Lsp(xi) + γ n i=1 Lcp(xi) (13) s.t.∀i ∈ {1,...,n},uˆi = ˆg([Gˆi; f global i ]cat), vj ∈ {−1, +1}q, where Sij represents the similarity between the i-th and j-th samples, q denotes the code length, λ and γ are hyper-parameters. 3.4 Learning Algorithm To solve the optimization problem in Eq. (13), we design an alternating algorithm to learn V , Θ, and C. Specifically, we learn one parameter with the others fixed. 1 We omit the bias term for simplicity.
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有