IEEE TRANSACTIONS ON IMAGE PROCESSING_中国高校课件下载中心

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Discrete Latent Factor Model for Cross-Modal Hashing

正在加载图片...

EEE TRANSACTIONS ON IMAGE PROCESSING.VOL.XX,NO.X.XXX 2019 11 TABLE IX COMPARISON WITH DEEP BASELINES ON NUS-WIDE DATASET. Method MAP(%)】 Time 7 4 Avg. UGACH 61.3 60.3 60R N/A CMHH 66.82±1.02 67.35±0.84 67.08±0.91 48128.5 DCMH 64.79士0.19 68.84±0.28 66.62士0.18 35777.2 DLFH 80.67±0.77 75.76±1.68 78.22士1.44 360 “日量2站知但是的 KDLFH 82.32士1.23 76.04±1.99 79.18±1.78 6651.9 (a)DLFH, (b)DLFH, (c)LFHem, mAC=0.223. mAC=0.234 mAC=0.296. Fig.6.Correlation matrix and mAC on NUS-WIDE dataset. IAPR-TC12,MIRFLICKR-25K and NUS-WIDE datasets,re- spectively.Other cases with different number of bits have I.Sensitivity to Hyper-Parameter similar phenomenon.We use boldface to denote the cases where the average accuracy is better than that of baselines.We We study the influence of Yz,n,A and the number of can see that KDLFH and DLFH can outperform deep CMH sampled points m on IAPR-TC12,MIRFLICKR-25K and methods in most cases.Furthermore,KDLFH and DLFH are NUS-WIDE datasets. more efficient than deep CMH methods in all cases,even if We first present the MAP values with different from KDLFH and DLFH are trained on CPU while deep CMH the range of [10-4,103]with the code length being 16 bits. methods are trained on GPU. The results are shown in Figure 7.We can find that DLFH is not sensitive to Y in a large range when 10-4<n<1. Furthermore,from Figure 8,we can find that DLFH is not H.Comparison with LFH-extension sensitive to n in a large range when 10-3<n<1. We also report the MAP values for different A from the To verify the effectiveness of the proposed method,we range of [1,24]with the code length being 16 bits.The results extend LFH [50]to cross-modal hashing and adopt this method are shown in Figure 9.We can find that DLFH is not sensitive as baseline.Specifically,we utilize LFH method to learn a to入in a large range when6≤入≤l0. unified binary codes for both two modalities.Then we learn Furthermore,we present the influence of m in Figure 10.We two linear functions to perform out-of-sample extension.We can find that as the number of sampled points increases,the ac- denote this method as LFHem. curacy increases at the beginning and then remains unchanged. We present the MAP results in Figure 5 on NUS-WIDE As more sampled points will require higher computation cost, dataset.We can see that DLFH can outperform LFHm thanks we simply set m =c in our experiment to get a good tradeoff to discrete learning and bit-wise learning strategy.To verify between accuracy and efficiency for DLFH and KDLFH. that DLFH can learn a more highly uncorrelated binary codes because of a bit-wise learning strategy,we present the correlation matrix E of the learned binary codes for DLFH and LFHem.We also calculate the mean Absolute Correlation (mAC)based on the following equation:mAC= 2∑-EEh,where E={E}j=l∈Rexe is the c(c-1) correlation matrix of the learn binary codes. (a)IAPR-TC12. (b)MIRFLICKR-25K. (c)NUS-WIDE. We show the correlation matrix(absolute value)of learned Fig.7.MAP values with different y on three datasets. binary codes U and V for DLFH in Figure 6(a),(b).And the correlation matrix for LFHem is shown in Figure 6(c). From Figure 6,we can see that most values in the correlation matrix of DLFH are lower than the corresponding values in the correlation matrix of LFHem.Furthermore,we can find that DLFH can achieve lower mAC than LFHm.Hence,DLFH can learn more highly uncorrelated binary codes than LFHem. 10 (a)IAPR-TC12 (b)MIRFLICKR-25K. (c)NUS-WIDE Fig.8.MAP values with different n on three datasets. TONUS-WIDE TNUS-WIDE 0.75 0.85 0.7 6 79 0.6 0.7 ◆LFH 0.55 0.65 488012 182024 1488101216224 468101216202 8 16 32 64 8 16 6 Code Length Code Length (a)IAPR-TC12. (b)MIRFLICKR-25K. (c)NUS-WIDE. Fig.5.Comparison with LFHem on NUS-WIDE dataset. Fig.9.MAP values with different A on three datasets.IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. XX, NO. X, XXX 2019 11 TABLE IX COMPARISON WITH DEEP BASELINES ON NUS-WIDE DATASET. Method MAP (%) Avg. Time I → T T → I UGACH 61.3 60.3 60.8 N/A CMHH 66.82±1.02 67.35±0.84 67.08±0.91 48128.5 DCMH 64.79±0.19 68.84±0.28 66.62±0.18 35777.2 DLFH 80.67±0.77 75.76±1.68 78.22±1.44 36.0 KDLFH 82.32±1.23 76.04±1.99 79.18±1.78 6651.9 IAPR-TC12, MIRFLICKR-25K and NUS-WIDE datasets, respectively. Other cases with different number of bits have similar phenomenon. We use boldface to denote the cases where the average accuracy is better than that of baselines. We can see that KDLFH and DLFH can outperform deep CMH methods in most cases. Furthermore, KDLFH and DLFH are more efficient than deep CMH methods in all cases, even if KDLFH and DLFH are trained on CPU while deep CMH methods are trained on GPU. H. Comparison with LFH-extension To verify the effectiveness of the proposed method, we extend LFH [50] to cross-modal hashing and adopt this method as baseline. Specifically, we utilize LFH method to learn a unified binary codes for both two modalities. Then we learn two linear functions to perform out-of-sample extension. We denote this method as LFHcm. We present the MAP results in Figure 5 on NUS-WIDE dataset. We can see that DLFH can outperform LFHcm thanks to discrete learning and bit-wise learning strategy. To verify that DLFH can learn a more highly uncorrelated binary codes because of a bit-wise learning strategy, we present the correlation matrix E of the learned binary codes for DLFH and LFHcm. We also calculate the mean Absolute Correlation (mAC) based on the following equation: mAC = 2 Pc i=1 P j<i |Eij | c(c−1) , where E = {Eij} c i,j=1 ∈ R c×c is the correlation matrix of the learn binary codes. We show the correlation matrix (absolute value) of learned binary codes U and V for DLFH in Figure 6 (a),(b). And the correlation matrix for LFHcm is shown in Figure 6 (c). From Figure 6, we can see that most values in the correlation matrix of DLFH are lower than the corresponding values in the correlation matrix of LFHcm. Furthermore, we can find that DLFH can achieve lower mAC than LFHcm. Hence, DLFH can learn more highly uncorrelated binary codes than LFHcm. Code Length 8 16 32 64 M A P 0.55 0.6 0.65 0.7 0.75 Code Length 8 16 32 64 M A P 0.65 0.7 0.75 0.8 0.85 Fig. 5. Comparison with LFHcm on NUS-WIDE dataset. Image Database Code 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 (a) DLFH, mAC= 0.223. Text Database Code 4 8 1216202428323640444852566064 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60 64 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 (b) DLFH, mAC= 0.234. Database Code 6 12 18 24 30 36 42 48 54 60 6 12 18 24 30 36 42 48 54 60 0.2 0.4 0.6 0.8 (c) LFHcm, mAC= 0.296. Fig. 6. Correlation matrix and mAC on NUS-WIDE dataset. I. Sensitivity to Hyper-Parameter We study the influence of γx, ηx, λ and the number of sampled points m on IAPR-TC12, MIRFLICKR-25K and NUS-WIDE datasets. We first present the MAP values with different γx from the range of [10−4 , 103 ] with the code length being 16 bits. The results are shown in Figure 7. We can find that DLFH is not sensitive to γx in a large range when 10−4 ≤ ηx ≤ 1. Furthermore, from Figure 8, we can find that DLFH is not sensitive to ηx in a large range when 10−3 ≤ ηx ≤ 1. We also report the MAP values for different λ from the range of [1, 24] with the code length being 16 bits. The results are shown in Figure 9. We can find that DLFH is not sensitive to λ in a large range when 6 ≤ λ ≤ 10. Furthermore, we present the influence of m in Figure 10. We can find that as the number of sampled points increases, the accuracy increases at the beginning and then remains unchanged. As more sampled points will require higher computation cost, we simply set m = c in our experiment to get a good tradeoff between accuracy and efficiency for DLFH and KDLFH. 10-410-310-210-1 1 10 102 103 M A P 0.35 0.4 0.45 0.5 0.55 (a) IAPR-TC12. 10-410-310-210-1 1 10 102 103 M A P 0.65 0.7 0.75 0.8 0.85 (b) MIRFLICKR-25K. 10-410-310-210-1 1 10 102 103 M A P 0.55 0.6 0.65 0.7 0.75 0.8 (c) NUS-WIDE. Fig. 7. MAP values with different γx on three datasets. 10-4 10-3 10-2 10-1 1 10 102 M A P 0.3 0.35 0.4 0.45 0.5 0.55 (a) IAPR-TC12. 10-4 10-3 10-2 10-1 1 10 102 M A P 0.6 0.65 0.7 0.75 0.8 0.85 (b) MIRFLICKR-25K. 10-4 10-3 10-2 10-1 1 10 102 M A P 0.5 0.6 0.7 0.8 (c) NUS-WIDE. Fig. 8. MAP values with different ηx on three datasets. 1 4 6 8 1012 16 20 24 M A P 0.2 0.3 0.4 0.5 (a) IAPR-TC12. 1 4 6 8 1012 16 20 24 M A P 0.6 0.65 0.7 0.75 0.8 0.85 (b) MIRFLICKR-25K. 1 4 6 8 1012 16 20 24 M A P 0.4 0.5 0.6 0.7 0.8 (c) NUS-WIDE. Fig. 9. MAP values with different λ on three datasets

<<向上翻页向下翻页>>

点击下载：《人工智能、机器学习与大数据》课程教学资源（参考文献）Discrete Latent Factor Model for Cross-Modal Hashing