正在加载图片...
Table 3.MAP()for hashing based NDVR methods LSH ITO IsoH HDML Dataset 16 bits 32 bits 16 bits 32 bits 16 bits 32 bits 16 bits 32 bits CCWEB 68.12 83.15 70.16 87.14 72.24 86.75 82.72 90.23 VCDB 10.33 30.88 10.68 33.31 10.60 33.30 35.96 68.92 SVD 4.34 28.36 5.16 30.14 4.85 30.88 6.47 31.59 SVDCropping 0.32 2.65 0.70 4.41 0.96 4.01 1.23 5.39 SVDBlack Border 0.76 4.61 1.18 7.08 1.15 5.58 1.61 10.54 SVDRotation 0.06 0.09 0.04 0.43 0.07 0.24 0.54 1.95 SVDSpeeding 3.34 23.56 4.42 25.82 4.14 26.63 4.56 28.60 Table 4.Top-100 MAP(%).storage cost and retrieval time on all datasets. Storage Cost Methods Dim/#bits Top-100 MAP Retrieval Time (ms) CCWEB VCDB SVD CCWEB VCDB SVD CCWEB VCDB SVD DML 500D 97.93 84.60 81.27 48.83M 0.40G 2.25G 41.2 278.3 2203.5 CNNL 4096D 97.88 84.48 61.04 99.96M CNNV 3.29G 18.42G 266.6 2290.3 15887.3 4096D 97.86 79.44 25.10 LSH+ 98.29 66.55 76.02 ITQ+ 98.11 66.65 77.96 16 bits 0.06M 0.60M 3.37M 1.4 17.8 88.2 IsoH+ 97.92 66.58 78.19 HDML+ 97.74 77.96 76.29 LSH+ 97.81 67.19 78.80 ITQ+ 32 bits 97.75 66.65 78.92 0.09M 0.80M 4.49M 2.5 24.8 174.8 IsoH+ 97.79 67.01 79.00 HDML+ 97.69 78.36 78.63 5.4.Hashing based NDVR ods.In addition,we can see that hashing based methods are much faster than real-value based methods.Hence.for Accuracy In this section,we present the retrieval results of hashing based methods on all datasets.The MAP results large-scale applications,hashing based methods are usually are reported in Table 3.From Table 3,we can find that the more practical than real-value based methods. retrieval accuracy of hashing based methods are not as good 6.Conclusion as that of real-value based NDVR methods on all datasets. Compared with CCWEB and VCDB dataset,the retrieval In this paper,we introduce a novel large-scale short accuracy on SVD dataset is the worst.Furthermore,the video dataset,called SVD,for NDVR.This dataset contain- MAP results on SVDtransformation are much worse than s over 500,000 short videos collected from a large video those on SVD in all cases. platform and over 30,000 labeled videos of near-duplicate Reranking We also carry out experiments by utilizing r- videos.We utilize multiple mining strategies to mine hard eranking to improve the retrieval accuracy of hashing based positive/negative samples from massive short videos.Fur- methods.For reranking,we set N=0.1 x M,where M is thermore,we design some temporal and spatial transforma- the number of videos in database'for each query.Here the tions to mimic users'copy-and-edit behavior in real appli- videos in database contain labeled videos and background cations and construct more challenging variants of SVD. distraction videos or probable negative unlabeled videos SVD is the first short video dataset,and it is also the largest In Table 4,we report the top-100 MAP,storage cost for dataset for NDVR.The release of SVD will foster the re- database and average retrieval time per query.The"LSH+" search of NDVR,especially NDVR for short videos. denotes the LSH algorithm with reranking and the other no- tations are defined similarly.From Table 4.we can find 7.Acknowledgement that after reranking,the retrieval accuracy of hashing based This work is supported by the NSFC-NRF Joint Re- methods is comparable with real-value based methods in search Project(No.61861146001)and the Program A for most cases.Furthermore,the storage cost for hashing based Outstanding Ph.D.candidate of Nanjing University.We methods is much smaller than that of real-value based meth- thank Yubo Du and Ming-Wei Li for their help in data anno- 7As the number of labeled videos for different query videos is different, tation and filtering.Lei Li and Wu-Jun Li are corresponding the M for different query videos is also different authors.Table 3. MAP (%) for hashing based NDVR methods. Dataset LSH ITQ IsoH HDML 16 bits 32 bits 16 bits 32 bits 16 bits 32 bits 16 bits 32 bits CCWEB 68.12 83.15 70.16 87.14 72.24 86.75 82.72 90.23 VCDB 10.33 30.88 10.68 33.31 10.60 33.30 35.96 68.92 SVD 4.34 28.36 5.16 30.14 4.85 30.88 6.47 31.59 SVDCropping 0.32 2.65 0.70 4.41 0.96 4.01 1.23 5.39 SVDBlackBorder 0.76 4.61 1.18 7.08 1.15 5.58 1.61 10.54 SVDRotation 0.06 0.09 0.04 0.43 0.07 0.24 0.54 1.95 SVDSpeeding 3.34 23.56 4.42 25.82 4.14 26.63 4.56 28.60 Table 4. Top-100 MAP (%), storage cost and retrieval time on all datasets. Methods Dim/#bits Top-100 MAP Storage Cost Retrieval Time (ms) CCWEB VCDB SVD CCWEB VCDB SVD CCWEB VCDB SVD DML 500D 97.93 84.60 81.27 48.83M 0.40G 2.25G 41.2 278.3 2203.5 CNNL 4096D 97.88 84.48 61.04 99.96M 3.29G 18.42G 266.6 2290.3 15887.3 CNNV 4096D 97.86 79.44 25.10 LSH+ 16 bits 98.29 66.55 76.02 0.06M 0.60M 3.37M 1.4 17.8 88.2 ITQ+ 98.11 66.65 77.96 IsoH+ 97.92 66.58 78.19 HDML+ 97.74 77.96 76.29 LSH+ 32 bits 97.81 67.19 78.80 0.09M 0.80M 4.49M 2.5 24.8 174.8 ITQ+ 97.75 66.65 78.92 IsoH+ 97.79 67.01 79.00 HDML+ 97.69 78.36 78.63 5.4. Hashing based NDVR Accuracy In this section, we present the retrieval results of hashing based methods on all datasets. The MAP results are reported in Table 3. From Table 3, we can find that the retrieval accuracy of hashing based methods are not as good as that of real-value based NDVR methods on all datasets. Compared with CCWEB and VCDB dataset, the retrieval accuracy on SVD dataset is the worst. Furthermore, the MAP results on SVDtransformation are much worse than those on SVD in all cases. Reranking We also carry out experiments by utilizing r￾eranking to improve the retrieval accuracy of hashing based methods. For reranking, we set N = 0.1 × M, where M is the number of videos in database7 for each query. Here the videos in database contain labeled videos and background distraction videos or probable negative unlabeled videos In Table 4, we report the top-100 MAP, storage cost for database and average retrieval time per query. The “LSH+” denotes the LSH algorithm with reranking and the other no￾tations are defined similarly. From Table 4, we can find that after reranking, the retrieval accuracy of hashing based methods is comparable with real-value based methods in most cases. Furthermore, the storage cost for hashing based methods is much smaller than that of real-value based meth- 7As the number of labeled videos for different query videos is different, the M for different query videos is also different. ods. In addition, we can see that hashing based methods are much faster than real-value based methods. Hence, for large-scale applications, hashing based methods are usually more practical than real-value based methods. 6. Conclusion In this paper, we introduce a novel large-scale short video dataset, called SVD, for NDVR. This dataset contain￾s over 500,000 short videos collected from a large video platform and over 30,000 labeled videos of near-duplicate videos. We utilize multiple mining strategies to mine hard positive/negative samples from massive short videos. Fur￾thermore, we design some temporal and spatial transforma￾tions to mimic users’ copy-and-edit behavior in real appli￾cations and construct more challenging variants of SVD. SVD is the first short video dataset, and it is also the largest dataset for NDVR. The release of SVD will foster the re￾search of NDVR, especially NDVR for short videos. 7. Acknowledgement This work is supported by the NSFC-NRF Joint Re￾search Project (No. 61861146001) and the Program A for Outstanding Ph.D. candidate of Nanjing University. We thank Yubo Du and Ming-Wei Li for their help in data anno￾tation and filtering. Lei Li and Wu-Jun Li are corresponding authors
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有