正在加载图片...
7.References [16]J.Hu.L.Shen.and G.Sun."Squeeze-and-excitation networks," [1]S.J.D.Prince and J.H.Elder,"Probabilistic linear discriminant in IEEE Conference on Computer Vision and Pattern Recognition analysis for inferences about identity,"in IEEE International Con- (CVPR),2018,Pp.7132-7141. ference on Computer Vision (ICCV),2007,pp.1-8. [17]F.Wang.J.Cheng.W.Liu,and H.Liu,"Additive margin softmax [2]N.Dehak,P.Kenny,R.Dehak,P.Dumouchel,and P.Ouellet. for face verification,"IEEE Signal Processing Letters,vol.25, "Front-end factor analysis for speaker verification."IEEE Trans. no.7,Pp.926-930.2018. actions on Audio.Speech Language Processing,vol.19,no.4 [18]J.Deng.J.Guo.N.Xue.and S.Zafeiriou."Arcface:Additive an- Pp.788-798.2011. gular margin loss for deep face recognition,"in IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2019,pp. [3]C.Li,X.Ma,B.Jiang.X.Li,X.Zhang,X.Liu,Y.Cao,A.Kan- 4690-4699. nan,and Z.Zhu,"Deep speaker:an end-to-end neural speaker embedding system,"CoRR,vol.abs/1705.02304,2017. [19]A.Nagrani,J.S.Chung,and A.Zisserman,"Voxceleb:A large- scale speaker identification dataset,"in Annual Conference of [4]D.Snyder,D.Garcia-Romero,D.Povey,and S.Khudanpur the Intemational Speech Communication Association (INTER- "Deep neural network embeddings for text-independent speaker SPEECH,2017,Pp.2616-2620. verification,"in Annual Conference of the International Speech Communication Association (INTERSPEECH),2017,pp.999- [20]D.Povey.A.Ghoshal,G.Boulianne,L.Burget,O.Glembek 1003. N.Goel,M.Hannemann,P.Motlicek,Y.Qian.P.Schwarz. J.Silovsky,G.Stemmer,and K.Vesely,"The kaldi speech recog- [5]D.Snyder,D.Garcia-Romero,G.Sell,D.Povey,and S.Khudan- nition toolkit."in IEEE Workshop on Automatic Speech Recogni- pur,"X-vectors:Robust DNN embeddings for speaker recogni- tion and Understanding,2011. tion,"in IEEE Interational Conference on Acoustics,Speech and Signal Processing (ICASSP),2018,pp.5329-5333. [6]Y.-Q.Yu,L.Fan,and W.-J.Li,"Ensemble additive margin soft- max for speaker verification,"in IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).2019,pp. 6046-6050. [7]L.Fan,Q.-Y.Jiang.Y.-Q.Yu,and W.-J.Li,"Deep hashing for speaker identification and retrieval,"in Annual Conference of the International Speech Communication Association(INTER- SPEECH,)2019,Pp.2908-2912. [8]J.S.Chung,A.Nagrani,and A.Zisserman,"Voxceleb2:Deep speaker recognition,"in Annual Conference of the International Speech Communication Association (INTERSPEECH).2018.pp. 1086-1090. [9]D.Snyder,D.Garcia-Romero.G.Sell,A.McCree,D.Povey, and S.Khudanpur,"Speaker recognition for multi-speaker con- versations using x-vectors,"in IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP),2019,pp. 5796-5800 [10]J.Villalba,N.Chen,D.Snyder,D.Garcia-Romero.A.McCree. G.Sell,J.Borgstrom,F.Richardson,S.Shon,F.Grondin,R.De- hak.L.P.Garcia-Perera,D.Povey,P.A.Torres-Carrasquillo, S.Khudanpur,and N.Dehak,"State-of-the-art speaker recog- nition for telephone and video speech:The JHU-MIT submis- sion for NIST SRE18,"in Annual Conference of the International Speech Communication Association (INTERSPEECH),2019,pp. 1488-1492. [11]D.Povey,G.Cheng,Y.Wang,K.Li,H.Xu,M.Yarmohammadi, and S.Khudanpur,"Semi-orthogonal low-rank matrix factoriza- tion for deep neural networks,"in Annual Conference of the Inter- national Speech Communication Association (INTERSPEECH) 2018.pp.3743-3747. [12]G.Huang,Z.Liu,L.van der Maaten,and K.Q.Weinberger, "Densely connected convolutional networks,"in IEEE Confer- ence on Computer Vision and Pattern Recognition (CVPR),2017, Pp.2261-2269. [13]Y.Jiang.Y.Song.I.McLoughlin.Z.Gao,and L.-R.Dai."An effective deep embedding learning architecture for speaker verifi- cation."in Annual Conference of the International Speech Com- munication Association (INTERSPEECH),2019,pp.4040-4044 [14]Z.Gao,Y.Song,I.McLoughlin,P.Li,Y.Jiang,and L.-R.Dai, "Improving aggregation and loss function for better embedding learning in end-to-end speaker verification system,"in Annual Conference of the International Speech Communication Associ- ation (INTERSPEECH).2019,pp.361-365. [15]A.Hajavi and A.Etemad,"A deep neural network for short- segment speaker recognition,"in Annual Conference of the Inter- national Speech Communication Association (INTERSPEECH), 2019,Pp.2878-2882. 9257. References [1] S. J. D. Prince and J. H. Elder, “Probabilistic linear discriminant analysis for inferences about identity,” in IEEE International Con￾ference on Computer Vision (ICCV), 2007, pp. 1–8. [2] N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, “Front-end factor analysis for speaker verification,” IEEE Trans￾actions on Audio, Speech & Language Processing, vol. 19, no. 4, pp. 788–798, 2011. [3] C. Li, X. Ma, B. Jiang, X. Li, X. Zhang, X. Liu, Y. Cao, A. Kan￾nan, and Z. Zhu, “Deep speaker: an end-to-end neural speaker embedding system,” CoRR, vol. abs/1705.02304, 2017. [4] D. Snyder, D. Garcia-Romero, D. Povey, and S. Khudanpur, “Deep neural network embeddings for text-independent speaker verification,” in Annual Conference of the International Speech Communication Association (INTERSPEECH), 2017, pp. 999– 1003. [5] D. Snyder, D. Garcia-Romero, G. Sell, D. Povey, and S. Khudan￾pur, “X-vectors: Robust DNN embeddings for speaker recogni￾tion,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 5329–5333. [6] Y.-Q. Yu, L. Fan, and W.-J. Li, “Ensemble additive margin soft￾max for speaker verification,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 6046–6050. [7] L. Fan, Q.-Y. Jiang, Y.-Q. Yu, and W.-J. Li, “Deep hashing for speaker identification and retrieval,” in Annual Conference of the International Speech Communication Association (INTER￾SPEECH), 2019, pp. 2908–2912. [8] J. S. Chung, A. Nagrani, and A. Zisserman, “Voxceleb2: Deep speaker recognition,” in Annual Conference of the International Speech Communication Association (INTERSPEECH), 2018, pp. 1086–1090. [9] D. Snyder, D. Garcia-Romero, G. Sell, A. McCree, D. Povey, and S. Khudanpur, “Speaker recognition for multi-speaker con￾versations using x-vectors,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019, pp. 5796–5800. [10] J. Villalba, N. Chen, D. Snyder, D. Garcia-Romero, A. McCree, G. Sell, J. Borgstrom, F. Richardson, S. Shon, F. Grondin, R. De￾hak, L. P. Garc´ıa-Perera, D. Povey, P. A. Torres-Carrasquillo, S. Khudanpur, and N. Dehak, “State-of-the-art speaker recog￾nition for telephone and video speech: The JHU-MIT submis￾sion for NIST SRE18,” in Annual Conference of the International Speech Communication Association (INTERSPEECH), 2019, pp. 1488–1492. [11] D. Povey, G. Cheng, Y. Wang, K. Li, H. Xu, M. Yarmohammadi, and S. Khudanpur, “Semi-orthogonal low-rank matrix factoriza￾tion for deep neural networks,” in Annual Conference of the Inter￾national Speech Communication Association (INTERSPEECH), 2018, pp. 3743–3747. [12] G. Huang, Z. Liu, L. van der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” in IEEE Confer￾ence on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 2261–2269. [13] Y. Jiang, Y. Song, I. McLoughlin, Z. Gao, and L.-R. Dai, “An effective deep embedding learning architecture for speaker verifi- cation,” in Annual Conference of the International Speech Com￾munication Association (INTERSPEECH), 2019, pp. 4040–4044. [14] Z. Gao, Y. Song, I. McLoughlin, P. Li, Y. Jiang, and L.-R. Dai, “Improving aggregation and loss function for better embedding learning in end-to-end speaker verification system,” in Annual Conference of the International Speech Communication Associ￾ation (INTERSPEECH), 2019, pp. 361–365. [15] A. Hajavi and A. Etemad, “A deep neural network for short￾segment speaker recognition,” in Annual Conference of the Inter￾national Speech Communication Association (INTERSPEECH), 2019, pp. 2878–2882. [16] J. Hu, L. Shen, and G. Sun, “Squeeze-and-excitation networks,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 7132–7141. [17] F. Wang, J. Cheng, W. Liu, and H. Liu, “Additive margin softmax for face verification,” IEEE Signal Processing Letters, vol. 25, no. 7, pp. 926–930, 2018. [18] J. Deng, J. Guo, N. Xue, and S. Zafeiriou, “Arcface: Additive an￾gular margin loss for deep face recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019, pp. 4690–4699. [19] A. Nagrani, J. S. Chung, and A. Zisserman, “Voxceleb: A large￾scale speaker identification dataset,” in Annual Conference of the International Speech Communication Association (INTER￾SPEECH), 2017, pp. 2616–2620. [20] D. Povey, A. Ghoshal, G. Boulianne, L. Burget, O. Glembek, N. Goel, M. Hannemann, P. Motlicek, Y. Qian, P. Schwarz, J. Silovsky, G. Stemmer, and K. Vesely, “The kaldi speech recog￾nition toolkit,” in IEEE Workshop on Automatic Speech Recogni￾tion and Understanding, 2011. 925
<<向上翻页
©2008-现在 cucdc.com 高等教育资讯网 版权所有