正在加载图片...
张晴晴等:基于卷积神经网络的连续语音识别 *1217· 征、卷积器尺寸和个数、计算量和模型规模等做了详细 formation Processing Systems,2012:197 的对比实验,并与普遍使用的深层神经网络进行了对 7]Wolf L.DeepFace:closing the gap to human-evel performance in 比.卷积神经网络通过卷积层对局部特征进行观察, face verification /IEEE Conference on Computer Vision and Pat- tern Recognition.Columbus,2014 再经过全网络层的信息整合最终得到输出概率,相比 [8]Abdel-Hamid O,Mohamed A,Jiang H,Penn G.Applying convo- 深层神经网络具有更好的物理意义.同时,由于卷积 lutional neural networks concepts to hybrid NN-HMM model for 神经网络的权值共享,使得模型复杂度大大降低.在 speech recognition /2012 IEEE International Conference on A- 多个标准库上的实验证明,在计算量比深层神经网络 coustics,Speech and Signal Processing (ICASSP).Kyoto,2012: 更少的条件下,卷积神经网络的识别性能更优,泛化能 4277 力更强. ]Sainath T N,Mohamed A R,Kingsbury B,et al.Deep convolu- tional neural networks for LVCSR /2013 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP). 参考文献 Vancouver,2013:8614 00] Abdel-Hamid O,Deng L,Yu D.Exploring convolutional neural [1]Dahl G E,Yu D,Deng L,et al.Context-dependent pre-trained network structures and optimization techniques for speech recog- deep neural networks for large-vocabulary speech recognition. nition /INTERSPEECH.Lyon,2013:3366 IEEE Trans Audio Speech Lang Process,2012.20(1):30 [11]TIMIT.Linguistic Data Consortium [DB/OL][2014-08-10]. ]Hinton C,Deng L,Yu D,et al.Deep neural networks for acous- http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?cata- tic modeling in speech recognition:the shared views of four re logld LDC93SI search groups.IEEE Signal Process Mag,2012,29(6):82 [12]LeCun Y,Huang F J,Bottou L.Learning methods for generic 3]Yu D,Deng L.Deep leaming and its applications to signal and object recognition with invariance to pose and lighting /Pro- information processing.IEEE Signal Process Mag,2011,28 (1): ceedings of the 2004 IEEE Computer Society Conference on Com- 145 puter Vision and Pattern Recognition.Washinglon,2004:97- 4]LeCun Y,Bengio Y.Convolutional networks for images,speech, 104 and time series /The Handbook of Brain Theory and Neural Net- 03] Zhang QQ,Pan J L,Yan Y H.Tonal articulatory feature for works,1995 Mandarin and its application to conversational LVCSR /Tenth [5]Fan B L.Research on Parallelization of Convolutional Neural Net- Annual Conference of the International Speech Communication As- works DDissertation].Zhengzhou:Zhengzhou University,2013 sociation.Brighton,2009:3007 (凡保磊.卷积神经网路的并行化研究[学位论文].郑州,郑 [14]Zhang QQ,Cai S,Pan J L,et al.Improved acoustic models for 州大学,2013) conversational telephone speech recognition /9th International [6]Krizhevsky A,Sutskever I,Hinton G E.ImageNet classification Conference on Fuzzy Systems and Knowledge Discovery (FSKD) with deep convolutional neural networks /Adrances in Neural In- IEEE,2012:1229张晴晴等: 基于卷积神经网络的连续语音识别 征、卷积器尺寸和个数、计算量和模型规模等做了详细 的对比实验,并与普遍使用的深层神经网络进行了对 比. 卷积神经网络通过卷积层对局部特征进行观察, 再经过全网络层的信息整合最终得到输出概率,相比 深层神经网络具有更好的物理意义. 同时,由于卷积 神经网络的权值共享,使得模型复杂度大大降低. 在 多个标准库上的实验证明,在计算量比深层神经网络 更少的条件下,卷积神经网络的识别性能更优,泛化能 力更强. 参 考 文 献 [1] Dahl G E,Yu D,Deng L,et al. Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans Audio Speech Lang Process,2012,20( 1) : 30 [2] Hinton G,Deng L,Yu D,et al. Deep neural networks for acous￾tic modeling in speech recognition: the shared views of four re￾search groups. IEEE Signal Process Mag,2012,29( 6) : 82 [3] Yu D,Deng L. Deep learning and its applications to signal and information processing. IEEE Signal Process Mag,2011,28( 1) : 145 [4] LeCun Y,Bengio Y. Convolutional networks for images,speech, and time series / / The Handbook of Brain Theory and Neural Net￾works,1995 [5] Fan B L. Research on Parallelization of Convolutional Neural Net￾works [Dissertation]. Zhengzhou: Zhengzhou University,2013 ( 凡保磊. 卷积神经网络的并行化研究[学位论文]. 郑州,郑 州大学,2013) [6] Krizhevsky A,Sutskever I,Hinton G E. ImageNet classification with deep convolutional neural networks / / Advances in Neural In￾formation Processing Systems,2012: 1097 [7] Wolf L. DeepFace: closing the gap to human-level performance in face verification / / IEEE Conference on Computer Vision and Pat￾tern Recognition. Columbus,2014 [8] Abdel-Hamid O,Mohamed A,Jiang H,Penn G. Applying convo￾lutional neural networks concepts to hybrid NN-HMM model for speech recognition / / 2012 IEEE International Conference on A￾coustics,Speech and Signal Processing ( ICASSP) . Kyoto,2012: 4277 [9] Sainath T N,Mohamed A R,Kingsbury B,et al. Deep convolu￾tional neural networks for LVCSR / / 2013 IEEE International Conference on Acoustics,Speech and Signal Processing ( ICASSP) . Vancouver,2013: 8614 [10] Abdel-Hamid O,Deng L,Yu D. Exploring convolutional neural network structures and optimization techniques for speech recog￾nition / / INTERSPEECH. Lyon,2013: 3366 [11] TIMIT. Linguistic Data Consortium[DB /OL][2014--08--10]. http: / /www. ldc. upenn. edu /Catalog /CatalogEntry. jsp? cata￾logId = LDC93S1 [12] LeCun Y,Huang F J,Bottou L. Learning methods for generic object recognition with invariance to pose and lighting / / Pro￾ceedings of the 2004 IEEE Computer Society Conference on Com￾puter Vision and Pattern Recognition. Washington,2004: II--97-- 104 [13] Zhang Q Q,Pan J L,Yan Y H. Tonal articulatory feature for Mandarin and its application to conversational LVCSR / / Tenth Annual Conference of the International Speech Communication As￾sociation. Brighton,2009: 3007 [14] Zhang Q Q,Cai S,Pan J L,et al. Improved acoustic models for conversational telephone speech recognition / / 9th International Conference on Fuzzy Systems and Knowledge Discovery ( FSKD) . IEEE,2012: 1229 ·1217·
<<向上翻页
©2008-现在 cucdc.com 高等教育资讯网 版权所有