正在加载图片...
Poster Session 5 MM'21,October 20-24,2021,Virtual Event,China REFERENCES [23]Chin-Yew Lin.2004.Rouge:A package for automatic evaluation of summaries. [1]Dzmitry Bahdanau,Kyunghyun Cho,and Yoshua Bengio.2014.Neural ma- In Text summarization branches out.74-81. chine translation by jointly learning to align and translate. arXiv preprint [24]Tao Liu,Wengang Zhou,and Hougiang Li 2016.Sign language recognition with arXi1409.0473(2014. long short-term memory.In ICIP.IEEE,2871-2875. [2]Kshitij Bantupalli and Ying Xie.2018.American sign language recognition using [25]Alptekin Orbay and Lale Akarun.2020.Neural sign language translation by deep learning and computer vision.In 2018 IEEE Intemational Conference on Big learning tokenization.arXiv preprint arXiv:2002.00479(2020). Data (Big Data)IEEE,4896-4899. [26]Kishore Papineni,Salim Roukos,Todd Ward,and Wei-Jing Zhu.2002.BLEU:a [3]Jan Bungeroth and Hermann Ney.2004.Statistical sign language translation method for automatic evaluation of machine translation.In ACL.311-318. In Workshop on representation and processing of sign languages,LREC,Vol.4. [27]Lionel Pigou,Mieke Van Herreweghe,and Joni Dambre.2017.Gesture and sign Citeseer 105-108 ognition with temporal residual networks.In Proceedings of the IEEE Conference an Computer Vision Workshons 3086 [4]Necati Cihan Camgoz,Simon Hadfield,Oscar Koller,and Richard Bowden.2017. Subunets:End-to-end hand shape and continuous sign language recognition.In [28]Junfu Pu,Wengang Zhou,Hezhen Hu,and Houqiang Li.2020.Boosting Continu- 1CCY.EEE,3075-3084. [5]N.C.Camgoz,S.Hadfield,O.Koller,H.Ney,and R.Bowden.2018.Neural Sign [29]Junfu Pu,Wengang Zhou,and Hougiang Li.2019.Iterative alignment network Language Translation.In CVPR.7784-7793.https://doiorg/10.1109/CVPR.2018. 00812 for continuous sign language recognition.In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.4165-4174. [6]Necati Cihan Camgoz,Oscar Koller,Simon Hadfield,and Richard Bowden.2020. Multi-channel Transformers for Multi-articulatory Sign Language Translation. [30]Zhaofan Qiu,Ting Yao,and Tao Mei.2017.Learning spatio-temporal representa- tion with pseudo-3d residual networks.In CVPR.5533-5541. arXiv preprint arXiv:2009.00299 (2020). [7]Necati Cihan Camgoz,Oscar Koller,Simon Hadfield,and Richard Bowden.2020. [31]Lei Shi,Yifan Zhang.Jian Cheng,and Hanging Lu.2019.Skeleton-based action Sign Language Transformers:Joint End-to-end Sign Language Recognition and recognition with directed graph neural networks.In Proceedings of the IEEE Translation.In CVPR.10023-10033. Conference on Computer Vision and Pattern Recognition.7912-7921. [8]Zhe Cao,Tomas Simon,Shih-En Wei,and Yaser Sheikh.2017.Realtime multi- [32]Karen Simonyan and Andrew Zisserman.2014.Two-stream convolutional net- works for action recognition in videos.arXiv preprint arXiv:1406.2199(2014). person 2d pose estimation using part affinity fields.In CVPR.7291-7299. [9]Xiujuan Chai,Guang Li,Yushun Lin,Zhihao Xu.Yili Tang.Xilin Chen,and Ming [33]Karen Simonyan and Andrew Zisserman.2014.Very deep convolutional networks for large-scale image recognition.arXiv preprint arXiv:1409.1556(2014). Zhou.2013.Sign language recognition and translation with kinect.In IEEE Conf. on AFGR Vol 655 4 [34]Ke Sun,Bin Xiao,Dong Liu,and Jingdong Wang.2019.Deep high-resolution [10]Runpeng Cui,Hu Liu,and Changshui Zhang.2019.A deep neural framework representation learning for human pose estimation.In CVPR 5693-5703. for continuous sign language recognition by iterative training.IEEE Transactions [35]Ilya Sutskever,Oriol Vinyals,and Quoc VLe.2014.Sequence to sequence learning on Multimedia21,7(2019,1880-1891. with neural networks.In NIPS.3104-3112. [11]Amanda Duarte,Shruti Palaskar,Lucas Ventura,Deepti Ghadiyaram,Kenneth [36]Ao Tang.Ke Lu,Yufei Wang.Jie Huang.and Houqiang Li.2015.A real-time hand DeHaan.Florian Metze,Jordi Torres,and Xavier Giro-i Nieto.2021.How2Sign posture recognition system using deep neural networks.ACM Transactions on A Large-scale Multimodal Dataset for Continuous American Sign Language.In Intelligent Systems and Technology (TIST)6,2(2015),1-23. [37]Dominique Uebersax,Juergen Gall,Michael Van den Bergh,and Luc Van Gool. Conference on Computer Vision and Pattern Recognition (CVPR). [12]K.Grobel and M.Assan.1997.Isolated sign language recognition using hidden 2011.Real-time sign language letter and word recognition from depth data.In 2011 IEEE intemational conference on computer vision workshops(ICCV Warkshops) Markov models.In SMC,Vol.1.162-167 voL.1.https://doiorg/10.1109/ICSMC 1997.625742 1EEE,383-390. [13]Dan Guo,Shuo Wang.Qi Tian,and Meng Wang.2019.Dense Temporal Convo- [38]Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones lution Network for Sign Language Translation..In ICAL 744-750. Aidan N Gomez,Lukasz Kaiser,and Illia Polosukhin.2017.Attention is all [14]Dan Guo,Wengang Zhou,Anyang Li,Houqiang Li,and Meng Wang.2019. you need.In NIPS.5998-6008. Hierarchical recurrent deep fusion using adaptive clip summarization for sign [39]Subhashini Venugopalan,Marcus Rohrbach,Jeffrey Donahue,Raymond Mooney. language translation.TIP 29 (2019),1575-1590 Trevor Darrell,and Kate Saenko.2015.Sequence to sequence-video to text.In ICCV.4534-4542. [15]Dan Guo,Wengang Zhou,Houqiang Li,and Meng Wang.2018.Hierarchical lstm for sign language translation.In AAAL Vol.32. [40]Limin Wang.Yuanjun Xiong.Zhe Wang.Yu Qiao,Dahua Lin,Xiaoou Tang,and [16]Dan Guo,Wengang Zhou,Meng Wang.and Houqiang Li.2016.Sign language Luc Van Gool 2016.Temporal segment networks:Towards good practices for recognition based on adaptive hmms with data augmentation.In 2016 IEEE deep action recognition.In European conference on computer vision.Springer. 20-36 International Conference on Image Processing (ICIP)IEEE2876-2880. [41]Sijie Yan,Yuanjun Xiong,and Dahua Lin.2018.Spatial Temporal Graph Convo- [17]Jie Huang.Wengang Zhou,Houqiang Li,and Weiping Li.2018.Attention-based lutional Networks for Skeleton-Based Action Recognition.In AAAL 3D-CNNs for large-vocabulary sign language recognition.IEEE Transactions on Circuits and Systems for Video Technology 29,9 (2018),2822-2832. [42]Siyuan Yang.Jun Liu,Shijian Lu,Meng Hwa Er,and Alex C Kot.2020.Col laborative learning of gesture recognition and 3D hand pose estimation with multi-order feature analysis.In European Conference on Com outer Vision.Springer 769-786. [19]Diederik P Kingma and Jimmy Ba.2014.Adam:A method for stochastic opti- mization.arXiv preprint arXiv:14126980(2014). [43]Zhaoyang Yang.Zhenmei Shi,Xiaoyong Shen,and Yu-Wing Tai.2019.SF-Net: [20]Oscar Koller,Cihan Camgoz,Hermann Ney,and Richard Bowden.2019.Weakly Structured Feature Network for Continuous Sign Language Recognition.arXiv supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential preprint arXiv:1908.01341(2019). parallelism in sign language videos.TPAMI(2019). [44]Jihai Zhang.Wengang Zhou,and Houqiang Li.2014.A threshold-based hmm-dtw [21]Oscar Koller,Sepehr Zargaran,and Hermann Ney.2017.Re-sign:Re-aligned approach for continuous sign language recognition.In ICIMCS.237-240. oral end-to-end sequence modelling with deep recurrent CNN-HMMs.In CVPR 4297- [45]Hao Zhou,Wengang Zhou,Yun Zhou,and Houqiang Li.2020.Spatial-Tem 4305. Multi-Cue Network for Continuous Sign Language Recognition.in AAAL 13009 13016. [22]Dongxu Li,Chenchen Xu,Xin Yu,Kaihao Zhang.Benjamin Swift,Hanna Suomi- nen,and Hongdong Li.2020.TSPNet:Hierarchical Feature Learing via Temporal [46]Hao Zhou,Wengang Zhou,Yun Zhou,and Houqiang Li.2021.Spatial-temporal Semantic Pyr multi-cue network for sign language recognition and translation.IEEE Transac- ageranslationn Advances n Neural tions on Multimedia (2021). 4361REFERENCES [1] Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural ma￾chine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473 (2014). [2] Kshitij Bantupalli and Ying Xie. 2018. American sign language recognition using deep learning and computer vision. In 2018 IEEE International Conference on Big Data (Big Data). IEEE, 4896–4899. [3] Jan Bungeroth and Hermann Ney. 2004. Statistical sign language translation. In Workshop on representation and processing of sign languages, LREC, Vol. 4. Citeseer, 105–108. [4] Necati Cihan Camgoz, Simon Hadfield, Oscar Koller, and Richard Bowden. 2017. Subunets: End-to-end hand shape and continuous sign language recognition. In ICCV. IEEE, 3075–3084. [5] N. C. Camgoz, S. Hadfield, O. Koller, H. Ney, and R. Bowden. 2018. Neural Sign Language Translation. In CVPR. 7784–7793. https://doi.org/10.1109/CVPR.2018. 00812 [6] Necati Cihan Camgoz, Oscar Koller, Simon Hadfield, and Richard Bowden. 2020. Multi-channel Transformers for Multi-articulatory Sign Language Translation. arXiv preprint arXiv:2009.00299 (2020). [7] Necati Cihan Camgoz, Oscar Koller, Simon Hadfield, and Richard Bowden. 2020. Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation. In CVPR. 10023–10033. [8] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime multi￾person 2d pose estimation using part affinity fields. In CVPR. 7291–7299. [9] Xiujuan Chai, Guang Li, Yushun Lin, Zhihao Xu, Yili Tang, Xilin Chen, and Ming Zhou. 2013. Sign language recognition and translation with kinect. In IEEE Conf. on AFGR, Vol. 655. 4. [10] Runpeng Cui, Hu Liu, and Changshui Zhang. 2019. A deep neural framework for continuous sign language recognition by iterative training. IEEE Transactions on Multimedia 21, 7 (2019), 1880–1891. [11] Amanda Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, and Xavier Giro-i Nieto. 2021. How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language. In Conference on Computer Vision and Pattern Recognition (CVPR). [12] K. Grobel and M. Assan. 1997. Isolated sign language recognition using hidden Markov models. In SMC, Vol. 1. 162–167 vol.1. https://doi.org/10.1109/ICSMC. 1997.625742 [13] Dan Guo, Shuo Wang, Qi Tian, and Meng Wang. 2019. Dense Temporal Convo￾lution Network for Sign Language Translation.. In IJCAI. 744–750. [14] Dan Guo, Wengang Zhou, Anyang Li, Houqiang Li, and Meng Wang. 2019. Hierarchical recurrent deep fusion using adaptive clip summarization for sign language translation. TIP 29 (2019), 1575–1590. [15] Dan Guo, Wengang Zhou, Houqiang Li, and Meng Wang. 2018. Hierarchical lstm for sign language translation. In AAAI, Vol. 32. [16] Dan Guo, Wengang Zhou, Meng Wang, and Houqiang Li. 2016. Sign language recognition based on adaptive hmms with data augmentation. In 2016 IEEE International Conference on Image Processing (ICIP). IEEE, 2876–2880. [17] Jie Huang, Wengang Zhou, Houqiang Li, and Weiping Li. 2018. Attention-based 3D-CNNs for large-vocabulary sign language recognition. IEEE Transactions on Circuits and Systems for Video Technology 29, 9 (2018), 2822–2832. [18] Jie Huang, Wengang Zhou, Qilin Zhang, Houqiang Li, and Weiping Li. 2018. Video-based sign language recognition without temporal segmentation. In AAAI. [19] Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic opti￾mization. arXiv preprint arXiv:1412.6980 (2014). [20] Oscar Koller, Cihan Camgoz, Hermann Ney, and Richard Bowden. 2019. Weakly supervised learning with multi-stream CNN-LSTM-HMMs to discover sequential parallelism in sign language videos. TPAMI (2019). [21] Oscar Koller, Sepehr Zargaran, and Hermann Ney. 2017. Re-sign: Re-aligned end-to-end sequence modelling with deep recurrent CNN-HMMs. In CVPR. 4297– 4305. [22] Dongxu Li, Chenchen Xu, Xin Yu, Kaihao Zhang, Benjamin Swift, Hanna Suomi￾nen, and Hongdong Li. 2020. TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation. In Advances in Neural Infor￾mation Processing Systems, Vol. 33. [23] Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. In Text summarization branches out. 74–81. [24] Tao Liu, Wengang Zhou, and Houqiang Li. 2016. Sign language recognition with long short-term memory. In ICIP. IEEE, 2871–2875. [25] Alptekin Orbay and Lale Akarun. 2020. Neural sign language translation by learning tokenization. arXiv preprint arXiv:2002.00479 (2020). [26] Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In ACL. 311–318. [27] Lionel Pigou, Mieke Van Herreweghe, and Joni Dambre. 2017. Gesture and sign language recognition with temporal residual networks. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 3086–3093. [28] Junfu Pu, Wengang Zhou, Hezhen Hu, and Houqiang Li. 2020. Boosting Continu￾ous Sign Language Recognition via Cross Modality Augmentation. In Proceedings of the 28th ACM International Conference on Multimedia. 1497–1505. [29] Junfu Pu, Wengang Zhou, and Houqiang Li. 2019. Iterative alignment network for continuous sign language recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4165–4174. [30] Zhaofan Qiu, Ting Yao, and Tao Mei. 2017. Learning spatio-temporal representa￾tion with pseudo-3d residual networks. In CVPR. 5533–5541. [31] Lei Shi, Yifan Zhang, Jian Cheng, and Hanqing Lu. 2019. Skeleton-based action recognition with directed graph neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7912–7921. [32] Karen Simonyan and Andrew Zisserman. 2014. Two-stream convolutional net￾works for action recognition in videos. arXiv preprint arXiv:1406.2199 (2014). [33] Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014). [34] Ke Sun, Bin Xiao, Dong Liu, and Jingdong Wang. 2019. Deep high-resolution representation learning for human pose estimation. In CVPR. 5693–5703. [35] Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In NIPS. 3104–3112. [36] Ao Tang, Ke Lu, Yufei Wang, Jie Huang, and Houqiang Li. 2015. A real-time hand posture recognition system using deep neural networks. ACM Transactions on Intelligent Systems and Technology (TIST) 6, 2 (2015), 1–23. [37] Dominique Uebersax, Juergen Gall, Michael Van den Bergh, and Luc Van Gool. 2011. Real-time sign language letter and word recognition from depth data. In 2011 IEEE international conference on computer vision workshops (ICCV Workshops). IEEE, 383–390. [38] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In NIPS. 5998–6008. [39] Subhashini Venugopalan, Marcus Rohrbach, Jeffrey Donahue, Raymond Mooney, Trevor Darrell, and Kate Saenko. 2015. Sequence to sequence-video to text. In ICCV. 4534–4542. [40] Limin Wang, Yuanjun Xiong, Zhe Wang, Yu Qiao, Dahua Lin, Xiaoou Tang, and Luc Van Gool. 2016. Temporal segment networks: Towards good practices for deep action recognition. In European conference on computer vision. Springer, 20–36. [41] Sijie Yan, Yuanjun Xiong, and Dahua Lin. 2018. Spatial Temporal Graph Convo￾lutional Networks for Skeleton-Based Action Recognition. In AAAI. [42] Siyuan Yang, Jun Liu, Shijian Lu, Meng Hwa Er, and Alex C Kot. 2020. Col￾laborative learning of gesture recognition and 3D hand pose estimation with multi-order feature analysis. In European Conference on Computer Vision. Springer, 769–786. [43] Zhaoyang Yang, Zhenmei Shi, Xiaoyong Shen, and Yu-Wing Tai. 2019. SF-Net: Structured Feature Network for Continuous Sign Language Recognition. arXiv preprint arXiv:1908.01341 (2019). [44] Jihai Zhang, Wengang Zhou, and Houqiang Li. 2014. A threshold-based hmm-dtw approach for continuous sign language recognition. In ICIMCS. 237–240. [45] Hao Zhou, Wengang Zhou, Yun Zhou, and Houqiang Li. 2020. Spatial-Temporal Multi-Cue Network for Continuous Sign Language Recognition.. In AAAI. 13009– 13016. [46] Hao Zhou, Wengang Zhou, Yun Zhou, and Houqiang Li. 2021. Spatial-temporal multi-cue network for sign language recognition and translation. IEEE Transac￾tions on Multimedia (2021). Poster Session 5 MM ’21, October 20–24, 2021, Virtual Event, China 4361
<<向上翻页
©2008-现在 cucdc.com 高等教育资讯网 版权所有