正在加载图片...
第8卷第3期 智能系统学报 Vol.8 No.3 2013年6月 CAAI Transactions on Intelligent Systems Jun.2013 D0I:10.3969/i.issn.1673-4785.201211023 网络出版地址:http://www.cnki.net/kcms/detail/23.1538.TP.20130515.0927.005.html 引入复述技术的统计机器翻译研究综述 胡金铭2,史晓东12,苏劲松3,陈毅东12 (1.厦门大学信息科学与技术学院,福建厦门361005:2.厦门大学福建省仿脑智能系统重点实验室,福建厦门 361005;3.厦门大学软件学院,福建厦门361005) 摘要:基于对引入复述技术的统计机器翻译研究现状的分析,提出具有研究价值的课题方向.首先归纳了复述的概 念,总结了引入复述技术的统计机器翻译各类方法.然后对复述知识在统计机器翻译中的模型训练、参数调整、待译 语句改写和机器翻译自动评测等方面应用的主流方法进行了概括、比较和分析,说明了复述与统计机器翻译是紧密 相关的,强调了复述在统计机器翻译应用中的关键问题是复述的正确性和多样性最后指出提高复述资源的精确度、 建立复述与机器翻译的联合模型、采用新方法解决稀疏问题等是有待进一步研究的课题。 关键词:复述技术:机器翻译:统计机器翻译 中图分类号:TP391文献标志码:A文章编号:1673-4785(2013)03-0199-09 中文引用格式:胡金铭,史晓东,苏劲松,等.引入复述技术的统计机器翻译研究综述[J].智能系统学报,2013,8(3):199-207 英文引用格式:HU Jinming,SHI Xiaodong,SU Jinsong,etal.A survey of statistical machine translation using paraphrasing tech nology[J].CAAI Transactions on Intelligent Systems,2013,8(3):199-207. A survey of statistical machine translation using paraphrasing technology HU Jinming'2,SHI Xiaodong'2,SU Jinsong,CHEN Yidong'2 (1.School of Information Science and Engineering,Xiamen University,Xiamen 361005,China;2.Fujian Key Laboratory of the Brain-like Intelligent Systems,Xiamen University,Xiamen 361005,China;3.College of Software,Xiamen University,Xiamen 361005,China) Abstract:In this paper,the research team discussed possible new prospective research directions of paraphrasing technology in statistical machine translation (SMT),based on reviews of state-of-the-art technology.First the re- search team introduced the concept of paraphrases,and next a summarization of the latest progress utilizing para- phrasing technology in SMT was conducted.Finally,conclusions were drawn,data was compared and an analysis of the main issues of incorporating paraphrases into SMT,including translation model training,parameter tuning,in- put sentences rewriting and machine translation evaluation was performed.The results proved that there is an inher- ent connection between paraphrasing and SMT.The results also point out that the correctness and diversity of para- phrasing are the key issues to apply paraphrasing to SMT.It was highly noted that the improvement in the quality of paraphrasing resource,the establishment of a joint model of paraphrasing and machine translation and the new pro- posed approach to solve data sparseness are problems which need further study. Keywords:paraphrasing technology;machine translation;statistical machine translation 机器翻译(machine translation,MT)是利用计算 翻译.它属于计算语言学(computational linguistics) 机程序,实现从一种自然语言到另一种自然语言的 的范畴.经过数十年的研究,机器翻译在理论和实践 方面都有了较大的进步.从方法论的角度来看,目前 收稿日期:2012-11-16.网络出版日期:2013-05-15. 的主流研究使用基于统计的方法.统计机器翻译 基金项目:国家科技支撑计划资助项目(2012BAH14F03):国家自然 科学基金资助项目(60573189,61005052):福建省自然科 (statistical machine translation,SMT)是通过对大量 学基金资助项目(20060043) 通信作者:史晓东.E-mail:mandel@xmu.cdu.cn 双语平行语料库的统计分析来构建统计翻译模型,第 8 卷第 3 期 智 能 系 统 学 报 Vol.8 №.3 2013 年 6 月 CAAI Transactions on Intelligent Systems Jun. 2013 DOI:10.3969 / j.issn.1673⁃4785.201211023 网络出版地址:http: / / www.cnki.net / kcms/ detail / 23.1538.TP.20130515.0927.005.html 引入复述技术的统计机器翻译研究综述 胡金铭1,2 ,史晓东1,2 ,苏劲松3 ,陈毅东1,2 (1.厦门大学 信息科学与技术学院,福建 厦门 361005; 2.厦门大学 福建省仿脑智能系统重点实验室,福建 厦门 361005; 3.厦门大学 软件学院,福建 厦门 361005) 摘 要:基于对引入复述技术的统计机器翻译研究现状的分析,提出具有研究价值的课题方向.首先归纳了复述的概 念,总结了引入复述技术的统计机器翻译各类方法.然后对复述知识在统计机器翻译中的模型训练、参数调整、待译 语句改写和机器翻译自动评测等方面应用的主流方法进行了概括、比较和分析,说明了复述与统计机器翻译是紧密 相关的,强调了复述在统计机器翻译应用中的关键问题是复述的正确性和多样性.最后指出提高复述资源的精确度、 建立复述与机器翻译的联合模型、采用新方法解决稀疏问题等是有待进一步研究的课题. 关键词:复述技术;机器翻译;统计机器翻译 中图分类号: TP391 文献标志码:A 文章编号:1673⁃4785(2013)03⁃0199⁃09 中文引用格式:胡金铭,史晓东,苏劲松,等.引入复述技术的统计机器翻译研究综述[J].智能系统学报, 2013, 8(3): 199⁃207. 英文引用格式:HU Jinming, SHI Xiaodong, SU Jinsong, et al. A survey of statistical machine translation using paraphrasing tech⁃ nology[J]. CAAI Transactions on Intelligent Systems, 2013, 8(3): 199⁃207. A survey of statistical machine translation using paraphrasing technology HU Jinming 1,2 , SHI Xiaodong 1,2 , SU Jinsong 3 , CHEN Yidong 1,2 (1. School of Information Science and Engineering, Xiamen University, Xiamen 361005, China; 2. Fujian Key Laboratory of the Brain⁃like Intelligent Systems, Xiamen University, Xiamen 361005, China; 3. College of Software, Xiamen University, Xiamen 361005, China) Abstract:In this paper, the research team discussed possible new prospective research directions of paraphrasing technology in statistical machine translation ( SMT), based on reviews of state⁃of⁃the⁃art technology. First the re⁃ search team introduced the concept of paraphrases, and next a summarization of the latest progress utilizing para⁃ phrasing technology in SMT was conducted. Finally, conclusions were drawn, data was compared and an analysis of the main issues of incorporating paraphrases into SMT, including translation model training, parameter tuning, in⁃ put sentences rewriting and machine translation evaluation was performed. The results proved that there is an inher⁃ ent connection between paraphrasing and SMT. The results also point out that the correctness and diversity of para⁃ phrasing are the key issues to apply paraphrasing to SMT. It was highly noted that the improvement in the quality of paraphrasing resource, the establishment of a joint model of paraphrasing and machine translation and the new pro⁃ posed approach to solve data sparseness are problems which need further study. Keywords:paraphrasing technology; machine translation; statistical machine translation 收稿日期:2012⁃11⁃16. 网络出版日期:2013⁃05⁃15. 基金项目:国家科技支撑计划资助项目(2012BAH14F03);国家自然 科学基金资助项目( 60573189,61005052);福建省自然科 学基金资助项目(2006J0043). 通信作者:史晓东. E⁃mail:mandel@ xmu.edu.cn. 机器翻译(machine translation, MT)是利用计算 机程序,实现从一种自然语言到另一种自然语言的 翻译.它属于计算语言学( computational linguistics) 的范畴.经过数十年的研究,机器翻译在理论和实践 方面都有了较大的进步.从方法论的角度来看,目前 的主流研究使用基于统计的方法. 统计机器翻译 (statistical machine translation, SMT)是通过对大量 双语平行语料库的统计分析来构建统计翻译模型
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有