正在加载图片...
·666 智能系统学报 第12卷 于不同数据集下最佳语义组合权重不同,该实验针 [C]//Association for the Advancement of Artificial 对数据集WS353/SCWS/RW分别设置语义组合权 Intelligence.Austin Texas,USA,2015:2418-2424. 重为y=1,y=1,y=15。 [3]MAAS A L.DALY R E,PHAM P T,et al.Learning word 表2不同数据集下KbMF与其他方法的斯皮尔曼相关系数 vectors for sentiment analysis[C]//Proceedings of the 49th Table 2 Spearman correlation coefficients of KbEMF compared Annual Meeting of the Association for Computational to other approaches on different datasets Linguistics.Portland Oregon,USA,2011:142-150. [4]DHILLON P,FOSTER D P,UNGAR L H.Multi-view 数据集 方法 learning of word embeddings via cca [C]//Advances in WS353 SCWs RW Neural Information Processing Systems.Granada,Spain, EMF 0.7918 0.6474 0.6786 2011:199-207. Retro(CBOW) 0.7816 0.6685 0.6071 [5]BANSAL M,GIMPEL K,LIVESCU K.Tailoring continuous word representations for dependency parsing[C]//Meeting Retro(Skip-gram) 0.6930 0.6449 0.7143 of the Association for Computational Linguistics.Baltimore SWE 0.7965 0.6593 0.6429 Maryland,USA,2014:809-815. KbEMF 0.7999 0.6740 0.7500 [6]HUANG E H.SOCHER R,MANNING C D,et al.Improving word representations via global context and multiple word 表2中KbEMF在上述3个数据集的斯皮尔曼 prototypes[C]//Meeting of the Association for Computational 相关系数均有所提升,因为KbEMF相比较Retro在 Linguistics.Jeju Island,Korea,2012:873-882. 语料库学习词向量阶段就融入了语义知识库信息, [7]MNIH A,HINTON G.Three new graphical models for 相较于SWE则运用了语料库全局的共现信息,因此 statistical language modelling[C]//Proceedings of the 24th 表现最好。尤其KEMF在RW上的斯皮尔曼相关 International Conference on Machine Learning.New York, 系数提升显著,这说明语义知识库信息的融入有助 USA,2007:641-648. 于改善学习稀有单词的词向量。 [8 MNIH A,HINTON G.A scalable hierarchical distributed language model C]//Advances in Neural Information 4 结束语 Processing Systems.Vancouver,Canada,2008:1081-1088. 学习高效的词向量对自然语言处理至关重要。 [9]BENGIO Y,DUCHARME R,VINCENT P,et al.A neural 仅依赖语料库学习词向量无法很好地体现单词本 probabilistic language model[J].Journal of machine learning research,2003,3(02):1137-1155. 身的含义及单词间复杂的关系,因此本文通过从丰 [10]COLLOBERT R,WESTON J,BOTTOU L,et al.Natural 富的知识库提取有价值的语义信息作为对单一依 language processing (almost)from scratch[J].Journal of 赖语料库信息的约束监督,提出了融合语义信息的 machine learning research,2011,12(8):2493-2537. 矩阵分解词向量学习模型,该模型大大改善了词向 [11]MIKOLOV T,CHEN K,CORRADO G,ET AL.Efficient 量的质量。在实验中将Enwik9作为训练文本语料 estimation of word representations in vector space[C]/ 库并且将WordNet作为先验知识库,将学到的词向 International Conference on Learning Representations 量用于单词相似度量和单词类比推理两项任务中, Scottsdale,USA,2013. 充分展示了本文模型的优越性。 [12]BAIN J,Gao B,Liu T Y.Knowledge-powered deep leaming 在后续的研究工作中,我们将继续探索结合其 for word embedding [C]//Joint European Conference on 他知识库(如PPDB、WAN等),从中抽取更多类型 Machine Leaming and Knowledge Discovery in Databases. 的语义信息(如部分整体关系、多义词等),进而定 Springer,Berlin,Heidelberg,2014:132-148. [13]LI Y,XU L,TIAN F,ET AL.Word embedding revisited:a 义不同更有针对性的语义约束模型,进一步改善词 new representation leamning and explicit matrix factorization 向量。并将它们用于文本挖掘和自然语言处理任 perspective [C]//International Conference on Artificial 务中。 Intelligence.Buenos Aires,Argentina,2015:3650-3656. 参考文献: [14]LEVY O,GOLDBERG Y.Neural word embedding as implicit matrix factorization [C]//Advances in Neural [1]TURIAN J,RATINOV L,BENGIO Y.Word representations:a Information Processing Systems.Montreal Quebec, simple and general method for semi-supervised leaming[C]// Canada,2014:2177-2185. Proceedings of the 48th Annual Meeting of the Association for [15 PENNINGTON J,SOCHER R,MANNING C.Glove: Computational Linguistics.Uppsala,Sweden,2010:384-394. global vectors for word representation[C]//Conference on [2]LIU Y,LIU Z,CHUA T S,et al.Topical word embeddings Empirical Methods in Natural Language Processing.Doha,于不同数据集下最佳语义组合权重不同,该实验针 对数据集 WS353 / SCWS / RW 分别设置语义组合权 重为 γ = 1,γ = 1,γ = 15。 表2 不同数据集下 KbEMF 与其他方法的斯皮尔曼相关系数 Table 2 Spearman correlation coefficients of KbEMF compared to other approaches on different datasets 方法 数据集 WS353 SCWS RW EMF 0.791 8 0.647 4 0.678 6 Retro(CBOW) 0.781 6 0.668 5 0.607 1 Retro(Skip⁃gram) 0.693 0 0.644 9 0.714 3 SWE 0.796 5 0.659 3 0.642 9 KbEMF 0.799 9 0.674 0 0.750 0 表 2 中 KbEMF 在上述 3 个数据集的斯皮尔曼 相关系数均有所提升,因为 KbEMF 相比较 Retro 在 语料库学习词向量阶段就融入了语义知识库信息, 相较于 SWE 则运用了语料库全局的共现信息,因此 表现最好。 尤其 KbEMF 在 RW 上的斯皮尔曼相关 系数提升显著,这说明语义知识库信息的融入有助 于改善学习稀有单词的词向量。 4 结束语 学习高效的词向量对自然语言处理至关重要。 仅依赖语料库学习词向量无法很好地体现单词本 身的含义及单词间复杂的关系,因此本文通过从丰 富的知识库提取有价值的语义信息作为对单一依 赖语料库信息的约束监督,提出了融合语义信息的 矩阵分解词向量学习模型,该模型大大改善了词向 量的质量。 在实验中将 Enwik9 作为训练文本语料 库并且将 WordNet 作为先验知识库,将学到的词向 量用于单词相似度量和单词类比推理两项任务中, 充分展示了本文模型的优越性。 在后续的研究工作中,我们将继续探索结合其 他知识库(如 PPDB、WAN 等),从中抽取更多类型 的语义信息(如部分整体关系、多义词等),进而定 义不同更有针对性的语义约束模型,进一步改善词 向量。 并将它们用于文本挖掘和自然语言处理任 务中。 参考文献: [ 1]TURIAN J, RATINOV L, BENGIO Y. Word representations: a simple and general method for semi⁃supervised learning[C] / / Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics. Uppsala, Sweden, 2010: 384-394. [2]LIU Y, LIU Z, CHUA T S, et al. Topical word embeddings [ C ] / / Association for the Advancement of Artificial Intelligence. Austin Texas, USA, 2015: 2418-2424. [3]MAAS A L, DALY R E, PHAM P T, et al. Learning word vectors for sentiment analysis[C] / / Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics. Portland Oregon, USA, 2011: 142-150. [4] DHILLON P, FOSTER D P, UNGAR L H. Multi⁃view learning of word embeddings via cca [ C] / / Advances in Neural Information Processing Systems. Granada, Spain, 2011: 199-207. [5]BANSAL M, GIMPEL K, LIVESCU K. Tailoring continuous word representations for dependency parsing[ C] / / Meeting of the Association for Computational Linguistics. Baltimore Maryland, USA, 2014: 809-815. [6]HUANG E H, SOCHER R, MANNING C D, et al. Improving word representations via global context and multiple word prototypes[C] / / Meeting of the Association for Computational Linguistics. Jeju Island, Korea, 2012: 873-882. [7] MNIH A, HINTON G. Three new graphical models for statistical language modelling[C] / / Proceedings of the 24th International Conference on Machine Learning. New York, USA, 2007: 641-648. [8 ] MNIH A, HINTON G. A scalable hierarchical distributed language model [ C ] / / Advances in Neural Information Processing Systems. Vancouver, Canada, 2008:1081-1088. [9]BENGIO Y, DUCHARME R, VINCENT P, et al. A neural probabilistic language model[J]. Journal of machine learning research, 2003, 3(02): 1137-1155. [10]COLLOBERT R, WESTON J, BOTTOU L, et al. Natural language processing (almost) from scratch[ J]. Journal of machine learning research, 2011, 12(8): 2493-2537. [11]MIKOLOV T, CHEN K, CORRADO G, ET AL. Efficient estimation of word representations in vector space [ C] / / International Conference on Learning Representations. Scottsdale, USA,2013. [12]BAIN J, Gao B, Liu T Y. Knowledge⁃powered deep learning for word embedding [ C] / / Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Berlin, Heidelberg, 2014: 132-148. [13]LI Y, XU L, TIAN F, ET AL. Word embedding revisited: a new representation learning and explicit matrix factorization perspective [ C ] / / International Conference on Artificial Intelligence. Buenos Aires, Argentina, 2015: 3650-3656. [14] LEVY O, GOLDBERG Y. Neural word embedding as implicit matrix factorization [ C ] / / Advances in Neural Information Processing Systems. Montreal Quebec, Canada, 2014: 2177-2185. [ 15 ] PENNINGTON J, SOCHER R, MANNING C. Glove: global vectors for word representation[C] / / Conference on Empirical Methods in Natural Language Processing. Doha, ·666· 智 能 系 统 学 报 第 12 卷
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有