正在加载图片...
第15卷第2期 智能系统学报 Vol.15 No.2 2020年3月 CAAI Transactions on Intelligent Systems Mar.2020 D0:10.11992/tis.201811022 网络出版地址:http:/kns.cnki.net/kcms/detail/23.1538.TP.20190520.1347.006html 基于相似性负采样的知识图谱嵌入 饶官军,古天龙,常亮,宾辰忠,秦赛歌,宣闻 (桂林电子科技大学广西可信软件重点实验室,广西桂林541004) 摘要:针对现有知识图谱嵌入模型通过从实体集中随机抽取一个实体来生成负例三元组,导致负例三元组质 量较低,影响了实体与关系的特征学习能力。研究了影响负例三元组质量的相关因素,提出了基于实体相似性 负采样的方法来生成高质量的负例三元组。在相似性负采样方法中,首先使用K-Meas聚类算法将所有实体 划分为多个组,然后从正例三元组中头实体所在的簇中选择一个实体替换头实体,并以类似的方法替换尾实 体。通过将相似性负采样方法与TransE相结合得到TransE-SNS。研究结果表明:TransE-SNS在链路预测和三 元组分类任务上取得了显著的进步。 关键词:知识图谱:表示学习;随机抽样;相似性负采样;K-Means聚类;随机梯度下降;链接预测:三元组分类 中图分类号:TP391 文献标志码:A文章编号:1673-4785(2020)02-0218-09 中文引用格式:饶官军,古天龙,常亮,等.基于相似性负采样的知识图谱嵌入小.智能系统学报,2020,15(2):218-226. 英文引用格式:RAO Guanjun,.GU Tianlong,.CHANG Liang,.ctal.Knowledge graph embedding based on similarity negative sampling[J.CAAI transactions on intelligent systems,2020,15(2):218-226. Knowledge graph embedding based on similarity negative sampling RAO Guanjun,GU Tianlong,CHANG Liang,BIN Chenzhong,QIN Saige,XUAN Wen (Guangxi Key Laboratory of Trusted Software,Guilin University of Electronic Technology,Guilin 541004,China) Abstract:For the existing knowledge graph embedding model,the random extraction of an entity from the entity set results in the generation of lower-quality negative triples,and this affects the feature learning ability of the entity and the relationship.In this paper,we study the related factors affecting the quality of negative triples,and propose an entity similarity negative sampling method to generate high-quality negative triples.In the similarity negative sampling meth- od,all entities are first divided into a number of groups using the K-means clustering algorithm.Then,corresponding to each positive triple,an entity is selected to replace the head entity from the cluster,whereby the head entity is located in the positive triple,and the tail entity is replaced in a similar approach.TransE-SNS is obtained by combining the similar- ity negative sampling method with TransE.Experimental results show that TransE-SNS has made significant progress in link prediction and triplet classification tasks. Keywords:knowledge graph;representation learning;random sampling;similarity sampling;K-means clustering; stochastic gradient descent:link prediction:triple classification 知识图谱(knowledge graph)的概念是谷歌在 收稿日期:2018-12-04.网络出版日期:2019-05-21 基金项目:国家自然科学基金资助项目(U1501252,61572146): 2012年正式提出的,主要用于提升搜索引擎性 广西创新驱动重大专项项目(AA17202024):广西自 然科学基金项目(2016 GXNSFDA380006):广西高校 能。随着大数据时代的到来,知识图谱规模得到 中青年教师基础能力提升项日(2018KYD203):广西 了快速的增长,各种大规模知识图谱相继出现 研究生教育创新计划项目(YCSW2018139). 通信作者:宾辰忠.E-mail:cz_bin@guet.edu..cn. (如Freebase、WordNet、NULL等)。当前知识DOI: 10.11992/tis.201811022 网络出版地址: http://kns.cnki.net/kcms/detail/23.1538.TP.20190520.1347.006.html 基于相似性负采样的知识图谱嵌入 饶官军,古天龙,常亮,宾辰忠,秦赛歌,宣闻 (桂林电子科技大学 广西可信软件重点实验室,广西 桂林 541004) 摘 要:针对现有知识图谱嵌入模型通过从实体集中随机抽取一个实体来生成负例三元组,导致负例三元组质 量较低,影响了实体与关系的特征学习能力。研究了影响负例三元组质量的相关因素,提出了基于实体相似性 负采样的方法来生成高质量的负例三元组。在相似性负采样方法中,首先使用 K-Means 聚类算法将所有实体 划分为多个组,然后从正例三元组中头实体所在的簇中选择一个实体替换头实体,并以类似的方法替换尾实 体。通过将相似性负采样方法与 TransE 相结合得到 TransE-SNS。研究结果表明:TransE-SNS 在链路预测和三 元组分类任务上取得了显著的进步。 关键词:知识图谱;表示学习;随机抽样;相似性负采样;K-Means 聚类;随机梯度下降;链接预测;三元组分类 中图分类号:TP391 文献标志码:A 文章编号:1673−4785(2020)02−0218−09 中文引用格式:饶官军, 古天龙, 常亮, 等. 基于相似性负采样的知识图谱嵌入 [J]. 智能系统学报, 2020, 15(2): 218–226. 英文引用格式:RAO Guanjun, GU Tianlong, CHANG Liang, et al. Knowledge graph embedding based on similarity negative sampling[J]. CAAI transactions on intelligent systems, 2020, 15(2): 218–226. Knowledge graph embedding based on similarity negative sampling RAO Guanjun,GU Tianlong,CHANG Liang,BIN Chenzhong,QIN Saige,XUAN Wen (Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, Guilin 541004, China) Abstract: For the existing knowledge graph embedding model, the random extraction of an entity from the entity set results in the generation of lower-quality negative triples, and this affects the feature learning ability of the entity and the relationship. In this paper, we study the related factors affecting the quality of negative triples, and propose an entity similarity negative sampling method to generate high-quality negative triples. In the similarity negative sampling meth￾od, all entities are first divided into a number of groups using the K-means clustering algorithm. Then, corresponding to each positive triple, an entity is selected to replace the head entity from the cluster, whereby the head entity is located in the positive triple, and the tail entity is replaced in a similar approach. TransE-SNS is obtained by combining the similar￾ity negative sampling method with TransE. Experimental results show that TransE-SNS has made significant progress in link prediction and triplet classification tasks. Keywords: knowledge graph; representation learning; random sampling; similarity sampling; K-means clustering; stochastic gradient descent; link prediction; triple classification 知识图谱 (knowledge graph) 的概念是谷歌在 2012 年正式提出的,主要用于提升搜索引擎性 能。随着大数据时代的到来,知识图谱规模得到 了快速的增长,各种大规模知识图谱相继出现 (如 Freebase[1] 、WordNet[2] 、NULL[3] 等)。当前知识 收稿日期:2018−12−04. 网络出版日期:2019−05−21. 基金项目:国家自然科学基金资助项目 (U1501252,61572146); 广西创新驱动重大专项项目 (AA17202024);广西自 然科学基金项目 (2016GXNSFDA380006);广西高校 中青年教师基础能力提升项目 (2018KYD203);广西 研究生教育创新计划项目 (YCSW2018139). 通信作者:宾辰忠. E-mail:cz_bin@guet.edu.cn. 第 15 卷第 2 期 智 能 系 统 学 报 Vol.15 No.2 2020 年 3 月 CAAI Transactions on Intelligent Systems Mar. 2020
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有