第15卷第6期 智能系统学报 Vol.15 No.6 2020年11月 CAAI Transactions on Intelligent Systems Nov.2020 D0L:10.11992tis.201903033 空间关键字个性化语义近似查询方法 李盼,张霄雁,孟祥福,赵路路,齐雪月 (辽宁工程技术大学电子与信息工程学院,辽宁葫芦岛125105) 摘要:现有的空间关键字查询处理模式大都仅支持位置相近和文本相似匹配,但不能将语义相近但形式上不 匹配的对象提供给用户:并且,当前的空间-文本索引结构也不能对空间对象中的数值属性进行处理。针对上 述问题,本文提出了一种支持语义近似查询的空间关键字查询方法。首先,利用词嵌入技术对用户原始查询进 行扩展,生成一系列与原始查询关键字语义相关的查询关键字:然后,提出了一种能够同时支持文本和语义匹 配,并利用Skyline方法对数值属性进行处理的混合索引结构AIR-Tree:最后,利用AIR-Tree进行查询匹配,返 回to-k个与查询条件最为相关的有序空间对象。实验分析和结果表明,与现有同类方法相比,本文方法具有 较高的执行效率和较好的用户满意度;基于AIR-Tree索引的查询效率较IRS-Tree索引提高了3.6%.在查询结 果准确率上较IR-Tree和IRS-Tree索引分别提高了10.14%和16.15%。 关键词:空间关键字查询:词嵌入:语义近似查询:文本;数值属性:索引结构:查询匹配 中图分类号:TP311文献标志码:A文章编号:1673-4785(2020)06-1163-12 中文引用格式:李盼,张霄雁,孟样福,等.空间关键字个性化语义近似查询方法从.智能系统学报,2020,15(6):1163-1174. 英文引用格式:LI Pan,.ZHANG Xiaoyan,.MENG Xiangfu,etal.Spatial keyword personalized and semantic approximate query approachJ CAAI transactions on intelligent systems,2020,15(6):1163-1174. Spatial keyword personalized and semantic approximate query approach LI Pan,ZHANG Xiaoyan,MENG Xiangfu,ZHAO Lulu,QI Xueyue (School of Electronic and Information Engineering,Liaoning Technical University,Huludao 125105,China) Abstract:Most spatial keyword query processing models only support the location proximity and text similarity match- ing.However,in terms of text information processing,spatial objects with similar semantics but mismatched forms can- not be filtered out and provided to query users.Furthermore,the current spatial-text index structure cannot process the numerical attributes.To solve the above problem,this paper proposes a spatial keyword query method that can support the semantic approximate query processing.Word embedding technology is used to expand the users'original queries and generate a series of query keywords semantically related to the original query keywords.Then,a hybrid index struc- ture AlR-tree that can support text and semantic matching and use the Skyline method to process numerical attributes is proposed.Finally,AIR-tree is used for query matching to return the top-k ordered spatial objects most closely related to the query conditions.Experimental analysis and results show that compared with similar methods,this method has a higher execution efficiency and better user satisfaction.The query efficiency based on the AIR-tree index is 3.6%high- er than that of the IRS-tree index.In terms of accuracy,IR-tree and IRS-tree are increased by 10.14%and 16.15%,re- spectively,compared with AIR-tree. Keywords:spatial keyword query;word embedding;semantic approximate query;text;numerical attribute;index struc- ture;query matching 移动网络的普遍应用和空间Web对象的大 range query)和top-kk近邻查询(top-k kNN query), 量出现,使得空间关键字查询成为LBS(location- 这两类查询处理模式主要是根据空间对象与空间 based system)的重要支撑技术。现有的空间关键 关键字查询之间的文本相似度和位置相近度构建 字查询处理模式主要有top-k范围查询(top-k 结果评分函数,进而利用文本和空间混合索引技 术提高查询效率。现有的空间数据和文本信息相 收稿日期:2019-03-25 基金项目:国家自然科学基金面上项目(61772249). 混合的空间-文本索引技术主要有R-Tree、IR2 通信作者:孟祥福.E-mail:marxi(@I26.com Tree、QuadTree、R*.Tree、S2i等;文本搜索DOI: 10.11992/tis.201903033 空间关键字个性化语义近似查询方法 李盼,张霄雁,孟祥福,赵路路,齐雪月 (辽宁工程技术大学 电子与信息工程学院,辽宁 葫芦岛 125105) 摘 要:现有的空间关键字查询处理模式大都仅支持位置相近和文本相似匹配,但不能将语义相近但形式上不 匹配的对象提供给用户;并且,当前的空间−文本索引结构也不能对空间对象中的数值属性进行处理。针对上 述问题,本文提出了一种支持语义近似查询的空间关键字查询方法。首先,利用词嵌入技术对用户原始查询进 行扩展,生成一系列与原始查询关键字语义相关的查询关键字;然后,提出了一种能够同时支持文本和语义匹 配,并利用 Skyline 方法对数值属性进行处理的混合索引结构 AIR-Tree;最后,利用 AIR-Tree 进行查询匹配,返 回 top-k 个与查询条件最为相关的有序空间对象。实验分析和结果表明,与现有同类方法相比,本文方法具有 较高的执行效率和较好的用户满意度;基于 AIR-Tree 索引的查询效率较 IRS-Tree 索引提高了 3.6%,在查询结 果准确率上较 IR-Tree 和 IRS-Tree 索引分别提高了 10.14% 和 16.15%。 关键词:空间关键字查询;词嵌入;语义近似查询;文本;数值属性;索引结构;查询匹配 中图分类号:TP311 文献标志码:A 文章编号:1673−4785(2020)06−1163−12 中文引用格式:李盼, 张霄雁, 孟祥福, 等. 空间关键字个性化语义近似查询方法 [J]. 智能系统学报, 2020, 15(6): 1163–1174. 英文引用格式:LI Pan, ZHANG Xiaoyan, MENG Xiangfu, et al. Spatial keyword personalized and semantic approximate query approach[J]. CAAI transactions on intelligent systems, 2020, 15(6): 1163–1174. Spatial keyword personalized and semantic approximate query approach LI Pan,ZHANG Xiaoyan,MENG Xiangfu,ZHAO Lulu,QI Xueyue (School of Electronic and Information Engineering, Liaoning Technical University, Huludao 125105, China) Abstract: Most spatial keyword query processing models only support the location proximity and text similarity matching. However, in terms of text information processing, spatial objects with similar semantics but mismatched forms cannot be filtered out and provided to query users. Furthermore, the current spatial-text index structure cannot process the numerical attributes. To solve the above problem, this paper proposes a spatial keyword query method that can support the semantic approximate query processing. Word embedding technology is used to expand the users’ original queries and generate a series of query keywords semantically related to the original query keywords. Then, a hybrid index structure AIR-tree that can support text and semantic matching and use the Skyline method to process numerical attributes is proposed. Finally, AIR-tree is used for query matching to return the top-k ordered spatial objects most closely related to the query conditions. Experimental analysis and results show that compared with similar methods, this method has a higher execution efficiency and better user satisfaction. The query efficiency based on the AIR-tree index is 3.6% higher than that of the IRS-tree index. In terms of accuracy, IR-tree and IRS-tree are increased by 10.14% and 16.15%, respectively, compared with AIR-tree. Keywords: spatial keyword query; word embedding; semantic approximate query; text; numerical attribute; index structure; query matching 移动网络的普遍应用和空间 Web 对象的大 量出现,使得空间关键字查询成为 LBS(locationbased system) 的重要支撑技术。现有的空间关键 字查询处理模式主要有 top-k 范围查询 (top-k range query) 和 top-k k 近邻查询 (top-k kNN query), 这两类查询处理模式主要是根据空间对象与空间 关键字查询之间的文本相似度和位置相近度构建 结果评分函数,进而利用文本和空间混合索引技 术提高查询效率。现有的空间数据和文本信息相 混合的空间−文本索引技术主要有 IR-Tree[1] 、IR2 - Tree[2] 、QuadTree[3] 、R*-Tree[4] 、S2I[5] 等;文本搜索 收稿日期:2019−03−25. 基金项目:国家自然科学基金面上项目 (61772249). 通信作者:孟祥福. E-mail:marxi@126.com. 第 15 卷第 6 期 智 能 系 统 学 报 Vol.15 No.6 2020 年 11 月 CAAI Transactions on Intelligent Systems Nov. 2020