DOI: 10.11992/tis.201806002 网络出版地址: h

正在加载图片...

第14卷第4期智能系统学报 Vol.14 No.4 2019年7月 CAAI Transactions on Intelligent Systems Jul.2019 D0:10.11992/tis.201806002 网络出版地址：http:/kns.cnki.net/kcms/detail/23.1538.TP.20180629.1153.004html 基于图游走的并行协同过滤推荐算法顾军华2，谢志坚2，武君艳2，许馨匀2，张素琪 (1.河北工业大学人工智能与数据科学学院，天津300401；2.河北工业大学河北省大数据计算重点实验室，天津300401：3.天津商业大学信息工程学院，天津300134) 摘要：针对目前协同过滤推荐算法存在的数据稀疏性问题和可扩展性问题，本文进行了相关研究。针对稀疏性问题，在传统的皮尔逊相关相似度中引入交占比系数计算用户间直接相似度，该方法缓解了用户间共同评分项的占比问题：提出一种基于图游走的间接相似度计算方法，该方法根据用户间的直接相似度建立用户网络图，在用户网络图上通过游走计算用户间的间接相似度，并进行推荐。在Spak平台上实现本文方法的并行化，缓解了数据规模增加带来的可扩展性问题。实验结果表明：本文提出的算法在不同数据集上均取得了良好效果，有效地提高了推荐准确度，并且在分布式环境下具有良好的可扩展性。关键词：协同过滤：推荐；用户网络图；游走：相似度；间接相似度：并行；Spak平台中图分类号：TP391 文献标志码：A文章编号：1673-4785(201904-0743-09 中文引用格式：顾军华，谢志坚，武君艳，等.基于图游走的并行协同过滤推荐算法.智能系统学报，2019,14(4)：743-751. 英文引用格式：GU Junhua,XIE Zhijian,.VU Junyan,etal.Parallel collaborative filtering recommendation algorithm based on graph walkJ].CAAI transactions on intelligent systems,2019,14(4):743-751. Parallel collaborative filtering recommendation algorithm based on graph walk GU Junhua,XIE Zhijian,WU Junyan,XU Xinyun,ZHANG Suqi' (1.School of Artificial Intelligence,Hebei University of Technology,Tianjin 300401,China;2.Hebei Province Key Laboratory of Big Data Computing,Tianjin 300401,China;3.School of Information Engineering,Tianjin University of Commerce,Tianjin 300134,China) Abstract:This study aims to solve the problem of data sparsity and scalability of collaborative filtering recommenda- tion algorithms.For the sparseness problem,the traditional Pearson correlation similarity is introduced to calculate the direct similarity between the users using the cross-ratio coefficients.This method alleviates the proportion of common scoring items among users.An indirect similarity calculation method based on graph walk is proposed in the paper.This method builds a user network map based on the direct similarity between users,calculates the indirect similarity between users by walking on the user network map,and makes recommendations.The parallelization of this method on the Spark platform mitigates the scalability problem caused by increase of the data size.Experimental results on Movielens data- set and IPTV dataset show that the proposed algorithm achieves good results on different datasets,effectively improves the recommendation accuracy rate,and has good scalability in a distributed environment. Keywords:collaborative filtering;recommendation;user network map;walk;similarity;indirect similarity,parallel; Spark platform 近年来随着互联网科技的发展，大数据在促如何快速从海量数据中获取有价值的信息成为当进社会进步的同时，也带来了“信息过载”问题。前大数据发展的关键性问题。为满足人们在大收稿日期：2018-06-01.网络出版日期：2018-07-02. 数据中快速获取有价值信息的需求，推荐系统应基金项目：河北省科技计划项目(17210305D:天津市科技计划项目(I6 ZXHLSF0023):天津市自然科学基金项目运而生。推荐系统的目标是根据用户的个性化需 (15 JCONJC00600). 通信作者：张素琪.E-mail:zhangsuqie(@163.com. 求将最符合用户喜好的信息挑选出来并推荐给用DOI: 10.11992/tis.201806002 网络出版地址: http://kns.cnki.net/kcms/detail/23.1538.TP.20180629.1153.004.html 基于图游走的并行协同过滤推荐算法顾军华1,2，谢志坚1,2，武君艳1,2，许馨匀1,2，张素琪3 （1. 河北工业大学人工智能与数据科学学院，天津 300401; 2. 河北工业大学河北省大数据计算重点实验室，天津 300401; 3. 天津商业大学信息工程学院，天津 300134）摘要：针对目前协同过滤推荐算法存在的数据稀疏性问题和可扩展性问题，本文进行了相关研究。针对稀疏性问题，在传统的皮尔逊相关相似度中引入交占比系数计算用户间直接相似度，该方法缓解了用户间共同评分项的占比问题；提出一种基于图游走的间接相似度计算方法，该方法根据用户间的直接相似度建立用户网络图，在用户网络图上通过游走计算用户间的间接相似度，并进行推荐。在 Spark 平台上实现本文方法的并行化，缓解了数据规模增加带来的可扩展性问题。实验结果表明：本文提出的算法在不同数据集上均取得了良好效果，有效地提高了推荐准确度，并且在分布式环境下具有良好的可扩展性。关键词：协同过滤；推荐；用户网络图；游走；相似度；间接相似度；并行；Spark 平台中图分类号：TP391 文献标志码：A 文章编号：1673−4785(2019)04−0743−09 中文引用格式：顾军华, 谢志坚, 武君艳, 等. 基于图游走的并行协同过滤推荐算法 [J]. 智能系统学报, 2019, 14(4): 743–751. 英文引用格式：GU Junhua, XIE Zhijian, WU Junyan, et al. Parallel collaborative filtering recommendation algorithm based on graph walk[J]. CAAI transactions on intelligent systems, 2019, 14(4): 743–751. Parallel collaborative filtering recommendation algorithm based on graph walk GU Junhua1,2 ，XIE Zhijian1,2 ，WU Junyan1,2 ，XU Xinyun1,2 ，ZHANG Suqi3 (1. School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China; 2. Hebei Province Key Laboratory of Big Data Computing, Tianjin 300401, China; 3. School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China) Abstract: This study aims to solve the problem of data sparsity and scalability of collaborative filtering recommendation algorithms. For the sparseness problem, the traditional Pearson correlation similarity is introduced to calculate the direct similarity between the users using the cross-ratio coefficients. This method alleviates the proportion of common scoring items among users. An indirect similarity calculation method based on graph walk is proposed in the paper. This method builds a user network map based on the direct similarity between users, calculates the indirect similarity between users by walking on the user network map, and makes recommendations. The parallelization of this method on the Spark platform mitigates the scalability problem caused by increase of the data size. Experimental results on Movielens dataset and IPTV dataset show that the proposed algorithm achieves good results on different datasets, effectively improves the recommendation accuracy rate, and has good scalability in a distributed environment. Keywords: collaborative filtering; recommendation; user network map; walk; similarity; indirect similarity; parallel; Spark platform 近年来随着互联网科技的发展，大数据在促进社会进步的同时，也带来了“信息过载”问题。如何快速从海量数据中获取有价值的信息成为当前大数据发展的关键性问题[1]。为满足人们在大数据中快速获取有价值信息的需求，推荐系统应运而生。推荐系统的目标是根据用户的个性化需求将最符合用户喜好的信息挑选出来并推荐给用收稿日期：2018−06−01. 网络出版日期：2018−07−02. 基金项目：河北省科技计划项目 (17210305D)；天津市科技计划项目 (16ZXHLSF0023)；天津市自然科学基金项目 (15JCQNJC00600). 通信作者：张素琪. E-mail：zhangsuqie@163.com. 第 14 卷第 4 期智能系统学报 Vol.14 No.4 2019 年 7 月 CAAI Transactions on Intelligent Systems Jul. 2019

向下翻页>>

点击下载：【机器学习】基于图游走的并行协同过滤推荐算法