首页 | 官方网站   微博 | 高级检索  
     

基于图游走的并行协同过滤推荐算法
引用本文:顾军华,,谢志坚,,武君艳,,许馨匀,,张素琪.基于图游走的并行协同过滤推荐算法[J].智能系统学报,2019,14(4):743-751.
作者姓名:顾军华    谢志坚    武君艳    许馨匀    张素琪
作者单位:1. 河北工业大学 人工智能与数据科学学院, 天津 300401;2. 河北工业大学 河北省大数据计算重点实验室, 天津 300401;3. 天津商业大学 信息工程学院, 天津 300134
摘    要:针对目前协同过滤推荐算法存在的数据稀疏性问题和可扩展性问题,本文进行了相关研究。针对稀疏性问题,在传统的皮尔逊相关相似度中引入交占比系数计算用户间直接相似度,该方法缓解了用户间共同评分项的占比问题;提出一种基于图游走的间接相似度计算方法,该方法根据用户间的直接相似度建立用户网络图,在用户网络图上通过游走计算用户间的间接相似度,并进行推荐。在Spark平台上实现本文方法的并行化,缓解了数据规模增加带来的可扩展性问题。实验结果表明:本文提出的算法在不同数据集上均取得了良好效果,有效地提高了推荐准确度,并且在分布式环境下具有良好的可扩展性。

关 键 词:协同过滤  推荐  用户网络图  游走  相似度  间接相似度  并行  Spark  平台

Parallel collaborative filtering recommendation algorithm based on graph walk
GU Junhua,,XIE Zhijian,,WU Junyan,,XU Xinyun,,ZHANG Suqi.Parallel collaborative filtering recommendation algorithm based on graph walk[J].CAAL Transactions on Intelligent Systems,2019,14(4):743-751.
Authors:GU Junhua    XIE Zhijian    WU Junyan    XU Xinyun    ZHANG Suqi
Affiliation:1. School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China;2. Hebei Province Key Laboratory of Big Data Computing, Tianjin 300401, China;3. School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China
Abstract:This study aims to solve the problem of data sparsity and scalability of collaborative filtering recommendation algorithms. For the sparseness problem, the traditional Pearson correlation similarity is introduced to calculate the direct similarity between the users using the cross-ratio coefficients. This method alleviates the proportion of common scoring items among users. An indirect similarity calculation method based on graph walk is proposed in the paper. This method builds a user network map based on the direct similarity between users, calculates the indirect similarity between users by walking on the user network map, and makes recommendations. The parallelization of this method on the Spark platform mitigates the scalability problem caused by increase of the data size. Experimental results on Movielens dataset and IPTV dataset show that the proposed algorithm achieves good results on different datasets, effectively improves the recommendation accuracy rate, and has good scalability in a distributed environment.
Keywords:collaborative filtering  recommendation  user network map  walk  similarity  indirect similarity  parallel  Spark platform
点击此处可从《智能系统学报》浏览原始摘要信息
点击此处可从《智能系统学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号