基于图游走的并行协同过滤推荐算法 Parallel collaborative filtering recommendation algorithm based on graph walk期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于图游走的并行协同过滤推荐算法

引用本文：	顾军华,,谢志坚,,武君艳,,许馨匀,,张素琪.基于图游走的并行协同过滤推荐算法[J].智能系统学报,2019,14(4):743-751.

作者姓名：	顾军华谢志坚武君艳许馨匀张素琪

作者单位：	1. 河北工业大学人工智能与数据科学学院, 天津 300401;2. 河北工业大学河北省大数据计算重点实验室, 天津 300401;3. 天津商业大学信息工程学院, 天津 300134

摘要：	针对目前协同过滤推荐算法存在的数据稀疏性问题和可扩展性问题，本文进行了相关研究。针对稀疏性问题，在传统的皮尔逊相关相似度中引入交占比系数计算用户间直接相似度，该方法缓解了用户间共同评分项的占比问题；提出一种基于图游走的间接相似度计算方法，该方法根据用户间的直接相似度建立用户网络图，在用户网络图上通过游走计算用户间的间接相似度，并进行推荐。在Spark平台上实现本文方法的并行化，缓解了数据规模增加带来的可扩展性问题。实验结果表明:本文提出的算法在不同数据集上均取得了良好效果，有效地提高了推荐准确度，并且在分布式环境下具有良好的可扩展性。
关键词：	协同过滤推荐用户网络图游走相似度间接相似度并行 Spark 平台
Parallel collaborative filtering recommendation algorithm based on graph walk

GU Junhua,,XIE Zhijian,,WU Junyan,,XU Xinyun,,ZHANG Suqi.Parallel collaborative filtering recommendation algorithm based on graph walk[J].CAAL Transactions on Intelligent Systems,2019,14(4):743-751.

Authors:	GU Junhua XIE Zhijian WU Junyan XU Xinyun ZHANG Suqi

Affiliation:	1. School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China;2. Hebei Province Key Laboratory of Big Data Computing, Tianjin 300401, China;3. School of Information Engineering, Tianjin University of Commerce, Tianjin 300134, China

Abstract:	This study aims to solve the problem of data sparsity and scalability of collaborative filtering recommendation algorithms. For the sparseness problem, the traditional Pearson correlation similarity is introduced to calculate the direct similarity between the users using the cross-ratio coefficients. This method alleviates the proportion of common scoring items among users. An indirect similarity calculation method based on graph walk is proposed in the paper. This method builds a user network map based on the direct similarity between users, calculates the indirect similarity between users by walking on the user network map, and makes recommendations. The parallelization of this method on the Spark platform mitigates the scalability problem caused by increase of the data size. Experimental results on Movielens dataset and IPTV dataset show that the proposed algorithm achieves good results on different datasets, effectively improves the recommendation accuracy rate, and has good scalability in a distributed environment.

Keywords:	collaborative filtering recommendation user network map walk similarity indirect similarity parallel Spark platform

	点击此处可从《智能系统学报》浏览原始摘要信息
	点击此处可从《智能系统学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏