首页 | 官方网站   微博 | 高级检索  
     

改进的频繁项集挖掘算法及其应用研究
引用本文:顾军华,李如婷,张亚娟,董彦琦.改进的频繁项集挖掘算法及其应用研究[J].计算机应用与软件,2019,36(9):260-269.
作者姓名:顾军华  李如婷  张亚娟  董彦琦
作者单位:河北工业大学人工智能与数据科学学院 天津300401;河北省大数据计算重点实验室 天津300401;河北工业大学人工智能与数据科学学院 天津300401
基金项目:河北省科技计划;天津市科技计划
摘    要:频繁模式增长(FP-growth)算法是挖掘频繁项集的经典算法,解决了挖掘频繁项集时需多次扫描数据库且产生大量候选项集的问题,但大多数基于FP-growth思想的算法在生成频繁项集时存在过程复杂、占用空间多的问题。为此,提出一种基于前序完全构造链表(PF-List)的频繁项集挖掘算法(PFLFIM)。该算法使用PF-List表示项集,通过简单比较和连接两个PF-List挖掘频繁项集,避免复杂的连接操作;使用包含索引、提前停止交集和父子等价策略对搜索空间进行优化,减少空间占用。通过实验验证,相比于FIN算法和negFIN算法,该算法在运行时间和内存占用方面具有更好的性能。将该算法应用于高校人力资源管理系统中进行关联规则挖掘,寻找影响人才发展的因素,为高校人才引进和选拔提供决策支持。

关 键 词:关联规则  频繁项集挖掘  构建树  剪枝策略  人才引进

IMPROVED FREQUENT ITEMSETS MINING ALGORITHM AND ITS APPLICATION
Gu Junhua,Li Ruting,Zhang Yajuan,Dong Yanqi.IMPROVED FREQUENT ITEMSETS MINING ALGORITHM AND ITS APPLICATION[J].Computer Applications and Software,2019,36(9):260-269.
Authors:Gu Junhua  Li Ruting  Zhang Yajuan  Dong Yanqi
Affiliation:(School of Artificial Intelligence and Data Science, Hebei University of Technology, Tianjin 300401, China;Hebei Province Key Laboratory of Big data Computing, Tianjin 300401, China)
Abstract:Frequent Pattern growth(FP-growth) algorithm is a classic algorithm for mining frequent itemsets. It solves the problem of scanning the database multiple times and generating a large number of candidate sets, but most of the algorithms based on FP-growth idea have the problem of complex process and space occupation. Therefore, we proposed a frequent itemsets mining algorithm(PFLFIM) based on PF-List. PF-List was employed to represent itemsets. By simply comparing and connecting two PF-Lists to mine frequent itemsets, complex join operations were avoided. The search space was optimized by using the strategies of subsume index, stop intersection beforehand, father-son equivalence, which reduced the space occupation. The experimental results show that the algorithm is superior to the FIN algorithm and the negFIN algorithm on both running time and space occupancy. The algorithm is applied to mining association rules in human resource management system of colleges and universities to find factors affecting the development of talents, and it provides decision support for the talent introduction of universities.
Keywords:Association rule  Frequent itemsets mining  Building tree  Pruning strategy  Talent introduction
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号