首页 | 官方网站   微博 | 高级检索  
     

长生物数据集中频繁闭合模式挖掘算法研究
引用本文:周明,李宏.长生物数据集中频繁闭合模式挖掘算法研究[J].计算机工程,2007,33(2):74-76.
作者姓名:周明  李宏
作者单位:中南大学信息科学与工程学院,长沙,410083
摘    要:传统频繁项集挖掘算法在处理稠密或长数据集(如基因表达数据集)时效率低且产生大量冗余模式,为解决这些问题一些学者提出了闭合模式的概念和挖掘闭合模式的算法,研究证明挖掘闭合模式可以显著减少项集数量并消除大量冗余模式。该文针对生物数据特点提出了一个新颖的挖掘频繁闭合模式的算法REMFOR,该算法在闭合模式概念和行枚举思想的基础上,采用垂直数据结构和fp-tree技术,对行集建立行fp-tree来挖掘频繁闭合模式。通过实例和实验证明该算法是正确有效的。

关 键 词:数据挖掘  频繁项集  闭合模式
文章编号:1000-3428(2007)02-0074-03
修稿时间:2006-02-05

Research of Frequent Closed Pattern Mining in Long Biological Datasets
ZHOU Ming,LI Hong.Research of Frequent Closed Pattern Mining in Long Biological Datasets[J].Computer Engineering,2007,33(2):74-76.
Authors:ZHOU Ming  LI Hong
Affiliation:Institute of Information Science and Engineering, Central South University, Changsha 410083
Abstract:Traditional algorithms for mining frequent itemsets are proved to be inefficient and produce many redundant patterns when they are applied to dense datasets or long datasets,such as gene expression datasets.To solve this problem,some researchers propose closed pattern conception and some algorithms.It is proved that these algorithms based on the conception of closed pattern can substantially reduce the number of rules and redundant patterns.According to the characters of biological datasets,a novel algorithm called REMFOR is dlsigned to mine frequent closed pattern.It is based on the conception of closed pattern,using row enumeration and vertical data structure,building row fp-tree on row set to mine frequent closed pattern.And it is proved to be correct and efficient by example and tests.
Keywords:Data mining  Frequent itemsets  Closed pattern
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号