首页 | 官方网站   微博 | 高级检索  
     

Efficient Mining of Frequent Closed XML Query Pattern
作者姓名:Jian-Hua Feng  Qian Qian  Jian-Yong Wang  and Li-Zhu Zhou
作者单位:Department of Computer Science and Technology Tsinghua University Beijing 100084,China,Department of Computer Science and Technology Tsinghua University,Beijing 100084,China,Department of Computer Science and Technology Tsinghua University,Beijing 100084,China,Department of Computer Science and Technology Tsinghua University,Beijing 100084,China
基金项目:This work is supported in part by the National Natural Science Foundation of China under Grant No.60573094,the National Grand Fundamental Research 973 Program of China under Grant No.2006CB303103,the National High Technology Development 863 Program of China under Grant No.2006AA01A101,Tsinghua Basic Research Foundation under Grant No.JCqn2005022.
摘    要:Previous research works have presented convincing arguments that a frequent pattern mining algorithm should not mine all frequent but only the closed ones because the latter leads to not only more compact yet complete result set but also better efficiency. Upon discovery of frequent closed XML query patterns, indexing and caching can be effectively adopted for query performance enhancement. Most of the previous algorithms for finding frequent patterns basically introduced a straightforward generate-and-test strategy. In this paper, we present SOLARIA*, an efficient algorithm for mining frequent closed XML query patterns without candidate maintenance and costly tree-containment checking. Efficient algorithm of sequence mining is involved in discovering frequent tree-structured patterns, which aims at replacing expensive containment testing with cheap parent-child checking in sequences. SOLARIA* deeply prunes unrelated search space for frequent pattern enumeration by parent-child relationship constraint. By a thorough experimental study on various real-life data, we demonstrate the efficiency and scalability of SOLARIA* over the previous known alternative. SOLARIA* is also linearly scalable in terms of XML queries' size.

关 键 词:计算机软件  频繁关闭模式  XML  数据采集  询问模式
收稿时间:12 October 2006
修稿时间:2006-10-12

Efficient Mining of Frequent Closed XML Query Pattern
Jian-Hua Feng,Qian Qian,Jian-Yong Wang,and Li-Zhu Zhou.Efficient Mining of Frequent Closed XML Query Pattern[J].Journal of Computer Science and Technology,2007,22(5):725-735.
Authors:Jian-Hua Feng  Qian Qian  Jian-Yong Wang  Li-Zhu Zhou
Affiliation:Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
Abstract:Previous research works have presented convincing arguments that a frequent pattern mining algorithm should not mine all frequent but only the closed ones because the latter leads to not only more compact yet complete result set but also better efficiency.Upon discovery of frequent closed XML query patterns,indexing and caching can be effectively adopted for query performance enhancement.Most of the previous algorithms for finding frequent patterns basically introduced a straightforward generate-and-test strategy.In this paper,we present SOLARIA*,an efficient algorithm for mining frequent closed XML query patterns without candidate maintenance and costly tree-containment checking.Efficient algorithm of sequence mining is involved in discovering frequent tree-structured patterns,which aims at replacing expensive containment testing with cheap parent-child checking in sequences.SOLARIA* deeply prunes unrelated search space for frequent pattern enumeration by parent-child relationship constraint.By a thorough experimental study on various real-life data,we demonstrate the efficiency and scalability of SOLARIA* over the previous known alternative.SOLARIA* is also linearly scalable in terms of XML queries' size.
Keywords:computer software  frequent closed pattern  data mining  XML  XPath
本文献已被 CNKI 维普 万方数据 SpringerLink 等数据库收录!
点击此处可从《计算机科学技术学报》浏览原始摘要信息
点击此处可从《计算机科学技术学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号