首页 | 官方网站   微博 | 高级检索  
     

基于闭合频繁Induced子树的GML文档结构聚类
引用本文:苗建新,吉根林,朱颖雯.基于闭合频繁Induced子树的GML文档结构聚类[J].南京师范大学学报,2009,9(2):61-64.
作者姓名:苗建新  吉根林  朱颖雯
作者单位:南京师范大学,计算机科学与技术学院,江苏,南京,210097  
摘    要:提出了一种GML文档结构聚类新算法MCF_CLU.与其它相关算法不同,该算法基于闭合频繁Induced子树进行聚类,聚类过程中不需树之间的两两相似度比较,而是挖掘GML文档数据库的闭合频繁Induced子树,为每个文档求一个闭合频繁Induced子树作为该文档的代表树,将具有相同代表树的文档聚为一类.聚类过程中自动生成簇的个数,为每个簇形成聚类描述,而且能够发现孤立点.实验结果表明算法MCF_CLU是有效的,且性能优于其它同类算法.

关 键 词:闭合频繁Induced子树  GML结构聚类  聚类

Clustering GML Documents by Structure Based on Closed Frequent Induced Subtrees
Miao Jianxin,Ji Genlin,Zhu Yingwen.Clustering GML Documents by Structure Based on Closed Frequent Induced Subtrees[J].Journal of Nanjing Nor Univ: Eng and Technol,2009,9(2):61-64.
Authors:Miao Jianxin  Ji Genlin  Zhu Yingwen
Affiliation:School of Computer Sciences;Nanjing Normal University;Nanjing 210097;China
Abstract:This paper presents an algorithm MCF-CLU for clustering GML documents by structure.Different from other algorithms,it goes on clustering based on the closed frequent induced subtrees,and doesn't need comparing the similarity between trees.The closed frequent induced subtrees of all the GML documents are computed.The representative closed frequent induced subtree of every document is obtained.The documents which have the same representative tree are regarded as a cluster.During the clustering process,not onl...
Keywords:closed frequent induced subtree  clustering GML by structure  clustering  
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号