首页 | 官方网站   微博 | 高级检索  
     

多代表点特征树与空间聚类算法
引用本文:黄添强,秦小麟,王金栋.多代表点特征树与空间聚类算法[J].计算机科学,2006,33(12):189-195.
作者姓名:黄添强  秦小麟  王金栋
作者单位:1. 福建师范大学数学与计算机科学学院计算机科学系,福州,350007;南京航空航天大学计算机科学与技术系,南京,210016
2. 南京航空航天大学计算机科学与技术系,南京,210016
基金项目:国家自然科学基金;国家高技术研究发展计划(863计划);航空基础科学基金;江苏省自然科学基金
摘    要:空间数据具有海量、复杂、连续、空间自相关、存在缺损与误差等的特点,要求空间聚类算法具有高效率,能处理各种复杂形状的簇,聚类结果与数据空间分布顺序无关,并且对离群点是健壮的等性能,已有的算法难以同时满足要求。本文提出了一个适合处理海量复杂空间数据的数据结构一多代表点特征树。基于多代表点特征树提出了适合挖掘海量复杂空间数据聚类算法CAMFT,该算法利用多代表点特征树对海量的数据进行压缩,结合随机采样的方法进一步增强算法处理海量数据的能力;同时,多代表点特征树能够保存复杂形状的聚类特征,适合处理复杂空间数据。实验表明了算法CAMFT能够快速处理带有离群点的复杂形状聚类的空间数据,结果与对象空间分布顺序无关,并且效率优于已有的同类聚类算法BLRCH与CURE。

关 键 词:空间聚类  空间数据  多代表点特征树

Multi-representation Feature Tree and Spatial Clustering Algorithm
HUANG Tian-Qiang,QIN Xiao-Lin,WANG Jin-Dong.Multi-representation Feature Tree and Spatial Clustering Algorithm[J].Computer Science,2006,33(12):189-195.
Authors:HUANG Tian-Qiang  QIN Xiao-Lin  WANG Jin-Dong
Affiliation:1.Department of Computer Science and Engineering, Fuzhou University, Fuzhou 350002;2.Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016
Abstract:Spatial data have the features of largeness, complexity, continuity, spatial autocorrelation, missing data and error in spatial database. These characters require that a good spatial clustering algorithm must be high efficient, and should be able to detect clusters of complicated shapes, and the dusters found should be independent of the order in which the points in the space are examined, and should be not be impacted by outliers. The existed algorithms can not work well, Clustering algorithm based on multi-representation feature tree named CAMFT is proposed, A new data structure is firstly proposed to condense data, which drew the strongpoint from BIRCH algorithm and CURE algorithm, and then the algorithm that included the idea of random sampling is proposed to enhance the ability to detect very large data, As well as, the multi-representation feature tree can keep clusters of complicated shapes, so it can be used to detect spatial clusters. Experimental results show the algorithm can identify clusters of complicated shapes efficiently in large spatial database that have many outliers, and outperform BIRCH algorithm and CURE algorithm in efficiency.
Keywords:Spatial clustering  Spatial data  Multi representation feature tree
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号