首页 | 官方网站   微博 | 高级检索  
     

基于增量DFT概要的数据流聚类算法
引用本文:孔英会,安静,车辚辚,刘云峰.基于增量DFT概要的数据流聚类算法[J].华北电力大学学报,2007,34(5):85-89.
作者姓名:孔英会  安静  车辚辚  刘云峰
作者单位:华北电力大学,电气与电子工程学院,河北,保定,071003
摘    要:数据流聚类分析是数据流挖掘领域的重要分支。由于数据流海量、快速、动态到达,传统的静态数据挖掘技术不能满足在线分析的需求。数据流聚类的核心是设计单遍数据集扫描算法,在有限的内存中存储少量概要特征信息,实现数据流实时、在线聚类分析。采用数据流处理中广泛应用的滑动窗口模型,提出一种新的基于增量傅立叶变换(DFT)的数据流概要算法,并在此基础上运用k-均值(k-means)聚类,实现数据流的在线挖掘。基于增量DFT概要的数据流聚类算法可减少运行时间,节省内存空间,实际用电负荷数据证明了算法的有效性。

关 键 词:数据流  滑动窗口  增量傅立叶变换  聚类  k-means
文章编号:1007-2691(2007)05-0085-05
修稿时间:2006-12-15

An algorithm for clustering data streams using incremental DFT
KONG Ying-hui,AN Jing,CHE Lin-lin,LIU Yun-feng.An algorithm for clustering data streams using incremental DFT[J].Journal of North China Electric Power University,2007,34(5):85-89.
Authors:KONG Ying-hui  AN Jing  CHE Lin-lin  LIU Yun-feng
Affiliation:School of Electrical and Electronic Engineering, North China Electric Power University, Baoding 071003, China
Abstract:Clustering data streams is one of the important branches in mining data streams.Because of dynamic and massive characteristics of data streams,traditional data mining algorithms could not satisfy the requirement of online analysis.The focus on data stream technologies is to design one-pass scan algorithmover data set,and maintain an effective synopsis data structure(digest) in memory incrementally which is far smaller than the size of whole data set.A novel algorithm for clustering data streams is presented in this paper.In this algorithm,means method is used for the subset division,sliding window model is used for the data changing and updating,DFT digest is used for data reduction and can be incrementally maintained.This algorithm can save main memory and run time,it is suitable for online clustering.Experiment of clustering real electrical consumption data verify the effectiveness of the presented algorithm.
Keywords:data stream  sliding window  incremental DFT  cluster  k-means
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号