首页 | 官方网站   微博 | 高级检索  
     

衰减窗口中的不确定数据流聚类算法
引用本文:屠莉,陈崚.衰减窗口中的不确定数据流聚类算法[J].计算机应用研究,2021,38(9):2673-2677,2682.
作者姓名:屠莉  陈崚
作者单位:江阴职业技术学院 计算机科学系,江苏 江阴 214405;扬州大学 信息工程学院,江苏 扬州 225127;南京大学 软件新技术国家重点实验室,南京210093
基金项目:国家自然科学基金项目(61702441);江苏省自然科学基金项目(BK20201430)
摘    要:针对现实不确定数据流具备分布非凸性和包含大量噪声等特点,提出不确定数据流聚类算法Clu_Ustream(clustering on uncertain stream)来解决对近期数据进行实时高效聚类演化问题.首先,在线部分利用子窗口采样机制采集滑动窗口中的不确定流数据,采用双层概要统计结构链表存储概率密度网格的统计信息;然后,离线聚类过程中通过衰减窗口机制弱化老旧数据的影响,并定期对窗口中的过期子窗口进行清理;同时采用动态异常网格删除机制有效过滤离群点,从而降低算法的时空复杂度.在模拟数据集和网络入侵真实数据集上的仿真结果表明,Clu_Ustream算法与其他同类算法相比具有较高的聚类质量和效率.

关 键 词:不确定数据流  聚类  衰减窗口  采样机制  密度网格  网络入侵
收稿时间:2021/1/23 0:00:00
修稿时间:2021/8/10 0:00:00

Clustering algorithm for uncertain data stream based on damped sliding window
Tu Li and Chen Ling.Clustering algorithm for uncertain data stream based on damped sliding window[J].Application Research of Computers,2021,38(9):2673-2677,2682.
Authors:Tu Li and Chen Ling
Affiliation:Department of Computer Science,Jiangyin Polytechnic College,
Abstract:In view of the fact that uncertain data stream had the characteristics of non-convex distribution and contained a lot of noise, this paper proposed an algorithm Clu_Ustream for clustering uncertain data stream which solved the problem of real-time and efficient clustering evolution for recent data. Firstly, in the online part, Clu_Ustream used the sub window sampling mechanism to collect the uncertain stream data in the sliding window. Moreover, it used a double-layer summary statistical structure linked list to store the statistical information of the probability density grids to improve the processing efficiency. Secondly, in the off-line part, it used the damped window mechanism to weaken the influence of old data and deleted regularly the expired sub windows to ensure the effectiveness of clustering. In addition, it developed a dynamic abnormal grids deletion mechanism to filter most of outliers in order to dramatically improve the space and time efficiency. The experimental results on the synthetic and real datasets show that Clu_Ustream has superior clustering quality and efficiency than other similar algorithms.
Keywords:uncertain data stream  clustering  damped window  sampling mechanism  density grid  network intrusion
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号