首页 | 官方网站   微博 | 高级检索  
     


A Scalable Method of Maintaining Order Statistics for Big Data Stream
Authors:Zhaohui Zhang  Jian Chen  Ligong Chen  Qiuwen Liu  Lijun Yang  Pengwei Wang  Yongjun Zheng
Affiliation: School of Computer Science and Technology, Donghua University, 201620, China. The Key Laboratory of Embedded System and Service Computing, Ministry of Education, Tongji University, 201804, China. Shanghai Engineering Research Center of Network Information Services, 201804, China. School of Electronics, Computing and Mathematics, University of Derby, Derby, United Kingdom.
Abstract:Recently, there are some online quantile algorithms that work on how to analyze the order statistics about the high-volume and high-velocity data stream, but the drawback of these algorithms is not scalable because they take the GK algorithm as the subroutine, which is not known to be mergeable. Another drawback is that they can’t maintain the correctness, which means the error will increase during the process of the window sliding. In this paper, we use a novel data structure to store the sketch that maintains the order statistics over sliding windows. Therefore three algorithms have been proposed based on the data structure. And the fixed-size window algorithm can keep the sketch of the last W elements. It is also scalable because of the mergeable property. The time-based window algorithm can always keep the sketch of the data in the last T time units. Finally, we provide the window aggregation algorithm which can help extend our algorithm into the distributed system. This provides a speed performance boost and makes it more suitable for modern applications such as system/network monitoring and anomaly detection. The experimental results show that our algorithm can not only achieve acceptable performance but also can actually maintain the correctness and be mergeable.
Keywords:Big data stream  online analytical processing  sliding windows  mergeable data sketches  
点击此处可从《》浏览原始摘要信息
点击此处可从《》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号