首页 | 官方网站   微博 | 高级检索  
     

基于Spark框架的电力大数据清洗模型
引用本文:王 冲,张文龙,王永文.基于Spark框架的电力大数据清洗模型[J].电测与仪表,2017,54(14).
作者姓名:王 冲  张文龙  王永文
作者单位:1. 国网内蒙古东部电力有限公司信息通信分公司,呼和浩特,010020;2. 兰州大学 数学与统计学院,兰州,730000
基金项目:国家自然科学基金资助项目
摘    要:针对电力大数据清洗过程中的提取统一异常检测模式困难、异常数据修正连续性及准确性低下等问题,提出了一种基于Spark框架的电力大数据清洗模型。首先基于改进CURE聚类算法获取正常簇;其次,实现了正常簇的边界样本获取方法,并设计了基于边界样本的异常识别算法;最后通过指数加权移动平均数实现了异常数据修正。通过对某风电场风力发电监测数据进行了数据清洗实验分析,验证了清洗模型的高效性、准确性。

关 键 词:电力大数据  数据清洗  异常识别  异常修正  Spark框架
收稿时间:2016/9/6 0:00:00
修稿时间:2016/9/6 0:00:00

A Data Cleaning Model for Electric Power Big Data Based on Spark Framework
wangchong,zhangwenlong and wangyongwen.A Data Cleaning Model for Electric Power Big Data Based on Spark Framework[J].Electrical Measurement & Instrumentation,2017,54(14).
Authors:wangchong  zhangwenlong and wangyongwen
Abstract:Aiming at the difficulties of the extracting of the unified anomaly detection pattern and the low accuracy and continuity of the anomaly data correction in the process of the electrical power big data cleaning, the data cleaning model of the electrical power big data based on Spark framework is proposed. Firstly, the normal clusters and the corresponding boundary samples are obtained by the improved CURE clustering algorithm. Then, the anomaly data identification algorithm based on boundary samples is designed. Finally, the anomaly data modification is realized by using exponential weighting moving mean value. The high efficiency and accuracy are proved by the experiment of the data cleaning of the wind power generation monitoring data from the wind power station.
Keywords:big data of electric power  data cleaning  anomaly identification  anomaly modification  Spark framework
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《电测与仪表》浏览原始摘要信息
点击此处可从《电测与仪表》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号