首页 | 官方网站   微博 | 高级检索  
     

基于云数据中心的多源异构数据治理技术研究
引用本文:孙瑜. 基于云数据中心的多源异构数据治理技术研究[J]. 计算机测量与控制, 2024, 32(3): 286-292
作者姓名:孙瑜
作者单位:92941部队45分队
摘    要:目前常规的多源异构数据治理方法主要通过对数据属性进行判断,从而实现分区域数据清洗,由于缺乏对非线性数据的分析,导致治理性能不佳;对此,提出基于云数据中心的多源异构数据治理技术;采用关系型数据库中的ETL功能对数据进行清洗,对数据转换模式以及数据清洗规则进行定义;引入互信息系数对数据相关程度进行判定,并进行非线性数据相关性分析;以云数据中心作为载体,对多源异构数据治理体系进行构建;在实验中,对提出的数据治理技术进行了治理性能的检验;最终的实验结果表明,提出的数据治理技术具备较高的查准率,对云数据中心多源异构数据具备较为理想的数据治理效果。

关 键 词:云数据中心  多源异构数据  数据治理  数据清洗
收稿时间:2023-08-20
修稿时间:2023-09-09

中图分类号:TP206.3 文献标识码:A
孙瑜. 中图分类号:TP206.3 文献标识码:A[J]. Computer Measurement & Control, 2024, 32(3): 286-292
Authors:孙瑜
Abstract:Current conventional multi-source heterogeneous data governance methods mainly judge data attributes to achieve sub-regional data cleaning, which leads to poor governance performance due to the lack of analysis of non-linear data. In this regard, a multi-source heterogeneous data governance technique based on cloud data center is proposed. The ETL function in the relational database is adopted to clean the data, and the data transformation mode as well as the data cleaning rules are defined. Mutual information coefficient is introduced to determine the degree of data relevance, and nonlinear data relevance analysis is performed. The cloud data center is used as a carrier to construct the multi-source heterogeneous data governance system. In the experiments, the governance performance of the proposed data governance technique is examined. The final test results show that the proposed data governance technique has a high checking accuracy rate and a more ideal data governance effect.
Keywords:Cloud data centers   multi-source heterogeneous data   data governance   data cleansing  
点击此处可从《计算机测量与控制》浏览原始摘要信息
点击此处可从《计算机测量与控制》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号