首页 | 官方网站   微博 | 高级检索  
     


Toward feature selection in big data preprocessing based on hybrid cloud-based model
Authors:Shehab  Noha  Badawy  Mahmoud  Ali  H Arafat
Affiliation:1.Computers and Control Systems Engineering Department, Faculty of Engineering, Mansoura University, Mansoura, 35516, Egypt
;2.Information Technology Institute, Open Source Dept., Ministry of Communications and Information Technology., Cairo, Egypt
;3.Taibah University, Computer Science and Information Dept., Madinah, Saudi Arabia
;
Abstract:

Recently, big data are widely noticed in many fields like machine learning, pattern recognition, medical, financial, and transportation fields. Data analysis is crucial to converting data into more specific information fed to the decision-making systems. With the diverse and complex types of datasets, knowledge discovery becomes more difficult. One solution is to use feature subset selection preprocessing that reduces this complexity, so the computation and analysis become convenient. Preprocessing produces a reliable and suitable source for any data-mining algorithm. The effective features’ selection can improve a model’s performance and help us understand the characteristics and underlying structure of complex data. This study introduces a novel hybrid feature selection cloud-based model for imbalanced data based on the k nearest neighbor algorithm. The proposed model showed good performance compared with the simple weighted nearest neighbor. The proposed model combines the firefly distance metric and the Euclidean distance used in the k nearest neighbor. The experimental results showed good insights in both time usage and feature weights compared with the weighted nearest neighbor. It also showed improvement in the classification accuracy by 12% compared with the weighted nearest neighbor algorithm. And using the cloud-distributed model reduced the processing time up to 30%, which is deliberated to be substantial compared with the recent state-of-the-art methods.

Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号