首页 | 官方网站   微博 | 高级检索  
     


PREDICTIVE MODELING WITH MISSING DATA USING AN AUTOMATIC RELEVANCE DETERMINATION ENSEMBLE: A COMPARATIVE STUDY
Authors:Mlungisi Duma  Bhekisipho Twala  Fulufhelo Nelwamondo  Tshilidzi Marwala
Affiliation:1. Department of Electrical Engineering and the Built Environment , University of Johannesburg , Auckland Park , Johannesburg , South Africa mlungisiduma@gmail.com;3. Department of Electrical Engineering and the Built Environment , University of Johannesburg , Auckland Park , Johannesburg , South Africa;4. Modelling and Digital Science, Council for Scientific and Industrial Research , Pretoria , South Africa
Abstract:The objective of this article is to present an automatic relevance determination ensemble as an effective variable extraction method for insurance datasets with large numbers of variables. Automatic relevance determination is a method that uses a Bayesian neural network and the evidence framework to rank variables in the order of relevance to the target variable. The current approach uses a single Bayesian neural network that searches only for local minima or maxima. In large datasets with numerous variables, this is a concern because we cannot be certain that the outcome is an optimal one. The method used to address this issue in this study is an automatic relevance determination ensemble with various configurations (or structures) of the Bayesian neural networks. Each outcome in the ensemble is determined by using a confidence factor rather than by scrutinizing the most probable weights values or hyperparameters directly. The extraction method is used with the repeated incremental pruning to produce error reduction, logistic discriminant analysis, and k-nearest neighbor models to evaluate the performance. Furthermore, the datasets employed contain escalating missing data to measure the accuracy and resilience of the models when they are used with the proposed ensemble. The ensemble is compared with the principal component analysis method. The results show that with the automatic relevance determination ensemble, the models achieve higher accuracies in performance than when used with the principal component analysis. Furthermore, the resilience and strength of models is higher when using the ensemble, compared with the principal component analysis method.
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号