首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 6 毫秒
1.
The aim of bankruptcy prediction in the areas of data mining and machine learning is to develop an effective model which can provide the higher prediction accuracy. In the prior literature, various classification techniques have been developed and studied, in/with which classifier ensembles by combining multiple classifiers approach have shown their outperformance over many single classifiers. However, in terms of constructing classifier ensembles, there are three critical issues which can affect their performance. The first one is the classification technique actually used/adopted, and the other two are the combination method to combine multiple classifiers and the number of classifiers to be combined, respectively. Since there are limited, relevant studies examining these aforementioned disuses, this paper conducts a comprehensive study of comparing classifier ensembles by three widely used classification techniques including multilayer perceptron (MLP) neural networks, support vector machines (SVM), and decision trees (DT) based on two well-known combination methods including bagging and boosting and different numbers of combined classifiers. Our experimental results by three public datasets show that DT ensembles composed of 80–100 classifiers using the boosting method perform best. The Wilcoxon signed ranked test also demonstrates that DT ensembles by boosting perform significantly different from the other classifier ensembles. Moreover, a further study over a real-world case by a Taiwan bankruptcy dataset was conducted, which also demonstrates the superiority of DT ensembles by boosting over the others.  相似文献   

2.
Using neural network ensembles for bankruptcy prediction and credit scoring   总被引:2,自引:0,他引:2  
Bankruptcy prediction and credit scoring have long been regarded as critical topics and have been studied extensively in the accounting and finance literature. Artificial intelligence and machine learning techniques have been used to solve these financial decision-making problems. The multilayer perceptron (MLP) network trained by the back-propagation learning algorithm is the mostly used technique for financial decision-making problems. In addition, it is usually superior to other traditional statistical models. Recent studies suggest combining multiple classifiers (or classifier ensembles) should be better than single classifiers. However, the performance of multiple classifiers in bankruptcy prediction and credit scoring is not fully understood. In this paper, we investigate the performance of a single classifier as the baseline classifier to compare with multiple classifiers and diversified multiple classifiers by using neural networks based on three datasets. By comparing with the single classifier as the benchmark in terms of average prediction accuracy, the multiple classifiers only perform better in one of the three datasets. The diversified multiple classifiers trained by not only different classifier parameters but also different sets of training data perform worse in all datasets. However, for the Type I and Type II errors, there is no exact winner. We suggest that it is better to consider these three classifier architectures to make the optimal financial decision.  相似文献   

3.
Incremental construction of classifier and discriminant ensembles   总被引:2,自引:0,他引:2  
We discuss approaches to incrementally construct an ensemble. The first constructs an ensemble of classifiers choosing a subset from a larger set, and the second constructs an ensemble of discriminants, where a classifier is used for some classes only. We investigate criteria including accuracy, significant improvement, diversity, correlation, and the role of search direction. For discriminant ensembles, we test subset selection and trees. Fusion is by voting or by a linear model. Using 14 classifiers on 38 data sets, incremental search finds small, accurate ensembles in polynomial time. The discriminant ensemble uses a subset of discriminants and is simpler, interpretable, and accurate. We see that an incremental ensemble has higher accuracy than bagging and random subspace method; and it has a comparable accuracy to AdaBoost, but fewer classifiers.  相似文献   

4.
Previous studies about ensembles of classifiers for bankruptcy prediction and credit scoring have been presented. In these studies, different ensemble schemes for complex classifiers were applied, and the best results were obtained using the Random Subspace method. The Bagging scheme was one of the ensemble methods used in the comparison. However, it was not correctly used. It is very important to use this ensemble scheme on weak and unstable classifiers for producing diversity in the combination. In order to improve the comparison, Bagging scheme on several decision trees models is applied to bankruptcy prediction and credit scoring. Decision trees encourage diversity for the combination of classifiers. Finally, an experimental study shows that Bagging scheme on decision trees present the best results for bankruptcy prediction and credit scoring.  相似文献   

5.
This paper presents an alternative technique for financial distress prediction systems. The method is based on a type of neural network, which is called hybrid associative memory with translation. While many different neural network architectures have successfully been used to predict credit risk and corporate failure, the power of associative memories for financial decision-making has not been explored in any depth as yet. The performance of the hybrid associative memory with translation is compared to four traditional neural networks, a support vector machine and a logistic regression model in terms of their prediction capabilities. The experimental results over nine real-life data sets show that the associative memory here proposed constitutes an appropriate solution for bankruptcy and credit risk prediction, performing significantly better than the rest of models under class imbalance and data overlapping conditions in terms of the true positive rate and the geometric mean of true positive and true negative rates.  相似文献   

6.
7.
The primary concern of the rating policies for a banking industry is to develop a more objective, accurate and competitive scoring model to avoid losses from potential bad debt. This study proposes an artificial immune classifier based on the artificial immune network (named AINE-based classifier) to evaluate the applicants’ credit scores. Two experimental credit datasets are used to show the accuracy rate of the artificial immune classifier. The ten-fold cross-validation method is applied to evaluate the performance of the classifier. The classifier is compared with other data mining techniques. Experimental results show that for the AINE-based classifier in credit scoring is more competitive than the SVM and hybrid SVM-based classifiers, except the BPN classifier. We further compare our classifier with other three AIS-based classifiers in the benchmark datasets, and show that the AINE-based classifier can rival the AIRS-based classifiers and outperforms the SAIS classifier when the number of attributes and classes increase. Our classifier can provide the credit card issuer with accurate and valuable information of credit scoring analyses to avoid making incorrect decisions that result in the loss of applicants’ bad debt.  相似文献   

8.
Financially distressed prediction (FDP) has been a widely and continually studied topic in the field of corporate finance. One of the core problems to FDP is to design effective feature selection algorithms. In contrast to existing approaches, we propose an integrated approach to feature selection for the FDP problem that embeds expert knowledge with the wrapper method. The financial features are categorized into seven classes according to their financial semantics based on experts’ domain knowledge surveyed from literature. We then apply the wrapper method to search for “good” feature subsets consisting of top candidates from each feature class. For concept verification, we compare several scholars’ models as well as leading feature selection methods with the proposed method. Our empirical experiment indicates that the prediction model based on the feature set selected by the proposed method outperforms those models based on traditional feature selection methods in terms of prediction accuracy.  相似文献   

9.
We set out in this study to review a vast amount of recent literature on machine learning (ML) approaches to predicting financial distress (FD), including supervised, unsupervised and hybrid supervised–unsupervised learning algorithms. Four supervised ML models including the traditional support vector machine (SVM), recently developed hybrid associative memory with translation (HACT), hybrid GA-fuzzy clustering and extreme gradient boosting (XGBoost) were compared in prediction performance to the unsupervised classifier deep belief network (DBN) and the hybrid DBN-SVM model, whereby a total of sixteen financial variables were selected from the financial statements of the publicly-listed Taiwanese firms as inputs to the six approaches. Our empirical findings, covering the 2010–2016 sample period, demonstrated that among the four supervised algorithms, the XGBoost provided the most accurate FD prediction. Moreover, the hybrid DBN-SVM model was able to generate more accurate forecasts than the use of either the SVM or the classifier DBN in isolation.  相似文献   

10.
We present attribute bagging (AB), a technique for improving the accuracy and stability of classifier ensembles induced using random subsets of features. AB is a wrapper method that can be used with any learning algorithm. It establishes an appropriate attribute subset size and then randomly selects subsets of features, creating projections of the training set on which the ensemble classifiers are built. The induced classifiers are then used for voting. This article compares the performance of our AB method with bagging and other algorithms on a hand-pose recognition dataset. It is shown that AB gives consistently better results than bagging, both in accuracy and stability. The performance of ensemble voting in bagging and the AB method as a function of the attribute subset size and the number of voters for both weighted and unweighted voting is tested and discussed. We also demonstrate that ranking the attribute subsets by their classification accuracy and voting using only the best subsets further improves the resulting performance of the ensemble.  相似文献   

11.
Combining Classifiers with Meta Decision Trees   总被引:4,自引:0,他引:4  
The paper introduces meta decision trees (MDTs), a novel method for combining multiple classifiers. Instead of giving a prediction, MDT leaves specify which classifier should be used to obtain a prediction. We present an algorithm for learning MDTs based on the C4.5 algorithm for learning ordinary decision trees (ODTs). An extensive experimental evaluation of the new algorithm is performed on twenty-one data sets, combining classifiers generated by five learning algorithms: two algorithms for learning decision trees, a rule learning algorithm, a nearest neighbor algorithm and a naive Bayes algorithm. In terms of performance, stacking with MDTs combines classifiers better than voting and stacking with ODTs. In addition, the MDTs are much more concise than the ODTs and are thus a step towards comprehensible combination of multiple classifiers. MDTs also perform better than several other approaches to stacking.  相似文献   

12.
Tzong-Huei   《Neurocomputing》2009,72(16-18):3507
In 2008, financial tsunami started to impair the economic development of many countries, including Taiwan. The prediction of financial crisis turns to be much more important and doubtlessly holds public attention when the world economy goes to depression. This study examined the predictive ability of the four most commonly used financial distress prediction models and thus constructed reliable failure prediction models for public industrial firms in Taiwan. Multiple discriminate analysis (MDA), logit, probit, and artificial neural networks (ANNs) methodology were employed to a dataset of matched sample of failed and non-failed Taiwan public industrial firms during 1998–2005. The final models are validated using within sample test and out-of-the-sample test, respectively. The results indicated that the probit, logit, and ANN models which used in this study achieve higher prediction accuracy and possess the ability of generalization. The probit model possesses the best and stable performance. However, if the data does not satisfy the assumptions of the statistical approach, then the ANN approach would demonstrate its advantage and achieve higher prediction accuracy. In addition, the models which used in this study achieve higher prediction accuracy and possess the ability of generalization than those of [Altman, Financial ratios—discriminant analysis and the prediction of corporate bankruptcy using capital market data, Journal of Finance 23 (4) (1968) 589–609, Ohlson, Financial ratios and the probability prediction of bankruptcy, Journal of Accounting Research 18 (1) (1980) 109–131, and Zmijewski, Methodological issues related to the estimation of financial distress prediction models, Journal of Accounting Research 22 (1984) 59–82]. In summary, the models used in this study can be used to assist investors, creditors, managers, auditors, and regulatory agencies in Taiwan to predict the probability of business failure.  相似文献   

13.
Financial distress prediction (FDP) has always been an important issue in the business and financial management. This research proposed a novel multiple classifier ensemble model based on firm life cycle and Choquet integral for FDP, named MCELCCh-FDP, as a new approach to tackle with financial distress. Empirical study based on Chinese listed companies’ real data is conducted, and the results show that the proposed MCELCCh-FDP model has higher prediction accuracy than single classifiers. In order to verify the prediction capability of firm life cycle and Choquet integral in FDP model, comparative analysis is conducted. The experiment results indicate that the introduction of firm life cycle and Choquet integral in FDP can greatly enhance prediction accuracy.  相似文献   

14.
How to effectively predict financial distress is an important problem in corporate financial management. Though much attention has been paid to financial distress prediction methods based on single classifier, its limitation of uncertainty and benefit of multiple classifier combination for financial distress prediction has also been neglected. This paper puts forward a financial distress prediction method based on weighted majority voting combination of multiple classifiers. The framework of multiple classifier combination system, model of weighted majority voting combination, basic classifiers’ voting weight model and basic classifiers’ selection principles are discussed in detail. Empirical experiment with Chinese listed companies’ real world data indicates that this method can greatly improve the average prediction accuracy and stability, and it is more suitable for financial distress prediction than single classifiers.  相似文献   

15.
In recent years, financial distress prediction (FDP), also known as corporate failure prediction or bankruptcy prediction, has gained significant importance due to its impact on organizations, especially during unexpected events like pandemics and wars. Machine learning (ML) models have emerged as innovative and essential tools in predicting financial distress, leveraging the ever-increasing volume of databases and computing power. This study utilizes bibliographic techniques to contribute to the field's literature review to address the disorganized nature of the existing literature on FDP, reduce confusion, and provide clarity to domain researchers. These techniques enable identifying the progress of articles published over the years, influential authors, and highly cited articles. Additionally, the study examines crucial aspects of data preprocessing, such as missing data, imbalanced data, feature selection, and outliers, as they significantly impact the robustness and performance of ML models. Furthermore, it discusses essential models employed in FDP, focusing on recent advancements that represent promising trends. In conclusion, this study contributes to the field by uncovering novel trends and proposing possible directions for advancing FDP research. These findings will guide researchers, practitioners, and stakeholders in their quest for improved prediction and decision-making in financial distress.  相似文献   

16.
Z. Zhu  H. He 《Information Sciences》2007,177(5):1180-1192
A new self-organizing learning array (SOLAR) system has been implemented in software. It is an information theory based learning machine capable of handling a wide variety of classification problems. It has self-reconfigurable processing cells (neurons) and an evolvable system structure. Entropy based learning is performed locally at each neuron, where neural functions and connections that correspond to the minimum entropy are adaptively learned. By choosing connections for each neuron, the system sets up the wiring and completes its self-organization. SOLAR classifies input data based on weighted statistical information from all neurons. Unlike artificial neural networks, its multi-layer structure scales well to large systems capable of solving complex pattern recognition and classification tasks. This paper shows its application in economic and financial fields. A reference to influence diagrams is also discussed. Several prediction and classification cases are studied. The results have been compared with the existing methods.  相似文献   

17.
The simultaneous use of multiple classifiers has been shown to provide performance improvement in classification problems. The selection of an optimal set of classifiers is an important part of multiple classifier systems and the independence of classifier outputs is generally considered to be an advantage for obtaining better multiple classifier systems. In this paper, the need for the classifier independence is interrogated from classification performance point of view. The performance achieved with the use of classifiers having independent joint distributions is compared to some other classifiers which are defined to have best and worst joint distributions. These distributions are obtained by formulating the combination operation as an optimization problem. The analysis revealed several important observations about classifier selection which are then used to analyze the problem of selecting an additional classifier to be used with the available multiple classifier system.  相似文献   

18.
Support vector machine (SVM) is an effective tool for financial distress identification (FDI). However, a potential issue that keeps SVM from being efficiently applied in identifying financial distress is how to select features in SVM-based FDI. Although filters are commonly employed, yet this type of approach does not consider predictive capability of SVM itself when selecting features. This research devotes to constructing a statistics-based wrapper for SVM-based FDI by using statistical indices of ranking-order information from predictive performances on various parameters. This wrapper consists of four levels, i.e., data level, model level based on SVM, feature ranking-order level, and the index level of feature selection. When data is ready, predictive accuracies of a type of SVM model, i.e., linear SVM (LSVM), polynomial SVM (PSVM), Gaussian SVM (GSVM), or sigmoid SVM (SSVM), on various pairs of parameters are firstly calculated. Then, performances of SVM models on each candidate feature are transferred to be ranking-order indices. After this step, the two statistical indices of mean and standard deviation values are calculated from ranking-order information on each feature. Finally, the feature selection indices of SVM are produced by a combination of statistical indices. Each feature with its feature selection index being smaller than half of the average index is selected to compose the optimal feature set. With a dataset collected for Chinese FDI prior to 3 years, we statistically verified the performance of this statistics-based wrapper against a non-statistics-based wrapper, two filters, and non-feature selection for SVM-based FDI. Results from unseen dataset indicate that GSVM with the statistics-based wrapper significantly outperformed the other SVM models on the other feature selection methods and two wrapper-based classical statistical models.  相似文献   

19.
The idea of performing model combination, instead of model selection, has a long theoretical background in statistics. However, making use of theoretical results is ordinarily subject to the satisfaction of strong hypotheses (weak error correlation, availability of large training sets, possibility to rerun the training procedure an arbitrary number of times, etc.). In contrast, the practitioner is frequently faced with the problem of combining a given set of pre-trained classifiers, with highly correlated errors, using only a small training sample. Overfitting is then the main risk, which cannot be overcome but with a strict complexity control of the combiner selected. This suggests that SVMs should be well suited for these difficult situations. Investigating this idea, we introduce a family of multi-class SVMs and assess them as ensemble methods on a real-world problem. This task, protein secondary structure prediction, is an open problem in biocomputing for which model combination appears to be an issue of central importance. Experimental evidence highlights the gain in quality resulting from combining some of the most widely used prediction methods with our SVMs rather than with the ensemble methods traditionally used in the field. The gain increases when the outputs of the combiners are post-processed with a DP algorithm. Received: 15 November 2000, Received in revised form: 26 October 2001, Accepted: 13 December 2001  相似文献   

20.
Industrialized building construction is an approach that integrates manufacturing techniques into construction projects to achieve improved quality, shortened project duration, and enhanced schedule predictability. Time savings result from concurrently carrying out factory operations and site preparation activities. In an industrialized building construction factory, the accurate prediction of production cycle time is crucial to reap the advantage of improved schedule predictability leading to enhanced production planning and control. With the large amount of data being generated as part of the daily operations within such a factory, the present study proposes a machine learning approach to accurately estimate production time using (1) the physical characteristics of building components, (2) the real-time tracking data gathered using a radio frequency identification system, and (3) a set of engineered features constructed to capture the real-time loading conditions of the job shop. The results show a mean absolute percentage error and correlation coefficient of 11% and 0.80, respectively, between the actual and predicted values when using random forest models. The results confirm the significant effects of including shop utilization features in model training and suggest that predicting production time can be reasonably achieved.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号