共查询到20条相似文献,搜索用时 31 毫秒
1.
《Expert systems with applications》2008,34(4):847-856
The credit card industry has been growing rapidly recently, and thus huge numbers of consumers’ credit data are collected by the credit department of the bank. The credit scoring manager often evaluates the consumer’s credit with intuitive experience. However, with the support of the credit classification model, the manager can accurately evaluate the applicant’s credit score. Support Vector Machine (SVM) classification is currently an active research area and successfully solves classification problems in many domains. This study used three strategies to construct the hybrid SVM-based credit scoring models to evaluate the applicant’s credit score from the applicant’s input features. Two credit datasets in UCI database are selected as the experimental data to demonstrate the accuracy of the SVM classifier. Compared with neural networks, genetic programming, and decision tree classifiers, the SVM classifier achieved an identical classificatory accuracy with relatively few input features. Additionally, combining genetic algorithms with SVM classifier, the proposed hybrid GA-SVM strategy can simultaneously perform feature selection task and model parameters optimization. Experimental results show that SVM is a promising addition to the existing data mining methods. 相似文献
2.
Diwakar Tripathi Damodar Reddy Edla Ramalingaswamy Cheruku Venkatanareshbabu Kuppili 《Computational Intelligence》2019,35(2):371-394
Credit scoring focuses on the development of empirical models to support the financial decision‐making processes of financial institutions and credit industries. It makes use of applicants' historical data and statistical or machine learning techniques to assess the risk associated with an applicant. However, the historical data may consist of redundant and noisy features that affect the performance of credit scoring models. The main focus of this paper is to develop a hybrid model, combining feature selection and a multilayer ensemble classifier framework, to improve the predictive performance of credit scoring. The proposed hybrid credit scoring model is modeled in three phases. The initial phase constitutes preprocessing and assigns ranks and weights to classifiers. In the next phase, the ensemble feature selection approach is applied to the preprocessed dataset. Finally, in the last phase, the dataset with the selected features is used in a multilayer ensemble classifier framework. In addition, a classifier placement algorithm based on the Choquet integral value is designed, as the classifier placement affects the predictive performance of the ensemble framework. The proposed hybrid credit scoring model is validated on real‐world credit scoring datasets, namely, Australian, Japanese, German‐categorical, and German‐numerical datasets. 相似文献
3.
Developing rule extraction algorithms from machine learning techniques such as artificial neural networks and support vector
machines (SVMs), which are considered incomprehensible black-box models, is an important topic in current research. This study
proposes a rule extraction algorithm from SVMs that uses a kernel-based clustering algorithm to integrate all support vectors
and genetic algorithms into extracted rule sets. This study uses measurements of accuracy, sensitivity, specificity, coverage,
fidelity and comprehensibility to evaluate the performance of the proposed method on the public credit screening data sets.
Results indicate that the proposed method performs better than other rule extraction algorithms. Thus, the proposed algorithm
is an essential analysis tool that can be effectively used in data mining fields. 相似文献
4.
The credit scoring model development has become a very important issue, as the credit industry is highly competitive. Therefore, considerable credit scoring models have been widely studied in the areas of statistics to improve the accuracy of credit scoring during the past few years. This study constructs a hybrid SVM-based credit scoring models to evaluate the applicant’s credit score according to the applicant’s input features: (1) using neighborhood rough set to select input features; (2) using grid search to optimize RBF kernel parameters; (3) using the hybrid optimal input features and model parameters to solve the credit scoring problem with 10-fold cross validation; (4) comparing the accuracy of the proposed method with other methods. Experiment results demonstrate that the neighborhood rough set and SVM based hybrid classifier has the best credit scoring capability compared with other hybrid classifiers. It also outperforms linear discriminant analysis, logistic regression and neural networks. 相似文献
5.
6.
Akhil Bandhu HensManoj Kumar Tiwari 《Expert systems with applications》2012,39(8):6774-6781
With the rapid growth of credit industry, credit scoring model has a great significance to issue a credit card to the applicant with a minimum risk. So credit scoring is very important in financial firm like bans etc. With the previous data, a model is established. From that model is decision is taken whether he will be granted for issuing loans, credit cards or he will be rejected. There are several methodologies to construct credit scoring model i.e. neural network model, statistical classification techniques, genetic programming, support vector model etc. Computational time for running a model has a great importance in the 21st century. The algorithms or models with less computational time are more efficient and thus gives more profit to the banks or firms. In this study, we proposed a new strategy to reduce the computational time for credit scoring. In this approach we have used SVM incorporated with the concept of reduction of features using F score and taking a sample instead of taking the whole dataset to create the credit scoring model. We run our method two real dataset to see the performance of the new method. We have compared the result of the new method with the result obtained from other well known method. It is shown that new method for credit scoring model is very much competitive to other method in the view of its accuracy as well as new method has a less computational time than the other methods. 相似文献
7.
A data driven ensemble classifier for credit scoring analysis 总被引:2,自引:0,他引:2
This study focuses on predicting whether a credit applicant can be categorized as good, bad or borderline from information initially supplied. This is essentially a classification task for credit scoring. Given its importance, many researchers have recently worked on an ensemble of classifiers. However, to the best of our knowledge, unrepresentative samples drastically reduce the accuracy of the deployment classifier. Few have attempted to preprocess the input samples into more homogeneous cluster groups and then fit the ensemble classifier accordingly. For this reason, we introduce the concept of class-wise classification as a preprocessing step in order to obtain an efficient ensemble classifier. This strategy would work better than a direct ensemble of classifiers without the preprocessing step. The proposed ensemble classifier is constructed by incorporating several data mining techniques, mainly involving optimal associate binning to discretize continuous values; neural network, support vector machine, and Bayesian network are used to augment the ensemble classifier. In particular, the Markov blanket concept of Bayesian network allows for a natural form of feature selection, which provides a basis for mining association rules. The learned knowledge is represented in multiple forms, including causal diagram and constrained association rules. The data driven nature of the proposed system distinguishes it from existing hybrid/ensemble credit scoring systems. 相似文献
8.
9.
《Calphad》2021
Derivation and discovery of physical dynamics inherent in big data is one of the most major purposes of machine learning (ML) in the field of modern natural science. In the materials science, phase diagrams are often called as “road maps” to perfectly understand the conditions for phase formation and/or transformation in any material system caused by the associated thermodynamics. In this paper, we report a numerical experiment investigating whether the underlying thermodynamics can be derived from the big data constructed of local spatial composition and phase distribution data along with the help of ML. The artificial data analysed have been created assuming a steel composition based on the calculation phase diagram (CALPHAD) thermodynamics combined with the order-statistics-based sampling model. The hypothetical procedures of data acquisition assumed in this numerical experiment are as follows; (i) obtaining local analysis data on the composition and phase distribution in the same observation area using instruments such as electron probe micro analyser (EPMA) and electron backscattering diffraction (EBSD), and (ii) training the classification model based on a ML algorithm with compositional data as input and the phase data as output. The accuracies of the reconstructed phase diagrams have been estimated for three ML algorithms, i.e. support vector machine (SVM), random forest, and multilayer perceptron (MLP). The phase diagrams predicted using SVM and MLP are found to be adequately consistent with those of the CALPHAD method. We have also investigated the regression performance of the continuous data involved in the CALPHAD thermodynamics, such as the phase fractions of body-centred cubic, face-centred cubic, and cementite phases. Compared with the ML algorithms, the CALPHAD method is found to show superior predictive performance since it is based on the sophisticated physical model. 相似文献
10.
图片检索是图片共享社会网络中的重要研究内容之一。传统的图片检索方法往往通过对用户输入的关键字和图片的文本描述加以匹配来进行图片检索。由于文本信息存在歧义性,图片的文本描述十分困难,因此检索结果的准确性低。为了提高图片检索的准确性,提出了基于排序学习的图片检索方法。将每幅图片通过多种特征描述符进行描述,当用户的输入为图片时,通过对比查询图片和图片库中图片的相似性进行图片检索。采用支持向量机和关联规则两种学习方法对特征描述符的权重组合进行学习,并提出了相应的学习算法。实验表明,提出的基于学习的图片检索方法与相关图片检索方法相比具有更高的准确性。此外,应用支持向量机和关联规则两种方法对分类函数进行学习时,由于两种算法通过相同的数据实例对图片描述符的权重进行学习,因此得到的结果是相关的。 相似文献
11.
Arash Jalali Hassan Farsi Shahrokh Ghaemmaghami 《Multimedia Tools and Applications》2018,77(13):16347-16366
Achieving high rates of detection in low rates of embedding is still a challenging problem in many steganalysis systems. The newly proposed steganalysis system based on sparse representation classifier has shown remarkable detection rates in low embedding rate. In this paper, we propose a new steganalysis system based on double sparse representation classifier. We compare our proposed method with other steganalysis systems which use different classifier (including nearest neighbor, support vector machine, ensemble support vector machine and sparse representation). In all of our experiments, input features to the classifier are fixed and the ability of classifier is examined. Also we provide a complexity analysis in terms of execution time for different classifier. In most of experiments, our proposed method shows superior performance in terms of detection rate and complexity for low embedding rates. 相似文献
12.
A new fuzzy support vector machine to evaluate credit risk 总被引:7,自引:0,他引:7
Due to recent financial crises and regulatory concerns, financial intermediaries' credit risk assessment is an area of renewed interest in both the academic world and the business community. In this paper, we propose a new fuzzy support vector machine to discriminate good creditors from bad ones. Because in credit scoring areas we usually cannot label one customer as absolutely good who is sure to repay in time, or absolutely bad who will default certainly, our new fuzzy support vector machine treats every sample as both positive and negative classes, but with different memberships. By this way we expect the new fuzzy support vector machine to have more generalization ability, while preserving the merit of insensitive to outliers, as the fuzzy support vector machine (SVM) proposed in previous papers. We reformulate this kind of two-group classification problem into a quadratic programming problem. Empirical tests on three public datasets show that it can have better discriminatory power than the standard support vector machine and the fuzzy support vector machine if appropriate kernel and membership generation method are chosen. 相似文献
13.
Lu Han Liyan Han Hongwei Zhao 《Engineering Applications of Artificial Intelligence》2013,26(2):848-862
The most commonly used techniques for credit scoring is logistic regression, and more recent research has proposed that the support vector machine is a more effective method. However, both logistic regression and support vector machine suffers from curse of dimension. In this paper, we introduce a new way to address this problem which is defined as orthogonal dimension reduction. We discuss the related properties of this method in detail and test it against other common statistical approaches—principal component analysis and hybridizing logistic regression to better solve and evaluate the data. With experiments on German data set, there is also an interesting phenomenon with respect to the use of support vector machine, which we define as ‘Dimensional interference’, and discuss in general. Based on the results of cross-validation, it can be found that through the use of logistic regression filtering the dummy variables and orthogonal extracting feature, the support vector machine not only reduces complexity and accelerates convergence, but also achieves better performance. 相似文献
14.
Technology credit scoring models have been used to screen loan applicant firms based on their technology. Typically a logistic regression model is employed to relate the probability of a loan default of the firms with several evaluation attributes associated with technology. However, these attributes are evaluated in linguistic expressions represented by fuzzy number. Besides, the possibility of loan default can be described in verbal terms as well. To handle these fuzzy input and output data, we proposed a fuzzy credit scoring model that can be applied to predict the default possibility of loan for a firm that is approved based on its technology. The method of fuzzy logistic regression as an appropriate prediction approach for credit scoring with fuzzy input and output was presented in this study. The performance of the model is improved compared to that of typical logistic regression. This study is expected to contribute to practical utilization of the technology credit scoring with linguistic evaluation attributes. 相似文献
15.
Credit scoring has become a critical and challenging management science issue, as the credit industry has been facing fiercer competition in recent years. Many methods have been suggested to tackle this problem in the literature. In this paper, we proposed hybrid support vector machine technique based on three strategies: (1) using CART to select input features, (2) using MARS to select input features, (3) using grid search to optimize model parameters. In order to verify the feasibility and effectiveness of the proposed hybrid SVM model, one credit card dataset provided by a local bank in China is used in this study. Analytic results demonstrate that the hybrid SVM technique not only has the best classification rate, but also has the lowest Type II error in comparison with CART, MARS and SVM and justify the presumptions that SVM having better capability of capturing nonlinear relationship among variables. 相似文献
16.
Development of a quick credibility scoring decision support system using fuzzy TOPSIS 总被引:1,自引:0,他引:1
In this study, a quick credibility scoring decision support system is developed for the banks to determine the credibility of manufacturing firms in Turkey. The proposed decision support system is expected to be used by the banks when they want to determine whether an applicant firm is worth a detailed credit check or not. Using such a quick credit scoring decision model reduces the banks’ workload. The proposed credit scoring model is based on the financial ratios and fuzzy TOPSIS approach. It obtains two separate scores which reflect the attractiveness of manufacturing industries within the overall economy and manufacturing firms’ performance with respect to its competitors belonging to the same industry. These two scores are then used to determine the credibility of applicant manufacturing firms. The developed decision support system is tested with various real cases and satisfactory results are obtained. An application is also provided in the paper for illustrative purposes. 相似文献
17.
18.
We propose a novel architecture for a higher order fuzzy inference system (FIS) and develop a learning algorithm to build the FIS. The consequent part of the proposed FIS is expressed as a nonlinear combination of the input variables, which can be obtained by introducing an implicit mapping from the input space to a high dimensional feature space. The proposed learning algorithm consists of two phases. In the first phase, the antecedent fuzzy sets are estimated by the kernel-based fuzzy c-means clustering. In the second phase, the consequent parameters are identified by support vector machine whose kernel function is constructed by fuzzy membership functions and the Gaussian kernel. The performance of the proposed model is verified through several numerical examples generally used in fuzzy modeling. Comparative analysis shows that, compared with the zero-order fuzzy model, first-order fuzzy model, and polynomial fuzzy model, the proposed model exhibits higher accuracy, better generalization performance, and satisfactory robustness. 相似文献
19.
提出Dirichlet混合多项式(DCM)流形,并利用DCM流形可与正半球流形建立同胚和等距关系的性质,通过拉回映射将正半球流形的测地距离映射为DCM流形的测地距离,从而在DCM流形上建立距离度量,构建统计流形上的Dirichlet混合多项式扩散核和Dirichlet混合多项式倒排文档频率(DCMIDF)扩散核。利用WebKBTop4和20Newsgroups语料库上进行实验,DCM流形能比欧氏空间更能准确地描述文本。与多项式核支持向量机算法、,负测地距离核支持向量机算法相比,实验结果显示文中基于DCM扩散核和DCMIDF扩散核的支持向量机算法可取得良好的文本分类效果。 相似文献