共查询到20条相似文献,搜索用时 31 毫秒
1.
As churn management is a major task for companies to retain valuable customers, the ability to predict customer churn is necessary. In literature, neural networks have shown their applicability to churn prediction. On the other hand, hybrid data mining techniques by combining two or more techniques have been proved to provide better performances than many single techniques over a number of different domain problems. This paper considers two hybrid models by combining two different neural network techniques for churn prediction, which are back-propagation artificial neural networks (ANN) and self-organizing maps (SOM). The hybrid models are ANN combined with ANN (ANN + ANN) and SOM combined with ANN (SOM + ANN). In particular, the first technique of the two hybrid models performs the data reduction task by filtering out unrepresentative training data. Then, the outputs as representative data are used to create the prediction model based on the second technique. To evaluate the performance of these models, three different kinds of testing sets are considered. They are the general testing set and two fuzzy testing sets based on the filtered out data by the first technique of the two hybrid models, i.e. ANN and SOM, respectively. The experimental results show that the two hybrid models outperform the single neural network baseline model in terms of prediction accuracy and Types I and II errors over the three kinds of testing sets. In addition, the ANN + ANN hybrid model significantly performs better than the SOM + ANN hybrid model and the ANN baseline model. 相似文献
2.
Customer retention in telecommunication companies is one of the most important issues in customer relationship management, and customer churn prediction is a major instrument in customer retention. Churn prediction aims at identifying potential churning customers. Traditional approaches for determining potential churning customers are based only on customer personal information without considering the relationship among customers. However, the subscribers of telecommunication companies are connected with other customers, and network properties among people may affect the churn. For this reason, we proposed a new procedure of the churn prediction by examining the communication patterns among subscribers and considering a propagation process in a network based on call detail records which transfers churning information from churners to non-churners. A fast and effective propagation process is possible through community detection and through setting the initial energy of churners (the amount of information transferred) differently in churn date or centrality. The proposed procedure was evaluated based on the performance of the prediction model trained with a social network feature and traditional personal features. 相似文献
3.
XML documents have recently become ubiquitous because of their varied applicability in a number of applications. Classification
is an important problem in the data mining domain, but current classification methods for XML documents use IR-based methods
in which each document is treated as a bag of words. Such techniques ignore a significant amount of information hidden inside
the documents. In this paper we discuss the problem of rule based classification of XML data by using frequent discriminatory
substructures within XML documents. Such a technique is more capable of finding the classification characteristics of documents.
In addition, the technique can also be extended to cost sensitive classification. We show the effectiveness of the method
with respect to other classifiers. We note that the methodology discussed in this paper is applicable to any kind of semi-structured
data.
Editors: Hendrik Blockeel, David Jensen and Stefan Kramer
An erratum to this article is available at . 相似文献
4.
Lot output time prediction is a critical task to a wafer fab (fabrication plant). To further enhance the accuracy of wafer lot output time prediction, an intelligent hybrid system is constructed in this study. Firstly, the concept of input classification is applied to Chen’s fuzzy back propagation network (FBPN) approach in this study by pre-classifying wafer lots with the k-means (kM) classifier before predicting the output times with FBPN. Examples belonging to different categories are then learned with different FBPNs but with the same topology. Secondly, the future release plan of the fab, which is influential but has been ignored in traditional approaches, is also incorporated in the intelligent hybrid system. To evaluate the effectiveness of the proposed methodology, production simulation has been applied in this study to generate test examples. According to experimental results, the prediction accuracy of the intelligent hybrid system was significantly better than those of six approaches: BPN, case-based reasoning (CBR), FBPN, look-ahead FBPN, evolving fuzzy rules (EFR), and kM-FBPN without look-ahead in most cases by achieving a 17-47% (and an average of 34%) reduction in the root-mean-squared-error (RMSE) over the comparison basis - BPN. 相似文献
5.
We studied the problem of optimizing the performance of a DSS for churn prediction. In particular, we investigated the beneficial effect of adding the voice of customers through call center emails – i.e. textual information – to a churn-prediction system that only uses traditional marketing information. We found that adding unstructured, textual information into a conventional churn-prediction model resulted in a significant increase in predictive performance. From a managerial point of view, this integrated framework helps marketing-decision makers to better identify customers most prone to switch. Consequently, their customer retention campaigns can be targeted more effectively because the prediction method is better at detecting those customers who are likely to leave. 相似文献
6.
In this paper, an effective hybrid algorithm based on estimation of distribution algorithm (EDA) is proposed to solve the multidimensional knapsack problem (MKP). With the framework of EDA, the probability model is built with the superior population and the new individuals are generated based on probability model. In addition, an updating mechanism of the probability model is proposed and a mechanism for initializing the probability model based on the specific knowledge of the MKP is also proposed to improve the convergence speed. Meanwhile, an adaptive local search is proposed to enhance the exploitation ability. Furthermore, the influences of parameters are investigated based on Taguchi method of design of experiment and the importance of repair operator is also studied via simulation testing and comparisons. Finally, numerical simulation is carried out based on the benchmark instances, and the comparisons with some existing algorithms demonstrate the effectiveness of the proposed algorithm. 相似文献
7.
The university course timetabling problem is an optimisation problem in which a set of events has to be scheduled in timeslots and located in suitable rooms. Recently, a set of benchmark instances was introduced and used for an ‘International Timetabling Competition’ to which 24 algorithms were submitted by various research groups active in the field of timetabling. We describe and analyse a hybrid metaheuristic algorithm which was developed under the very same rules and deadlines imposed by the competition and outperformed the official winner. It combines various construction heuristics, tabu search, variable neighbourhood descent and simulated annealing. Due to the complexity of developing hybrid metaheuristics, we strongly relied on an experimental methodology for configuring the algorithms as well as for choosing proper parameter settings. In particular, we used racing procedures that allow an automatic or semi-automatic configuration of algorithms with a good save in time. Our successful example shows that the systematic design of hybrid algorithms through an experimental methodology leads to high performing algorithms for hard combinatorial optimisation problems. 相似文献
9.
The behaviours of hybrid dynamic systems (HDS) are determined by combining continuous variables with discrete switching logic. The identification of a HDS aims to find an accurate model of the system’s dynamics based on its past inputs and outputs. In pattern recognition (PR) methods, each mode is represented by a set of similar patterns that form restricted regions in the feature space. These sets of patterns are called classes. A pattern is a vector built from past inputs and outputs. HDS identification is a challenging problem since it involves the estimation of different sets of parameters without knowing in advance which sections of the measured data correspond to the different modes of the system. Therefore, HDS identification can be achieved by combining two steps: clustering and parameter estimation. In the clustering step, the number of discrete modes (i.e., the classes that input-output data points belong) is estimated. The parameter estimation step finds the parameters of the models that govern the continuous dynamics in each mode. In this paper, an unsupervised PR method is proposed to achieve the clustering step of the identification of temporally switched linear HDS. The determination of the number of modes does not require prior information about the modes or their number. 相似文献
10.
A neural network architecture is introduced which implements a supervised clustering algorithm for the classification of feature vectors. The network is selforganising, and is able to adapt to the shape of the underlying pattern distribution as well as detect novel input vectors during training. It is also capable of determining the relative importance of the feature components for classification. The architecture is a hybrid of supervised and unsupervised networks, and combines the strengths of three wellknown architectures: learning vector quantisation, backpro-pagation and adaptive resonance theory. Network performance is compared to that of learning vector quantisation, back-propagation and cascade-correlation. It is found that performance is generally as good as or better than the performance of these other architectures, while training time is considerably shorter. However, the main advantage of the hybrid architecture is its ability to gain insight into the feature pattern space.Nomenclature
O
j
The output value of the jth unit
-
I
i
The ith component of the input pattern
-
W
ij
The weight of the cluster connection between the ith input and the jth unit
-
B
ij
The weight of the shape connection between the ith input and the jth unit
-
N
The dimension of the input patterns
-
v
j
The vigilance parameter of the jth unit
-
v
init
The initial vigilance parameter value
-
v
rate
The change in the vigilance parameter value
-
X
i
The ith direction in an N-dimensional coordinate system
-
T
k
The classification tag of the kth unit
-
C
The classification tag of the current input vector
-
(p)
The learning rate at the pth epoch for the cluster weights
-
p
The current epoch
-
P
The total number of epochs
-
E
k
The error associated with the kth unit
-
The constant learning rate for the shape weights
-
a
j
The age in epochs of the jth unit 相似文献
11.
To build a successful customer churn prediction model, a classification algorithm should be chosen that fulfills two requirements: strong classification performance and a high level of model interpretability. In recent literature, ensemble classifiers have demonstrated superior performance in a multitude of applications and data mining contests. However, due to an increased complexity they result in models that are often difficult to interpret. In this study, GAMensPlus, an ensemble classifier based upon generalized additive models (GAMs), in which both performance and interpretability are reconciled, is presented and evaluated in a context of churn prediction modeling. The recently proposed GAMens, based upon Bagging, the Random Subspace Method and semi-parametric GAMs as constituent classifiers, is extended to include two instruments for model interpretability: generalized feature importance scores, and bootstrap confidence bands for smoothing splines. In an experimental comparison on data sets of six real-life churn prediction projects, the competitive performance of the proposed algorithm over a set of well-known benchmark algorithms is demonstrated in terms of four evaluation metrics. Further, the ability of the technique to deliver valuable insight into the drivers of customer churn is illustrated in a case study on data from a European bank. Firstly, it is shown how the generalized feature importance scores allow the analyst to identify the relative importance of churn predictors in function of the criterion that is used to measure the quality of the model predictions. Secondly, the ability of GAMensPlus to identify nonlinear relationships between predictors and churn probabilities is demonstrated. 相似文献
12.
The prevention of subscriber churn through customer retention is a core issue of Customer Relationship Management (CRM). By minimizing customer churn a company maximizes its profit. This paper proposes a hybridized architecture to deal with customer retention problems. It does so not only through predicting churn probability but also by proposing retention policies. The architecture works in two modes: learning and usage. In the learning mode, the churn model learner seeks potential associations from the subscriber database. This historical information is used to form a churn model. This mode also calls for a policy model constructor to use the attributes identified in the churn model to divide all ‘churners’ into distinct groups. The policy model constructor is also responsible for developing a policy model for each churner group. In the usage mode, a churn predictor uses the churn model to predict the churn probability of a given subscriber. When the churn model finds that the subscriber has a high churn probability the policy model is used to suggest specific retention policies. This study’s experiments show that the churn model has an evaluation accuracy of approximately eighty-five percent. This suggests that policy model construction represents an interesting and important technique in investigating the characteristics of churner groups. Furthermore, this study indicates that understanding the relationships between churns is essential in creating effective retention policy models for dealing with ‘churners’. 相似文献
13.
As a typical manufacturing and scheduling problem with strong industrial background, flow shop scheduling with limited buffers has gained wide attention both in academic and engineering fields. With the objective to minimize the total completion time (or makespan), such an issue is very hard to solve effectively due to the NP-hardness and the constraint on the intermediate buffer. In this paper, an effective hybrid genetic algorithm (HGA) is proposed for permutation flow shop scheduling with limited buffers. In the HGA, not only multiple genetic operators based on evolutionary mechanism are used simultaneously in hybrid sense, but also a neighborhood structure based on graph model is employed to enhance the local search, so that the exploration and exploitation abilities can be well balanced. Moreover, a decision probability is used to control the utilization of genetic mutation operation and local search based on problem-specific information so as to prevent the premature convergence and concentrate computing effort on promising neighbor solutions. Simulation results and comparisons based on benchmarks demonstrate the effectiveness of the HGA. Meanwhile, the effects of buffer size and decision probability on optimization performances are discussed. 相似文献
14.
A hybrid learning algorithm for multilayered perceptrons (MLPs) and pattern-by-pattern training, based on optimized instantaneous learning rates and the recursive least squares method, is proposed. This hybrid solution is developed for on-line identification of process models based on the use of MLPs, and can speed up the learning process of the MLPs substantially, while simultaneously preserving the stability of the learning process. For illustration and test purposes the proposed algorithm is applied to the identification of a non-linear dynamic system. 相似文献
15.
In this paper, a hybrid intelligent system that consists of the Fuzzy Min–Max neural network, the Classification and Regression Tree, and the Random Forest model is proposed, and its efficacy as a decision support tool for medical data classification is examined. The hybrid intelligent system aims to exploit the advantages of the constituent models and, at the same time, alleviate their limitations. It is able to learn incrementally from data samples (owing to Fuzzy Min–Max neural network), explain its predicted outputs (owing to the Classification and Regression Tree), and achieve high classification performances (owing to Random Forest). To evaluate the effectiveness of the hybrid intelligent system, three benchmark medical data sets, viz., Breast Cancer Wisconsin, Pima Indians Diabetes, and Liver Disorders from the UCI Repository of Machine Learning, are used for evaluation. A number of useful performance metrics in medical applications which include accuracy, sensitivity, specificity, as well as the area under the Receiver Operating Characteristic curve are computed. The results are analyzed and compared with those from other methods published in the literature. The experimental outcomes positively demonstrate that the hybrid intelligent system is effective in undertaking medical data classification tasks. More importantly, the hybrid intelligent system not only is able to produce good results but also to elucidate its knowledge base with a decision tree. As a result, domain users (i.e., medical practitioners) are able to comprehend the prediction given by the hybrid intelligent system; hence accepting its role as a useful medical decision support tool. 相似文献
16.
In this paper, an effective hybrid algorithm based on particle swarm optimization (HPSO) is proposed for permutation flow shop scheduling problem (PFSSP) with the limited buffers between consecutive machines to minimize the maximum completion time (i.e., makespan). First, a novel encoding scheme based on random key representation is developed, which converts the continuous position values of particles in PSO to job permutations. Second, an efficient population initialization based on the famous Nawaz–Enscore–Ham (NEH) heuristic is proposed to generate an initial population with certain quality and diversity. Third, a local search strategy based on the generalization of the block elimination properties, named block-based local search, is probabilistically applied to some good particles. Moreover, simulated annealing (SA) with multi-neighborhood guided by an adaptive meta-Lamarckian learning strategy is designed to prevent the premature convergence and concentrate computing effort on promising solutions. Simulation results and comparisons demonstrate the effectiveness of the proposed HPSO. Furthermore, the effects of some parameters are discussed. 相似文献
18.
为了增强混杂Petri网解决资源共享和资源冲突的能力,定义一种新的混杂Petri网模型———资源配置混杂Petri网,提出了相应的使能和激发规则.将对连续变迁和离散变迁的控制作用引入混杂Petri网,同时,增加了资源配置变迁和资源释放变迁,用于有效分配可重复利用的资源.以典型的混杂生产过程为例,研究混杂系统生产过程建模.研究结果表明,所定义的模型描述能力强,模型语义正确合理,能够有效描述和分析混杂系统生产过程. 相似文献
19.
This paper describes a hybrid model formed by a mixture of various regressive neural network models, such as temporal self-organising maps and support vector regressions, for modelling and prediction of foreign exchange rate time series. A selected set of influential trading indicators, including the moving average convergence/divergence and relative strength index, are also utilised in the proposed method. A genetic algorithm is applied to fuse all the information from the mixture regression models and the economical indicators. Experimental results and comparisons show that the proposed method outperforms the global modelling techniques such as generalised autoregressive conditional heteroscedasticity in terms of profit returns. A virtual trading system is built to examine the performance of the methods under study. 相似文献
20.
We propose a new technique for the identification of discrete-time hybrid systems in the piecewise affine (PWA) form. This problem can be formulated as the reconstruction of a possibly discontinuous PWA map with a multi-dimensional domain. In order to achieve our goal, we provide an algorithm that exploits the combined use of clustering, linear identification, and pattern recognition techniques. This allows to identify both the affine submodels and the polyhedral partition of the domain on which each submodel is valid avoiding gridding procedures. Moreover, the clustering step (used for classifying the datapoints) is performed in a suitably defined feature space which allows also to reconstruct different submodels that share the same coefficients but are defined on different regions. Measures of confidence on the samples are introduced and exploited in order to improve the performance of both the clustering and the final linear regression procedure. 相似文献
|