首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
To survive in today's telecommunication business it is imperative to distinguish customers who are not reluctant to move toward a competitor. Therefore, customer churn prediction has become an essential issue in telecommunication business. In such competitive business a reliable customer predictor will be regarded priceless. This paper has employed data mining classification techniques including Decision Tree, Artificial Neural Networks, K-Nearest Neighbors, and Support Vector Machine so as to compare their performances. Using the data of an Iranian mobile company, not only were these techniques experienced and compared to one another, but also we have drawn a parallel between some different prominent data mining software. Analyzing the techniques’ behavior and coming to know their specialties, we proposed a hybrid methodology which made considerable improvements to the value of some of the evaluations metrics. The proposed methodology results showed that above 95% accuracy for Recall and Precision is easily achievable. Apart from that a new methodology for extracting influential features in dataset was introduced and experienced.  相似文献   

2.
As churn management is a major task for companies to retain valuable customers, the ability to predict customer churn is necessary. In literature, neural networks have shown their applicability to churn prediction. On the other hand, hybrid data mining techniques by combining two or more techniques have been proved to provide better performances than many single techniques over a number of different domain problems. This paper considers two hybrid models by combining two different neural network techniques for churn prediction, which are back-propagation artificial neural networks (ANN) and self-organizing maps (SOM). The hybrid models are ANN combined with ANN (ANN + ANN) and SOM combined with ANN (SOM + ANN). In particular, the first technique of the two hybrid models performs the data reduction task by filtering out unrepresentative training data. Then, the outputs as representative data are used to create the prediction model based on the second technique. To evaluate the performance of these models, three different kinds of testing sets are considered. They are the general testing set and two fuzzy testing sets based on the filtered out data by the first technique of the two hybrid models, i.e. ANN and SOM, respectively. The experimental results show that the two hybrid models outperform the single neural network baseline model in terms of prediction accuracy and Types I and II errors over the three kinds of testing sets. In addition, the ANN + ANN hybrid model significantly performs better than the SOM + ANN hybrid model and the ANN baseline model.  相似文献   

3.
We present a comparative study on the most popular machine learning methods applied to the challenging problem of customer churning prediction in the telecommunications industry. In the first phase of our experiments, all models were applied and evaluated using cross-validation on a popular, public domain dataset. In the second phase, the performance improvement offered by boosting was studied. In order to determine the most efficient parameter combinations we performed a series of Monte Carlo simulations for each method and for a wide range of parameters. Our results demonstrate clear superiority of the boosted versions of the models against the plain (non-boosted) versions. The best overall classifier was the SVM-POLY using AdaBoost with accuracy of almost 97% and F-measure over 84%.  相似文献   

4.
Customer retention in telecommunication companies is one of the most important issues in customer relationship management, and customer churn prediction is a major instrument in customer retention. Churn prediction aims at identifying potential churning customers. Traditional approaches for determining potential churning customers are based only on customer personal information without considering the relationship among customers. However, the subscribers of telecommunication companies are connected with other customers, and network properties among people may affect the churn. For this reason, we proposed a new procedure of the churn prediction by examining the communication patterns among subscribers and considering a propagation process in a network based on call detail records which transfers churning information from churners to non-churners. A fast and effective propagation process is possible through community detection and through setting the initial energy of churners (the amount of information transferred) differently in churn date or centrality. The proposed procedure was evaluated based on the performance of the prediction model trained with a social network feature and traditional personal features.  相似文献   

5.
XML documents have recently become ubiquitous because of their varied applicability in a number of applications. Classification is an important problem in the data mining domain, but current classification methods for XML documents use IR-based methods in which each document is treated as a bag of words. Such techniques ignore a significant amount of information hidden inside the documents. In this paper we discuss the problem of rule based classification of XML data by using frequent discriminatory substructures within XML documents. Such a technique is more capable of finding the classification characteristics of documents. In addition, the technique can also be extended to cost sensitive classification. We show the effectiveness of the method with respect to other classifiers. We note that the methodology discussed in this paper is applicable to any kind of semi-structured data. Editors: Hendrik Blockeel, David Jensen and Stefan Kramer An erratum to this article is available at .  相似文献   

6.
An intelligent hybrid system for wafer lot output time prediction   总被引:1,自引:0,他引:1  
Lot output time prediction is a critical task to a wafer fab (fabrication plant). To further enhance the accuracy of wafer lot output time prediction, an intelligent hybrid system is constructed in this study. Firstly, the concept of input classification is applied to Chen’s fuzzy back propagation network (FBPN) approach in this study by pre-classifying wafer lots with the k-means (kM) classifier before predicting the output times with FBPN. Examples belonging to different categories are then learned with different FBPNs but with the same topology. Secondly, the future release plan of the fab, which is influential but has been ignored in traditional approaches, is also incorporated in the intelligent hybrid system. To evaluate the effectiveness of the proposed methodology, production simulation has been applied in this study to generate test examples. According to experimental results, the prediction accuracy of the intelligent hybrid system was significantly better than those of six approaches: BPN, case-based reasoning (CBR), FBPN, look-ahead FBPN, evolving fuzzy rules (EFR), and kM-FBPN without look-ahead in most cases by achieving a 17-47% (and an average of 34%) reduction in the root-mean-squared-error (RMSE) over the comparison basis - BPN.  相似文献   

7.
We studied the problem of optimizing the performance of a DSS for churn prediction. In particular, we investigated the beneficial effect of adding the voice of customers through call center emails – i.e. textual information – to a churn-prediction system that only uses traditional marketing information. We found that adding unstructured, textual information into a conventional churn-prediction model resulted in a significant increase in predictive performance. From a managerial point of view, this integrated framework helps marketing-decision makers to better identify customers most prone to switch. Consequently, their customer retention campaigns can be targeted more effectively because the prediction method is better at detecting those customers who are likely to leave.  相似文献   

8.
Support vector machine (SVM) is currently state-of-the-art for classification tasks due to its ability to model nonlinearities. However, the main drawback of SVM is that it generates “black box” model, i.e. it does not reveal the knowledge learnt during training in human comprehensible form. The process of converting such opaque models into a transparent model is often regarded as rule extraction. In this paper we proposed a hybrid approach for extracting rules from SVM for customer relationship management (CRM) purposes. The proposed hybrid approach consists of three phases. (i) During first phase; SVM-RFE (SVM-recursive feature elimination) is employed to reduce the feature set. (ii) Dataset with reduced features is then used in the second phase to obtain SVM model and support vectors are extracted. (iii) Rules are then generated using Naive Bayes Tree (NBTree) in the final phase. The dataset analyzed in this research study is about Churn prediction in bank credit card customer (Business Intelligence Cup 2004) and it is highly unbalanced with 93.24% loyal and 6.76% churned customers. Further we employed various standard balancing approaches to balance the data and extracted rules. It is observed from the empirical results that the proposed hybrid outperformed all other techniques tested. As the reduced feature dataset is used, it is also observed that the proposed approach extracts smaller length rules, thereby improving the comprehensibility of the system. The generated rules act as an early warning expert system to the bank management.  相似文献   

9.
In this paper, an effective hybrid algorithm based on estimation of distribution algorithm (EDA) is proposed to solve the multidimensional knapsack problem (MKP). With the framework of EDA, the probability model is built with the superior population and the new individuals are generated based on probability model. In addition, an updating mechanism of the probability model is proposed and a mechanism for initializing the probability model based on the specific knowledge of the MKP is also proposed to improve the convergence speed. Meanwhile, an adaptive local search is proposed to enhance the exploitation ability. Furthermore, the influences of parameters are investigated based on Taguchi method of design of experiment and the importance of repair operator is also studied via simulation testing and comparisons. Finally, numerical simulation is carried out based on the benchmark instances, and the comparisons with some existing algorithms demonstrate the effectiveness of the proposed algorithm.  相似文献   

10.
Remaining useful life prediction is one of the key requirements in prognostics and health management. While a system or component exhibits degradation during its life cycle, there are various methods to predict its future performance and assess the time frame until it does no longer perform its desired functionality. The proposed data-driven and model-based hybrid/fusion prognostics framework interfaces a classical Bayesian model-based prognostics approach, namely particle filter, with two data-driven methods in purpose of improving the prediction accuracy. The first data-driven method establishes the measurement model (inferring the measurements from the internal system state) to account for situations where the internal system state is not accessible through direct measurements. The second data-driven method extrapolates the measurements beyond the range of actually available measurements to feed them back to the model-based method which further updates the particles and their weights during the long-term prediction phase. By leveraging the strengths of the data-driven and model-based methods, the proposed fusion prognostics framework can bridge the gap between data-driven prognostics and model-based prognostics when both abundant historical data and knowledge of the physical degradation process are available. The proposed framework was successfully applied on lithium-ion battery remaining useful life prediction and achieved a significantly better accuracy compared to the classical particle filter approach.  相似文献   

11.
客户流失预测的现状与发展研究   总被引:5,自引:1,他引:5  
根据客户流失预测研究的发展历程和智能化程度的高低,将客户流失预测研究划分为三个阶段,包括基于传统统计学的预测方法、基于人工智能的预测方法和基于统计学习理论的预测方法,并通过分析每个阶段存在的问题提出了未来可研究的方向。  相似文献   

12.
Prediction of stock market trends is considered as an important task and is of great attention as predicting stock prices successfully may lead to attractive profits by making proper decisions. Stock market prediction is a major challenge owing to non-stationary, blaring, and chaotic data, and thus, the prediction becomes challenging among the investors to invest the money for making profits. Several techniques are devised in the existing techniques to predict the stock market trends. This work presents the detailed review of 50 research papers suggesting the methodologies, like Bayesian model, Fuzzy classifier, Artificial Neural Networks (ANN), Support Vector Machine (SVM) classifier, Neural Network (NN), Machine Learning Methods and so on, based on stock market prediction. The obtained papers are classified based on different prediction and clustering techniques. The research gaps and the challenges faced by the existing techniques are listed and elaborated, which help the researchers to upgrade the future works. The works are analyzed using certain datasets, software tools, performance evaluation measures, prediction techniques utilized, and performance attained by different techniques. The commonly used technique for attaining effective stock market prediction is ANN and the fuzzy-based technique. Even though a lot of research efforts, the current stock market prediction technique still have many limits. From this survey, it can be concluded that the stock market prediction is a very complex task, and different factors should be considered for predicting the future of the market more accurately and efficiently.  相似文献   

13.
A case-based reasoning system for PCB defect prediction   总被引:1,自引:0,他引:1  
The manufacturing process for a new Printed Circuit Board (PCB) design is often instable and might generate a number of defects during the complicated production process. Defects reduce the yield rate and increase the production costs. Although skilled engineers can predict the possible defect items for a new PCB product, this approach requires strong engineering experience and is time consuming. To conquer this problem, this research applies case-based reasoning (CBR) methodology to develop a defect prediction system for new PCB products. In the CBR system, each case is represented using the design specifications, defect items and corresponding costs. A vantage-based case indexing mechanism is developed to accelerate the case retrieval efficiency. In addition, a reasoning algorithm that considers the defect cost is proposed to infer the defect items that are interesting to PCB manufacturers. The system performance is analyzed to show the efficiency and accuracy of the proposed system. A practical implementation using a case-base provided by a PCB manufacturer is demonstrated.  相似文献   

14.
The present article proposes an advanced methodology for numerically simulating complex noise problems. More precisely, we consider the so-called multi-stage acoustic hybrid approach, which principle is to couple sound generation and acoustic propagation stages. Under that approach, we propose an advanced hybrid method which acoustic propagation stage relies on Computational AeroAcoustics (CAA) techniques. To this end, first, an innovative weak-coupling technique is developed, which allows an implicit forcing of the CAA stage with a given source signal coming from an a priori evaluation, whether the latter evaluation is of analytical or computational nature. Then, thanks to additional innovative solutions, the resulting CAA-based hybrid approach is optimized so that it can be applied to realistic and complex acoustic problems in an easier and safer way. All these innovative features are then validated on the basis of an academic test case, before the resulting advanced CAA-based hybrid methodology is applied to two problems of flow-induced noise radiation. This demonstrates the ability of the here proposed method to address realistic problems, by offering to handle at the same time both acoustic generation and propagation phenomena, despite their intrinsic multiscale character.  相似文献   

15.
An effective hybrid algorithm for university course timetabling   总被引:3,自引:0,他引:3  
The university course timetabling problem is an optimisation problem in which a set of events has to be scheduled in timeslots and located in suitable rooms. Recently, a set of benchmark instances was introduced and used for an ‘International Timetabling Competition’ to which 24 algorithms were submitted by various research groups active in the field of timetabling. We describe and analyse a hybrid metaheuristic algorithm which was developed under the very same rules and deadlines imposed by the competition and outperformed the official winner. It combines various construction heuristics, tabu search, variable neighbourhood descent and simulated annealing. Due to the complexity of developing hybrid metaheuristics, we strongly relied on an experimental methodology for configuring the algorithms as well as for choosing proper parameter settings. In particular, we used racing procedures that allow an automatic or semi-automatic configuration of algorithms with a good save in time. Our successful example shows that the systematic design of hybrid algorithms through an experimental methodology leads to high performing algorithms for hard combinatorial optimisation problems.  相似文献   

16.
A new relational learning system using novel rule selection strategies   总被引:1,自引:0,他引:1  
Mahmut Uludag  Mehmet R. Tolun   《Knowledge》2006,19(8):765-771
This paper describes a new rule induction system, rila, which can extract frequent patterns from multiple connected relations. The system supports two different rule selection strategies, namely the select early and select late strategies. Pruning heuristics are used to control the number of hypotheses generated during the learning process. Experimental results are provided on the mutagenesis and the segmentation data sets. The present rule induction algorithm is also compared to the similar relational learning algorithms. Results show that the algorithm is comparable to similar algorithms.  相似文献   

17.
This paper presents an Ethernet based hybrid method for predicting random time-delay in the networked control system.First,db3 wavelet is used to decompose and reconstruct time-delay sequence,and the approximation component and detail components of time-delay sequences are fgured out.Next,one step prediction of time-delay is obtained through echo state network(ESN)model and auto-regressive integrated moving average model(ARIMA)according to the diferent characteristics of approximate component and detail components.Then,the fnal predictive value of time-delay is obtained by summation.Meanwhile,the parameters of echo state network is optimized by genetic algorithm.The simulation results indicate that higher accuracy can be achieved through this prediction method.  相似文献   

18.
19.
The behaviours of hybrid dynamic systems (HDS) are determined by combining continuous variables with discrete switching logic. The identification of a HDS aims to find an accurate model of the system’s dynamics based on its past inputs and outputs. In pattern recognition (PR) methods, each mode is represented by a set of similar patterns that form restricted regions in the feature space. These sets of patterns are called classes. A pattern is a vector built from past inputs and outputs. HDS identification is a challenging problem since it involves the estimation of different sets of parameters without knowing in advance which sections of the measured data correspond to the different modes of the system. Therefore, HDS identification can be achieved by combining two steps: clustering and parameter estimation. In the clustering step, the number of discrete modes (i.e., the classes that input-output data points belong) is estimated. The parameter estimation step finds the parameters of the models that govern the continuous dynamics in each mode. In this paper, an unsupervised PR method is proposed to achieve the clustering step of the identification of temporally switched linear HDS. The determination of the number of modes does not require prior information about the modes or their number.  相似文献   

20.
A neural network architecture is introduced which implements a supervised clustering algorithm for the classification of feature vectors. The network is selforganising, and is able to adapt to the shape of the underlying pattern distribution as well as detect novel input vectors during training. It is also capable of determining the relative importance of the feature components for classification. The architecture is a hybrid of supervised and unsupervised networks, and combines the strengths of three wellknown architectures: learning vector quantisation, backpro-pagation and adaptive resonance theory. Network performance is compared to that of learning vector quantisation, back-propagation and cascade-correlation. It is found that performance is generally as good as or better than the performance of these other architectures, while training time is considerably shorter. However, the main advantage of the hybrid architecture is its ability to gain insight into the feature pattern space.Nomenclature O j The output value of thejth unit - I i Theith component of the input pattern - W ij The weight of the cluster connection between theith input and thejth unit - B ij The weight of the shape connection between theith input and thejth unit - N The dimension of the input patterns - v j The vigilance parameter of thejth unit - v init The initial vigilance parameter value - v rate The change in the vigilance parameter value - X i Theith direction in anN-dimensional coordinate system - T k The classification tag of thekth unit - C The classification tag of the current input vector - (p) The learning rate at thepth epoch for the cluster weights - p The current epoch - P The total number of epochs - E k The error associated with thekth unit - The constant learning rate for the shape weights - a j The age in epochs of thejth unit  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号