首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
At the present time a large number of AI methods have been developed in the field of pattern classification. In this paper, we will compare the performance of a well-known algorithm in machine learning (C4.5) with a recently proposed algorithm in the fuzzy set community (NEFCLASS). We will compare the algorithms both on the accuracy attained and on the size of the induced rule base. Additionally, we will investigate how the selected algorithms perform after they have been pre-processed by discretization and feature selection.  相似文献   

2.
In this paper, we propose a microcalcification classification scheme, assisted by content-based mammogram retrieval, for breast cancer diagnosis. We recently developed a machine learning approach for mammogram retrieval where the similarity measure between two lesion mammograms was modeled after expert observers. In this work, we investigate how to use retrieved similar cases as references to improve the performance of a numerical classifier. Our rationale is that by adaptively incorporating local proximity information into a classifier, it can help to improve its classification accuracy, thereby leading to an improved “second opinion” to radiologists. Our experimental results on a mammogram database demonstrate that the proposed retrieval-driven approach with an adaptive support vector machine (SVM) could improve the classification performance from 0.78 to 0.82 in terms of the area under the ROC curve.  相似文献   

3.
In this paper, the classification of the two binary bioinformatics datasets, leukemia and colon tumor, is further studied by using the recently developed neural network-based finite impulse response extreme learning machine (FIR-ELM). It is seen that a time series analysis of the microarray samples is first performed to determine the filtering properties of the hidden layer of the neural classifier with FIR-ELM for feature identification. The linear separability of the data patterns in the microarray datasets is then studied. For improving the robustness of the neural classifier against noise and errors, a frequency domain gene feature selection algorithm is also proposed. It is shown in the simulation results that the FIR-ELM algorithm has an excellent performance for the classification of bioinformatics data in comparison with many existing classification algorithms.  相似文献   

4.
Classifying online network traffic is becoming critical in network management and security. Recently, new classification methods based on analysis of statistical features of transport layer traffic have been proposed. While these new methods address the limitations of the port based and payload based traffic classification, the current software-based solutions are not fast enough to deal with the traffic of today’s high-speed networks. In this paper, we propose an online statistical traffic classifier using the C4.5 machine learning algorithm running on the NetFPGA platform. Our NetFPGA classifier is constructed by adding three main modules to the NetFPGA reference switch design; a Netflow module, a feature extractor module, and a C4.5 search tree classifier. The proposed classifier is able to classify the input traffics at the maximum line speed of the NetFPGA platform, i.e. 8 Gbps without any packet loss. Our method is based on the statistical features of the first few packets of a flow. The flow is classified just a few micro seconds after receiving the desired number of packets.  相似文献   

5.

Selecting the right set of features for classification is one of the most important problems in designing a good classifier. Decision tree induction algorithms such as C4.5 have incorporated in their learning phase an automatic feature selection strategy, while some other statistical classification algorithms require the feature subset to be selected in a preprocessing phase. It is well known that correlated and irrelevant features may degrade the performance of the C4.5 algorithm. In our study, we evaluated the influence of feature preselection on the prediction accuracy of C4.5 using a real-world data set. We observed that accuracy of the C4.5 classifier could be improved with an appropriate feature preselection phase for the learning algorithm. Beyond that, the number of features used for classification can be reduced, which is important for image interpretation tasks since feature calculation is a time-consuming process.  相似文献   

6.
随着计算机技术的发展,越来越多的医学图像分析技术应运而生.利用数据挖掘方法对医学图像做分析是目前研究的热点之一,该方法首先从医学图像中提取统计特征,在此基础上进一步挖掘,这种方法对所提取的特征有很强的依赖性而且受到经验等主观因素的影响.针对乳腺X光图像,采用一种可以从图像中自动学习特征并利用学习到的特征对图像进行分类的医学图像分析新方法——判别式受限玻尔兹曼机(Discriminative Restricted Boltzmann Machine,DRBM).DRBM是一种无向判别模型,它可以自动地从图像中学习特征.在乳腺X光图像标准数据集上的实验结果表明,DRBM对医学图像的分类准确率明显高于其它基于统计特征提取的医学图像分类方法.  相似文献   

7.
Abstract: The aim of this research was to compare classifier algorithms including the C4.5 decision tree classifier, the least squares support vector machine (LS-SVM) and the artificial immune recognition system (AIRS) for diagnosing macular and optic nerve diseases from pattern electroretinography signals. The pattern electroretinography signals were obtained by electrophysiological testing devices from 106 subjects who were optic nerve and macular disease subjects. In order to show the test performance of the classifier algorithms, the classification accuracy, receiver operating characteristic curves, sensitivity and specificity values, confusion matrix and 10-fold cross-validation have been used. The classification results obtained are 85.9%, 100% and 81.82% for the C4.5 decision tree classifier, the LS-SVM classifier and the AIRS classifier respectively using 10-fold cross-validation. It is shown that the LS-SVM classifier is a robust and effective classifier system for the determination of macular and optic nerve diseases.  相似文献   

8.
The analysis of social communities related logs has recently received considerable attention for its importance in shedding light on social concerns by identifying different groups, and hence helps in resolving issues like predicting terrorist groups. In the customer analysis domain, identifying calling communities can be used for determining a particular customer’s value according to the general pattern behavior of the community that the customer belongs to; this helps the effective targeted marketing design, which is significantly important for increasing profitability. In telecommunication industry, machine learning techniques have been applied to the Call Detail Record (CDR) for predicting customer behavior such as churn prediction. In this paper, we pursue identifying the calling communities and demonstrate how cluster analysis can be used to effectively identify communities using information derived from the CDR data. We use the information extracted from the cluster analysis to identify customer calling patterns. Customers calling patterns are then given to a classification algorithm to generate a classifier model for predicting the calling communities of a customer. We apply different machine learning techniques to build classifier models and compare them in terms of classification accuracy and computational performance. The reported test results demonstrate the applicability and effectiveness of the proposed approach.  相似文献   

9.
研究表明,端学习机和判别性字典学习算法在图像分类领域极具有高效和准确的优势。然而,这两种方法也具有各自的缺点,极端学习机对噪声的鲁棒性较差,判别性字典学习算法在分类过程中耗时较长。为统一这种互补性以提高分类性能,文中提出了一种融合极端学习机的判别性分析字典学习模型。该模型利用迭代优化算法学习最优的判别性分析字典和极端学习机分类器。为验证所提算法的有效性,利用人脸数据集进行分类。实验结果表明,与目前较为流行的字典学习算法和极端学习机相比,所提算法在分类过程中具有更好的效果。  相似文献   

10.
Minimal Learning Machine (MLM) is a recently proposed supervised learning algorithm with performance comparable to most state-of-the-art machine learning methods. In this work, we propose ensemble methods for classification and regression using MLMs. The goal of ensemble strategies is to produce more robust and accurate models when compared to a single classifier or regression model. Despite its successful application, MLM employs a computationally intensive optimization problem as part of its test procedure (out-of-sample data estimation). This becomes even more noticeable in the context of ensemble learning, where multiple models are used. Aiming to provide fast alternatives to the standard MLM, we also propose the Nearest Neighbor Minimal Learning Machine and the Cubic Equation Minimal Learning Machine to cope with classification and single-output regression problems, respectively. The experimental assessment conducted on real-world datasets reports that ensemble of fast MLMs perform comparably or superiorly to reference machine learning algorithms.  相似文献   

11.
Improving accuracies of machine learning algorithms is vital in designing high performance computer-aided diagnosis (CADx) systems. Researches have shown that a base classifier performance might be enhanced by ensemble classification strategies. In this study, we construct rotation forest (RF) ensemble classifiers of 30 machine learning algorithms to evaluate their classification performances using Parkinson's, diabetes and heart diseases from literature.While making experiments, first the feature dimension of three datasets is reduced using correlation based feature selection (CFS) algorithm. Second, classification performances of 30 machine learning algorithms are calculated for three datasets. Third, 30 classifier ensembles are constructed based on RF algorithm to assess performances of respective classifiers with the same disease data. All the experiments are carried out with leave-one-out validation strategy and the performances of the 60 algorithms are evaluated using three metrics; classification accuracy (ACC), kappa error (KE) and area under the receiver operating characteristic (ROC) curve (AUC).Base classifiers succeeded 72.15%, 77.52% and 84.43% average accuracies for diabetes, heart and Parkinson's datasets, respectively. As for RF classifier ensembles, they produced average accuracies of 74.47%, 80.49% and 87.13% for respective diseases.RF, a newly proposed classifier ensemble algorithm, might be used to improve accuracy of miscellaneous machine learning algorithms to design advanced CADx systems.  相似文献   

12.
Several pattern classifiers give high classification accuracy but their storage requirements and processing time are severely expensive. On the other hand, some classifiers require very low storage requirement and processing time but their classification accuracy is not satisfactory. In either of the cases the performance of the classifier is poor. In this paper, we have presented a technique based on the combination of minimum distance classifier (MDC), class-dependent principal component analysis (PCA) and linear discriminant analysis (LDA) which gives improved performance as compared with other standard techniques when experimented on several machine learning corpuses.  相似文献   

13.
High accuracy and low overhead are two key features of a well-designed classifier for different classification scenarios. In this paper, we propose an improved classifier using a single-hidden layer feedforward neural network (SLFN) trained with extreme learning machine. The novel classifier first utilizes principal component analysis to reduce the feature dimension and then selects the optimal architecture of the SLFN based on a new localized generalization error model in the principal component space. Experimental and statistical results on the NSL-KDD data set demonstrate that the proposed classifier can achieve a significant performance improvement compared with previous classifiers.  相似文献   

14.
Visual inspection on the surface of components is a main application of machine vision. Visual inspection finds its application in identifying defects such as scratches, cracks bubbles and measurement of cutting tool wear and welding quality. Machine learning approach to machine vision helps in automating the design process of machine vision systems. This approach involves image acquisition, preprocessing, feature extraction and classification. Study shows a library of features, and classifiers are available to classify the data. However, only the best combination of them can yield the highest classification accuracy. In this study, images with different known conditions were acquired, preprocessed, and histogram features were extracted. The classification accuracies of C4.5 classifier algorithm and Naïve Bayes algorithm were compared, and results are reported. The study shows that C4.5 algorithm performs better.  相似文献   

15.
The optic nerve disease is an important disease that appears commonly in public. In this paper, we propose a hybrid diagnostic system based on discretization (quantization) method and classification algorithms including C4.5 decision tree classifier, artificial neural network (ANN), and least square support vector machine (LSSVM) to diagnose the optic nerve disease from Visual Evoked Potential (VEP) signals with discrete values. The aim of this paper is to investigate the effect of Discretization method on the classification of optic nerve disease. Since the VEP signals are non-linearly-separable, low classification accuracy can be obtained by classifier algorithms. In order to overcome this problem, we have used the Discretization method as data pre-processing. The proposed method consists of two phases: (i) quantization of VEP signals using Discretization method, and (ii) diagnosis of discretized VEP signals using classification algorithms including C4.5 decision tree classifier, ANN, and LSSVM. The classification accuracies obtained by these hybrid methods (combination of C4.5 decision tree classifier-quantization method, combination of ANN-quantization method, and combination of LSSVM-quantization method) with and without quantization strategy are 84.6-96.92%, 94.20-96.76%, and 73.44-100%, respectively. As can be seen from these results, the best model used to classify the optic nerve disease from VEP signals is obtained for the combination of LSSVM classifier and quantization strategy. The obtained results denote that the proposed method can make an effective interpretation and point out the ability of design of a new intelligent assistance diagnosis system.  相似文献   

16.
面向中文文本分类的C4.5Bagging算法研究   总被引:2,自引:0,他引:2       下载免费PDF全文
对于中文文本分类问题,提出一种新的Bagging方法。这一方法以决策树C4.5算法为弱分类器,通过实例重取样获取多个训练集,将其结果按照投票规则进行合成,最终得到分类结果。实验证明,这种算法的准确率、查全率、F1值比C4.5、kNN和朴素贝叶斯分类器都高,具有更加优良的性能。  相似文献   

17.
Various methods for ensembles selection and classifier combination have been designed to optimize the performance of ensembles of classifiers. However, use of large number of features in training data can affect the classification performance of machine learning algorithms. The objective of this paper is to represent a novel feature elimination (FE) based ensembles learning method which is an extension to an existing machine learning environment. Here the standard 12 lead ECG signal recordings data have been used in order to diagnose arrhythmia by classifying it into normal and abnormal subjects. The advantage of the proposed approach is that it reduces the size of feature space by way of using various feature elimination methods. The decisions obtained from these methods have been coalesced to form a fused data. Thus the idea behind this work is to discover a reduced feature space so that a classifier built using this tiny data set would perform no worse than a classifier built from the original data set. Random subspace based ensembles classifier is used with PART tree as base classifier. The proposed approach has been implemented and evaluated on the UCI ECG signal data. Here, the classification performance has been evaluated using measures such as mean absolute error, root mean squared error, relative absolute error, F-measure, classification accuracy, receiver operating characteristics and area under curve. In this way, the proposed novel approach has provided an attractive performance in terms of overall classification accuracy of 91.11 % on unseen test data set. From this work, it is shown that this approach performs well on the ensembles size of 15 and 20.  相似文献   

18.
Identifying a discriminative feature can effectively improve the classification performance of aerial scene classification. Deep convolutional neural networks (DCNN) have been widely used in aerial scene classification for its learning discriminative feature ability. The DCNN feature can be more discriminative by optimizing the training loss function and using transfer learning methods. To enhance the discriminative power of a DCNN feature, the improved loss functions of pretraining models are combined with a softmax loss function and a centre loss function. To further improve performance, in this article, we propose hybrid DCNN features for aerial scene classification. First, we use DCNN models with joint loss functions and transfer learning from pretrained deep DCNN models. Second, the dense DCNN features are extracted, and the discriminative hybrid features are created using linear connection. Finally, an ensemble extreme learning machine (EELM) classifier is adopted for classification due to its general superiority and low computational cost. Experimental results based on the three public benchmark data sets demonstrate that the hybrid features obtained using the proposed approach and classified by the EELM classifier can result in remarkable performance.  相似文献   

19.
In this paper, we propose a new constructive method, based on cooperative coevolution, for designing automatically the structure of a neural network for classification. Our approach is based on a modular construction of the neural network by means of a cooperative evolutionary process. This process benefits from the advantages of coevolutionary computation as well as the advantages of constructive methods. The proposed methodology can be easily extended to work with almost any kind of classifier.The evaluation of each module that constitutes the network is made using a multiobjective method. So, each new module can be evaluated in a comprehensive way, considering different aspects, such as performance, complexity, or degree of cooperation with the previous modules of the network. In this way, the method has the advantage of considering not only the performance of the networks, but also other features.The method is tested on 40 classification problems from the UCI machine learning repository with very good performance. The method is thoroughly compared with two other constructive methods, cascade correlation and GMDH networks, and other classification methods, namely, SVM, C4.5, and k nearest-neighbours, and an ensemble of neural networks constructed using four different methods.  相似文献   

20.
Static security analysis is an important study carried out in the control centers of electric utilities. Static security assessment (SSA) is the process of determining whether the current operational state is in a secure or emergency (insecure) state. Conventional method of security evaluation involves performing continuous load flow analysis, which is highly time consuming and infeasible for real-time applications. This led to the application of pattern recognition (PR) approach for static security analysis. This paper presents a more efficient design of a PR system suitable for on-line SSA. The feature selection stage in the PR system uses many algorithms to select the optimal feature set. This paper proposes the use of Support Vector Machine (SVM), a recently introduced machine learning tool, in the classifier design stage of PR system. The developed PR system is implemented in IEEE standard test systems for SSA and classification. The performance of SVM classifier is compared with the conventional K-nearest neighbor, method of least squares and neural network classifiers. Simulation results prove that the SVM-PR classifier outperforms other equivalent classifier algorithms, giving high classification accuracy and less misclassification rate. The feasibility of SVM-PR classifier for on-line security assessment process is also presented.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号