首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 281 毫秒
1.
基于PCA与改进的最近邻法则的异常检测   总被引:1,自引:0,他引:1  
提出一种新颖的基于特征抽取的异常检测方法,先对预处理数据进行标准化变换,然后应用主成份分析(PCA)抽取入侵特征,最后应用一种改进的最近邻分类方法--基于中心的最近邻分类法(CNN)检测入侵.利用KDD Cup'99数据集,将PCA删与PCA NN、PCA SVM、标准SVM进行比较,结果显示,在不降低分类器性能的情况下,特征抽取方法能对输入数据有效降维,且在各种方法中,PCA与CNN的结合能得到最优的入侵检测性能.  相似文献   

2.
受限于人脸姿态、光照变化等因素,通过引入多通道Gaborface表征结合基于子空间的二维双向线性降维算法,提出了一种结合优化多通道Gaborface与二维线性降维的特征提取算法。首先,采用多通道Gaborface表征(MGFR)模型对样本集进行预处理,提取不同通道下的人脸Gabor特征表示并优化选取通道融合方式而组合成新特征;再引入样本间类别信息获得改进线性二维双向特征降维算法,从而对获得的人脸表示进行特征降维与提取;最终通过最近邻分类器得到分类结果。试验结果表明,通过在AR、ORL和YALE人脸库进行对比分析,改进算法对人脸姿态等变化具有较强的鲁棒性,且较其他算法表现出了较优的识别性能。  相似文献   

3.
In this paper, two new methods to segment infrared images of finger in order to perform the finger vein pattern extraction task are presented. In the first, the widespread known and used K nearest neighbor (KNN) classifier, which is a very effective supervised method for clustering data sets, is used. In the second, a novel clustering algorithm named nearest neighbor clustering algorithm (NNCA), which is unsupervised and has been recently proposed for retinal vessel segmentation, is used. As feature vectors for the classification process in both cases two features are used: the multidirectional response of a matched filter and the minimum eigenvalue of the Hessian matrix. The response of the multidirectional filter is essential for robust classification because offers a distinction between vein-like and edge-like structures while Hessian based approaches cannot offer this. The two algorithms, as the experimental results show, perform well with the NNCA has the advantage that is unsupervised and thus can be used for full automatic finger vein pattern extraction. It is also worth to note that the proposed vector, composed only of two features, is the simplest feature set which has proposed in the literature until now and results in a performance comparable with others that use a vector with much larger size (31 features). NNCA evaluated also quantitatively on a database which contains artificial images of finger and achieved the segmentation rates: 0.88 sensitivity, 0.80 specificity and 0.82 accuracy.  相似文献   

4.

Presently, while automated depression diagnosis has made great progress, most of the recent works have focused on combining multiple modalities rather than strengthening a single one. In this research work, we present a unimodal framework for depression detection based on facial expressions and facial motion analysis. We investigate a wide set of visual features extracted from different facial regions. Due to high dimensionality of the obtained feature sets, identification of informative and discriminative features is a challenge. This paper suggests a hybrid dimensionality reduction approach which leverages the advantages of the filter and wrapper methods. First, we use a univariate filter method, Fisher Discriminant Ratio, to initially reduce the size of each feature set. Subsequently, we propose an Incremental Linear Discriminant Analysis (ILDA) approach to find an optimal combination of complementary and relevant feature sets. We compare the performance of the proposed ILDA with the batch-mode LDA and also the Composite Kernel based Support Vector Machine (CKSVM) method. The experiments conducted on the Distress Analysis Interview Corpus Wizard-of-Oz (DAIC-WOZ) dataset demonstrate that the best depression classification performance is obtained by using different feature extraction methods in combination rather than individually. ILDA generates better depression classification results in comparison to the CKSVM. Moreover, ILDA based wrapper feature selection incurs lower computational cost in comparison to the CKSVM and the batch-mode LDA methods. The proposed framework significantly improves the depression classification performance, with an F1 Score of 0.805, which is better than all the video based depression detection models suggested in literature, for the DAIC-WOZ dataset. Salient facial regions and well performing visual feature extraction methods are also identified.

  相似文献   

5.
随着网络入侵行为的多样化和智能化,传统的入侵检测算法在面对高维特征、非线性的海量数据时,存在特征提取不充分、模型分类不够精确等问题,为此,提出了一种结合卷积神经网络(convolutional neural networks,CNN)和三支决策(three-way decision,TWD)的入侵检测算法。卷积神经网络具有优越的特征提取能力;同时,三支决策可以规避因信息不足而盲目分类造成的风险,且减少分类所耗费的时间。该方法通过卷积神经网络对高维数据进行特征提取,构建多粒度特征空间,然后基于三支决策理论对网络行为做出即时决策,对于无法即时决策的网络行为进行延迟决策,即对该部分网络行为再次特征提取以构建不同的粒度特征空间,最后输出分类结果。该方法建立的模型在NSL-KDD、CIC-IDS2017数据集上的实验结果表明,提出的算法可以提升入侵检测系统的性能。  相似文献   

6.
电力数据安全随着电力信息网与互联网的接入变得尤为严峻,其数据与规模愈加庞大复杂。为了对其进行有效的安全分析及特征提取,提出一种基于特征提取的SQL注入攻击检测模型。从Web访问日志中提取SQL注入语法特征和行为特征,得到语法特征矩阵和行为特征矩阵数据集。以漏报率和误报率为评价指标,选取K-means、Naive Bayes、SVM和RF算法分别在两类数据集上实验。实验结果表明,与以语法特征矩阵作为数据集相比,行为特征矩阵在SQL注入攻击检测中具有更好的效果。此外SVM和RF检测效果较好,具有较低的漏报率和误报率,该方法能有效检测出SQL注入攻击。  相似文献   

7.
针对传统飞机检测算法特征学习能力较弱,在背景复杂、目标密集、成像质量较差的遥感影像上检测精度较低的问题,提出了一种基于Faster-RCNN(Faster-Regions with Convolutional Neural Network)框架的遥感影像飞机检测优化算法。以ResNet50为基础特征提取网络,引入空洞残差块进行多层特征融合,构建新的特征提取网络,提高算法的特征提取能力。首先在UCAS-AOD数据集上采用交叉验证训练方法验证模型在不同训练集与测试集上的稳定性,同时比较不同算法的检测性能;然后在NWPU VHR-10数据集上进行飞机检测对比实验,验证模型泛化性。实验结果表明:在UCAS-AOD数据集上优化算法平均精度为97.1%,在NWPU VHR-10数据集上优化算法平均精度为96.2%。该优化算法能够提升遥感影像中飞机的检测精度,且泛化性更强,对实现遥感影像飞机快速检测具有一定的参考意义。  相似文献   

8.
在现实世界中,可用的训练数据通常较少,且很容易过时,所以需要不断采集和标记大量新的数据集;针对此问题,提出一种基于SAMME和TrAdaBoost算法的迁移学习分类方法。该方法的核心思想是:从老视频流数据集中筛选出有用的样本来帮助模型识别新的未知视频流集样本,这里新老视频流数据集的样本特征分布是不相同的。同时该方法结合SAMME算法将TrAdaBoost算法从只可实现两分类扩展至多分类。实验结果表明,与现有方法比较,该方法能更好地实现对六种类型视频流的精细分类,并减少大量已标注老数据集的浪费。  相似文献   

9.
考虑局部均值和类全局信息的快速近邻原型选择算法   总被引:1,自引:0,他引:1  
李娟  王宇平 《自动化学报》2014,40(6):1116-1125
压缩近邻法是一种简单的非参数原型选择算法,其原型选取易受样本读取序列、异常样本等干扰.为克服上述问题,提出了一个基于局部均值与类全局信息的近邻原型选择方法.该方法既在原型选取过程中,充分利用了待学习样本在原型集中k个同异类近邻局部均值和类全局信息的知识,又设定原型集更新策略实现对原型集的动态更新.该方法不仅能较好克服读取序列、异常样本对原型选取的影响,降低了原型集规模,而且在保持高分类精度的同时,实现了对数据集的高压缩效应.图像识别及UCI(University of California Irvine)基准数据集实验结果表明,所提出算法集具有较比较算法更有效的分类性能.  相似文献   

10.
为了有效改善高光谱图像数据分类的精确度,减少对大数目数据集的依赖,在原型空间特征提取方法的基础上,提出一种基于加权模糊C均值算法改进型原型空间特征提取方案。该方案通过加权模糊C均值算法对每个特征施加不同的权重,从而保证提取后的特征含有较高的有效信息量,从而达到减少训练数据集而不降低分类所需信息量的效果。实验结果表明,与业内公认的原型空间提取算法相比,该方案在相对较小的数据集下,其性能仍具有较为理想的稳定性,且具有相对较高的分类精度。  相似文献   

11.
Local feature weighting in nearest prototype classification.   总被引:1,自引:0,他引:1  
The distance metric is the corner stone of nearest neighbor (NN)-based methods, and therefore, of nearest prototype (NP) algorithms. That is because they classify depending on the similarity of the data. When the data is characterized by a set of features which may contribute to the classification task in different levels, feature weighting or selection is required, sometimes in a local sense. However, local weighting is typically restricted to NN approaches. In this paper, we introduce local feature weighting (LFW) in NP classification. LFW provides each prototype its own weight vector, opposite to typical global weighting methods found in the NP literature, where all the prototypes share the same one. Providing each prototype its own weight vector has a novel effect in the borders of the Voronoi regions generated: They become nonlinear. We have integrated LFW with a previously developed evolutionary nearest prototype classifier (ENPC). The experiments performed both in artificial and real data sets demonstrate that the resulting algorithm that we call LFW in nearest prototype classification (LFW-NPC) avoids overfitting on training data in domains where the features may have different contribution to the classification task in different areas of the feature space. This generalization capability is also reflected in automatically obtaining an accurate and reduced set of prototypes.  相似文献   

12.
特征提取是软件缺陷预测中的关键步骤,特征提取的质量决定了缺陷预测模型的性能,但传统的特征提取方法难以提取出软件缺陷数据的深层本质特征。深度学习理论中的自动编码器能够从原始数据中自动学习特征,并获得其特征表示,同时为了增强自动编码器的鲁棒性,本文提出一种基于堆叠降噪稀疏自动编码器的特征提取方法,通过设置不同的隐藏层数、稀疏性约束和加噪方式,可以直接高效地从软件缺陷数据中提取出分类预测所需的各层次特征表示。利用Eclipse缺陷数据集的实验结果表明,该方法较传统特征提取方法具有更好的性能。  相似文献   

13.
K nearest neighbor and Bayesian methods are effective methods of machine learning. Expectation maximization is an effective Bayesian classifier. In this work a data elimination approach is proposed to improve data clustering. The proposed method is based on hybridization of k nearest neighbor and expectation maximization algorithms. The k nearest neighbor algorithm is considered as the preprocessor for expectation maximization algorithm to reduce the amount of training data making it difficult to learn. The suggested method is tested on well-known machine learning data sets iris, wine, breast cancer, glass and yeast. Simulations are done in MATLAB environment and performance results are concluded.  相似文献   

14.
In this article, we propose a feature extraction method based on median–mean and feature line embedding (MMFLE) for the classification of hyperspectral images. In MMFLE, we maximize the class separability using discriminant analysis. Moreover, we remove the negative effect of outliers on the class mean using the median–mean line (MML) measurement and virtually enlarge the training set using the feature line (FL) distance metric. The experimental results on Indian Pines and University of Pavia data sets show the better performance of MMFLE compared to nearest feature line embedding (NFLE), median–mean line discriminant analysis (MMLDA), and some other feature extraction approaches in terms of classification accuracy using a small training set.  相似文献   

15.
The traditional CCA and 2D-CCA algorithms are unsupervised multiple feature extraction methods. Hence, introducing the supervised information of samples into these methods should be able to promote the classification performance. In this paper, a novel method is proposed to carry out the multiple feature extraction for classification, called two-dimensional supervised canonical correlation analysis (2D-SCCA), in which the supervised information is added to the criterion function. Then, by analyzing the relationship between GCCA and 2D-SCCA, another feature extraction method called multiple-rank supervised canonical correlation analysis (MSCCA) is also developed. Different from 2D-SCCA, in MSCCA k pairs left transforms and k pairs right transforms are sought to maximize the correlation. The convergence behavior and computational complexity of the algorithms are analyzed. Experimental results on real-world databases demonstrate the viability of the formulation, they also show that the classification results of our methods are higher than the other’s and the computing time is competitive. In this manner, the proposed methods proved to be the competitive multiple feature extraction and classification methods. As such, the two methods may well help to improve image recognition tasks, which are essential in many advanced expert and intelligent systems.  相似文献   

16.
Despite the significant progress of automatic speech recognition (ASR) in the past three decades, it could not gain the level of human performance, particularly in the adverse conditions. To improve the performance of ASR, various approaches have been studied, which differ in feature extraction method, classification method, and training algorithms. Different approaches often utilize complementary information; therefore, to use their combination can be a better option. In this paper, we have proposed a novel approach to use the best characteristics of conventional, hybrid and segmental HMM by integrating them with the help of ROVER system combination technique. In the proposed framework, three different recognizers are created and combined, each having its own feature set and classification technique. For design and development of the complete system, three separate acoustic models are used with three different feature sets and two language models. Experimental result shows that word error rate (WER) can be reduced about 4% using the proposed technique as compared to conventional methods. Various modules are implemented and tested for Hindi Language ASR, in typical field conditions as well as in noisy environment.  相似文献   

17.
Two semi-supervised feature extraction methods are proposed for electroencephalogram (EEG) classification. They aim to alleviate two important limitations in brain–computer interfaces (BCIs). One is on the requirement of small training sets owing to the need of short calibration sessions. The second is the time-varying property of signals, e.g., EEG signals recorded in the training and test sessions often exhibit different discriminant features. These limitations are common in current practical applications of BCI systems and often degrade the performance of traditional feature extraction algorithms. In this paper, we propose two strategies to obtain semi-supervised feature extractors by improving a previous feature extraction method extreme energy ratio (EER). The two methods are termed semi-supervised temporally smooth EER and semi-supervised importance weighted EER, respectively. The former constructs a regularization term on the preservation of the temporal manifold of test samples and adds this as a constraint to the learning of spatial filters. The latter defines two kinds of weights by exploiting the distribution information of test samples and assigns the weights to training data points and trials to improve the estimation of covariance matrices. Both of these two methods regularize the spatial filters to make them more robust and adaptive to the test sessions. Experimental results on data sets from nine subjects with comparisons to the previous EER demonstrate their better capability for classification.  相似文献   

18.
Feature extraction is an important step before actual learning. Although many feature extraction methods have been proposed for clustering, classification and regression, very limited work has been done on multi-class classification problems. This paper proposes a novel feature extraction method, called orientation distance–based discriminative (ODD) feature extraction, particularly designed for multi-class classification problems. Our proposed method works in two steps. In the first step, we extend the Fisher Discriminant idea to determine an appropriate kernel function and map the input data with all classes into a feature space where the classes of the data are well separated. In the second step, we put forward two variants of ODD features, i.e., one-vs-all-based ODD and one-vs-one-based ODD features. We first construct hyper-plane (SVM) based on one-vs-all scheme or one-vs-one scheme in the feature space; we then extract one-vs-all-based or one-vs-one-based ODD features between a sample and each hyper-plane. These newly extracted ODD features are treated as the representative features and are thereafter used in the subsequent classification phase. Extensive experiments have been conducted to investigate the performance of one-vs-all-based and one-vs-one-based ODD features for multi-class classification. The statistical results show that the classification accuracy based on ODD features outperforms that of the state-of-the-art feature extraction methods.  相似文献   

19.
Feature selection and feature weighting are useful techniques for improving the classification accuracy of K-nearest-neighbor (K-NN) rule. The term feature selection refers to algorithms that select the best subset of the input feature set. In feature weighting, each feature is multiplied by a weight value proportional to the ability of the feature to distinguish pattern classes. In this paper, a novel hybrid approach is proposed for simultaneous feature selection and feature weighting of K-NN rule based on Tabu Search (TS) heuristic. The proposed TS heuristic in combination with K-NN classifier is compared with several classifiers on various available data sets. The results have indicated a significant improvement in the performance in classification accuracy. The proposed TS heuristic is also compared with various feature selection algorithms. Experiments performed revealed that the proposed hybrid TS heuristic is superior to both simple TS and sequential search algorithms. We also present results for the classification of prostate cancer using multispectral images, an important problem in biomedicine.  相似文献   

20.
Several studies have demonstrated the superior performance of ensemble classification algorithms, whereby multiple member classifiers are combined into one aggregated and powerful classification model, over single models. In this paper, two rotation-based ensemble classifiers are proposed as modeling techniques for customer churn prediction. In Rotation Forests, feature extraction is applied to feature subsets in order to rotate the input data for training base classifiers, while RotBoost combines Rotation Forest with AdaBoost. In an experimental validation based on data sets from four real-life customer churn prediction projects, Rotation Forest and RotBoost are compared to a set of well-known benchmark classifiers. Moreover, variations of Rotation Forest and RotBoost are compared, implementing three alternative feature extraction algorithms: principal component analysis (PCA), independent component analysis (ICA) and sparse random projections (SRP). The performance of rotation-based ensemble classifier is found to depend upon: (i) the performance criterion used to measure classification performance, and (ii) the implemented feature extraction algorithm. In terms of accuracy, RotBoost outperforms Rotation Forest, but none of the considered variations offers a clear advantage over the benchmark algorithms. However, in terms of AUC and top-decile lift, results clearly demonstrate the competitive performance of Rotation Forests compared to the benchmark algorithms. Moreover, ICA-based Rotation Forests outperform all other considered classifiers and are therefore recommended as a well-suited alternative classification technique for the prediction of customer churn that allows for improved marketing decision making.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号