首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Li ZHANG  Cong WANG 《通信学报》2018,39(5):111-122
Feature selection has played an important role in machine learning and artificial intelligence in the past decades.Many existing feature selection algorithm have chosen some redundant and irrelevant features,which is leading to overestimation of some features.Moreover,more features will significantly slow down the speed of machine learning and lead to classification over-fitting.Therefore,a new nonlinear feature selection algorithm based on forward search was proposed.The algorithm used the theory of mutual information and mutual information to find the optimal subset associated with multi-task labels and reduced the computational complexity.Compared with the experimental results of nine datasets and four different classifiers in UCI,the proposed algorithm is superior to the feature set selected by the original feature set and other feature selection algorithms.  相似文献   

2.
基于相像系数的雷达辐射源信号特征选择   总被引:10,自引:0,他引:10  
提出一种基于相像系数(RC)的特征选择新方法,给出了RC的定义和基于RC的类别可分离性判据,描述了 基于RC和量子遗传算法的雷达辐射源信号特征选择算法,设计了神经网络分类器,并将该方法与基于距离准则的顺序前 进法(SFSDC)和吕铁军的方法(GADC)作了特征选择和分类识别的对比实验。结果表明,本文方法无需事先指定最优特征 子集的维数,能可靠有效地选择出最佳特征子集,不仅大大降低了特征向量的维数,简化了分类器的设计,而且获得了比 原始特征集、SFSDC和GADC更高的正确识别率和识别效率。  相似文献   

3.
基于微粒群算法和支持向量机的特征子集选择方法   总被引:9,自引:0,他引:9  
乔立岩  彭喜元  彭宇 《电子学报》2006,34(3):496-498
在模式分类系统中,大量无关或冗余的特征往往会降低分类器的性能,因此需要特征选择.本文提出了基于离散微粒群(BPSO)和支持向量机(SVM)封装模式的特征子集选择方法,首先随机产生若干种群(特征子集),然后用BPSO算法对特征进行优化,并用SVM的10阶交叉验证结果指导算法的搜索,最后选出最佳适应度的子集对SVM进行训练.两个UCI机器数据集(户外图像和电离层)的实验结果表明了提出算法的有效性.  相似文献   

4.
The main aim of this study is to select the optimal set of genes from microarray cancer datasets that contribute to the prediction of specific cancer types. This study proposes the enhancement of the feature selection filter algorithm based on Joe's normalized mutual information and its use for gene selection. The proposed algorithm is implemented and evaluated on seven benchmark microarray cancer datasets, namely, central nervous system, leukemia (binary), leukemia (3 class), leukemia (4 class), lymphoma, mixed lineage leukemia, and small round blue cell tumor, using five well‐known classifiers, including the naive Bayes, radial basis function network, instance‐based classifier, decision‐based table, and decision tree. An average increase in the prediction accuracy of 5.1% is observed on all seven datasets averaged over all five classifiers. The average reduction in training time is 2.86 seconds. The performance of the proposed method is also compared with those of three other popular mutual information–based feature selection filters, namely, information gain, gain ratio, and symmetric uncertainty. The results are impressive when all five classifiers are used on all the datasets.  相似文献   

5.
Similarity-based online feature selection in content-based image retrieval.   总被引:2,自引:0,他引:2  
Content-based image retrieval (CBIR) has been more and more important in the last decade, and the gap between high-level semantic concepts and low-level visual features hinders further performance improvement. The problem of online feature selection is critical to really bridge this gap. In this paper, we investigate online feature selection in the relevance feedback learning process to improve the retrieval performance of the region-based image retrieval system. Our contributions are mainly in three areas. 1) A novel feature selection criterion is proposed, which is based on the psychological similarity between the positive and negative training sets. 2) An effective online feature selection algorithm is implemented in a boosting manner to select the most representative features for the current query concept and combine classifiers constructed over the selected features to retrieve images. 3) To apply the proposed feature selection method in region-based image retrieval systems, we propose a novel region-based representation to describe images in a uniform feature space with real-valued fuzzy features. Our system is suitable for online relevance feedback learning in CBIR by meeting the three requirements: learning with small size training set, the intrinsic asymmetry property of training samples, and the fast response requirement. Extensive experiments, including comparisons with many state-of-the-arts, show the effectiveness of our algorithm in improving the retrieval performance and saving the processing time.  相似文献   

6.
蔡灿辉  朱建清 《信号处理》2013,29(8):956-963
本文提出一个基于Gentle AdaBoost和嵌套级联结构(Nesting Cascade Structure)的快速人脸检测器。采用嵌套级联结构并在训练过程中剔除前级节点分类器已使用过的特征,解决了经典的AdaBoost级联分类器因各节点分类器独立训练导致不同节点之间特征相同的弱分类器大量存在而影响检测速度的问题,提高了人脸检测速度。采用Gentle AdaBoost算法训练节点分类器以提高各节点分类器的泛化能力,进一步减少嵌套级联结构中弱分类器的个数。实验结果表明本文所提出的人脸检测算法大幅度减少了级联分类器所需的弱分类器个数,使检测的速度得到明显的提高,在CIF(352×288)格式的视频上达到每帧8毫秒的检测速度,优于现有的人脸检测算法,而且检测的准确性也比现有的人脸检测算法略有提高。   相似文献   

7.
基于视图的3维模型分类方法与深度学习融合能有效提升模型分类的准确率。但目前的方法将相同类别的3维模型所有视点上的视图归为一类,忽略了不同视点上的视图差异,导致分类器很难学习到一个合理的分类面。为解决这一问题,该文提出一个基于深度神经网络的3维模型分类方法。该方法在3维模型的周围均匀设置多个视点组,为每个视点组训练1个视图分类器,充分挖掘不同视点组下的3维模型深度信息。这些分类器共享1个特征提取网络,但却有各自的分类网络。为了使提取的视图特征具有区分性,在特征提取网络中加入注意力机制;为了对非本视点组的视图建模,在分类网络中增加了附加类。在分类阶段首先提出一个视图选择策略,从大量视图中选择少量视图用于分类,以提高分类效率。然后提出一个分类策略通过分类视图实现可靠的3维模型分类。在ModelNet10和ModelNet40上的实验结果表明,该方法在仅用3张视图的情况下分类准确率高达93.6%和91.0%。  相似文献   

8.
We propose a method that can detect humans in a single image based on a novel cascaded structure. In our approach, both intensity-based rectangle features and gradient-based 1-D features are employed in the feature pool for weak-learner selection. The Real AdaBoost algorithm is used to select critical features from a combined feature set and learn the classifiers from the training images for each stage of the cascaded structure. Instead of using the standard boosted cascade, the proposed method employs a novel cascaded structure that exploits both the stage-wise classification information and the interstage cross-reference information. We introduce meta-stages to enhance the detection performance of a boosted cascade. Experiment results show that the proposed approach achieves high detection accuracy and efficiency.  相似文献   

9.
In pattern classification problems, the choice of variables to include in the feature vector is a difficult one. The authors have investigated the use of stepwise discriminant analysis as a feature selection step in the problem of segmenting digital chest radiographs. In this problem, locally calculated features are used to classify pixels into one of several anatomic classes. The feature selection step was used to choose a subset of features which gave performance equivalent to the entire set of candidate features, while utilizing less computational resources. The impact of using the reduced/selected feature set on classifier performance is evaluated for two classifiers: a linear discriminator and a neural network. The results from the reduced/selected feature set were compared to that of the full feature set as well as a randomly selected reduced feature set. The results of the different feature sets were also compared after applying an additional postprocessing step which used a rule-based spatial information heuristic to improve the classification results. This work shows that, in the authors' pattern classification problem, using a feature selection step reduced the number of features used, reduced the processing time requirements, and gave results comparable to the full set of features.  相似文献   

10.
Feature selection algorithm based on XGBoost   总被引:2,自引:0,他引:2  
Feature selection in classification has always been an important but difficult problem.This kind of problem requires that feature selection algorithms can not only help classifiers to improve the classification accuracy,but also reduce the redundant features as much as possible.Therefore,in order to solve feature selection in the classification problems better,a new wrapped feature selection algorithm XGBSFS was proposed.The thought process of building trees in XGBoost was used for reference,and the importance of features from three importance metrics was measured to avoid the limitation of single importance metric.Then the improved sequential floating forward selection (ISFFS) was applied to search the feature subset so that it had high quality.Compared with the experimental results of eight datasets in UCI,the proposed algorithm has good performance.  相似文献   

11.
《电子学报:英文版》2017,(6):1168-1176
As the conventional feature selection algorithms are prone to the poor running efficiency in largescale datasets with interacting features, this paper aims at proposing a novel rough feature selection algorithm whose innovation centers on the layered co-evolutionary strategy with neighborhood radius hierarchy. This hierarchy can adapt the rough feature scales among different layers as well as produce the reasonable decompositions through exploiting any correlation and interdependency among feature subsets. Both neighborhood interaction within layer and neighborhood cascade between layers are adopted to implement the interactive optimization of neighborhood radius matrix, so that both the optimal rough feature selection subsets and their global optimal set are obtained efficiently. Our experimental results substantiate the proposed algorithm can achieve better effectiveness, accuracy and applicability than some traditional feature selection algorithms.  相似文献   

12.
天地一体化网络处在开放的电磁环境中,会时常遭受恶意网络入侵。为解决网络中绕过安全机制的非授权行为对系统进行攻击的问题,提出一种改进的遗传算法。该算法以决策树算法为适应度函数,通过删除数据集中的冗余特征,显著提高了对网络攻击的拦截率。通过机器学习进行异常分类,并利用遗传算法的特征选择功能,增强机器学习方法的分类效率。为验证算法的有效性,选用UNSW_NB15和UGRansome1819数据集进行训练和检测。使用随机森林、人工神经网络、K近邻和支持向量机等4种机器学习分类器进行评估,采用准确性、F1分数、召回率和混淆矩阵等指标评估算法的性能。实验证明,遗传算法作为特征选择工具能够显著提高分类准确性,并在算法性能上取得显著改善。同时,为解决弱分类器的不稳定性,提出一种集成学习优化技术,将弱分类器和强分类器集成进行优化。实验证实了该优化算法在提高弱分类器稳定性方面性能卓越。  相似文献   

13.
Intrusion detection systems (IDSs) have an important effect on system defense and security. Recently, most IDS methods have used transformed features, selected features, or original features. Both feature transformation and feature selection have their advantages. Neighborhood component analysis feature transformation and genetic feature selection (NCAGAFS) is proposed in this research. NCAGAFS is based on soft computing and data mining and uses the advantages of both transformation and selection. This method transforms features via neighborhood component analysis and chooses the best features with a classifier based on a genetic feature selection method. This novel approach is verified using the KDD Cup99 dataset, demonstrating higher performances than other well‐known methods under various classifiers have demonstrated.  相似文献   

14.
针对不平衡数据的分类问题,本文提出了一种新的方法,将特征选择应用在不平衡数据集中,首先对数据集进行预处理,然后从特征选择的角度出发,选择具有较强能力代表数据集的特征,简化数据的同时也提高了分类性能。通过实验表明,该方法能够有效地提高分类精度。  相似文献   

15.
互信息是一种常用的特征选择评价函数,但研究表明它会导致分类精度相对较低.文中针对互信息倾向选择低频词的不足,提出了一种新的特征评价函数TFMIIE,将信息熵和改进互信息相结合,其中改进互信息能够避免偏向低频的生僻词,而特征熵有利于去除类别不确定的特征词.实验结果表明,采用TFMIIE进行特征选择,用得到的特征子集表示文本和构建分类器,文本分类的准确率与召回率比采用互信息的方法提高了约40%,验证了所提出的基于改进互信息和信息熵的文本特征选择方法是有效的.  相似文献   

16.
针对基于传统协同训练框架的视觉跟踪算法在复杂环境下鲁棒性不足,该文提出一种改进的协同训练框架下压缩跟踪算法。首先,利用空间布局信息,基于能量熵最大化的在线特征选择技术提升压缩感知分类器的判别能力,分别在灰度空间和局部二值模式空间建立起基于结构压缩特征的两个独立分类器。然后,基于候选样本信任度分布熵的分类器联合机制实现互补性特征的自适应融合,增强跟踪结果的鲁棒性。最后,在级联的梯度直方图分类器辅助下,通过具备样本选择能力的新型协同训练准则完成联合外观模型的准确更新,解决了协同训练误差的积累问题。对大量具有挑战性的序列的对比实验结果验证了该算法相比于其它近似跟踪算法具有更优的性能。  相似文献   

17.
针对生物组学数据高维小样本的特点而引起的分类误差较大的问题,提出了一种带约束小生境二进制粒子群优化的集成特征选择方法。该方法利用二进制粒子群优化算法搜索分类准确率最高的特征子集,通过约束粒子编码的置位个数以限制选择特征个数,并加入多模优化中的小生境技术使算法能够一次获得多个差异度较大的特征子集,最后采用集成学习技术将基于多特征子集建立的基分类器集成为强分类器并对数据进行分类学习。实验结果表明,该特征选择方法在生物组学数据上能够稳定选择较少特征并获得较好分类性能。   相似文献   

18.
The incredible growth of telecom data and fierce competition among telecommunication operators for customer retention demand continues improvements, both strategically and analytically, in the current customer relationship management (CRM) systems. One of the key objectives of a typical CRM system is to classify and predict a group of potential churners form a large set of customers to devise profitable and targeted retention campaigns for keeping a long-term relationship with valued customers. For achieving the aforementioned objective, several churn prediction models have been proposed in the past for the accurate identification of the customers who are prone to churn. However, these previously proposed models suffer from a number of limitations which place strong barriers towards the direct applicability of such models for accurate prediction. Firstly, the feature selection methods adopted in majority of the past work neglected the information rich variables present in call details record for model development. Secondly, selection of important features was done through statistical methods only. Although statistical methods have been applied successfully in diverse domains, however, these methods alone without the augmentation of domain knowledge have the tendency to yield erroneous results. Thirdly, the previous models have been validated mainly with benchmark datasets which do not provide a true representation of real world telecom data consisting of noise and large number of missing values. Fourthly, the evaluation measures used in the past neglected the True Positive (TP) rate, which actually highlights the ability of a model to correctly classify the percentage of churners as compared to non-churners. Finally, the classifiers used in the previous models completely neglected the use of fuzzy classification methods which perform reasonably well for data sets with noise. In this paper, a fuzzy based churn prediction model has been proposed and validated using a real data from a telecom company in South Asia. A number of predominant classifiers namely, Neural Network, Linear regression, C4.5, SVM, AdaBoost, Gradient Boosting and Random Forest have been compared with fuzzy classifiers to highlight the superiority of fuzzy classifiers in predicting the accurate set of churners.  相似文献   

19.
A new suboptimal search strategy suitable for feature selection in very high-dimensional remote sensing images (e.g., those acquired by hyperspectral sensors) is proposed. Each solution of the feature selection problem is represented as a binary string that indicates which features are selected and which are disregarded. In turn, each binary string corresponds to a point of a multidimensional binary space. Given a criterion function to evaluate the effectiveness of a selected solution, the proposed strategy is based on the search for constrained local extremes of such a function in the above-defined binary space. In particular, two different algorithms are presented that explore the space of solutions in different ways. These algorithms are compared with the classical sequential forward selection and sequential forward floating selection suboptimal techniques, using hyperspectral remote sensing images (acquired by the airborne visible/infrared imaging spectrometer [AVIRIS] sensor) as a data set. Experimental results point out the effectiveness of both algorithms, which can be regarded as valid alternatives to classical methods, as they allow interesting tradeoffs between the qualities of selected feature subsets and computational cost  相似文献   

20.
In remotely sensed hyperspectral imagery, many samples are collected on a given flight and many variable factors contribute to the distribution of samples. Various factors transform spectral responses causing them to appear differently in different contexts. We develop a method that infers context via spectra population distribution analysis. In this manner, feature space orientations of sets of spectral signatures are characterized using random set models. The models allow for the characterization of complex and irregular patterns in a feature space. The developed random set framework for context-based classification applies context-specific classifiers in an ensemblelike manner, and aggregates their decisions based on their contextual relevance to the spectra under test. Results indicate that the proposed method improves classification accuracy over similar classifiers, which make no use of contextual information, and performs well when compared to similar context-based approaches.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号