期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

谢娟英雷金虎谢维信高新波《计算机应用》2011,31(12):3292-3296

F-score作为特征评价准则时,没有考虑不同特征的不同测量量纲对特征重要性的影响。为此,提出一种新的特征评价准则D-score,该准则不仅可以衡量样本特征在两类或多类之间的辨别能力,而且不受特征测量量纲对特征重要性的影响。以D-score为特征重要性评价准则,结合前向顺序搜索、前向顺序浮动搜索以及后向浮动搜索三种特征搜索策略,以支持向量机分类正确率评价特征子集的分类性能得到三种混合的特征选择方法。这些特征选择方法结合了Filter方法和Wrapper方法的各自优势实现特征选择。对UCI机器学习数据库中9个标准数据集的实验测试,以及与基于改进F-score与支持向量机的混合特征选择方法的实验比较,表明D-score特征评价准则是一种有效的样本特征重要性,也即特征辨别能力衡量准则。基于该准则与支持向量机的混合特征选择方法实现了有效的特征选择,在保持数据集辨识能力不变情况下实现了维数压缩。相似文献

2.

Genetic algorithm wrapped Bayesian network feature selection applied to differential diagnosis of erythemato-squamous diseases

Akın Özçift Arif Gülten 《Digital Signal Processing》2013,23(1):230-237

This paper presents a new method for differential diagnosis of erythemato-squamous diseases based on Genetic Algorithm (GA) wrapped Bayesian Network (BN) Feature Selection (FS). With this aim, a GA based FS algorithm combined in parallel with a BN classifier is proposed.Basically, erythemato-squamous dataset contains six dermatological diseases defined with 34 features. In GA–BN algorithm, GA makes a heuristic search to find most relevant feature model that increase accuracy of BN algorithm with the use of a 10-fold cross-validation strategy. The subsets of features are sequentially used to identify six dermatological diseases via a BN fitting the corresponding data. The algorithm, in this case, produces 99.20% classification accuracy in the diagnosis of erythemato-squamous diseases. The strength of feature model generated for BN is furthermore tested with the use of Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), Simple Logistics (SL) and Functional Decision Tree (FT). The resultant classification accuracies of algorithms are 98.36%, 97.00%, 98.36% and 97.81% respectively. On the other hand, BN algorithm with classification accuracy of 99.20% is quite a high diagnosis performance for erythemato-squamous diseases. The proposed algorithm makes no more than 3 misclassifications out of 366 instances. Furthermore, FS power of GA is also compared with two alternative search algorithms, i.e. Best First (BF) and Sequential Floating (SF).The obtained results have all together shown that the proposed GA–BN based FS and prediction strategy is very promising in diagnosis of erythemato-squamous diseases. 相似文献

3.

Whodunnit – Searching for the most important feature types signalling emotion-related user states in speech

Anton Batliner Stefan Steidl Björn Schuller Dino Seppi Thurid Vogt Johannes Wagner Laurence Devillers Laurence Vidrascu Vered Aharonson Loic Kessous Noam Amir 《Computer Speech and Language》2011,25(1):4-28

In this article, we describe and interpret a set of acoustic and linguistic features that characterise emotional/emotion-related user states – confined to the one database processed: four classes in a German corpus of children interacting with a pet robot. To this end, we collected a very large feature vector consisting of more than 4000 features extracted at different sites. We performed extensive feature selection (Sequential Forward Floating Search) for seven acoustic and four linguistic types of features, ending up in a small number of ‘most important’ features which we try to interpret by discussing the impact of different feature and extraction types. We establish different measures of impact and discuss the mutual influence of acoustics and linguistics. 相似文献

4.

利用可分性指数的极化SAR图像特征选择与多层SVM分类

李平徐新董浩邓旭《计算机应用》2018,38(1):132-136

可分性指数（SI）可用来选择各类地物的有效分类特征,但在多维特征以及地物可分性较好的情况下,只利用可分性指数进行特征选择不能有效去除特征之间的冗余性。基于此,提出了利用可分性指数并辅以顺序后退（SBS）算法进行特征选择与多层支持向量机（SVM）分类的方法。首先,由各类地物在所有特征下的可分性指数选择分类地物和特征;然后,以该地物的分类精度为评估依据,利用顺序后退法筛选特征;其次,由剩余地物之间的可分性指数和顺序后退法依次选择各类地物的分类特征;最后利用多层SVM进行分类。实验结果表明,与只利用可分性指数选择特征进行多层SVM分类的方法相比,所提方法的分类精度提高了2%,各类地物的分类精度均高于86%,且运行时间为原来方法的一半。相似文献

5.

A novel ACO–GA hybrid algorithm for feature selection in protein function prediction

Shahla Nemati Mohammad Ehsan Basiri Nasser Ghasem-Aghaee Mehdi Hosseinzadeh Aghdam 《Expert systems with applications》2009,36(10):12086-12094

Protein function prediction is an important problem in functional genomics. Typically, protein sequences are represented by feature vectors. A major problem of protein datasets that increase the complexity of classification models is their large number of features. Feature selection (FS) techniques are used to deal with this high dimensional space of features. In this paper, we propose a novel feature selection algorithm that combines genetic algorithms (GA) and ant colony optimization (ACO) for faster and better search capability. The hybrid algorithm makes use of advantages of both ACO and GA methods. Proposed algorithm is easily implemented and because of use of a simple classifier in that, its computational complexity is very low. The performance of proposed algorithm is compared to the performance of two prominent population-based algorithms, ACO and genetic algorithms. Experimentation is carried out using two challenging biological datasets, involving the hierarchical functional classification of GPCRs and enzymes. The criteria used for comparison are maximizing predictive accuracy, and finding the smallest subset of features. The results of experiments indicate the superiority of proposed algorithm. 相似文献

6.

基于改进遗传算法和图神经网络的股市波动预测方法

李晓寒贾华丁程雪李太勇《计算机应用》2022,42(5):1624-1633

针对支持向量机（SVM）、长短期记忆（LSTM）网络等智能算法在股市波动预测过程中股票评价特征选择困难及时序关系维度特征缺失的问题,为能够准确预测股票波动、有效防范金融市场风险,提出了一种基于改进遗传算法（IGA）和图神经网络（GNN）的股市波动预测方法——IGA-GNN。首先,利用相邻交易日间的时序关系构建股市交易指标图数据;其次,通过评价指标特性优化交叉、变异概率来改进遗传算法（GA）,从而实现节点特征选择;然后,建立图数据的边与节点特征的权重矩阵;最后,运用GNN进行图数据节点的聚合与分类,实现了股市波动预测。在实验阶段,所研究的股票总评价指标数为130个,其中IGA在GNN方法下提取的有效评价指标87个,使指标数量降低了33.08%。应用所提IGA在智能算法中进行特征提取,得到的算法与未进行特征提取的智能算法相比,预测准确率整体提升了7.38个百分点;而与应用传统GA进行智能算法的特征提取相比,应用所提IGA进行智能算法的特征提取的总训练时间缩短了17.97%。其中,IGA-GNN方法的预测准确率最高,相较未进行特征提取的GNN方法的预测准确率整体提高了19.62个百分点;而该方法与用传统GA进行特征提取的GNN方法相比,训练时间平均缩短了15.97%。实验结果表明,所提方法可对股票特征进行有效提取,预测效果较好。相似文献

7.

基于统计特性随机森林算法的特征选择

宋源梁雪春张然《计算机应用》2015,35(5):1459-1461

针对由静息态功能磁共振成像(R-fMRI)得到的脑功能连接矩阵数据运用传统特征选择方法处理的结果,存在特征冗余,无法确定最终特征维数等问题,提出一种全新的特征选择算法.该算法在随机森林(RF)算法中结合统计特性,根据袋外数据的分类效果得到保留的特征,并将其运用在对精神分裂患者与正常被试者的识别实验中.实验结果表明,与传统的主成分分析(PCA)方法相比,该算法可以有效保留重要特征,提高识别精度,且保留的特征具有很好的医学解释性. 相似文献

8.

互信息与模糊C均值聚类集成的特征优选方法

朱接文肖军《计算机应用》2014,34(9):2608-2611

针对大型数据中大量冗余特征的存在可能降低数据分类性能的问题,提出了一种基于互信息(MI)与模糊C均值(FCM)聚类集成的特征自动优选方法FCC-MI。首先分析了互信息特征及其相关度函数,根据相关度对特征进行排序;然后按照最大相关度对应的特征对数据进行分组,采用FCM聚类方法自动确定最优特征数目;最后基于相关度对特征进行了优选。在UCI机器学习数据库的7个数据集上进行实验,并与相关文献中提出的基于类内方差与相关度结合的特征选择方法(WCMFS)、基于近似Markov blanket和动态互信息的特征选择算法(B-AMBDMI)及基于互信息和遗传算法的两阶段特征选择方法(T-MI-GA)进行对比。理论分析和实验结果表明,FCC-MI不但提高了数据分类的效率,而且在有效保证分类精度的同时能自动确定最优特征子集,减少了数据集的特征数目,适用于海量、数据特征相关性大的特征约简及数据分析。相似文献

9.

基于Filter Wrapper模式的特征选择算法*

周传华柳智才丁敬安周家亿《计算机应用研究》2019,36(7)

特征选择是数据挖掘、机器学习和模式识别中始终面临的一个重要问题。针对类和特征分布不均时,传统信息增益在特征选择中存在的选择偏好问题,本文提出了一种基于信息增益率与随机森林的特征选择算法。该算法结合Filter和Wrapper模式的优点,首先从信息相关性和分类能力两个方面对特征进行综合度量,然后采用序列前向选择（Sequential Forward Selection, SFS）策略对特征进行选择,并以分类精度作为评价指标对特征子集进行度量,从而获取最优特征子集。实验结果表明,本文算法不仅能够达到特征空间降维的效果,而且能够有效提高分类算法的分类性能和查全率。相似文献

10.

融合序列后向选择与支持向量机的混合式特征选择算法

吴清寿刘长勇林丽惠《计算机系统应用》2019,28(7):174-179

维度灾难是机器学习任务中的常见问题,特征选择算法能够从原始数据集中选取出最优特征子集,降低特征维度.提出一种混合式特征选择算法,首先用卡方检验和过滤式方法选择重要特征子集并进行标准化缩放,再用序列后向选择算法（SBS）与支持向量机（SVM）包裹的SBS-SVM算法选择最优特征子集,实现分类性能最大化并有效降低特征数量.实验中,将包裹阶段的SBS-SVM与其他两种算法在3个经典数据集上进行测试,结果表明,SBS-SVM算法在分类性能和泛化能力方面均具有较好的表现. 相似文献