首页 | 官方网站   微博 | 高级检索  
     

基于重复度分析的森林优化特征选择算法
引用本文:冀若含,董红斌.基于重复度分析的森林优化特征选择算法[J].智能系统学报,2022,17(6):1113-1122.
作者姓名:冀若含  董红斌
作者单位:哈尔滨工程大学 计算机科学与技术学院,黑龙江 哈尔滨 150001
摘    要:森林优化算法是一种基于森林中树木播种思想的演化算法,其具有良好的特征空间搜索能力,且实现难度低。但该算法在森林整体的收敛速度和寻优能力上仍存在提升空间,而且对高维数据集的适应度较差。本文针对上述问题提出了基于重复度分析的森林优化特征选择算法(feature selection using forest optimization algorithm based on duplication analysis, DAFSFOA)。该算法提出了基于信息增益的自适应初始化策略、森林重复度分析机制、森林重启机制、候选最优树生成策略、综合考虑特征选择数量和分类正确率的适应度函数。实验结果表明,DAFSFOA在大部分数据集上达到了最高的分类准确率。同时,对于高维数据集SRBCT,在维度缩减率和分类准确率方面,DAFSFOA对比森林优化特征选择算法(feature selection using forest optimization algorithm, FSFOA)都有较大提升。DAFSFOA 比FSFOA具有更强的特征空间探索能力,而且能够适应不同维度的数据集。

关 键 词:特征选择  演化算法  重复度分析  信息熵  信息增益  重启机制  森林优化算法  维度缩减

Feature selection using forest optimization algorithm based on duplication analysis
JI Ruohan,DONG Hongbin.Feature selection using forest optimization algorithm based on duplication analysis[J].CAAL Transactions on Intelligent Systems,2022,17(6):1113-1122.
Authors:JI Ruohan  DONG Hongbin
Affiliation:School of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China
Abstract:The forest optimization algorithm is an evolutionary algorithm based on the concept of forest tree planting. It has a strong capability for searching for feature space and low implementation difficulty. However, the algorithm still has room for improvement in the convergence speed and merit-seeking ability of the forest as a whole, and it is not well-suited to high dimensional data sets. In this paper, we propose to use a forest optimization algorithm based on duplication analysis (DAFSFOA) to address the above problems. The algorithm proposes an adaptive initialization strategy based on information gain, a forest repetition analysis mechanism, a forest restart mechanism, a candidate optimal tree generation strategy, and an adaptation function that integrates the number of feature selections and the correct classification rate. The experimental results show that DAFSFOA achieves the highest classification accuracy on most datasets. Meanwhile, for the high dimensional dataset SRBCT, DAFSFOA has a large improvement over feature selection using a forest optimization algorithm (FSFOA) in terms of dimensionality reduction rate and classification accuracy. DAFSFOA has a stronger feature space exploration capability than FSFOA and can adapt to datasets with different dimensions.
Keywords:feature selection  evolutionary algorithm  duplication analysis  information entropy  information gain  restart mechanism  forest optimization algorithm  dimensionality reduction
点击此处可从《智能系统学报》浏览原始摘要信息
点击此处可从《智能系统学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号