首页 | 官方网站   微博 | 高级检索  
     

基于遗传乌燕鸥算法的同步优化特征选择
引用本文:贾鹤鸣,李瑶,孙康健.基于遗传乌燕鸥算法的同步优化特征选择[J].自动化学报,2022,48(6):1601-1615.
作者姓名:贾鹤鸣  李瑶  孙康健
作者单位:1.三明学院信息工程学院 三明 365004
基金项目:福建省自然科学基金项目(2021J011128);
摘    要:针对传统支持向量机方法用于数据分类存在分类精度低的不足问题, 将支持向量机分类方法与特征选择同步结合, 并利用智能优化算法对算法参数进行优化研究. 首先将遗传算法(Genetic algorithm, GA)和乌燕鸥优化算法(Sooty tern optimization algorithm, STOA)进行混合, 先通过对平均适应度值进行评估, 当个体的适应度函数值小于平均值时采用遗传算法对其进行局部搜索的加强, 否则进行乌燕鸥本体优化过程, 同时将支持向量机内核函数和特征选择目标共同作为优化对象, 利用改进后的STOA-GA寻找最适应解, 获得所选的特征分类结果. 其次, 通过16组经典UCI数据集和实际乳腺癌数据集进行数据分类研究, 在最佳适应度值、所选特征个数、特异性、敏感性和算法耗时方面进行对比研究, 实验结果表明, 该算法可以更加准确地处理数据, 避免冗余特征干扰, 在数据挖掘领域具有更广阔的工程应用前景.

关 键 词:乌燕鸥优化算法    混合优化    特征选择    支持向量机    数据分类
收稿时间:2020-05-18

Simultaneous Feature Selection Optimization Based on Hybrid Sooty Tern Optimization Algorithm and Genetic Algorithm
Affiliation:1.School of Information Engineering, Sanming University, Sanming 3650042.School of Mechanical and Electrical Engineering, Northeast Forestry University, Harbin 150040
Abstract:In view of the shortcomings of traditional support vector machine in data classification, this paper combines support vector machine classification with feature selection synchronously, and uses intelligent optimization algorithm to optimize algorithm parameters. Firstly, the genetic algorithm (GA) is mixed with the sooty tern optimization algorithm (STOA). In this paper, the average fitness value is evaluated first. When the fitness function value of the individual is less than the average value, the GA is used to deepen the local search. Otherwise, the optimization process of the STOA itself is carried out.The SVM kernel function and the feature selection target are taken as the optimization object. The improved STOA-GA is used to find the most suitable solution and get the selected feature classification results. Secondly, through the data classification research of sixteen groups of classic UCI data sets and real breast cancer data sets, the best fitness value, the number of selected features, specificity, sensitivity and algorithm time-consuming are compared. The experimental results show that the algorithm proposed in this paper can deal with data more accurately, avoid redundant feature interference, and have a broader work in the field of data mining application prospect of the project.
Keywords:
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号