首页 | 官方网站   微博 | 高级检索  
     

基于标签关联性的分层分类共有与固有特征选择
引用本文:林耀进,白盛兴,赵红,李绍滋,胡清华.基于标签关联性的分层分类共有与固有特征选择[J].软件学报,2022,33(7):2667-2682.
作者姓名:林耀进  白盛兴  赵红  李绍滋  胡清华
作者单位:闽南师范大学 计算机学院, 福建 漳州 363000;数据科学与智能应用福建省高校重点实验室(闽南师范大学), 福建 漳州 363000;厦门大学人工智能系, 福建 厦门 361005;天津大学智能与计算学部, 天津 300072
基金项目:国家自然科学基金(62076116,61672272,61925602,61732011)
摘    要:在大数据时代,数据的样本数量、特征维度和类别数量都在急剧增加,且样本类别间通常存在着层次结构.如何对层次结构数据进行特征选择具有重要意义.近年来,已有相关特征选择算法提出,然而现有算法未充分利用类别的层次结构信息,且忽略了不同类节点具有共有与固有属性的特点.据此,提出了基于标签关联性的分层分类共有与固有特征选择算法.该算法利用递归正则化对层次结构的每个内部节点选择对应的固有特征,并充分利用层次结构分析标签关联性,进而利用正则化惩罚项学习各子树的共有特征.该模型不仅能够处理树结构层次化数据,也能直接处理更为复杂常见的有向无环图结构的层次化数据.在6个树结构数据集和4个有向无环图结构数据集上的实验结果,验证了该算法的有效性.

关 键 词:特征选择  分层分类  共有特征  固有特征  递归正则化
收稿时间:2020/11/27 0:00:00
修稿时间:2021/1/27 0:00:00

Label-correlation-based Common and Specific Feature Selection for Hierarchical Classification
LIN Yao-Jin,BAI Sheng-Xing,ZHAO Hong,LI Shao-Zi,HU Qing-Hua.Label-correlation-based Common and Specific Feature Selection for Hierarchical Classification[J].Journal of Software,2022,33(7):2667-2682.
Authors:LIN Yao-Jin  BAI Sheng-Xing  ZHAO Hong  LI Shao-Zi  HU Qing-Hua
Affiliation:School of Computer Science, Minnan Normal University, Zhangzhou 363000, China;Key Laboratory of Data Science and Intelligent Application (Minnan Normal University), Zhangzhou 363000, China;Department of Artificial Intelligence, Xiamen University, Xiamen 361005, China; College of Intelligence and Computing, Tianjin University, Tianjin 300072, China
Abstract:In the era of big data, the sizes of data sets in terms of the number of samples, features, and classes have dramatically increased, and the classes usually exists a hierarchical structure. It is of great significance to select features for hierarchical data. In recent years, relevant feature selection algorithms have been proposed. However, the existing algorithms do not take full advantage of the information of the hierarchical structure of classes, and ignore the common and specific features of different class nodes. This study proposes a label- correlation-based feature selection algorithm for hierarchical classification with common and specific features. The algorithm uses recursive regularization to select the corresponding specific features for each internal node of the hierarchical structure, and makes full use of the hierarchical structure to analyze the label correlation, and then utilizes regularized penalty to select the common features of each subtree. Finally, the proposed model not only can address hierarchical tree data, but also can address more complex hierarchical DAG data directly. Experimental results on six hierarchical tree data sets and four hierarchical DAG data sets demonstrate the effectiveness of the proposed algorithm.
Keywords:feature selection|hierarchical classification|common features|specific feature|recursive regularization
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号