应用于不平衡多分类问题的损失平衡函数 Application of the loss balance function to the imbalanced multi-classification problems期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

应用于不平衡多分类问题的损失平衡函数

引用本文：	黄庆康,宋恺涛,陆建峰.应用于不平衡多分类问题的损失平衡函数[J].智能系统学报,2019,14(5):953-958.

作者姓名：	黄庆康宋恺涛陆建峰

作者单位：	南京理工大学计算机科学与工程学院, 江苏南京 210094

摘要：	传统分类算法一般要求数据集类别分布平衡，然而在实际情况中往往面临的是不平衡的类别分布。目前存在的数据层面和模型层面算法试图从不同角度解决该问题，但面临着参数选择以及重复采样产生的额外计算等问题。针对此问题，提出了一种在小批量内样本损失自适应均衡化的方法。该算法采用了一种动态学习损失函数的方式，根据小批量内样本标签信息调整各样本损失权重，从而实现在小批量内各类别样本总损失的平衡性。通过在caltech101和ILSVRC2014数据集上的实验表明，该算法能够有效地减少计算成本并提高分类精度，且一定程度上避免了过采样方法所带来的模型过拟合风险。
关键词：	不平衡学习不平衡数据分类多分类不平衡损失平衡不平衡数据分类算法不平衡数据集 F₁调和平均F₁调和平均卷积神经网络深度学习
Application of the loss balance function to the imbalanced multi-classification problems

HUANG Qingkang,SONG Kaitao,LU Jianfeng.Application of the loss balance function to the imbalanced multi-classification problems[J].CAAL Transactions on Intelligent Systems,2019,14(5):953-958.

Authors:	HUANG Qingkang SONG Kaitao LU Jianfeng

Affiliation:	School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China

Abstract:	The traditional classification algorithms generally require a balanced distribution of various categories in datasets. However, the traditional classification algorithms often encounter an imbalanced class distribution in real life. The existing data- and classifier-level methods that attempt to solve this problem based on different perspectives exhibit some disadvantages, including the selection of parameters that have to be handled carefully and additional computing power because of repeated sampling. To solve these disadvantages, a method that can adaptively maintain the loss balance of examples in a mini-batch is proposed. This algorithm uses a dynamic loss-learnt function to adjust the loss ratio of each sample based on the information present in the label in every mini-batch, thereby achieving a balanced total loss for each class. The experiments conducted using the caltech101 and ILSVRC2014 datasets denote that this algorithm can effectively reduce the computational cost, improve the classification accuracy, and avoid the overfitting risk of the model that can be attributed to the oversampling method.

Keywords:	imbalanced learning imbalanced data classification imbalanced multi-classification loss balance classification algorithm for imbalanced data imbalanced dataset F₁ measureF₁ measure convolutional neural networks deep learning

	点击此处可从《智能系统学报》浏览原始摘要信息
	点击此处可从《智能系统学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏