首页 | 官方网站   微博 | 高级检索  
     

高斯核尺度空间中的采样算法研究
引用本文:朱顺痣,施华,刘利钊,叶东毅. 高斯核尺度空间中的采样算法研究[J]. 计算机科学与探索, 2012, 6(7): 644-653
作者姓名:朱顺痣  施华  刘利钊  叶东毅
作者单位:1. 厦门理工学院计算机科学与技术系,福建厦门,361024
2. 福州大学空间数据挖掘与信息共享教育部重点实验室,福州,350002
摘    要:将线性尺度空间的特征点扩展问题转化为多尺度数据集的同尺度内分类问题,该问题属于尺度不变的非平衡数据集分类问题。提出了一种基于尺度空间的核学习的采样算法来处理支持向量机(support vector machine,SVM)在非平衡数据集上的分类问题。其核心思想是首先在核空间中对少数类样本进行上采样,然后通过输入空间和核空间的距离关系寻找所合成样本在输入空间的原像,最后再采用SVM对其进行训练,从而有效克服了目前采样方法在不同空间处理训练样本所带来的数据不一致问题。该算法所采用的采样策略不仅能够降低数据失衡率,而且能够拓展少数类样本所形成的凸壳,从而更为有效地纠正最优分类超平面偏移问题。实验结果证明,所获得的结果分类器具有更好的泛化性能,能够在同尺度内有效扩展稳定特征点数量。

关 键 词:分类  高斯核  尺度空间  凸壳  非平衡数据集

Sampling Method Based on Scale Space with Gaussian Kernel
ZHU Shunzhi , SHI Hua , LIU Lizhao , YE Dongyi. Sampling Method Based on Scale Space with Gaussian Kernel[J]. Journal of Frontier of Computer Science and Technology, 2012, 6(7): 644-653
Authors:ZHU Shunzhi    SHI Hua    LIU Lizhao    YE Dongyi
Abstract:The expansion of feature points of the linear scale space is transformed into the classification of multi-scale data set within the same scale, which belongs to the classification of scale invariant non-equilibrium. This paper presents a sample approach based on scale space with Gaussian kernel learning to solve classification on imbalance dataset by support vector machine (SVM). The method first preprocesses the data by over-sampling the minority class in kernel space, then finds the pre-images of the synthetic samples based on a distance relation between kernel space and input space, finally appends these pre-images to the original dataset to train. As a result, the inconsistency which is brought about by processing samples in different spaces is overcome. The sampling strategies of the method not only can decrease imbalanced rate of training dataset, but also can enlarge convex hull of the minority class. Consequently, the problem of boundary skew can be amended more effectively. Experimental results on real dataset indicate that the generalization performance of the result classifier is improved and the algorithm can work well on expanding the feature points stably for a certain scale.
Keywords:classification   Gaussian kernel   scale space   convex hull   imbalanced datasets
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号