首页 | 官方网站   微博 | 高级检索  
     

基于特征偏好的聚类研究
引用本文:方 玲,陈松灿.基于特征偏好的聚类研究[J].计算机科学,2015,42(5):57-61.
作者姓名:方 玲  陈松灿
作者单位:南京航空航天大学计算机科学与技术学院 南京 210016
摘    要:传统的聚类方法,如k均值和模糊c均值,通常并不区分数据特征对聚类的不同贡献或重要度,因此在面对高维数据聚类时,常会导致偏低的聚类性能,这归咎于聚类时未考虑高维数据特征间所存在的高度相关性或冗余.而通过在聚类时为每一特征引入权重并通过聚类目标的优化,不仅能自动获得对应的权重,而且也获得了聚类性能的提升.尽管如此,但无监督获取的特征权重未必吻合用户所期望的特征间的相对重要性(或偏好).因此尝试利用用户给定的实际偏好设计出能反映特征偏好的聚类方法,其将现有独立于个体聚类的全局加权型偏好聚类方法拓展至聚类依赖的局部特征加权型方法,由此弥补了前者的不足,提升了偏好聚类算法的性能.

关 键 词:聚类分析  特征偏好  特征权重  聚类依赖  二次规划

Research on Clustering with Feature Preferences
FANG Ling and CHEN Song-can.Research on Clustering with Feature Preferences[J].Computer Science,2015,42(5):57-61.
Authors:FANG Ling and CHEN Song-can
Affiliation:Department of Computer Science & Engineering,Nanjing University of Aeronautics & Astronautics,Nanjing 210016,China and Department of Computer Science & Engineering,Nanjing University of Aeronautics & Astronautics,Nanjing 210016,China
Abstract:Traditional clustering methods,such as k-means and fuzzy c-means,do not generally distinguish different contributions or importance of data features to individual clusters,thus when facing high dimensional data,they often lead to lower clustering performance due to hardly considering the presence of high correlation or redundancy between features.In order to mitigate such adversity,with the introduction of the feature weights for each cluster in the clustering objective,we could automatically obtain not only the cluster-dependent weights but also the enhanced clustering performance.Though so,the feature weights obtained by an unsupervised clustering algorithm do not necessarily match the relative importance (or preferences) between the features as users expect.Thus this paper attempted to take advantage of actual preferences from users to design a clustering method which can reflect the feature preference.As a result,the proposed method not only extends the existing clustering methods with globally-weighted cluster-independent features to the one with locally-weighted cluster-dependent features but alos improves the clustering performance for feature preferences.
Keywords:Clustering analysis  Feature preferences  Feature weighting  Cluster-dependent  Quadratic programming
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号