首页 | 官方网站   微博 | 高级检索  
     

基于PCA和K 均值聚类的有监督分裂层次聚类方法*
引用本文:浦路平,赵鹏大,胡光道,张振飞,夏庆霖.基于PCA和K 均值聚类的有监督分裂层次聚类方法*[J].计算机应用研究,2008,25(5):1412-1414.
作者姓名:浦路平  赵鹏大  胡光道  张振飞  夏庆霖
作者单位:1. 中国地质大学,遥感地质与数学地质所,武汉,430074;桂林工学院,现代教育中心,广西,桂林,541004
2. 中国地质大学,遥感地质与数学地质所,武汉,430074
基金项目:国家自然科学基金 , 广西教育厅科研项目
摘    要:提出了一种新的基于PCA和K-均值聚类的有监督二叉分裂层次聚类方法PCASHC,用K-均值聚类进行逐次二叉聚簇分裂,选择PCA第一主成分相距最远样本点作为K-均值聚类初始聚簇中心,解决了K-均值聚类初始中心随机选择导致结果不确定的问题,用聚簇样本类别方差作为聚簇样本不纯度控制聚簇分裂水平,避免过拟合,可学习到合适的聚类数目。用四组UCI标准数据集对其进行了10折交叉验证分类误差检验,与另外七种分类器相比说明PCASHC有较高的分类精度。

关 键 词:数据挖掘  机器学习  有监督聚类  分裂层次聚类
文章编号:1001-3695(2008)05-1412-03
收稿时间:2008/4/20 0:00:00
修稿时间:2007年2月28日

PCA and K means based supervised split hierarchy clustering method
PU Lu ping,ZHAO Peng d,HU Guang dao,ZHANG Zhen fei,XIA Qing lin.PCA and K means based supervised split hierarchy clustering method[J].Application Research of Computers,2008,25(5):1412-1414.
Authors:PU Lu ping  ZHAO Peng d  HU Guang dao  ZHANG Zhen fei  XIA Qing lin
Affiliation:(1.Research Institute of Mathematical Geology, China University of Geosciences, Wuhan 430074, China; 2.Modern Education Technical Center, Guilin University of Technology, Guilin Guangxi 541004, China)
Abstract:The paper presented a new supervised bin-split hierarchy clustering method,PCASHC(PCA split supervised hierarchy clustering).The method bin-splited cluster by K-means clustering with initial centers undertaken by the samples of maximum and minimum of first principal component of principal component analysis of the cluster,which solve the problem of uncertain result as a result of the uncertain choice of initial centers.In the method,the variance of the classes of the samples in cluster was chose as measure of impurity of cluster samples class,which controls the slip level of cluster,avoid over-fitting and can find out the proper number of clusters.The method tested with 10-fold cross validation for classifying of 4 UCI datasets.It proves the method has excellent classifying accuracy rate comparing of the error rate of it to other 7 representative classifiers for classifying of same datasets with same test way.
Keywords:data mining  machine learning  supervised clustering  split hierarchy clustering
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号