首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 250 毫秒
1.
改进的模糊C-均值聚类算法研究   总被引:10,自引:1,他引:9       下载免费PDF全文
为解决模糊C-均值(FCM)聚类算法对噪声和孤立点数据敏感、样本分布不均衡的问题,提出了具体的改进和提高的方法:改进隶属度函数,以消除孤立点对聚类结果的影响;为每个样本点赋予一个定量的权值,以区分不同的样本点对于知识发现的不同作用,改善噪音和分布不均衡的样本集的聚类结果。实验结果表明该算法具有更好的健壮性和聚类效果。  相似文献   

2.
基于模糊分割和邻近对的支持向量机分类器   总被引:1,自引:0,他引:1  
支持向量机算法对噪声点和异常点是敏感的,为了解决这个问题,人们提出了模糊支持向量机,但其中的模糊隶属度函数需要人为设置。提出基于模糊分割和邻近对的支持向量机分类器。在该算法中,首先根据聚类有效性用模糊c-均值聚类算法分别对训练集中的正负类数据聚类;然后,根据聚类结果构造c个二分类问题,求解得c个二分类器;最后,用邻近对策略对样本点进行识别。用4个著名的数据集进行了数值实验,结果表明该算法能有效提高带噪声点和异常点数据集分类的预测精度。  相似文献   

3.
基于减法聚类改进的模糊c-均值算法的模糊聚类研究   总被引:2,自引:0,他引:2  
针对模糊c-均值(FCM)聚类算法受初始聚类中心影响,易陷入局部最优,以及算法对孤立点数据敏感的问题,提出了解决方案:采用快速减法聚类算法初始化聚类中心,为每个样本点赋予一个定量的权值,用来区分不同的样本点对最终的聚类结果的不同作用,为提高聚类速度采用修正隶属度矩阵的方法,并将算法与传统的FCM相比.实验结果表明,该算法较好地解决了初值问题,与随机初始化方法相比,迭代次数少、收敛速度快、具有较好的聚类结果.  相似文献   

4.
K-means和模糊C均值为代表的划分式聚类算法无法有效处理按照风格为标准划分样本的聚类任务.针对此问题,文中提出按风格划分数据的模糊聚类算法.利用风格标准化矩阵表示包含在类簇中样本的风格信息,同时使用逼近标准风格之后的样本计算距离矩阵,并以隶属度表示样本点对于类簇的可代表程度.通过常用的交替优化策略同时优化隶属度矩阵和风格标准化矩阵.文中算法可以有效利用样本的风格信息和样本点与类簇之间的关系信息,在人工数据集和真实数据集上的实验表明算法的有效性.  相似文献   

5.

传统模糊??-均值(FCM) 算法要求一个样本对于各个聚类的隶属度之和满足归一化条件, 从而导致算法对噪声和孤立点敏感, 对非均衡分布样本的聚类有效性降低. 针对该问题, 提出一种改进模糊隶属函数约束的FCM聚类算法, 通过放松归一化条件, 推导出新的隶属度划分公式, 并在聚类过程中不断进行隶属度修正, 从而达到消除噪声样本、提高聚类有效性的目的. 最后通过实验结果对比验证了改进算法的正确性.

  相似文献   

6.
基于改进模糊聚类算法鲁棒的图像分割   总被引:2,自引:0,他引:2       下载免费PDF全文
对噪声图像提出了一种改进的模糊聚类分割算法。因为模糊C均值聚类(FCM)算法具有对噪声数据敏感的缺点,该算法通过提升意义更趋明晰的模糊隶属度来改变模糊聚类中的目标函数,即通过在标准的FCM算法中使用到类的Voronoi cell的距离来取代到类的原型的欧氏距离,从而增强了聚类结果的鲁棒性。实验结果表明,改进的算法较之于FCM对于噪声图像的分割有更好的鲁棒性。  相似文献   

7.
一种基于隶属度优化的演化聚类算法   总被引:1,自引:0,他引:1  
针对FCM中数据点隶属度的计算是影响算法执行效率的主要因素,提出一种新的加速FCM算法(accelerated fuzzy C-means,AFCM),用于加速FCM及基于FCM的演化聚类算法.AFCM算法采用抽样初始化操作,产生较好的初始聚类中心,对于拥有较大隶属度的数据点,通过一步k-means操作更新模糊聚类中心,同时仅更新小隶属度来达到加速FCM算法的目的.为了验证所提出方法的有效性并提高聚类算法的效率,将AFCM应用于基于演化算法的模糊聚类算法.实验表明,此方法在保持良好的聚类结果前提下,能够减少大规模数据集上聚类算法的计算时间.  相似文献   

8.
王治和  王淑艳  杜辉 《计算机工程》2021,47(5):88-96,103
模糊C均值(FCM)聚类算法无法识别非凸数据,算法中基于欧式距离的相似性度量只考虑数据点之间的局部一致性特征而忽略了全局一致性特征。提出一种利用密度敏感距离度量创建相似度矩阵的FCM算法。通过近邻传播算法获取粗类数作为最佳聚类数的搜索范围上限,以解决FCM算法聚类数目需要人为预先设定和随机选定初始聚类中心造成聚类结果不稳定的问题。在此基础上,改进最大最小距离算法,得到具有代表性的样本点作为初始聚类中心,并结合轮廓系数自动确定最佳聚类数。基于UCI数据集和人工数据集的实验结果表明,相比经典FCM、K-means和CFSFDP算法,该算法不仅具有识别复杂非凸数据的能力,而且能够在保证聚类性能和稳定性的前提下加快收敛速度。  相似文献   

9.
《微型机与应用》2014,(15):40-42
提出了一种基于量子粒子群的改进模糊聚类图像分割算法。针对FCM图像分割算法对聚类中心初始值比较敏感的缺点,利用量子粒子群优化算法强大的全局搜索能力寻找最优解,能够有效降低图像分割算法对初始值的依赖程度;同时,用一种新的基于簇密度的距离度量公式来计算图像特征点与聚类中心点的距离,其在确定类中心时考虑数据集的全局信息,并且在迭代过程中采用动态隶属度,能够降低噪声干扰。仿真实验结果证明改进算法具有较好的性能。  相似文献   

10.
针对FCM聚类算法容易陷入局部最优且对初始点很敏感的问题,提出基于搜索空间平滑技术的点密度加权FCM算法以获得最优解。以所得的聚类中心作为输入,再次执行FCM算法,对于隶属度小于阈值的数据样本进行检测;如果该数据样本被删除,目标函数值变化明显,则该数据样本为异常数据样本,并且聚类最后产生的小的簇中的数据样本也是异常数据样本。在KDDCUP99数据集上进行检测,实验结果表明该算法具有较高的检测率及较低的误检率。  相似文献   

11.
In the fuzzy c-means (FCM) clustering algorithm, almost none of the data points have a membership value of 1. Moreover, noise and outliers may cause difficulties in obtaining appropriate clustering results from the FCM algorithm. The embedding of FCM into switching regressions, called the fuzzy c-regressions (FCRs), still has the same drawbacks as FCM. In this paper, we propose the alpha-cut implemented fuzzy clustering algorithms, referred to as FCMalpha, which allow the data points being able to completely belong to one cluster. The proposed FCMalpha algorithms can form a cluster core for each cluster, where data points inside a cluster core will have a membership value of 1 so that it can resolve the drawbacks of FCM. On the other hand, the fuzziness index m plays different roles for FCM and FCMalpha. We find that the clustering results obtained by FCMalpha are more robust to noise and outliers than FCM when a larger m is used. Moreover, the cluster cores generated by FCMalpha are workable for various data shape clusters, so that FCMalpha is very suitable for embedding into switching regressions. The embedding of FCMalpha into switching regressions is called FCRalpha. The proposed FCRalpha provides better results than FCR for environments with noise or outliers. Numerical examples show the robustness and the superiority of our proposed methods.  相似文献   

12.
一种协同的可能性模糊聚类算法   总被引:1,自引:0,他引:1  
模糊C-均值聚类(FCM)对噪声数据敏感和可能性C-均值聚类(PCM)对初始中心非常敏感易导致一致性聚类。协同聚类算法利用不同特征子集之间的协同关系并与其他算法相结合,可提高原有的聚类性能。对此,在可能性C-均值聚类算法(PCM)基础上将其与协同聚类算法相结合,提出一种协同的可能性C-均值模糊聚类算法(C-FCM)。该算法在改进的PCM的基础上,提高了对数据集的聚类效果。在对数据集Wine和Iris进行测试的结果表明,该方法优于PCM算法,说明该算法的有效性。  相似文献   

13.

The fuzzy c-means algorithm (FCM) is aimed at computing the membership degree of each data point to its corresponding cluster center. This computation needs to calculate the distance matrix between the cluster center and the data point. The main bottleneck of the FCM algorithm is the computing of the membership matrix for all data points. This work presents a new clustering method, the bdrFCM (boundary data reduction fuzzy c-means). Our algorithm is based on the original FCM proposal, adapted to detect and remove the boundary regions of clusters. Our implementation efforts are directed in two aspects: processing large datasets in less time and reducing the data volume, maintaining the quality of the clusters. A significant volume of real data application (> 106 records) was used, and we identified that bdrFCM implementation has good scalability to handle datasets with millions of data points.

  相似文献   

14.
针对模糊C均值(FCM)聚类算法没有考虑样本不同属性的重要程度、邻域信息等问题,提出一种基于熵与邻域约束的FCM算法。首先通过计算样本各属性的熵值来为各属性赋予权重,结合属性权重改进距离度量函数;随后根据邻域样本与中心样本间的距离计算邻域隶属度权重,加权得到邻域隶属度,利用邻域隶属度约束目标函数,修正隶属度迭代过程,最终达到提升FCM聚类算法性能的目的。理论分析和在人造数据集、多个UCI数据集的试验结果表明,改进后的算法在聚类效果、鲁棒性上均优于传统FCM算法、PCM算法、KFCM算法、KPCM算法和DSFCM算法,表明了本文算法的有效性。  相似文献   

15.
一种基于核的快速可能性聚类算法   总被引:1,自引:1,他引:0       下载免费PDF全文
传统的快速聚类算法大多基于模糊C均值算法(Fuzzy C-means,FCM),而FCM对初始聚类中心敏感,对噪音数据敏感并且容易收敛到局部极小值,因而聚类准确率不高。可能性C-均值聚类较好地解决了FCM对噪声敏感的问题,但容易产生一致性聚类。将FCM和可能性C-均值聚类结合的聚类算法较好地解决了一致性聚类问题。为进一步提高算法收敛速度和鲁棒性,提出一种基于核的快速可能性聚类算法。该方法引入核聚类的思想,同时使用样本方差对目标函数中参数η进行优化。标准数据集和人造数据集的实验结果表明这种基于核的快速可能性聚类算法提高了算法的聚类准确率,加快了收敛速度。  相似文献   

16.
ABSTRACT

Fuzzy c-means clustering is an important non-supervised classification method for remote-sensing images and is based on type-1 fuzzy set theory. Type-1 fuzzy sets use singleton values to express the membership grade; therefore, such sets cannot describe the uncertainty of the membership grade. Interval type-2 fuzzy c-means (IT2FCM) clustering and relevant methods are based on interval type-2 fuzzy sets. Real vectors are used to describe the clustering centres, and the average values of the upper and lower membership grades are used to determine the classification of each pixel. Thus, the width information for interval clustering centres and interval membership grades are ignored. The main contribution of this article is to propose an improved IT2FCM* algorithm by adopting interval number distance (IND) and ranking methods, which use the width information of interval clustering centres and interval membership grades, thus distinguishing this method from existing fuzzy clustering methods. Three different IND definitions are tested, and the distance definition proposed by Li shows the best performance. The second contribution of this work is that two fuzzy cluster validity indices, FS- and XB-, are improved using the IND. Three types of multi/hyperspectral remote-sensing data sets are used to test this algorithm, and the experimental results show that the IT2FCM* algorithm based on the IND proposed by Li performs better than the IT2FCM algorithm using four cluster validity indices, the confusion matrix, and the kappa coefficient (κ). Additionally, the improved FS- index has more indicative ability than the original FS- index.  相似文献   

17.
Fuzzy C-means (FCM) clustering has been widely used successfully in many real-world applications. However, the FCM algorithm is sensitive to the initial prototypes, and it cannot handle non-traditional curved clusters. In this paper, a multi-center fuzzy C-means algorithm based on transitive closure and spectral clustering (MFCM-TCSC) is provided. In this algorithm, the initial guesses of the locations of the cluster centers or the membership values are not necessary. Multi-centers are adopted to represent the non-spherical shape of clusters. Thus, the clustering algorithm with multi-center clusters can handle non-traditional curved clusters. The novel algorithm contains three phases. First, the dataset is partitioned into some subclusters by FCM algorithm with multi-centers. Then, the subclusters are merged by spectral clustering. Finally, based on these two clustering results, the final results are obtained. When merging subclusters, we adopt the lattice similarity method as the distance between two subclusters, which has explicit form when we use the fuzzy membership values of subclusters as the features. Experimental results on two artificial datasets, UCI dataset and real image segmentation show that the proposed method outperforms traditional FCM algorithm and spectral clustering obviously in efficiency and robustness.  相似文献   

18.
石文峰  商琳 《计算机科学》2017,44(9):45-48, 66
Fuzzy C-Means(FCM)是模糊聚类中聚类效果较好且应用较为广泛的聚类算法,但是其对初始聚类数的敏感性导致如何选择一个较好的C值 变得十分重要。因此,确定FCM的聚类数是使用FCM进行聚类分析时的一个至关重要的步骤。通过扩展决策粗糙集模型进行聚类的有效性分析,并进一步确定FCM的聚类数,从而避免了使用FCM时不好的初始化所带来的影响。文中提出了一种基于扩展粗糙集模型的模糊C均值聚类数的确定方法,并通过图像分割实验来验证聚类的效果。实验通过比对不同聚类数下分类结果的代价获得了一个较好的分割结果,并将结果与Z.Yu等人于2015年提出的蚁群模糊C均值混合算法(AFHA)以及提高的AFHA算法(IAFHA)进行对比,结果表明所提方法的聚类结果较好,图像分割效果较明显,Bezdek分割系数比AFHA和IAFHA算法的更高,且在Xie-Beni系数上也有较大优势。  相似文献   

19.
针对基于粒子群的模糊聚类算法以隶属度编码时对噪音敏感,以及处理样本数小于样本维数的数据集效果较差等问题,通过改进其中的模糊聚类约束方法,提出一种改进的基于粒子群的模糊聚类方法.当样本对各类的隶属度之和不为1时,新方法在粒子群优化得出的隶属度基础上,根据样本与各类之间的距离对隶属度进一步分配,以使隶属度满足模糊聚类约束条件.新方法显著地改善了在隶属度编码下使用粒子群进行模糊聚类的效果,并通过典型的数据集进行了验证.  相似文献   

20.
Most variants of fuzzy c-means (FCM) clustering algorithms involving prior knowledge are generally based on the modification of the objective function or the clustering process. This paper proposes a new weighted semi-supervised FCM algorithm (SSFCM-HPR) that transforms the prior knowledge in the labeled samples into constraint conditions in terms of fuzzy membership degrees, assigns different weights according to the representativeness of the samples, and then uses the HPR multiplier to solve the clustering problem. The “representativeness” of the labeled samples is decided by their distances to the cluster centers they belong to. In this paper, we take the ratio of the largest to the second largest fuzzy membership degree from a labeled sample as its weight. This algorithm not only retains the fuzzy partition of the labeled samples, which guarantees the effective guidance on the clustering process, but also can detect whether a sample is an outlier or not. Moreover, when part of the supervised information of the labeled samples is wrong, this algorithm can reduce the influence of the incorrectly labeled samples on the final clustering results. The experimental evaluation on synthetic and real data sets demonstrates the efficiency and effectiveness of our approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号