首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 906 毫秒
1.
基于UML顺序图的面向对象软件簇级测试   总被引:2,自引:0,他引:2  
苏荟  张毅坤  姚海波  费蓉 《计算机工程》2005,31(24):78-79,101
提出了一种新的面向对象软件簇级测试方法,在UML顺序图的基础上,首先从*.MDL文档中提取出类间的交互信息;其次利用程序插桩技术从源代码中提取程序运行时类间的交互信息;最后将二者进行对比,验证源代码中类间信息交互是否正确。并通过实例对该方法的正确性和有效性进行了验证。  相似文献   

2.
生态位因子分析方法是一种基于生态位概念的多变量分析方法,然而该方法在计算相关性时所使用的协方差只考虑了变量间的线性关系,而大部分变量间的关系是非线性相关的.互信息可用于衡量两个变量间相互依赖的强弱程度,且不局限于线性相关.本文提出基于互信息的生态位因子分析方法,采用互信息计算变量间的相关性,分析斑头雁在青海湖地区的栖息地选择情况以及栖息地适宜性,与传统生态位因子分析方法相比,所提出的方法改变了特化向量,提高了栖息地适宜性预测的准确率.  相似文献   

3.
K-means是一种无监督学习算法,基于数据对象之间的距离度量划分数据簇、欧氏距离等度量方法存在一些问题,比如离群点数据较多,算法准确度较低.互信息可以度量任意两个数据对象之间的互相包含程度,基于互信息改进K-means算法,可以更好地度量数据对象之间的距离,确保簇内高度相同和簇间高度相异,旨在解决离群点数据较多的情况下K-means算法准确度不高的问题.实验结果显示,与K-means算法、模糊K-means算法相比,改进K-means算法实验结果精确度达到了97.8%,该方法明显提高K-means算法的准确度.  相似文献   

4.
介绍了一种用互信息来衡量相似性图像检索方法.该方法首先生成一种在统计上有代表性的视觉模式,使用这种模式的分布作为图像内容的描述符;基于该内容描述,设计了其互信息的计算方法以衡量图像的相似性.实验结果表明,在图像检索中,相对于其它如KL散度和L2规范等方法,互信息是一种更为有效的衡量相似性的方法.  相似文献   

5.
雷鸣  马荣  赵丽  赵晓寒 《计算机应用》2021,41(z2):234-240
针对三维非等距模型簇的一致对应关系计算问题,提出了一种基于核密度估计的三维非等距模型簇对应关系计算方法.首先引入离散时间演化过程描述符(DEP)提取三维模型表面的特征描述符,得到不同区域的不同分布特征;其次通过核密度估计建立非等距模型间的映射关系;最后利用弹性网罚函数对非等距模型簇映射关系进行凸优化,从而得到更准确的三维非等距模型簇点到点对应关系.实验结果表明,利用时变描述符与核密度估计相结合的方法计算非等距模型簇的对应关系,在一定程度上减小了模型簇一致对应的测地错误,与Aubry的算法比较,测地错误平均下降至0.054.该基于核密度估计的匹配算法与使用函数映射或随机森林函数的方法相比,能构建出更为准确的非等距模型簇一致对应关系.  相似文献   

6.
在对分簇无线传感器网络在簇群内和簇群之间数据传输能耗进行分析的基础上,提出了一种在簇群间采用MIMO模式进行数据传输的方法。该方法以HEED协议为基础,将MIMO模式引入到簇群间的数据传输过程中,极大地降低了传输能耗,并且创建了一种基于MIMO技术的分簇传感器网络能耗模型,推导出实现网络能耗最小化时系统参数的最优值。实验结果显示,与HEED协议相比,该方法更显著地降低了网络的总体能耗,使网络具有更长的生存期。  相似文献   

7.
在对节点通信模式和簇群划分过程分析的基础上,提出一种在节点分布不均匀的条件下,构建能量均衡簇群的方法.该算法兼顾了簇群成员节点与簇头通信的能量消耗和簇群能耗负载,实现各簇群间能耗的平衡.仿真表明,该方法在网络生命期、节点平均生命期和网络扩展性方面比基于最短距离的分簇算法具有更好的性能.  相似文献   

8.
双聚类模型有助于聚类存在相关性的局部模式。论文提出了一种可识别多种相关模式的双聚类算法,以二次互信息作为相关性标准,并以Parzen窗口法有效估算高维变量之间的互信息;同时提出了最大相关维簇的概念。算法以多个最大相关维簇为种子,通过迭代细化聚类,可有效地发现高维数据环境内相关的长模式。真实基因表达数据的实验证明了算法的有效性。  相似文献   

9.
智慧医疗呈现出蓬勃发展的态势,因子分析是多维数据分析中常用的特征选择方法,而该方法无法处理非线性关系.互信息是评估特征间依赖的强弱程度,具有良好非线性关系处理能力.鉴于此,提出结合互信息的因子分析方法,采用互信息对特征间的相关性进行计算,将结果转换为特征值矩阵作为评估标准确定公因子,由累积贡献率选择新特征以达到降维目的,提升模型精度.选取神经网络作为分类器,采用实际数据对提出的算法进行对比实验,正确分类精度达到96.51%,损失函数为0.1138,仿真结果表明分类准确度在高维癌症数据集中得到提升,验证了方法的有效性.  相似文献   

10.
针对大规模类别数据的互信息计算量非常大的问题,利用Spark内存计算平台,提出了类别数据的并行互信息计算方法,该算法首先采用列变换将数据集转换成多个数据子集;然后采用两个变长数组缓存中间结果,解决了类别数据特征对间互信息计算量大、重复性强的问题;最后在配备了24个计算节点的Spark集群中,使用人工合成和真实数据集验证了算法。实验结果表明,该算法在效率、可伸缩性和可扩展性等方面都达到了较高的性能。  相似文献   

11.
The evaluation of the relationships between clusters is important to identify vital unknown information in many real-life applications, such as in the fields of crime detection, evolution trees, metallurgical industry and biology engraftment. This article proposes a method called ‘mode pattern?+?mutual information’ to rank the inter-relationship between clusters. The idea of the mode pattern is used to find outstanding objects from each cluster, and the mutual information criterion measures the close proximity of a pair of clusters. Our approach is different from the conventional algorithms of classifying and clustering, because our focus is not to classify objects into different clusters, but instead, we aim to rank the inter-relationship between clusters when the clusters are given. We conducted experiments on a wide range of real-life datasets, including image data and cancer diagnosis data. The experimental results show that our algorithm is effective and promising.  相似文献   

12.
In this paper we propose a new unsupervised dimensionality reduction algorithm that looks for a projection that optimally preserves the clustering data structure of the original space. Formally we attempt to find a projection that maximizes the mutual information between data points and clusters in the projected space. In order to compute the mutual information, we neither assume the data are given in terms of distributions nor impose any parametric model on the within-cluster distribution. Instead, we utilize a non-parametric estimation of the average cluster entropies and search for a linear projection and a clustering that maximizes the estimated mutual information between the projected data points and the clusters. The improved performance is demonstrated on both synthetic and real world examples.  相似文献   

13.
In data mining and knowledge discovery, pattern discovery extracts previously unknown regularities in the data and is a useful tool for categorical data analysis. However, the number of patterns discovered is often overwhelming. It is difficult and time-consuming to 1) interpret the discovered patterns and 2) use them to further analyze the data set. To overcome these problems, this paper proposes a new method that clusters patterns and their associated data simultaneously. When patterns are clustered, the data containing the patterns are also clustered; and the relation between patterns and data is made explicit. Such an explicit relation allows the user on the one hand to further analyze each pattern cluster via its associated data cluster, and on the other hand to interpret why a data cluster is formed via its corresponding pattern cluster. Since the effectiveness of clustering mainly depends on the distance measure, several distance measures between patterns and their associated data are proposed. Their relationships to the existing common ones are discussed. Once pattern clusters and their associated data clusters are obtained, each of them can be further analyzed individually. To evaluate the effectiveness of the proposed approach, experimental results on synthetic and real data are reported.  相似文献   

14.
Clustering is one of the important data mining tasks. Nested clusters or clusters of multi-density are very prevalent in data sets. In this paper, we develop a hierarchical clustering approach—a cluster tree to determine such cluster structure and understand hidden information present in data sets of nested clusters or clusters of multi-density. We embed the agglomerative k-means algorithm in the generation of cluster tree to detect such clusters. Experimental results on both synthetic data sets and real data sets are presented to illustrate the effectiveness of the proposed method. Compared with some existing clustering algorithms (DBSCAN, X-means, BIRCH, CURE, NBC, OPTICS, Neural Gas, Tree-SOM, EnDBSAN and LDBSCAN), our proposed cluster tree approach performs better than these methods.  相似文献   

15.
机器学习的无监督聚类算法已被广泛应用于各种目标识别任务。基于密度峰值的快速搜索聚类算法(DPC)能快速有效地确定聚类中心点和类个数,但在处理复杂分布形状的数据和高维图像数据时仍存在聚类中心点不容易确定、类数偏少等问题。为了提高其处理复杂高维数据的鲁棒性,文中提出了一种基于学习特征表示的密度峰值快速搜索聚类算法(AE-MDPC)。该算法采用无监督的自动编码器(AutoEncoder)学出数据的最优特征表示,结合能刻画数据全局一致性的流形相似性,提高了同类数据间的紧致性和不同类数据间的分离性,促使潜在类中心点的密度值成为局部最大。在4个人工数据集和4个真实图像数据集上将AE-MDPC与经典的K-means,DBSCAN,DPC算法以及结合了PCA的DPC算法进行比较。实验结果表明,在外部评价指标聚类精度、内部评价指标调整互信息和调整兰德指数上,AE-MDPC的聚类性能优于对比算法,而且提供了更好的可视化性能。总之,基于特征表示学习且结合流形距离的AE-MDPC算法能有效地处理复杂流形数据和高维图像数据。  相似文献   

16.
基于数学形态学的模糊异常点检测   总被引:1,自引:0,他引:1  
异常点检测作为数据挖掘的一项重要任务,可能会导致意想不到的知识发现.但传统的异常点检测技术都忽略了数据的自然结构,即异常点与簇的联系.然而,把异常点得分和聚类方法结合起来有利于对异常点与簇的联系的研究.提出基于数学形态学的模糊异常点检测与分析,把数学形态学技术和基于连接的异常点检测方法集成到一个模糊模型中,从异常隶属度和模糊隶属度这两个方面来分析对象与簇集的模糊关系.通过充分的实验证明,该算法能够对复杂面状和变密度的数据集,正确、高效地找出异常点,同时发现与异常点相关联的簇信息,探索异常点与簇核的关联深度,对异常点本身的意义具有启发作用.  相似文献   

17.
Clustering for symbolic data type is a necessary process in many scientific disciplines, and the fuzzy c-means clustering for interval data type (IFCM) is one of the most popular algorithms. This paper presents an adaptive fuzzy c-means clustering algorithm for interval-valued data based on interval-dividing technique. This method gives a fuzzy partition and a prototype for each fuzzy cluster by optimizing an objective function. And the adaptive distance between the pattern and its cluster center varies with each algorithm iteration and may be either different from one cluster to another or the same for all clusters. The novel part of this approach is that it takes into account every point in both intervals when computing the distance between the cluster and its representative. Experiments are conducted on synthetic data sets and a real data set. To compare the comprehensive performance of the proposed method with other four existing methods, the corrected rand index, the value of objective function and iterations are introduced as the evaluation criterion. Clustering results demonstrate that the algorithm proposed in this paper has remarkable advantages.  相似文献   

18.
The popularity of GPS-equipped gadgets and mapping mashup applications has motivated the growth of geotagged Web resources as well as georeferenced multimedia applications. More and more research attention have been put on mining collaborative knowledge from mass user-contributed geotagged contents. However, little attention has been paid to generating high-quality geographical clusters, which is an important preliminary data-cleaning process for most geographical mining works. Previous works mainly use geotags to derive geographical clusters. Simply using one channel information is not sufficient for generating distinguishable clusters, especially when the location ambiguity problem occurs. In this paper, we propose a two-level clustering framework to utilize both the spatial and the semantic features of photographs for clustering. For the first-level geoclustering phase, we cluster geotagged photographs according to their spatial ties to roughly partition the dataset in an efficient way. Then we leverage the textual semantics in photographs' annotation to further refine the grouping results in the second-level semantic clustering phase. To effectively measure the semantic correlation between photographs, a semantic enhancement method as well as a new term weighting function have been proposed. We also propose a method for automatic parameter determination for the second-level spectral clustering process. Evaluation of our implementation on real georeferenced photograph dataset shows that our algorithm performs well, producing distinguishable geographical cluster with high accuracy and mutual information.  相似文献   

19.
针对现有层次聚类算法难以处理不完备数据集,同时考虑样本与类簇之间的不确定关系,提出一种面向不完备数据的集对粒层次聚类算法-SPGCURE.首先,采用集对信息粒的知识对缺失值进行处理,不同于以往算法中将缺失属性删除或者填充,用集对联系度中的差异度来表示缺失属性值,提出一种改进的集对信息距离度量方法,用于考量不完备数据样本间的紧密程度;其次,基于改进后的集对距离度量,给出各个类簇的类内平均距离的定义,形成以正同域Cs(样本一定属于类簇)、边界域Cu(样本可能属于类簇)和负反域Co(样本不属于类簇)表示的集对粒层次聚类;SPGCURE算法在完备和不完备数据都适用,最后,选用5个经典的UCI数据集,与常用的经典及改进聚类算法进行实验评价,结果表明,SPGCURE算法在准确度、F-measure、调整兰德系数和标准互信息等指标上均具有不错的聚类性能.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号