首页 | 官方网站   微博 | 高级检索  
     


Metric and trigonometric pruning for clustering of uncertain data in 2D geometric space
Authors:Wang Kay Ngai  Ben Kao  Reynold Cheng  Michael Chau  Sau Dan Lee  David W Cheung  Kevin Y Yip
Affiliation:1. Department of Computer Science, The University of Hong Kong, Hong Kong;2. School of Business, The University of Hong Kong, Hong Kong;3. Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong;4. Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, United States
Abstract:We study the problem of clustering data objects with location uncertainty. In our model, a data object is represented by an uncertainty region over which a probability density function (pdf) is defined. One method to cluster such uncertain objects is to apply the UK-means algorithm 1], an extension of the traditional K-means algorithm, which assigns each object to the cluster whose representative has the smallest expected distance from it. For arbitrary pdf, calculating the expected distance between an object and a cluster representative requires expensive integration of the pdf. We study two pruning methods: pre-computation (PC) and cluster shift (CS) that can significantly reduce the number of integrations computed. Both pruning methods rely on good bounding techniques. We propose and evaluate two such techniques that are based on metric properties (Met) and trigonometry (Tri). Our experimental results show that Tri offers a very high pruning power. In some cases, more than 99.9% of the expected distance calculations are pruned. This results in a very efficient clustering algorithm. 1
Keywords:Clustering  Data uncertainty
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号