A local-density based spatial clustering algorithm with noise |
| |
Authors: | Lian Duan Lida Xu Feng Guo Jun Lee Baopin Yan |
| |
Affiliation: | 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing, China;2. The Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China;3. Zhejiang University, Hangzhou, China;4. Old Dominion University, VA, USA |
| |
Abstract: | Density-based clustering algorithms are attractive for the task of class identification in spatial database. However, in many cases, very different local-density clusters exist in different regions of data space, therefore, DBSCAN method [M. Ester, H.-P. Kriegel, J. Sander, X. Xu, A density-based algorithm for discovering clusters in large spatial databases with noise, in: E. Simoudis, J. Han, U.M. Fayyad (Eds.), Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, AAAI, Menlo Park, CA, 1996, pp. 226–231] using a global density parameter is not suitable. Although OPTICS [M. Ankerst, M.M. Breunig, H.-P. Kriegel, J. Sander, OPTICS: ordering points to identify the clustering structure, in: A. Delis, C. Faloutsos, S. Ghandeharizadeh (Eds.), Proceedings of ACM SIGMOD International Conference on Management of Data Philadelphia, PA, ACM, New York, 1999, pp. 49–60] provides an augmented ordering of the database to represent its density-based clustering structure, it only generates the clusters with local-density exceeds certain thresholds but not the cluster of similar local-density; in addition, it does not produce clusters of a data set explicitly. Furthermore, the parameters required by almost all the major clustering algorithms are hard to determine although they significantly impact on the clustering result. In this paper, a new clustering algorithm LDBSCAN relying on a local-density-based notion of clusters is proposed. In this technique, the selection of appropriate parameters is not difficult; it also takes the advantage of the LOF [M.M. Breunig, H.-P. Kriegel, R.T. Ng, J. Sander, LOF: identifying density-based local outliers, in: W. Chen, J.F. Naughton, P.A. Bernstein (Eds.), Proceedings of ACM SIGMOD International Conference on Management of Data, Dalles, TX, ACM, New York, 2000, pp. 93–104] to detect the noises comparing with other density-based clustering algorithms. The proposed algorithm has potential applications in business intelligence. |
| |
Keywords: | Data mining Local outlier factor Local reachability density Local-density-based clustering |
本文献已被 ScienceDirect 等数据库收录! |
|