首页 | 官方网站   微博 | 高级检索  
     

地理时空三向聚类分析方法的构建与实践
引用本文:程昌秀,宋长青,吴晓静,沈石,高培超,叶思菁.地理时空三向聚类分析方法的构建与实践[J].地理学报,2020,75(5):904-916.
作者姓名:程昌秀  宋长青  吴晓静  沈石  高培超  叶思菁
作者单位:1.北京师范大学地表过程与资源生态国家重点实验室,北京 1008752.北京师范大学地理科学学部,北京 1008753.国家青藏高原科学数据中心,北京 100101
基金项目:国家重点研发计划(2019YFA0606901);中国科学院战略性先导科技专项(XDA23100303)
摘    要:随着地理数据获取能力的不断提升,地理数据体量呈指数增长,数据种类、数据性质更加多元化。对数据的有效甄别和归类成为理解地理现象时空特征、演化过程和行为机制的关键。传统聚类方法面临数据体量大、维数高、质量差的挑战,加之对地理空间与时间关联分析的需求,对聚类方法改进和提升研究的要求越来越迫切。本文介绍了从单向到三向聚类构建思路的变革。单向聚类是仅在样本或属性方向上进行聚类,易忽视非常相似的局部特征、易犯“横看成岭侧成峰”的错误。双向聚类是基于数据矩阵内元素值的相似性,形成一个子矩阵分割方案,使子矩阵内元素相似度尽可能高,子矩阵间元素相似度尽可能低,从而实现行列两方向的同时聚类,避免了单向聚类的不足。鉴于双向聚类难以满足地理研究超出双向的解译需求,本文提出并研发了一个全新的三向聚类方法,给出了运用该方法开展地理时空格局过程探测的流程,总结了如何根据研究涉及的“空间—时间—尺度—属性”构建三维数据体;最后,展示了三向聚类的地理实践案例。结果表明:① 三向聚类是一种大数据时代探测地理数据时空分异规律的有效方法,可以解决数据维度高、质量低等问题;② 面对不同的地理问题,三向聚类在算法层面上是通用的,不同之处仅在于:根据不同问题涉及的空间、时间、尺度、属性的不同,构建不同的数据体;不同数据体聚类得到的不同结果回答不同的地理问题;③ 三向聚类可以实现地理数据的时空分异规律多方向、多尺度、多层次的联合解译,揭示地理特征时空尺度叠加效应。最后,论文强调根据地理问题组织数据的重要性,期待未来能够提升三向聚类在多空间尺度、多属性方面的地理研究实践。

关 键 词:三向聚类  空间—时间—尺度—属性  联合解译  时空局部相似性  时空分异  
收稿时间:2020-02-06
修稿时间:2020-04-22

Tri-clustering: Construction and practice of space-time integrated analysis tool
CHENG Changxiu,SONG Changqing,WU Xiaojing,SHEN Shi,GAO Peichao,YE Sijing.Tri-clustering: Construction and practice of space-time integrated analysis tool[J].Acta Geographica Sinica,2020,75(5):904-916.
Authors:CHENG Changxiu  SONG Changqing  WU Xiaojing  SHEN Shi  GAO Peichao  YE Sijing
Affiliation:1.State Key Laboratory of Earth Surface Processes and Resource Ecology, Beijing Normal University,Beijing 100875, China2.Faculty of Geographical Science, Beijing Normal University, Beijing 100875, China3.National Tibetan Plateau Data Center, Beijing 100101, China
Abstract:With the improvement of geographic data acquisition capabilities, the volume of geographic data has been growing exponentially, and the data types as well as characteristics have become more diverse. The effective identification and classification of data has become the key to understand spatio-temporal patterns, evolutionary processes, and driving mechanisms of geographic phenomena. However, traditional clustering methods are facing some challenges, such as large amount, high-dimensionality and poor-quality of the data to be dealt with. Therefore, it is necessary to improve clustering methods. This paper first describes the transformation from one-way clustering to tri-clustering. One-way clustering methods perform the clustering analysis along with the samples or the attributes. They played an important role in previous studies, but ignored local features that are very similar. Co-clustering methods perform the submatrix partitioning scheme based on location similarity of elements within the data matrix. They avoid shortages of one-way clustering by realizing the clustering from both rows and columns, making similar elements into the same submatrix and dissimilar ones into different ones. However, they cannot satisfy multiple directions interpretations of geographical research since they do not support 3D panel data body. Then, we develop a new tri-clustering method, presents the workflow of using tri-clustering to spatio-temporal patterns' studies, and summarizes how to construct the 3D data matrix for clustering according to different aspects of 'space-time-scale-attribute' involved in the analysis. Finally, we show some practices of tri-cluster. The results show that: (1) Tri-clustering is an effective method to identify the spatio-temporal differentiation of geographic data in the era of big data by solving problems, i.e. data of high dimensionality and low quality. (2) Tri-clustering is universal in the algorithmic level when facing different geographic topics, but the differences rely on the 3D data matrices constructed according to different aspects of "space-time-scale-attribute" involved in the analysis. And, different data matrices are clustered to different results, which answer different topics. (3) Tri-clustering is able to interpret the spatio-temporal differentiation of geographic data in multiple directions, multiple scales, and multiple hierarchies, and thereby reveal the superposition effects of spatio-temporal scales of geographic features. Finally, we emphasize the significance of constructing 3D data matrices based on different geographic topics and expect that tri-clustering methods can enhance the ability to analyze geographic data with multiple spatial scales and attributes in the future.
Keywords:tri-clustering  space-time-scale-attribute  integrated interpretation  spatio-temporal local similarity  spatio-temporal differentiation  
本文献已被 CNKI 等数据库收录!
点击此处可从《地理学报》浏览原始摘要信息
点击此处可从《地理学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号