首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 343 毫秒
1.
针对传统空间离群点检测算法构建邻域时参数选择困难,处理高维数据的时间复杂度较高等问题,提出了一种基于地统计学的空间离群点检测算法。该算法将空间自相关理论引入空间离群检测中,首先利用3σ规则识别全局离群点,然后利用Delaunay三角网构建空间邻域,将邻域节点均值代替全局离群点,最后使用局部Moran’ I作为空间异常的度量方法。仿真结果表明,该方法不需要选择参数,鲁棒性较强,检测率较高、误警率较低。  相似文献   

2.
局部离群点挖掘算法研究   总被引:14,自引:0,他引:14  
离群点可分为全局离群点和局部离群点.在很多情况下,局部离群点的挖掘比全局离群点的挖掘更有意义.现有的基于局部离群度的离群点挖掘算法存在检测精度依赖于用户给定的参数、计算复杂度高等局限.文中提出将对象属性分为固有属性和环境属性,用环境属性确定对象邻域、固有属性计算离群度的方法克服上述局限;并以空间数据为例,将空间属性与非空间属性分开,用空间属性确定空间邻域,用非空间属性计算空间离群度,设计了空间离群点挖掘算法.实验结果表明,所提算法具有对用户依赖性少、检测精度高、可伸缩性强和运算效率高的优点.  相似文献   

3.
基于距离和基于密度的离群点检测算法受到维度和数据量伸缩性的挑战, 而空间数据的自相关性和异质性决定了以属性相互独立和分类属性的基于信息理论的离群点检测算法也难以适应空间离群点检测, 因此提出了基于全息熵的混合属性空间离群点检测算法。算法利用区域标志属性进行区域划分, 在区域内利用空间关系确定空间邻域, 并用R*-树进行检索。在此基础上提出了基于全息熵的空间离群度的度量方法和空间离群点挖掘算法, 有效解决了混合属性的离群度的度量和离群点的挖掘问题。由于实现区域划分有利于并行计算, 从而可适应大数据量的计算。理论和实验证明, 所提算法在计算效率和实验结果的可解释性方面均具有优势。  相似文献   

4.
离群点的查找算法主要有两类:第一类是面向统计数据,把各种数据都看成是多维空间,没有区分空间维与非空间维,这类算法可能产生错误的判断或找到的是无意义的离群点;第二类算法面向空间数据,区分空间维与非空间维,但该类算法查找效率太低或不能查找邻域离群点。引入熵权的概念,提出了一种新的基于熵权的空间邻域离群点度量算法。算法面向空间数据,区分空间维与非空间维,利用空间索引划分空间邻域,用非空间属性计算空间偏离因子,由此度量空间邻域的离群点。理论分析表明,该算法是合理的。实验结果表明,算法具有对用户依赖性小、检测精度和计算效率高的优点。  相似文献   

5.
为了检测空间数据集中存在的离群区域,提出一种基于裁边策略的空间离群区域检测算法。首先利用Delaunay三角网格确定空间邻接关系,根据非空间属性描述邻接节点间的差异性;然后反复裁去最大权边,且并发地检测离群区域,直到发现足够多离群点。实验结果表明,该算法能有效检测离群区域,并且准确给出局部离群性,克服了普通算法中易受坏邻居干扰和区域缺乏原子性的局限。  相似文献   

6.
基于空间约束的离群点挖掘   总被引:1,自引:0,他引:1  
由于现有的空间离群点检测算法没有很好地解决空间数据的自相关性和异质性约束问题,提出用计算邻域距离的方法解决空间自相关性约束问题,用计算空间局部离群系数的方法解决空间异质性约束问题。用离群系数表示对象的离群程度,并将离群系数按降序排列,取离群系数最大的前m个对象为离群点,据此提出基于空间约束的离群点挖掘算法。实验结果表明,所提算法比已有算法具有更高的检测精度、更低的用户依赖性和更高的效率。  相似文献   

7.
张洋  王辰 《计算机应用》2013,33(10):2981-2983
首先介绍了目前空间数据可视化技术的研究内容和基本方法,对基于实体和基于区域两类常用方法进行了分析和总结。在此基础上提出了一种基于聚类的空间数据可视化方法,其基本思想是利用以Delaunay三角网的自适应空间聚类算法(ASCDT)为代表的空间聚类算法进行聚类分析,并获得结果描述参数,结合基本方法和参数特征设计专门用于聚类结果表达的可视化对象,进而实现空间数据的图上投影。最后对该类方法有待进一步探讨和改进的内容进行了展望  相似文献   

8.
空间离群点的模型与跳跃取样查找算法   总被引:3,自引:0,他引:3       下载免费PDF全文
目前无论是查找一般的离群点,还是空间离群点,都强调非空间属性的偏离,但在图像处理、基于位置的服务等许多应用领域,空间与非空间属性要综合考虑。为此,首先提出了一个综合考虑两者的空间离群点定义,然后提出了一种新的基于密度的空间离群点查找方法——基于密度的跳跃取样空间离群点查找算法DBSODLS。由于已有的基于密度的离群点查找方法对每一点都要求进行邻域查询计算,故查找效率低,而该算法由于可充分利用已知的邻居信息,即不必计算所有点的邻域,从而能快速找到空间离群点。分析与试验结果表明,该算法时间性能明显优于目前已有的基于密度的算法。  相似文献   

9.
空间数据集中离群数据与正常数据之间的非空间属性值相差较大。针对该情况,提出一种基于K-最邻近(KNN)图的空间离群点挖掘算法。该算法通过所有对象的K近邻关系构造KNN图,将相邻对象非空间属性值的差作为2个对象点间的边权值,利用裁边策略去掉权值较高的边,从而识别出空间离群点和离群区域。实验结果表明,该算法的时间性能优于POD算法。  相似文献   

10.
为了提高离群点挖掘的效率和准确度,在分析了传统离群点挖掘算法优、缺点的基础上,提出一种离群点检测算法.该算法利用Voronoi确定样点之间的邻近关系,通过参照邻域范围内其它样点的非空间属性值的信息熵作为离群因子,并根据离群因子标识出样点集中的离群点.以北京市大兴区土壤养分为例,实验结果表明,该检测算法能够高效,准确地检测出土壤样点中的离群点.  相似文献   

11.
Spatial outlier detection is a research hot spot in the field of spatial data mining. Because of the lack of specific research on spatial point events, this study presents an adaptive approach for spatial point events outlier detection (SPEOD) using multilevel constrained Delaunay triangulation. First, the spatial proximity relationships between spatial point events are roughly captured by Delaunay triangulation. Then, three-level constraints are described and used to refine spatial proximity relationships with the consideration of statistical characteristics. Finally, those spatial point events connected by remaining edges are gathered to form a series of subgraphs. Those subgraphs containing very few point events are regarded as spatial outliers. Experiments on both synthetic and real-world spatial data sets are used to show that the proposed SPEOD algorithm can detect various types of spatial point event outliers with high efficiency. Moreover, there is no need to input any parameter in SPEOD.  相似文献   

12.
The paper presents problems pertaining to spatial data mining. Based on the existing solutions a new method of knowledge extraction in the form of spatial association rules and collocations has been worked out and is proposed herein. Delaunay diagram is used for determining neighborhoods. Based on the neighborhood notion, spatial association rules and collocations are defined. A novel algorithm for finding spatial rules and collocations has been presented. The approach allows eliminating the parameters defining neighborhood of objects, thus avoiding multiple “test and trial” repetitions of the process of mining for various parameter values. The presented method has been implemented and tested. The results of the experiments have been discussed.  相似文献   

13.
空间离群点是指与其邻居具有明显区别的属性值的空间对象。已有的空间离散点检测算法一个主要的缺陷就是这些方法导致一些真正的离群点被忽略而把一些非离群点当成了空间离群点。本文提出了一种迭代算法,该算法通过多次迭代检测离群点,取得较好效果。实验表明该算法具有较好的实用性。  相似文献   

14.
An adaptive spatial clustering algorithm based on delaunay triangulation   总被引:7,自引:0,他引:7  
In this paper, an adaptive spatial clustering algorithm based on Delaunay triangulation (ASCDT for short) is proposed. The ASCDT algorithm employs both statistical features of the edges of Delaunay triangulation and a novel spatial proximity definition based upon Delaunay triangulation to detect spatial clusters. Normally, this algorithm can automatically discover clusters of complicated shapes, and non-homogeneous densities in a spatial database, without the need to set parameters or prior knowledge. The user can also modify the parameter to fit with special applications. In addition, the algorithm is robust to noise. Experiments on both simulated and real-world spatial databases (i.e. an earthquake dataset in China) are utilized to demonstrate the effectiveness and advantages of the ASCDT algorithm.  相似文献   

15.
一种带岛屿约束数据域的三角网剖分算法研究   总被引:6,自引:0,他引:6  
文中对多边形内部三角剖分算法及具有属性的带岛屿的约束数据域的D-三角剖分算法进行了研究,提出了一种适用于多边形内部的基于“最小内角优先原则“D-三角剖分算法及适用于多边形内,外部构网通用三角剖三角剖分算法,算法充分考虑到了构网数据域中存在多种不同属性块,并成功将算法应用于工程项目之中。  相似文献   

16.
SLOM: a new measure for local spatial outliers   总被引:13,自引:1,他引:13  
We propose a measure, spatial local outlier measure (SLOM), which captures the local behaviour of datum in their spatial neighbourhood. With the help of SLOM, we are able to discern local spatial outliers that are usually missed by global techniques, like “three standard deviations away from the mean”. Furthermore, the measure takes into account the local stability around a data point and suppresses the reporting of outliers in highly unstable areas, where data are too heterogeneous and the notion of outliers is not meaningful. We prove several properties of SLOM and report experiments on synthetic and real data sets that show that our approach is novel and scalable to large datasets. Sanjay Chawla is a Senior Lecturer in the School of Information Technologies at the University of Sydney. His research interests span the area of data mining and spatial database management. He is a co-author of the textbook “Spatial Databases: A Tour”, which is published by Prentice Hall. His research work has appeared in leading publications, including IEEE Transaction on Knowledge and Data Engineering and GeoInformatica. He received his Ph.D. in Mathematics from the University of Tennessee, USA. Pei Sun is currently a Ph.D. student in the School of Information Technology, Sydney University, Australia. His research interests include data mining and spatial database. He received his M.E. degree from the University of New South Wales, Sydney, Australia, in 2002 and a B.E. degree from Beijing Forestry University, China, in 1990.  相似文献   

17.
基于Qi算法的Delaunay三角网逐点插入法   总被引:1,自引:0,他引:1  
Delaunay三角网在很多领域都有着广泛的应用,快速高效地生成Delaunay三角网十分重要。逐点插入法是构建Delaunay三角网中使用最广泛的方法之一。本文深入研究了使用逐点插入法构建不带约束条件Delaunay三角网的过程。在使用该方法生成Delaunay三角网中建立结点拓扑关系这一影响构网效率的关键步骤中引入了Qi算法,简化了该方法生成Delaunay三角网的复杂度。然后在向Delaunay三角网内插入约束边的过程中,再次引入Qi算法,从而提高了构网的效率。为了验证上述模型,我们在Microsoft Visual Studio 2005开发环境下,以C#为开发工具,采用底层开发模式实现了改进的逐点插入法,实验证明引入Qi算法能够提高逐点插入法Delaunay三角网构建及插入约束边的效率。  相似文献   

18.
在传统的基于[K]近邻的算法中,需要为算法设置邻居参数[k]的值,只有具备相关的先验知识才能确定合适的参数值。为了减少参数对于离群点检测的影响,提出了一种无需参数的基于Delaunay三角剖分的离群点检测算法。Delaunay三角剖分是数值分析以及图形学中的重要基础理论,它的构建无需任何参数,在三角剖分图中的每个数据对象与它空间上相邻的点都存在边直接相连,因此可以形成一种有效的邻居关系。算法首先通过Delaunay三角剖分形成每个点的空间邻居集合,然后根据每个点与它们空间邻居之间的分布特征,计算它们的离群程度,根据离群程度的大小判断该点是否为离群点。通过实验与相关的算法比较,算法具有更好的效果。  相似文献   

19.
Keyframe-based video summarization using Delaunay clustering   总被引:1,自引:0,他引:1  
Recent advances in technology have made tremendous amounts of multimedia information available to the general population. An efficient way of dealing with this new development is to develop browsing tools that distill multimedia data as information oriented summaries. Such an approach will not only suit resource poor environments such as wireless and mobile, but also enhance browsing on the wired side for applications like digital libraries and repositories. Automatic summarization and indexing techniques will give users an opportunity to browse and select multimedia document of their choice for complete viewing later. In this paper, we present a technique by which we can automatically gather the frames of interest in a video for purposes of summarization. Our proposed technique is based on using Delaunay Triangulation for clustering the frames in videos. We represent the frame contents as multi-dimensional point data and use Delaunay Triangulation for clustering them. We propose a novel video summarization technique by using Delaunay clusters that generates good quality summaries with fewer frames and less redundancy when compared to other schemes. In contrast to many of the other clustering techniques, the Delaunay clustering algorithm is fully automatic with no user specified parameters and is well suited for batch processing. We demonstrate these and other desirable properties of the proposed algorithm by testing it on a collection of videos from Open Video Project. We provide a meaningful comparison between results of the proposed summarization technique with Open Video storyboard and K-means clustering. We evaluate the results in terms of metrics that measure the content representational value of the proposed technique.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号