首页 | 官方网站   微博 | 高级检索  
     

从不确定数据集中挖掘频繁Co-location模式
引用本文:陆叶,王丽珍,张晓峰.从不确定数据集中挖掘频繁Co-location模式[J].计算机科学与探索,2009,3(6):656-664.
作者姓名:陆叶  王丽珍  张晓峰
作者单位:云南大学信息学院计算机科学与工程系,昆明,650091
摘    要:把挖掘频繁co-location模式的经典算法Join-based算法扩展到了UJoin-based算法,解决了从不确定数据集中挖掘频繁co-location模式的问题。针对UJoin-based算法中ED(expected distances)计算开销大的问题,介绍了两种剪枝技术:边界矩形剪枝技术和三角不等式剪枝技术,其中,在三角不等式剪枝部分,分别讨论了取1个锚点、5个锚点和9个锚点的不同情况。通过大量实验证明了剪枝策略有效避免了大量的ED计算,提高了算法的效率。

关 键 词:不确定数据  co-location模式  UJoin-based算法  边界矩形剪枝  三角不等式剪枝
修稿时间: 

Mining Frequent Co-location Patterns from Uncertain Data
LU Ye,WANG Lizhen,ZHANG Xiaofeng.Mining Frequent Co-location Patterns from Uncertain Data[J].Journal of Frontier of Computer Science and Technology,2009,3(6):656-664.
Authors:LU Ye  WANG Lizhen  ZHANG Xiaofeng
Affiliation:Department of Computer Science and Engineering, School of Information Science and Engineering, Yunnan University, Kunming 650091, China
Abstract:Studied the problem of mining frequent co-location patterns from uncertain data whose locations are described by probability density functions (PDF). It is showed that the UJoin-based algorithm, which generalizes the Join-based algorithm to handle uncertain instances,is very inefficient. The inefficiency comes from the fact that UJoin-based computes expected distances (ED) between instances. For arbitrary PDF’s, expected distances are computed by numerical integrations, which are costly operations. Various pruning methods are studied to avoid such expensive expected distance calculation. Experiments have been conducted to evaluate the effectiveness of this pruning techniques.
Keywords:uncertain data  co-location patterns  UJoin-based algorithm  BR pruning  triangle inequality pruning
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机科学与探索》浏览原始摘要信息
点击此处可从《计算机科学与探索》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号