首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Updating generalized association rules with evolving fuzzy taxonomies   总被引:1,自引:1,他引:0  
Mining generalized association rules with fuzzy taxonomic structures has been recognized as an important extension of generalized associations mining problem. To date most work on this problem, however, required the taxonomies to be static, ignoring the fact that the taxonomies of items cannot necessarily be kept unchanged. For instance, some items may be reclassified from one hierarchy tree to another for more suitable classification, abandoned from the taxonomies if they will no longer be produced, or added into the taxonomies as new items. Additionally, the membership degrees expressing the fuzzy classification may also need to be adjusted. Under these circumstances, effectively updating the discovered generalized association rules is a crucial task. In this paper, we examine this problem and propose two novel algorithms, called FDiff_ET and FDiff_ET*, to update the discovered generalized frequent itemsets. Empirical evaluations show that our algorithms can maintain their performance even in high degree of taxonomy evolution, and are significantly faster than applying the contemporary fuzzy generalized association mining algorithm FGAR to the database with evolving taxonomy.  相似文献   

2.
王新  王勇 《计算机工程与应用》2002,38(17):206-207,220
项目的分类通常是呈模糊类层次,该文基于模糊类层次的概念,讨论模糊关联规则支持度和置信度的计算,并给出挖掘广义模糊关联规则的两个扩展算法。  相似文献   

3.
This paper presents some new algorithms to efficiently mine max frequent generalized itemsets (g-itemsets) and essential generalized association rules (g-rules). These are compact and general representations for all frequent patterns and all strong association rules in the generalized environment. Our results fill an important gap among algorithms for frequent patterns and association rules by combining two concepts. First, generalized itemsets employ a taxonomy of items, rather than a flat list of items. This produces more natural frequent itemsets and associations such as (meat, milk) instead of (beef, milk), (chicken, milk), etc. Second, compact representations of frequent itemsets and strong rules, whose result size is exponentially smaller, can solve a standard dilemma in mining patterns: with small threshold values for support and confidence, the user is overwhelmed by the extraordinary number of identified patterns and associations; but with large threshold values, some interesting patterns and associations fail to be identified. Our algorithms can also expand those max frequent g-itemsets and essential g-rules into the much larger set of ordinary frequent g-itemsets and strong g-rules. While that expansion is not recommended in most practical cases, we do so in order to present a comparison with existing algorithms that only handle ordinary frequent g-itemsets. In this case, the new algorithm is shown to be thousands, and in some cases millions, of the time faster than previous algorithms. Further, the new algorithm succeeds in analyzing deeper taxonomies, with the depths of seven or more. Experimental results for previous algorithms limited themselves to taxonomies with depth at most three or four. In each of the two problems, a straightforward lattice-based approach is briefly discussed and then a classificationbased algorithm is developed. In particular, the two classification-based algorithms are MFGI_class for mining max frequent g-itemsets and EGR_class for mining essential g-rules. The classification-based algorithms are featured with conceptual classification trees and dynamic generation and pruning algorithms.  相似文献   

4.
In this paper, we study the issues of mining and maintaining association rules in a large database of customer transactions. The problem of mining association rules can be mapped into the problems of finding large itemsets which are sets of items brought together in a sufficient number of transactions. We revise a graph-based algorithm to further speed up the process of itemset generation. In addition, we extend our revised algorithm to maintain discovered association rules when incremental or decremental updates are made to the databases. Experimental results show the efficiency of our algorithms. The revised algorithm is a significant improvement over the original one on mining association rules. The algorithms for maintaining association rules are more efficient than re-running the mining algorithms for the whole updated database and outperform previously proposed algorithms that need multiple passes over the database. Received 4 August 1999 / Revised 18 March 2000 / Accepted in revised form 18 October 2000  相似文献   

5.
混合关联规则及其挖掘算法   总被引:1,自引:0,他引:1  
在项目集中引入了负项目,据此定义了关联规则的一种泛化模型——混合关联规则,分析了它的价值,引入了它的挖掘问题的形式描述,并定义了挖掘中的几个关键算法.  相似文献   

6.
加权关联规则的开采   总被引:24,自引:0,他引:24  
关联规则可以揭示数据之间隐含的关系,并已在许多领域取得了广泛的应用。目前已经提出了许多有效发现关联规则的算法,这些算法都认为每个数据对规则的重要性相同。但在实际应用中,用户更关心近期发生的数据,即历史越久远的数据对规则的影响应该小,应当削弱这些数据对规则的影响,为此,本文提出了垂直加权关联规则的问题;另外,用户有时可能希望加强或削弱某些项目对规则的影响,即所谓的水平加权关联规则。最后,提出了混合加权关联规则的问题,并给出了一个解决该问题的算法MWAL,实验证明了MWAL算法的有效性。  相似文献   

7.
布尔加权关联规则的几种开采算法及比较   总被引:1,自引:0,他引:1  
关联规则挖掘在许多领域已有广泛的应用 ,目前存在许多发现关联规则的算法。这些算法都认为每个项目对规则的重要性相同。但在实际应用中 ,用户会比较看重一些项目 ,因此 ,为了加强这些项目对规则的影响 ,提出了一些加权关联规则的算法 ,介绍了几种存在的算法 ,并对它们进行了分析比较  相似文献   

8.
Several efforts were devoted to mining association rules having conjunction of items in premise and conclusion parts. Such rules convey information about the co-occurrence relations between items. However, other links amongst items—like complementary occurrence of items, absence of items, etc.—may occur and offer interesting knowledge to end-users. In this respect, looking for such relationship is a real challenge since not based on the conjunctive patterns. Indeed, catching such links requires obtaining semantically richer association rules, the generalized ones. These latter rules generalize classic ones to also offer disjunction and negation connectors between items, in addition to the conjunctive one. For this purpose, we propose in this paper a complete process for mining generalized association rules starting from an extraction context. Our experimental study stressing on the mining performances as well as the quantitative aspect proves the soundness of our proposal.  相似文献   

9.
加权关联规则的改进算法   总被引:7,自引:2,他引:7  
论文讨论了加权关联规则问题,针对布尔类型的加权关联规则问题提出一种改进算法。该算法首先利用普通的关联规则算法产生频繁集,然后在该频繁集的基础上产生加权频繁集。同时,给出了最优的最小支持度设定方法,保证了普通关联规则算法所产生的频繁集为加权频繁集的超集。该算法有较高的效率,并且能够有效利用已有的关联规则算法。  相似文献   

10.
关联规则的冗余删除与聚类   总被引:9,自引:0,他引:9  
关联规则挖掘常常会产生大量的规则,这使得用户分析和利用这些规则变得十分困难,尤其是数据库中属性高度相关时,问题更为突出.为了帮助用户做探索式分析,可以采用各种技术来有效地减少规则数量,如约束性关联规则挖掘、对规则进行聚类或泛化等技术.本文提出一种关联规则冗余删除算法ADRR和一种关联规则聚类算法ACAR.根据集合具有的性质,证明在挖掘到的关联规则中存在大量可以删除的冗余规则,从而提出了算法ADRR;算法ACAR采用一种新的用项目间的相关性来定义规则间距离的方法,结合DBSCAN算法的思想对关联规则进行聚类.最后将本文提出的算法加以实现,实验结果表明该算法暑有数可行的.且具较高的效率。  相似文献   

11.
基于Apriori算法的水平加权关联规则挖掘   总被引:19,自引:2,他引:19  
关联规则挖掘可以发现大量数据中项集之间有趣的关联或相关联系,并已在许多领域得到了广泛的应用。目前业界已经提出了许多发现关联规则的算法,这些算法都认为每个数据对规则的重要性相同。但在实际应用中,用户会比较倾向于自己最感兴趣或认为最重要的那部分项目,因此有必要加强这些项目对规则的影响,同时减弱另一些用户兴趣不大或认为不重要的项目对规则的影响。为此,论文提出了水平加权关联规则的问题,并结合Apriori算法,加以改进,给出了关于该问题的解决方案及有效算法New_Apriori。  相似文献   

12.
约束性相联规则发现方法及算法   总被引:47,自引:0,他引:47  
文中研究了在大型事务7库中发现有约束条件的相联规则问题,提出了有效实现约束性相联规则发现的两种方法,过滤数据库算法Filtering和频繁项集生成算法Separate,这两种可以同时并有物方法比已有算法运算效率有显著性提高。  相似文献   

13.
一种新的加权关联规则模型   总被引:5,自引:3,他引:5  
关联规则挖掘可以发现大量数据项集之间隐含的关系,在许多领域得到了广泛应用。目前很多关联规则挖掘算法已经被提出,这些算法一般都认为每个数据项的重要性相同。然而在现实中各个项目的重要性往往不同,从决策者角度出发,他们往往会优先考虑利润较高的项目,而忽略利润较低的项目。论文分析了现有加权关联规则文献中存在的问题,提出了一种新的加权关联规则模型,给出了有效挖掘加权频繁项集的MWFI算法。  相似文献   

14.
许多现实数据库都存在时态语义问题,因此在挖掘关联规则时附加上时态约束会使规则更具有实际意义。但目前提出的大多数时态关联规则挖掘算法,一般都认为每个数据项的重要性相同,而从决策者角度出发,往往会优先考虑利润较高的项目。提出了一种加权时态关联规则挖掘算法,以项目的生命周期作为时间特征,允许用户设定不同的项目权重。实验结果证明,该算法不仅能有效地发现加权时态关联规则,而且挖掘出的规则更有价值。  相似文献   

15.
一种有效的关联规则增量式更新算法   总被引:8,自引:2,他引:6  
关联规则是数据挖掘中的一个重要研究内容。目前已经提出了许多用于高效地发现大规模数据库中的关联规则的算法,而对已发现规则的更新及维护问题的研究却较少。文章提出了基于频繁模式树的关联规则增量式更新算法,以处理事务数据库中增加了新的事务数据集后相应关联规则的更新问题,并对其性能进行了分析。  相似文献   

16.
一种实用的关联规则增量式更新算法   总被引:2,自引:0,他引:2  
薛锦  陈原斌 《计算机工程与应用》2003,39(13):212-213,217
关联规则是数据挖掘中的一个重要研究内容。目前已经提出了许多用于高效地发现大规模数据库中的关联规则的算法,而对已发现规则的更新及维护问题的研究却较少。该文提出了一种实用的关联规则增量式更新算法,以处理事务数据库中增加了新的事务数据集后相应的关联规则的更新问题,并对其性能进行了分析。  相似文献   

17.
The association rule mining is one of the most popular data mining techniques, however, the users often experience difficulties in interpreting and exploiting the association rules extracted from large transaction data with high dimensionality. The primary reasons for such difficulties are two-folds. Firstly, too many association rules can be produced by the conventional association rule mining algorithms, and secondly, some association rules can be partly overlapped. This problem can be addressed if the user can select the relevant items to be used in association rule mining, however, there are often quite complex relations among the items in large transaction data. In this context, this paper aims to propose a novel visual exploration tool, structured association map (SAM), which enables the users to find the group of the relevant items in a visual way. The appearance of SAM is similar with the well-known cluster heat map, however, the items in SAM are sorted in more intelligent way so that the users can easily find the interesting area formed by a set of associated items, which are likely to constitute interesting many-to-many association rules. Moreover, this paper introduces an index called S2C, designed to evaluate the quality of SAM, and explains the SAM based association analysis procedure in a comprehensive manner. For illustration, this procedure is applied to a mass health examination result data set, and the experiment results demonstrate that SAM with high S2C value helps to reduce the complexities of association analysis significantly and it enables to focus on the specific region of the search space of association rule mining while avoiding the irrelevant association rules.  相似文献   

18.
一种改进的Apriori挖掘关联规则算法   总被引:2,自引:0,他引:2  
关联规则挖掘可以发现大量数据中项集之间有趣的联系,并已在许多领域得到了广泛的应用。但传统关联规则挖掘很少考虑数据项的重要程度,这些算法认为每个数据对规则的重要性相同,实际挖掘的结果不是很理想。为了挖掘出更具有价值的规则,文中提出了一种加权的关联规则算法,即用频度和利润来标识该项的重要性,然后对经典Apriori算法进行改进。最后用实例对改进后算法进行验证,结果证明改进后算法是合理有效的,能够挖掘出更具价值的信息。  相似文献   

19.
最简关联规则及其挖掘算法   总被引:3,自引:0,他引:3       下载免费PDF全文
李杰  徐勇  王云峰  王友 《计算机工程》2007,33(13):46-48
传统关联规则挖掘算法往往产生过多规则而难以被决策者所采用。针对该问题,文章从应用的角度提出了最简关联规则,其特点是后项只包括一种产品,同时追求规则前项产品项数的最小化,在此基础上给出了一种最简关联规则挖掘算法。利用该算法得到的最简关联规则集包括的规则数量大为减少且能得出与全部关联规则集相同的决策,避免了大量的冗余挖掘,提高了挖掘效率和应用效果。  相似文献   

20.
关系数据库中模糊规则的快速挖掘算法   总被引:10,自引:0,他引:10  
陈宁  陈安  周龙骧 《软件学报》2001,12(7):949-959
关联规则和时序规则是数据挖掘的任务之一.在以往的算法中,规则通常用确定的数值或概念来表示,往往不具有实际意义,而且不容易被用户理解.研究了从大型关系数据库中挖掘模糊关联规则和模糊时序规则的问题.基于模糊集合的理论,提出了两个模糊关联规则的挖掘算法,然后把它们分别扩展为模糊时序规则的挖掘算法.用模糊概念表示的规则更符合人的思维和表达习惯,增强了规则的可理解性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号