共查询到20条相似文献,搜索用时 31 毫秒
1.
Wen-Yang Lin Ja-Hwung Su Ming-Cheng Tseng 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2012,16(7):1109-1118
Mining generalized association rules with fuzzy taxonomic structures has been recognized as an important extension of generalized associations mining problem. To date most work on this problem, however, required the taxonomies to be static, ignoring the fact that the taxonomies of items cannot necessarily be kept unchanged. For instance, some items may be reclassified from one hierarchy tree to another for more suitable classification, abandoned from the taxonomies if they will no longer be produced, or added into the taxonomies as new items. Additionally, the membership degrees expressing the fuzzy classification may also need to be adjusted. Under these circumstances, effectively updating the discovered generalized association rules is a crucial task. In this paper, we examine this problem and propose two novel algorithms, called FDiff_ET and FDiff_ET*, to update the discovered generalized frequent itemsets. Empirical evaluations show that our algorithms can maintain their performance even in high degree of taxonomy evolution, and are significantly faster than applying the contemporary fuzzy generalized association mining algorithm FGAR to the database with evolving taxonomy. 相似文献
2.
项目的分类通常是呈模糊类层次,该文基于模糊类层次的概念,讨论模糊关联规则支持度和置信度的计算,并给出挖掘广义模糊关联规则的两个扩展算法。 相似文献
3.
Mining Frequent Generalized Itemsets and Generalized Association Rules Without Redundancy 总被引:1,自引:0,他引:1 下载免费PDF全文
This paper presents some new algorithms to efficiently mine max frequent generalized itemsets (g-itemsets) and essential generalized association rules (g-rules). These are compact and general representations for all frequent patterns and all strong association rules in the generalized environment. Our results fill an important gap among algorithms for frequent patterns and association rules by combining two concepts. First, generalized itemsets employ a taxonomy of items, rather than a flat list of items. This produces more natural frequent itemsets and associations such as (meat, milk) instead of (beef, milk), (chicken, milk), etc. Second, compact representations of frequent itemsets and strong rules, whose result size is exponentially smaller, can solve a standard dilemma in mining patterns: with small threshold values for support and confidence, the user is overwhelmed by the extraordinary number of identified patterns and associations; but with large threshold values, some interesting patterns and associations fail to be identified. Our algorithms can also expand those max frequent g-itemsets and essential g-rules into the much larger set of ordinary frequent g-itemsets and strong g-rules. While that expansion is not recommended in most practical cases, we do so in order to present a comparison with existing algorithms that only handle ordinary frequent g-itemsets. In this case, the new algorithm is shown to be thousands, and in some cases millions, of the time faster than previous algorithms. Further, the new algorithm succeeds in analyzing deeper taxonomies, with the depths of seven or more. Experimental results for previous algorithms limited themselves to taxonomies with depth at most three or four. In each of the two problems, a straightforward lattice-based approach is briefly discussed and then a classificationbased algorithm is developed. In particular, the two classification-based algorithms are MFGI_class for mining max frequent g-itemsets and EGR_class for mining essential g-rules. The classification-based algorithms are featured with conceptual classification trees and dynamic generation and pruning algorithms. 相似文献
4.
Efficient Graph-Based Algorithms for Discovering and Maintaining Association Rules in Large Databases 总被引:4,自引:2,他引:2
In this paper, we study the issues of mining and maintaining association rules in a large database of customer transactions.
The problem of mining association rules can be mapped into the problems of finding large itemsets which are sets of items brought together in a sufficient number of transactions. We revise a graph-based algorithm to further
speed up the process of itemset generation. In addition, we extend our revised algorithm to maintain discovered association
rules when incremental or decremental updates are made to the databases. Experimental results show the efficiency of our algorithms.
The revised algorithm is a significant improvement over the original one on mining association rules. The algorithms for maintaining
association rules are more efficient than re-running the mining algorithms for the whole updated database and outperform previously
proposed algorithms that need multiple passes over the database.
Received 4 August 1999 / Revised 18 March 2000 / Accepted in revised form 18 October 2000 相似文献
5.
混合关联规则及其挖掘算法 总被引:1,自引:0,他引:1
在项目集中引入了负项目,据此定义了关联规则的一种泛化模型——混合关联规则,分析了它的价值,引入了它的挖掘问题的形式描述,并定义了挖掘中的几个关键算法. 相似文献
6.
7.
布尔加权关联规则的几种开采算法及比较 总被引:1,自引:0,他引:1
关联规则挖掘在许多领域已有广泛的应用 ,目前存在许多发现关联规则的算法。这些算法都认为每个项目对规则的重要性相同。但在实际应用中 ,用户会比较看重一些项目 ,因此 ,为了加强这些项目对规则的影响 ,提出了一些加权关联规则的算法 ,介绍了几种存在的算法 ,并对它们进行了分析比较 相似文献
8.
Tarek Hamrouni Sadok Ben Yahia Engelbert Mephu Nguifo 《Annals of Mathematics and Artificial Intelligence》2010,59(2):201-222
Several efforts were devoted to mining association rules having conjunction of items in premise and conclusion parts. Such
rules convey information about the co-occurrence relations between items. However, other links amongst items—like complementary
occurrence of items, absence of items, etc.—may occur and offer interesting knowledge to end-users. In this respect, looking
for such relationship is a real challenge since not based on the conjunctive patterns. Indeed, catching such links requires
obtaining semantically richer association rules, the generalized ones. These latter rules generalize classic ones to also
offer disjunction and negation connectors between items, in addition to the conjunctive one. For this purpose, we propose
in this paper a complete process for mining generalized association rules starting from an extraction context. Our experimental
study stressing on the mining performances as well as the quantitative aspect proves the soundness of our proposal. 相似文献
9.
加权关联规则的改进算法 总被引:7,自引:2,他引:7
论文讨论了加权关联规则问题,针对布尔类型的加权关联规则问题提出一种改进算法。该算法首先利用普通的关联规则算法产生频繁集,然后在该频繁集的基础上产生加权频繁集。同时,给出了最优的最小支持度设定方法,保证了普通关联规则算法所产生的频繁集为加权频繁集的超集。该算法有较高的效率,并且能够有效利用已有的关联规则算法。 相似文献
10.
关联规则的冗余删除与聚类 总被引:9,自引:0,他引:9
关联规则挖掘常常会产生大量的规则,这使得用户分析和利用这些规则变得十分困难,尤其是数据库中属性高度相关时,问题更为突出.为了帮助用户做探索式分析,可以采用各种技术来有效地减少规则数量,如约束性关联规则挖掘、对规则进行聚类或泛化等技术.本文提出一种关联规则冗余删除算法ADRR和一种关联规则聚类算法ACAR.根据集合具有的性质,证明在挖掘到的关联规则中存在大量可以删除的冗余规则,从而提出了算法ADRR;算法ACAR采用一种新的用项目间的相关性来定义规则间距离的方法,结合DBSCAN算法的思想对关联规则进行聚类.最后将本文提出的算法加以实现,实验结果表明该算法暑有数可行的.且具较高的效率。 相似文献
11.
基于Apriori算法的水平加权关联规则挖掘 总被引:19,自引:2,他引:19
关联规则挖掘可以发现大量数据中项集之间有趣的关联或相关联系,并已在许多领域得到了广泛的应用。目前业界已经提出了许多发现关联规则的算法,这些算法都认为每个数据对规则的重要性相同。但在实际应用中,用户会比较倾向于自己最感兴趣或认为最重要的那部分项目,因此有必要加强这些项目对规则的影响,同时减弱另一些用户兴趣不大或认为不重要的项目对规则的影响。为此,论文提出了水平加权关联规则的问题,并结合Apriori算法,加以改进,给出了关于该问题的解决方案及有效算法New_Apriori。 相似文献
12.
13.
一种新的加权关联规则模型 总被引:5,自引:3,他引:5
关联规则挖掘可以发现大量数据项集之间隐含的关系,在许多领域得到了广泛应用。目前很多关联规则挖掘算法已经被提出,这些算法一般都认为每个数据项的重要性相同。然而在现实中各个项目的重要性往往不同,从决策者角度出发,他们往往会优先考虑利润较高的项目,而忽略利润较低的项目。论文分析了现有加权关联规则文献中存在的问题,提出了一种新的加权关联规则模型,给出了有效挖掘加权频繁项集的MWFI算法。 相似文献
14.
许多现实数据库都存在时态语义问题,因此在挖掘关联规则时附加上时态约束会使规则更具有实际意义。但目前提出的大多数时态关联规则挖掘算法,一般都认为每个数据项的重要性相同,而从决策者角度出发,往往会优先考虑利润较高的项目。提出了一种加权时态关联规则挖掘算法,以项目的生命周期作为时间特征,允许用户设定不同的项目权重。实验结果证明,该算法不仅能有效地发现加权时态关联规则,而且挖掘出的规则更有价值。 相似文献
15.
一种有效的关联规则增量式更新算法 总被引:8,自引:2,他引:6
关联规则是数据挖掘中的一个重要研究内容。目前已经提出了许多用于高效地发现大规模数据库中的关联规则的算法,而对已发现规则的更新及维护问题的研究却较少。文章提出了基于频繁模式树的关联规则增量式更新算法,以处理事务数据库中增加了新的事务数据集后相应关联规则的更新问题,并对其性能进行了分析。 相似文献
16.
一种实用的关联规则增量式更新算法 总被引:2,自引:0,他引:2
关联规则是数据挖掘中的一个重要研究内容。目前已经提出了许多用于高效地发现大规模数据库中的关联规则的算法,而对已发现规则的更新及维护问题的研究却较少。该文提出了一种实用的关联规则增量式更新算法,以处理事务数据库中增加了新的事务数据集后相应的关联规则的更新问题,并对其性能进行了分析。 相似文献
17.
The association rule mining is one of the most popular data mining techniques, however, the users often experience difficulties in interpreting and exploiting the association rules extracted from large transaction data with high dimensionality. The primary reasons for such difficulties are two-folds. Firstly, too many association rules can be produced by the conventional association rule mining algorithms, and secondly, some association rules can be partly overlapped. This problem can be addressed if the user can select the relevant items to be used in association rule mining, however, there are often quite complex relations among the items in large transaction data. In this context, this paper aims to propose a novel visual exploration tool, structured association map (SAM), which enables the users to find the group of the relevant items in a visual way. The appearance of SAM is similar with the well-known cluster heat map, however, the items in SAM are sorted in more intelligent way so that the users can easily find the interesting area formed by a set of associated items, which are likely to constitute interesting many-to-many association rules. Moreover, this paper introduces an index called S2C, designed to evaluate the quality of SAM, and explains the SAM based association analysis procedure in a comprehensive manner. For illustration, this procedure is applied to a mass health examination result data set, and the experiment results demonstrate that SAM with high S2C value helps to reduce the complexities of association analysis significantly and it enables to focus on the specific region of the search space of association rule mining while avoiding the irrelevant association rules. 相似文献
18.
19.