首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Much academic research has been conducted about the process of association rule mining. More effort is now required for practical application of association rules in various commercial fields. A potential application of association rule mining is the problem of product assignment in retail. The product assignment problem involves how to most effectively assign items to sites in retail stores to grow sales. Effective product assignment facilitates cross-selling and convenient shopping for customers to promote maximum sales for retailers. However, little practical research has been done to address the issue. The current study approaches the product assignment problem using association rule mining for retail environments. There are some barriers to overcome in applying association rule mining to the product assignment problem for retail. This study conducts some generalizing to overcome drawbacks caused by the short lifecycles of current products. As a measure of cross-selling, lift is used to compare the effectiveness of various assignments for products. The proposed algorithm consists of three processes, which include mining associations among items, nearest neighbor assignments, and updating assignments. The algorithm was tested on synthetic databases. The results show very effective product assignment in terms of the potential for cross-selling to drive maximum sales for retailers.  相似文献   

2.
以超市的量化属性为研究对象,提出一种基于模糊聚类和减类聚类的量化关联规则算法.该算法基本思想是把模糊聚类技术融入到离散化过程中,使数据离散到合理的区间,再利用经典的布尔关联规则挖掘算法Apriori进行挖掘.实验证明,这种方法能够有效挖掘量化关联规则,提高交叉销售的可能性.  相似文献   

3.
Privacy preserving association rule mining has been an active research area since recently. To this problem, there have been two different approaches—perturbation based and secure multiparty computation based. One drawback of the perturbation based approach is that it cannot always fully preserve individual’s privacy while achieving precision of mining results. The secure multiparty computation based approach works only for distributed environment and needs sophisticated protocols, which constrains its practical usage. In this paper, we propose a new approach for preserving privacy in association rule mining. The main idea is to use keyed Bloom filters to represent transactions as well as data items. The proposed approach can fully preserve privacy while maintaining the precision of mining results. The tradeoff between mining precision and storage requirement is investigated. We also propose δ-folding technique to further reduce the storage requirement without sacrificing mining precision and running time.  相似文献   

4.
Modern production planning and inventory control has been developed in order to treat more practical and more complicated circumstances, such as researching supply chain instead of single stock point; multi-items with correlation instead of single item and so on. In this paper, how to classify inventory items which are correlated each other is discussed by using the concept of ‘cross-selling effect’. In history, the ABC classification is usually used for inventory items aggregation because the number of inventory items is so large that it is not computationally feasible to set stock and service control guidelines for each individual item. A fundamental principle in ABC classification is that ranking all inventory items with respect to a notion of profit based on historical transactions. The difficulty is that the profit of one item not only comes from its own sales, but also from its influence on the sales of other items or reverse, i.e., the ‘cross-selling effect’. We had previously developed a classification approach for inventory items by using the association rules to deal with the ‘cross-selling effect’ and found that a very different classification can be obtained when comparing with traditional ABC classification. However, the ‘cross-selling effect’ may be considered in different ways. In this paper, a new consideration of inventory classification based on loss rule is presented. The lost profit of item/itemset with ‘cross-selling effect’ is discussed and defined as criterion for evaluating of importance of item, based on which new algorithms on classifying inventory items, also on discovering maximum profit item selection, are presented. A simple example is used to explain the new algorithm, and large amount of empirical experiments, both on real database collected from Japanese convenient store and on downloaded benchmark database, are implemented to evaluate the performances on effectiveness and utility. The results show that the proposed approach in this paper can gain a well insight into the cross-selling effect among items and is applicable for large-sized transaction database.  相似文献   

5.
In a recent paper by Toloo et al. [Toloo, M., Sohrabi, B., & Nalchigar, S. (2009). A new method for ranking discovered rules from data mining by DEA. Expert Systems with Applications, 36, 8503–8508], they proposed a new integrated data envelopment analysis model to find most efficient association rule in data mining. Then, utilizing this model, an algorithm is developed for ranking association rules by considering multiple criteria. In this paper, we show that their model only selects one efficient association rule by chance and is totally depended on the solution method or software is used for solving the problem. In addition, it is shown that their proposed algorithm can only rank efficient rules randomly and will fail to rank inefficient DMUs. We also refer to some other drawbacks in that paper and propose another approach to set up a full ranking of the association rules. A numerical example illustrates some contents of the paper.  相似文献   

6.
Association Rule Mining is one of the important data mining activities and has received substantial attention in the literature. Association rule mining is a computationally and I/O intensive task. In this paper, we propose a solution approach for mining optimized fuzzy association rules of different orders. We also propose an approach to define membership functions for all the continuous attributes in a database by using clustering techniques. Although single objective genetic algorithms are used extensively, they degenerate the solution. In our approach, extraction and optimization of fuzzy association rules are done together using multi-objective genetic algorithm by considering the objectives such as fuzzy support, fuzzy confidence and rule length. The effectiveness of the proposed approach is tested using computer activity dataset to analyze the performance of a multi processor system and network audit data to detect anomaly based intrusions. Experiments show that the proposed method is efficient in many scenarios.
V. S. AnanthanarayanaEmail:
  相似文献   

7.
给定数据库,在不考虑支持度和可信度情况下,事先能否预知最终会挖掘出多少条关联规则,这是个值得研究的问题。为此文中提出预期关联规则的概念,使上述问题转化成为如何计算预期关联规则集基数的问题。分别给出布尔型和数量型两种情况下的计算公式。对于数量型数据集,讨论当转换为布尔型数据后各个项集元素呈现的互斥性质。利用此性质导出一个膨胀矩阵和膨胀算法。该方法相对简洁地解决数量型数据集预期关联规则集基数的计算问题。计算和测试结果都表明,预期关联规则总量随着互斥元素的增加呈现下降趋势。这些结果对于深刻理解关联规则挖掘的实质,进而研发更加高效的挖掘算法十分有益。  相似文献   

8.
In data mining applications, it is important to develop evaluation methods for selecting quality and profitable rules. This paper utilizes a non-parametric approach, Data Envelopment Analysis (DEA), to estimate and rank the efficiency of association rules with multiple criteria. The interestingness of association rules is conventionally measured based on support and confidence. For specific applications, domain knowledge can be further designed as measures to evaluate the discovered rules. For example, in market basket analysis, the product value and cross-selling profit associated with the association rule can serve as essential measures to rule interestingness. In this paper, these domain measures are also included in the rule ranking procedure for selecting valuable rules for implementation. An example of market basket analysis is applied to illustrate the DEA based methodology for measuring the efficiency of association rules with multiple criteria.  相似文献   

9.
In data mining applications, it is important to develop evaluation methods for selecting quality and profitable rules. This paper utilizes a non-parametric approach, Data Envelopment Analysis (DEA), to estimate and rank the efficiency of association rules with multiple criteria. The interestingness of association rules is conventionally measured based on support and confidence. For specific applications, domain knowledge can be further designed as measures to evaluate the discovered rules. For example, in market basket analysis, the product value and cross-selling profit associated with the association rule can serve as essential measures to rule interestingness. In this paper, these domain measures are also included in the rule ranking procedure for selecting valuable rules for implementation. An example of market basket analysis is applied to illustrate the DEA based methodology for measuring the efficiency of association rules with multiple criteria.  相似文献   

10.
Building a high accuracy classifier for classification is a problem in real applications. One high accuracy classifier used for this purpose is based on association rules. In the past, some researches showed that classification based on association rules (or class-association rules – CARs) has higher accuracy than that of other rule-based methods such as ILA and C4.5. However, mining CARs consumes more time because it mines a complete rule set. Therefore, improving the execution time for mining CARs is one of the main problems with this method that needs to be solved. In this paper, we propose a new method for mining class-association rule. Firstly, we design a tree structure for the storage frequent itemsets of datasets. Some theorems for pruning nodes and computing information in the tree are developed after that, and then, based on the theorems, we propose an efficient algorithm for mining CARs. Experimental results show that our approach is more efficient than those used previously.  相似文献   

11.
Association rule mining has contributed to many advances in the area of knowledge discovery. However, the quality of the discovered association rules is a big concern and has drawn more and more attention recently. One problem with the quality of the discovered association rules is the huge size of the extracted rule set. Often for a dataset, a huge number of rules can be extracted, but many of them can be redundant to other rules and thus useless in practice. Mining non-redundant rules is a promising approach to solve this problem. In this paper, we first propose a definition for redundancy, then propose a concise representation, called a Reliable basis, for representing non-redundant association rules. The Reliable basis contains a set of non-redundant rules which are derived using frequent closed itemsets and their generators instead of using frequent itemsets that are usually used by traditional association rule mining approaches. An important contribution of this paper is that we propose to use the certainty factor as the criterion to measure the strength of the discovered association rules. Using this criterion, we can ensure the elimination of as many redundant rules as possible without reducing the inference capacity of the remaining extracted non-redundant rules. We prove that the redundancy elimination, based on the proposed Reliable basis, does not reduce the strength of belief in the extracted rules. We also prove that all association rules, their supports and confidences, can be retrieved from the Reliable basis without accessing the dataset. Therefore the Reliable basis is a lossless representation of association rules. Experimental results show that the proposed Reliable basis can significantly reduce the number of extracted rules. We also conduct experiments on the application of association rules to the area of product recommendation. The experimental results show that the non-redundant association rules extracted using the proposed method retain the same inference capacity as the entire rule set. This result indicates that using non-redundant rules only is sufficient to solve real problems needless using the entire rule set.  相似文献   

12.
关联规则分析在电信交叉销售中的应用研究   总被引:3,自引:0,他引:3  
在阐述电信运营企业市场竞争和营销活动中存在的问题的基础上,结合电信企业的特点,分析了在该行业营销领域采用交叉销售策略的必要性,并将数据挖掘中的关联规则分析法应用于交叉销售分析中.详细介绍了关联规则分析法中的Apriori算法原理,并用该算法对电信业务数据进行了分析,给出了业务之间的关联,为企业实施交叉销售提供了有力的数据支持.  相似文献   

13.
基于矩阵的频繁项集挖掘算法   总被引:9,自引:3,他引:6       下载免费PDF全文
如何高效地挖掘频繁项集是关联规则挖掘的主要问题。该文根据集合论和矩阵理论,提出一种基于矩阵的频繁项集挖掘算法。该算法只需扫描数据库一次,就能把所有事务转化为矩阵的行,把所有项和项集转化为矩阵的列,在对矩阵操作时能一次性产生所有频繁项集,且当支持度阈值改变时无需重新扫描数据库。实验结果表明,该算法的挖掘效率高于Apriori算法。  相似文献   

14.
In recent years, data mining has become one of the most popular techniques for data owners to determine their strategies. Association rule mining is a data mining approach that is used widely in traditional databases and usually to find the positive association rules. However, there are some other challenging rule mining topics like data stream mining and negative association rule mining. Besides, organizations want to concentrate on their own business and outsource the rest of their work. This approach is named “database as a service concept” and provides lots of benefits to data owner, but, at the same time, brings out some security problems. In this paper, a rule mining system has been proposed that provides efficient and secure solution to positive and negative association rule computation on XML data streams in database as a service concept. The system is implemented and several experiments have been done with different synthetic data sets to show the performance and efficiency of the proposed system.  相似文献   

15.
关联规则挖掘作为近年来的研究热点之一,其经典算法Apriori算法因需要多次扫描数据库且会产生大量候选项集,严重影响了关联规则的挖掘效率.在此基础上提出了一种基于矩阵压缩的加权关联规则挖掘算法,只需扫描一次数据库,并将其转换为0-1矩阵,根据相关性质对矩阵进行压缩,从而降低了算法执行过程中的计算量;同时,考虑到项目的重要性,采取加权的方法,用求概率的方式设置项目属性的权值.同Apriori算法相比,本算法在挖掘过程中能直接查找高阶频繁项集.实验结果表明,本算法能有效提高关联规则的挖掘效率.  相似文献   

16.
A novel multi-objective genetic algorithm (GA)-based rule-mining method for affective product design is proposed to discover a set of rules relating design attributes with customer evaluation based on survey data. The proposed method can generate approximate rules to consider the ambiguity of customer assessments. The generated rules can be used to determine the lower and upper limits of the affective effect of design patterns. For a rule-mining problem, the proposed multi-objective GA approach could simultaneously consider the accuracy, comprehensibility, and definability of approximate rules. In addition, the proposed approach can deal with categorical attributes and quantitative attributes, and determine the interval of quantitative attributes. Categorical and quantitative attributes in affective product design should be considered because they are commonly used to define the design profile of a product. In this paper, a two-stage rule-mining approach is proposed to generate rules with a simple chromosome design in the first stage of rule mining. In the second stage of rule mining, entire rule sets are refined to determine solutions considering rule interaction. A case study on mobile phones is used to demonstrate and validate the performance of the proposed rule-mining method. The method can discover rule sets with good support and coverage rates from the survey data.  相似文献   

17.
This study proposes a new approach to identifying core technologies from a perspective of technological cross-impacts based on patent co-classification information with consideration of the overall interrelationships among technologies. The proposed approach is comprised of two methods: association rule mining (ARM) and the analytic network process (ANP). Firstly association rule mining (ARM) is employed to calculate the technological cross-impact indexes. Since the confidence measure in ARM is defined as a conditional probability between two technologies, it is adopted as an index for evaluating technological cross-impacts. The technological cross-impact matrix is then constructed with all calculated cross-impact indexes. Secondly, the ANP, which is a generalization of the analytic hierarchy process (AHP), is conducted to produce priorities of technologies with consideration of their direct and indirect impacts. The proposed approach can be utilized for technology monitoring for both technology planning of firms and innovation policy making of governments. A case of telecommunication technology is presented to illustrate the proposed approach.  相似文献   

18.
In the field of data mining, an important issue for association rules generation is frequent itemset discovery, which is the key factor in implementing association rule mining. Therefore, this study considers the user’s assigned constraints in the mining process. Constraint-based mining enables users to concentrate on mining itemsets that are interesting to themselves, which improves the efficiency of mining tasks. In addition, in the real world, users may prefer recording more than one attribute and setting multi-dimensional constraints. Thus, this study intends to solve the multi-dimensional constraints problem for association rules generation.The ant colony system (ACS) is one of the newest meta-heuristics for combinatorial optimization problems, and this study uses the ant colony system to mine a large database to find the association rules effectively. If this system can consider multi-dimensional constraints, the association rules will be generated more effectively. Therefore, this study proposes a novel approach of applying the ant colony system for extracting the association rules from the database. In addition, the multi-dimensional constraints are taken into account. The results using a real case, the National Health Insurance Research Database, show that the proposed method is able to provide more condensed rules than the Apriori method. The computational time is also reduced.  相似文献   

19.
A hybrid multi-group approach for privacy-preserving data mining   总被引:6,自引:6,他引:0  
In this paper, we propose a hybrid multi-group approach for privacy preserving data mining. We make two contributions in this paper. First, we propose a hybrid approach. Previous work has used either the randomization approach or the secure multi-party computation (SMC) approach. However, these two approaches have complementary features: the randomization approach is much more efficient but less accurate, while the SMC approach is less efficient but more accurate. We propose a novel hybrid approach, which takes advantage of the strength of both approaches to balance the accuracy and efficiency constraints. Compared to the two existing approaches, our proposed approach can achieve much better accuracy than randomization approach and much reduced computation cost than SMC approach. We also propose a multi-group scheme that makes it flexible for the data miner to control the balance between data mining accuracy and privacy. This scheme is motivated by the fact that existing randomization schemes that randomize data at individual attribute level can produce insufficient accuracy when the number of dimensions is high. We partition attributes into groups, and develop a scheme to conduct group-based randomization to achieve better data mining accuracy. To demonstrate the effectiveness of the proposed general schemes, we have implemented them for the ID3 decision tree algorithm and association rule mining problem and we also present experimental results.
Wenliang DuEmail:
  相似文献   

20.
Most incremental mining and online mining algorithms concentrate on finding association rules or patterns consistent with entire current sets of data. Users cannot easily obtain results from only interesting portion of data. This may prevent the usage of mining from online decision support for multidimensional data. To provide ad-hoc, query-driven, and online mining support, we first propose a relation called the multidimensional pattern relation to structurally and systematically store context and mining information for later analysis. Each tuple in the relation comes from an inserted dataset in the database. We then develop an online mining approach called three-phase online association rule mining (TOARM) based on this proposed multidimensional pattern relation to support online generation of association rules under multidimensional considerations. The TOARM approach consists of three phases during which final sets of patterns satisfying various mining requests are found. It first selects and integrates related mining information in the multidimensional pattern relation, and then if necessary, re-processes itemsets without sufficient information against the underlying datasets. Some implementation considerations for the algorithm are also stated in detail. Experiments on homogeneous and heterogeneous datasets were made and the results show the effectiveness of the proposed approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号