期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Mining interestingness measures for string pattern mining

M. Baena-Garc?´a R. Morales-Bueno 《Knowledge》2012,25(1):45-50

相似文献

2.

Ranking efficient decision-making units in data envelopment analysis using fuzzy concept

Majid Zerafat Angiz L. Adli Mustafa Ali Emrouznejad 《Computers & Industrial Engineering》2010

This paper introduces a new mathematical method for improving the discrimination power of data envelopment analysis and to completely rank the efficient decision-making units (DMUs). Fuzzy concept is utilised. For this purpose, first all DMUs are evaluated with the CCR model. Thereafter, the resulted weights for each output are considered as fuzzy sets and are then converted to fuzzy numbers. The introduced model is a multi-objective linear model, endpoints of which are the highest and lowest of the weighted values. An added advantage of the model is its ability to handle the infeasibility situation sometimes faced by previously introduced models. 相似文献

3.

Assessment of data quality in accounting data with association rules

《Expert systems with applications》2014,41(5):2259-2268

Business rules are an effective way to control data quality. Business experts can directly enter the rules into appropriate software without error prone communication with programmers. However, not all business situations and possible data quality problems can be considered in advance. In situations where business rules have not been defined yet, patterns of data handling may arise in practice. We employ data mining to accounting transactions in order to discover such patterns. The discovered patterns are represented in form of association rules. Then, deviations from discovered patterns can be marked as potential data quality violations that need to be examined by humans. Data quality breaches can be expensive but manual examination of many transactions is also expensive. Therefore, the goal is to find a balance between marking too many and too few transactions as being potentially erroneous. We apply appropriate procedures to evaluate the classification accuracy of developed association rules and support the decision on the number of deviations to be manually examined based on economic principles. 相似文献

4.

Railway station site selection using analytical hierarchy process and data envelopment analysis

Nahid Mohajeri Gholam R. Amin 《Computers & Industrial Engineering》2010

This paper deals with the problem of finding the optimum site for a railway station for the city of Mashhad, northeast Iran, using the methods of analytical hierarchy process (AHP) and data envelopment analysis (DEA). The paper identifies a four-level hierarchy model for the railway station site-selection problem. The model uses four main criteria: (1) rail-related, (2) passenger services, (3) architecture and urbanism, and (4) economics. In addition, there are 26 subcriteria as well as five (potential) candidates or alternatives. Comparison matrices are used to obtain the local weights and priorities of the railway-station candidates. A DEA model is proposed to determine the optimum site for a railway station. It is shown that the local priorities (or weights) obtained from the AHP can be defined as the multiple outputs of a DEA model for finding the best site for a railway station. 相似文献

5.

Novel measurement for mining effective association rules

Jin-Mao Wei Wei-Guo Yi Ming-Yang Wang 《Knowledge》2006,19(8):739-743

Mining association rules are widely studied in data mining society. In this paper, we analyze the measure method of support–confidence framework for mining association rules, from which we find it tends to mine many redundant or unrelated rules besides the interesting ones. In order to ameliorate the criterion, we propose a new method of match as the substitution of confidence. We analyze in detail the property of the proposed measurement. Experimental results show that the generated rules by the improved method reveal high correlation between the antecedent and the consequent when the rules were compared with that produced by the support–confidence framework. Furthermore, the improved method decreases the generation of redundant rules. 相似文献

6.

A data mining approach to database compression

Chin-Feng Lee S. Wesley Changchien Wei-Tse Wang Jau-Ji Shen 《Information Systems Frontiers》2006,8(3):147-161

Data mining can dig out valuable information from databases to assist a business in approaching knowledge discovery and improving business intelligence. Database stores large structured data. The amount of data increases due to the advanced database technology and extensive use of information systems. Despite the price drop of storage devices, it is still important to develop efficient techniques for database compression. This paper develops a database compression method by eliminating redundant data, which often exist in transaction database. The proposed approach uses a data mining structure to extract association rules from a database. Redundant data will then be replaced by means of compression rules. A heuristic method is designed to resolve the conflicts of the compression rules. To prove its efficiency and effectiveness, the proposed approach is compared with two other database compression methods. Chin-Feng Lee is an associate professor with the Department of Information Management at Chaoyang University of Technology, Taiwan, R.O.C. She received her M.S. and Ph.D. degrees in 1994 and 1998, respectively, from the Department of Computer Science and Information Engineering at National Chung Cheng University. Her current research interests include database design, image processing and data mining techniques. S. Wesley Changchien is a professor with the Institute of Electronic Commerce at National Chung-Hsing University, Taiwan, R.O.C. He received a BS degree in Mechanical Engineering (1989) and completed his MS (1993) and Ph.D. (1996) degrees in Industrial Engineering at State University of New York at Buffalo, USA. His current research interests include electronic commerce, internet/database marketing, knowledge management, data mining, and decision support systems. Jau-Ji Shen received his Ph.D. degree in Information Engineering and Computer Science from National Taiwan University at Taipei, Taiwan in 1988. From 1988 to 1994, he was the leader of the software group in Institute of Aeronautic, Chung-Sung Institute of Science and Technology. He is currently an associate professor of information management department in the National Chung Hsing University at Taichung. His research areas focus on the digital multimedia, database and information security. His current research areas focus on data engineering, database techniques and information security. Wei-Tse Wang received the B.A. (2001) and M.B.A (2003) degrees in Information Management at Chaoyang University of Technology, Taiwan, R.O.C. His research interests include data mining, XML, and database compression. 相似文献

7.

Alternative mixed integer linear programming models for identifying the most efficient decision making unit in data envelopment analysis

Ying-Ming Wang Peng Jiang 《Computers & Industrial Engineering》2012

A mixed integer linear model for selecting the best decision making unit (DMU) in data envelopment analysis (DEA) has recently been proposed by Foroughi [Foroughi, A. A. (2011a). A new mixed integer linear model for selecting the best decision making units in data envelopment analysis. Computers and Industrial Engineering, 60(4), 550–554], which involves many unnecessary constraints and requires specifying an assurance region (AR) for input weights and output weights, respectively. Its selection of the best DMU is easy to be affected by outliers and may sometimes be incorrect. To avoid these drawbacks, this paper proposes three alternative mixed integer linear programming (MILP) models for identifying the most efficient DMU under different returns to scales, which contain only essential constraints and decision variables and are much simpler and more succinct than Foroughi’s. The proposed alternative MILP models can make full use of input and output information without the need of specifying any assurance regions for input and output weights to avoid zero weights, can make correct selections without being affected by outliers, and are of significant importance to the decision makers whose concerns are not DMU ranking, but the correct selection of the most efficient DMU. The potential applications of the proposed alternative MILP models and their effectiveness are illustrated with four numerical examples. 相似文献

8.

Ranking alternatives in a preferential voting system using fuzzy concepts and data envelopment analysis

M. Zerafat Angiz A. Tajaddini A. Mustafa M. Jalal Kamali 《Computers & Industrial Engineering》2012

In this paper, a new method for aggregating the opinions of experts in a preferential voting system is proposed. The method, which uses fuzzy concept in handling crisp data, is computationally efficient and is able to completely rank the alternatives. Through this method, the number of votes for certain rank position that each alternative receives are first grouped together to form fuzzy numbers. The nearest point to a fuzzy number concept is then used to introduce an artificial ideal alternative. Data envelopment analysis is next used to find the efficiency scores of the alternatives in a pair-wise comparison with the artificial ideal alternative. Alternatives are rank based on these efficiency scores. If the alternatives are not completely ranked, a weight restriction method also based on fuzzy concept is used on the un-discriminated alternatives until they are completely ranked. Two examples are given for illustration of the method. 相似文献

9.

Online mining of fuzzy multidimensional weighted association rules

Mehmet Kaya Reda Alhajj 《Applied Intelligence》2008,29(1):13-34

This paper addresses the integration of fuzziness with On-Line Analytical Processing (OLAP) based association rules mining. It contributes to the ongoing research on multidimensional online association rules mining by proposing a general architecture that utilizes a fuzzy data cube for knowledge discovery. A data cube is mainly constructed to provide users with the flexibility to view data from different perspectives as some dimensions of the cube contain multiple levels of abstraction. The first step of the process described in this paper involves introducing fuzzy data cube as a remedy to the problem of handling quantitative values of dimensional attributes in a cube. This facilitates the online mining of fuzzy association rules at different levels within the constructed fuzzy data cube. Then, we investigate combining the concepts of weight and multiple-level to mine fuzzy weighted multi-cross-level association rules from the constructed fuzzy data cube. For this purpose, three different methods are introduced for single dimension, multidimensional and hybrid (integrates the other two methods) fuzzy weighted association rules mining. Each of the three methods utilizes a fuzzy data cube constructed to suite the particular method. To the best of our knowledge, this is the first effort in this direction. We compared the proposed approach to an existing approach that does not utilize fuzziness. Experimental results obtained for each of the three methods on a synthetic dataset and on the adult data of the United States census in year 2000 demonstrate the effectiveness and applicability of the proposed fuzzy OLAP based mining approach. OLAP is one of the most popular tools for on-line, fast and effective multidimensional data analysis. In the OLAP framework, data is mainly stored in data hypercubes (simply called cubes). 相似文献

10.

Meta-association rules for mining interesting associations in multiple datasets

《Applied Soft Computing》2016

Association rules have been widely used in many application areas to extract new and useful information expressed in a comprehensive way for decision makers from raw data. However, raw data may not always be available, it can be distributed in multiple datasets and therefore there resulting number of association rules to be inspected is overwhelming. In the light of these observations, we propose meta-association rules, a new framework for mining association rules over previously discovered rules in multiple databases. Meta-association rules are a new tool that convey new information from the patterns extracted from multiple datasets and give a “summarized” representation about most frequent patterns. We propose and compare two different algorithms based respectively on crisp rules and fuzzy rules, concluding that fuzzy meta-association rules are suitable to incorporate to the meta-mining procedure the obtained quality assessment provided by the rules in the first step of the process, although it consumes more time than the crisp approach. In addition, fuzzy meta-rules give a more manageable set of rules for its posterior analysis and they allow the use of fuzzy items to express additional knowledge about the original databases. The proposed framework is illustrated with real-life data about crime incidents in the city of Chicago. Issues such as the difference with traditional approaches are discussed using synthetic data. 相似文献

11.

Top-down mining of frequent closed patterns from very high dimensional data

Hongyan Liu Xiaoyu Wang 《Information Sciences》2009,179(7):899-127

Frequent pattern mining is an essential theme in data mining. Existing algorithms usually use a bottom-up search strategy. However, for very high dimensional data, this strategy cannot fully utilize the minimum support constraint to prune the rowset search space. In this paper, we propose a new method called top-down mining together with a novel row enumeration tree to make full use of the pruning power of the minimum support constraint. Furthermore, to efficiently check if a rowset is closed, we develop a method called the trace-based method. Based on these methods, an algorithm called TD-Close is designed for mining a complete set of frequent closed patterns. To enhance its performance further, we improve it by using new pruning strategies and new data structures that lead to a new algorithm TTD-Close. Our performance study shows that the top-down strategy is effective in cutting down search space and saving memory space, while the trace-based method facilitates the closeness-checking. As a result, the algorithm TTD-Close outperforms the bottom-up search algorithms such as Carpenter and FPclose in most cases. It also runs faster than TD-Close. 相似文献

12.

An efficient algorithm for mining temporal high utility itemsets from data streams 总被引：1，自引：0，他引：1

Chun-Jung Chu Tyne Liang 《Journal of Systems and Software》2008,81(7):1105-1117

Utility of an itemset is considered as the value of this itemset, and utility mining aims at identifying the itemsets with high utilities. The temporal high utility itemsets are the itemsets whose support is larger than a pre-specified threshold in current time window of the data stream. Discovery of temporal high utility itemsets is an important process for mining interesting patterns like association rules from data streams. In this paper, we propose a novel method, namely THUI (Temporal High Utility Itemsets)-Mine, for mining temporal high utility itemsets from data streams efficiently and effectively. To the best of our knowledge, this is the first work on mining temporal high utility itemsets from data streams. The novel contribution of THUI-Mine is that it can effectively identify the temporal high utility itemsets by generating fewer candidate itemsets such that the execution time can be reduced substantially in mining all high utility itemsets in data streams. In this way, the process of discovering all temporal high utility itemsets under all time windows of data streams can be achieved effectively with less memory space and execution time. This meets the critical requirements on time and space efficiency for mining data streams. Through experimental evaluation, THUI-Mine is shown to significantly outperform other existing methods like Two-Phase algorithm under various experimental conditions. 相似文献

13.

基于免疫原理的模糊关联规则挖掘算法

张雷李人厚《控制与决策》2008,23(8)

提出一种基于免疫原理的人工免疫算法,用于模糊关联规则的挖掘.该算法通过借鉴生物免疫系统中的克隆选择原理来实施优化操作,它直接从给出的数据中,通过优化机制自动确定每个属性对应的模糊集合,使推导出的满足条件的模糊关联规则数目最多.将实际数据集和相关算法进行性能比较,实验结果表明了所提出算法的有效性. 相似文献

14.

Multi-objective PSO algorithm for mining numerical association rules without a priori discretization

《Expert systems with applications》2014,41(9):4259-4273

In the domain of association rules mining (ARM) discovering the rules for numerical attributes is still a challenging issue. Most of the popular approaches for numerical ARM require a priori data discretization to handle the numerical attributes. Moreover, in the process of discovering relations among data, often more than one objective (quality measure) is required, and in most cases, such objectives include conflicting measures. In such a situation, it is recommended to obtain the optimal trade-off between objectives. This paper deals with the numerical ARM problem using a multi-objective perspective by proposing a multi-objective particle swarm optimization algorithm (i.e., MOPAR) for numerical ARM that discovers numerical association rules (ARs) in only one single step. To identify more efficient ARs, several objectives are defined in the proposed multi-objective optimization approach, including confidence, comprehensibility, and interestingness. Finally, by using the Pareto optimality the best ARs are extracted. To deal with numerical attributes, we use rough values containing lower and upper bounds to show the intervals of attributes. In the experimental section of the paper, we analyze the effect of operators used in this study, compare our method to the most popular evolutionary-based proposals for ARM and present an analysis of the mined ARs. The results show that MOPAR extracts reliable (with confidence values close to 95%), comprehensible, and interesting numerical ARs when attaining the optimal trade-off between confidence, comprehensibility and interestingness. 相似文献

15.

Benchmarking of service quality with data envelopment analysis

《Expert systems with applications》2014,41(8):3761-3768

This paper proposes a data envelopment analysis (DEA) approach to measurement and benchmarking of service quality. Dealing with measurement of overall service quality of multiple units with SERVPERF as multiple-criteria decision-making (MCDM), the proposed approach utilizes DEA, in particular, the pure output model without inputs. The five dimensions of SERVPERF are considered as outputs of the DEA model. A case study of auto repair services is provided for the purpose of illustration. The current practice of benchmarking of service quality with SERVQUAL/SERVPERF is limited in that there is little guidance to whom to benchmark and to what degree service quality should be improved. This study contributes to the field of service quality benchmarking by overcoming the above limitations, taking advantage of DEA’s capability to handle MCDM problems and provide benchmarking guidelines. 相似文献

16.

Common weights data envelopment analysis with uncertain data: A robust optimization approach

H. Omrani 《Computers & Industrial Engineering》2013

One of the primary issues on data envelopment analysis (DEA) models is the reduction of weights flexibility. There are literally several studies to determine common weights in DEA but none of them considers uncertainty in data. This paper introduces a robust optimization approach to find common weights in DEA with uncertain data. The uncertainty is considered in both inputs and outputs and a suitable robust counterpart of DEA model is developed. The proposed robust DEA model is solved and the ideal solution is found for each decision making units (DMUs). Then, the common weights are found for all DMUs by utilizing the goal programming technique. To illustrate the performance of the proposed model, a numerical example is solved. Also, the proposed model of this paper is implemented by using some actual data from provincial gas companies in Iran. 相似文献

17.

Prioritization of association rules in data mining: Multiple criteria decision approach

Duke Hyun Choi Byeong Seok Ahn Soung Hie Kim 《Expert systems with applications》2005,29(4):203-878

Data mining techniques, extracting patterns from large databases are the processes that focus on the automatic exploration and analysis of large quantities of raw data in order to discover meaningful patterns and rules. In the process of applying the methods, most of the managers who are engaging the business encounter a multitude of rules resulted from the data mining technique. In view of multi-faceted characteristics of such rules, in general, the rules are featured by multiple conflicting criteria that are directly related with the business values, such as, e.g. expected monetary value or incremental monetary value.

In the paper, we present a method for rule prioritization, taking into account the business values which are comprised of objective metric or managers’ subjective judgments. The proposed methodology is an attempt to make synergy with decision analysis techniques for solving problems in the domain of data mining. We believe that this approach would be particularly useful for the business managers who are suffering from rule quality or quantity problems, conflicts between extracted rules, and difficulties of building a consensus in case several managers are involved for the rule selection. 相似文献

18.

Association rules on significant rare data using second support

《国际计算机数学杂志》2012,89(1):69-80

Association rule is one of the data mining techniques involved in discovering information that represents the association among data. Data in the database sometimes appear infrequent but highly associated with a specific data. This paper proposes a technique for significant rare data by introducing second support in discovering the association rules of such data. We show that the proposed approach provides better performance as compared to standard association rules techniques. 相似文献

19.

An interactive application of data envelopment analysis in microcomputers

Yih-Long Chang Toshiyuki Sueyoshi 《Computational Economics》1991,4(1):51-64

This article describes a general-purpose microcomputer code for data envelopment analysis (DEA) that incorporates four different DEA models in the form of a user-friendly, menu-driven structure.Research financially supported by Dean's Professorship, College of Business, the Ohio State University. 相似文献

20.

Finding association rules in semantic web data

Victoria Nebot Rafael Berlanga 《Knowledge》2012,25(1):51-62

The amount of ontologies and semantic annotations available on the Web is constantly growing. This new type of complex and heterogeneous graph-structured data raises new challenges for the data mining community. In this paper, we present a novel method for mining association rules from semantic instance data repositories expressed in RDF/(S) and OWL. We take advantage of the schema-level (i.e. Tbox) knowledge encoded in the ontology to derive appropriate transactions which will later feed traditional association rules algorithms. This process is guided by the analyst requirements, expressed in the form of query patterns. Initial experiments performed on semantic data of a biomedical application show the usefulness and efficiency of the approach. 相似文献