首页 | 官方网站   微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   17篇
  免费   0篇
工业技术   17篇
  2018年   1篇
  2013年   3篇
  2012年   1篇
  2011年   3篇
  2010年   1篇
  2009年   3篇
  1998年   1篇
  1997年   2篇
  1996年   1篇
  1994年   1篇
排序方式: 共有17条查询结果,搜索用时 0 毫秒
1.
Designing Templates for Mining Association Rules   总被引:3,自引:0,他引:3  
Current approaches to data mining usually address specific userrequests, while no general design criteria for the extraction of associationrules are available for the end-user. In this paper, we propose aclassification of association rule types, which provides a general frameworkfor the design of association rule mining applications. Based on theidentified association rule types, we introduce predefined templates as ameans to capture the user specification of mining applications. Furthermore,we propose a general language to design templates for the extraction ofarbitrary association rule types.  相似文献   
2.
Internet measured data collected via passive measurement are analyzed to obtain localization information on nodes by clustering (i.e., grouping together) nodes that exhibit similar network path properties. Since traditional clustering algorithms fail to correctly identify clusters of homogeneous nodes, we propose the NetCluster novel framework, suited to analyze Internet measurement datasets. We show that the proposed framework correctly analyzes synthetically generated traces. Finally, we apply it to real traces collected at the access link of Politecnico di Torino campus LAN and discuss the network characteristics as seen at the vantage point.  相似文献   
3.
The lack of tools for rule generation, analysis, and run-time monitoring appears one of the main obstacles to the widespreading of active database applications. This paper describes a complete tool environment for assisting the design of active rules applications; the tools were developed at Politecnico di Milano in the context of the IDEA Project, a 4-years Esprit project sponsored by the European Commission which was launched in June 1992. We describe tools for active rule generation, analysis, debugging, and browsing; rules are defined in Chimera, a conceptual design model and language for the specification of active rules applications. We also introduce a tool for mapping from Chimera into Oracle, a relational product supporting triggers.Most of the tools described in this paper are fully implemented and currently in operation (beta-testing) within the companies participating to the IDEA Project, with the exception of two of them (called Argonaut-V and Pandora), which will be completed by the end of 1996.Research presented in this paper is supported by Esprit project P6333 IDEA, and by ENEL contract VDS 1/94: Integrity Constraint Management  相似文献   
4.
Associative classification is characterized by accurate models and high model generation time. Most time is spent in extracting and postprocessing a large set of irrelevant rules, which are eventually pruned. We propose I‐prune, an item‐pruning approach that selects uninteresting items by means of an interestingness measure and prunes them as soon as they are detected. Thus, the number of extracted rules is reduced and model generation time decreases correspondingly. A wide set of experiments on real and synthetic data sets has been performed to evaluate I‐prune and select the appropriate interestingness measure. The experimental results show that I‐prune allows a significant reduction in model generation time, while increasing (or at worst preserving) model accuracy. Experimental evaluation also points to the chi‐square measure as the most effective interestingness measure for item pruning. © 2012 Wiley Periodicals, Inc.  相似文献   
5.
Cagliero  Luca  Garza  Paolo  Kavoosifar  Mohammad Reza  Baralis  Elena 《Scientometrics》2018,116(2):1273-1301

Identifying the most relevant scientific publications on a given topic is a well-known research problem. The Author-Topic Model (ATM) is a generative model that represents the relationships between research topics and publication authors. It allows us to identify the most influential authors on a particular topic. However, since most research works are co-authored by many researchers the information provided by ATM can be complemented by the study of the most fruitful collaborations among multiple authors. This paper addresses the discovery of research collaborations among multiple authors on single or multiple topics. Specifically, it exploits an exploratory data mining technique, i.e., weighted association rule mining, to analyze publication data and to discover correlations between ATM topics and combinations of authors. The mined rules characterize groups of researchers with fairly high scientific productivity by indicating (1) the research topics covered by their most cited publications and the relevance of their scientific production separately for each topic, (2) the nature of the collaboration (topic-specific or cross-topic), (3) the name of the external authors who have (occasionally) collaborated with the group either on a specific topic or on multiple topics, and (4) the underlying correlations between the addressed topics. The applicability of the proposed approach was validated on real data acquired from the Online Mendelian Inheritance in Man catalog of genetic disorders and from the PubMed digital library. The results confirm the effectiveness of the proposed strategy.

  相似文献   
6.
Nowadays, wireless sensor networks are being used for a fast-growing number of different application fields (e.g., habitat monitoring, highway traffic monitoring, remote surveillance). Monitoring (i.e., querying) the sensor network entails the frequent acquisition of measurements from all sensors. Since sensor data acquisition and communication are the main sources of power consumption and sensors are battery-powered, an important issue in this context is energy saving during data collection. Hence, the challenge is to extend sensor lifetime by reducing communication cost and computation energy. This paper thoroughly describes the complete design, implementation and validation of the SeReNe framework. Given historical sensor readings, SeReNe discovers energy-saving models to efficiently acquire sensor network data. SeReNe exploits different clustering algorithms to discover spatial and temporal correlations which allow the identification of sets of correlated sensors and sensor data streams. Given clusters of correlated sensors, a subset of representative sensors is selected. Rather than directly querying all network nodes, only the representative sensors are queried by reducing the communication, computation and power costs. Experiments performed on both a real sensor network deployed at the Politecnico di Torino labs and a publicly available dataset from Intel Berkeley Research lab demonstrate the adaptability and the effectiveness of the SeReNe framework in providing energy-saving sensor network models.  相似文献   
7.
This paper presents the IMine index, a general and compact structure which provides tight integration of itemset extraction in a relational DBMS. Since no constraint is enforced during the index creation phase, IMine provides a complete representation of the original database. To reduce the I/O cost, data accessed together during the same extraction phase are clustered on the same disk block. The IMine index structure can be efficiently exploited by different itemset extraction algorithms. In particular, IMine data access methods currently support the FP-growth and LCM v.2 algorithms, but they can straightforwardly support the enforcement of various constraint categories. The IMine index has been integrated into the PostgreSQL DBMS and exploits its physical level access methods. Experiments, run for both sparse and dense data distributions, show the efficiency of the proposed index and its linear scalability also for large datasets. Itemset mining supported by the IMine index shows performance always comparable with, and sometimes better than, state of the art algorithms accessing data on flat file.  相似文献   
8.
Sentence-based multi-document summarization is the task of generating a succinct summary of a document collection, which consists of the most salient document sentences. In recent years, the increasing availability of semantics-based models (e.g., ontologies and taxonomies) has prompted researchers to investigate their usefulness for improving summarizer performance. However, semantics-based document analysis is often applied as a preprocessing step, rather than integrating the discovered knowledge into the summarization process.This paper proposes a novel summarizer, namely Yago-based Summarizer, that relies on an ontology-based evaluation and selection of the document sentences. To capture the actual meaning and context of the document sentences and generate sound document summaries, an established entity recognition and disambiguation step based on the Yago ontology is integrated into the summarization process.The experimental results, which were achieved on the DUC’04 benchmark collections, demonstrate the effectiveness of the proposed approach compared to a large number of competitors as well as the qualitative soundness of the generated summaries.  相似文献   
9.
The analysis of medical data is a challenging task for health care systems since a huge amount of interesting knowledge can be automatically mined to effectively support both physicians and health care organizations. This paper proposes a data analysis framework based on a multiple-level clustering technique to identify the examination pathways commonly followed by patients with a given disease. This knowledge can support health care organizations in evaluating the medical treatments usually adopted, and thus the incurred costs. The proposed multiple-level strategy allows clustering patient examination datasets with a variable distribution. To measure the relevance of specific examinations for a given disease complication, patient examination data has been represented in the Vector Space Model using the TF-IDF method. As a case study, the proposed approach has been applied to the diabetic care scenario. The experimental validation, performed on a real collection of diabetic patients, demonstrates the effectiveness of the approach in identifying groups of patients with a similar examination history and increasing severity in diabetes complications.  相似文献   
10.
Many real‐life databases are updated by means of incoming business information. In these databases (e.g., transactional data from large retail chains, call‐detail records), the content evolves through periodical insertions (or deletions) of data blocks. Since data evolve over time, algorithms have to be devised to incrementally update data mining models. This paper presents a novel index, called I‐Forest, to support itemset mining on incoming data blocks, where new blocks are inserted periodically, or old blocks are discarded. The I‐Forest structure provides a complete data representation and allows different kind of analyses (e.g., investigate quarterly data), besides supporting user‐defined time and support constraints. The I‐Forest index has been implemented into the PostgreSQL open source DBMS and exploits its physical level access methods. Experiments, run for both sparse and dense data distributions, show the effectiveness of the I‐Forest‐based approach to perform itemset mining with both time and support constraints. The execution time of the I‐Forest‐based itemset mining technique is often faster than the Prefix‐Tree algorithm accessing static data on flat files. © 2010 Wiley Periodicals, Inc.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号