首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 546 毫秒
1.
In data mining, the usefulness of a data pattern depends on the user of the database and does not solely depend on the statistical strength of the pattern. Based on the premise that heuristic search in combinatorial spaces built on computer and human cognitive theories is useful for effective knowledge discovery, this study investigates how the use of self-organizing maps as a tool of data visualization in data mining plays a significant role in human–computer interactive knowledge discovery. This article presents the conceptual foundations of the integration of data visualization and query processing for knowledge discovery, and proposes a set of query functions for the validation of self-organizing maps in data mining. Received 1 November 1999 / Revised 2 March 2000 / Accepted in revised form 20 October 2000  相似文献   

2.
空间数据采掘的研究与发展   总被引:19,自引:0,他引:19  
数据采掘的研究已从关系型和事务型数据库扩展到空间数据库。空间数据采掘是一个很有发展的领域,它是在大量空间数据中进行知识发现的技术。文中总结了空间数据采掘领域中的研究成果,概括出空间数据采掘的体系结构、查询语言及相关方法,并探讨了目前存在的问题和发展方向。  相似文献   

3.
Self-organising maps (SOM) have become a commonly-used cluster analysis technique in data mining. However, SOM are not able to process incomplete data. To build more capability of data mining for SOM, this study proposes an SOM-based fuzzy map model for data mining with incomplete data sets. Using this model, incomplete data are translated into fuzzy data, and are used to generate fuzzy observations. These fuzzy observations, along with observations without missing values, are then used to train the SOM to generate fuzzy maps. Compared with the standard SOM approach, fuzzy maps generated by the proposed method can provide more information for knowledge discovery.  相似文献   

4.
部分数据缺失环境下的知识发现方法   总被引:12,自引:0,他引:12  
王清毅  蔡智  邹翔  蔡庆生 《软件学报》2001,12(10):1516-1524
介绍了目前的不完全数据环境下的知识发现研究工作,分两个部分提出了一个不完全数据库中的知识发现方法.首先具体讨论了如何猜测丢失的数据,给出了基于距离的关联规则的定义及挖掘方法.然后在此基础上详细描述了一个不完全数据库中的知识发现算法,分析了算法的复杂度,并给出了相应的实验结果.最后,将所提方法与其他相关方法进行了比较.  相似文献   

5.
An Overview of Data Mining and Knowledge Discovery   总被引:9,自引:0,他引:9       下载免费PDF全文
With massive amounts of data stored in databases,mining information and knowledge in databases has become an important issue in recent research.Researchers in many different fields have shown great interest in date mining and knowledge discovery in databases.Several emerging applications in information providing services,such as data warehousing and on-line services over the Internet,also call for various data mining and knowledge discovery tchniques to understand used behavior better,to improve the service provided,and to increase the business opportunities.In response to such a demand,this article is to provide a comprehensive survey on the data mining and knowledge discorvery techniques developed recently,and introduce some real application systems as well.In conclusion,this article also lists some problems and challenges for further research.  相似文献   

6.
Data mining with incomplete survey data is an immature subject area. Mining a database with incomplete data, the patterns of missing data as well as the potential implication of these missing data constitute valuable knowledge. This paper presents the conceptual foundations of data mining with incomplete data through classification which is relevant to a specific decision making problem. The proposed technique generally supposes that incomplete data and complete data may come from different sub-populations. The major objective of the proposed technique is to detect the interesting patterns of data missing behavior that are relevant to a specific decision making, instead of estimation of individual missing value. Using this technique, a set of complete data is used to acquire a near-optimal classifier. This classifier provides the prediction reference information for analyzing the incomplete data. The data missing behavior concealed in the missing data is then revealed. Using a real-world survey data set, the paper demonstrates the usefulness of this technique.  相似文献   

7.
skyline查询是数据挖掘一个重要的研究方向,在基于数据的决策支持等应用中有着重要的作用.由于现实应用中存在着大量的不完整数据流,但大多数现有的skyline查询算法都依赖于如下的假设:1)任意数据点的所有维度值都是已知的;2)数据集是稳定、有界的并且可以随意访问.此外,随着数据维度的增加,skyline数据点的个数会变得过多,因此引入了k-支配skyline的概念,但是不完整数据的k-支配关系并不具有传递性,现有的skyline查询算法都无法适用.基于这些问题,考虑到数据流高维、无界、顺序性的特点,并且在某些维度上可能具有缺失值的特性,提出了一种新的基于滑动窗口的不完整数据流的k-支配skyline查询算法,实验结果表明,算法不仅可以支持不完整数据流上的k-支配skyline计算,并能够保证效率和性能.  相似文献   

8.
数据挖掘算法广泛地应用于数据分析。工业、科学和商业领域需要分析地理上分布的大量数据集,而网格能有效地提供高性能应用和分布式的基础设施。为了利用网格实现数据挖掘和知识表示,文中根据知识网格的概念,在GlobusToolkit的基础上,分析了知识网格的体系结构和它的主要组件,根据数据挖掘的过程设计了一种网格数据挖掘系统软件模型,并指出了该模型应提供的服务,这些服务会屏蔽所有关于网格底层的所有细节,使最终用户只关心知识发现的过程。  相似文献   

9.
针对流行病学研究的特点,论文提出计算机辅助医学数据挖掘系统构架,以糖尿病并发症为研究实例,探讨医学数据的冗余性消除、规范化储存、知识归纳及可视化表达等问题。以天津总医院3022例普查数据为研究对象,尝试解决用计算机实现糖尿病并发症这类定性数据的定量化数据挖掘和知识发现。通过对于43种并发症的定性数据挖掘,可以发现诸如高血脂、冠心病、高血压、脑血管病等具有明显并发倾向的知识规则18条。同时,采用知识树方式和决策树等方法实现知识规则的可视化表达。基于数据挖掘和知识发现计算机辅助医学数据挖掘系统能够对现有病历数据库中数据进行自动分析并且提供有价值医学知识,特别适合流行病学分析和全民健康评估,因此与社区医疗和医院HIS系统结合是未来一个非常现实的发展方向。  相似文献   

10.
离群点挖掘方法综述   总被引:10,自引:2,他引:10  
离群点挖掘可揭示稀有事件和现象、发现有趣的模式,有着广阔的应用前景,因此引起广泛关注。首先介绍离群点的定义、引起离群的原因和离群点挖掘算法的分类,对基于距离和基于密度的离群点挖掘算法进行了比较详细的讨论,指出了其优缺点和发展方向,重点对当前研究的热点——高维大数据量的挖掘、空间数据挖掘、时序离群点挖掘和离群点挖掘技术的应用进行了讨论,指出了进一步研究方向。  相似文献   

11.
Large databases are becoming increasingly common in civil infrastructure applications. Although it is relatively simple to specifically query these databases at a low level, more abstract questions like ‘How does the environment affect pavement cracking?’ are difficult to answer with traditional methods. Data mining techniques can provide a solution for learning abstract knowledge from civil infrastruc-ture databases. However, data mining needs to be performed within a systematic process to ensure correct and reproducible results. Many decisions must be made during this process, making it difficult for novice analysts to apply data mining techniques thoroughly. This paper presents an application of a knowledge discovery process to data collected for an ‘intelligent’ building. The knowledge discovery process is illustrated and explained through this case study. Additionally, we discuss the importance of this case study in the context of a research effort to develop an interactive guide for the knowledge discovery process.  相似文献   

12.
在分析科学数据网格环境下数据挖掘之特点的基础上,提出了科学数据挖掘网格服务框架.科学数据挖掘网格服务以网格服务的形式提供了科学数据网格环境下的数据挖掘解决方案.与传统的数据挖掘系统相比,科学数据挖掘网格服务具有诸多优点,更适合科学数据网格和科学数据库环境.目前已经实际应用于几个数据库中,不仅具有简单的查询检索功能,而且可以进行数据统计分析及知识发现,进一步提高了科学数据网格服务的水平.  相似文献   

13.
伴随车辆是公安刑侦部门对海量车辆通行信息检索的一类实战需求,目的是通过模糊条件查询得到潜在的结伴作案车辆,究其本质,可将此类查询转化为数据挖掘中关联规则挖掘问题。通过对公路车辆智能监测记录系统采集的过车数据进行分析,将伴随车辆查询转化为关联规则挖掘,利用数据挖掘技术对过车数据查询问题进行综合分析,实现高效率的伴随车辆查询算法AVD(Accompany Vehicles D iscovery)。算法分析表明,AVD不但能提供准确的伴随车辆查询结果,而且效率高、扩展性强,具有较高的可行性。  相似文献   

14.
This paper reports on conceptual development in applications of neural networks to data mining and knowledge discovery. Hypothesis generation is one of the significant differences of data mining from statistical analyses. Nonlinear pattern hypothesis generation is a major task of data mining and knowledge discovery. Yet, few methods of nonlinear pattern hypothesis generation are available.

This paper proposes a model of data mining to support nonlinear pattern hypothesis generation. This model is an integration of linear regression analysis model, Kohonen's self-organizing maps, the algorithm for convex polytopes, and back-propagation neural networks.  相似文献   


15.
XML data mining     
With the spreading of XML sources, mining XML data can be an important objective in the near future. This paper presents a project focussed on designing a general‐purpose query language in support of mining XML data. In our framework, raw data, mining models and domain knowledge are represented by way of XML documents and stored inside native XML databases. Data mining (DM) tasks are expressed in an extension of XQuery. Special attention is given to the frequent pattern discovery problem, and a way of exploiting domain‐dependent optimizations and efficient data structures as deeper as possible in the extraction process is presented. We report the results of a first bunch of experiments, showing that a good trade‐off between expressiveness and efficiency in XML DM is not a chimera. Copyright © 2009 John Wiley & Sons, Ltd.  相似文献   

16.
数据采掘与知识发现:回顾和展望   总被引:20,自引:0,他引:20  
如何从大规模数据库中发掘深层次的知识和信息,而不仅仅是那些从传统数据 库查询方法所获得的平凡内容,这方面的研究正受到越来越多的关注.作为一门独立于应用 的研究课题,它已成为众多研究领域的热点内容,已经有相当多的应用报道并取得了丰硕的 成果.本文试就数据采掘和知识发现的各方面内容:如采掘过程、方法、算法和应用等,作 一个完整的回顾,文章也讨论了这一领域未来的工作和挑战.  相似文献   

17.
基于领域本体的数据挖掘服务发现算法   总被引:3,自引:0,他引:3  
随着数据库的广泛应用,数据挖掘技术面临数据的海量化、分布化问题。采用面向服务的架构构造数据挖掘系统是解决该问题的方法之一。提出一种基于领域本体的数据挖掘服务发现算法,通过引入领域知识,定义数据挖掘本体,有效地解决了数据挖掘服务发现问题。首先给出了结合领域知识的数据挖掘服务发现框架,提出了数据挖掘方法本体和质量本体的定义,并给出了根据领域知识及用户需求进行数据挖掘服务发现的算法,为数据挖掘服务选择提供了较为完善的方案。  相似文献   

18.
Abstract: Although data mining and knowledge discovery techniques have recently been used to diagnose human disease, little research has been conducted on disease diagnostic modelling using human gene information. Furthermore, to our knowledge, no study has reported on diagnosis models using single nucleotide polymorphism (SNP) information. A disease diagnosis model using data mining techniques and SNP information should prove promising from a practical perspective as more information on human genes becomes available. Data mining and knowledge discovery techniques can be put to practical use detecting human disease, since a haplotype analysis using high-density SNP markers has gained great attention for evaluating human genes related to various human diseases. This paper explores how data mining and knowledge discovery can be applied to medical informatics using human gene information. As an example, we applied case-based reasoning to a cancer detection problem using human gene information and SNP analysis because case-based reasoning has been applied in medicine relatively less often than other data mining techniques. We propose a modified case-based reasoning method that is appropriate for associated categorical variables to use in detecting gastric cancer.  相似文献   

19.
Although knowledge discovery from large relational databases has gained popularity and its significance is well recognized, the prohibitive nature of the cost associated with extracting such knowledge, as well as the lack of suitable declarative query language support act as limiting factors. Surprisingly, little or no relational technology has yet been significantly exploited in data mining even though data often reside in relational tables. Consequently, no relational optimization has yet been possible for data mining. We exploit the transitive nature of large item sets and the so called anti-monotonicity property of support thresholds of large item sets to develop a natural least fixpoint operator for set oriented data mining from relational databases. The operator proposed has several advantages including optimization opportunities, and traditional candidate set free large item set generation. We present an SQL3 expression for association rule mining and discuss its mapping to the least fixpoint operator developed in this paper.  相似文献   

20.
粗集理论能支持数据挖掘与知识发现的多个步骤,如数据预处理、数据简化、规则生成、数据依赖关系获取等,为数据挖掘与知识发现提供了新的思路和方法。本文将粗集理论引入空间数据挖掘领域,介绍了粗集理论的基础理论和一系列方法,给出了应用实例,并探讨粗集理论在空间数据挖掘中的应用。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号