共查询到20条相似文献,搜索用时 15 毫秒
1.
增量式挖掘方法有适应大规模动态数据、降低内存需求和可实现并行处理等诸多好处,但是目前的增量式聚类方法存在参数限制较多和计算结果不够准确等问题.在信息源变化的数据挖掘体系结构下,利用一群特殊的智能代理增量修改知识模型,提出了群体智能聚类模型的构建方法及增量模型维护算法.该方法利用信息熵加快聚类过程,根据信息素和数据库的插入及删除增量操作调整已生成的聚群,设定的参数较少,实验表明聚类结果准确. 相似文献
2.
A survey: algorithms simulating bee swarm intelligence 总被引:4,自引:1,他引:3
Swarm intelligence is an emerging area in the field of optimization and researchers have developed various algorithms by modeling the behaviors of different swarm of animals and insects such as ants, termites, bees, birds, fishes. In 1990s, Ant Colony Optimization based on ant swarm and Particle Swarm Optimization based on bird flocks and fish schools have been introduced and they have been applied to solve optimization problems in various areas within a time of two decade. However, the intelligent behaviors of bee swarm have inspired the researchers especially during the last decade to develop new algorithms. This work presents a survey of the algorithms described based on the intelligence in bee swarms and their applications. 相似文献
3.
数据挖掘在Web智能化中应用研究 总被引:3,自引:9,他引:3
分析了Web信息的特点和目前开发利用的局限,提出在Web上采用数据挖掘技术即Web挖掘,促进web智能化的观点。全面阐述了Web挖掘在Web智能化中的几个重要应用。指出Web挖掘是Web技术中一个重要的研究领域,是发现蕴藏在web上知识、区分权威链接、理解用户访问模式和网页语义结构的关键,它使充分利用Web大量的真正有价值的信息成为可能,为智能化Web奠定了基础。 相似文献
4.
Distributed data mining: a survey 总被引:1,自引:1,他引:0
Li Zeng Ling Li Lian Duan Kevin Lu Zhongzhi Shi Maoguang Wang Wenjuan Wu Ping Luo 《Information Technology and Management》2012,13(4):403-409
Most data mining approaches assume that the data can be provided from a single source. If data was produced from many physically distributed locations like Wal-Mart, these methods require a data center which gathers data from distributed locations. Sometimes, transmitting large amounts of data to a data center is expensive and even impractical. Therefore, distributed and parallel data mining algorithms were developed to solve this problem. In this paper, we survey the-state-of-the-art algorithms and applications in distributed data mining and discuss the future research opportunities. 相似文献
5.
Stefan Lessmann Marco Caserta Idel Montalvo Arango 《Expert systems with applications》2011,38(10):12826-12838
The paper is concerned with practices for tuning the parameters of metaheuristics. Settings such as, e.g., the cooling factor in simulated annealing, may greatly affect a metaheuristic’s efficiency as well as effectiveness in solving a given decision problem. However, procedures for organizing parameter calibration are scarce and commonly limited to particular metaheuristics. We argue that the parameter selection task can appropriately be addressed by means of a data mining based approach. In particular, a hybrid system is devised, which employs regression models to learn suitable parameter values from past moves of a metaheuristic in an online fashion. In order to identify a suitable regression method and, more generally, to demonstrate the feasibility of the proposed approach, a case study of particle swarm optimization is conducted. Empirical results suggest that characteristics of the decision problem as well as search history data indeed embody information that allows suitable parameter values to be determined, and that this type of information can successfully be extracted by means of nonlinear regression models. 相似文献
6.
《Expert systems with applications》2014,41(17):7987-7994
Crowdsourcing allows large-scale and flexible invocation of human input for data gathering and analysis, which introduces a new paradigm of data mining process. Traditional data mining methods often require the experts in analytic domains to annotate the data. However, it is expensive and usually takes a long time. Crowdsourcing enables the use of heterogeneous background knowledge from volunteers and distributes the annotation process to small portions of efforts from different contributions. This paper reviews the state-of-the-arts on the crowdsourcing for data mining in recent years. We first review the challenges and opportunities of data mining tasks using crowdsourcing, and summarize the framework of them. Then we highlight several exemplar works in each component of the framework, including question designing, data mining and quality control. Finally, we conclude the limitation of crowdsourcing for data mining and suggest related areas for future research. 相似文献
7.
Phan Han Duy Ellis Kirsten Barca Jan Carlo Dorin Alan 《Neural computing & applications》2020,32(2):567-588
Neural Computing and Applications - Parameter settings for nature-inspired optimization algorithms are essential for their effective performance. Evolutionary algorithms and swarm intelligence... 相似文献
8.
9.
Agostino Forestiero Clara Pizzuti Giandomenico Spezzano 《Data mining and knowledge discovery》2013,26(1):1-26
Existing density-based data stream clustering algorithms use a two-phase scheme approach consisting of an online phase, in which raw data is processed to gather summary statistics, and an offline phase that generates the clusters by using the summary data. In this article we propose a data stream clustering method based on a multi-agent system that uses a decentralized bottom-up self-organizing strategy to group similar data points. Data points are associated with agents and deployed onto a 2D space, to work simultaneously by applying a heuristic strategy based on a bio-inspired model, known as flocking model. Agents move onto the space for a fixed time and, when they encounter other agents into a predefined visibility range, they can decide to form a flock if they are similar. Flocks can join to form swarms of similar groups. This strategy allows to merge the two phases of density-based approaches and thus to avoid the computing demanding offline cluster computation, since a swarm represents a cluster. Experimental results show that the bio-inspired approach can obtain very good results on real and synthetic data sets. 相似文献
10.
From visual data exploration to visual data mining: a survey 总被引:8,自引:0,他引:8
Ferreira de Oliveira M.C. Levkowitz H. 《IEEE transactions on visualization and computer graphics》2003,9(3):378-394
We survey work on the different uses of graphical mapping and interaction techniques for visual data mining of large data sets represented as table data. Basic terminology related to data mining, data sets, and visualization is introduced. Previous work on information visualization is reviewed in light of different categorizations of techniques and systems. The role of interaction techniques is discussed, in addition to work addressing the question of selecting and evaluating visualization techniques. We review some representative work on the use of information visualization techniques in the context of mining data. This includes both visual data exploration and visually expressing the outcome of specific mining algorithms. We also review recent innovative approaches that attempt to integrate visualization into the DM/KDD process, using it to enhance user interaction and comprehension. 相似文献
11.
12.
Bilal Alatas Erhan Akin 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2008,12(12):1205-1218
This paper proposes a novel particle swarm optimization algorithm, rough particle swarm optimization algorithm (RPSOA), based
on the notion of rough patterns that use rough values defined with upper and lower intervals that represent a range or set
of values. In this paper, various operators and evaluation measures that can be used in RPSOA have been described and efficiently
utilized in data mining applications, especially in automatic mining of numeric association rules which is a hard problem. 相似文献
13.
Data mining for Web intelligence 总被引:2,自引:0,他引:2
Searching, comprehending, and using the semistructured HTML, XML, and database-service-engine information stored on the Web poses a significant challenge. This data is more sophisticated and dynamic than the information commercial database systems store. To supplement keyword-based indexing, researchers have applied data mining to Web-page ranking. In this context, data mining helps Web search engines find high-quality Web pages and enhances Web click stream analysis. For the Web to reach its full potential, however, we must improve its services, make it more comprehensible, and increase its usability. As researchers continue to develop data mining techniques, the authors believe this technology will play an increasingly important role in meeting the challenges of developing the intelligent Web. Ultimately, data mining for Web intelligence will make the Web a richer, friendlier, and more intelligent resource that we can all share and explore. The paper considers how data mining holds the key to uncovering and cataloging the authoritative links, traversal patterns, and semantic structures that will bring intelligence and direction to our Web interactions. 相似文献
14.
基于Web的数据挖掘研究综述 总被引:4,自引:0,他引:4
基于Web数据挖掘是一个结合了数据挖掘和WWW的热门研究主题。文章介绍了Web数据挖掘最流行的分类;Web内容挖掘,Web结构挖掘和Web使用记录挖掘,根据Web数据挖掘的最近研究状况,总结了几个研究热点,并介绍了一个Web使用记录挖掘的框架WebSIFT. 相似文献
15.
Particle swarm optimization (PSO) is a popular meta-heuristic for black-box optimization. In essence, within this paradigm, the system is fully defined by a swarm of “particles” each characterized by a set of features such as its position, velocity and acceleration. The consequent optimized global best solution is obtained by comparing the personal best solutions of the entire swarm. Many variations and extensions of PSO have been developed since its creation in 1995, and the algorithm remains a popular topic of research. In this work we submit a new, abstracted perspective of the PSO system, where we attempt to move away from the swarm of individual particles, but rather characterize each particle by a field or distribution. The strategy that updates the various fields is akin to Thompson’s sampling. By invoking such an abstraction, we present the novel particle field optimization algorithm which harnesses this new perspective to achieve a model and behavior which is completely distinct from the family of traditional PSO systems. 相似文献
16.
17.
18.
《International journal of human-computer studies》2007,65(5):421-433
Serendipity is the making of fortunate discoveries by accident, and is one of the cornerstones of scientific progress. In today's world of digital data and media, there is now a vast quantity of material that we could potentially encounter, and so there is an increased opportunity of being able to discover interesting things. However, the availability of material does not imply that we will be able to actually find it; the sheer quantity of data mitigates against us being able to discover the interesting nuggets.This paper explores approaches we have taken to support users in their search for interesting and relevant information. The primary concept is the principle that it is more useful to augment user skills in information foraging than it is to try and replace them. We have taken a variety of artificial intelligence, statistical, and visualisation techniques, and combined them with careful design approaches to provide supportive systems that monitor user actions, garner additional information from their surrounding environment and use this enhanced understanding to offer supplemental information that aids the user in their interaction with the system.We present two different systems that have been designed and developed according to these principles. The first system is a data mining system that allows interactive exploration of the data, allowing the user to pose different questions and understand information at different levels of detail. The second supports information foraging of a different sort, aiming to augment users browsing habits in order to help them surf the internet more effectively. Both use ambient intelligence techniques to provide a richer context for the interaction and to help guide it in more effective ways: both have the user as the focal point of the interaction, in control of an iterative exploratory process, working in indirect collaboration with the artificial intelligence components.Each of these systems contains some important concepts of their own: the data mining system has a symbolic genetic algorithm which can be tuned in novel ways to aid knowledge discovery, and which reports results in a user-comprehensible format. The visualisation system supports high-dimensional data, dynamically organised in a three-dimensional space and grouped by similarity. The notions of similarity are further discussed in the internet browsing system, in which an approach to measuring similarity between web pages and a user's interests is presented. We present details of both systems and evaluate their effectiveness. 相似文献
19.
本文着重分析了航天探测信息系统建设的现状与成就,指出了当前航空物探数据管理的一些问题,提出了加大航天航空探测数据库建设。实现数据入库、检查和查询三大功能。本文针对其查询较为繁琐的问题引入了数据挖掘这一思想,使数据查询和使用更加的高效和便捷,进一步完善了我航空航天数据库系统的建设。 相似文献