共查询到20条相似文献,搜索用时 78 毫秒
1.
There are hidden and rich information for data mining in the topology of topic-specific websites. A new topic-specific association rules mining algorithm is proposed to further the research on this area. The key idea is to analyze the frequent hyperlinked relati ons between pages of different topics. In the topic-specific area, if pages of onetopic are frequently hyperlinked by pages of another topic, we consider the two topics are relevant. Also, if pages oftwo different topics are frequently hyperlinked together by pages of the other topic, we consider the two topics are relevant.The initial experiments show that this algorithm performs quite well while guiding the topic-specific crawling agent and it can be applied to the further discovery and mining on the topic-specific website. 相似文献
2.
一种增量式规则提取算法 总被引:6,自引:0,他引:6
扩展了决策矩阵的定义,并在此基础上提出一种增量式规则提取算法(IREA),该算法能够以增量的方式从样本数据中提取确定性和可能性规则.对于缺乏领域知识时的知识/规则获取具有重要使用价值. 相似文献
3.
文章以机器人的模糊控制为背景,基于Kohonen自组织竞争网络和改进的DCL算法,从输入输出数据中提取模糊控制规则,所得结果明显优于Kong和Kosko[1]的结果。 相似文献
4.
.基于规则提取量的Web日志关联规则挖掘方法* 总被引:2,自引:0,他引:2
引入规则提取量的度量标准,提出一种基于免疫多克隆遗传策略的Web日志关联规则挖掘方法。该算法在遗传算法的基础上引入免疫多克隆算子,有效地克服了遗传算法容易陷入局部最优的缺点,具有更强的全局与局部搜索能力。实验结果表明,该算法能高效地解决Web日志关联规则挖掘问题。 相似文献
5.
文章重点研究了Web日志挖掘以及关联分析中的关联规则挖掘算法FP_Growth算法,提出了一种改进的关联规则挖掘算法,并将该算法应用于某高校图书馆个性化服务系统My Library的设计过程中,从服务器日志中得到用户感兴趣的隐式模式,并将该隐式兴趣集推荐给用户,从而在一定程度上实现了个性化服务。 相似文献
6.
分类是许多研究领域的关键问题,模糊规则的提取质量对分类器的性能又有着极大影响.所提取的规则不仅在分类能力上要达到最优,同时在规则数量上也不能太多,否则会影响规则搜索和匹配的速度.结合人工免疫的克隆选择原理,采用克隆选择算法,提取通过多精度模糊分割产生的大量模糊if—then规则中的少数精华规则,从而建立了模糊分类所需要的有效规则集合,同时还对优化目标函数进行了改进.经仿真实验证明,该方法所提取的模糊规则具有分类准确率高,规则数目较少等特点。 相似文献
7.
8.
一种改进的相联规则提取算法 总被引:3,自引:1,他引:3
相联规则的提取是数据挖掘的一个重要方面。Apriori算法是提取相联规则的经典算法,效率较高。AprioriPro算法是对Apriori算法的改进,它利用大项集生成过程中的中间结果对数据库进行过滤,从而加快候选项集的计数速度,提高了整个算法的效率。该文在AprioriPro算法的基础上,首先对其基本理论进行扩展并加以证明,提出了AprioriPro2算法。该算法相对于AprioriPro算法能更多地去掉数据库中的无效元组,从而进一步提高了算法的效率。 相似文献
9.
10.
11.
12.
This paper deals with learning first-order logic rules from data lacking an explicit classification predicate. Consequently, the learned rules are not restricted to predicate definitions as in supervised inductive logic programming. First-order logic offers the ability to deal with structured, multi-relational knowledge. Possible applications include first-order knowledge discovery, induction of integrity constraints in databases, multiple predicate learning, and learning mixed theories of predicate definitions and integrity constraints. One of the contributions of our work is a heuristic measure of confirmation, trading off novelty and satisfaction of the rule. The approach has been implemented in the Tertius system. The system performs an optimal best-first search, finding the k most confirmed hypotheses, and includes a non-redundant refinement operator to avoid duplicates in the search. Tertius can be adapted to many different domains by tuning its parameters, and it can deal either with individual-based representations by upgrading propositional representations to first-order, or with general logical rules. We describe a number of experiments demonstrating the feasibility and flexibility of our approach. 相似文献
13.
XPATH在Web信息提取中起重要作用,但是这些XPATH规则通常要人工生成。文中讨论了在XPATH与基于文本上下文规则的信息提取方法结合的系统中如何归纳学习XPATH规则。生成的XPATH规则结构简单,可以为基于文本上下文的信息提取系统提供较为准确的信息定位。 相似文献
14.
一种Web信息的启发式检索方法 总被引:3,自引:0,他引:3
Internet是一个开放的全球分布式网络 ,资源分布在世界上不同的地方 ,并且网上资源没有统一的管理和结构 ,导致了信息搜索的困难 .同时 ,Internet是一个有巨大价值的信息源 .因此 ,研究一种快速、高效的 Web信息检索方法是很有实用意义的 .本文提出了一种用相关度及用户兴趣作为评价函数在 Internet上进行启发式搜索及在此基础上利用机器学习有效的实现搜索知识重用的方法 相似文献
15.
This paper presents an infrastructure and methodology to extract conceptual structure from Web pages, which are mainly constructed by HTML tags and incomplete text. Human beings can easily read Web pages and grasp an idea about the conceptual structure of underlying data, but cannot handle excessive amounts of data due to lack of patience and time. However, it is extremely difficult for machines to accurately determine the content of Web pages due to lack of understanding of context and semantics. Our work provides a methodology and infrastructure to process Web data and extract the underlying conceptual structure, in particular relationships between ontological concepts using Inductive Logic Programming in order to help with automating the processing of the excessive amount of Web data by capturing its conceptual structures. 相似文献
16.
17.
18.
19.
The representation formalism as well as the representation language is of great importance for the success of machine learning. The representation formalism should be expressive, efficient, useful, and applicable. First-order logic needs to be restricted in order to be efficient for inductive and deductive reasoning. In the field of knowledge representation, term subsumption formalisms have been developed which are efficient and expressive. In this article, a learning algorithm, KLUSTER, is described that represents concept definitions in this formalism. KLUSTER enhances the representation language if this is necessary for the discrimination of concepts. Hence, KLUSTER is a constructive induction program. KLUSTER builds the most specific generalization and a most general discrimination in polynomial time. It embeds these concept learning problems into the overall task of learning a hierarchy of concepts. 相似文献
20.
WANG Juan 《数字社区&智能家居》2008,(34)
针对FP算法的缺陷,将OLAP技术和Apriori关联规则相结合,提出了一种针对FP算法的改进的多层次关联规则数据挖掘算法,在分析了关联规则数据挖掘结构的基础上,给出了该算法的思想与执行步骤,对于关联规则数据挖掘的研究具有一定的理论意义。 相似文献