首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Abstract: Cardiovascular diseases constitute one of the main causes of mortality in the world, and machine learning has become a powerful tool for analysing medical data in the last few years. In this paper we present an interdisciplinary work based on an ambulatory blood pressure study and the development of a new classification algorithm named REMED. We focused on the discovery of new patterns for abnormal blood pressure variability as a possible cardiovascular risk factor. We compared our results with other classification algorithms based on Bayesian methods, decision trees, and rule induction techniques. In the comparison, REMED showed similar accuracy to these algorithms but it has the advantage of being superior in its capacity to classify sick people correctly. Therefore, our method could represent an innovative approach that might be useful in medical decision support for cardiovascular disease prognosis.  相似文献   

2.
刘洋  张卓  周清雷 《计算机科学》2014,41(12):164-167
医疗健康数据通常属性较多,且存在连续型、离散型并存的混合数据,这在很大程度上限制了知识发现方法对医疗健康数据的挖掘效率。以模糊粗糙集理论为基础,研究混合数据上的分类规则挖掘方法,通过引入规则获取算法的泛化阈值,来控制获取规则集的大小和复杂程度,提高粗糙集知识发现方法在医疗健康数据上的分类效率。最后通过对比实验验证了该算法在医疗决策表上挖掘规则的有效性。  相似文献   

3.
在知识发现流程中,分类规则是主要的挖掘任务之一。针对传统的基于统计分析的挖掘算法在保证知识的有趣性方面的缺陷,提出了利用演化计算这种智能计算模型的全局搜索特性和完全适应值导向特性来进行分类知识的自动挖掘和处理,不需要先验知识,以确保知识的有趣性。提出了用IF-THEN这种高层次的知识表示形式来提高知识的可理解性。并给出了个体表示,遗传操作和适应值评估等几个在演化算法中起重要作用的成分的设计原则和方法。  相似文献   

4.
Knowledge-based systems such as expert systems are of particular interest in medical applications as extracted if-then rules can provide interpretable results. Various rule induction algorithms have been proposed to effectively extract knowledge from data, and they can be combined with classification methods to form rule-based classifiers. However, most of the rule-based classifiers can not directly handle numerical data such as blood pressure. A data preprocessing step called discretization is required to convert such numerical data into a categorical format. Existing discretization algorithms do not take into account the multimodal class densities of numerical variables in datasets, which may degrade the performance of rule-based classifiers. In this paper, a new Gaussian Mixture Model based Discretization Algorithm (GMBD) is proposed that preserve the most frequent patterns of the original dataset by taking into account the multimodal distribution of the numerical variables. The effectiveness of GMBD algorithm was verified using six publicly available medical datasets. According to the experimental results, the GMBD algorithm outperformed five other static discretization methods in terms of the number of generated rules and classification accuracy in the associative classification algorithm. Consequently, our proposed approach has a potential to enhance the performance of rule-based classifiers used in clinical expert systems.  相似文献   

5.
Associative classification (AC), which is based on association rules, has shown great promise over many other classification techniques. To implement AC effectively, we need to tackle the problems on the very large search space of candidate rules during the rule discovery process and incorporate the discovered association rules into the classification process. This paper proposes a new approach that we call artificial immune system-associative classification (AIS-AC), which is based on AIS, for mining association rules effectively for classification. Instead of massively searching for all possible association rules, AIS-AC will only find a subset of association rules that are suitable for effective AC in an evolutionary manner. In this paper, we also evaluate the performance of the proposed AIS-AC approach for AC based on large datasets. The performance results have shown that the proposed approach is efficient in dealing with the problem on the complexity of the rule search space, and at the same time, good classification accuracy has been achieved. This is especially important for mining association rules from large datasets in which the search space of rules is huge.  相似文献   

6.
针对传统的关联分类算法在构造分类器的过程中需要多次遍历数据集从而消耗大量的计算、存储资源的问题,该文提出了一种基于知识进化算法的分类规则构造方法。该方法首先对数据集中的数据进行编码;然后利用猜测与反驳算子从编码后的数据中提取出猜测知识和反面知识;接着对提取出来的猜测知识进行覆盖度、正确度的计算,并根据不断变化的统计数据利用萃取算子将猜测知识与反面知识进行合理的转换。当得到的知识集中的知识的覆盖度达到预设的阈值时,该数据集中的知识被用来生成分类器进行分类。该方法分块读入待分类的数据集,极大地减少了遍历数据集的次数,明显减少了系统所需的存储空间,提高了分类器的构造效率。实验结果表明,该方法可行、有效,在保证分类精度的前提下,较好地解决了关联分类器构造低效、费时的问题。  相似文献   

7.
针对流行病学研究的特点,论文提出计算机辅助医学数据挖掘系统构架,以糖尿病并发症为研究实例,探讨医学数据的冗余性消除、规范化储存、知识归纳及可视化表达等问题。以天津总医院3022例普查数据为研究对象,尝试解决用计算机实现糖尿病并发症这类定性数据的定量化数据挖掘和知识发现。通过对于43种并发症的定性数据挖掘,可以发现诸如高血脂、冠心病、高血压、脑血管病等具有明显并发倾向的知识规则18条。同时,采用知识树方式和决策树等方法实现知识规则的可视化表达。基于数据挖掘和知识发现计算机辅助医学数据挖掘系统能够对现有病历数据库中数据进行自动分析并且提供有价值医学知识,特别适合流行病学分析和全民健康评估,因此与社区医疗和医院HIS系统结合是未来一个非常现实的发展方向。  相似文献   

8.
G3P-MI: A genetic programming algorithm for multiple instance learning   总被引:1,自引:0,他引:1  
This paper introduces a new Grammar-Guided Genetic Programming algorithm for resolving multi-instance learning problems. This algorithm, called G3P-MI, is evaluated and compared to other multi-instance classification techniques in different application domains. Computational experiments show that the G3P-MI often obtains consistently better results than other algorithms in terms of accuracy, sensitivity and specificity. Moreover, it makes the knowledge discovery process clearer and more comprehensible, by expressing information in the form of IF-THEN rules. Our results confirm that evolutionary algorithms are very appropriate for dealing with multi-instance learning problems.  相似文献   

9.
刘晓平 《计算机仿真》2005,22(12):76-79
用于知识发现的大部分数据挖掘工具均采用规则发现和决策树分类技术来发现数据模式和规则。该文通过采用基于仿真属性的离散化方法,基于概率统计的未知属性与噪声数据处理方法以及基于误差的剪枝算法,实现了用于自动生成决策树的通用算法模板。利用该模板,决策树算法的设计者可以快速验证为解决特定决策问题而设计的新算法。构造决策树的基本机制是算法的设计者利用其自己定义的公式来初始化通用算法模板。然后利用该系统提供的交互式图形环境,针对不同的决策问题测试该算法,从而找出适合特定问题的算法。  相似文献   

10.
《Information Fusion》2007,8(3):295-315
An important component of higher level fusion is knowledge discovery. One form of knowledge is a set of relationships between concepts. This paper addresses the automated discovery of ontological knowledge representations such as taxonomies/thesauri from imagery-based data. Multi-target classification is used to transform each source data point into a set of conceptual predictions from a pre-defined lexicon. This classification pre-processing produces co-occurrence data that is suitable for input to an ontology learning algorithm. A neural network with an associative, incremental learning (NAIL) algorithm processes this co-occurrence data to find relationships between elements of the lexicon, thus uncovering the knowledge structure ‘hidden’ in the dataset. The efficacy of this approach is demonstrated on a dataset created from satellite imagery of a metropolitan region. The flexibility of the NAIL algorithm is illustrated by employing it on an additional dataset comprised of topic categories from a text document collection. The usefulness of the knowledge structure discovered from the imagery data is illustrated via construction of a Bayesian network, which produces an inference engine capable of exploiting the learned knowledge model. Effective automation of knowledge discovery in an information fusion context has considerable potential for aiding the development of machine-based situation awareness capabilities.  相似文献   

11.
This paper describes the application of evolutionary fuzzy systems for subgroup discovery to a medical problem, the study on the type of patients who tend to visit the psychiatric emergency department in a given period of time of the day. In this problem, the objective is to characterise subgroups of patients according to their time of arrival at the emergency department. To solve this problem, several subgroup discovery algorithms have been applied to determine which of them obtains better results. The multiobjective evolutionary algorithm MESDIF for the extraction of fuzzy rules obtains better results and so it has been used to extract interesting information regarding the rate of admission to the psychiatric emergency department.  相似文献   

12.
The selection of an appropriate classification model and algorithm is crucial for effective knowledge discovery on a dataset. For large databases, common in data mining, such a selection is necessary, because the cost of invoking all alternative classifiers is prohibitive. This selection task is impeded by two factors. First, there are many performance criteria, and the behaviour of a classifier varies considerably with them. Second, a classifier's performance is strongly affected by the characteristics of the dataset.Classifier selection implies mastering a lot of background information on the dataset, the models and the algorithms in question. An intelligent assistant can reduce this effort by inducing helpful suggestions from background information. In this study, we present such an assistant, NOEMON. For each registered classifier, NOEMON measures its performance for a collection of datasets. Rules are induced from those measurements and accommodated in a knowledge base. The suggestion on the most appropriate classifier(s) for a dataset is then based on those rules. Results on the performance of an initial prototype are also given.  相似文献   

13.
A method for finding all deterministic and maximally general rules for a target classification is explained in detail and illustrated with examples. Maximally general rules are rules with minimal numbers of conditions. The method has been developed within the context of the rough sets model and is based on the concepts of a decision matrix and a decision function. The problem of finding all the rules is reduced to the problem of computing prime implicants of a group of associated Boolean expressions. The method is particularly applicable to identifying all potentially interesting deterministic rules in a knowledge discovery system but can also be used to produce possible rules or nondeterministic rules with decision probabilities, by adapting the method to the definitions of the variable precision rough sets model.  相似文献   

14.
In manufacturing systems, only a small training dataset can be obtained in the early stages. A small training dataset usually leads to low learning accuracy with regard to classification of machine learning, and the knowledge derived is often fragile, and this is called the small sample problem. This research mainly aims at overcoming this problem using a special nonlinear classification technique to generate virtual samples to enlarge the training dataset for learning improvement. This research proposes a new sample generation method, named non-linear virtual sample generation (NVSG), which combines a unique group discovery technique and a virtual sample generation method using parametric equations of hypersphere. By applying a back-propagation neural network (BPN) as the learning tool, the computational experiments obtained from the simulated dataset and the real dataset quoted from the Iris Plant Database show that the learning accuracy can be significantly improved using NVSG method for very small training datasets.  相似文献   

15.
In this research work, a novel framework for the construction of augmented Fuzzy Cognitive Maps based on Fuzzy Rule-Extraction methods for decisions in medical informatics is investigated. Specifically, the issue of designing augmented Fuzzy Cognitive Maps combining knowledge from experts and knowledge from data in the form of fuzzy rules generated from rule-based knowledge discovery methods is explored. Fuzzy cognitive maps are knowledge-based techniques which combine elements of fuzzy logic and neural networks and work as artificial cognitive networks. The knowledge extraction methods used in this study extract the available knowledge from data in the form of fuzzy rules and insert them into the FCM, contributing to the development of a dynamic decision support system. The fuzzy rules, which derived by these extraction algorithms (such as fuzzy decision trees, association rule-based methods and neuro-fuzzy methods) are implemented to restructure the FCM model, producing new weights into the FCM model, that initially structured by experts. Concluding, our scope is to present a new methodology through a framework for decision making tasks using the soft computing technique of FCMs based on knowledge extraction methods. A well known medical decision making problem pertaining to the problem of radiotherapy treatment planning selection is presented to illustrate the application of the proposed framework and its functioning.  相似文献   

16.
Associative classification is a new classification approach integrating association mining and classification. It becomes a significant tool for knowledge discovery and data mining. However, high-order association mining is time consuming when the number of attributes becomes large. The recent development of the AdaBoost algorithm indicates that boosting simple rules could often achieve better classification results than the use of complex rules. In view of this, we apply the AdaBoost algorithm to an associative classification system for both learning time reduction and accuracy improvement. In addition to exploring many advantages of the boosted associative classification system, this paper also proposes a new weighting strategy for voting multiple classifiers.  相似文献   

17.
The comprehensibility aspect of rule discovery is of emerging interest in the realm of knowledge discovery in databases. Of the many cognitive and psychological factors relating the comprehensibility of knowledge, we focus on the use of human amenable concepts as a representation language in expressing classification rules. Existing work in neural logic networks (or neulonets) provides impetus for our research; its strength lies in its ability to learn and represent complex human logic in decision-making using symbolic-interpretable net rules. A novel technique is developed for neulonet learning by composing net rules using genetic programming. Coupled with a sequential covering approach for generating a list of neulonets, the straightforward extraction of human-like logic rules from each neulonet provides an alternate perspective to the greater extent of knowledge that can potentially be expressed and discovered, while the entire list of neulonets together constitute an effective classifier. We show how the sequential covering approach is analogous to association-based classification, leading to the development of an association-based neulonet classifier. Empirical study shows that associative classification integrated with the genetic construction of neulonets performs better than general association-based classifiers in terms of higher accuracies and smaller rule sets. This is due to the richness in logic expression inherent in the neulonet learning paradigm.  相似文献   

18.
在入侵检测系统和状态检测防火墙等应用中,规则冲突检测及冲突解析算法是影响安全性及服务质量的关键。首先对防火墙过滤规则之间的关系进行了建模和分类。然后在过滤规则关系分类的基础上提出了一种冲突检测算法。该算法能够自动检测、发现规则冲突和潜在的问题,并且能够对防火墙过滤规则进行无冲突的插入、删除和修改。实现该算法的工具软件能够显著简化防火墙策略的管理和消除防火墙的规则冲突。  相似文献   

19.
The CN2 Induction Algorithm   总被引:37,自引:1,他引:36  
Clark  Peter  Niblett  Tim 《Machine Learning》1989,3(4):261-283
  相似文献   

20.
刘晓平 《计算机仿真》2006,23(4):103-105,113
数据挖掘是从大量原始数据中抽取隐藏知识的过程。大部分数据挖掘工具采用规则发现和决策树分类技术来发现数据模式和规则,其核心是归纳算法。与传统统计方法相比,基于机器学习技术得到的分类结果具有较好的可解释性。在针对特定的数据集进行数据挖掘时,如果缺乏相应的领域知识,用户或决策者就很难确定选择何种归纳算法。因此,需要尝试各种算法。借助MLC++,决策者能够轻而易举地比较不同分类算法对特定数据集的有效性,从而选择合适的分类算法。同时,系统开发人员也可以利用MLC++设计各种混合算法。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号