首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
杨成云  张明清  唐俊 《计算机工程》2010,36(7):122-125,
应用决策树方法获取提高农民专业合作社技术服务能力的规律性认识。通过对合作社的业务数据样本进行分析挖掘和分类,建立基于社员生产能力的决策树分类模型,根据该模型生成规则集,从而发现合作社技术服务管理与主要因素间潜在的规则性知识,为丰富合作社的服务内容、提高合作社的服务能力、实现指导社员合理最优地生产提供可参考的依据。  相似文献   

2.
变精度粗集模型在决策树生成过程中的应用   总被引:2,自引:0,他引:2       下载免费PDF全文
Pawlak粗集模型所描述的分类是完全精确的,而没有某种程度上的近似。在利用Pawlak粗集模型构造决策树的过程中,生成方法会将少数特殊实例特化出来,使生成的决策树过于庞大,从而降低了决策树对未来数据的预测和分类能力。利用变精度粗集模型,对基于Pawlak粗集模型的决策树生成方法进行改进,提出变精度明确区的概念,允许在构造决策树的过程中划入明确区的实例类别存在一定的不一致性,可简化生成的决策树,提高决策树的泛化能力。  相似文献   

3.
决策树算法采用递归方法构建,训练效率较低,过度分类的决策树可能产生过拟合现象.因此,文中提出模型决策树算法.首先在训练数据集上采用基尼指数递归生成一棵不完全决策树,然后使用一个简单分类模型对其中的非纯伪叶结点(非叶结点且结点包含的样本不属于同一类)进行分类,生成最终的决策树.相比原始的决策树算法,这样产生的模型决策树能在算法精度不损失或损失很小的情况下,提高决策树的训练效率.在标准数据集上的实验表明,文中提出的模型决策树在速度上明显优于决策树算法,具备一定的抗过拟合能力.  相似文献   

4.
根据银行业多年来的信贷记录,运用决策树算法进行建模、分类,从而构造出一棵决策树来帮助银行决策。决策树具有自学习能力,随着业务经验的积累,模型的再训练,其决策精度将会不断提高。  相似文献   

5.
恶意网址URL检测一直是信息安全防御技术领域的研究热点之一。针对传统恶意URL检测技术无法自主探测未知URL,并且缺乏适应大数据时代发展的能力等问题,设计并实现了一种基于大数据技术,结合决策树算法与黑白名单技术的恶意URL检测模型。该模型基于Spark分布式计算框架,利用已知URL训练集提取特征、训练决策树分类模型,然后用已有分类模型对黑白名单无法检测出的URL进行分类预测,达到检测目的。实验证明,构建的检测模型具有很好的检测效果和稳定性。  相似文献   

6.
阐述了决策树分类技术和R-C4.5决策树模型。以某高职院校近几届毕业生的个人信息、教育信息和就业信息数据为研究对象,对实验数据进行数据预处理,运用R-C4.5决策树分类技术进行数据挖掘,挖掘出影响高职毕业生就业质量的相关因素,为政府和学校提高就业质量的各类措施和改革提供了决策依据。  相似文献   

7.
阐述了决策树分类技术和R-C4.5决策树模型。以某高职院校近几届毕业生的个人信息、教育信息和就业信息数据为研究对象,对实验数据进行数据预处理,运用R-C4.5决策树分类技术进行数据挖掘,挖掘出影响高职毕业生就业质量的相关因素,为政府和学校提高就业质量的各类措施和改革提供了决策依据。  相似文献   

8.
建立了一种基于聚类分析与决策树分析相结合的服务订制预测模型,阐述了聚类分析K-means算法、决策树算法C5.0算法原理、建模流程的设计,将模型应用于某地区用户对有线电视交互服务的订制意愿预测,最终确定高响应率客户群.实验证明.该模型相对于仅通过决策树进行预测能更大程度地提高分类精度,并能更有效地识别出高响应率客户群.  相似文献   

9.
随机森林(RF)具有抗噪能力强,预测准确率高,能够处理高维数据等优点,因此在机器学习领域得到了广泛的应用。模型决策树(MDT)是一种加速的决策树算法,虽然能够提高决策树算法的训练效率,但是随着非纯伪叶结点规模的增大,模型决策树的精度也在下降。针对上述问题,提出了一种模型决策森林算法(MDF)以提高模型决策树的分类精度。MDF算法将MDT作为基分类器,利用随机森林的思想,生成多棵模型决策树。算法首先通过旋转矩阵得到不同的样本子集,然后在这些样本子集上训练出多棵不同的模型决策树,再将这些树通过投票的方式进行集成,最后根据得到的模型决策森林给出分类结果。在标准数据集上的实验结果表明,提出的模型决策森林在分类精度上明显优于模型决策树算法,并且MDF在树的数量较少时也能取到不错的精度,避免了因树的数量增加时间复杂度增高的问题。  相似文献   

10.
郭强      邹广天     《智能系统学报》2017,12(1):117-123
为提升建筑师在策划过程中科学预测的能力,提出了一种基于决策树分类的可拓建筑策划预测方法。首先,运用数据采集软件批量采集互联网中的建筑案例数据,将数据预处理后存储至建筑案例库中;其次,通过评价特征选取、评价信息元集生成、决策树构建等步骤,获得决策树模型;最后,运用该模型预测当前策划项目的性能指标是否满足要求,并给出不满足要求情况下性能指标变换的途径。案例检验表明,该方法能有效提高建筑师运用互联网数据的能力,能够挖掘决策树分类知识,从而加速计算机辅助可拓建筑策划的进程。  相似文献   

11.
Knowledge discovery refers to identifying hidden and valid patterns in data and it can be used to build knowledge inference systems. Decision tree is one such successful technique for supervised learning and extracting knowledge or rules. This paper aims at developing a decision tree model to predict the occurrence of diabetes disease. Traditional decision tree algorithms have a problem with crisp boundaries. Much better decision rules can be identified from these clinical data sets with the use of the fuzzy decision boundaries. The key step in the construction of a decision tree is the identification of split points and in this work best split points are identified using the Gini index. Authors propose a method to minimize the calculation of Gini indices by identifying false split points and used the Gaussian fuzzy function because the clinical data sets are not crisp. As the efficiency of the decision tree depends on many factors such as number of nodes and the length of the tree, pruning of decision tree plays a key role. The modified Gini index-Gaussian fuzzy decision tree algorithm is proposed and is tested with Pima Indian Diabetes (PID) clinical data set for accuracy. This algorithm outperforms other decision tree algorithms.  相似文献   

12.
为了提高客服终端数据可利用性,降低冗余数据干扰程度,挖掘潜在客户,制定销售策略,研究一种基于决策树算法的客服终端冗余数据迭代消除方法。采用数据仓库法抽取并集成客服终端数据,对字符类数据进行去停用词和中文分词预处理,对数值类数据进行缺失值填补和离散值删除预处理。构建ID3决策树,分类客服终端数据,计算同一类数据的类间相似度,构建冗余数据判断规则,检测客服终端冗余数据,联合消除器消除冗余数据。实验结果表明:所研究方法应用后,可以消除客服终端冗余数据,空间缩减比更接近冗余率。  相似文献   

13.
实践了基于专家知识和决策树的设备状态诊断方法。利用专家知识,一方面对样本数据属性进行裁剪,另一方面对正常运行中不易发生的边缘样本点进行人工构造,从而形成一个较完整的样本数据集;利用决策树算法进行规则提取,基于该树形规则,可实现快速状态诊断。  相似文献   

14.
In this explorative research, we aim to find the most important service experience variables that determine customer purchasing decision and the clerks’ influence on customers’ purchases. This study was conducted as a case study of a children’s apparel company, denoted Company L, which has 243 retail stores. Company L has implemented Point of Sale (POS) systems in its retail stores, and would like to know what functions could be added to induce storefront employees to deliver better customer service. We, therefore, focus on observing the services provided by storefront employees and their reflection on a customer’s purchasing decision in a retail store. The study generated decision trees via Weka, a data mining open source software platform, to analyze multiple data sources to (1) understand what makes a good service experience for a customer, (2) get explicit knowledge from service encounter information, and (3) externalize the tacit knowledge of storefront service experiences. These findings can be used to improve Company L’s POS system to guide storefront employees to learn from trained decision rules. Moreover, the company can internalize service experience knowledge by aggregating learned rules from the company’s retail stores.  相似文献   

15.
决策树算法及其在乳腺疾病图像数据挖掘中的应用   总被引:5,自引:1,他引:5  
介绍了ID3决策树算法建立决策树的基本原理,着重介绍了决策树的修剪问题和两种典型的修剪算法-减少分类错误修剪算法和最小代价-复杂度修剪算法,并利用介绍的决策树算法和修剪算法对乳腺疾病图像进行数据挖掘,得到了一些有实际参考价值的规则,获得了很高的分类准确率,证明了决策树算法在医学图像数据挖掘领域有着广泛的应用前景。  相似文献   

16.
《Intelligent Data Analysis》1998,2(1-4):165-185
Classification, which involves finding rules that partition a given dataset into disjoint groups, is one class of data mining problems. Approaches proposed so far for mining classification rules from databases are mainly decision tree based on symbolic learning methods. In this paper, we combine artificial neural network and genetic algorithm to mine classification rules. Some experiments have demonstrated that our method generates rules of better performance than the decision tree approach and the number of extracted rules is fewer than that of C4.5.  相似文献   

17.
Knowledge inference systems are built to identify hidden and logical patterns in huge data. Decision trees play a vital role in knowledge discovery but crisp decision tree algorithms have a problem with sharp decision boundaries which may not be implicated to all knowledge inference systems. A fuzzy decision tree algorithm overcomes this drawback. Fuzzy decision trees are implemented through fuzzification of the decision boundaries without disturbing the attribute values. Data reduction also plays a crucial role in many classification problems. In this research article, it presents an approach using principal component analysis and modified Gini index based fuzzy SLIQ decision tree algorithm. The PCA is used for dimensionality reduction, and modified Gini index fuzzy SLIQ decision tree algorithm to construct decision rules. Finally, through PID data set, the method is validated in the simulation experiment in MATLAB.  相似文献   

18.
基于值约简和决策树的最简规则提取算法   总被引:7,自引:0,他引:7  
罗秋瑾  陈世联 《计算机应用》2005,25(8):1853-1855
粗糙集理论中的值约简和数据挖掘领域中的决策树都是有效的分类方法,但二者都有其局限性。将这两种方法结合起来,生成一种新的基于值核的极小化方法对决策树进行修剪,提出了约简规则的判定准则,缩小了约简的范围,最后再对生成的规则进行极大化处理,以保证规则覆盖信息的一致性,实验验证了该算法的有效性。  相似文献   

19.
Attribute Generation Based on Association Rules   总被引:1,自引:0,他引:1  
A decision tree is considered to be appropriate (1) if the tree can classify the unseen data accurately, and (2) if the size of the tree is small. One of the approaches to induce such a good decision tree is to add new attributes and their values to enhance the expressiveness of the training data at the data pre-processing stage. There are many existing methods for attribute extraction and construction, but constructing new attributes is still an art. These methods are very time consuming, and some of them need a priori knowledge of the data domain. They are not suitable for data mining dealing with large volumes of data. We propose a novel approach that the knowledge on attributes relevant to the class is extracted as association rules from the training data. The new attributes and the values are generated from the association rules among the originally given attributes. We elaborate on the method and investigate its feature. The effectiveness of our approach is demonstrated through some experiments. Received 6 December 1999 / Revised 28 October 2000 / Accepted in revised form 9 March 2001  相似文献   

20.
一种多变量决策树的构造与研究   总被引:3,自引:0,他引:3       下载免费PDF全文
单变量决策树算法造成树的规模庞大、规则复杂、不易理解,而多变量决策树是一种有效用于分类的数据挖掘方法,构造的关键是根据属性之间的相关性选择合适的属性组合构成一个新的属性作为节点。结合粗糙集原理中的知识依赖性度量和信息系统中条件属性集的离散度概念,提出了一种多变量决策树的构造算法(RD)。在UCI上部分数据集的实验结果表明,提出的多变量决策树算法的分类效果与传统的ID3算法以及基于核方法的多变量决策树的分类效果相比,有一定的提高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号