首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
During the last years, several association rule‐based classification methods have been proposed, these algorithms may quickly generate accurate rules. However, the generated rules are often very large in terms of the number of rules and usually complex and hardly understandable for users. Among all the rules generated by the algorithms, only some of them are likely to be of any interest to the domain expert analyzing the data. Most of the rules are either redundant, irrelevant or obvious. In this paper, a new method for selecting the interesting class association rules is proposed by an evolutionary method named genetic relation algorithm. The algorithm evaluates the relevance and interestingness of the discovered association rules by the relationships between the rules in each generation using a specific measure of distance among them giving a reduced set of rules as the result in the final generation. This small rule set has the following properties: (i) accurate as it has at least the same classification accuracy as the complete association rule set, (ii) interesting because of the diversity of rules and (iii) comprehensible because it is more understandable for the users as the number of attributes involved in the rules is also small. The efficiency of the proposed method is compared with other conventional methods including genetic network programming‐based mining using ten databases and the experimental results show that it outperforms others keeping a good balance between the classification accuracy and the comprehensibility of the rules. © 2011 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

2.
Attribute selection is a technique to prune less relevant information and discover high‐quality knowledge. It is especially useful for the classification of a large database, because the preprocessing of data increases the possibility that predictor attributes given to the mining algorithm become more relevant to the class attribute. In this paper, a method to acquire the optimal attribute subset for the genetic network programming (GNP) based class association rule mining has been proposed, and this attribute selection process using genetic algorithm (GA) leads to a higher accuracy for classification. Class association rule mining through GNP is conducted with a small subset of data rather than the original large number of attributes; thus simple but important rules are obtained for classification while the local optimal problem is avoided. Simulation results with educational data show that the classification accuracy is largely improved from 52.73 to 74.54%, when classification is made using the optimal attribute subset. © 2014 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

3.
Most of the existing association rule mining algorithms are able to extract knowledge from databases with attributes of binary values. However, in real‐world applications, databases are usually composed of continuous values such as height, length or weight. If the attributes are continuous, the algorithms are commonly integrated with a discretization method that transforms them into discrete attributes. Discretization is a process of transforming a continuous attribute value into a finite number of intervals and assigning each interval into a discrete numerical value. However, the user most often must specify the number of intervals, or provide some heuristic rules to be used while discretization, and then it is difficult to get the highest attribute interdependency and at the same time get the lowest number of intervals. In this paper we present an association rule mining algorithm that is suited for continuous valued attributes commonly found in scientific and statistical databases. We propose a method using a new graph‐based evolutionary algorithm named ‘genetic network programming (GNP)’ that can deal with continuous values directly, that is, without using any discretization method as a preprocessing step. GNP represents its individuals using graph structures and evolves them in order to find a solution; this feature contributes to creating very compact programs and implicitly memorizing past action sequences. In the proposed method using GNP, the significance of the extracted association rules is measured by the use of χ2 test, and only important association rules are stored in a pool all together through generations. Results of experiments conducted on a real‐life database suggest that the proposed method provides an effective technique for handling continuous attributes. Copyright © 2008 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

4.
Quantitative attributes are partitioned into several fuzzy sets by using fuzzy c-means algorithm. Fuzzy c-means algorithm can embody the actual distribution of the data, and fuzzy sets can soften the partition boundary. Then, we improve the search technology of apriori algorithm and present the algorithm for mining fuzzy association rules. As the database size becomes larger and larger, a better way is to mine fuzzy association rules in parallel. In the parallel mining algorithm, quantitative attributes are partitioned into several fuzzy sets by using parallel fuzzy c-means algorithm. Boolean parallel algorithm is improved to discover frequent fuzzy attribute set, and the fuzzy association rules with at least a minimum confidence are generated on all processors. The experiment results implemented on the distributed linked PC/workstation show that the parallel mining algorithm has fine scaleup, sizeup and speedup. Last, we discuss the application of fuzzy association rules in the classification. The example shows that the accuracy of classification systems of the fuzzy association rules is better than that of the two popular classification methods: C4.5 and CBA. __________ Translated from Journal of Southeast University (Natural Science Edition), 2005, 35(2): 165–170 (in Chinese)  相似文献   

5.
基于模糊多目标遗传优化算法的节假日电力负荷预测   总被引:10,自引:1,他引:10  
多目标遗传优化算法的一个优点就是可在一次迭代计算中寻找到问题的多个非劣最优解。该文应用多目标遗传算法和关联规则算法提出一个基于模糊规则的电力负荷模式分类系统。在此分类系统中采用多目标遗传优化算法从众多模糊分类规则中自动挑选出具有较好识别性能和可解释性的模糊规则,并利用模糊关联规则挖掘通过启发式规则选择改善遗传算法的搜索性能。经仿真试验表明此分类系统具有较好的分类性能,可为节假日负荷预测提供更为充分的历史数据,从而改善其负荷预测性能。  相似文献   

6.
大数据挖掘分析在电力设备状态评估中的应用   总被引:2,自引:0,他引:2  
为提高电力设备状态评估的准确性和效率,提出了将大数据挖掘分析应用于电力设备状态评估的思路和方法。介绍了大数据挖掘分析的架构,将电力设备状态的多维度数据解析为静态、动态、准动态和外部参数四大类,分析了数据关联规则、关联度和权重,最后给出了大数据挖掘分析技术的应用前景。  相似文献   

7.
基于粗糙集理论知识,对关联规则挖掘算法作出一定的改进。该算法的主要思想是把集合的近似质量作为迭代准则,初始约简集是所有的条件属性集合,在保证近似质量不变的前提下通过逐步缩减的方式来求取约简集,保证了所求的约简不会减弱对问题的分类决策能力。约简后得到新的决策表,在此基础上应用基于贪心思想的Apriori算法挖掘关联规则。算法的主要优势是在不影响对问题分类决策能力的前提下,以较小的属性和候选项集数目以及有限的扫描次数生成决策规则。通过应用实例和实验分析验证了算法的有效性。  相似文献   

8.
雷击故障是造成电网电压暂降的主要原因之一,准确评估雷电造成的电压暂降严重程度可以为制定最优治理方案和敏感用户选址提供依据。文中提出一种数据驱动的电压暂降严重程度自学习评估方法。首先,基于雷电造成的电压暂降机理,结合雷电定位系统和电能质量监测系统中的监测信息选取参与挖掘的参数;其次,减少离散化结果对规则准确性的影响,使用离散化评价系数确定不同参数的离散区间数目;然后,针对电网数据库动态变化时挖掘算法效率过低的问题,使用基于增量式学习的关联规则挖掘算法持续更新挖掘规则,从而赋予其自学习的能力;最后,提出基于综合赋权法的加权欧氏距离评估实际场景的电压暂降严重程度。通过某地区电网的监测数据和IEEE 30节点系统仿真数据进行实证分析,结果证明文中方法能在实际应用中准确挖掘有价值规则,实现关注节点的电压暂降严重程度评估。  相似文献   

9.
基于配用电信息系统数据和关联规则算法,提出一种诊断中压配电网分支线断线不接地故障的方法。通过分析相互关联的配用电信息系统数据,提出基于数据特征选择的关联规则挖掘方法,并通过卡方分裂算法将连续型特征量转换为布尔型特征量,同时采用MSApriori算法解决故障信息中的稀有项问题,然后在此基础上应用kulc准则消除冗余规则以形成约简的代表规则家族。以华东某地区配用电信息系统中的历史数据为依据进行实际算例分析,结果说明所提出的方法能够大量减少无效挖掘,显著提高效率和准确度,适用于中压配电网断线故障的在线诊断。  相似文献   

10.
将基于粗糙集的默认规则挖掘算法(Mining Default Rules Based on Rough Set,MDRBR)用于电力系统短期负荷预测,首先采用基于Gini指标的粗糙集离散化算法对气温、湿度等影响负荷的条件属性进行离散化,同时兼顾了条件属性和决策属性。在此基础上,通过计算规则的信赖度和支持度形成不同层次上符合初定阈值的带粗糙集算子的网络规则集,能减少因噪音的影响而产生的多余规则,提高规则产生和实际分类的效率,使所产生的分类规则集大大缩小,提高在使用规则时检索规则的效率。在负荷预测时自上而下逐层搜索规则网直至找出与所给信息相匹配的规则。粗糙集算子反映了规则的重要程度,同时作为选择规则的标准。实际应用表明,该方法能有效去除噪音,提高默认规则的挖掘效率,从而提高负荷预测的精度,具有一定的实用性。  相似文献   

11.
Genetic network programming (GNP)‐based class association rule mining has been demonstrated to be efficient for misuse and anomaly detection. However, misuse detection is weak in detecting brand new attacks, while anomaly detection has a defect of high positive false rate. In this paper, a unified detection method is proposed to integrate misuse detection and anomaly detection to overcome their disadvantages. In addition, GNP‐based class association rule mining method extracts an overwhelming number of rules which contain much redundant and irrelevant information. Therefore, in this paper, an efficient class association rule‐pruning method is proposed based on matching degree and genetic algorithm (GA). In the first stage, a matching degree‐based method is applied to preprune the rules in order to improve the efficiency of the GA. In the second stage, the GA is implemented to pick up the effective rules among the rules remaining in the first stage. Simulations on KDDCup99 show the high performance of the proposed method. © 2012 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

12.
文章在分析电、水、气智能表计故障诊断和实际消缺工作流程特点的基础上,提出了一种基于"多表合一"系统的智能表异常诊断和处理方法,该方法首先通过关联挖掘获取诊断规则,然后将监测的数据转换为事件元并进行规则匹配以获取诊断结果,最后通过现场排查确认并反馈诊断结果。该方法将异常诊断和消缺闭环管理结合起来,通过对诊断知识库的不断完善来提高异常诊断的准确性,充分发挥关联挖掘技术的优越性。  相似文献   

13.
基于特征挖掘的电网故障诊断方法   总被引:4,自引:0,他引:4  
专家系统在应用方面的主要瓶颈是:规则库的维护;推理的速度和准确度的协调。分析了故障信息序列中必有或特有的信息,提出了基于特征挖掘的关联规则挖掘方法。结合电网故障信息的特征,改进了频繁模式(frequent pattern,FP)–算法:考虑了故障信息的特征,如时序和因果关联关系、故障性质、严重故障、稀有故障等因素;增加了规则的或逻辑;改进了FP-树的修剪技术。算例表明该算法能够大量减少无效挖掘,推理速度和准确度显著提高,适用于在线诊断。  相似文献   

14.
为了找到负荷值与各种影响负荷预测精度因素之间的关系来进行缺损数据处理,提出一个基于关联分类技术的短期负荷数据缺损处理模型。该模型首先对负荷信息系统应用数据规约方法得到规约集,然后利用关联分类算法挖掘出隐含在其中的有趣的满足用户指定的最小支持度和最小信任度的强关联规则,最后通过规则匹配对含有缺损数据的记录进行修补,对有问题的数据判断异常。经仿真分析,应用这种新的数据缺损处理策略可以得到更加精确的预测结果。  相似文献   

15.
针对配电网运行时经常发生故障的情况,如何快速高效地寻找出配电网中的薄弱点成为了当下配电网安全运行的一大难题。文中采用频繁模式网络(FP-network)模型,建立事务-项目的关联矩阵,并且将所需要进行关联规则挖掘的数据储存在关联矩阵中,从而进行关联规则的数据挖掘。通过算例分析证实了FP-network关联规则挖掘算法可用于配电网薄弱点分析中,并通过配电网实际运行情况验证了该算法的可行性。该算法对配电网数据库中的故障数据仅仅需要进行一次扫描,从而提高了配电网故障数据关联规则挖掘的效率,更有利于配电网实时更新数据库,为分析检测配电网运行中的薄弱点提供了技术支持。  相似文献   

16.
运用基于大数据处理架构的Naive Bayes分类方法提出了暂态电能质量评估方法,将数据来源扩展至电网运行监测数据、电力用户数据和公共信息数据等方面,并将评估结果按严重程度分为暂态正常状态、短时电压暂降状态、短时深度电压暂降状态、短时电压失压状态。基于MapReduce架构,设计分布式Naive Bayes算法实现状态分类。在分类器训练阶段,对海量历史数据进行分布式学习,周期性地生成评估规则库并部署到所有评估节点。在状态评估阶段,各评估节点基于流处理框架快速生成实时评估样本,并根据当前规则库实时地得出评估结果。试验结果表明,所提出的基于大数据分析的暂态电能质量评估方法是可行,在准确率和处理速度上都取得了较好的效果。  相似文献   

17.
随着智能电网、通信网络及电力生产安全事故事件分析水平的提高和发展,电力生产安全事故事件数据量快速增长、复杂性不断增大,逐步构成了电力生产安全事故事件大数据。为在先验事故事件大数据的基础上高效、可靠地对事故诱因进行分类和识别,基于关联规则挖掘进行电力生产安全事故事件关键诱因筛选。根据事故事件的特点,建立电力生产安全事故诱因分析体系,对不同类型的事故进行布尔离散化,并基于关联规则挖掘提出事故诱因的诱发度计算方法,运用Apriori算法进行深度关联规则挖掘,并根据强关联规则对关键诱因进行筛选和分析。以某区域近5年的事故实例分析验证了该方法的有效性。  相似文献   

18.
Traditional principal component analysis (PCA) based face recognition algorithms have a low recognition accuracy due to the influence of noise and illumination changes. This paper proposes a robust, intelligent PCA‐based face recognition framework in the complicated illumination database when using multiple training images per person (MTIP‐CID). There are mainly two improvements in the proposed method. One is that a face‐recognition‐oriented genetic‐based clustering algorithm is introduced to reduce the influence of a large number of classes on the classification accuracy in the MTIP‐CID. The other is that a classifier based on fuzzy class association rules (FCARs) is applied to mine the inherent relationships between eigenfaces and to improve the robustness of PCA‐based face recognition in noisy environments. Experimental results on the extended Yale‐B database demonstrate that the proposed framework performs better and is more robust against noise compared with other traditional face recognition algorithms, i.e. linear discriminant analysis (LDA) and local binary patterns (LBPs). © 2013 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.  相似文献   

19.
非侵入式负荷监测与分解(NILMD)是获取电器用电信息的关键技术,针对当前NILMD缺乏考虑不同电器关联运行的用电模式和电器状态的强波动性以致分解精度低的问题,提出一种计及电器状态关联规则的新型负荷分解方法。通过仿射传播聚类提取电器的运行状态,基于互信息熵,运用关联规则算法挖掘电器状态的关联性;调整含关联规则的样本权值并结合k近邻算法实现状态辨识;利用极大似然估计完成负荷功率分解。测试算例验证了所提方法的有效性和准确性。  相似文献   

20.
混凝土坝施工信息多以文档文本的形式呈现,其体量大、分布广、内在关系复杂,人工操作难以准确、高效地提取信息知识内容,理清错综复杂的施工信息关系。在自然语言处理技术中,命名实体是文本信息知识的载体,实现精确快速的实体识别是施工知识挖掘的重要前提。本文提出一种融合深度学习与关联规则技术的混凝土坝施工文档知识智能识别及挖掘分析方法。该方法耦合双向长短期记忆神经网络(bi-directional long-short term memory,Bi-LSTM)与条件随机场(conditional random field,CRF),定义混凝土坝施工实体类型,构建命名实体识别模型,形成混凝土坝施工实体知识集合;在此基础上,考虑施工文本表达规律及实体类型,预定义实体之间关系,确定施工实体组合形式,形成实体关联规则提取技术;以实体关联规则提取技术为导向,改进Apriori算法计算频繁项集,获得实体间的强关联规则。该方法应用于实际混凝土坝施工监理周报中,经过计算得到命名实体识别的精确率为86.42%,验证了该方法的准确性。利用改进Apriori算法分析实体间的关联规则,证明了改进算法的优势,有助于提升混凝土坝施工文档知识分析的智能化与精细化水平。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号