首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 218 毫秒
1.
在介绍了现有数值型属性分裂方法的基础上,引出了纯区间的概念,提出了一种基于纯区间归约的数值型属性分裂方法。该方法将属性值域用等宽直方图的方法划分为多个区间,对纯区间和非纯区间分别处理。理论分析和实验结果表明该方法在保证了分裂精度的同时,减小了搜索空间。  相似文献   

2.
周军  林庆  胡瑞瑞 《计算机应用》2009,29(6):1608-1611
粗糙集理论在对不精确、不确定和不完全的数据进行分类分析和知识获取中具有突出的优势。从粒度粗细的角度动态分析了粗糙集的边界域,结合属性关联的理论定义了动态粒度商的概念。依据粒度粗细的理论,提出了一种新的属性约简算法。采用动态粒度商法选择最优归约集,抛弃了传统的先求核心,再选择最优归约集的算法。实例研究证明提出的粒度计算方法是可靠有效的,为进一步研究知识的粒度计算提供了可行的方法。  相似文献   

3.
可终止性判定问题是主动数据库的一个核心问题。现有的研究工作提出了运用触发图和活化图的方法解决这个问题,其中的一个关键技术就是利用归约算法对主动规则集进行归约。已有的计算方法对一些可归约规则无法识别。本文提出了独立型触发环、非独立型触发环、活化路径、禁止活化环、禁止活化规则等概念。基于这些概念,提出了一个新的归约算法,从而可识别出更多的可归约规则。  相似文献   

4.
考虑有向无环图 (DAG)描述的组合服务模型,提出了一种新的组合服务QoS度量方法--基于拓扑序列归约的Web服务QoS度量方法(QCMTSR)。其借鉴迭代归约度量方法中的基本结构及QoS计算公式,定义了DAG图中的两类基本结构,串归约结构和并归约结构,并给出了两种基本结构的QoS属性计算公式;通过逐步归约DAG图拓扑序列中的每个节点,直至最后一个节点的QoS属性值就是组合服务的各QoS属性的度量结果。从理论上证明了QCMTSR算法适用于所有DAG描述的组合服务,并实验证明QCMTSR算法对可靠性和可用性能够更准确的度量。  相似文献   

5.
本文介绍用于识别手写印刷体汉字的二维扩展属性文法方法中文法归约阶段的工作。从四方位取出部件之后,按照部件组合属性和部件框位置关系对部件进行归约。由于利用了汉字部件组合关系中的信息冗余及扩展属性文法的语义处理能力,这种方法降低了对部件正确抽取的要求,但仍能识别畸变较大的汉字,并能有效地区分极相似字。  相似文献   

6.
考虑有向无环图(DAG)描述的组合服务模型,提出了一种新的组合服务QoS度量方法——基于拓扑序列归约的Web服务QoS度量方法(QCMTSR).其借鉴迭代归约度量方法中的基本结构及Q6计算公式,定义了DAG图中的两类基本结构,串归约结构和并归约结构,并给出了两种基本结构的QoS属性计算公式;通过逐步归约DAG图拓扑序列中的每个节点,直至最后一个节点的QoS属性值就是组合服务的各QoS属性的度量结果.从理论上证明了QCMTSR算法适用于所有DAG描述的组合服务,并实验证明QCMTSR算法对可靠性和可用性能够更准确的度量.  相似文献   

7.
混合信息的多属性决策研究   总被引:1,自引:0,他引:1  
张卓 《计算机与数字工程》2014,(12):2433-2438,2442
针对实数型、区间型、三角模糊数和不确定性语言组成的混合多属性决策问题,首先分析了混合属性信息的度量关系,并结合混合属性特点讨论了各属性之间的转化关系,将混合属性统一到区间属性信息框架下,实现了混合属性之间的不确定性度量。其次基于灰色关联的区间多属性决策方法,计算决策方案的贴近度,根据贴近度对混合多属性信息转化后的区间信息进行决策。最后结合武器威胁程度决策判定,验证了所提出的混合多属性决策方法的实用性。  相似文献   

8.
计算主动数据库中不可归约规则集的有效算法   总被引:5,自引:1,他引:5  
主动数据库中规则集的可终止性判定是一个重要问题,已经成为一个研究热点.有些研究工作提出了在编译阶段运用触发图和活化图的方法解决这个问题,其中的一个关键技术就是计算主动规则集的不可归约规则集.现有的计算方法由于具有一定保守性,使得计算出的不可归约规则集仍可进一步地归约,这无疑将影响到规则集的可终止性判定的准确性和运行阶段规则分析的效率.经过深入分析活化规则可无限执行的特点,提出了活化路径等概念.基于这些概念,提出了一个计算主动规则集的不可归约规则集的有效算法,使现有方法求得的不可归约规则集得到进一步的归约.  相似文献   

9.
鲍迪  张楠  童向荣  岳晓冬 《计算机应用》2019,39(8):2288-2296
实际应用中存在大量动态增加的区间型数据,若采用传统的非增量正域属性约简方法进行约简,则需要对更新后的区间值数据集的正域约简进行重新计算,导致属性约简的计算效率大大降低。针对上述问题,提出区间值决策表的正域增量属性约简方法。首先,给出区间值决策表正域约简的相关概念;然后,讨论并证明单增量和组增量的正域更新机制,提出区间值决策表的正域单增量和组增量属性约简算法;最后,通过8组UCI数据集进行实验。当8组数据集的数据量由60%增加至100%时,传统非增量属性约简算法在8组数据集中的约简耗时分别为36.59 s、72.35 s、69.83 s、154.29 s、80.66 s、1498.11 s、4124.14 s和809.65 s,单增量属性约简算法的约简耗时分别为19.05 s、46.54 s、26.98 s、26.12 s、34.02 s、1270.87 s、1598.78 s和408.65 s,组增量属性约简算法的约简耗时分别为6.39 s、15.66 s、3.44 s、15.06 s、8.02 s、167.12 s、180.88 s和61.04 s。实验结果表明,提出的区间值决策表的正域增量式属性约简算法具有高效性。  相似文献   

10.
归约算法在科学计算和图像处理等领域有着十分广泛的应用,是并行计算的基本算法之一,因此对归约算法进行加速具有重要意义。为了充分挖掘异构计算平台下GPU的计算能力以对归约算法进行加速,文中提出基于线程内归约、work-group内归约和work-group间归约3个层面的归约优化方法,并打破以往相关工作将优化重心集中在work-group内归约上的传统思维,通过论证指出线程内归约才是归约算法的瓶颈所在。实验结果表明,在不同的数据规模下,所提归约算法与经过精心优化的OpenCV库的CPU版本相比,在AMD W8000和NVIDIA Tesla K20M平台上分别达到了3.91~15.93和2.97~20.24的加速比; 相比于OpenCV库的CUDA版本与OpenCL版本,在NVIDIA Tesla K20M平台上分别达到了2.25~5.97和1.25~1.75的加速比;相比于OpenCL版本,在AMD W8000平台上达到了1.24~5.15的加速比。文中工作不仅实现了归约算法在GPU计算平台上的高性能,而且实现了在不同GPU计算平台间的性能可移植。  相似文献   

11.
We present a method to learn maximal generalized decision rules from databases by integrating discretization, generalization and rough set feature selection. Our method reduces the data horizontally and vertically. In the first phase, discretization and generalization are integrated and the numeric attributes are discretized into a few intervals. The primitive values of symbolic attributes are replaced by high level concepts and some obvious superfluous or irrelevant symbolic attributes are also eliminated. Horizontal reduction is accomplished by merging identical tuples after the substitution of an attribute value by its higher level value in a pre-defined concept hierarchy for symbolic attributes, or the discretization of continuous (or numeric) attributes. This phase greatly decreases the number of tuples in the database. In the second phase, a novel context-sensitive feature merit measure is used to rank the features, a subset of relevant attributes is chosen based on rough set theory and the merit values of the features. A reduced table is obtained by removing those attributes which are not in the relevant attributes subset and the data set is further reduced vertically without destroying the interdependence relationships between classes and the attributes. Then rough set-based value reduction is further performed on the reduced table and all redundant condition values are dropped. Finally, tuples in the reduced table are transformed into a set of maximal generalized decision rules. The experimental results on UCI data sets and a real market database demonstrate that our method can dramatically reduce the feature space and improve learning accuracy.  相似文献   

12.
Inductive learning systems can be effectively used to acquire classification knowledge from examples. Many existing symbolic learning algorithms can be applied in domains with continuous attributes when integrated with a discretization algorithm to transform the continuous attributes into ordered discrete ones. In this paper, a new information theoretic discretization method optimized for supervised learning is proposed and described. This approach seeks to maximize the mutual dependence as measured by the interdependence redundancy between the discrete intervals and the class labels, and can automatically determine the most preferred number of intervals for an inductive learning application. The method has been tested in a number of inductive learning examples to show that the class-dependent discretizer can significantly improve the classification performance of many existing learning algorithms in domains containing numeric attributes  相似文献   

13.
In data mining many datasets are described with both discrete and numeric attributes. Most Ant Colony Optimization based classifiers can only deal with discrete attributes and need a pre-processing discretization step in case of numeric attributes. We propose an adaptation of AntMiner+ for rule mining which intrinsically handles numeric attributes. We describe the new approach and compare it to the existing algorithms. The proposed method achieves comparable results with existing methods on UCI datasets, but has advantages on datasets with strong interactions between numeric attributes. We analyse the effect of parameters on the classification accuracy and propose sensible defaults. We describe application of the new method on a real world medical domain which achieves comparable results with the existing method.  相似文献   

14.
概率逻辑公式集分解的合并聚类算法   总被引:1,自引:2,他引:1  
为使概率逻辑的不确定性推理方法能应用于较大规模的知识库,本文基于一个实际专家系统知识库的开发经验,在概率逻辑公式一致性区间的一般算法基础上,为概率逻辑公式集的分解设计了一种合并聚类算法.对于不同背景的概率逻辑知识库,只要公式集具有一定的分层结构性质,该算法就能保证Dantzig-Wolfe分解的联合计算模型适用于概率逻辑推理.测试结果表明,该算法对于数10个变量和子句的实例可收到很好的效果.  相似文献   

15.
We present a data mining method which integrates discretization, generalization and rough set feature selection. Our method reduces the data horizontally and vertically. In the first phase, discretization and generalization are integrated. Numeric attributes are discretized into a few intervals. The primitive values of symbolic attributes are replaced by high level concepts and some obvious superfluous or irrelevant symbolic attributes are also eliminated. The horizontal reduction is done by merging identical tuples after substituting an attribute value by its higher level value in a pre- defined concept hierarchy for symbolic attributes, or the discretization of continuous (or numeric) attributes. This phase greatly decreases the number of tuples we consider further in the database(s). In the second phase, a novel context- sensitive feature merit measure is used to rank features, a subset of relevant attributes is chosen, based on rough set theory and the merit values of the features. A reduced table is obtained by removing those attributes which are not in the relevant attributes subset and the data set is further reduced vertically without changing the interdependence relationships between the classes and the attributes. Finally, the tuples in the reduced relation are transformed into different knowledge rules based on different knowledge discovery algorithms. Based on these principles, a prototype knowledge discovery system DBROUGH-II has been constructed by integrating discretization, generalization, rough set feature selection and a variety of data mining algorithms. Tests on a telecommunication customer data warehouse demonstrates that different kinds of knowledge rules, such as characteristic rules, discriminant rules, maximal generalized classification rules, and data evolution regularities, can be discovered efficiently and effectively.  相似文献   

16.
K-means type clustering algorithms for mixed data that consists of numeric and categorical attributes suffer from cluster center initialization problem. The final clustering results depend upon the initial cluster centers. Random cluster center initialization is a popular initialization technique. However, clustering results are not consistent with different cluster center initializations. K-Harmonic means clustering algorithm tries to overcome this problem for pure numeric data. In this paper, we extend the K-Harmonic means clustering algorithm for mixed datasets. We propose a definition for a cluster center and a distance measure. These cluster centers and the distance measure are used with the cost function of K-Harmonic means clustering algorithm in the proposed algorithm. Experiments were carried out with pure categorical datasets and mixed datasets. Results suggest that the proposed clustering algorithm is quite insensitive to the cluster center initialization problem. Comparative studies with other clustering algorithms show that the proposed algorithm produce better clustering results.  相似文献   

17.
18.
《Knowledge》2007,20(4):419-425
Many classification algorithms require that training examples contain only discrete values. In order to use these algorithms when some attributes have continuous numeric values, the numeric attributes must be converted into discrete ones. This paper describes a new way of discretizing numeric values using information theory. Our method is context-sensitive in the sense that it takes into account the value of the target attribute. The amount of information each interval gives to the target attribute is measured using Hellinger divergence, and the interval boundaries are decided so that each interval contains as equal amount of information as possible. In order to compare our discretization method with some current discretization methods, several popular classification data sets are selected for discretization. We use naive Bayesian classifier and C4.5 as classification tools to compare the accuracy of our discretization method with that of other methods.  相似文献   

19.
Instance-based prediction of real-valued attributes   总被引:2,自引:0,他引:2  
Instance-based representations have been applied to numerous classification tasks with some success. Most of these applications involved predicting a symbolic class based on observed attributes. This paper presents an instance-based method for predicting a numeric value based on observed attributes. We prove that, given enough instances, if the numeric values are generated by continuous functions with bounded slope, then the predicted values are accurate approximations of the actual values. We demonstrate the utility of this approach by comparing it with a standard approach for value prediction. The instance-based approach requires neither ad hoc parameters nor background knowledge.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号