首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
An important step in building expert and intelligent systems is to obtain the knowledge that they will use. This knowledge can be obtained from experts or, nowadays more often, from machine learning processes applied to large volumes of data. However, for some of these learning processes, if the volume of data is large, the knowledge extraction phase is very slow (or even impossible). Moreover, often the origin of the data sets used for learning are measure processes in which the collected data can contain errors, so the presence of noise in the data is inevitable. It is in such environments where an initial step of noise filtering and reduction of data set size plays a fundamental role. For both tasks, instance selection emerges as a possible solution that has proved to be useful in various fields. In this paper we focus mainly on instance selection for noise removal. In addition, in contrast to most of the existing methods, which applied instance selection to classification tasks (discrete prediction), the proposed approach is used to obtain instance selection methods for regression tasks (prediction of continuous values). The different nature of the value to predict poses an extra difficulty that explains the low number of articles on the subject of instance selection for regression.More specifically the idea used in this article to adapt to regression problems “classic” instance-selection algorithms for classification is as simple as the discretization of the numerical output variable. In the experimentation, the proposed method is compared with much more sophisticated methods, specifically designed for regression, and shows to be very competitive.The main contributions of the paper include: (i) a simple way to adapt to regression instance selection algorithms for classification, (ii) the use of this approach to adapt a popular noise filter called ENN (edited nearest neighbor), and (iii) the comparison of this noise filter against two other specifically designed for regression, showing to be very competitive despite its simplicity.  相似文献   

2.
边缘计算为资源受限的物联网IoT设备扩展计算资源、增强存储容量,可以改善IoT应用程序的执行性能。在IoT环境中,大多数应用都将以分布式架构的形式部署在各站点中,站点之间需要协作完成任务。为了解决物联网环境中多站点协同计算的代价优化问题,提出了一种基于遗传算法的多站点协同计算卸载算法GAMCCO。该算法将应用程序抽象为任务依赖关系图模型,分析各任务之间的依赖关系,将多站点协同计算卸载的问题建模为代价模型,并利用遗传算法寻找最小代价的卸载方案。实验与评估结果表明,所提出的GAMCCO算法可以有效减少IoT应用的时延,同时降低终端设备的能耗。  相似文献   

3.
针对大数据样例选择问题,提出了一种基于随机森林(RF)和投票机制的大数据样例选择算法。首先,将大数据集划分成两个子集,要求第一个子集是大型的,第二个子集是中小型的。然后,将第一个大型子集划分成q个规模较小的子集,并将这些子集部署到q个云计算节点,并将第二个中小型子集广播到q个云计算节点。接下来,在各个节点用本地数据子集训练随机森林,并用随机森林从第二个中小型子集中选择样例,之后合并在各个节点选择的样例以得到这一次所选样例的子集。重复上述过程p次,得到p个样例子集。最后,用这p个子集进行投票,得到最终选择的样例子集。在Hadoop和Spark两种大数据平台上实现了提出的算法,比较了两种大数据平台的实现机制。此外,在6个大数据集上将所提算法与压缩最近邻(CNN)算法和约简最近邻(RNN)算法进行了比较,实验结果显示数据集的规模越大时,与这两个算法相比,提出的算法测试精度更高且时间消耗更短。证明了提出的算法在大数据处理上具有良好的泛化能力和较高的运行效率,可以有效地解决大数据的样例选择问题。  相似文献   

4.
Graph-based semi-supervised learning is an important semi-supervised learning paradigm. Although graph-based semi-supervised learning methods have been shown to be helpful in various situations, they may adversely affect performance when using unlabeled data. In this paper, we propose a new graph-based semi-supervised learning method based on instance selection in order to reduce the chances of performance degeneration. Our basic idea is that given a set of unlabeled instances, it is not the best approach to exploit all the unlabeled instances; instead, we should exploit the unlabeled instances that are highly likely to help improve the performance, while not taking into account the ones with high risk. We develop both transductive and inductive variants of our method. Experiments on a broad range of data sets show that the chances of performance degeneration of our proposed method are much smaller than those of many state-of-the-art graph-based semi-supervised learning methods.  相似文献   

5.
Abstract

The main limit of data mining algorithms is their inability to deal with the huge amount of available data in a reasonable processing time. A solution of producing fast and accurate results is instances and features selection. This process eliminates noisy or redundant data in order to reduce the storage and computational cost without performances degradation. In this paper, a new instance selection approach called Ensemble Margin Instance Selection (EMIS) algorithm is proposed. This approach is based on the ensemble margin. To evaluate our approach, we have conducted several experiments on different real-world classification problems from UCI Machine learning repository. The pixel-based image segmentation is a field where the storage requirement and computational cost of applied model become higher. To solve these limitations we conduct a study based on the application of EMIS and other instance selection techniques for the segmentation and automatic recognition of white blood cells WBC (nucleus and cytoplasm) in cytological images.  相似文献   

6.
Biological data often consist of redundant and irrelevant features. These features can lead to misleading in modeling the algorithms and overfitting problem. Without a feature selection method, it is difficult for the existing models to accurately capture the patterns on data. The aim of feature selection is to choose a small number of relevant or significant features to enhance the performance of the classification. Existing feature selection methods suffer from the problems such as becoming stuck in local optima and being computationally expensive. To solve these problems, an efficient global search technique is needed.Black Hole Algorithm (BHA) is an efficient and new global search technique, inspired by the behavior of black hole, which is being applied to solve several optimization problems. However, the potential of BHA for feature selection has not been investigated yet. This paper proposes a Binary version of Black Hole Algorithm called BBHA for solving feature selection problem in biological data. The BBHA is an extension of existing BHA through appropriate binarization. Moreover, the performances of six well-known decision tree classifiers (Random Forest (RF), Bagging, C5.0, C4.5, Boosted C5.0, and CART) are compared in this study to employ the best one as an evaluator of proposed algorithm.The performance of the proposed algorithm is tested upon eight publicly available biological datasets and is compared with Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Simulated Annealing (SA), and Correlation based Feature Selection (CFS) in terms of accuracy, sensitivity, specificity, Matthews’ Correlation Coefficient (MCC), and Area Under the receiver operating characteristic (ROC) Curve (AUC). In order to verify the applicability and generality of the BBHA, it was integrated with Naive Bayes (NB) classifier and applied on further datasets on the text and image domains.The experimental results confirm that the performance of RF is better than the other decision tree algorithms and the proposed BBHA wrapper based feature selection method is superior to BPSO, GA, SA, and CFS in terms of all criteria. BBHA gives significantly better performance than the BPSO and GA in terms of CPU Time, the number of parameters for configuring the model, and the number of chosen optimized features. Also, BBHA has competitive or better performance than the other methods in the literature.  相似文献   

7.
提出了一种基于概率神经网络和K-L散度的样例选择算法。该算法利用概率神经网络估计训练样例的概率分布, 利用K-L散度作为启发式来进行样例选择, 用该方法选出的样例大多分布在分类边界附近。与五个著名的样例选择算法CNN、ENN、RNN、MCS和ICF进行了实验比较, 实验结果显示, 算法的选择比更低, 训练出分类器具有更好的泛化能力, 提出的方法是有效的。  相似文献   

8.
The process of mutation has been studied extensively in the field of biology and it has been shown that it is one of the major factors that aid the process of evolution. Inspired by this a novel genetic algorithm (GA) is presented here. Various mutation operators such as small mutation, gene mutation and chromosome mutation have been applied in this genetic algorithm. In order to facilitate the implementation of the above-mentioned mutation operators a modified way of representing the variables has been presented. It resembles the way genetic information is coded in living beings. Different mutation operators pose a challenge as regards the determination of the optimal rate of mutation. This problem is overcome by using adaptive mutation operators. The main purpose behind this approach was to improve the efficiency of GAs and to find widely distributed Pareto-optimal solutions. This algorithm was tested on some benchmark test functions and compared with other GAs. It was observed that the introduction of these mutations do improve the genetic algorithms in terms of convergence and the quality of the solutions.  相似文献   

9.
田锦  袁家政  刘宏哲 《计算机应用》2020,40(7):1932-1937
车道线检测是智能驾驶系统的重要组成部分。传统车道线检测方法高度依赖手动选取特征,工作量大,在受到物体遮挡、光照变化和磨损等复杂场景的干扰时精度不高,因此设计一个鲁棒的检测算法面临着很大挑战。为了克服这些缺点,提出了一种基于深度学习实例分割方法的车道线检测模型。该模型基于改进的Mask R-CNN模型,首先利用实例分割模型对道路图像进行分割,提高车道特征信息的检测能力;然后使用聚类模型提取离散的车道线特征信息点;最后提出一种自适应拟合的方法,结合直线和多项式两种拟合方法对不同视野内的特征点进行拟合,生成最优车道线参数方程。实验结果表明,该方法提高了检测速度,在不同场景下都具有较好的检测精度,能够实现对各种复杂实际条件下的车道线信息的鲁棒提取。  相似文献   

10.
针对目前很少有将克隆选择算法应用到天线方向图综合领域的问题,研究了克隆选择优化算法在天线方向图综合问题中的应用,描述了仅对权值的相位进行编码的算法具体操作步骤,并在此基础上着重分析了相位编码的位数对优化结果的影响,分析结果表明,对权值相位进行四位编码也能得到比较理想的优化效果,使该算法的实用性得到了一定程度的验证.  相似文献   

11.
否定选择算法将单个自体点和其邻近点作为自体区域训练检测器。研究了实值否定算法,定义了连续的自体区域,采用动态聚类法将自体样本点分类到自体区域,训练时根据自体区域半径和与自体区域重心之间的余弦距离做局部训练,并在自体区域内使用可变阈值检测器。实验证明当耐受自体点被当成一个整体使用时能提供更多的信息,可以探测出自体区域边界,使系统效率和检测率得到提高。  相似文献   

12.
基于蚁群优化的网络选择算法   总被引:1,自引:0,他引:1       下载免费PDF全文
随着通信技术的不断发展,越来越多的无线通信网络标准被制定出来。为了保护投资,平滑过渡,各种不同的无线通信网络必然将相互融合。终端在这样一个多网络覆盖的区域中如何选择所使用的网络就成为了一个研究的热点。然而,在已有的诸多网络算法中,无一不存在着参加判决的参数过多、算法过于复杂而导致终端的电力和处理能力消耗过多、没有较好考虑网络负载均衡的缺陷并且没有考虑终端的反馈机制。简要介绍异构融合网络场景下网络选择的相关内容,包括异构融合网络场景,已有的网络选择算法,蚁群优化及其特点。在此基础上,提出了一种全新的基于蚁群模型的网络选择算法(ANSA)。利用Matlab对所提出的ANSA的性能进行了仿真分析,与TOPSIS算法进行对比,证明了ANSA比已有的网络选择算法具有更好的负载均衡性能并且降低了终端的复杂度。  相似文献   

13.
改进的克隆选择算法ICSA   总被引:2,自引:0,他引:2  
针对传统的克隆选择算法存在的不足,提出一种改进的克隆选择算法ICSA.该算法在克隆选择算法的基础上,利用负选择算法优化了克隆初始抗体群的生成方式,加入对抗原性质的评判环节,引入克隆选择动力学模型来模拟生物免疫系统中抗体增殖的动态行为,用以指导ICSA中的抗体增殖,并针对盾构地下工程风险实时识别的要求,采用了在线和增量式的学习方式,做到边学习、边识别、边更新.ICSA在标准数据集与盾构地下工程数据的仿真实验表明,在二分类模式识别上具有很高的分类性能.  相似文献   

14.
传统的否定选择算法无法有效识别落入到低维子空间的样本,导致算法在高维空间检测性能不佳。为此,本文提出了面向子空间的否定选择算法(Subspace-oriented Real Negative Selection Algorithm, SONSA)。在训练常规检测器的基础上,SONSA将搜索样本分布较密度高的低维子空间以进一步训练面向子空间的检测器,从而提高算法对低维子空间内样本的识别能力。实验结果表明在标准数据集Haberman’s Survival(3维)与Breast Cancer Wisconsin (9维)上,相对于经典的V-Detector算法以及采用PCA降维的V-Detector算法,SONSA能在误报率相似的情况下显著地提高检测率。  相似文献   

15.
一种新型的克隆选择算法*   总被引:1,自引:0,他引:1  
针对克隆选择算法自适应能力较弱的缺陷,给出了一种基于危险理论的自适应克隆选择算法。设计了危险信号操作算子,该算子将种群浓度的变动作为环境因素,以抗体—抗原亲和力为依据计算各个抗体在该环境因素下的危险信号,最终通过危险信号自适应地引导免疫克隆、变异和选择等后续免疫应答。实验结果表明本文算法具有较好的自适应能力和多值搜索能力。  相似文献   

16.
This article argues how one problem of computing lies in realizing a significant instance given a class or type. Analysis of a case study on digital narrative suggests two general processes for instantiating significant instances: interaction and optimization. The article then explains how the problem of universals needs to be deconstructed when trying to understand what type of entities significant instances are and what the process for obtaining them is.
Kumiko Tanaka-IshiiEmail:
  相似文献   

17.
翟婷婷  何振峰 《计算机应用》2012,32(11):3034-3037
针对实例选择算法INSIGHT存在选出的实例类别分布不均衡和得分相等的实例的重要性无法区分两个问题,分别提出了改进算法。改进算法B INSIGHT1基于分治思想,通过筛选出训练集各类中最具有代表性的实例,来确保选出的实例类别分布尽可能均衡。改进算法B INSIGHT2将改进算法B INSIGHT1的单重排序改进成了双重排序,以便更有效地衡量实例的重要性。实验结果表明,在时间复杂度基本不变的前提下,所提算法在分类准确率上均优于INSIGHT算法。  相似文献   

18.
Web service selection, as an important part of web service composition, has direct influence on the quality of composite service. Many works have been carried out to find the efficient algorithms for quality of service (QoS)-aware service selection problem in recent years. In this paper, a negative selection immune algorithm (NSA) is proposed, and as far as we know, this is the first time that NSA is introduced into web service selection problem. Domain terms and operations of NSA are firstly redefined in this paper aiming at QoS-aware service selection problem. NSA is then constructed to demonstrate how to use negative selection principle to solve this question. Thirdly, an inconsistent analysis between local exploitation and global planning is presented, through which a local alteration of a composite service scheme can transfer to the global exploration correctly. It is a general adjusting method and independent to algorithms. Finally, extensive experimental results illustrate that NSA, especially for NSA with consistency weights adjusting strategy (NSA+), significantly outperforms particle swarm optimization and clonal selection algorithm for QoS-aware service selection problem. The superiority of NSA+ over others is more and more evident with the increase of component tasks and related candidate services.  相似文献   

19.
基于量子遗传算法的特征选择算法   总被引:6,自引:1,他引:6  
特征选择是模式识别和机器学习等领域中重要而困难的研究课题.提出一种最优特征子集评价准则和实现特征选择的一种新量子遗传算法(NQGA).NQGA采用量子门旋转角更新新方法和增强算法寻优能力及防止早熟收敛的移民和灾变策略.定性分析了NQGA的高效性.典型复杂函数测试和雷达辐射源信号特征选择的应用表明,NQGA寻优能力强、收敛速度快和能有效防止早熟现象.采用提出的准则函数和搜索策略实现特征选择,大大降低了特征维数,获得了更高的正确识别率.  相似文献   

20.
针对移动自组织网络(mobile Ad hoc network,MANET)与Internet互联时的网关选择问题,提出一种基于Jelger算法的多因素网关选择算法来进行网关选择和切换。该算法综合考虑跳数、网关效益、通信成本等约束对网关选择的影响,在此基础上构建目标函数,引入网关选择度量——网关可用度(GUD)来完成网关的选择和切换。仿真结果表明,该算法能够有效改善Jelger算法引起的网关频繁切换问题和负载不均衡问题,减少网关切换次数,降低了传输时延和网络负载,提高了网络性能。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号