首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 453 毫秒
1.
This paper investigates an online gradient method with penalty for training feedforward neural networks with linear output. A usual penalty is considered, which is a term proportional to the norm of the weights. The main contribution of this paper is to theoretically prove the boundedness of the weights in the network training process. This boundedness is then used to prove an almost sure convergence of the algorithm to the zero set of the gradient of the error function.  相似文献   

2.
Xiong Y  Wu W  Kang X  Zhang C 《Neural computation》2007,19(12):3356-3368
A pi-sigma network is a class of feedforward neural networks with product units in the output layer. An online gradient algorithm is the simplest and most often used training method for feedforward neural networks. But there arises a problem when the online gradient algorithm is used for pi-sigma networks in that the update increment of the weights may become very small, especially early in training, resulting in a very slow convergence. To overcome this difficulty, we introduce an adaptive penalty term into the error function, so as to increase the magnitude of the update increment of the weights when it is too small. This strategy brings about faster convergence as shown by the numerical experiments carried out in this letter.  相似文献   

3.
In this paper, the deterministic convergence of an online gradient method with penalty and momentum is investigated for training two-layer feedforward neural networks. The monotonicity of the new error function with the penalty term in the training iteration is firstly proved. Under this conclusion, we show that the weights are uniformly bounded during the training process and the algorithm is deterministically convergent. Sufficient conditions are also provided for both weak and strong convergence results.  相似文献   

4.
In this paper, we study the convergence of an online gradient method with inner-product penalty and adaptive momentum for feedforward neural networks, assuming that the training samples are permuted stochastically in each cycle of iteration. Both two-layer and three-layer neural network models are considered, and two convergence theorems are established. Sufficient conditions are proposed to prove weak and strong convergence results. The algorithm is applied to the classical two-spiral problem and identification of Gabor function problem to support these theoretical findings.  相似文献   

5.
Pi-sigma神经网络的乘子法随机单点在线梯度算法*   总被引:1,自引:0,他引:1  
喻昕  邓飞  唐利霞 《计算机应用研究》2011,28(11):4074-4077
在利用梯度算法训练Pi-sigma神经网络时,存在因权值选取过小导致收敛速度过慢的问题,而采用一般罚函数法虽然可以克服这个缺点,但要求罚因子必须趋近于∞且惩罚项绝对值不可微,从而导致数值求解困难。为克服以上缺点,提出了一种基于乘子法的随机单点在线梯度算法。利用最优化理论方法,将有约束问题转换为无约束问题,利用乘子法来求解网络误差函数。从理论上分析了算法的收敛速度和稳定性,仿真实验结果验证了算法的有效性。  相似文献   

6.
This paper investigates the split-complex back-propagation algorithm with momentum and penalty for training complex-valued neural networks. Here the momentum are used to accelerate the convergence of the algorithm and the penalty are used to control the magnitude of the network weights. The sufficient conditions for the learning rate, the momentum factor, the penalty coefficient, and the activation functions are proposed to establish the theoretical results of the algorithm. We theoretically prove the boundedness of the network weights during the training process, which is usually used as a precondition for convergence analysis in literatures. The monotonicity of the error function and the convergence of the algorithm are also guaranteed.  相似文献   

7.
针对攻击者利用生成式对抗网络技术(GAN)还原出训练集中的数据,泄露用户隐私信息的问题,提出了一种差分隐私保护梯度惩罚Wasserstein生成对抗网络(WGAN-GP)的方法.该方法在深度学习训练过程中对梯度添加精确计算后的高斯噪声,并使用梯度惩罚进行梯度修正,实现差分隐私保护.利用梯度惩罚Wasser-stein生成对抗网络与原始数据相似的数据.实验结果表明,在保证数据可用性的前提下,该方法可以有效保护数据的隐私信息,且生成数据具有较好的质量.  相似文献   

8.
生成式对抗网络GAN功能强大,但是具有收敛速度慢、训练不稳定、生成样本多样性不足等缺点。该文结合条件深度卷积对抗网络CDCGAN和带有梯度惩罚的Wasserstein生成对抗网络WGAN-GP的优点,提出了一个混合模型-条件梯度Wasserstein生成对抗网络CDCWGAN-GP,用带有梯度惩罚的Wasserstein距离训练对抗网络保证了训练稳定性且收敛速度更快,同时加入条件c来指导数据生成。另外为了增强判别器提取特征的能力,该文设计了全局判别器和局部判别器一起打分,最后提取判别器进行图像识别。实验结果证明,该方法有效的提高了图像识别的准确率。  相似文献   

9.
In this paper, we introduce a smoothed piecewise linear network (SPLN) and develop second order training algorithms for it. An embedded feature selection algorithm is developed which minimizes training error with respect to distance measure weights. Then a method is presented which adjusts center vector locations in the SPLN. We also present a gradient method for optimizing the SPLN output weights. Results with several data sets show that the distance measure optimization, center vector optimization, and output weight optimization, individually and together, reduce testing errors in the final network.  相似文献   

10.
Differential Evolution Training Algorithm for Feed-Forward Neural Networks   总被引:11,自引:0,他引:11  
An evolutionary optimization method over continuous search spaces, differential evolution, has recently been successfully applied to real world and artificial optimization problems and proposed also for neural network training. However, differential evolution has not been comprehensively studied in the context of training neural network weights, i.e., how useful is differential evolution in finding the global optimum for expense of convergence speed. In this study, differential evolution has been analyzed as a candidate global optimization method for feed-forward neural networks. In comparison to gradient based methods, differential evolution seems not to provide any distinct advantage in terms of learning rate or solution quality. Differential evolution can rather be used in validation of reached optima and in the development of regularization terms and non-conventional transfer functions that do not necessarily provide gradient information. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

11.
How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an online fashion. We have also developed an online version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner.  相似文献   

12.
传统的梯度算法存在收敛速度过慢的问题,针对这个问题,提出一种将惩罚项加到传统误差函数的梯度算法以训练递归pi-sigma神经网络,算法不仅提高了神经网络的泛化能力,而且克服了因网络初始权值选取过小而导致的收敛速度过慢的问题,相比不带惩罚项的梯度算法提高了收敛速度。从理论上分析了带惩罚项的梯度算法的收敛性,并通过实验验证了算法的有效性。  相似文献   

13.
神经网络的两种结构优化算法研究   总被引:6,自引:0,他引:6  
提出了一种基于权值拟熵的“剪枝算法”与权值敏感度相结合的新方法,在“剪枝算法”中将权值拟熵作为惩罚项加入目标函数中,使多层前向神经网络在学习过程中自动约束权值分布,并以权值敏感度作为简化标准,避免了单纯依赖权值大小剪枝的随机性.同时,又针对剪枝算法在优化多输入多输出网络过程中计算量大、效率不高的问题,提出了一种在级联—相关(cascade correlation, CC)算法的基础上从适当的网络结构开始对网络进行构建的快速“构造算法”.仿真结果表明这种快速构造算法在收敛速度、运行效率乃至泛化性能上都更胜一筹.  相似文献   

14.
黄德根  张云霞  林红梅  邹丽  刘壮 《软件学报》2020,31(4):1063-1078
为了缓解神经网络的“黑盒子”机制引起的算法可解释性低的问题,基于使用证据推理算法的置信规则库推理方法(以下简称RIMER)提出了一个规则推理网络模型.该模型通过RIMER中的置信规则和推理机制提高网络的可解释性.首先证明了基于证据推理的推理函数是可偏导的,保证了算法的可行性;然后,给出了规则推理网络的网络框架和学习算法,利用RIMER中的推理过程作为规则推理网络的前馈过程,以保证网络的可解释性;使用梯度下降法调整规则库中的参数以建立更合理的置信规则库,为了降低学习复杂度,提出了“伪梯度”的概念;最后,通过分类对比实验,分析了所提算法在精确度和可解释性上的优势.实验结果表明,当训练数据集规模较小时,规则推理网络的表现良好,当训练数据规模扩大时,规则推理网络也能达到令人满意的结果.  相似文献   

15.
In this paper, a hybrid method is proposed to control a nonlinear dynamic system using feedforward neural network. This learning procedure uses different learning algorithm separately. The weights connecting the input and hidden layers are firstly adjusted by a self organized learning procedure, whereas the weights between hidden and output layers are trained by supervised learning algorithm, such as a gradient descent method. A comparison with backpropagation (BP) shows that the new algorithm can considerably reduce network training time.  相似文献   

16.
强化学习是解决自适应问题的重要方法,被广泛地应用于连续状态下的学习控制,然而存在效率不高和收敛速度较慢的问题.在运用反向传播(back propagation,BP)神经网络基础上,结合资格迹方法提出一种算法,实现了强化学习过程的多步更新.解决了输出层的局部梯度向隐层节点的反向传播问题,从而实现了神经网络隐层权值的快速更新,并提供一个算法描述.提出了一种改进的残差法,在神经网络的训练过程中将各层权值进行线性优化加权,既获得了梯度下降法的学习速度又获得了残差梯度法的收敛性能,将其应用于神经网络隐层的权值更新,改善了值函数的收敛性能.通过一个倒立摆平衡系统仿真实验,对算法进行了验证和分析.结果显示,经过较短时间的学习,本方法能成功地控制倒立摆,显著提高了学习效率.  相似文献   

17.
Neural-network feature selector   总被引:12,自引:0,他引:12  
Feature selection is an integral part of most learning algorithms. Due to the existence of irrelevant and redundant attributes, by selecting only the relevant attributes of the data, higher predictive accuracy can be expected from a machine learning method. In this paper, we propose the use of a three-layer feedforward neural network to select those input attributes that are most useful for discriminating classes in a given set of input patterns. A network pruning algorithm is the foundation of the proposed algorithm. By adding a penalty term to the error function of the network, redundant network connections can be distinguished from those relevant ones by their small weights when the network training process has been completed. A simple criterion to remove an attribute based on the accuracy rate of the network is developed. The network is retrained after removal of an attribute, and the selection process is repeated until no attribute meets the criterion for removal. Our experimental results suggest that the proposed method works very well on a wide variety of classification problems.  相似文献   

18.
In the above paper by Ergezinger and Thomsen (ibid. vol.6 (1991)), a new method for training multilayer perceptron, called optimization layer by layer (OLL), was introduced. The present paper analyzes the performance of OLL. We show, from theoretical considerations, that the amount of work required with OLL-learning scales as the third power of the network size, compared with the square of the network size for commonly used conjugate gradient (CG) training algorithms. This theoretical estimate is confirmed through a practical example. Thus, although OLL is shown to function very well for small neural networks (less than about 500 weights per layer), it is slower than CG for large neural networks. Next, we show that OLL does not always improve on the accuracy that can be obtained with CG. It seems that the final accuracy that can be obtained depends strongly on the initial network weights.  相似文献   

19.
A gradient descent algorithm suitable for training multilayer feedforward networks of processing units with hard-limiting output functions is presented. The conventional backpropagation algorithm cannot be applied in this case because the required derivatives are not available. However, if the network weights are random variables with smooth distribution functions, the probability of a hard-limiting unit taking one of its two possible values is a continuously differentiable function. In the paper, this is used to develop an algorithm similar to backpropagation, but for the hard-limiting case. It is shown that the computational framework of this algorithm is similar to standard backpropagation, but there is an additional computational expense involved in the estimation of gradients. Upper bounds on this estimation penalty are given. Two examples which indicate that, when this algorithm is used to train networks of hard-limiting units, its performance is similar to that of conventional backpropagation applied to networks of units with sigmoidal characteristics are presented.  相似文献   

20.
On line tool wear monitoring based on auto associative neural network   总被引:1,自引:0,他引:1  
This paper presents a new tool wear monitoring method based on auto associative neural network. The main advantage of the model lies that it can be built only by the data under normal cutting condition. Therefore, the training samples of the tool wear status are no longer needed during the training process that makes it easier to be applied in real industrial environment than other neural network models. An averaged distance indicator is proposed to denote not only the occurrence of the tool wear but also its severity. Moreover, the Levenberg–Marquardt (LM) training algorithm is introduced to improve the convergence accuracy of the auto associative neural network. Based on the proposed method, a framework for online tool condition monitoring is illustrated and the cutting force data under different tool wear status are collected to simulate the online modeling and monitoring process for the rough and finish milling respectively. The results show that the proposed indicator can reflect the evolution process of tool wear correctly and the LM algorithm is more accurate in comparison with the gradient descent methods. Therefore, it casts new light on practical application of neural network in the field of on line tool condition monitoring.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号