首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
In this brief, we consider an online gradient method with penalty for training feedforward neural networks. Specifically, the penalty is a term proportional to the norm of the weights. Its roles in the method are to control the magnitude of the weights and to improve the generalization performance of the network. By proving that the weights are automatically bounded in the network training with penalty, we simplify the conditions that are required for convergence of online gradient method in literature. A numerical example is given to support the theoretical analysis.   相似文献   

2.
This paper investigates the split-complex back-propagation algorithm with momentum and penalty for training complex-valued neural networks. Here the momentum are used to accelerate the convergence of the algorithm and the penalty are used to control the magnitude of the network weights. The sufficient conditions for the learning rate, the momentum factor, the penalty coefficient, and the activation functions are proposed to establish the theoretical results of the algorithm. We theoretically prove the boundedness of the network weights during the training process, which is usually used as a precondition for convergence analysis in literatures. The monotonicity of the error function and the convergence of the algorithm are also guaranteed.  相似文献   

3.
In this paper, the deterministic convergence of an online gradient method with penalty and momentum is investigated for training two-layer feedforward neural networks. The monotonicity of the new error function with the penalty term in the training iteration is firstly proved. Under this conclusion, we show that the weights are uniformly bounded during the training process and the algorithm is deterministically convergent. Sufficient conditions are also provided for both weak and strong convergence results.  相似文献   

4.
Pi-sigma神经网络的乘子法随机单点在线梯度算法*   总被引:1,自引:0,他引:1  
喻昕  邓飞  唐利霞 《计算机应用研究》2011,28(11):4074-4077
在利用梯度算法训练Pi-sigma神经网络时,存在因权值选取过小导致收敛速度过慢的问题,而采用一般罚函数法虽然可以克服这个缺点,但要求罚因子必须趋近于∞且惩罚项绝对值不可微,从而导致数值求解困难。为克服以上缺点,提出了一种基于乘子法的随机单点在线梯度算法。利用最优化理论方法,将有约束问题转换为无约束问题,利用乘子法来求解网络误差函数。从理论上分析了算法的收敛速度和稳定性,仿真实验结果验证了算法的有效性。  相似文献   

5.
Xiong Y  Wu W  Kang X  Zhang C 《Neural computation》2007,19(12):3356-3368
A pi-sigma network is a class of feedforward neural networks with product units in the output layer. An online gradient algorithm is the simplest and most often used training method for feedforward neural networks. But there arises a problem when the online gradient algorithm is used for pi-sigma networks in that the update increment of the weights may become very small, especially early in training, resulting in a very slow convergence. To overcome this difficulty, we introduce an adaptive penalty term into the error function, so as to increase the magnitude of the update increment of the weights when it is too small. This strategy brings about faster convergence as shown by the numerical experiments carried out in this letter.  相似文献   

6.
传统的梯度算法存在收敛速度过慢的问题,针对这个问题,提出一种将惩罚项加到传统误差函数的梯度算法以训练递归pi-sigma神经网络,算法不仅提高了神经网络的泛化能力,而且克服了因网络初始权值选取过小而导致的收敛速度过慢的问题,相比不带惩罚项的梯度算法提高了收敛速度。从理论上分析了带惩罚项的梯度算法的收敛性,并通过实验验证了算法的有效性。  相似文献   

7.
In this paper, we study the convergence of an online gradient method with inner-product penalty and adaptive momentum for feedforward neural networks, assuming that the training samples are permuted stochastically in each cycle of iteration. Both two-layer and three-layer neural network models are considered, and two convergence theorems are established. Sufficient conditions are proposed to prove weak and strong convergence results. The algorithm is applied to the classical two-spiral problem and identification of Gabor function problem to support these theoretical findings.  相似文献   

8.
生成式对抗网络GAN功能强大,但是具有收敛速度慢、训练不稳定、生成样本多样性不足等缺点。该文结合条件深度卷积对抗网络CDCGAN和带有梯度惩罚的Wasserstein生成对抗网络WGAN-GP的优点,提出了一个混合模型-条件梯度Wasserstein生成对抗网络CDCWGAN-GP,用带有梯度惩罚的Wasserstein距离训练对抗网络保证了训练稳定性且收敛速度更快,同时加入条件c来指导数据生成。另外为了增强判别器提取特征的能力,该文设计了全局判别器和局部判别器一起打分,最后提取判别器进行图像识别。实验结果证明,该方法有效的提高了图像识别的准确率。  相似文献   

9.
In this paper, we propose an actor-critic neuro-control for a class of continuous-time nonlinear systems under nonlinear abrupt faults, which is combined with an adaptive fault diagnosis observer (AFDO). Together with its estimation laws, an AFDO scheme, which estimates the faults in real time, is designed based on Lyapunov analysis. Then, based on the designed AFDO, a fault tolerant actor- critic control scheme is proposed where the critic neural network (NN) is used to approximate the value function and the actor NN updates the fault tolerant policy based on the approximated value function in the critic NN. The weight update laws for critic NN and actor NN are designed using the gradient descent method. By Lyapunov analysis, we prove the uniform ultimately boundedness (UUB) of all the states, their estimation errors, and NN weights of the fault tolerant system under the unpredictable faults. Finally, we verify the effectiveness of the proposed method through numerical simulations.  相似文献   

10.
针对攻击者利用生成式对抗网络技术(GAN)还原出训练集中的数据,泄露用户隐私信息的问题,提出了一种差分隐私保护梯度惩罚Wasserstein生成对抗网络(WGAN-GP)的方法.该方法在深度学习训练过程中对梯度添加精确计算后的高斯噪声,并使用梯度惩罚进行梯度修正,实现差分隐私保护.利用梯度惩罚Wasser-stein生成对抗网络与原始数据相似的数据.实验结果表明,在保证数据可用性的前提下,该方法可以有效保护数据的隐私信息,且生成数据具有较好的质量.  相似文献   

11.
共轭梯度法在BP网络中的应用   总被引:7,自引:0,他引:7  
该文针对广泛使用的前向多层网络的BP算法存在的收敛速率低、有局部振荡的缺陷,提出了共轭梯度法改进BP算法,它在共轭梯度方向修正权值、使用概率接受原则决定目标函数值变化的取舍。同时给出了提高网络抗过配合性能的罚函数方法。实例证明:在不同的初值下,共轭梯度法均具有快的全局收敛性。  相似文献   

12.
A gradient descent algorithm suitable for training multilayer feedforward networks of processing units with hard-limiting output functions is presented. The conventional backpropagation algorithm cannot be applied in this case because the required derivatives are not available. However, if the network weights are random variables with smooth distribution functions, the probability of a hard-limiting unit taking one of its two possible values is a continuously differentiable function. In the paper, this is used to develop an algorithm similar to backpropagation, but for the hard-limiting case. It is shown that the computational framework of this algorithm is similar to standard backpropagation, but there is an additional computational expense involved in the estimation of gradients. Upper bounds on this estimation penalty are given. Two examples which indicate that, when this algorithm is used to train networks of hard-limiting units, its performance is similar to that of conventional backpropagation applied to networks of units with sigmoidal characteristics are presented.  相似文献   

13.
Adaptive learning rate methods have been successfully applied in many fields, especially in training deep neural networks. Recent results have shown that adaptive methods with exponential increasing weights on squared past gradients (i.e., ADAM, RMSPROP) may fail to converge to the optimal solution. Though many algorithms, such as AMSGRAD and ADAMNC, have been proposed to fix the non-convergence issues, achieving a data-dependent regret bound similar to or better than ADAGRAD is still a challenge to these methods. In this paper, we propose a novel adaptive method weighted adaptive algorithm (WADA) to tackle the non-convergence issues. Unlike AMSGRAD and ADAMNC, we consider using a milder growing weighting strategy on squared past gradient, in which weights grow linearly. Based on this idea, we propose weighted adaptive gradient method framework (WAGMF) and implement WADA algorithm on this framework. Moreover, we prove that WADA can achieve a weighted data-dependent regret bound, which could be better than the original regret bound of ADAGRAD when the gradients decrease rapidly. This bound may partially explain the good performance of ADAM in practice. Finally, extensive experiments demonstrate the effectiveness of WADA and its variants in comparison with several variants of ADAM on training convex problems and deep neural networks.  相似文献   

14.
在网络入侵检测中,数据类别不均衡训练集的使用将产生分类偏差,主要原因在于对每个训练样本的错误分类的惩罚系数是相等的.加权支持向量机对每个错误分类样本的惩罚系数是不一样的,这对小样本来说提高了分类精度,克服了常规SVM算法不能灵活处理样本的缺陷.但这是以大样本分类精度的降低以及总分类精度的下降为代价的.实验结果证明,将加权支持向量机用于网络入侵检测中是可行的、高效的.  相似文献   

15.
In this paper, it is found that the weights of a perceptron are bounded for all initial weights if there exists a nonempty set of initial weights that the weights of the perceptron are bounded. Hence, the boundedness condition of the weights of the perceptron is independent of the initial weights. Also, a necessary and sufficient condition for the weights of the perceptron exhibiting a limit cycle behavior is derived. The range of the number of updates for the weights of the perceptron required to reach the limit cycle is estimated. Finally, it is suggested that the perceptron exhibiting the limit cycle behavior can be employed for solving a recognition problem when downsampled sets of bounded training feature vectors are linearly separable. Numerical computer simulation results show that the perceptron exhibiting the limit cycle behavior can achieve a better recognition performance compared to a multilayer perceptron.  相似文献   

16.
Liu  Yan  Yang  Dakun  Li  Long  Yang  Jie 《Neural Processing Letters》2019,50(2):1589-1609

In order to broaden the study of the most popular and general Takagi–Sugeno (TS) system, we propose a complex-valued neuro-fuzzy inference system which realises the zero-order TS system in the complex-valued network architecture and develop it. In the complex domain, boundedness and analyticity cannot be achieved together. The splitting strategy is given by computing the gradients of the real-valued error function with respect to the real and the imaginary parts of the weight parameters independently. Specifically, this system has four layers: in the Gaussian layer, the L-dimensional complex-valued input features are mapped to a Q-dimensional real-valued space, and in the output layer, complex-valued weights are employed to project it back to the complex domain. Hence, split-complex valued gradients of the real-valued error function are obtained, forming the split-complex valued neuro-fuzzy (split-CVNF) learning algorithm based on gradient descent. Another contribution of this paper is that the deterministic convergence of the split-CVNF algorithm is analysed. It is proved that the error function is monotone during the training iteration process, and the sum of gradient norms tends to zero. By adding a moderate condition, the weight sequence itself is also proved to be convergent.

  相似文献   

17.
提高BP网络收敛速率的又一种算法   总被引:3,自引:1,他引:3  
陈玉芳  雷霖 《计算机仿真》2004,21(11):74-77
提高BP网络的训练速率是改善BP网络性能的一项重要任务。该文在误差反向传播算法(BP算法)的基础上提出了一种新的训练算法,该算法对BP网络的传统动量法进行了修改,采用动态权值调整以减少训练时间。文章提供了改进算法的仿真实例,仿真结果表明用该方法解决某些问题时,其相对于BP网络传统算法的优越性。  相似文献   

18.
Fast training of multilayer perceptrons   总被引:5,自引:0,他引:5  
Training a multilayer perceptron by an error backpropagation algorithm is slow and uncertain. This paper describes a new approach which is much faster and certain than error backpropagation. The proposed approach is based on combined iterative and direct solution methods. In this approach, we use an inverse transformation for linearization of nonlinear output activation functions, direct solution matrix methods for training the weights of the output layer; and gradient descent, the delta rule, and other proposed techniques for training the weights of the hidden layers. The approach has been implemented and tested on many problems. Experimental results, including training times and recognition accuracy, are given. Generally, the approach achieves accuracy as good as or better than perceptrons trained using error backpropagation, and the training process is much faster than the error backpropagation algorithm and also avoids local minima and paralysis.  相似文献   

19.
Recurrent neural networks have been successfully used for analysis and prediction of temporal sequences. This paper is concerned with the convergence of a gradient-descent learning algorithm for training a fully recurrent neural network. In literature, stochastic process theory has been used to establish some convergence results of probability nature for the on-line gradient training algorithm, based on the assumption that a very large number of (or infinitely many in theory) training samples of the temporal sequences are available. In this paper, we consider the case that only a limited number of training samples of the temporal sequences are available such that the stochastic treatment of the problem is no longer appropriate. Instead, we use an off-line gradient training algorithm for the fully recurrent neural network, and we accordingly prove some convergence results of deterministic nature. The monotonicity of the error function in the iteration is also guaranteed. A numerical example is given to support the theoretical findings.  相似文献   

20.
将远程随机感染引入到经典的SIRS模型来研究复杂网络上疾病传播行为,考虑到感染节点在以一定概率把疾病感染到其邻接节点的同时,随机选取网络中一个不存在边连接的非邻接节点,并以一定的远程感染概率进行感染。针对小世界网络和无标度网络,分别采用重连概率相关和度相关的远程感染概率,利用平均场的方法求得改进的SIRS模型在这两种网络上的传播阈值以及稳态感染密度。数值仿真结果表明:对于小世界网络,有效传播率在一定范围内,重连概率对稳态感染密度和传播速度有明显的影响,超过这个范围,重连概率对稳态感染密度的影响可以忽略;而  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号