首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
《Neurocomputing》1999,24(1-3):173-189
The real-time recurrent learning (RTRL) algorithm, which is originally proposed for training recurrent neural networks, requires a large number of iterations for convergence because a small learning rate should be used. While an obvious solution to this problem is to use a large learning rate, this could result in undesirable convergence characteristics. This paper attempts to improve the convergence capability and convergence characteristics of the RTRL algorithm by incorporating conjugate gradient computation into its learning procedure. The resulting algorithm, referred to as the conjugate gradient recurrent learning (CGRL) algorithm, is applied to train fully connected recurrent neural networks to simulate a second-order low-pass filter and to predict the chaotic intensity pulsations of NH3 laser. Results show that the CGRL algorithm exhibits substantial improvement in convergence (in terms of the reduction in mean squared error per epoch) as compared to the RTRL and batch mode RTRL algorithms.  相似文献   

2.
神经网络优化是机器学习领域的一个基础性前沿课题。相较于神经网络的纯梯度优化算法,非梯度算法在解决收敛速度慢、易陷入局部最优、无法解决不可微等问题上表现出更大的优势。在剖析基于梯度的神经网络方法优缺点的基础上,重点对部分非梯度优化方法进行了综述,包括前馈神经网络优化和随机搜索优化;从基本理论、训练神经网络的步骤以及收敛性等方面对非梯度优化方法的优缺点和应用情况进行了分析;总结了基于非梯度的训练神经网络的算法在理论和应用方面面临的挑战并且展望了未来的发展方向。  相似文献   

3.
将聚类网络用于非监督的图像分割,提出了竞争层神经元的动态调整机制和返回式的非重复训练学习方案,实现了聚类数的自适应增加,解决了随机生成权值矩阵产生的死点问题,提高了算法的收敛性能。实验结果表明,改进的聚类网络的图像分割结果优于C-均值聚类算法和通常的聚类网络。  相似文献   

4.
Training of recurrent neural networks (RNNs) introduces considerable computational complexities due to the need for gradient evaluations. How to get fast convergence speed and low computational complexity remains a challenging and open topic. Besides, the transient response of learning process of RNNs is a critical issue, especially for online applications. Conventional RNN training algorithms such as the backpropagation through time and real-time recurrent learning have not adequately satisfied these requirements because they often suffer from slow convergence speed. If a large learning rate is chosen to improve performance, the training process may become unstable in terms of weight divergence. In this paper, a novel training algorithm of RNN, named robust recurrent simultaneous perturbation stochastic approximation (RRSPSA), is developed with a specially designed recurrent hybrid adaptive parameter and adaptive learning rates. RRSPSA is a powerful novel twin-engine simultaneous perturbation stochastic approximation (SPSA) type of RNN training algorithm. It utilizes three specially designed adaptive parameters to maximize training speed for a recurrent training signal while exhibiting certain weight convergence properties with only two objective function measurements as the original SPSA algorithm. The RRSPSA is proved with guaranteed weight convergence and system stability in the sense of Lyapunov function. Computer simulations were carried out to demonstrate applicability of the theoretical results.  相似文献   

5.
Real-time algorithms for gradient descent supervised learning in recurrent dynamical neural networks fail to support scalable VLSI implementation, due to their complexity which grows sharply with the network dimension. We present an alternative implementation in analog VLSI, which employs a stochastic perturbation algorithm to observe the gradient of the error index directly on the network in random directions of the parameter space, thereby avoiding the tedious task of deriving the gradient from an explicit model of the network dynamics. The network contains six fully recurrent neurons with continuous-time dynamics, providing 42 free parameters which comprise connection strengths and thresholds. The chip implementing the network includes local provisions supporting both the learning and storage of the parameters, integrated in a scalable architecture which can be readily expanded for applications of learning recurrent dynamical networks requiring larger dimensionality. We describe and characterize the functional elements comprising the implemented recurrent network and integrated learning system, and include experimental results obtained from training the network to represent a quadrature-phase oscillator.  相似文献   

6.
Simultaneous perturbation stochastic approximation (SPSA) belongs to the class of gradient-free optimization methods that extract gradient information from successive objective function evaluation. This paper describes an improved SPSA algorithm, which entails fuzzy adaptive gain sequences, gradient smoothing, and a step rejection procedure to enhance convergence and stability. The proposed fuzzy adaptive simultaneous perturbation approximation (FASPA) algorithm is particularly well suited to problems involving a large number of parameters such as those encountered in nonlinear system identification using neural networks (NNs). Accordingly, a multilayer perceptron (MLP) network with popular training algorithms was used to predicate the system response. We found that an MLP trained by FASPSA had the desired accuracy that was comparable to results obtained by traditional system identification algorithms. Simulation results for typical nonlinear systems demonstrate that the proposed NN architecture trained with FASPSA yields improved system identification as measured by reduced time of convergence and a smaller identification error.  相似文献   

7.
Decision feedback recurrent neural equalization with fast convergence rate   总被引:1,自引:0,他引:1  
Real-time recurrent learning (RTRL), commonly employed for training a fully connected recurrent neural network (RNN), has a drawback of slow convergence rate. In the light of this deficiency, a decision feedback recurrent neural equalizer (DFRNE) using the RTRL requires long training sequences to achieve good performance. In this paper, extended Kalman filter (EKF) algorithms based on the RTRL for the DFRNE are presented in state-space formulation of the system, in particular for complex-valued signal processing. The main features of global EKF and decoupled EKF algorithms are fast convergence and good tracking performance. Through nonlinear channel equalization, performance of the DFRNE with the EKF algorithms is evaluated and compared with that of the DFRNE with the RTRL.  相似文献   

8.
In this paper,the constrained optimization technique for a substantial problem is explored,that is accelerating training the globally recurrent neural network.Unlike most of the previous methods in feedforware neural networks,the authors adopt the constrained optimization technique to improve the gradientbased algorithm of the globally recurrent neural network for the adaptive learning rate during tracining.Using the recurrent network with the improved algorithm,some experiments in two real-world problems,namely,filtering additive noises in acoustic data and classification of temporat signals for speaker identification,have been performed.The experimental results show that the recurrent neural network with the improved learning algorithm yields significantly faster training and achieves the satisfactory performance.  相似文献   

9.
A nonlinear dynamic model is developed for a process system, namely a heat exchanger, using the recurrent multilayer perceptron network as the underlying model structure. The perceptron is a dynamic neural network, which appears effective in the input-output modeling of complex process systems. Dynamic gradient descent learning is used to train the recurrent multilayer perceptron, resulting in an order of magnitude improvement in convergence speed over a static learning algorithm used to train the same network. In developing the empirical process model the effects of actuator, process, and sensor noise on the training and testing sets are investigated. Learning and prediction both appear very effective, despite the presence of training and testing set noise, respectively. The recurrent multilayer perceptron appears to learn the deterministic part of a stochastic training set, and it predicts approximately a moving average response of various testing sets. Extensive model validation studies with signals that are encountered in the operation of the process system modeled, that is steps and ramps, indicate that the empirical model can substantially generalize operational transients, including accurate prediction of instabilities not in the training set. However, the accuracy of the model beyond these operational transients has not been investigated. Furthermore, online learning is necessary during some transients and for tracking slowly varying process dynamics. Neural networks based empirical models in some cases appear to provide a serious alternative to first principles models.  相似文献   

10.
In this paper, we study the convergence of an online gradient method with inner-product penalty and adaptive momentum for feedforward neural networks, assuming that the training samples are permuted stochastically in each cycle of iteration. Both two-layer and three-layer neural network models are considered, and two convergence theorems are established. Sufficient conditions are proposed to prove weak and strong convergence results. The algorithm is applied to the classical two-spiral problem and identification of Gabor function problem to support these theoretical findings.  相似文献   

11.
This paper examines the inductive inference of a complex grammar with neural networks and specifically, the task considered is that of training a network to classify natural language sentences as grammatical or ungrammatical, thereby exhibiting the same kind of discriminatory power provided by the Principles and Parameters linguistic framework, or Government-and-Binding theory. Neural networks are trained, without the division into learned vs. innate components assumed by Chomsky (1956), in an attempt to produce the same judgments as native speakers on sharply grammatical/ungrammatical data. How a recurrent neural network could possess linguistic capability and the properties of various common recurrent neural network architectures are discussed. The problem exhibits training behavior which is often not present with smaller grammars and training was initially difficult. However, after implementing several techniques aimed at improving the convergence of the gradient descent backpropagation-through-time training algorithm, significant learning was possible. It was found that certain architectures are better able to learn an appropriate grammar. The operation of the networks and their training is analyzed. Finally, the extraction of rules in the form of deterministic finite state automata is investigated  相似文献   

12.
针对随机梯度下降法可能会收敛到局部最优的问题,文中提出采用分数阶动量的随机梯度下降法,提高卷积神经网络的识别精度和学习收敛速度.结合基于动量的随机梯度下降法和分数阶差分运算,改进参数更新方法,讨论分数阶阶次对网络参数训练效果的影响,给出阶次调整方法.在MNIST、CIFAR-10数据集上的实验表明,文中方法可以提高卷积神经网络的识别精度和学习收敛速度.  相似文献   

13.
针对模式识别中协同方法存在的问题,提出了一种协同神经网络中序参量重构的方法,该方法是利用遗传算法的全局最优搜索能力,通过对训练样本集的学习,然后再通过在序参量的构建参数空间进行全局搜索来获得最优重构参数。利用实际采样得到的样本对新算法进行的测试表明,新方法确定能找到一组序参量重构参数,并能使识别性能有较大提高。  相似文献   

14.
本文从智能家居角度研究室内热舒适, 分析热舒适评价方式PMV, 指出其部分参数在智能家居场景中获取困难. 提出在忽略风速和平均辐射温度的情况下, 引入气候和环境特征来拟合PMV公式. 研究使用经过差分进化算法(Differential Evolution, DE)优化后的BP神经网络算法(DE-BP)来建立拟合模型, DE算法优化神经网络的参数, 神经网络训练使用动量加速的随机梯度下降算法, 且增加了仿射变换的标准化层和L2正则化. 测试结果显示模型在收敛速度、稳定性和泛化性能上比传统BP神经网络更优,在较小误差范围内可应用于计算热舒适度的系统中, 降低其输入参量难度.  相似文献   

15.
一种基于粗糙集神经网络的分类算法*   总被引:1,自引:0,他引:1  
当输入维数高时神经网络结构复杂,体系庞大,导致其收敛速度慢,为克服这个缺点,提出了基于决策规则的神经网络(RDRN),利用粗糙集理论从数据样本中获取最简的决策规则,按决策规则语义构建一种不完全连接的神经网络。根据决策语义规则,计算并初始化网络的参数,减少网络训练的迭代次数,提高网络的收敛速度。同时利用蚁群算法对网络输入的连续属性的最优离散化值进行寻优,从而获得了最优的网络结构。最后通过实验结果将本文提出方法与传统神经网络方法以及支持向量机分类方法进行了比较。比较说明了该神经网络具有收敛速度较快,分类效率较高的优点。  相似文献   

16.
Rule revision with recurrent neural networks   总被引:2,自引:0,他引:2  
Recurrent neural networks readily process, recognize and generate temporal sequences. By encoding grammatical strings as temporal sequences, recurrent neural networks can be trained to behave like deterministic sequential finite-state automata. Algorithms have been developed for extracting grammatical rules from trained networks. Using a simple method for inserting prior knowledge (or rules) into recurrent neural networks, we show that recurrent neural networks are able to perform rule revision. Rule revision is performed by comparing the inserted rules with the rules in the finite-state automata extracted from trained networks. The results from training a recurrent neural network to recognize a known non-trivial, randomly-generated regular grammar show that not only do the networks preserve correct rules but that they are able to correct through training inserted rules which were initially incorrect (i.e. the rules were not the ones in the randomly generated grammar)  相似文献   

17.
This article presents some efficient training algorithms, based on first-order, second-order, and conjugate gradient optimization methods, for a class of convolutional neural networks (CoNNs), known as shunting inhibitory convolution neural networks. Furthermore, a new hybrid method is proposed, which is derived from the principles of Quickprop, Rprop, SuperSAB, and least squares (LS). Experimental results show that the new hybrid method can perform as well as the Levenberg-Marquardt (LM) algorithm, but at a much lower computational cost and less memory storage. For comparison sake, the visual pattern recognition task of face/nonface discrimination is chosen as a classification problem to evaluate the performance of the training algorithms. Sixteen training algorithms are implemented for the three different variants of the proposed CoNN architecture: binary-, Toeplitz- and fully connected architectures. All implemented algorithms can train the three network architectures successfully, but their convergence speed vary markedly. In particular, the combination of LS with the new hybrid method and LS with the LM method achieve the best convergence rates in terms of number of training epochs. In addition, the classification accuracies of all three architectures are assessed using ten-fold cross validation. The results show that the binary- and Toeplitz-connected architectures outperform slightly the fully connected architecture: the lowest error rates across all training algorithms are 1.95% for Toeplitz-connected, 2.10% for the binary-connected, and 2.20% for the fully connected network. In general, the modified Broyden-Fletcher-Goldfarb-Shanno (BFGS) methods, the three variants of LM algorithm, and the new hybrid/LS method perform consistently well, achieving error rates of less than 3% averaged across all three architectures.  相似文献   

18.
传统的梯度算法存在收敛速度过慢的问题,针对这个问题,提出一种将惩罚项加到传统误差函数的梯度算法以训练递归pi-sigma神经网络,算法不仅提高了神经网络的泛化能力,而且克服了因网络初始权值选取过小而导致的收敛速度过慢的问题,相比不带惩罚项的梯度算法提高了收敛速度。从理论上分析了带惩罚项的梯度算法的收敛性,并通过实验验证了算法的有效性。  相似文献   

19.
基于最大似然估计(Maximum likelihood estimation,MLE)的语言模型(Language model,LM)数据增强方法由于存在暴露偏差问题而无法生成具有长时语义信息的采样数据.本文提出了一种基于对抗训练策略的语言模型数据增强的方法,通过一个辅助的卷积神经网络判别模型判断生成数据的真伪,从而引导递归神经网络生成模型学习真实数据的分布.语言模型的数据增强问题实质上是离散序列的生成问题.当生成模型的输出为离散值时,来自判别模型的误差无法通过反向传播算法回传到生成模型.为了解决此问题,本文将离散序列生成问题表示为强化学习问题,利用判别模型的输出作为奖励对生成模型进行优化,此外,由于判别模型只能对完整的生成序列进行评价,本文采用蒙特卡洛搜索算法对生成序列的中间状态进行评价.语音识别多候选重估实验表明,在有限文本数据条件下,随着训练数据量的增加,本文提出的方法可以进一步降低识别字错误率(Character error rate,CER),且始终优于基于MLE的数据增强方法.当训练数据达到6M词规模时,本文提出的方法使THCHS30数据集的CER相对基线系统下降5.0%,AISHELL数据集的CER相对下降7.1%.  相似文献   

20.
矿井瓦斯含量的预测模型是一个多变量、非线性的函数关系,预测模型建立的准确与否决定于各个影响因素之间的相互作用、相互耦合的特性。将神经网络与粒子群算法有机地结合起来,以神经网络理论为基础,利用粒子群算法优化隐含层神经元个数和网络中的连接权值,建立瓦斯含量预测模型,解决了Bp神经网络收敛速度慢、易陷入局部优化的缺陷。并在历史数据的基础上,建立遗传神经网络训练和检验样本集,利用MATLAB进行仿真,结果表明粒子群神经网络模型可靠性强,预测精度高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号