首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Based on adaptive dynamic programming (ADP), the fixed-point tracking control problem is solved by a value iteration (Ⅵ) algorithm. First, a class of discrete-time (DT) nonlinear system with disturbance is considered. Second, the convergence of a Ⅵ algorithm is given. It is proven that the iterative cost function precisely converges to the optimal value, and the control input and disturbance input also converges to the optimal values. Third, a novel analysis pertaining to the range of the discount factor is presented, where the cost function serves as a Lyapunov function. Finally, neural networks (NNs) are employed to approximate the cost function, the control law, and the disturbance law. Simulation examples are given to illustrate the effective performance of the proposed method.   相似文献   

2.
利用数据驱动控制思想,建立一种设计离散时间非线性系统近似最优调节器的迭代神经动态规划方法.提出针对离散时间一般非线性系统的迭代自适应动态规划算法并且证明其收敛性与最优性.通过构建三种神经网络,给出全局二次启发式动态规划技术及其详细的实现过程,其中执行网络是在神经动态规划的框架下进行训练.这种新颖的结构可以近似代价函数及其导函数,同时在不依赖系统动态的情况下自适应地学习近似最优控制律.值得注意的是,这在降低对于控制矩阵或者其神经网络表示的要求方面,明显地改进了迭代自适应动态规划算法的现有结果,能够促进复杂非线性系统基于数据的优化与控制设计的发展.通过两个仿真实验,验证本文提出的数据驱动最优调节方法的有效性.  相似文献   

3.
针对带有饱和执行器且局部未知的非线性连续系统的有穷域最优控制问题,设计了一种基于自适应动态规划(ADP)的在线积分增强学习算法,并给出算法的收敛性证明.首先,引入非二次型函数处理控制饱和问题.其次,设计一种由常量权重和时变激活函数构成的单一网络,来逼近未知连续的值函数,与传统双网络相比减少了计算量.同时,综合考虑神经网络产生的残差和终端误差,应用最小二乘法更新神经网络权重,并且给出基于神经网络的迭代值函数收敛到最优值的收敛性证明.最后,通过两个仿真例子验证了算法的有效性.  相似文献   

4.
This paper proposes an online adaptive approximate solution for the infinite-horizon optimal tracking control problem of continuous-time nonlinear systems with unknown dynamics. The requirement of the complete knowledge of system dynamics is avoided by employing an adaptive identifier in conjunction with a novel adaptive law, such that the estimated identifier weights converge to a small neighborhood of their ideal values. An adaptive steady-state controller is developed to maintain the desired tracking performance at the steady-state, and an adaptive optimal controller is designed to stabilize the tracking error dynamics in an optimal manner. For this purpose, a critic neural network (NN) is utilized to approximate the optimal value function of the Hamilton-Jacobi-Bellman (HJB) equation, which is used in the construction of the optimal controller. The learning of two NNs, i.e., the identifier NN and the critic NN, is continuous and simultaneous by means of a novel adaptive law design methodology based on the parameter estimation error. Stability of the whole system consisting of the identifier NN, the critic NN and the optimal tracking control is guaranteed using Lyapunov theory; convergence to a near-optimal control law is proved. Simulation results exemplify the effectiveness of the proposed method.   相似文献   

5.
In this paper, we aim to solve the finite-horizon optimal control problem for a class of non-linear discrete-time switched systems using adaptive dynamic programming(ADP) algorithm. A new ε-optimal control scheme based on the iterative ADP algorithm is presented which makes the value function converge iteratively to the greatest lower bound of all value function indices within an error according to ε within finite time. Two neural networks are used as parametric structures to implement the iterative ADP algorithm with ε-error bound, which aim at approximating the value function and the control policy, respectively. And then, the optimal control policy is obtained. Finally, a simulation example is included to illustrate the applicability of the proposed method.  相似文献   

6.
林小峰  丁强 《控制与决策》2015,30(3):495-499
为了求解有限时域最优控制问题,自适应动态规划(ADP)算法要求受控系统能一步控制到零。针对不能一步控制到零的非线性系统,提出一种改进的ADP算法,其初始代价函数由任意的有限时间容许序列构造。推导了算法的迭代过程并证明了算法的收敛性。当考虑评价网络的近似误差并满足假设条件时,迭代代价函数将收敛到最优代价函数的有界邻域。仿真例子验证了所提出方法的有效性。  相似文献   

7.
针对再入飞行器的姿态跟踪问题,基于递归神经网络提出最优跟踪控制.采用反步法和递归神经网络,设计自适应前馈控制,将再入飞行器的最优姿态跟踪问题转化为等价的姿态角误差/角速率误差最优调节问题.采用自适应动态规划技术,解决最优调节问题.引入神经网络估计最优控制中的代价函数,推导最优反馈控制律,同时保证Hamilton–Jacobi–Isaacs(HJI)方程估计误差最小化.采用Lyapunov理论,保证闭环系统中所有信号,包括姿态角跟踪误差是一致最终有界的.在MATLAB/Simulink中仿真验证了所提出控制策略的有效性.  相似文献   

8.
为克服现有近似最优跟踪控制方法只能跟踪连续可微参考输入的局限,本文针对一类具有未知动态的连续时间非线性时不变仿射系统,提出了一种新的基于自适应动态规划的鲁棒近似最优跟踪控制方法.首先采用递归神经网络建立系统模型,然后建立评价神经网络对最优性能指标进行估计,从而得到最优性能指标偏导数的估计值,进而得到近似最优跟踪控制器,最后利用系统输出与参考输入之间的跟踪误差设计鲁棒项对神经网络建模误差进行补偿.分别针对两个非线性系统进行仿真实验,仿真结果表明了所提方法的有效性和优越性.  相似文献   

9.
Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems, in this paper, a new iterative adaptive dynamic programming algorithm, which is the discrete-time time-varying policy iteration (DTTV) algorithm, is developed. The iterative control law is designed to update the iterative value function which approximates the index function of optimal performance. The admissibility of the iterative control law is analyzed. The results show that the iterative value function is non-increasingly convergent to the Bellman-equation optimal solution. To implement the algorithm, neural networks are employed and a new implementation structure is established, which avoids solving the generalized Bellman equation in each iteration. Finally, the optimal control laws for torsional pendulum and inverted pendulum systems are obtained by using the DTTV policy iteration algorithm, where the mass and pendulum bar length are permitted to be time-varying parameters. The effectiveness of the developed method is illustrated by numerical results and comparisons.   相似文献   

10.
针对一类带有执行器饱和的未知动态离散时间非线性系统, 提出了一种新的最优跟踪控制方案. 该方案基于迭代自适应动态规划算法, 为了实现最优控制, 首先建立了未知系统动态的数据辨识器. 通过引入M网络, 获得了稳态控制的精确表达式. 为了消除执行器饱和的影响, 提出了一个非二次的性能指标函数. 然后提出了一种迭代自适应动态规划算法获得最优跟踪控制的解, 并给出了收敛性分析. 为了实现最优控制方案, 神经网络被用来构建数据辨识器、计算性能指标函数、近似最优控制策略和求解稳态控制. 仿真结果验证了本文所提出的最优跟踪控制方法的有效性.  相似文献   

11.
In this paper, a novel iterative adaptive dynamic programming (ADP) algorithm, called generalised policy iteration ADP algorithm, is developed to solve optimal tracking control problems for discrete-time nonlinear systems. The idea is to use two iteration procedures, including an i-iteration and a j-iteration, to obtain the iterative tracking control laws and the iterative value functions. By system transformation, we first convert the optimal tracking control problem into an optimal regulation problem. Then the generalised policy iteration ADP algorithm, which is a general idea of interacting policy and value iteration algorithms, is introduced to deal with the optimal regulation problem. The convergence and optimality properties of the generalised policy iteration algorithm are analysed. Three neural networks are used to implement the developed algorithm. Finally, simulation examples are given to illustrate the performance of the present algorithm.  相似文献   

12.
In this paper, a data-based scheme is proposed to solve the optimal tracking problem of autonomous nonlinear switching systems. The system state is forced to track the reference signal by minimizing the performance function. First, the problem is transformed to solve the corresponding Bellman optimality equation in terms of the Q-function (also named as action value function). Then, an iterative algorithm based on adaptive dynamic programming (ADP) is developed to find the optimal solution which is totally based on sampled data. The linear-in-parameter (LIP) neural network is taken as the value function approximator. Considering the presence of approximation error at each iteration step, the generated approximated value function sequence is proved to be boundedness around the exact optimal solution under some verifiable assumptions. Moreover, the effect that the learning process will be terminated after a finite number of iterations is investigated in this paper. A sufficient condition for asymptotically stability of the tracking error is derived. Finally, the effectiveness of the algorithm is demonstrated with three simulation examples.   相似文献   

13.
An online adaptive optimal control is proposed for continuous-time nonlinear systems with completely unknown dynamics, which is achieved by developing a novel identifier-critic-based approximate dynamic programming algorithm with a dual neural network (NN) approximation structure. First, an adaptive NN identifier is designed to obviate the requirement of complete knowledge of system dynamics, and a critic NN is employed to approximate the optimal value function. Then, the optimal control law is computed based on the information from the identifier NN and the critic NN, so that the actor NN is not needed. In particular, a novel adaptive law design method with the parameter estimation error is proposed to online update the weights of both identifier NN and critic NN simultaneously, which converge to small neighbourhoods around their ideal values. The closed-loop system stability and the convergence to small vicinity around the optimal solution are all proved by means of the Lyapunov theory. The proposed adaptation algorithm is also improved to achieve finite-time convergence of the NN weights. Finally, simulation results are provided to exemplify the efficacy of the proposed methods.  相似文献   

14.
Although optimal regulation problem has been well studied, resolving optimal tracking control via adaptive dynamic programming (ADP) has not been completely resolved, particularly for nonlinear uncertain systems. In this paper, an online adaptive learning method is developed to realize the optimal tracking control design for nonlinear motor driven systems (NMDSs), which adopts the concept of ADP, unknown system dynamic estimator (USDE), and prescribed performance function (PPF). To this end, the USDE in a simple form is first proposed to address the NMDSs with bounded disturbances. Then, based on the estimated unknown dynamics, we define an optimal cost function and derive the optimal tracking control. The derived optimal tracking control is divided into two parts, that is, steady-state control and optimal feedback control. The steady-state control can be obtained with the tracking commands directly. The optimal feedback control can be obtained via the concept of ADP based on the PPF; this contributes to improving the convergence of critic neural network (CNN) weights and tracking accuracy of NMDSs. Simulations are provided to display the feasibility of the designed control method.  相似文献   

15.
In this paper, a novel optimal control design scheme is proposed for continuous-time nonaffine nonlinear dynamic systems with unknown dynamics by adaptive dynamic programming (ADP). The proposed methodology iteratively updates the control policy online by using the state and input information without identifying the system dynamics. An ADP algorithm is developed, and can be applied to a general class of nonlinear control design problems. The convergence analysis for the designed control scheme is presented, along with rigorous stability analysis for the closed-loop system. The effectiveness of this new algorithm is illustrated by two simulation examples.  相似文献   

16.
Based on a combination of a PD controller and a switching type two-parameter compensation force, an iterative learning controller with a projection-free adaptive algorithm is presented in this paper for repetitive control of uncertain robot manipulators. The adaptive iterative learning controller is designed without any a priori knowledge of robot parameters under certain properties on the dynamics of robot manipulators with revolute joints only. This new adaptive algorithm uses a combined time-domain and iteration-domain adaptation law allowing to guarantee the boundedness of the tracking error and the control input, in the sense of the infinity norm, as well as the convergence of the tracking error to zero, without any a priori knowledge of robot parameters. Simulation results are provided to illustrate the effectiveness of the learning controller.  相似文献   

17.
In this paper, a decentralised tracking control (DTC) scheme is developed for unknown large-scale nonlinear systems by using observer-critic structure-based adaptive dynamic programming. The control consists of local desired control, local tracking error control and a compensator. By introducing the local neural network observer, the subsystem dynamics can be identified. The identified subsystems can be used for the local desired control and the control input matrix, which is used in local tracking error control. Meanwhile, Hamiltonian-Jacobi-Bellman equation can be solved by constructing a critic neural network. Thus, the local tracking error control can be derived directly. To compensate the overall error caused by substitution, observation and approximation of the local tracking error control, an adaptive robustifying term is employed. Simulation examples are provided to demonstrate the effectiveness of the proposed DTC scheme.  相似文献   

18.
This paper concerns a novel optimal self-learning battery sequential control scheme for smart home energy systems. The main idea is to use the adaptive dynamic programming (ADP) technique to obtain the optimal battery sequential control iteratively. First, the battery energy management system model is established, where the power efficiency of the battery is considered. Next, considering the power constraints of the battery, a new non-quadratic form performance index function is established, which guarantees that the value of the iterative control law cannot exceed the maximum charging/discharging power of the battery to extend the service life of the battery. Then, the convergence properties of the iterative ADP algorithm are analyzed, which guarantees that the iterative value function and the iterative control law both reach the optimums. Finally, simulation and comparison results are given to illustrate the performance of the presented method.   相似文献   

19.
针对一类状态和控制变量均带有时滞的非线性系统的带有二次性能指标函数最优控制问题, 本文提出了一种基于新的迭代自适应动态规划算法的最优控制方案. 通过引进时滞矩阵函数, 应用动态规划理论, 本文获得了最优控制的显式表达式, 然后通过自适应评判技术获得最优控制量. 本文给出了收敛性证明以保证性能指标函数收敛到最优. 为了实现所提出的算法, 本文采用神经网络近似性能指标函数、计算最优控制策略、求解时滞矩阵函数、以及给非线性系统建模. 最后本文给出了两个仿真例子说明所提出的最优策略的有效性.  相似文献   

20.
In this paper, we aim to solve the finite horizon optimal control problem for a class of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming (ADP) approach. A new ε-optimal control algorithm based on the iterative ADP approach is proposed which makes the performance index function converge iteratively to the greatest lower bound of all performance indices within an error according to ε within finite time. The optimal number of control steps can also be obtained by the proposed ε-optimal control algorithm for the situation where the initial state of the system is unfixed. Neural networks are used to approximate the performance index function and compute the optimal control policy, respectively, for facilitating the implementation of the ε-optimal control algorithm. Finally, a simulation example is given to show the results of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号