首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
In this paper, a novel optimal control design scheme is proposed for continuous-time nonaffine nonlinear dynamic systems with unknown dynamics by adaptive dynamic programming (ADP). The proposed methodology iteratively updates the control policy online by using the state and input information without identifying the system dynamics. An ADP algorithm is developed, and can be applied to a general class of nonlinear control design problems. The convergence analysis for the designed control scheme is presented, along with rigorous stability analysis for the closed-loop system. The effectiveness of this new algorithm is illustrated by two simulation examples.  相似文献   

2.
S. Sen  S. J. Yakowitz   《Automatica》1987,23(6):749-752
We develop a quasi-Newton differential dynamic programming algorithm (QDDP) for discrete-time optimal control problems. In the spirit of dynamic programming, the quasi-Newton approximations are performed in a stagewise manner. We establish the global convergence of the method and also show a superlinear convergence rate. Among other advantages of the QDDP method, second derivatives need not be calculated. In theory, the computational effort of each recursion grows proportionally to the number of stages N, whereas with conventional quasi-Newton techniques which do not take advantage of the optimal control problem structure, the growth is as N2. Computational results are also reported.  相似文献   

3.
为克服现有近似最优跟踪控制方法只能跟踪连续可微参考输入的局限,本文针对一类具有未知动态的连续时间非线性时不变仿射系统,提出了一种新的基于自适应动态规划的鲁棒近似最优跟踪控制方法.首先采用递归神经网络建立系统模型,然后建立评价神经网络对最优性能指标进行估计,从而得到最优性能指标偏导数的估计值,进而得到近似最优跟踪控制器,最后利用系统输出与参考输入之间的跟踪误差设计鲁棒项对神经网络建模误差进行补偿.分别针对两个非线性系统进行仿真实验,仿真结果表明了所提方法的有效性和优越性.  相似文献   

4.
林小峰  丁强 《控制与决策》2015,30(3):495-499
为了求解有限时域最优控制问题,自适应动态规划(ADP)算法要求受控系统能一步控制到零。针对不能一步控制到零的非线性系统,提出一种改进的ADP算法,其初始代价函数由任意的有限时间容许序列构造。推导了算法的迭代过程并证明了算法的收敛性。当考虑评价网络的近似误差并满足假设条件时,迭代代价函数将收敛到最优代价函数的有界邻域。仿真例子验证了所提出方法的有效性。  相似文献   

5.
We present a numerical procedure for solving optimal control problems with both linear terminal constraints and multiple criteria. Using a Chebyshev spectral procedure, the problem reduces to a constrained optimization problem which can be solved using hybrid penalty partial quadratic interpolation (HPPQI) technique. The proposed procedure compares quite favorably with other methods on a sample of well-known examples.  相似文献   

6.
This paper will present an approximate/adaptive dynamic programming(ADP) algorithm,that uses the idea of integral reinforcement learning(IRL),to determine online the Nash equilibrium solution for the two-player zerosum differential game with linear dynamics and infinite horizon quadratic cost.The algorithm is built around an iterative method that has been developed in the control engineering community for solving the continuous-time game algebraic Riccati equation(CT-GARE),which underlies the game problem.We here show how the ADP techniques will enhance the capabilities of the offline method allowing an online solution without the requirement of complete knowledge of the system dynamics.The feasibility of the ADP scheme is demonstrated in simulation for a power system control application.The adaptation goal is the best control policy that will face in an optimal manner the highest load disturbance.  相似文献   

7.
针对一类非线性零和微分对策问题,本文提出了一种事件触发自适应动态规划(event-triggered adaptive dynamic programming,ET--ADP)算法在线求解其鞍点.首先,提出一个新的自适应事件触发条件.然后,利用一个输入为采样数据的神经网络(评价网络)近似最优值函数,并设计了新型的神经网络权值更新律使得值函数、控制策略及扰动策略仅在事件触发时刻同步更新.进一步地,利用Lyapunov稳定性理论证明了所提出的算法能够在线获得非线性零和微分对策的鞍点且不会引起Zeno行为.所提出的ET--ADP算法仅在事件触发条件满足时才更新值函数、控制策略和扰动策略,因而可有效减少计算量和降低网络负荷.最后,两个仿真例子验证了所提出的ET--ADP算法的有效性.  相似文献   

8.
We investigate the optimization of linear impulse systems with the reinforcement learning based adaptive dynamic programming (ADP) method. For linear impulse systems, the optimal objective function is shown to be a quadric form of the pre-impulse states. The ADP method provides solutions that iteratively converge to the optimal objective function. If an initial guess of the pre-impulse objective function is selected as a quadratic form of the pre-impulse states, the objective function iteratively converges to the optimal one through ADP. Though direct use of the quadratic objective function of the states within the ADP method is theoretically possible, the numerical singularity problem may occur due to the matrix inversion therein when the system dimensionality increases. A neural network based ADP method can circumvent this problem. A neural network with polynomial activation functions is selected to approximate the pr~impulse objective function and trained iteratively using the ADP method to achieve optimal control. After a successful training, optimal impulse control can be derived. Simulations are presented for illustrative purposes.  相似文献   

9.
In this paper, a new iterative adaptive dynamic programming (ADP) method is proposed to solve a class of continuous-time nonlinear two-person zero-sum differential games. The idea is to use the ADP technique to obtain the optimal control pair iteratively which makes the performance index function reach the saddle point of the zero-sum differential games. If the saddle point does not exist, the mixed optimal control pair is obtained to make the performance index function reach the mixed optimum. Stability analysis of the nonlinear systems is presented and the convergence property of the performance index function is also proved. Two simulation examples are given to illustrate the performance of the proposed method.  相似文献   

10.
In this paper, a finite-horizon neuro-optimal tracking control strategy for a class of discrete-time nonlinear systems is proposed. Through system transformation, the optimal tracking problem is converted into designing a finite-horizon optimal regulator for the tracking error dynamics. Then, with convergence analysis in terms of cost function and control law, the iterative adaptive dynamic programming (ADP) algorithm via heuristic dynamic programming (HDP) technique is introduced to obtain the finite-horizon optimal tracking controller which makes the cost function close to its optimal value within an ?-error bound. Three neural networks are used as parametric structures to implement the algorithm, which aims at approximating the cost function, the control law, and the error dynamics, respectively. Two simulation examples are included to complement the theoretical discussions.  相似文献   

11.
This paper studies data-driven learning-based methods for the finite-horizon optimal control of linear time-varying discrete-time systems. First, a novel finite-horizon Policy Iteration (PI) method for linear time-varying discrete-time systems is presented. Its connections with existing infinite-horizon PI methods are discussed. Then, both data-driven off-policy PI and Value Iteration (VI) algorithms are derived to find approximate optimal controllers when the system dynamics is completely unknown. Under mild conditions, the proposed data-driven off-policy algorithms converge to the optimal solution. Finally, the effectiveness and feasibility of the developed methods are validated by a practical example of spacecraft attitude control.  相似文献   

12.
This paper presents an Enhanced Self-adaptive Differential Evolution with Mixed Crossover (ESDE-MC) algorithm to solve the multiobjective optimal power flow problems with conflicting objectives that reflect the minimization of total production cost, emission pollution, L-index, and active power loss. In this algorithm, a combination of eigenvector and binomial crossovers has been used to move the current population towards better search positions to provide good quality solutions. Besides, an adaptive dynamic parameter adjusting strategy is adopted to obtain the appropriate parameter settings in differential evolution algorithm during the evolution process. Further, an external archive is used to preserve all the nondominated solutions evaluated in each iteration and a fuzzy decision-making technique is applied to extract the best compromise solution from all the nondominated solutions in the archive set. Finally, in order to investigate the usefulness of the proposed algorithm, IEEE 30-bus, IEEE 57-bus and Algerian 59-bus systems with different single and multiobjective OPF problems have been solved and the simulation results are evaluated and compared with the other algorithms recently reported in the literature. The results indicate that the proposed algorithm is competent, effective and quite suitable for solving single/multi objective optimal power flow problems.  相似文献   

13.
基于动态规划的约束优化问题多参数规划求解方法及应用   总被引:1,自引:0,他引:1  
结合动态规划和单步多参数二次规划, 提出一种新的约束优化控制问题多参数规划求解方法. 一方面能得到约束线性二次优化控制问题最优控制序列与状态之间的显式函数关系, 减少多参数规划问题求解的工作量; 另一方面能够同时求解得到状态反馈最优控制律. 应用本文提出的多参数二次规划求解方法, 建立无限时间约束优化问题状态反馈显式最优控制律. 针对电梯机械系统振动控制模型做了数值仿真计算.  相似文献   

14.
In this paper, the mixed H-two/H-infinity control synthesis problem is stated as a multiobjective opti-mization problem, with objectives of minimizing the H-two and H-infinity norms simultaneously. Instead of building a LMIs-based synthesis algorithm, a self-adaptive control parameter multiobjective differential evolution algorithm is developed directly in the controller parameters space. In the case of systems with polytopic uncertainties, the worst case norm computation is formulated as an implicit optimization problem, and the proposed self-adaptive differential evolution is employed to calculate the worst case H-two and H-infinity norms. The numerical examples illustrate the power and validity of the proposed approach for the mixed H-two/H-infinity control multiobjective optimal design.  相似文献   

15.
多目标混沌差分进化算法   总被引:12,自引:1,他引:11  
将差分进化算法用于多目标优化问题,提出了多目标混沌差分进化算法(CDEMO).该算法利用混沌序列初始化种群,并用混沌备用种群进行替换操作.该操作不仅起到了维持非劣最优解集均匀性的作用,而且增强了算法的搜索功能.对CDEMO的性能进行研究,数值实验结果表明了CDEMO的有效性.  相似文献   

16.
黄英博  吕永峰  赵刚  那靖  赵军 《控制与决策》2022,37(12):3197-3206
针对非线性主动悬架系统多性能指标综合优化问题,提出一类自适应最优控制方法.首先,通过引入一阶低通滤波操作,利用系统输入输出构建结构简单且调节参数少的一类未知非线性动态估计器,在线估计系统未知非线性动态;其次,构建包含乘驾舒适度、悬架行程空间及输入能耗的性能指标函数,采用单层神经网络对最优性能指标函数进行在线逼近,并得到新的哈密尔顿函数;为实现在线求解,构建一类新的基于参数估计误差信息的自适应律,在线更新神经网络权值并计算最优控制律;最后,理论分析闭环系统稳定性和收敛性,并通过专业软件Carsim与Matlab/Simulink搭建的联合仿真平台给出的对比仿真结果,验证所提出方法可有效解决主动悬架系统多目标性能优化控制问题,提升主动悬架系统综合性能.  相似文献   

17.
Optimal control of batch reactors by iterative dynamic programming   总被引:2,自引:0,他引:2  
Four batch reactor systems are chosen to examine the viability of using iterative dynamic programming (IDP) for highly nonlinear systems encountered by chemical engineers. The first system is mildly nonlinear and rapid convergence resulted with the use of only a single state grid point. The use of piecewise linear continuous control with 40 stages yielded better results that the use of 80 stages with piecewise constant control. The need for more than a single grid point for the other three systems led to a systematic study of the effects of the number of grid points, of the number of allowable values for control and of the region contraction factor on the convergence of IDP. In every case the global optimum could be obtained with reasonable computational effort, and no difficulties were encountered even with systems exhibiting several local optima. The use of stages of different length allowed a refined solution to be obtained with a reasonably small number of stages in the last example.  相似文献   

18.
In this paper, an adaptive optimal control strategy is proposed for a class of strict‐feedback nonlinear systems with output constraints by using dynamic surface control. The controller design procedure is divided into two parts. One is the design of feedforward controller and the other is the design of optimal controller. To guarantee the satisfaction of output constraints in feedforward controller, nonlinear mapping is utilized to transform the constrained system into an unconstrained system. Neural‐network based adaptive dynamic programming algorithm is employed to approximate the optimal cost function and the optimal control law. By theoretical analysis, all the signals in the closed‐loop system are proved to be semi‐globally uniformly ultimately bounded and the output constraints are not violated. A numerical example illustrates the effectiveness of the proposed scheme.  相似文献   

19.
Recently, evolutionary algorithm based on decomposition (MOEA/D) has been found to be very effective and efficient for solving complicated multiobjective optimization problems (MOPs). However, the selected differential evolution (DE) strategies and their parameter settings impact a lot on the performance of MOEA/D when tackling various kinds of MOPs. Therefore, in this paper, a novel adaptive control strategy is designed for a recently proposed MOEA/D with stable matching model, in which multiple DE strategies coupled with the parameter settings are adaptively conducted at different evolutionary stages and thus their advantages can be combined to further enhance the performance. By exploiting the historically successful experience, an execution probability is learned for each DE strategy to perform adaptive adjustment on the candidate solutions. The proposed adaptive strategies on operator selection and parameter settings are aimed at improving both of the convergence speed and population diversity, which are validated by our numerous experiments. When compared with several variants of MOEA/D such as MOEA/D, MOEA/D-DE, MOEA/D-DE+PSO, ENS-MOEA/D, MOEA/D-FRRMAB and MOEA/D-STM, our algorithm performs better on most of test problems.  相似文献   

20.
In this paper, we aim to solve the finite horizon optimal control problem for a class of discrete-time nonlinear systems with unfixed initial state using adaptive dynamic programming (ADP) approach. A new ε-optimal control algorithm based on the iterative ADP approach is proposed which makes the performance index function converge iteratively to the greatest lower bound of all performance indices within an error according to ε within finite time. The optimal number of control steps can also be obtained by the proposed ε-optimal control algorithm for the situation where the initial state of the system is unfixed. Neural networks are used to approximate the performance index function and compute the optimal control policy, respectively, for facilitating the implementation of the ε-optimal control algorithm. Finally, a simulation example is given to show the results of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号