一类控制受约束非线性系统的基于单网络贪婪迭代DHP算法的近似最优镇定 Near-optimal Stabilization for a Class of Nonlinear Systems with Control Constraint Based on Single Network Greedy Iterative DHP Algorithm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一类控制受约束非线性系统的基于单网络贪婪迭代DHP算法的近似最优镇定

引用本文：	罗艳红,张化光,曹宁,陈兵.一类控制受约束非线性系统的基于单网络贪婪迭代DHP算法的近似最优镇定[J].自动化学报,2009,35(11):1436-1445.

作者姓名：	罗艳红张化光曹宁陈兵

作者单位：	1.东北大学流程工业综合自动化教育部重点实验室沈阳 110004

摘要：	提出一种贪婪迭代DHP (Dual heuristic programming)算法, 解决了一类控制受约束非线性系统的近似最优镇定问题. 针对系统的控制约束, 首先引入一个非二次泛函把约束问题转换为无约束问题, 然后基于协状态函数提出一种贪婪迭代DHP算法以求解系统的HJB (Hamilton-Jacobi-Bellman)方程. 在算法的每个迭代步, 利用一个神经网络来近似系统的协状态函数, 而后根据协状态函数直接计算系统的最优控制策略, 从而消除了常规近似动态规划方法中的控制网络. 最后通过两个仿真例子证明了本文提出的最优控制方案的有效性和可行性.
关键词：	贪婪迭代约束非二次泛函最优控制神经网络
收稿时间：	2008-9-10
修稿时间：	2009-6-12
Near-optimal Stabilization for a Class of Nonlinear Systems with Control Constraint Based on Single Network Greedy Iterative DHP Algorithm

Optimal State Estimation for Discrete-time Systems with Random Observation Delays.Near-optimal Stabilization for a Class of Nonlinear Systems with Control Constraint Based on Single Network Greedy Iterative DHP Algorithm[J].Acta Automatica Sinica,2009,35(11):1436-1445.

Authors:	Optimal State Estimation for Discrete-time Systems with Random Observation Delays

Affiliation:	1.Key Laboratory of Integrated Automation for the Process Industry, Ministry of Education, Northeastern University, Shenyang 110004;2.School of Information Science and Engineering, Northeastern University, Shenyang 110004;3.Institute of Complexity Science, Qingdao University, Qingdao 266071

Abstract:	The near-optimal stabilization problem for nonlinear constrained systems is solved by greedy iterative DHP (Dual heuristic programming) algorithm. Considering the control constraint of the system, a nonquadratic functional is first introduced in order to transform the constrained problem into a unconstrained problem. Then based on the costate function, the greedy iterative DHP algorithm is proposed to solve the Hamilton-Jacobi-Bellman (HJB) equation of the system. At each step of the iterative algorithm, a neural network is utilized to approximate the costate function, and then the optimal control policy of the system can be computed directly according to the costate function, which removes the action network appearing in the ordinary approximate dynamic programming (ADP) method. Finally, two examples are given to demonstrate the validity and feasibility of the proposed optimal control scheme.

Keywords:	Greedy iterative constraint nonquadratic functional optimal control neural network

	点击此处可从《自动化学报》浏览原始摘要信息
	点击此处可从《自动化学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏