期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Bounded robust control of nonlinear systems using neural network�Cbased HJB solution

Dipak M. Adhyaru I. N. Kar M. Gopal 《Neural computing & applications》2011,20(1):91-103

In this paper, a Hamilton–Jacobi–Bellman (HJB) equation–based optimal control algorithm for robust controller design is proposed for nonlinear systems. The HJB equation is formulated using a suitable nonquadratic term in the performance functional to tackle constraints on the control input. Utilizing the direct method of Lyapunov stability, the controller is shown to be optimal with respect to a cost functional, which includes penalty on the control effort and the maximum bound on system uncertainty. The bounded controller requires the knowledge of the upper bound of system uncertainty. In the proposed algorithm, neural network is used to approximate the solution of HJB equation using least squares method. Proposed algorithm has been applied on the nonlinear system with matched and unmatched type system uncertainties and uncertainties in the input matrix. Necessary theoretical and simulation results are presented to validate proposed algorithm. 相似文献

2.

State observer design for nonlinear systems using neural network

Dipak M. Adhyaru 《Applied Soft Computing》2012,12(8):2530-2537

In this paper, an observer design is proposed for nonlinear systems. The Hamilton–Jacobi–Bellman (HJB) equation based formulation has been developed. The HJB equation is formulated using a suitable non-quadratic term in the performance functional to tackle magnitude constraints on the observer gain. Utilizing Lyapunov's direct method, observer is proved to be optimal with respect to meaningful cost. In the present algorithm, neural network (NN) is used to approximate value function to find approximate solution of HJB equation using least squares method. With time-varying HJB solution, we proposed a dynamic optimal observer for the nonlinear system. Proposed algorithm has been applied on nonlinear systems with finite-time-horizon and infinite-time-horizon. Necessary theoretical and simulation results are presented to validate proposed algorithm. 相似文献

3.

Fixed-Final-Time-Constrained Optimal Control of Nonlinear Systems Using Neural Network HJB Approach 总被引：2，自引：0，他引：2

Tao Cheng Lewis F.L. Abu-Khalaf M. 《Neural Networks, IEEE Transactions on》2007,18(6):1725-1737

In this paper, fixed-final time-constrained optimal control laws using neural networks (NNS) to solve Hamilton-Jacobi-Bellman (HJB) equations for general affine in the constrained nonlinear systems are proposed. An NN is used to approximate the time-varying cost function using the method of least squares on a predefined region. The result is an NN nearly -constrained feedback controller that has time-varying coefficients found by a priori offline tuning. Convergence results are shown. The results of this paper are demonstrated in two examples, including a nonholonomic system. 相似文献

4.

A neural network solution for fixed-final time optimal control of nonlinear systems

Tao Cheng^{Author Vitae} Frank L. Lewis Author Vitae Author Vitae 《Automatica》2007,43(3):482-490

In this paper, fixed-final time optimal control laws using neural networks and HJB equations for general affine in the input nonlinear systems are proposed. The method utilizes Kronecker matrix methods along with neural network approximation over a compact set to solve a time-varying HJB equation. The result is a neural network feedback controller that has time-varying coefficients found by a priori offline tuning. Convergence results are shown. The results of this paper are demonstrated on an example. 相似文献

5.

Neural-based online finite-time optimal tracking control for wheeled mobile robotic system with inequality constraints

Liang Ding Miao Zheng Shu Li Huaiguang Yang Haibo Gao Zongquan Deng 《Asian journal of control》2024,26(1):297-311

In this study, a finite-time online optimal controller was designed for a nonlinear wheeled mobile robotic system (WMRS) with inequality constraints, based on reinforcement learning (RL) neural networks. In addition, an extended cost function, obtained by introducing a penalty function to the original long-time cost function, was proposed to deal with the optimal control problem of the system with inequality constraints. A novel Hamilton-Jacobi-Bellman (HJB) equation containing the constraint conditions was defined to determine the optimal control input. Furthermore, two neural networks (NNs), a critic and an actor NN, were established to approximate the extended cost function and the optimal control input, respectively. The adaptation laws of the critic and actor NN were obtained with the gradient descent method. The semi-global practical finite-time stability (SGPFS) was proved using Lyapunov's stability theory. The tracking error converges to a small region near zero within the constraints in a finite period. Finally, the effectiveness of the proposed optimal controller was verified by a simulation based on a practical wheeled mobile robot model. 相似文献

6.

Optimal Synchronization Control of Heterogeneous Asymmetric Input-Constrained Unknown Nonlinear MASs via Reinforcement Learning

下载免费PDF全文

Lina Xia Qing Li Ruizhuo Song Hamidreza Modares 《IEEE/CAA Journal of Automatica Sinica》2022,9(3):520-532

The asymmetric input-constrained optimal synchronization problem of heterogeneous unknown nonlinear multiagent systems(MASs)is considered in the paper.Intuitively,a state-space transformation is performed such that satisfaction of symmetric input constraints for the transformed system guarantees satisfaction of asymmetric input constraints for the original system.Then,considering that the leader’s information is not available to every follower,a novel distributed observer is designed to estimate the leader’s state using only exchange of information among neighboring followers.After that,a network of augmented systems is constructed by combining observers and followers dynamics.A nonquadratic cost function is then leveraged for each augmented system(agent)for which its optimization satisfies input constraints and its corresponding constrained Hamilton-Jacobi-Bellman(HJB)equation is solved in a data-based fashion.More specifically,a data-based off-policy reinforcement learning(RL)algorithm is presented to learn the solution to the constrained HJB equation without requiring the complete knowledge of the agents’dynamics.Convergence of the improved RL algorithm to the solution to the constrained HJB equation is also demonstrated.Finally,the correctness and validity of the theoretical results are demonstrated by a simulation example. 相似文献

7.

Generalized Hamilton–Jacobi–Bellman Formulation -Based Neural Network Control of Affine Nonlinear Discrete-Time Systems

Zheng Chen Jagannathan S. 《Neural Networks, IEEE Transactions on》2008,19(1):90-106

In this paper, we consider the use of nonlinear networks towards obtaining nearly optimal solutions to the control of nonlinear discrete-time (DT) systems. The method is based on least squares successive approximation solution of the generalized Hamilton-Jacobi-Bellman (GHJB) equation which appears in optimization problems. Successive approximation using the GHJB has not been applied for nonlinear DT systems. The proposed recursive method solves the GHJB equation in DT on a well-defined region of attraction. The definition of GHJB, pre-Hamiltonian function, HJB equation, and method of updating the control function for the affine nonlinear DT systems under small perturbation assumption are proposed. A neural network (NN) is used to approximate the GHJB solution. It is shown that the result is a closed-loop control based on an NN that has been tuned a priori in offline mode. Numerical examples show that, for the linear DT system, the updated control laws will converge to the optimal control, and for nonlinear DT systems, the updated control laws will converge to the suboptimal control. 相似文献

8.

Generalized hamilton-jacobi-bellman formulation -based neural network control of affine nonlinear discrete-time systems

Zheng Chen Sarangapani Jagannathan 《Neural Networks, IEEE Transactions on》2008,19(1):90-106

In this paper, we consider the use of nonlinear networks towards obtaining nearly optimal solutions to the control of nonlinear discrete-time (DT) systems. The method is based on least squares successive approximation solution of the generalized Hamilton-Jacobi-Bellman (GHJB) equation which appears in optimization problems. Successive approximation using the GHJB has not been applied for nonlinear DT systems. The proposed recursive method solves the GHJB equation in DT on a well-defined region of attraction. The definition of GHJB, pre-Hamiltonian function, HJB equation, and method of updating the control function for the affine nonlinear DT systems under small perturbation assumption are proposed. A neural network (NN) is used to approximate the GHJB solution. It is shown that the result is a closed-loop control based on an NN that has been tuned a priori in offline mode. Numerical examples show that, for the linear DT system, the updated control laws will converge to the optimal control, and for nonlinear DT systems, the updated control laws will converge to the suboptimal control. 相似文献

9.

Asymptotic optimal control of uncertain nonlinear Euler–Lagrange systems

Keith Dupree Parag M. Patre Zachary D. Wilcox Warren E. Dixon Author vitae 《Automatica》2011,(1):99-107

A sufficient condition to solve an optimal control problem is to solve the Hamilton–Jacobi–Bellman (HJB) equation. However, finding a value function that satisfies the HJB equation for a nonlinear system is challenging. For an optimal control problem when a cost function is provided a priori, previous efforts have utilized feedback linearization methods which assume exact model knowledge, or have developed neural network (NN) approximations of the HJB value function. The result in this paper uses the implicit learning capabilities of the RISE control structure to learn the dynamics asymptotically. Specifically, a Lyapunov stability analysis is performed to show that the RISE feedback term asymptotically identifies the unknown dynamics, yielding semi-global asymptotic tracking. In addition, it is shown that the system converges to a state space system that has a quadratic performance index which has been optimized by an additional control element. An extension is included to illustrate how a NN can be combined with the previous results. Experimental results are given to demonstrate the proposed controllers. 相似文献

10.

Online optimal control of nonlinear discrete-time systems using approximate dynamic programming 总被引：1，自引：0，他引：1

Travis DIERKS Sarangapani JAGANNATHAN 《控制理论与应用(英文版)》2011,9(3):361-369

In this paper,the optimal control of a class of general affine nonlinear discrete-time(DT) systems is undertaken by solving the Hamilton Jacobi-Bellman(HJB) equation online and forward in time.The proposed approach,referred normally as adaptive or approximate dynamic programming(ADP),uses online approximators(OLAs) to solve the infinite horizon optimal regulation and tracking control problems for affine nonlinear DT systems in the presence of unknown internal dynamics.Both the regulation and tracking contro... 相似文献

11.

A neural network model predictive controller 总被引：2，自引：0，他引：2

Bernt M. kesson Hannu T. Toivonen 《Journal of Process Control》2006,16(9):937-946

A neural network controller is applied to the optimal model predictive control of constrained nonlinear systems. The control law is represented by a neural network function approximator, which is trained to minimize a control-relevant cost function. The proposed procedure can be applied to construct controllers with arbitrary structures, such as optimal reduced-order controllers and decentralized controllers. 相似文献

12.

一类控制受约束非线性系统的基于单网络贪婪迭代DHP算法的近似最优镇定

罗艳红张化光曹宁陈兵《自动化学报》2009,35(11):1436-1445

提出一种贪婪迭代DHP (Dual heuristic programming)算法, 解决了一类控制受约束非线性系统的近似最优镇定问题. 针对系统的控制约束, 首先引入一个非二次泛函把约束问题转换为无约束问题, 然后基于协状态函数提出一种贪婪迭代DHP算法以求解系统的HJB (Hamilton-Jacobi-Bellman)方程. 在算法的每个迭代步, 利用一个神经网络来近似系统的协状态函数, 而后根据协状态函数直接计算系统的最优控制策略, 从而消除了常规近似动态规划方法中的控制网络. 最后通过两个仿真例子证明了本文提出的最优控制方案的有效性和可行性. 相似文献

13.

Uplink power adjustment in wireless communication systems: a stochastic control analysis

Minyi Huang Caines P.E. Malhame R.P. 《Automatic Control, IEEE Transactions on》2004,49(10):1693-1708

This paper considers mobile to base station power control for lognormal fading channels in wireless communication systems within a centralized information stochastic optimal control framework. Under a bounded power rate of change constraint, the stochastic control problem and its associated Hamilton-Jacobi-Bellman (HJB) equation are analyzed by the viscosity solution method; then the degenerate HJB equation is perturbed to admit a classical solution and a suboptimal control law is designed based on the perturbed HJB equation. When a quadratic type cost is used without a bound constraint on the control, the value function is a classical solution to the degenerate HJB equation and the feedback control is affine in the system power. In addition, in this case we develop approximate, but highly scalable, solutions to the HJB equation in terms of a local polynomial expansion of the exact solution. When the channel parameters are not known a priori, one can obtain on-line estimates of the parameters and get adaptive versions of the control laws. In numerical experiments with both of the above cost functions, the following phenomenon is observed: whenever the users have different initial conditions, there is an initial convergence of the power levels to a common level and then subsequent approximately equal behavior which converges toward a stochastically varying optimum. 相似文献

14.

Online Adaptive Approximate Optimal Tracking Control with Simplified Dual Approximation Structure for Continuous-time Unknown Nonlinear Systems

下载免费PDF全文

Jing Na Guido Herrmann 《IEEE/CAA Journal of Automatica Sinica》2014,1(4):412-422

This paper proposes an online adaptive approximate solution for the infinite-horizon optimal tracking control problem of continuous-time nonlinear systems with unknown dynamics. The requirement of the complete knowledge of system dynamics is avoided by employing an adaptive identifier in conjunction with a novel adaptive law, such that the estimated identifier weights converge to a small neighborhood of their ideal values. An adaptive steady-state controller is developed to maintain the desired tracking performance at the steady-state, and an adaptive optimal controller is designed to stabilize the tracking error dynamics in an optimal manner. For this purpose, a critic neural network (NN) is utilized to approximate the optimal value function of the Hamilton-Jacobi-Bellman (HJB) equation, which is used in the construction of the optimal controller. The learning of two NNs, i.e., the identifier NN and the critic NN, is continuous and simultaneous by means of a novel adaptive law design methodology based on the parameter estimation error. Stability of the whole system consisting of the identifier NN, the critic NN and the optimal tracking control is guaranteed using Lyapunov theory; convergence to a near-optimal control law is proved. Simulation results exemplify the effectiveness of the proposed method. 相似文献

15.

Neural network solution for finite-horizon H-infinity constrained optimal control of nonlinear systems

Tao CHENG Frank L. LEWIS 《控制理论与应用(英文版)》2007,5(1):1-11

In this paper, neural networks are used to approximately solve the finite-horizon constrained input H-infinity state feedback control problem. The method is based on solving a related Hamilton-Jacobi-Isaacs equation of the corresponding finite-horizon zero-sum game. The game value function is approximated by a neural network with time- varying weights. It is shown that the neural network approximation converges uniformly to the game-value function and the resulting almost optimal constrained feedback controller provides closed-loop stability and bounded L2 gain. The result is an almost optimal H-infinity feedback controller with time-varying coefficients that is solved a priori off-line. The effectiveness of the method is shown on the Rotational/Translational Actuator benchmark nonlinear control problem. 相似文献

16.

Policy Iterations on the Hamilton–Jacobi–Isaacs Equation for H_∞ State Feedback Control With Input Saturation

Abu-Khalaf M. Lewis F. L. Huang J. 《Automatic Control, IEEE Transactions on》2006,51(12):1989-1995

An H_infin suboptimal state feedback controller for constrained input systems is derived using the Hamilton-Jacobi-Isaacs (HJI) equation of a corresponding zero-sum game that uses a special quasi-norm to encode the constraints on the input. The unique saddle point in feedback strategy form is derived. Using policy iterations on both players, the HJI equation is broken into a sequence of differential equations linear in the cost for which closed-form solutions are easier to obtain. Policy iterations on the disturbance are shown to converge to the available storage function of the associated L₂-gain dissipative dynamics. The resulting constrained optimal control feedback strategy has the largest domain of validity within which L₂-performance for a given gamma is guaranteed 相似文献

17.

A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system

《Journal of Process Control》2020

The Hamilton–Jacobi–Bellman (HJB) equation can be solved to obtain optimal closed-loop control policies for general nonlinear systems. As it is seldom possible to solve the HJB equation exactly for nonlinear systems, either analytically or numerically, methods to build approximate solutions through simulation based learning have been studied in various names like neurodynamic programming (NDP) and approximate dynamic programming (ADP). The aspect of learning connects these methods to reinforcement learning (RL), which also tries to learn optimal decision policies through trial-and-error based learning. This study develops a model-based RL method, which iteratively learns the solution to the HJB and its associated equations. We focus particularly on the control-affine system with a quadratic objective function and the finite horizon optimal control (FHOC) problem with time-varying reference trajectories. The HJB solutions for such systems involve time-varying value, costate, and policy functions subject to boundary conditions. To represent the time-varying HJB solution in high-dimensional state space in a general and efficient way, deep neural networks (DNNs) are employed. It is shown that the use of DNNs, compared to shallow neural networks (SNNs), can significantly improve the performance of a learned policy in the presence of uncertain initial state and state noise. Examples involving a batch chemical reactor and a one-dimensional diffusion-convection-reaction system are used to demonstrate this and other key aspects of the method. 相似文献

18.

执行器故障不确定非线性系统最优自适应输出跟踪控制

张绍杰吴雪刘春生《自动化学报》2018,44(12):2188-2197

本文针对一类具有执行器故障的多输入多输出（Multi-input multi-output,MIMO）不确定连续仿射非线性系统,提出了一种最优自适应输出跟踪控制方案.设计了保证系统稳定性的不确定项估计神经网络权值调整算法,仅采用评价网络即可同时获得无限时域代价函数和满足哈密顿-雅可比-贝尔曼（Hamilton-Jacobi-Bellman,HJB）方程的最优控制输入.考虑执行器卡死和部分失效故障,设计最优自适应补偿控制律,所设计的控制律可以实现对参考输出的一致最终有界跟踪.飞行器控制仿真和对比验证表明了本文方法的有效性和优越性. 相似文献

19.

基于HJB方程的无线传感器网络系统Minimax控制器设计

石元博王建辉方晓柯黄越洋顾树生《控制与决策》2021,36(4):947-952

针对工业环境下无线传感器网络系统在受到外部较大干扰时的系统稳定性问题,提出Hamilton-JacobiBellman (HJB)方程与Minimax控制相结合的方法.首先,针对无线传感器网络在复杂工况环境下出现的网络时延和连续丢包有界的情况,给出具有时延和丢包的无线传感器网络系统模型;然后,在Minimax性能指标函数下,利用HJB方程设计系统的Minimax最优控制器,进一步通过检验函数得出有关最大干扰的表达形式,从而推导出系统稳定的充分条件;最后,通过数值算例和仿真验证系统在突发较大干扰时采用所提方法的可行性和有效性. 相似文献

20.

Least squares solutions of the HJB equation with neural network value-function approximators. 总被引：1，自引：0，他引：1

Yuval Tassa Tom Erez 《Neural Networks, IEEE Transactions on》2007,18(4):1031-1041

In this paper, we present an empirical study of iterative least squares minimization of the Hamilton-Jacobi-Bellman (HJB) residual with a neural network (NN) approximation of the value function. Although the nonlinearities in the optimal control problem and NN approximator preclude theoretical guarantees and raise concerns of numerical instabilities, we present two simple methods for promoting convergence, the effectiveness of which is presented in a series of experiments. The first method involves the gradual increase of the horizon time scale, with a corresponding gradual increase in value function complexity. The second method involves the assumption of stochastic dynamics which introduces a regularizing second derivative term to the HJB equation. A gradual reduction of this term provides further stabilization of the convergence. We demonstrate the solution of several problems, including the 4-D inverted-pendulum system with bounded control. Our approach requires no initial stabilizing policy or any restrictive assumptions on the plant or cost function, only knowledge of the plant dynamics. In the Appendix, we provide the equations for first- and second-order differential backpropagation. 相似文献