首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 62 毫秒
1.
文章考虑了具适多智能体系统的分布式跟踪控制问题。通过设计带有初始学习机制的$P$型和$PD^{\alpha}$ 型迭代学习控制策略求解跟踪问题。具适导数具有良好的性质且可以刻画不同步长的实际数据采样情况。初始学习机制放松了初始值条件且提高了算法实现趋同跟踪的性能。在可重复操作环境和有向通信拓扑的假设下,提出了一种分布式迭代学习方案,通过重复同一轨迹的控制尝试和用跟踪误差修正不满意的控制信号来实现有限时间趋同。严格证明了随着迭代次数增加,提出的$P$型和$PD^{\alpha}$ 型迭代学习控制策略使得所有智能体能渐近跟踪上参考轨迹。两个代表性数值仿真验证了算法的有效性。  相似文献   

2.
李华  田国华 《测控技术》2015,34(6):74-76
基于双权值神经网络设计了一个模型参考自适应控制系统,以实现焊缝跟踪的自适应控制.通过分析双权值神经网络控制器在最陡下降法训练下的收敛和稳定条件,保证系统达到稳定运行的状态.最后仿真实验证明,由于存在方向权值和核心权值双权值的调整,基于双权值神经网络的焊缝跟踪控制系统的性能远优于BP网络控制系统的性能,可以更好地实现焊缝的实时跟踪控制.  相似文献   

3.
部分可观察Markov决策过程是通过引入信念状态空间将非Markov链问题转化为Markov链问题来求解,其描述真实世界的特性使它成为研究随机决策过程的重要分支.介绍了部分可观察Markov决策过程的基本原理和决策过程,提出一种基于策略迭代和值迭代的部分可观察Markov决策算法,该算法利用线性规划和动态规划的思想,解决当信念状态空间较大时出现的"维数灾"问题,得到Markov决策的逼近最优解.实验数据表明该算法是可行的和有效的.  相似文献   

4.
肖会敏  孟欣 《控制与决策》2019,34(10):2157-2163
针对一类具有外部扰动的不确定广义时滞系统,首先,设计一个积分型滑模面函数,基于Lyapunov稳定性理论,并结合线性矩阵不等式等技术,给出该滑动模态方程鲁棒渐近稳定的一个充分性判据;然后,通过设计一个新型的自适应滑模控制器,使得该闭环系统的状态可在有限时间内到达滑模面并作滑动运动;最后,通过一个数值仿真例子验证所提出的方法是有效可行的.  相似文献   

5.
于丹宁  倪坤  刘云龙 《计算机工程》2021,47(2):90-94,102
基于卷积神经网络的部分可观测马尔科夫决策过程(POMDP)值迭代算法QMDP-net在无先验知识的情况下具有较好的性能表现,但其存在训练效果不稳定、参数敏感等优化难题.提出基于循环卷积神经网络的POMDP值迭代算法RQMDP-net,使用门控循环单元网络实现值迭代更新,在保留输入和递归权重矩阵卷积特性的同时增强网络时序...  相似文献   

6.
采用奇异值分解设计广义系统的最优滤波器   总被引:6,自引:2,他引:4  
本文讨论广义离散随机线性系统的状态估计问题,通过矩阵奇值分解,本文给出了一种设计降阶最优滤波器的实用方法,该方法同时还得到了动态系统和量测系统的干扰噪声的估计。  相似文献   

7.
设计了一种基于事件的迭代自适应评判算法, 用于解决一类非仿射系统的零和博弈最优跟踪控制问题. 通过数值求解方法得到参考轨迹的稳定控制, 进而将未知非线性系统的零和博弈最优跟踪控制问题转化为误差系统的最优调节问题. 为了保证闭环系统在具有良好控制性能的基础上有效地提高资源利用率, 引入一个合适的事件触发条件来获得阶段性更新的跟踪策略对. 然后, 根据设计的触发条件, 采用Lyapunov方法证明误差系统的渐近稳定性. 接着, 通过构建四个神经网络, 来促进所提算法的实现. 为了提高目标轨迹对应稳定控制的精度, 采用模型网络直接逼近未知系统函数而不是误差动态系统. 构建评判网络、执行网络和扰动网络用于近似迭代代价函数和迭代跟踪策略对. 最后, 通过两个仿真实例, 验证该控制方法的可行性和有效性.  相似文献   

8.
9.
利用数据驱动控制思想,建立一种设计离散时间非线性系统近似最优调节器的迭代神经动态规划方法.提出针对离散时间一般非线性系统的迭代自适应动态规划算法并且证明其收敛性与最优性.通过构建三种神经网络,给出全局二次启发式动态规划技术及其详细的实现过程,其中执行网络是在神经动态规划的框架下进行训练.这种新颖的结构可以近似代价函数及其导函数,同时在不依赖系统动态的情况下自适应地学习近似最优控制律.值得注意的是,这在降低对于控制矩阵或者其神经网络表示的要求方面,明显地改进了迭代自适应动态规划算法的现有结果,能够促进复杂非线性系统基于数据的优化与控制设计的发展.通过两个仿真实验,验证本文提出的数据驱动最优调节方法的有效性.  相似文献   

10.
针对一类带有执行器饱和的未知动态离散时间非线性系统, 提出了一种新的最优跟踪控制方案. 该方案基于迭代自适应动态规划算法, 为了实现最优控制, 首先建立了未知系统动态的数据辨识器. 通过引入M网络, 获得了稳态控制的精确表达式. 为了消除执行器饱和的影响, 提出了一个非二次的性能指标函数. 然后提出了一种迭代自适应动态规划算法获得最优跟踪控制的解, 并给出了收敛性分析. 为了实现最优控制方案, 神经网络被用来构建数据辨识器、计算性能指标函数、近似最优控制策略和求解稳态控制. 仿真结果验证了本文所提出的最优跟踪控制方法的有效性.  相似文献   

11.
The core task of tracking control is to make the controlled plant track a desired trajectory. The traditional performance index used in previous studies cannot eliminate completely the tracking error as the number of time steps increases. In this paper, a new cost function is introduced to develop the value-iteration-based adaptive critic framework to solve the tracking control problem. Unlike the regulator problem,the iterative value function of tracking control problem cannot be regarded as a ...  相似文献   

12.
Aimed at infinite horizon optimal control problems of discrete time-varying nonlinear systems, in this paper, a new iterative adaptive dynamic programming algorithm, which is the discrete-time time-varying policy iteration (DTTV) algorithm, is developed. The iterative control law is designed to update the iterative value function which approximates the index function of optimal performance. The admissibility of the iterative control law is analyzed. The results show that the iterative value function is non-increasingly convergent to the Bellman-equation optimal solution. To implement the algorithm, neural networks are employed and a new implementation structure is established, which avoids solving the generalized Bellman equation in each iteration. Finally, the optimal control laws for torsional pendulum and inverted pendulum systems are obtained by using the DTTV policy iteration algorithm, where the mass and pendulum bar length are permitted to be time-varying parameters. The effectiveness of the developed method is illustrated by numerical results and comparisons.   相似文献   

13.
Based on adaptive dynamic programming (ADP), the fixed-point tracking control problem is solved by a value iteration (Ⅵ) algorithm. First, a class of discrete-time (DT) nonlinear system with disturbance is considered. Second, the convergence of a Ⅵ algorithm is given. It is proven that the iterative cost function precisely converges to the optimal value, and the control input and disturbance input also converges to the optimal values. Third, a novel analysis pertaining to the range of the discount factor is presented, where the cost function serves as a Lyapunov function. Finally, neural networks (NNs) are employed to approximate the cost function, the control law, and the disturbance law. Simulation examples are given to illustrate the effective performance of the proposed method.   相似文献   

14.
This paper studies the problem of optimal parallel tracking control for continuous-time general nonlinear systems. Unlike existing optimal state feedback control, the control input of the optimal parallel control is introduced into the feedback system. However, due to the introduction of control input into the feedback system, the optimal state feedback control methods can not be applied directly. To address this problem, an augmented system and an augmented performance index function are proposed firstly. Thus, the general nonlinear system is transformed into an affine nonlinear system. The difference between the optimal parallel control and the optimal state feedback control is analyzed theoretically. It is proven that the optimal parallel control with the augmented performance index function can be seen as the suboptimal state feedback control with the traditional performance index function. Moreover, an adaptive dynamic programming (ADP) technique is utilized to implement the optimal parallel tracking control using a critic neural network (NN) to approximate the value function online. The stability analysis of the closed-loop system is performed using the Lyapunov theory, and the tracking error and NN weights errors are uniformly ultimately bounded (UUB). Also, the optimal parallel controller guarantees the continuity of the control input under the circumstance that there are finite jump discontinuities in the reference signals. Finally, the effectiveness of the developed optimal parallel control method is verified in two cases.   相似文献   

15.
This paper is concerned with a novel integrated multi-step heuristic dynamic programming (MsHDP) algorithm for solving optimal control problems. It is shown that, initialized by the zero cost function, MsHDP can converge to the optimal solution of the Hamilton-Jacobi-Bellman (HJB) equation. Then, the stability of the system is analyzed using control policies generated by MsHDP. Also, a general stability criterion is designed to determine the admissibility of the current control policy. That is, the criterion is applicable not only to traditional value iteration and policy iteration but also to MsHDP. Further, based on the convergence and the stability criterion, the integrated MsHDP algorithm using immature control policies is developed to accelerate learning efficiency greatly. Besides, actor-critic is utilized to implement the integrated MsHDP scheme, where neural networks are used to evaluate and improve the iterative policy as the parameter architecture. Finally, two simulation examples are given to demonstrate that the learning effectiveness of the integrated MsHDP scheme surpasses those of other fixed or integrated methods.  相似文献   

16.
张绍杰  吴雪  刘春生 《自动化学报》2018,44(12):2188-2197
本文针对一类具有执行器故障的多输入多输出(Multi-input multi-output,MIMO)不确定连续仿射非线性系统,提出了一种最优自适应输出跟踪控制方案.设计了保证系统稳定性的不确定项估计神经网络权值调整算法,仅采用评价网络即可同时获得无限时域代价函数和满足哈密顿-雅可比-贝尔曼(Hamilton-Jacobi-Bellman,HJB)方程的最优控制输入.考虑执行器卡死和部分失效故障,设计最优自适应补偿控制律,所设计的控制律可以实现对参考输出的一致最终有界跟踪.飞行器控制仿真和对比验证表明了本文方法的有效性和优越性.  相似文献   

17.
In this paper, we present an optimal neuro-control scheme for continuous-time (CT) nonlinear systems with asymmetric input constraints. Initially, we introduce a discounted cost function for the CT nonlinear systems in order to handle the asymmetric input constraints. Then, we develop a Hamilton-Jacobi-Bellman equation (HJBE), which arises in the discounted cost optimal control problem. To obtain the optimal neurocontroller, we utilize a critic neural network (CNN) to solve the HJBE under the framework of reinforcement learning. The CNN’s weight vector is tuned via the gradient descent approach. Based on the Lyapunov method, we prove that uniform ultimate boundedness of the CNN’s weight vector and the closed-loop system is guaranteed. Finally, we verify the effectiveness of the present optimal neuro-control strategy through performing simulations of two examples.   相似文献   

18.
提出了基于一种迭代自适应评判设计(ACD)算法解决一类离散时间Roesser型2-D系统的二人零和对策问题. 文章主要思想是采用自适应评判技术迭代的获得最优控制对使得性能指标函数达到零和对策的鞍点. 所提出的ACD可以通过输入输出数据进行实现而不需要系统的模型. 为了实现迭代ACD算法, 神经网络分别用来近似性能指标函数和计算最优控制率. 最后最优控制策略将应用到空气干燥过程控制中以证明其有效性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号