首页 | 官方网站   微博 | 高级检索  
     

未知非线性零和博弈最优跟踪的事件触发控制设计
引用本文:王鼎,胡凌治,赵明明,哈明鸣,乔俊飞.未知非线性零和博弈最优跟踪的事件触发控制设计[J].自动化学报,2023,49(1):91-101.
作者姓名:王鼎  胡凌治  赵明明  哈明鸣  乔俊飞
作者单位:1.北京工业大学信息学部 北京 100124
基金项目:科技创新2030—“新一代人工智能”重大项目(2021ZD0112302), 北京市自然科学基金(JQ19013), 国家自然科学基金(62222301, 61890930-5, 62021003)资助
摘    要:设计了一种基于事件的迭代自适应评判算法,用于解决一类非仿射系统的零和博弈最优跟踪控制问题.通过数值求解方法得到参考轨迹的稳定控制,进而将未知非线性系统的零和博弈最优跟踪控制问题转化为误差系统的最优调节问题.为了保证闭环系统在具有良好控制性能的基础上有效地提高资源利用率,引入一个合适的事件触发条件来获得阶段性更新的跟踪策略对.然后,根据设计的触发条件,采用Lyapunov方法证明误差系统的渐近稳定性.接着,通过构建四个神经网络,来促进所提算法的实现.为了提高目标轨迹对应稳定控制的精度,采用模型网络直接逼近未知系统函数而不是误差动态系统.构建评判网络、执行网络和扰动网络用于近似迭代代价函数和迭代跟踪策略对.最后,通过两个仿真实例,验证该控制方法的可行性和有效性.

关 键 词:自适应评判设计  事件触发控制  神经网络  最优跟踪控制  稳定性分析  零和博弈
收稿时间:2022-05-09

Event-triggered Control Design for Optimal Tracking of Unknown Nonlinear Zero-sum Games
Affiliation:1.Faculty of Information Technology, Beijing University of Technology, Beijing 1001242.Beijing Key Laboratory of Computational Intelligence and Intelligent System, Beijing 1001243.Beijing Institute of Artificial Intelligence, Beijing 1001244.Beijing Laboratory of Smart Environmental Protection, Beijing 1001245.School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing 100083
Abstract:In this paper, an event-based iterative adaptive critic algorithm is designed to address optimal tracking control for a class of nonaffine zero-sum games. The steady control of the reference trajectory is obtained by numerical calculation. Then, the optimal tracking control problem of unknown nonlinear zero-sum games is transformed into the optimal regulation problem of corresponding error dynamics. In order to ensure that the closed-loop system possesses favourable control performance while can effectively improve the resource utilization, an appropriate event-triggering condition is introduced to obtain the tracking policy pair aperiodically. According to the designed triggering condition and the Lyapunov stability theory, the error system is proved to be asymptotically stable. In addition, four neural networks are constructed to promote the implementation of the proposed algorithm. In order to improve the accuracy of the steady control in target trajectory, the model network is used to approach the unknown system function directly instead of the error dynamic system. The critic network, the action network, and the disturbance network are constructed to obtain the approximate iterative cost function and the approximate iterative tracking policy pair. Finally, two examples are presented to demonstrate the feasibility and effectiveness of the proposed algorithm.
Keywords:
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号