首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 9 毫秒
1.
This paper investigates the consensus problem for linear multi-agent systems from the viewpoint of two-dimensional systems when the state information of each agent is not available. Observer-based fully distributed adaptive iterative learning protocol is designed in this paper. A local observer is designed for each agent and it is shown that without using any global information about the communication graph, all agents achieve consensus perfectly for all undirected connected communication graph when the number of iterations tends to infinity. The Lyapunov-like energy function is employed to facilitate the learning protocol design and property analysis. Finally, simulation example is given to illustrate the theoretical analysis.  相似文献   

2.
针对未知模型的非线性离散时间多智能体系统,研究基于事件触发迭代学习双向一致性问题.首先,利用紧凑形式动态线性化方法,建立多智能体系统的动态线性化数据模型,提出一种该数据模型的参数估计算法;其次,基于该数据模型设计输出观测器和死区控制器,并结合信号图论,构建一种事件触发分布式无模型迭代学习双向一致性控制策略;然后,通过设计李雅普诺夫函数对该控制策略的收敛性进行严格的证明;最后,通过数值仿真进一步验证该控制协议的正确性和有效性.  相似文献   

3.
Shoham et al. identify several important agendas which can help direct research in multi-agent learning. We propose two additional agendas—called “modelling” and “design”—which cover the problems we need to consider before our agents can start learning. We then consider research goals for modelling, design, and learning, and identify the problem of finding learning algorithms that guarantee convergence to Pareto-dominant equilibria against a wide range of opponents. Finally, we conclude with an example: starting from an informally-specified multi-agent learning problem, we illustrate how one might formalize and solve it by stepping through the tasks of modelling, design, and learning.  相似文献   

4.
针对周期性拒绝服务(DoS)攻击下多智能体系统有限时间趋同跟踪控制问题,本文提出了一种无模型自适应迭代学习控制(MFAILC)算法.假设多智能体系统具有固定拓扑结构,并且仅有部分智能体可获取到期望轨迹信息.在多智能体系统数据传输过程中,需要经由对数量化器进行量化处理.首先,使用伪偏导数将智能体系统动态线性化,处理过程中考虑符合伯努利分布的周期性DoS攻击现象,在此基础上设计了MFAILC控制算法,其次,采用压缩映射方法给出了一个在期望意义下保证跟踪误差收敛的充分条件,并在理论上证明了所提算法的收敛性.所提算法只需利用系统的输入输出数据就可完成趋同跟踪任务.最后,仿真结果验证了所提算法的有效性.  相似文献   

5.
In context-aware ubiquitous learning, students are guided to learn in the real world with personalized supports from the learning system. As the learning resources are realistic objects in the real world, certain physical constraints, such as the limitation of stream of people who visit the same learning object, the time for moving from one object to another, and the environmental parameters, need to be taken into account. Moreover, the values of these context-dependent parameters are likely to change swiftly during the learning process, which makes it a challenging and important issue to find a navigation support mechanism for suggesting learning paths for individual students in real time. In this paper, the navigation support problem for context-aware ubiquitous learning is formulated and two navigation support algorithms are proposed by taking learning efficacy and navigation efficiency into consideration. From the simulation results of learning in a butterfly museum setting, it is concluded that the innovative approach is helpful to the students to more effectively and efficiently utilize the learning resources and achieve better learning efficacy.  相似文献   

6.
7.
In this paper, the adaptive fuzzy iterative learning control scheme is proposed for coordination problems of Mth order (M ≥ 2) distributed multi-agent systems. Every follower agent has a higher order integrator with unknown nonlinear dynamics and input disturbance. The dynamics of the leader are a higher order nonlinear systems and only available to a portion of the follower agents. With distributed initial state learning, the unified distributed protocols combined time-domain and iteration-domain adaptive laws guarantee that the follower agents track the leader uniformly on [0, T]. Then, the proposed algorithm extends to achieve the formation control. A numerical example and a multiple robotic system are provided to demonstrate the performance of the proposed approach.  相似文献   

8.
This research treats a bargaining process as a Markov decision process, in which a bargaining agent’s goal is to learn the optimal policy that maximizes the total rewards it receives over the process. Reinforcement learning is an effective method for agents to learn how to determine actions for any time steps in a Markov decision process. Temporal-difference (TD) learning is a fundamental method for solving the reinforcement learning problem, and it can tackle the temporal credit assignment problem. This research designs agents that apply TD-based reinforcement learning to deal with online bilateral bargaining with incomplete information. This research further evaluates the agents’ bargaining performance in terms of the average payoff and settlement rate. The results show that agents using TD-based reinforcement learning are able to achieve good bargaining performance. This learning approach is sufficiently robust and convenient, hence it is suitable for online automated bargaining in electronic commerce.  相似文献   

9.
本文针对多智能体强化学习中存在的通信和计算资源消耗大等问题,提出了一种基于事件驱动的多智能体强化学习算法,侧重于事件驱动在多智能体学习策略层方面的研究。在智能体与环境的交互过程中,算法基于事件驱动的思想,根据智能体观测信息的变化率设计触发函数,使学习过程中的通信和学习时机无需实时或按周期地进行,故在相同时间内可以降低数据传输和计算次数。另外,分析了该算法的计算资源消耗,以及对算法收敛性进行了论证。最后,仿真实验说明了该算法可以在学习过程中减少一定的通信次数和策略遍历次数,进而缓解了通信和计算资源消耗。  相似文献   

10.
This paper addresses the consensus problem of leader-following nonlinear multi-agent systems with iterative learning control. The assumption that only a small portion of following agents can receive the information of leader agent is considered. To approximate the nonlinear dynamics of a given system, the radial basis function neural network is introduced. Then, a distributed adaptive iterative learning control protocol with an auxiliary control term is designed, where the estimates of nonlinear dynamics are applied in control protocol design and three adaptive laws are presented. Furthermore, the convergence of the proposed control protocol is analysed by Lyapunov stability theory. Finally, a simulation example is provided to demonstrate the validity of theoretical results.  相似文献   

11.
In this paper a novel problem of adaptive awareness coverage is formulated. We model the mission domain using a density function which characterizes the importance of each point and is unknown beforehand. The desired awareness coverage level over the mission domain is defined as a non-decreasing differentiable function of the density distribution. A decentralized adaptive control strategy is developed to accomplish the awareness coverage task and learning task simultaneously. The proposed control law is memoryless and can guarantee the achievement of satisfactory awareness coverage of the mission domain in finite time with the approximation error of the density function converging to zero.  相似文献   

12.
本文对具有非线性函数群集行为的连续时间多智能体系统的分布式优化问题进行了研究。本文的目的是 使局部代价函数之和最小。每个智能体只知道与其对应的代价函数。为了解决这一问题,本文设计了一个分布式 控制律,在这个研究中该控制律仅仅依赖于自己和邻居的速度。通过李雅普诺夫稳定性证明了多智能体系统的收 敛性,而且在最小化局部代价函数之和的同时所有智能体可以避免碰撞。最后,通过一个仿真案例来说明所获得 的分析结果。  相似文献   

13.
In this paper, we investigate the use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multi-agent tasks. We introduce a hierarchical multi-agent reinforcement learning (RL) framework, and propose a hierarchical multi-agent RL algorithm called Cooperative HRL. In this framework, agents are cooperative and homogeneous (use the same task decomposition). Learning is decentralized, with each agent learning three interrelated skills: how to perform each individual subtask, the order in which to carry them out, and how to coordinate with other agents. We define cooperative subtasks to be those subtasks in which coordination among agents significantly improves the performance of the overall task. Those levels of the hierarchy which include cooperative subtasks are called cooperation levels. A fundamental property of the proposed approach is that it allows agents to learn coordination faster by sharing information at the level of cooperative subtasks, rather than attempting to learn coordination at the level of primitive actions. We study the empirical performance of the Cooperative HRL algorithm using two testbeds: a simulated two-robot trash collection task, and a larger four-agent automated guided vehicle (AGV) scheduling problem. We compare the performance and speed of Cooperative HRL with other learning algorithms, as well as several well-known industrial AGV heuristics. We also address the issue of rational communication behavior among autonomous agents in this paper. The goal is for agents to learn both action and communication policies that together optimize the task given a communication cost. We extend the multi-agent HRL framework to include communication decisions and propose a cooperative multi-agent HRL algorithm called COM-Cooperative HRL. In this algorithm, we add a communication level to the hierarchical decomposition of the problem below each cooperation level. Before an agent makes a decision at a cooperative subtask, it decides if it is worthwhile to perform a communication action. A communication action has a certain cost and provides the agent with the actions selected by the other agents at a cooperation level. We demonstrate the efficiency of the COM-Cooperative HRL algorithm as well as the relation between the communication cost and the learned communication policy using a multi-agent taxi problem.  相似文献   

14.
徐鹏  谢广明      文家燕    高远 《智能系统学报》2019,14(1):93-98
针对经典强化学习的多智能体编队存在通信和计算资源消耗大的问题,本文引入事件驱动控制机制,智能体的动作决策无须按固定周期进行,而依赖于事件驱动条件更新智能体动作。在设计事件驱动条件时,不仅考虑智能体的累积奖赏值,还引入智能体与邻居奖赏值的偏差,智能体间通过交互来寻求最优联合策略实现编队。数值仿真结果表明,基于事件驱动的强化学习多智能体编队控制算法,在保证系统性能的情况下,能有效降低多智能体的动作决策频率和资源消耗。  相似文献   

15.
Multi-agent systems require adaptability to perform effectively in complex and dynamic environments. This article shows that agents should be able to benefit from dynamically adapting their decision-making frameworks. A decision-making framework describes the set of multi-agent decision-making interactions exercised by members of an agent group in the course of pursuing a goal or set of goals. The decision-making interaction style an agent adopts with respect to other agents influences that agent's degree of autonomy. The article introduces the capability of Dynamic Adaptive Autonomy (DAA), which allows an agent to dynamically modify its autonomy along a defined spectrum (from command-driven to consensus to locally autonomous/master) for each goal it pursues. This article presents one motivation for DAA through experiments showing that the ‘best’ decision-making framework for a group of agents depends not only on the problem domain and pre-defined characteristics of the system, but also on run-time factors that can change during system operation. This result holds regardless of which performance metric is used to define ‘best’. Thus, it is possible for agents to benefit by dynamically adapting their decision-making frameworks to their situation during system operation.  相似文献   

16.
This paper considers the adaptive containment control of high-order nonlinear multi-agent systems with nonlinear parameterisation. Without imposing any conditions on the unknown nonlinearities and unknown parameters, the distributed controllers are constructed recursively with only neighbours’ information by using the backstepping design method. Under the assumption that the leaders set is globally reachable, it is shown that all the signals of the closed-loop systems are global uniformly ultimately bounded (UUB), and all the followers will exponentially converge to the convex hull spanned by the dynamic leaders with adjustable tracking errors. Finally, two simulation examples demonstrate the effectiveness of the control scheme.  相似文献   

17.
近年来,强化学习与自适应动态规划算法的迅猛发展及其在一系列挑战性问题(如大规模多智能体系统优化决策和最优协调控制问题)中的成功应用,使其逐渐成为人工智能、系统与控制和应用数学等领域的研究热点.鉴于此,首先简要介绍强化学习和自适应动态规划算法的基础知识和核心思想,在此基础上综述两类密切相关的算法在不同研究领域的发展历程,着重介绍其从应用于单个智能体(控制对象)序贯决策(最优控制)问题到多智能体系统序贯决策(最优协调控制)问题的发展脉络和研究进展.进一步,在简要介绍自适应动态规划算法的结构变化历程和由基于模型的离线规划到无模型的在线学习发展演进的基础上,综述自适应动态规划算法在多智能体系统最优协调控制问题中的研究进展.最后,给出多智能体强化学习算法和利用自适应动态规划求解多智能体系统最优协调控制问题研究中值得关注的一些挑战性课题.  相似文献   

18.
针对一类模型未知的离散时间非线性多智能体系统聚类一致性问题,提出一种无模型自适应控制算法.首先,假设系统具有固定拓扑,利用伪偏导数概念得到系统的数据关系模型,在考虑多智能体之间耦合系数条件下给出聚类一致性误差,在此基础上设计一种数据驱动的聚类一致性跟踪控制协议;然后,采用压缩映射方法在理论上分析了跟踪误差的收敛性,结果表明所提出算法不需要智能体模型信息即可完成跟踪任务,是一种数据驱动的控制方法;最后,将结果拓展至随机切换拓扑结构的多智能体系统中,数值仿真结果验证了所提出算法的有效性.  相似文献   

19.
In this paper, we investigate the perfect consensus problem for second-order linearly parameterised multi-agent systems (MAS) with imprecise communication topology structure. Takagi-Sugeno (T–S) fuzzy models are presented to describe the imprecise communication topology structure of leader-following MAS, and a distributed adaptive iterative learning control protocol is proposed with the dynamic of leader unknown to any of the agent. The proposed protocol guarantees that the follower agents can track the leader perfectly on [0,T] for the consensus problem. Under alignment condition, a sufficient condition of the consensus for closed-loop MAS is given based on Lyapunov stability theory. Finally, a numerical example and a multiple pendulum system are given to illustrate the effectiveness of the proposed algorithm.  相似文献   

20.
Distributed learning and cooperative control for multi-agent systems   总被引:1,自引:0,他引:1  
This paper presents an algorithm and analysis of distributed learning and cooperative control for a multi-agent system so that a global goal of the overall system can be achieved by locally acting agents. We consider a resource-constrained multi-agent system, in which each agent has limited capabilities in terms of sensing, computation, and communication. The proposed algorithm is executed by each agent independently to estimate an unknown field of interest from noisy measurements and to coordinate multiple agents in a distributed manner to discover peaks of the unknown field. Each mobile agent maintains its own local estimate of the field and updates the estimate using collective measurements from itself and nearby agents. Each agent then moves towards peaks of the field using the gradient of its estimated field while avoiding collision and maintaining communication connectivity. The proposed algorithm is based on a recursive spatial estimation of an unknown field. We show that the closed-loop dynamics of the proposed multi-agent system can be transformed into a form of a stochastic approximation algorithm and prove its convergence using Ljung’s ordinary differential equation (ODE) approach. We also present extensive simulation results supporting our theoretical results.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号