首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
为了在连续和动态的环境中处理智能体不断变化的需求,我们通过利用强化学习来研究多机器人推箱子问题,得到了一种智能体可以不需要其它智能体任何信息的情况下完成协作任务的方法。强化学习可以应用于合作和非合作场合,对于存在噪声干扰和通讯困难的情况,强化学习具有其它人工智能方法不可比拟的优越性。  相似文献   

2.
将信息融合技术运用到多智能体系统中,利用信息融合方法对智能体得到的在空间上分布的其他智能体感知的局部信息进行融合,得到较完整的态势评估,以此来规划和协调多智能体系统的协作行为,提出了一种基于信息融合的多智能体协作方法。将该方法应用在机器人救援仿真系统中,结果表明该方法能够实现全局上的任务分解策略,有效提高了智能体协作能力。  相似文献   

3.
基于近似模拟的多智能体系统分布式层次控制设计   总被引:2,自引:0,他引:2  
本文主要探讨基于近似模拟的多智能体系统分布式层次控制的设计问题。为降低问题复杂度,首先建立了一个简单的抽象系统指导各智能体,继而讨论这一抽象系统和多智能体之间的模拟关系。借助于该抽象系统,给出多智能体系统完成一类协调任务的分布式层次控制器。同时利用模拟函数和共同Lyapunov函数,进一步分析了多个体系统在切换拓扑下的集体行为。  相似文献   

4.
多智能体深度强化学习(MADRL)将深度强化学习的思想和算法应用到多智能体系统的学习和控制中,是开发具有群智能体的多智能体系统的重要方法.现有的MADRL研究主要基于环境完全可观测或通信资源不受限的假设展开算法设计,然而部分可观测性是多智能体系统实际应用中客观存在的问题,例如智能体的观测范围通常是有限的,可观测的范围外不包括完整的环境信息,从而对多智能体间协同造成困难.鉴于此,针对实际场景中的部分可观测问题,基于集中式训练分布式执行的范式,将深度强化学习算法Actor-Critic扩展到多智能体系统,并增加智能体间的通信信道和门控机制,提出recurrent gated multi-agent Actor-Critic算法(RGMAAC).智能体可以基于历史动作观测记忆序列进行高效的通信交流,最终利用局部观测、历史观测记忆序列以及通过通信信道显式地由其他智能体共享的观察进行行为决策;同时,基于多智能体粒子环境设计多智能体同步且快速到达目标点任务,并分别设计2种奖励值函数和任务场景.实验结果表明,当任务场景中明确出现部分可观测问题时,RGMAAC算法训练后的智能体具有很好的表现,在稳定性...  相似文献   

5.
蒋伟进  夏可 《软件学报》2009,20(Z1):66-75
为提高企业的知识利用效率,增强企业创新能力,针对企业现有知识和系统,提出将企业知识管理的业务逻辑与知识处理事务分开,建立了基于多智能体和构件知识的知识复用模型,设计了知识管理业务逻辑的规则模型和智能体的活动行为模型,讨论了基于多智能体的规则协调模式,有效地支持知识的动态复用和知识使用过程的动态重组,增强知识管理系统的分布式处理能力和规模可扩展能力.在分布式构件库系统中,智能体通过协作联合完成任务要求.每个智能体拥有自己的知识库,并且具备学习能力,能够更新其知识库以保持执行结果的有效性.  相似文献   

6.
群智能在多智能体系统中的应用研究进展   总被引:1,自引:1,他引:0  
群智能算法是受群居性昆虫群体的集体行为启发而设计的分布式问题求解方法,将它应用到多智能体系统,旨在提高系统的鲁棒性、灵活性和自适应性。以群智能在多智能体系统中的应用为线索,首先介绍群智能的核心机制,然后从多智能体系统通信机制、协作技术、学习问题及体系结构建立这几个方面总结群智能理论在多智能体系统中的已有工作。最后分析和讨论了群智能方法在多智能体系统应用中存在的问题,并提出今后的工作展望。  相似文献   

7.
分布式强化学习系统的体系结构研究   总被引:2,自引:0,他引:2  
强化学习是一种重要的机器学习方法,随着计算机网络和分布式处理技术的飞速发展,多智能体系统中的分布式强化学习方法正受到越来越多的关注。论文将目前已有的各种分布式强化学习方法总结为中央强化学习、独立强化学习、群体强化学习、社会强化学习四类,然后探讨了这四类分布式强化学习方法的体系结构框架,并给出了这四类分布式强化学习方法的形式化定义。  相似文献   

8.
在分布式的动态环境下,多智能体系统的协作是建立在规则集合上的动态过程,因此需要建立动态的协作规则.多智能体强化学习的平稳状态本质上即是智能体之间的协作规则,据此提出一种基于强化学习的协作规则提取的方法,并由此构成智能体决策的新结构,最后用实例进行分析和证明了所提出的方法与单纯的强化学习方法相比较,具有如下优点:1)提取的规则可以加快多智能体的协作决策过程;2)规则的动态变化可以适应环境的动态变化;3)规则可以避免多次重复的学习过程.  相似文献   

9.
分布式人工智能与多智能体系统研究   总被引:4,自引:0,他引:4  
智能体理论是一个发展很快的前沿领域。为了解决复杂问题出现了分布式人工智能,多个智能体的协作正好符合分布式人工智能的要求,因此出现了多智能体系统。文中介绍了分布式人工智能的特点,并着重介绍了分布式人工智能的一种分类———多智能体系统。多智能体系统也是分布式人工智能的一种有效的解决方法,同时分布式人工智能又推动了多智能体的发展。  相似文献   

10.
协作问题一直是多智能体系统研究的关键问题之一,该文给出了用遗传算法来实现多智能体协作的一种方法。该方法利用遗传算法来解决当多智能体系统无法得到环境信息或得到这些信息代价过高时,如何有效地产生它们的协同运动。利用该方法,对三个智能体协作把箱子搬到目标点,然后改变目标点,让智能体继续完成协作任务进行计算机仿真,结果表明遗传算法在动态环境下实现多智能体协作方面的可行性和有效性。  相似文献   

11.
Reinforcement learning is one of the more prominent machine-learning technologies due to its unsupervised learning structure and ability to continually learn, even in a dynamic operating environment. Applying this learning to cooperative multi-agent systems not only allows each individual agent to learn from its own experience, but also offers the opportunity for the individual agents to learn from the other agents in the system so the speed of learning can be accelerated. In the proposed learning algorithm, an agent adapts to comply with its peers by learning carefully when it obtains a positive reinforcement feedback signal, but should learn more aggressively if a negative reward follows the action just taken. These two properties are applied to develop the proposed cooperative learning method. This research presents the novel use of the fastest policy hill-climbing methods of Win or Lose Fast (WoLF) with policy-sharing. Results from the multi-agent cooperative domain illustrate that the proposed algorithms perform better than Q-learning alone in a piano mover environment. It also demonstrates that agents can learn to accomplish a task together efficiently through repetitive trials.  相似文献   

12.
多智能体协作的两层强化学习实现方法   总被引:3,自引:0,他引:3  
提出了多智能体协作的两层强化学习方法。该方法主要通过在单个智能体中构筑两层强化学习单元来实现,将该方法应用于3个智能体协作抬起圆形物体的计算机模拟中,结果表明比采用传统强化学习方法的智能体协作得更好。  相似文献   

13.
In this paper, we investigate the use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multi-agent tasks. We introduce a hierarchical multi-agent reinforcement learning (RL) framework, and propose a hierarchical multi-agent RL algorithm called Cooperative HRL. In this framework, agents are cooperative and homogeneous (use the same task decomposition). Learning is decentralized, with each agent learning three interrelated skills: how to perform each individual subtask, the order in which to carry them out, and how to coordinate with other agents. We define cooperative subtasks to be those subtasks in which coordination among agents significantly improves the performance of the overall task. Those levels of the hierarchy which include cooperative subtasks are called cooperation levels. A fundamental property of the proposed approach is that it allows agents to learn coordination faster by sharing information at the level of cooperative subtasks, rather than attempting to learn coordination at the level of primitive actions. We study the empirical performance of the Cooperative HRL algorithm using two testbeds: a simulated two-robot trash collection task, and a larger four-agent automated guided vehicle (AGV) scheduling problem. We compare the performance and speed of Cooperative HRL with other learning algorithms, as well as several well-known industrial AGV heuristics. We also address the issue of rational communication behavior among autonomous agents in this paper. The goal is for agents to learn both action and communication policies that together optimize the task given a communication cost. We extend the multi-agent HRL framework to include communication decisions and propose a cooperative multi-agent HRL algorithm called COM-Cooperative HRL. In this algorithm, we add a communication level to the hierarchical decomposition of the problem below each cooperation level. Before an agent makes a decision at a cooperative subtask, it decides if it is worthwhile to perform a communication action. A communication action has a certain cost and provides the agent with the actions selected by the other agents at a cooperation level. We demonstrate the efficiency of the COM-Cooperative HRL algorithm as well as the relation between the communication cost and the learned communication policy using a multi-agent taxi problem.  相似文献   

14.
Abstract. This paper attempts to bridge the fields of machine learning, robotics, and distributed AI. It discusses the use of communication in reducing the undesirable effects of locality in fully distributed multi-agent systems with multiple agents robots learning in parallel while interacting with each other. Two key problems, hidden state and credit assignment, are addressed by applying local undirected broadcast communication in a dual role: as sensing and as reinforcement. The methodology is demonstrated on two multi-robot learning experiments. The first describes learning a tightly-coupled coordination task with two robots, the second a loosely-coupled task with four robots learning social rules. Communication is used to (1) share sensory data to overcome hidden state and (2) share reinforcement to overcome the credit assignment problem between the agents and bridge the gap between local individual and global group pay-off.  相似文献   

15.
多机器人动态编队的强化学习算法研究   总被引:8,自引:0,他引:8  
在人工智能领域中,强化学习理论由于其自学习性和自适应性的优点而得到了广泛关注.随着分布式人工智能中多智能体理论的不断发展,分布式强化学习算法逐渐成为研究的重点.首先介绍了强化学习的研究状况,然后以多机器人动态编队为研究模型,阐述应用分布式强化学习实现多机器人行为控制的方法.应用SOM神经网络对状态空间进行自主划分,以加快学习速度;应用BP神经网络实现强化学习,以增强系统的泛化能力;并且采用内、外两个强化信号兼顾机器人的个体利益及整体利益.为了明确控制任务,系统使用黑板通信方式进行分层控制.最后由仿真实验证明该方法的有效性.  相似文献   

16.
随机博弈框架下的多agent强化学习方法综述   总被引:4,自引:0,他引:4  
宋梅萍  顾国昌  张国印 《控制与决策》2005,20(10):1081-1090
多agent学习是在随机博弈的框架下,研究多个智能体间通过自学习掌握交互技巧的问题.单agent强化学习方法研究的成功,对策论本身牢固的数学基础以及在复杂任务环境中广阔的应用前景,使得多agent强化学习成为目前机器学习研究领域的一个重要课题.首先介绍了多agent系统随机博弈中基本概念的形式定义;然后介绍了随机博弈和重复博弈中学习算法的研究以及其他相关工作;最后结合近年来的发展,综述了多agent学习在电子商务、机器人以及军事等方面的应用研究,并介绍了仍存在的问题和未来的研究方向.  相似文献   

17.
Multi-agent reinforcement learning technologies are mainly investigated from two perspectives of the concurrence and the game theory. The former chiefly applies to cooperative multi-agent systems, while the latter usually applies to coordinated multi-agent systems. However, there exist such problems as the credit assignment and the multiple Nash equilibriums for agents with them. In this paper, we propose a new multi-agent reinforcement learning model and algorithm LMRL from a layer perspective. LMRL model is composed of an off-line training layer that employs a single agent reinforcement learning technology to acquire stationary strategy knowledge and an online interaction layer that employs a multi-agent reinforcement learning technology and the strategy knowledge that can be revised dynamically to interact with the environment. An agent with LMRL can improve its generalization capability, adaptability and coordination ability. Experiments show that the performance of LMRL can be better than those of a single agent reinforcement learning and Nash-Q.  相似文献   

18.
基于强化学习的多Agent协作研究   总被引:2,自引:0,他引:2  
强化学习为多Agent之间的协作提供了鲁棒的学习方法.本文首先介绍了强化学习的原理和组成要素,其次描述了多Agent马尔可夫决策过程MMDP,并给出了Agent强化学习模型.在此基础上,对多Agent协作过程中存在的两种强化学习方式:IL(独立学习)和JAL(联合动作学习)进行了比较.最后分析了在有多个最优策略存在的情况下,协作多Agent系统常用的几种协调机制.  相似文献   

19.
针对复杂、动态环境中多Agent协作的稳定性问题,提出了一种基于博弈论及惩罚机制的协作方法,通过效用函数来选择最优策略,实现均衡协作;为了提高协作的稳定性与成功率,引入惩罚机制,通过不断调整惩罚系数来维护多Agent协作的稳定性,并在形成协作团队时,充分考虑参与协作的Agent的信誉值。仿真结果表明,该方法能有效地降低任务完成时间,避免Agent在动态协作中随意退出,提高协作效率及协作稳定性。  相似文献   

20.
In this paper, a multi-agent reinforcement learning method based on action prediction of other agent is proposed. In a multi-agent system, action selection of the learning agent is unavoidably impacted by other agents’ actions. Therefore, joint-state and joint-action are involved in the multi-agent reinforcement learning system. A novel agent action prediction method based on the probabilistic neural network (PNN) is proposed. PNN is used to predict the actions of other agents. Furthermore, the sharing policy mechanism is used to exchange the learning policy of multiple agents, the aim of which is to speed up the learning. Finally, the application of presented method to robot soccer is studied. Through learning, robot players can master the mapping policy from the state information to the action space. Moreover, multiple robots coordination and cooperation are well realized.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号