基于偏向信息学习的双层强化学习算法 Dual Reinforcement Learning Based on Bias Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于偏向信息学习的双层强化学习算法

引用本文：	林芬,石川,罗杰文,史忠植.基于偏向信息学习的双层强化学习算法[J].计算机研究与发展,2008,45(9).

作者姓名：	林芬石川罗杰文史忠植

作者单位：	1. 中国科学院计算技术研究所智能信息处理重点实验室,北京,100190;中国科学院研究生院,北京,100049 2. 中国科学院计算技术研究所智能信息处理重点实验室,北京,100190;北京邮电大学北京市智能软件与多媒体重点实验室,北京,100876 3. 中国科学院计算技术研究所智能信息处理重点实验室,北京,100190

基金项目：	国家高技术研究发展计划(863计划)，国家重点基础研究发展计划(973计划)，国家自然科学基金

摘要：	传统的强化学习存在收敛速度慢等问题,结合先验知识预置某些偏向可以加快学习速度.但是当先验知识不正确时又可能导致学习过程不收敛.对此,提出基于偏向信息学习的双层强化学习模型.该模型将强化学习过程和偏向信息学习过程结合起来:偏向信息指导强化学习的行为选择策略,同时强化学习指导偏向信息学习过程.该方法在有效利用先验知识的同时能够消除不正确先验知识的影响.针对迷宫问题的实验表明,该方法能够稳定收敛到最优策略;并且能够有效利用先验知识提高学习效率,加快学习过程的收敛.
关键词：	强化学习 Q-学习算法偏向信息偏向信息学习先验知识
Dual Reinforcement Learning Based on Bias Learning

Lin Fen,Shi Chuan,Luo Jiewen,Shi Zhongzhi.Dual Reinforcement Learning Based on Bias Learning[J].Journal of Computer Research and Development,2008,45(9).

Authors:	Lin Fen Shi Chuan Luo Jiewen Shi Zhongzhi

Affiliation:	Lin Fen1,2,Shi Chuan1,3,Luo Jiewen1,, Shi Zhongzhi11(Key Laboratory of Intelligent Information Processing,Institute of Computing Technology,Beijing 100190)2(Graduate University of Chinese Academy of Sciences,Beijing 100049)3(Smart Software , Multimedia of Beijing Key Laboratory,Beijing University of Posts , Telecommunications,Beijing 100876)

Abstract:	Reinforcement learning has received much attention in the past decade. Its incremental nature and adaptive capabilities make it suitable for use in various domains, such as automatic control, mobile robotics and multi-agent system. A critical problem in conventional reinforcement learning is the slow convergence of the learning process. To accelerate the learning speed, bias information is incorporated to boost learning process with priori knowledge. Current methods use bias information for the action selec...

Keywords:	reinforcement learning Q-learning bias bias learning priori knowledge
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏