首页 | 官方网站   微博 | 高级检索  
     


Semi-Markov adaptive critic heuristics with application to airline revenue management
Authors:Ketaki KULKARNI  Abhijit GOSAVI  Susan MURRAY and Katie GRANTHAM
Affiliation:Department of Engineering Management and Systems Engineering, Missouri University of Science and Technology, Rolla, MO 65409, U.S.A.
Abstract:The adaptive critic heuristic has been a popular algorithm in reinforcement learning (RL) and approximate dynamic programming (ADP) alike. It is one of the first RL and ADP algorithms. RL and ADP algorithms are particularly useful for solving Markov decision processes (MDPs) that suffer from the curses of dimensionality and modeling. Many real-world problems, however, tend to be semi-Markov decision processes (SMDPs) in which the time spent in each transition of the underlying Markov chains is itself a random variable. Unfortunately for the average reward case, unlike the discounted reward case, the MDP does not have an easy extension to the SMDP. Examples of SMDPs can be found in the area of supply chain management, maintenance management, and airline revenue management. In this paper, we propose an adaptive critic heuristic for the SMDP under the long-run average reward criterion. We present the convergence analysis of the algorithm which shows that under certain mild conditions, which can be ensured within a simulator, the algorithm converges to an optimal solution with probability 1. We test the algorithm extensively on a problem of airline revenue management in which the manager has to set prices for airline tickets over the booking horizon. The problem has a large scale, suffering from the curse of dimensionality, and hence it is difficult to solve it via classical methods of dynamic programming. Our numerical results are encouraging and show that the algorithm outperforms an existing heuristic used widely in the airline industry.
Keywords:Adaptive critics  Actor critics  Semi-Markov  Approximate dynamic programming  Reinforcement learning
本文献已被 CNKI 维普 万方数据 SpringerLink 等数据库收录!
点击此处可从《控制理论与应用(英文版)》浏览原始摘要信息
点击此处可从《控制理论与应用(英文版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号