Hierarchical power management of a system with autonomously power-managed components using reinforcement learning |
| |
Affiliation: | 1. Carthage University, MMA Laboratory, Institut National des Sciences Appliquées et de Technologie, Centre Urbain Nord, B.P. 676, Tunis, Cedex 1080, Tunisia;2. Department of Electrical Engineering, University of Southern California, Los Angeles, CA, USA;3. Department of Electrical & Computer Engineering, Faculty of Engineering, King Abdulaziz University, P.O. Box 21589, Jeddah 21589, Saudi Arabia;1. Electronics Engineering Department, Bani-Suef University, Bani-Suef, Egypt;2. Mentor Graphics Corporation, Cairo, Egypt;3. Electrical Engineering Department, Minia University, El-Minia, Egypt;1. International Business Machine (IBM) Corporation, Austin, TX 78757, USA;2. Department of Electrical and Computer Engineering, Fort Collins, CO 80523, USA |
| |
Abstract: | This paper presents a hierarchical dynamic power management (DPM) framework based on reinforcement learning (RL) technique, which aims at power savings in a computer system with multiple I/O devices running a number of heterogeneous applications. The proposed framework interacts with the CPU scheduler to perform effective application-level scheduling, thereby enabling further power savings. Moreover, it considers non-stationary workloads and differentiates between the service request generation rates of various software application. The online adaptive DPM technique consists of two layers: component-level local power manager and system-level global power manager. The component-level PM policy is pre-specified and fixed whereas the system-level PM employs temporal difference learning on semi-Markov decision process as the model-free RL technique, and it is specifically optimized for a heterogeneous application pool. Experiments show that the proposed approach considerably enhances power savings while maintaining good performance levels. In comparison with other reference systems, the proposed RL-based DPM approach, further enhances power savings, performs well under various workloads, can simultaneously consider power and performance, and achieves wide and deep power-performance tradeoff curves. Experiments conducted with multiple service providers confirm that up to 63% maximum energy saving per service provider can be achieved. |
| |
Keywords: | Power management Reinforcement learning Temporal difference learning Semi-Markov decision process |
本文献已被 ScienceDirect 等数据库收录! |
|