首页 | 官方网站   微博 | 高级检索  
     


ROUTE: run‐time robust reducer workload estimation for MapReduce
Authors:Zhihong Liu  Qi Zhang  Raouf Boutaba  Yaping Liu  Zhenghu Gong
Affiliation:1. College of Computer, National University of Defense Technology, Changsha, Hunan, China;2. David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada;3. Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada
Abstract:MapReduce has become a popular model for large‐scale data processing in recent years. Many works on MapReduce scheduling (e.g., load balancing and deadline‐aware scheduling) have emphasized the importance of predicting workload received by individual reducers. However, because the input characteristics and user‐specified map function of a given job are unknown to the MapReduce framework before the job starts, accurately predicting workload of reducers can be a difficult challenge. To address this challenge, we present ROUTE, a run‐time robust reducer workload estimation technique for MapReduce. ROUTE progressively samples the partition size of the early completed mappers, allowing ROUTE to perform estimation at run time yet fulfilling the accuracy requirement specified by users. Moreover, by using robust estimation and bootstrapping resampling techniques, ROUTE can achieve high applicability to a wide variety of applications. Through experiments using both real and synthetic data on an 11‐node Hadoop cluster, we show ROUTE can achieve high accuracy with error rate no more than 10.92% and an improvement of 40.6% in terms of error rate while compared with the state‐of‐the‐art solution. Besides, through simulations using synthetic data, we show that ROUTE is robust to a variety of skewed distributions. Finally, we apply ROUTE to existing load balancing and deadline‐aware scheduling frameworks and show ROUTE significantly improves the performance of these frameworks. Copyright © 2016 John Wiley & Sons, Ltd.
Keywords:MapReduce  Workload estimation  Load balancing  Task scheduling
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号