首页 | 官方网站   微博 | 高级检索  
     

基于延迟调度策略的Reduce调度优化算法
引用本文:石义龙,林泓,李玉强,王彦.基于延迟调度策略的Reduce调度优化算法[J].计算机应用研究,2017,34(7).
作者姓名:石义龙  林泓  李玉强  王彦
作者单位:武汉理工大学 计算机科学与技术学院,武汉理工大学 计算机科学与技术学院,武汉理工大学 计算机科学与技术学院,武汉理工大学 计算机科学与技术学院
基金项目:湖北省自然科学(No. 2013CFB351)
摘    要:在大规模的Hadoop集群中,良好的任务调度策略对提高数据本地性、减小网络传输开销、减少作业执行时间以及提高集群的作业吞吐量都有着重要的影响。本文针对Hadoop架构中Reduce任务的数据本地性较低问题,提出了一种基于延迟调度策略的Reduce任务调度优化算法,通过提高Reduce任务的数据本地性来减少作业执行时间以及提高作业吞吐量,该算法在Hadoop架构的Early Shuffle阶段,使用多级延迟调度策略来提高Reduce任务的数据本地性。最后重写原生公平调度器代码实现了该调度算法,并与原生公平调度器进行了对比实验分析,实验结果表明该算法明显减少了作业执行时间,提高了集群的作业吞吐量。

关 键 词:Reduce任务    数据本地性  延迟调度  MapReduce任务调度
收稿时间:2016/5/3 0:00:00
修稿时间:2017/5/10 0:00:00

A Reduce scheduling optimization algorithm based on delay scheduling policy
Shi Yilong,Ling Hong,Li Yuqiang and Wang Yan.A Reduce scheduling optimization algorithm based on delay scheduling policy[J].Application Research of Computers,2017,34(7).
Authors:Shi Yilong  Ling Hong  Li Yuqiang and Wang Yan
Affiliation:Dept School of Computer Science and Technology,Wuhan University of Technology,Dept School of Computer Science and Technology,Wuhan University of Technology,Dept School of Computer Science and Technology,Wuhan University of Technology,Dept School of Computer Science and Technology,Wuhan University of Technology
Abstract:In large scale Hadoop cluster, good task scheduling strategy is important to improve data locality, reduce network transmission overhead, reduce job execution time and improve job throughput. In view of the low data locality problem of Reduce task in Hadoop architecture, this paper puts forward a Reduce task scheduling optimization algorithm based on delay scheduling policy, which reduces the job execution time and improves the job throughput by improving the data locality of the Reduce task. In the Shuffle Early phase, the algorithm uses a multi-stage delay scheduling policy to improve the data locality of the Reduce task. This paper rewrite the native fair scheduler code to realize the scheduling algorithm, and conduct contrast experiment with native fair scheduler. Experimental results show that the proposed algorithm significantly reduces the job execution time, and improves the job throughput.
Keywords:Reduce task  Data locality  Delay Scheduler  MapReduce task scheduling
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号