首页 | 官方网站   微博 | 高级检索  
     

基于Logistic回归模型的Hadoop本地任务调度优化算法*
引用本文:帅仁俊,沈 阳,陈 平,潘 静,董亚楠.基于Logistic回归模型的Hadoop本地任务调度优化算法*[J].计算机应用研究,2017,34(3).
作者姓名:帅仁俊  沈 阳  陈 平  潘 静  董亚楠
作者单位:南京工业大学 计算机科学与技术学院,南京工业大学 计算机科学与技术学院,南京市卫生信息中心,南京工业大学 计算机科学与技术学院,南京工业大学 计算机科学与技术学院
基金项目:国家公益性科研专项项目(201310162,201210022);连云港科技支撑计划项目(SH1110)
摘    要:当一个工作节点有多个本地任务可执行时,默认情况下,调度器都是按照任务被发现的先后顺序来进行执行,效率低下。针对于此,为了优化对本地任务的调度,提出了基于Logistic回归模型的Hadoop本地任务调度优化算法。首先,选取定义与任务相关的特征向量,然后基于Logistic回归的机器学习方式得到各向量的作用权值,将任务进行优先级排序,并通过过载规则不断更新模型。通过实验证明,提出的算法在改善map 任务的数据本地性的同时,降低了作业运行时间。

关 键 词:Hadoop  MapReduce  本地调度  任务优先级  过载规则  Logistic回归模型
收稿时间:2016/2/28 0:00:00
修稿时间:2017/1/15 0:00:00

Hadoop local tasks scheduling optimization algorithm based on Logistic regression model
Shuai Renjun,Shen Yang,Chen Ping,Pan Jing and Dong Yanan.Hadoop local tasks scheduling optimization algorithm based on Logistic regression model[J].Application Research of Computers,2017,34(3).
Authors:Shuai Renjun  Shen Yang  Chen Ping  Pan Jing and Dong Yanan
Affiliation:School of Computer Science and Technology,Nanjing Technology University,,Nanjing Health Information Center,School of Computer Science and Technology,Nanjing Technology University,School of Computer Science and Technology,Nanjing Technology University
Abstract:For a TaskTracker has multiple local tasks available, by default, the scheduler executes those tasks in succession with the order of the tasks to be found, this is inefficient. In order to optimize the local tasks scheduling, this paper presented Hadoop local tasks scheduling optimization algorithm based on Logistic regression model. First, related feature vectors of the local tasks were selected and defined, then, based on the way of machine learning with Logistic regression model, trained these vector to get the weight of each vector to decide the task priority, and updated the model constantly by the overload rules. The experimental results show that the proposed algorithm improves map task data locality, at the same time of reducing job running time.
Keywords:Hadoop  MapReduce  local tasks scheduling  task priority  overload rules  Logistic regression model
点击此处可从《计算机应用研究》浏览原始摘要信息
点击此处可从《计算机应用研究》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号