期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A Meta-Brokering Framework for Science Gateways

Krisztian Karoczkai Attila Kertesz Peter Kacsuk 《Journal of Grid Computing》2016,14(4):687-703

Recently scientific communities produce a growing number of computation-intensive applications, which calls for the interoperation of distributed infrastructures including Clouds, Grids and private clusters. The European SHIWA and ER-flow projects have enabled the combination of heterogeneous scientific workflows, and their execution in a large-scale system consisting of multiple Distributed Computing Infrastructures. One of the resource management challenges of these projects is called parameter study job scheduling. A parameter study job of a workflow generally has a large number of input files to be consumed by independent job instances. In this paper we propose a meta-brokering framework for science gateways to support the execution of such workflows. In order to cope with the high uncertainty and unpredictable load of the utilized distributed infrastructures, we introduce the so called resource priority services. These tools are capable of determining and dynamically updating priorities of the available infrastructures to be selected for job instances. Our evaluations show that this approach implies an efficient distribution of job instances among the available computing resources resulting in shorter makespan for parameter study workflows. 相似文献

2.

A Set of Successive Job Allocation Models in Distributed Computing Infrastructures

Gábor Bacsó Tamás Kis Ádám Visegrádi Attila Kertész Zsolt Németh 《Journal of Grid Computing》2016,14(2):347-358

The growing number of scientific computation-intensive applications calls for an efficient utilization of large-scale, potentially interoperable distributed infrastructures. Parameter sweep applications represent a large body of workflows. While the principle of workflows is easy to conceive, their execution is very complex and no universally accepted solution exists. In this paper we focus on the resource allocation challenges of parameter study jobs in distributed computing infrastructures. To cope with this NP-hard problem and the high uncertainty present in these systems, we propose a series of job allocation models that helps refining and simplifying the problem complexity. In this way we present some special cases that are polynomial and show how more complex scenarios can be reduced to these models. It is known from practice that a small number of job sizes improves the result of job allocation, therefore we state a hypothesis relying on this fact in one of our models. Unfortunately, the reduction of the general problem (using K-means clustering) did not help, and thus the hypothesis has proved to be false. In the future, we shall look for clustering techniques which fit this goal better. 相似文献

3.

分布式环境中的多作业执行调度策略与优化

季航旭姜苏赵宇海吴刚王国仁《计算机工程与科学》2021,43(6):951-961

分布式大数据计算引擎是科研机构、互联网企业和政府部门处理大规模数据必不可少的工具,它们的使用和推广促进了各个领域的快速发展,为社会进步做出了巨大贡献。但是,在多作业处理的情况下,目前主流的大数据计算引擎在资源分配和作业调度方面仍有许多不足之处,它们通常对多作业平均划分内存资源并以先进先出FIFO的方式调度作业,这样简单的资源划分方式和作业调度机制并不能充分利用系统性能。针对此问题,从计算引擎的作业层面做出了改进：在资源划分方面,通过提取作业特征对作业的任务量进行预估,判断作业任务量和作业预分配资源间的差异,合并对集群资源浪费较高的作业,充分利用计算资源;在作业调度方面,对作业池中的作业进行特征提取,使用多路K-means算法对作业进行聚类分析,然后基于分析的结果,使用自平衡轮询调度算法对作业进行调度,达到负载均衡的目的。为了验证所提算法的有效性,使用大规模文本数据集在分布式集群环境中进行对比实验,实验结果表明,提出的作业合并算法和多作业调度算法可以减少5%~23%的作业运行时间,提高了7.5%~29%的系统吞吐量,在最好情况下可减少40%的线程启动数。相似文献

4.

Data-Locality Aware Scientific Workflow Scheduling Methods in HPC Cloud Environments

Jieun Choi Theodora Adufu Yoonhee Kim 《International journal of parallel programming》2017,45(5):1128-1141

Efficient data-aware methods in job scheduling, distributed storage management and data management platforms are necessary for successful execution of data-intensive applications. However, research about methods for data-intensive scientific applications are insufficient in large-scale distributed cloud and cluster computing environments and data-aware methods are becoming more complex. In this paper, we propose a Data-Locality Aware Workflow Scheduling (D-LAWS) technique and a locality-aware resource management method for data-intensive scientific workflows in HPC cloud environments. D-LAWS applies data-locality and data transfer time based on network bandwidth to scientific workflow task scheduling and balances resource utilization and parallelism of tasks at the node-level. Our method consolidates VMs and consider task parallelism by data flow during the planning of task executions of a data-intensive scientific workflow. We additionally consider more complex workflow models and data locality pertaining to the placement and transfer of data prior to task executions. We implement and validate the methods based on fairness in cloud environments. Experimental results show that, the proposed methods can improve performance and data-locality of data-intensive workflows in cloud environments. 相似文献

5.

Multi-QoS constrained and Profit-aware scheduling approach for concurrent workflows on heterogeneous systems

《Future Generation Computer Systems》2017

The execution of a workflow application can result in an imbalanced workload among allocated processors, ultimately resulting in a waste of resources and a higher cost to the user. Here, we consider a dynamic resource management system in which processors are reserved not for a job but only to run a task, thus allowing a higher resource usage rate. This paper presents a scheduling algorithm that manages concurrent workflows in a dynamic environment in which jobs are submitted by users at any moment in time, on shared heterogeneous resources, and constrained to a specified budget and deadline for each job. Recent research attempted to propose dynamic strategies for concurrent workflows but only addressed fairness in resource sharing among applications while minimizing the execution time. The Multi-QoS Profit-Aware scheduling algorithm (MQ-PAS) proposed here is able to increase the profit achieved by the provider by considering the budget available for each job to define tasks priorities. We study the scalability of the algorithm with different types of workflows and infrastructures. The experimental results show that our strategy improves provider revenue significantly and obtains comparable successful rates of completed jobs. 相似文献

6.

Scheduling parameter sweep applications on global Grids: a deadline and budget constrained cost–time optimization algorithm

Rajkumar Buyya Manzur Murshed David Abramson Srikumar Venugopal 《Software》2005,35(5):491-512

Computational Grids and peer‐to‐peer (P2P) networks enable the sharing, selection, and aggregation of geographically distributed resources for solving large‐scale problems in science, engineering, and commerce. The management and composition of resources and services for scheduling applications, however, becomes a complex undertaking. We have proposed a computational economy framework for regulating the supply of and demand for resources and allocating them for applications based on the users' quality‐of‐service requirements. The framework requires economy‐driven deadline‐ and budget‐constrained (DBC) scheduling algorithms for allocating resources to application jobs in such a way that the users' requirements are met. In this paper, we propose a new scheduling algorithm, called the DBC cost–time optimization scheduling algorithm, that aims not only to optimize cost, but also time when possible. The performance of the cost–time optimization scheduling algorithm has been evaluated through extensive simulation and empirical studies for deploying parameter sweep applications on global Grids. Copyright © 2005 John Wiley & Sons, Ltd. 相似文献

7.

Online execution time prediction for computationally intensive applications with periodic progress updates

Maria Chtepen Filip H. A. Claeys Bart Dhoedt Filip De Turck Jan Fostier Piet Demeester Peter A. Vanrolleghem 《The Journal of supercomputing》2012,62(2):768-786

The effectiveness of distributed execution of computationally intensive applications (jobs) largely depends on the quality of the applied scheduling approach. However, most of the existing non-trivial scheduling algorithms rely on prior knowledge or on prediction of application parameters, such as execution time, size of input and output, dependencies, etc., to assign applications to the available computational resources. A major issue is that these parameters are hard to determine in advance, especially if the end user does not possess an extensive history of previous application runs. In this work we propose an online method for execution time prediction of applications, for which execution progress can be collected at run-time. Using dynamic progress information, the total job execution time can be predicted using extrapolation. However, the predictions achieved by extrapolation are far from precise and often vary over time as a result of changing application dynamics and varying resource load. Therefore, to compute the actual job execution time we match a number of predefined prediction evolution models against the consecutive extrapolations, by adopting nonlinear curve-fitting. The ??best-fit?? coefficients allow for more accurate execution time prediction. The predictions made are used to enhance a dynamic scheduling algorithm for workflows introduced in our earlier work. The scheduling algorithm is run with and without curve-fitting, showing a performance improvement of up to 15% in the former case. 相似文献

8.

Campus Grids Meet Applications: Modeling, Metascheduling and Integration

Yonghong Yan Barbara M. Chapman 《Journal of Grid Computing》2006,4(2):159-175

Air Quality Forecasting (AQF) is a new discipline that attempts to reliably predict atmospheric pollution. An AQF application has complex workflows and in order to produce timely and reliable forecast results, each execution requires access to diverse and distributed computational and storage resources. Deploying AQF on Grids is one option to satisfy such needs, but requires the related Grid middleware to support automated workflow scheduling and execution on Grid resources. In this paper, we analyze the challenges in deploying an AQF application in a campus Grid environment and present our current efforts to develop a general solution for Grid-enabling scientific workflow applications in the GRACCE project. In GRACCE, an application’s workflow is described using GAMDL, a powerful dataflow language for describing application logic. The GRACCE metascheduling architecture provides the functionalities required for co-allocating Grid resources for workflow tasks, scheduling the workflows and monitoring their execution. By providing an integrated framework for modeling and metascheduling scientific workflow applications on Grid resources, we make it easy to build a customized environment with end-to-end support for application Grid deployment, from the management of an application and its dataset, to the automatic execution and analysis of its results.The work has been performed as part of the University of Houston’s Sun Microsystems Center of Excellence in Geosciences [38]. 相似文献

9.

An SCP-based heuristic approach for scheduling distributed data-intensive applications on global grids 总被引：1，自引：0，他引：1

Srikumar Venugopal Rajkumar Buyya 《Journal of Parallel and Distributed Computing》2008

Data-intensive Grid applications need access to large data sets that may each be replicated on different resources. Minimizing the overhead of transferring these data sets to the resources where the applications are executed requires that appropriate computational and data resources be selected. In this paper, we consider the problem of scheduling an application composed of a set of independent tasks, each of which requires multiple data sets that are each replicated on multiple resources. We break this problem into two parts: one, to match each task (or job) to one compute resource for executing the job and one storage resource each for accessing each data set required by the job and two, to assign the set of tasks to the selected resources. We model the first part as an instance of the well-known Set Covering Problem (SCP) and apply a known heuristic for SCP to match jobs to resources. The second part is tackled by extending existing MinMin and Sufferage algorithms to schedule the set of distributed data-intensive tasks. Through simulation, we experimentally compare the SCP-based matching heuristic to others in conjunction with the task scheduling algorithms and present the results. 相似文献

10.

Towards the Scheduling of Multiple Workflows on Computational Grids

Luiz Fernando Bittencourt Edmundo R. M. Madeira 《Journal of Grid Computing》2010,8(3):419-441

The workflow paradigm has become the standard to represent processes and their execution flows. With the evolution of e-Science, workflows are becoming larger and more computational demanding. Such e-Science necessities match with what computational Grids have to offer. Grids are shared distributed platforms which will eventually receive multiple requisitions to execute workflows. With this, there is a demand for a scheduler which deals with multiple workflows in the same set of resources, thus the development of multiple workflow scheduling algorithms is necessary. In this paper we describe four different initial strategies for scheduling multiple workflows on Grids and evaluate them in terms of schedule length and fairness. We present results for the initial schedule and for the makespan after the execution with external load. From the results we conclude that interleaving the workflows on the Grid leads to good average makespan and provides fairness when multiple workflows share the same set of resources. 相似文献

11.

History-dependent scheduling: Models and algorithms for scheduling with general precedence and sequence dependence

《Computers & Operations Research》2015

In this paper, we extend job scheduling models to include aspects of history-dependent scheduling, where setup times for a job are affected by the aggregate activities of all predecessors of that job. Traditional approaches to machine scheduling typically address objectives and constraints that govern the relative sequence of jobs being executed using available resources. This paper optimises the operations of multiple unrelated resources to address sequential and history-dependent job scheduling constraints along with time window restrictions. We denote this consolidated problem as the general precedence scheduling problem (GPSP). We present several applications of the GPSP and show that many problems in the literature can be represented as special cases of history-dependent scheduling. We design new ways to model this class of problems and then proceed to formulate it as an integer program. We develop specialized algorithms to solve such problems. An extensive computational analysis over a diverse family of problem data instances demonstrates the efficacy of the novel approaches and algorithms introduced in this paper. 相似文献

12.

The Virtual Laboratory: a toolset to enable distributed molecular modelling for drug design on the World‐Wide Grid

Rajkumar Buyya Kim Branson Jon Giddy David Abramson 《Concurrency and Computation》2003,15(1):1-25

Computational Grids are emerging as a new paradigm for sharing and aggregation of geographically distributed resources for solving large‐scale compute and data intensive problems in science, engineering and commerce. However, application development, resource management and scheduling in these environments is a complex undertaking. In this paper, we illustrate the development of a Virtual Laboratory environment by leveraging existing Grid technologies to enable molecular modelling for drug design on geographically distributed resources. It involves screening millions of compounds in the chemical database (CDB) against a protein target to identify those with potential use for drug design. We have used the Nimrod‐G parameter specification language to transform the existing molecular docking application into a parameter sweep application for executing on distributed systems. We have developed new tools for enabling access to ligand records/molecules in the CDB from remote resources. The Nimrod‐G resource broker along with molecule CDB data broker is used for scheduling and on‐demand processing of docking jobs on the World‐Wide Grid (WWG) resources. The results demonstrate the ease of use and power of the Nimrod‐G and virtual laboratory tools for grid computing. Copyright © 2003 John Wiley & Sons, Ltd. 相似文献

13.

面向分布对象存储结构的高性能计算系统资源管理方法

卢宇彤杨学军《计算机研究与发展》2009,46(Z2)

当前的高性能计算系统的资源管理和调度关注的焦点是计算资源,然而随着高性能计算系统的规模增大和计算能力增强,其I/O瓶颈问题日益突出.由于高性能计算系统的存储结构多样性带来了存储资源管理分配的难题,在目前主流的资源管理系统中尚未有针对I/O存储资源的调度和管理.随着对象存储结构的发展和广泛使用,大多数主流高性能系统采用分布对象存储系统,研究对分布对象存储系统的管理并结合资源管理系统,实现面向存储的作业优化调度,对提升高性能计算系统的实际性能有重要意义.针对具有分布对象存储结构的高性能计算系统,研究面向分布存储的资源管理方法,在作业调度和资源分配时考虑不同应用的I/O需求,通过建立分布对象存储资源模型和应用程序I/O能力需求模型,并在资源调度和分配上根据不同的I/O应用级别,为作业分配合适的存储资源,设计并实现基于I/O能力分级的作业调度和资源分配算法.系统测试表明:该方法可以显著提高多作业环境下应用的性能,保证应用程序的性能稳定性,提高系统的吞吐率. 相似文献

14.

Real-Time Scheduling with a Budget

Joseph ?Naor Hadas?Shachnai Tami?Tamir 《Algorithmica》2007,47(3):343-364

Suppose that we are given a set of jobs, where each job has a processing time, a non-negative weight, and a set of possible time intervals in which it can be processed. In addition, each job has a processing cost. Our goal is to schedule a feasible subset of the jobs on a single machine, such that the total weight is maximized, and the cost of the schedule is within a given budget. We refer to this problem as budgeted real-time scheduling (BRS). Indeed, the special case where the budget is unbounded is the well-known real-time scheduling problem. The second problem that we consider is budgeted real-time scheduling with overlaps (BRSO), in which several jobs may be processed simultaneously, and the goal is to maximize the time in which the machine is utilized. Our two variants of this real-time scheduling problem have important applications, in vehicle scheduling, linear combinatorial auctions, and Quality-of-Service management for Internet connections. These problems are the focus of this paper. Both BRS and BRSO are strongly NP-hard, even with unbounded budget. Our main results are (2 + ε)-approximation algorithms for these problems. This ratio coincides with the best known approximation factor for the (unbudgeted) real-time scheduling problem, and is slightly weaker than the best known approximation factor of e/(e - 1) for the (unbudgeted) real-time scheduling with overlaps, presented in this paper. We show that better ratios (or simpler approximation algorithms) can be derived for some special cases, including instances with unit-costs and the budgeted job interval selection problem (JISP). Budgeted JISP is shown to be APX-hard even when overlaps are allowed and with unbounded budget. Finally, our results can be extended to instances with multiple machines. 相似文献

15.

Predictive resource management for meta-applications

N. Floros A. J. G. Hey K. E. Meacham J. Papay M. Surridge 《Future Generation Computer Systems》1999,15(5-6):723-734

This paper defines meta-applications as large, related collections of computational tasks, designed to achieve a specific overall result, running on a (possibly geographically) distributed, non-dedicated meta-computing platform. To carry out such applications in an industrial context, one requires resource management and job scheduling facilities (including capacity planning), to ensure that the application is feasible using the available resources, that each component job will be sent to an appropriate resource, and that everything will finish before the computing resources are needed for other purposes.

This requirement has been addressed by the PAC in three major European collaborative projects: PROMENVIR, TOOLSHED and HPC-VAO, leading to the creation of job scheduling software, in which scheduling is brought together with performance modelling of applications and systems, to provide meta-applications management facilities. This software is described, focusing on the performance modelling approach which was needed to support it.

Early results from this approach are discussed, raising some new issues in performance modelling and software deployment for meta-applications. An indication is given about ongoing work at the PAC designed to overcome current limitations and address these outstanding issues. 相似文献

16.

Distributed bottleneck control for repetitive production systems 总被引：1，自引：1，他引：0

ZBIGNIEW BANASZAK 《Journal of Intelligent Manufacturing》1997,8(5):415-424

A bottleneck control problem for general periodic job shops with blocking where each machine has an input buffer of finite capacity is investigated. The job shop considered consists of a set of workflows competing with each other for access to common machines. A distributed buffer control policy that restricts a job entering an input buffer of a local machine in a specific sequence is proposed. The conditions sufficient for design and allocation of dispatching rules are presented. The system time and the rate of machine utilization are considered as the evaluation criteria. Finally, the procedure aimed at scheduling periodic job shops is provided. 相似文献

17.

Distributed Late-binding Scheduling and Cooperative Data Caching

Antonio Delgado Peris José M. Hernández Eduardo Huedo 《Journal of Grid Computing》2017,15(2):235-256

Pull-based overlays are used in some of today’s largest computational grids. Job agents are submitted to resources with the duty of retrieving real workload from a central queue at runtime and executing it. This model helps overcome the problems of direct job submission in the highly complex grid environments, namely, heterogeneity, imprecise status information, relatively high failure rates and slow adaptation to changes of grid conditions or user priorities. This article presents a distributed scheduling architecture for such late-binding overlays. In this architecture, execution nodes share a distributed hash table and cooperatively perform job assignment. As our experiments prove, scalability problems of centralized matching are avoided, achieving low and predictable scheduling overheads even for execution of large workflows, and total turnaround times are improved. This is in line with the predictions of a theoretical model of grid workflow execution that the article also discusses. Scalability makes fine-grained scheduling possible and enables new functionalities, like a distributed data cache shared by the execution nodes, which helps alleviate the commonly congested storage services. In addition, we show that our system is more resilient to problems like communication breakdowns between computation centres. Moreover, the new architecture is better prepared to deal with demanding scenarios like intense demand of popular data files or remote data processing. 相似文献

18.

Task-resource scheduling problem

Anna Gorbenko Vladimir Popov 《国际自动化与计算杂志》2012,9(4):429-441

Cloud computing is a new and rapidly emerging computing paradigm where applications,data and IT services are provided over the Internet.The task-resource management is the key role in cloud computing systems.Task-resource scheduling problems are premier which relate to the efficiency of the whole cloud computing facilities.Task-resource scheduling problem is NPcomplete.In this paper,we consider an approach to solve this problem optimally.This approach is based on constructing a logical model for the problem.Using this model,we can apply algorithms for the satisfiability problem(SAT) to solve the task-resource scheduling problem.Also,this model allows us to create a testbed for particle swarm optimization algorithms for scheduling workflows. 相似文献

19.

Double Auction-based Scheduling of Scientific Applications in Distributed Grid and Cloud Environments 总被引：1，自引：0，他引：1

Radu Prodan Marek Wieczorek Hamid Mohammadi Fard 《Journal of Grid Computing》2011,9(4):531-548

Economy models have long been considered as a promising complement to the classical distributed resource management not only due of their dynamic and decentralized nature, but also because the concept of financial valuation of resources and services is an inherent part of any such model. In its broadest sense, scheduling of scientific applications in distributed Grid and Cloud environments can be regarded as a market-based negotiation between a scheduling service optimizing user-centric objectives (execution time, budget), and a resource manager optimizing provider-centric metrics (resource utilization, income, job throughput). In this paper, we propose a new instantiation of the negotiation protocol between the scheduler and resource manager using a market-based Continuous Double Auction (CDA) model. We analyze different scheduling strategies that can be applied and identify general strategic patterns that can lead to a fast and cheap work ow execution. In the experimental study, we demonstrate that under certain circumstances one can benefit by applying an aggressive scheduling strategy. 相似文献

20.

Wingrid-面向参数扫描应用的网格计算系统

刘文懋张伟哲张宏莉《计算机工程与应用》2006,(Z1)

参数扫描应用在计算网格环境下扮演十分重要的角色。在Wingrid项目中,我们提出并实现了一种面向参数扫描的自适应调度机制。客户端,主节点和从节点的调度基础设施,以及基于领导节点的通信系统能够改善调度的效率。同时,我们比较了自适应workqueue算法和标准启发式调度算法。实验结果显示大网络延迟下,启发式调度算法效率高于workqueue算法,在各种启发式算法中,min-min启发式算法的任务完成时间最小。相似文献