期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A strategy for scheduling tightly coupled parallel applications on clusters

H. A. Sanjay Sathish S. Vadhiyar 《Concurrency and Computation》2009,21(18):2491-2517

Although various strategies have been developed for scheduling parallel applications with independent tasks, very little work exists for scheduling tightly coupled parallel applications on cluster environments. In this paper, we compare four different strategies based on performance models of tightly coupled parallel applications for scheduling the applications on clusters. In addition to algorithms based on existing popular optimization techniques, we also propose a new algorithm called Box Elimination that searches the space of performance model parameters to determine the best schedule of machines. By means of real and simulation experiments, we evaluated the algorithms on single cluster and multi‐cluster setups. We show that our Box Elimination algorithm generates up to 80% more efficient schedules than other algorithms. We also show that the execution times of the schedules produced by our algorithm are more robust against the performance modeling errors. Copyright © 2009 John Wiley & Sons, Ltd. 相似文献

2.

A slowdown model for applications executing on time-shared clustersof workstations

Figueira S.M. Berman F. 《Parallel and Distributed Systems, IEEE Transactions on》2001,12(6):653-670

Distributed applications executing on clustered environments typically share resources (computers and network links) with other applications. In such systems, application execution may be retarded by the competition for these shared resources. In this paper, we define a model that calculates the slowdown imposed on applications in time-shared multi-user clusters. Our model focuses on three kinds of slowdown: local slowdown, which synthesizes the effect of contention for CPU in a single workstation; communication slowdown, which synthesizes the effect of contention for the workstations and network links on communication costs; and aggregate slowdown, which determines the effect of contention on a parallel task caused by other applications executing on the entire cluster, i.e., on the nodes used by the parallel application. We verify empirically that this model provides an accurate estimate of application performance for a set of compute-intensive parallel applications on different clusters with a variety of emulated loads 相似文献

3.

Strategies to create platforms for differentiated services from dedicated and opportunistic resources

Shah Asaduzzaman Muthucumaru Maheswaran 《Journal of Parallel and Distributed Computing》2007

This paper is proposing a new platform for implementing services in future service oriented architectures. The basic premise of our proposal is that by combining the large volume of uncontracted resources with small clusters of dedicated resources, we can dramatically reduce the amount of dedicated resources while the goodput provided by the overall system remains at a high level. This paper presents particular strategies for implementing this idea for a particular class of applications. We performed very detailed simulations on synthetic and real traces to evaluate the performance of the proposed strategies. Our findings on compute-intensive applications show that preemptive reallocation of resources is necessary for assured services. The proposed preemption-based scheduling heuristic can significantly improve utilization of the dedicated resources by opportunistically offloading the peak loads on uncontracted resources, while keeping the service quality virtually unaffected. 相似文献

4.

State-based predictions with self-correction on Enterprise Desktop Grid environments

Josep L. Lerida Francesc Solsona Porfidio Hernandez Francesc Gine Mauricio Hanzich Josep Conde 《Journal of Parallel and Distributed Computing》2013

The abundant computing resources in current organizations provide new opportunities for executing parallel scientific applications and using resources. The Enterprise Desktop Grid Computing (EDGC) paradigm addresses the potential for harvesting the idle computing resources of an organization’s desktop PCs to support the execution of the company’s large-scale applications. In these environments, the accuracy of response-time predictions is essential for effective metascheduling that maximizes resource usage without harming the performance of the parallel and local applications. However, this accuracy is a major challenge due to the heterogeneity and non-dedicated nature of EDGC resources. In this paper, two new prediction techniques are presented based on the state of resources. A thorough analysis by linear regression demonstrated that the proposed techniques capture the real behavior of the parallel applications better than other common techniques in the literature. Moreover, it is possible to reduce deviations with a proper modeling of prediction errors, and thus, a Self-adjustable Correction method (SAC) for detecting and correcting the prediction deviations was proposed with the ability to adapt to the changes in load conditions. An extensive evaluation in a real environment was conducted to validate the SAC method. The results show that the use of SAC increases the accuracy of response-time predictions by 35%. The cost of predictions with self-correction and its accuracy in a real environment was analyzed using a combination of the proposed techniques. The results demonstrate that the cost of predictions is negligible and the combined use of the prediction techniques is preferable. 相似文献

5.

Use of run time predictions for automatic co-allocation of multi-cluster resources for iterative parallel applications

Marco A.S. NettoAuthor Vitae Christian VecchiolaAuthor VitaeMichael KirleyAuthor Vitae Carlos A. VarelaAuthor VitaeRajkumar BuyyaAuthor Vitae 《Journal of Parallel and Distributed Computing》2011,71(10):1388-1399

Metaschedulers co-allocate resources by requesting a fixed number of processors and usage time for each cluster. These static requests, defined by users, limit the initial scheduling and prevent rescheduling of applications to other resource sets. It is also difficult for users to estimate application execution times, especially on heterogeneous environments. To overcome these problems, metaschedulers can use performance predictions for automatic resource selection. This paper proposes a resource co-allocation technique with rescheduling support based on performance predictions for multi-cluster iterative parallel applications. Iterative applications have been used to solve a variety of problems in science and engineering, including large-scale computations based on the asynchronous model more recently. We performed experiments using an iterative parallel application, which consists of benchmark multiobjective problems, with both synchronous and asynchronous communication models on Grid’5000. The results show run time predictions with an average error of 7% and prevention of up to 35% and 57% of run time overestimations to support rescheduling for synchronous and asynchronous models, respectively. The performance predictions require no application source code access. One of the main findings is that as the asynchronous model masks communication and computation, it requires no network information to predict execution times. By using our co-allocation technique, metaschedulers become responsible for run time predictions, process mapping, and application rescheduling; releasing the user from these burden tasks. 相似文献

6.

基于资源预测的网格任务调度模型 总被引：1，自引：0，他引：1

程宏兵《计算机应用》2010,30(9):2530-2534

跨越虚拟组织中多个域(或集群)的网格任务调度由于资源的不确定性（如动态性和异构性）而成为网格应用中亟待解决的问题。提出了一种有效的基于资源预测的网格任务调度模型——RPTS,该模型利用加权最小二乘方法进行参数估计的自回归滑动平均(ARMA)预测方法对网格环境下的主机负载进行预测。利用上述资源预测结果和一类数据并行性网格任务的建模结果,对它们进行预处理、匹配并调度执行。RPTS充分考虑了网格环境下资源的动态性和异构性,为解决网格环境下任务调度问题提供了一种较好的方法。与其他一些网格任务调度方法进行了一系列的仿真实验,结果表明RPTS模型具有任务执行时间最短和稳定性较好的特点。相似文献

7.

Workflow performance improvement using model-based scheduling over multiple clusters and clouds

《Future Generation Computer Systems》2016

In recent years, a variety of computational sites and resources have emerged, and users often have access to multiple resources that are distributed. These sites are heterogeneous in nature and performance of different tasks in a workflow varies from one site to another. Additionally, users typically have a limited resource allocation at each site capped by administrative policies. In such cases, judicious scheduling strategy is required in order to map tasks in the workflow to resources so that the workload is balanced among sites and the overhead is minimized in data transfer. Most existing systems either run the entire workflow in a single site or use naïve approaches to distribute the tasks across sites or leave it to the user to optimize the allocation of tasks to distributed resources. This results in a significant loss in productivity. We propose a multi-site workflow scheduling technique that uses performance models to predict the execution time on resources and dynamic probes to identify the achievable network throughput between sites. We evaluate our approach using real world applications using the Swift parallel and distributed execution framework. We use two distinct computational environments-geographically distributed multiple clusters and multiple clouds. We show that our approach improves the resource utilization and reduces execution time when compared to the default schedule. 相似文献

8.

Repetitive model refactoring strategy for the design space exploration of intensive signal processing applications 总被引：1，自引：0，他引：1

Calin Glitia Pierre Boulet Eric Lenormand Michel BarreteauAuthor vitae 《Journal of Systems Architecture》2011,57(9):815-829

The efficient design of computation intensive multidimensional signal processing applications requires dealing with three kinds of constraints: those implied by the data dependencies, the non-functional requirements (real-time, power consumption) and resources availability of the execution platform. Modeling and Analysis of Real-time and Embedded systems (MARTE) UML profile through its repetitive structure modeling (RSM) package is well suited to model the inherent parallelism within these applications, a compact representation of parallel execution platforms and the distributive mapping of one on another. The execution of such a specification respects the whole set of constraints defined upon, while the quality of the scheduling is directly linked to the quality of the mapping of the multidimensional structures (data arrays or parallel loop nests) into time and space. We propose here a strategy to use a refactoring tool dedicated to this kind of application that allows to find good trade-offs in the usage of storage and computation resources and in parallelism (both task and data parallelism) exploitation. This strategy is illustrated on an industrial radar application. 相似文献

9.

Overhead Analysis of Scientific Workflows in Grid Environments

Prodan R. Fahringer T. 《Parallel and Distributed Systems, IEEE Transactions on》2008,19(3):378-393

Scientific workflows are a topic of great interest in the grid community that sees in the workflow model an attractive paradigm for programming distributed wide-area grid infrastructures. Traditionally, the grid workflow execution is approached as a pure best effort scheduling problem that maps the activities onto the grid processors based on appropriate optimization or local matchmaking heuristics such that the overall execution time is minimized. Even though such heuristics often deliver effective results, the execution in dynamic and unpredictable grid environments is prone to severe performance losses that must be understood for minimizing the completion time or for the efficient use of high-performance resources. In this paper, we propose a new systematic approach to help the scientists and middleware developers understand the most severe sources of performance losses that occur when executing scientific workflows in dynamic grid environments. We introduce an ideal model for the lowest execution time that can be achieved by a workflow and explain the difference to the real measured grid execution time based on a hierarchy of performance overheads for grid computing. We describe how to systematically measure and compute the overheads from individual activities to larger workflow regions and adjust well-known parallel processing metrics to the scope of grid computing, including speedup and efficiency. We present a distributed online tool for computing and analyzing the performance overheads in real time based on event correlation techniques and introduce several performance contracts as quality-of-service parameters to be enforced during the workflow execution beyond traditional best effort practices. We illustrate our method through postmortem and online performance analysis of two real-world workflow applications executed in the Austrian grid environment. 相似文献

10.

Adaptive computing on the Grid using AppLeS 总被引：2，自引：0，他引：2

Berman F. Wolski R. Casanova H. Cirne W. Dail H. Faerman M. Figueira S. Hayes J. Obertelli G. Schopf J. Shao G. Smallen S. Spring N. Su A. Zagorodnov D. 《Parallel and Distributed Systems, IEEE Transactions on》2003,14(4):369-382

Ensembles of distributed, heterogeneous resources, also known as computational grids, have emerged as critical platforms for high-performance and resource-intensive applications. Such platforms provide the potential for applications to aggregate enormous bandwidth, computational power, memory, secondary storage, and other resources during a single execution. However, achieving this performance potential in dynamic, heterogeneous environments is challenging. Recent experience with distributed applications indicates that adaptivity is fundamental to achieving application performance in dynamic grid environments. The AppLeS (Application Level Scheduling) project provides a methodology, application software, and software environments for adaptively scheduling and deploying applications in heterogeneous, multiuser grid environments. We discuss the AppLeS project and outline our findings. 相似文献

11.

An efficient grid scheduling strategy for data parallel applications

Kashif Hesham Khan Kalim Qureshi Mostafa Abd-El-Barr 《The Journal of supercomputing》2014,68(3):1487-1502

Scheduling large-scale application in heterogeneous grid systems is a fundamental NP-complete problem that is critical to obtain good performance and execution cost. To achieve high performance in a grid system it requires effective task partitioning, resource management and load balancing. The heterogeneous and dynamic nature of a grid, as well as the diverse demands of applications running on the grid, makes grid scheduling a major task. Existing schedulers in wide-area heterogeneous systems require a large amount of information about the application and the grid environment to produce reasonable schedules. However, this required information may not be available, may be too expensive to collect, or may increase the runtime overhead of the scheduler such that the scheduler is rendered ineffective. We believe that no one scheduler is appropriate for all grid systems and applications. This is because while data parallel applications in which further data partitioning is possible can be further improved by efficient management of resources, smart selection of resources and load balancing can be possible, in functional/not-dividable-task parallel applications such partitioning is either not possible or difficult or expensive in term of performance. In this paper, we propose a scheduler for data parallel applications (SDPA) which offers an efficient task partitioning and load balancing strategy for data parallel applications in grid environment. The proposed SDPA offers two major features: maintaining job priority even if insufficient number of free resources is available and pre-task assignment to cut the idle time of nodes. The SDPA selects nodes smartly according to the nature of task and the nodes’ resources availability. Simulation results conducted reveal that SDPA achieves performance improvement over reported strategies in the reviewed literature in terms of execution time, throughput and waiting time. 相似文献

12.

A survey on parallel and distributed multi-agent systems for high performance computing simulations

《Computer Science Review》2016

Simulation has become an indispensable tool for researchers to explore systems without having recourse to real experiments. Depending on the characteristics of the modeled system, methods used to represent the system may vary. Multi-agent systems are often used to model and simulate complex systems. In any cases, increasing the size and the precision of the model increases the amount of computation, requiring the use of parallel systems when it becomes too large. In this paper, we focus on parallel platforms that support multi-agent simulations and their execution on high performance resources as parallel clusters. Our contribution is a survey on existing platforms and their evaluation in the context of high performance computing. We present a qualitative analysis of several multi-agent platforms, their tests in high performance computing execution environments, and the performance results for the only two platforms that fulfill the high performance computing constraints. 相似文献

13.

Performance measurement,visualization and modeling of parallel and distributed programs using the AIMS toolkit

Jerry Yan Sekhar Sarukkai Pankaj Mehra 《Software》1995,25(4):429-461

Writing large-scale parallel and distributed scientific applications that make optimum use of the multiprocessor is a challenging problem. Typically, computational resources are underused due to performance failures in the application being executed. Performance-tuning tools are essential for exposing these performance failures and for suggesting ways to improve program performance. In this paper, we first address fundamental issues in building useful performance-tuning tools and then describe our experience with the AIMS toolkit for tuning parallel and distributed programs on a variety of platforms. AIMS supports source-code instrumentation, run-time monitoring, graphical execution profiles, performance indices and automated modeling techniques as ways to expose performance problems of programs. Using several examples representing a broad range of scientific applications, we illustrate AIMS' effectiveness in exposing performance problems in parallel and distributed programs. 相似文献

14.

Grid harvest service: A performance system of grid computing

《Journal of Parallel and Distributed Computing》2006,66(10):1322-1337

Conventional performance evaluation mechanisms focus on dedicated systems. Grid computing infrastructure, on the other hand, is a shared collaborative environment constructed on virtual organizations. Each organization has its own resource management policy and usage pattern. The non-dedicated characteristic of Grid computing prevents the leverage of conventional performance evaluation systems. In this study, we introduce the grid harvest service (GHS) performance evaluation and task scheduling system for solving large-scale applications in a shared environment. GHS is based on a novel performance prediction model and a set of task scheduling algorithms. GHS supports three classes of task scheduling, single task, parallel processing and meta-task. Experimental results show that GHS provides a satisfactory solution for performance prediction and task scheduling of large applications and has a real potential. 相似文献

15.

Adapting grid computing environments dependable with virtual machines: design, implementation, and evaluations

Xuanhua Shi Hai Jin Song Wu Wei Zhu Li Qi 《The Journal of supercomputing》2013,66(3):1152-1166

Due to its potential, using virtual machines in grid computing is attracting increasing attention. Most of the researches focus on how to create or destroy a virtual execution environments for different kinds of applications, while the policy of managing the virtual environments is not widely discussed. This paper proposes the design, implementation, and evaluation of an adaptive and dependable virtual execution environment for grid computing, ADVE, which focuses on the policy of managing virtual machines in grid environments. To build a dependable virtual execution environments for grid applications, ADVE provides an set of adaptive policies managing virtual machine, such as when to create and destroy a new virtual execution environment, when to migrate applications from one virtual execution environment to a new virtual execution environment. We conduct experiments over a cluster to evaluate the performance of ADVE, and the experimental results show that ADVE can improve the throughput and the reliability of grid resources with the adaptive management of virtual machines. 相似文献

16.

An Anticipative Recursively Adjusting Mechanism for parallel file transfer in data grids

Chao‐Tung Yang Ming‐Feng Yang Yao‐Chun Chi Ching‐Hsien Hsu 《Concurrency and Computation》2010,22(15):2144-2169

Data Grids enable the sharing, selection, and connection of a wide variety of geographically distributed computational and storage resources for content needed by large‐scale data‐intensive applications such as high‐energy physics, bioinformatics, and virtual astrophysical observatories. In Data Grids, co‐allocation architectures were developed to enable parallel downloads of data sets from selected replica servers. As Internet is usually the underlying network of a grid, network bandwidth plays as the main factor affecting file transfers between clients and servers. In this paradigm, there are still some challenges that need to be solved, such as to reduce differences in finish times between selected replica servers, to avoid traffic congestion resulting from transferring the same blocks in different links among servers and clients, and to manage network performance variations among parallel transfers. In this paper, we propose the Anticipative Recursively Adjusting Mechanism (ARAM) scheme to adjust the workloads on selected replica servers and handle unpredictable variations in network performance by those servers. Our algorithm is based on using the finish rates for previously assigned transfers to anticipate the bandwidth status for the next section to adjust workloads, and to reduce file transfer times in grid environments. Our approach is useful in grid environments with unstable network link. It not only reduces idle time wasted waiting for the slowest server, but also decreases file transfer completion times. Copyright © 2010 John Wiley & Sons, Ltd. 相似文献

17.

Two new fast heuristics for mapping parallel applications on cloud computing

《Future Generation Computer Systems》2014

In this paper two new heuristics, named Min–min-C and Max–min-C, are proposed able to provide near-optimal solutions to the mapping of parallel applications, modeled as Task Interaction Graphs, on computational clouds. The aim of these heuristics is to determine mapping solutions which allow exploiting at best the available cloud resources to execute such applications concurrently with the other cloud services.Differently from their originating Min–min and Max–min models, the two introduced heuristics take also communications into account. Their effectiveness is assessed on a set of artificial mapping problems differing in applications and in node working conditions. The analysis, carried out also by means of statistical tests, reveals the robustness of the two algorithms proposed in coping with the mapping of small- and medium-sized high performance computing applications on non-dedicated cloud nodes. 相似文献

18.

Contention-sensitive static performance prediction for parallel distributed applications

《Performance Evaluation》2006,63(4-5):265-277

Performance prediction for parallel applications running in heterogeneous clusters is difficult to accomplish due to the unpredictable resource contention patterns that can be found in such environments. Typically, components of a parallel application will contend for the use of resources among themselves and with entities external to the application, such as other processes running in the computers of the cluster. The performance modeling approach should be able to represent these sources of contention and to produce an estimate of the execution time, preferably in polynomial time. This paper presents a polynomial time static performance prediction approach in which the prediction takes the form of an interval of values instead of a single value. The extra information given by an interval of values represents the variability of the underlying environment more accurately, as indicated by the practical examples presented. 相似文献

19.

Adaptive workload allocation in query processing in autonomous heterogeneous environments

Anastasios Gounaris Jim Smith Norman W. Paton Rizos Sakellariou Alvaro A. A. Fernandes Paul Watson 《Distributed and Parallel Databases》2009,25(3):125-164

The increasing prevalence of networked storage and computational resources, along with middleware for managing resource access and sharing, raises the prospect that queries can be run over resources obtained on demand, rather than on dedicated infrastructures. However, the movement of query processing into non-dedicated environments means that it is necessary to take account of the partial information and unstable conditions that characterise autonomous, shared, distributed settings. Thus, query processing on grid platforms needs to be adaptive, revising evaluation strategies at query runtime in response to the evolving environment, such as changes to machine load and availability. To address this challenge, adaptive techniques are described that: (i) balance load across plan partitions supporting intra-operator parallelism; (ii) remove bottlenecks in pipelined plans supporting inter-operator parallelism; and (iii) combine the two aforementioned techniques. The approach has been empirically evaluated in a grid-enabled adaptive query processor. 相似文献

20.

An Effective and Practical Performance Prediction Model for Parallel Computing on Nondedicated Heterogeneous NOW

《Journal of Parallel and Distributed Computing》1996,38(1):63-80

Networks of workstations (NOW) are receiving increased attention as a viable platform for high performance parallel computations. Heterogeneity and time-sharing are two characteristics that distinguish the NOW systems from conventional multiprocessor/multicomputer systems which are homogeneous and dedicated. It is important to have a practical model for users to predict the execution times of large-scale parallel applications on nondedicated heterogeneous NOW. Another objective of this study is to provide insight into the dynamic performance of parallel computing and into the effects of program structures and system factors on such a platform. In this paper, we study performance predictions for parallel computing on nondedicated heterogeneous networks of workstations. Our approach is based on a two-level model. On the top level, a semideterministic task graph is used to capture the parallel execution behavior including the variances of communication and synchronization. On the bottom level, a discrete time model is used to quantify effects from NOW systems. An iterative process is used to determine the interactive effects between network contention and task execution. We validate the prediction model using experiments on a nondedicated heterogeneous NOW. The maximum differences between predicted results and measured results were less than 10% in most cases and 15% in the worst cases. 相似文献