期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Experimental Analysis of Timing Validation Methods for Distributed Real-Time Systems

Cha Hojung Ha Rhan Liu Jane W. S. 《The Journal of supercomputing》2003,25(1):73-94

Scheduling jobs dynamically on processors is likely to achieve better performance in multiprocessor and distributed real-time systems. Exhaustive methods for determining whether all jobs complete by their deadlines, in systems that use modern priority-driven scheduling strategies, are often infeasible or unreliable since the execution time of each job may vary. We previously published research results on finding worst-case bounds and efficient algorithms for validating systems in which independent jobs have arbitrary release times and deadlines, and are scheduled on processors dynamically in a priority-driven manner. An efficient method has been proposed to determine how late the completion times of jobs can be in dynamic systems where the jobs are preemptable and nonmigratable. This paper further presents the performance characteristics of the proposed methods, and shows its soundness by providing extensive simulation results. The worst-case completion times of jobs obtained with the proposed methods are compared with the values by simulations under different workload characteristics. The simulation results show that the proposed algorithm performs considerably well for diverse workloads. Considering the previous work showed the unlikelihood of finding tighter bounds than the one given in the paper, the simulation results indicate that the proposed methods effectively constitute a theoretical basis needed for a comprehensive validation strategy that is capable of dealing with dynamic distributed real-time systems. 相似文献

2.

A Dynamic Load Balancing Framework for Real-time Applications in Message Passing Systems

Ghada F. El Kabbany Nayer M. Wanas Nadia H. Hegazi Samir I. Shaheen 《International journal of parallel programming》2011,39(2):143-182

Load balancing algorithms are designed essentially to equally distribute the load on processors and maximize their utilities while minimizing the total task execution time. In order to achieve these goals, the load-balancing mechanism should be “fair” in distributing the load across the different processors. This implies that the difference between the heaviest-loaded and the lightest-loaded processors should be minimized. Therefore, the load information on each processor must be updated such that the load-balancing mechanism can be more effective. In this work, we present an application independent dynamic algorithm for scheduling tasks and load- balancing in message passing systems. We propose a DAG-based Dynamic Load Balancing algorithm for Real time applications (DAG-DLBR) that is designed to work dynamically to cope with possible changes in the load that might occur during runtime. This algorithm addresses the challenge of devising a load balancing scheme which judicially deals with the hybrid execution of existing real-time application (represented by a Direct Acyclic Graph (DAG)) together with newly arriving jobs. The main objective of this algorithm is to reduce response times of the newly arriving jobs while maintaining the time constrains of the existing DAG. To evaluate the performance of the DAG-DLBR algorithm, a comparison with the performance of two common dynamic load balancing algorithms is presented. This comparison is performed by evaluating, experimentally, the execution time of different load balancing algorithms on a homogenous real parallel machine. In addition, the values of load imbalance, the execution time, and the communication overhead time are evaluated analytically using different benchmarks as test-bed workloads. These workloads cover a wide range of dynamic applications with different task types. Experimental results illustrate the improved performance of the DAG-DLBR algorithm compared to both distributed and hierarchal based algorithms by at least 12 and 19%, respectively. This improvement is true for all workloads, even with highly dependent workload. The DAG-DLBR algorithm achieves lower computation time than its corresponding values of both the distributed and the hierarchical-based algorithms for 4, 8, 12 and 16 processors. 相似文献

3.

The solution algorithms for the multiprocessor scheduling with workspan criterion

Radosław Rudek Agnieszka Rudek Andrzej Kozik 《Expert systems with applications》2013,40(8):2799-2806

In this paper, we consider multiprocessor scheduling problems, where each job (task) must be executed simultaneously by the specified number of processors, but the indices of the processors allotted to each job do not have to be contiguous (i.e., jobs can be fragmentable). Unlike other research in this domain, we analyse the problem under the workspan criterion, which is defined as the product of the maximum job completion time (makespan) and the number of used processors. Moreover, the job processing times can be described by non-increasing or non-decreasing functions dependent on the start times of jobs that model improvement (learning) or degradation (deteriorating), respectively. To solve the problems, we construct some polynomial time algorithms and analyse numerically their efficiency. 相似文献

4.

Adaptive parallel job scheduling with flexible coscheduling

Frachtenberg E. Feitelson G. Petrini F. Fernandez J. 《Parallel and Distributed Systems, IEEE Transactions on》2005,16(11):1066-1077

Many scientific and high-performance computing applications consist of multiple processes running on different processors that communicate frequently. Because of their synchronization needs, these applications can suffer severe performance penalties if their processes are not all coscheduled to run together. Two common approaches to coscheduling jobs are batch scheduling, wherein nodes are dedicated for the duration of the run, and gang scheduling, wherein time slicing is coordinated across processors. Both work well when jobs are load-balanced and make use of the entire parallel machine. However, these conditions are rarely met and most realistic workloads consequently suffer from both internal and external fragmentation, in which resources and processors are left idle because jobs cannot be packed with perfect efficiency. This situation leads to reduced utilization and suboptimal performance. Flexible coscheduling (FCS) addresses this problem by monitoring each job's computation granularity and communication pattern and scheduling jobs based on their synchronization and load-balancing requirements. In particular, jobs that do not require stringent synchronization are identified, and are not coscheduled; instead, these processes are used to reduce fragmentation. FCS has been fully implemented on top of the STORM resource manager on a 256-processor alpha cluster and compared to batch, gang, and implicit coscheduling algorithms. This paper describes in detail the implementation of FCS and its performance evaluation with a variety of workloads, including large-scale benchmarks, scientific applications, and dynamic workloads. The experimental results show that FCS saturates at higher loads than other algorithms (up to 54 percent higher in some cases), and displays lower response times and slowdown than the other algorithms in nearly all scenarios. 相似文献

5.

一种作业弹性与截止时间感知的作业调度算法

黄春秋陈志荣垂田《计算机与现代化》2019,(4):30

针对采用MapReduce模型的大数据分析作业的调度问题进行深入研究，并分析现有任务调度算法的缺陷，现有算法没有考虑资源分配对于作业截止时间的影响，也未考虑不同类型作业截止时间的敏感性问题。因作业的完成时间随着分配资源的不同而改变，故称之为弹性作业，截止时间敏感性是指不同类型作业对截止时间要求的严格程度不同。针对以上问题，提出一种截止时间感知的弹性作业调度算法（DA）。该算法将作业依据截止时间敏感程度进行分类，在基于作业整体执行时间预测的基础上，通过调控不同的资源分配策略来改变作业完成时间，同时结合用户对于截止时间的需求及作业预执行的收益来提前规划作业的资源分配及调度次序使得整体收益最大化。将算法在仿真拥有210个物理节点的集群中进行实验，实验表明该算法满足了截止时间的限制并使得作业整体收益值平均提高了2.37倍。相似文献

6.

Performance and energy aware cluster-level scheduling of compute-intensive jobs with unknown service times

Stylianos Zikos Helen D. Karatza 《Simulation Modelling Practice and Theory》2011,19(1):239-250

In this paper we examine three local resource allocation policies, which are based on shortest queue, in a cluster with heterogeneous servers. Two of them are optimized for performance and the third one is optimized for energy conservation. We assume that there are two types of processors in the cluster, with different performance and energy characteristics. We consider that service times of jobs are unknown to the scheduler. A simulation model is used to evaluate the performance and energy behavior of the policies. Simulation results indicate that the differences among the policies depend on system load and there is a trade-off between performance and energy consumption. 相似文献

7.

Information retrieval with distributed databases: analytic models of performance

Losee R.M. Church L. Jr. 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(1):18-27

The major emphasis is on analytical techniques for predicting the performance of various collection fusion scenarios. Knowledge of analytical models of information retrieval system performance, both with single processors and with multiple processors, increases our understanding of the parameters (e.g., number of documents, ranking algorithms, stemming algorithms, stop word lists, etc.) affecting system behavior. While there is a growing literature on the implementation of distributed information retrieval systems and digital libraries, little research has focused on analytic models of performance. We analytically describe the performance for single and multiple processors, both when different processors have the same parameter values and when they have different values. The use of different ranking algorithms and parameter values at different sites is examined. 相似文献

8.

Speed Scaling on Parallel Processors

Susanne Albers Fabian Müller Swen Schmelzer 《Algorithmica》2014,68(2):404-425

In this paper we investigate dynamic speed scaling, a technique to reduce energy consumption in variable-speed microprocessors. While prior research has focused mostly on single processor environments, in this paper we investigate multiprocessor settings. We study the basic problem of scheduling a set of jobs, each specified by a release date, a deadline and a processing volume, on variable-speed processors so as to minimize the total energy consumption. We first settle the problem complexity if unit size jobs have to be scheduled. More specifically, we devise a polynomial time algorithm for jobs with agreeable deadlines and prove NP-hardness results if jobs have arbitrary deadlines. For the latter setting we also develop a polynomial time algorithm achieving a constant factor approximation guarantee. Additionally, we study problem settings where jobs have arbitrary processing requirements and, again, develop constant factor approximation algorithms. We finally transform our offline algorithms into constant competitive online strategies. 相似文献

9.

Scheduling of computational processes in real-time distributed systems with uncertain task execution times

N. V. Kolesov M. V. Tolmacheva P. V. Yukhta 《Journal of Computer and Systems Sciences International》2012,51(5):636-647

An approach to scheduling computational processes in real-time distributed computing systems is considered. It is assumed that the task execution time is inexactly; more precisely, it is assumed to belog to a certain time interval. The problem is formulated as the scheduling of jobs of which each is characterized by its priority and consists of a set of tasks (with respect to the number of processors) executing on different processors and associated by a hierarchical precedence relationship. The proposed approach is based on algorithms with low computational complexity for suboptimal scheduling of equal-priority tasks. 相似文献

10.

网络集群计算系统中的并行任务调度 总被引：12，自引：0，他引：12

黄金贵陈建二陈松乔《计算机学报》2004,27(6):765-771

基于多处理机并行任务调度模型，探讨网络集群计算系统中的并行任务调度问题，首先证明了一般网络集群计算系统中调度算法的可近似性难度，然后提出了三种不同的启发式算法：最大长度优先调度算法、最大宽度优先调度算法和最大面积优先调度算法；然后根据大量的模拟实验对这些算法以及文献中已提出的调度算法进行了比较分析，结果表明该文的启发式算法比文献中的算法在性能上效果更好。相似文献

11.

超级计算机作业运行稳定性分析

罗红兵曹小林曹立强武林平莫则尧《计算机工程与科学》2009,31(11)

通过作业日志分析和考核实验方式,对超级计算机并行作业运行稳定性进行了分析。日志分析结果表明,并行作业运行的稳定性会随作业执行时间的增长、作业使用CPU数的增多而下降;当并行作业的计算量达到105CPU小时量级,超过20%的作业会因系统故障而中止。考核实验结果表明,使用数千CPU的并行作业很容易受到多种因素的干扰而中止,很难持续运行超过24小时。最后给出了有关超级计算机稳定性改进、系统管理使用和并行程序研制的几点建议。相似文献

12.

When the herd is smart: Aggregate behavior in the selection of job request

Cirne W. Berman F. 《Parallel and Distributed Systems, IEEE Transactions on》2003,14(2):181-192

In most parallel supercomputers, submitting a job for execution involves specifying how many processors are to be allocated to the job. When the job is moldable (i.e., there is a choice on how many processors the job uses), an application scheduler called SA can significantly improve job performance by automatically selecting how many processors to use. Since most jobs are moldable, this result has great impact to the current state of practice in supercomputer scheduling. However, the widespread use of SA can change the nature of workload processed by supercomputers. When many SAs are scheduling jobs on one supercomputer, the decision made by one SA affects the state of the system, therefore impacting other instances of SA. In this case, the global behavior of the system comes from the aggregate behavior caused by all SAs. In particular, it is reasonable to expect the competition for resources to become tougher with multiple SAs, and this tough competition to decrease the performance improvement attained by each SA individually. This paper investigates this very issue. We found that the increased competition indeed makes it harder for each individual instance of SA to improve job performance. Nevertheless, there are two other aggregate behaviors that override increased competition when the system load is moderate to heavy. First, as load goes up, SA chooses smaller requests, which increases efficiency, which effectively decreases the offered load, which mitigates long wait times. Second, better job packing and fewer jobs in the system make it easier for incoming jobs to fit in the supercomputer schedule, thus reducing wait times further. As a result, in moderate to heavy load conditions, a single instance of SA benefits from the fact that other jobs are also using SA. 相似文献

13.

Strategyproof Mechanisms for Scheduling Divisible Loads in Bus-Networked Distributed Systems

Carroll T.E. Grosu D. 《Parallel and Distributed Systems, IEEE Transactions on》2008,19(8):1124-1135

The scheduling of arbitrarily divisible loads on a distributed system is studied by Divisible Load Theory (DLT). DLT has the underlying assumption that the processors will not cheat. In the real world, this assumption is unrealistic as the processors are owned and operated by autonomous rational organizations that have no a priori motivation for cooperation. Consequently, they will manipulate the algorithms if it benefits them to do so. In this work, we propose strategyproof mechanisms for scheduling divisible loads on three types of bus-connected distributed systems. These mechanisms provide incentives to the processors to obey the prescribed algorithms and to truthfully report their parameters, leading to an efficient load allocation and execution. 相似文献

14.

Bounding completion times of jobs with arbitrary release times,variable execution times and resource sharing

Sun J. Gardner M.K. Liu J.W.S. 《IEEE transactions on pattern analysis and machine intelligence》1997,23(10):603-615

The workload of many real time systems can be characterized as a set of preemptable jobs with linear precedence constraints. Typically their execution times are only known to lie within a range of values. In addition, jobs share resources and access to the resources must be synchronized to ensure the integrity of the system. The paper is concerned with the schedulability of such jobs when scheduled on a priority driven basis. It describes three algorithms for computing upper bounds on the completion times of jobs that have arbitrary release times and priorities. The first two are simple but do not yield sufficiently tight bounds, while the last one yields the tightest bounds but has the greatest complexity 相似文献

15.

On-line maintenance job scheduling and assignment to resources in distributed systems by heuristic-based optimization

Kossi P. Adzakpa Kondo H. Adjallah Farouk Yalaoui 《Journal of Intelligent Manufacturing》2004,15(2):131-140

A heuristic-based optimization algorithm is proposed in this paper for on-line scheduling and assignment of preventive maintenance jobs to processors, to minimize under availability constraints, on a given time-window, the total cost of the maintenance operations of a distributed system. This algorithm minimizes the cost of discharge of preventive maintenance tasks or jobs, while assigning the tasks along with balancing the processors load. It is shown that the problem is NP-hard. To solve it, the concept of job emergency is introduced and the priority rule for total flow time (PRTF) criterion is used in an adapted heuristic job-scheduling model. In addition, the algorithm considers the constraints of precedence among consecutive standby jobs and their emergency. It is depicted the specific properties of the proposed heuristic allowing jobs scheduling in the right order. Computational results illustrate the efficiency of the approach implemented on different system configurations. 相似文献

16.

Complete exchange on the CM-5 and Touchstone Delta

Rajeev Thakur Ravi Ponnusamy Alok Choudhary Geoffrey Fox 《The Journal of supercomputing》1995,8(4):305-328

The complete exchange (or all-to-all personalized) communication pattern occurs frequently in many important parallel computing applications. It is the densest form of communication because all processors need to communicate with all other processors. This can result in severe link contention and degrade performance considerably. Hence, it is necessary to use efficient algorithms in order to get good performance over a wide range of message and multiprocessor sizes. In this paper we present several algorithms to perform complete exchange on the Thinking Machines CM-5 and the Intel Touchstone Delta multiprocessors. Since these machines have different architectures and communication capabilities, different algorithms are needed to get the best performance on each of them. We present four algorithms for the CM-5 and six algorithms for the Delta. Complete exchange algorithms generally assume that the number of processors is a power of two. However, on the Delta the number of processors allocated by a user need not be a power of two. We propose algorithms that are even applicable to non-power-of-two meshes on the Delta. We have developed analytical models to estimate the performance of the algorithms on the basis of system parameters. Performance results on the CM-5 and Delta are also presented and analyzed. 相似文献

17.

Common due date assignment and scheduling with ready times

T. C. E. Z. -L. N. V. 《Computers & Operations Research》2002,29(14)

We consider the problem of scheduling a set of nonsimultaneously available jobs on one machine. Each job has a ready time only at or after which the job can be processed. All the jobs have a common due date, which needs to be determined. The problem is to determine a due date and a schedule so as to minimize a total penalty depending on the earliness, tardiness and due date. We show that this problem is strongly NP-hard and give an efficient algorithm that finds an optimal due date and schedule when either the job sequence is predetermined or all jobs have the same processing time. We also propose three approximation algorithms for the general and special cases together with their experimental analysis.

Scope and purpose

We consider the single machine due date assignment problem for scheduling jobs which are ready for processing at different times. The problem under consideration arises in production planning and scheduling concerning the setting of appropriate due dates for a number of customer orders arriving over time. Most of the earlier publications on this subject assumed that the jobs are ready for processing simultaneously. This assumption is too restrictive for real-life production systems where jobs arrive at different times. We show that the problem with unequal ready times is NP-hard and develop fast heuristic algorithms for it, and exact algorithms for two special cases. 相似文献

18.

An improved heuristic for minimizing makespan among m identical parallel processors

John H. Blackstone Jr. Don T. Phillips 《Computers & Industrial Engineering》1981,5(4):279-287

A simple heuristic for minimizing makespan among identical processors assigned independent tasks is presented and explored. The heuristic initially assigns jobs to processors by applying a quick and effective algorithm which at present is commonly applied to this problem. The heuristic then seeks to identify pairs of jobs that may be interchanged between processors to improve the solution. The conditions under which an optimal makespan may be achieved by such an interchange are derived for the case of two processors. The procedure is then extended to three or more processors. Results of 700 randomly generated problems are reported. The heuristic achieved an optimal solution for most of the problems. The worst case performance for the heuristic has not been established; however, evidence is presented that the worst case for the heuristic is considerably smaller than that of algorithms presently used. 相似文献

19.

Graph approach to job assignment in distributed real-time systems

A. M. Gruzlikov N. V. Kolesov Yu. M. Skorodumov M. V. Tolmacheva 《Journal of Computer and Systems Sciences International》2014,53(5):702-712

Two heuristic algorithms for assigning jobs to processors of a distributed real-time system are proposed and studied. Each job is described by a directed graph of sufficiently general form. The first algorithm is based on the principle of assigning adjacent tasks to the processor while the second algorithm is based on the principle of assigning tasks with most intensive communication. Efficiency of these algorithms is studied as compared to optimal one and using the random generation of examples. For each algorithm, the domain of efficient application is found and given by the value of the ratio of processor/communication channel costs. 相似文献

20.

On the Design and Implementation of a Shared Memory Dispatcher for Partially Clairvoyant Schedulers

K. Subramani Kiran Yellajyosula 《International journal of parallel programming》2008,36(4):386-411

With the onset of distributed computing in hard real-time applications, the problem of assigning to, scheduling in, and executing jobs on processors, has received a lot of attention. Usually, real-time systems are embedded in closed loop reactive environments with uncertain behaviors and such systems take varying times to respond to such stimuli. One of the fundamental features of such systems is the presence of complex timing constraints between pairs of jobs. A secondary feature is the non-constant nature of the execution times of jobs. Real-time operating systems such as MARUTI can measure the interval within which the execution time varies (Mosse et al. In: Second IEEE Workshop on Experimental Distributed System, pp. 29–34., IEEE, 1990; Levi et al. 1989, ACM Special Interest Group Operat Syst 23(3):90–106). Partially clairvoyant scheduling was introduced in (Saksena, Parametric Scheduling in Hard Real-Time Systems. PhD thesis, University of Maryland, College Park, June 1994) to schedule jobs with varying execution times and non-trivial timing constraints. The schedulability of the job set is determined offline and a set of dispatch functions are produced from the given set of constraints if the job set is schedulable. The dispatch functions bind the start time of a job J to an interval that depends on the start and execution times of jobs sequenced before J. The online dispatcher of the system reads these dispatch functions and computes the interval within which a job can start without violating the constraints imposed on the system. In certain situations, the dispatcher fails to dispatch a job as the time to compute the dispatch functions associated with a job is greater than the interval within which the job needs to be dispatched. This phenomenon is called Loss of Dispatchability (Subramani, Duality in the Parametric Polytope and its Applications to a Scheduling Problem. PhD thesis, University of Maryland, College Park, August 2000). In this paper, we propose and implement a partially clairvoyant dispatching algorithm on a shared memory cluster with Concurrent Read Exclusive Write (CREW) architecture and contrast it with the sequential approach. For a preset number of processors, our approach has O(1) dispatch complexity while using a total of O(n ²) space, while the sequential approach requires Ω(n) time. The detailed implementation profile obtained clearly demonstrates the superiority of the multiprocessor approach to dispatching. We also address the issue of scalability of the dispatcher for increasing number of processors and show that job sets of different sizes require different number of processors. Finally, we demonstrate the effect of execution time on the dispatchability of schedules. 相似文献