首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
For fine grain task graphs, duplication-based scheduling algorithms are generally more efficient than list and cluster-based algorithms. However, most duplication-based heuristics try to duplicate all possible ancestor nodes of a given join node, in order to reduce the earliest start time (EST) of the join node, even though these ancestor nodes have already been allocated in previous steps. Thus, these duplication heuristics inevitably induce redundant duplications, which lead to the superfluous consumption of resources and generally deteriorate the scheduling result in the case of a bounded number of processors. When scheduling algorithms are used on an unbounded number of processors, the required number of processors grows excessively with the size of the task graph, thereby limiting the practicality of these algorithms for large task graphs. In this paper, we propose a novel algorithm designed to allocate join nodes without redundant duplications. In the proposed algorithm, if the ancestor nodes of a join node are duplicated when scheduling the join node, the original allocations of these ancestor nodes are removed using a very efficient method. The performance of the proposed algorithm, in terms of its normalized schedule length and efficiency, is compared with that of some of the recently proposed algorithms. The proposed algorithm generates better or comparable schedules with minimized duplication. Specifically, the simulation results show that it is most useful on a bounded number of processors.  相似文献   

2.
On exploiting task duplication in parallel program scheduling   总被引:1,自引:0,他引:1  
One of the main obstacles in obtaining high performance from message-passing multicomputer systems is the inevitable communication overhead which is incurred when tasks executing on different processors exchange data. Given a task graph, duplication-based scheduling can mitigate this overhead by allocating some of the tasks redundantly on more than one processor. In this paper, we focus on the problem of using duplication in static scheduling of task graphs on parallel and distributed systems. We discuss five previously proposed algorithms and examine their merits and demerits. We describe some of the essential principles for exploiting duplication in a more useful manner and, based on these principles, propose an algorithm which outperforms the previous algorithms. The proposed algorithm generates optimal solutions for a number of task graphs. The algorithm assumes an unbounded number of processors. For scheduling on a bounded number of processors, we propose a second algorithm which controls the degree of duplication according to the number of available processors. The proposed algorithms are analytically and experimentally evaluated and are also compared with the previous algorithms  相似文献   

3.
Task graph pre-scheduling, using Nash equilibrium in game theory   总被引:1,自引:1,他引:0  
Prescheduling algorithms are targeted at restructuring of task graphs for optimal scheduling. Task graph scheduling is a NP-complete problem. This article offers a prescheduling algorithm for tasks to be executed on the networks of homogeneous processors. The proposed algorithm merges tasks to minimize their earliest start time while reducing the overall completion time. To this end, considering each task as a player attempting to reduce its earliest time as much as possible, we have applied the idea of Nash equilibrium in game theory to determine the most appropriate merging. Also, considering each level of a task graph as a player, seeking for distinct parallel processors to execute each of its independent tasks in parallel with the others, the idea of Nash equilibrium in game theory can be applied to determine the appropriate number of processors in a way that the overall idle time of the processors is minimized and the throughput is maximized. The communication delay will be explicitly considered in the comparisons. Our experiments with a number of known benchmarks task graphs and also two well-known problems of linear algebra, LU decomposition and Gauss–Jordan elimination, demonstrate the distinguished scheduling results provided by applying our algorithm. In our study, we consider ten scheduling algorithms: min–min, chaining, A ?, genetic algorithms, simulated annealing, tabu search, HLFET, ISH, DSH with task duplication, and our proposed algorithm (PSGT).  相似文献   

4.
一个调度Fork-Join任务图的新算法   总被引:17,自引:1,他引:16  
刘振英  方滨兴  姜誉  张毅  赵宏 《软件学报》2002,13(4):693-697
任务调度是影响工作站网络效率的关键因素之一.Fork-Join任务图可以代表很多并行结构,但其他已有调度Fork-Join任务图算法忽略了在非全互连工作站网络环境中通信之间不能并行执行的问题,有些效率高的算法又没有考虑节省处理器个数的问题.因此,专门针对该任务图,综合考虑调度长度、非并行通信和节省处理器个数问题,提出了一个基于任务复制的静态调度算法TSA_FJ.通过随机产生任务的执行时间和通信时间,生成了多个Fork-Join任务图,并且采用TSA_FJ算法和其他调度算法对生成的任务图进行调度.结果表明,  相似文献   

5.
对基于总线的机群系统,本文提出了一种基于任务复制的调度Fork-Join任务图的新算法。该算法通过任务集划分计算调度长度,并在不增加调度长度的同时将任务尽可能调度在已用处理器上,节省处理器数。新算法的时间复杂度高于现有算法,但其调度性能最优。  相似文献   

6.
任务调度问题是并行分布式计算中的挑战性问题之一。大多数实际的调度算法是启发式的因而常常具有改进的余地。针对Out-Tree任务图这一基本结构提出一个基于任务复制的启发式调度算法,该算法在确保最短调度长度的同时,注重处理器的负载平衡,以达到节约处理器的目的。比较性实验的结果表明,该算法确保了最短调度长度且使用的处理器最少。因而,该算法提高了系统的利用率,避免消耗过多的资源,实际应用性更好。  相似文献   

7.
现有的很多调度算法存在时间复杂度过高或调度成功率低的问题。提出一种新的调度算法(HRTSA),提高实时任务的调度成功率。HRTSA首先通过METC策略初始化分簇,降低算法的时间复杂度;再在放置任务时根据处理器的负载均衡进行处理器负载的有效控制;最后通过任务复制调度以提高任务调度成功率。对比实验分析表明提出的HRTSA算法时间复杂度与RTSDA相比较低,调度成功率较高。  相似文献   

8.
目前已有的Fork-Join任务图的调度算法大多假定处理机为同构的,而没有考虑实际应用中处理机的异构性以及节省处理机的问题,导致算法在具体应用中效率较低.因此,对Fork-Join任务图的调度问题进行研究,提出了一个基于异构环境的贪心调度算法,该算法具有高的加速比和总体效率,其时间复杂度为O(v~2),其中,v表示任务集中任务的个数.实验结果表明,相比其它算法,该算法具有较短的调度长度、较短的完成时间,使用的处理机数较少,具有更强的实用性.  相似文献   

9.
一种基于多处理器任务复制的分簇调度算法   总被引:2,自引:1,他引:1  
任务调度的优劣是决定并行分布式计算机系统性能好坏的重要因素之一。为优化任务调度,基于一些典型算法(如LG、PPA算法等),提出了一种新的任务调度算法。该算法一方面复制满足条件的前驱任务来缩短调度长度;另一方面合理地复制其他前驱任务和合并冗余簇来减少所需处理器的数目。实验表明,该算法在调度长度和所需处理器的数目上优于以上典型算法,并具有更小的时间复杂度,对并行计算机系统性能的提升具有一定的意义。  相似文献   

10.
基于任务复制的处理器预分配算法   总被引:12,自引:2,他引:12  
基于任务复制的调度算法比无任务复制的调度算法具有较好的性能.文章在分析了基于任务复制的几个典型算法(如TDS,OSA等算法)及其假设条件后,提出了以使调度长度最短作为主要目标、减少处理机数目作为次要目标的处理器预分配算法PPA.该算法对任务计算时间与任务间通信时间未做任何限制(即不考虑任务粒度).通过与相关工作的比较可以看出:PPA算法在调度长度与处理器使用数目上均优于其它算法或与其它算法相当,同时,该算法具有与TDS,OSA相同的时间复杂度.这对嵌入式实时分布系统具有重要的意义。  相似文献   

11.
已有的Join任务图的调度算法大多不是基于通信竞争的环境而开发,且未考虑节省处理机的问题,使算法的应用效果不佳.因此,针对Join任务图,提出一个通信竞争环境的调度算法,该算法因串行通信边而改善其调度效率,时间复杂度为O(vlogv),其中,v为图中任务的个数.实验结果表明,与其他算法相比,该算法的调度长度较短且使用的...  相似文献   

12.
This paper addresses the problem of scheduling parallel programs represented as directed acyclic task graphs for execution on distributed memory parallel architectures. Because of the high communication overhead in existing parallel machines, a crucial step in scheduling is task clustering, the process of coalescing fine grain tasks into single coarser ones so that the overall execution time is minimized. The task clustering problem is NP-hard, even when the number of processors is unbounded and task duplication is allowed. A simple greedy algorithm is presented for this problem which, for a task graph with arbitrary granularity, produces a schedule whose makespan is at most twice optimal. Indeed, the quality of the schedule improves as the granularity of the task graph becomes larger. For example, if the granularity is at least 1/2, the makespan of the schedule is at most 5/3 times optimal. For a task graph with n tasks and e inter-task communication constraints, the algorithm runs in O(n(n lg n+e)) time, which is n times faster than the currently best known algorithm for this problem. Similar algorithms are developed that produce: (1) optimal schedules for coarse grain graphs; (2) 2-optimal schedules for trees with no task duplication; and (3) optimal schedules for coarse grain trees with no task duplication  相似文献   

13.
现代并行系统的复杂调度问题可以转化为Fork-join图的任务调度问题.然而在实际计算环境中,两个处理节点之间的通信大多以独占方式进行,现有的大多数任务调度算法往往忽略了对通信信道独占性的考虑.提出了一种带通信限制的Fork-join图调度算法CCTD.该算法引入了实际环境中的通信独占性限制,同时保证了Fork-join图的基于复制的优化调度,而且尽可能地减少了对处理器占用.实验结果表明,CCTD算法是一种适应性强的、高效的Fork-join图调度算法.  相似文献   

14.
Contention-aware scheduling with task duplication   总被引:1,自引:0,他引:1  
Finding an efficient schedule for a task graph on several processors is a trade-off between maximising concurrency and minimising interprocessor communication. Task duplication is a technique that has been employed to reduce or avoid interprocessor communication. Certain tasks are duplicated on several processors to produce the data locally and avoid the communication among processors. Most of the algorithms using task duplication have been proposed for the classic scheduling model, which allows concurrent communication and ignores contention for communication resources. It is increasingly recognised that this classic model is unrealistic and does not permit creating accurate and efficient schedules. The recently proposed contention model introduces contention awareness into task scheduling by assigning the edges of the task graph to the links of the communication network. It is intuitive that scheduling under such a model benefits even more from task duplication, yet no such algorithm has been proposed as it is not trivial to duplicate tasks under the contention model. This paper proposes a contention-aware task duplication scheduling algorithm. We investigate the fundamentals for task duplication in the contention model and propose an algorithm that is based on state-of-the-art techniques found in task duplication and contention-aware algorithms. An extensive experimental evaluation demonstrates the significant improvements to the speedup of the produced schedules.  相似文献   

15.
Improving scheduling of tasks in a heterogeneous environment   总被引:1,自引:0,他引:1  
Optimal scheduling of parallel tasks with some precedence relationship, onto a parallel machine is known to be NP-complete. The complexity of the problem increases when task scheduling is to be done in a heterogeneous environment, where the processors in the network may not be identical and take different amounts of time to execute the same task. We introduce a task duplication-based scheduling algorithm for network of heterogeneous systems (TANH), with complexity O(V/sup 2/), which provides optimal results for applications represented by directed acyclic graphs (DAGs), provided a simple set of conditions on task computation and network communication time could be satisfied. The performance of the algorithm is illustrated by comparing the scheduling time with an existing "best imaginary level scheduling (BIL)" scheme for heterogeneous systems. The scalability for a higher or lower number of processors, as per their availability is also discussed. We have shown to provide substantial improvement over existing work on the task duplication-based scheduling algorithm (TDS).  相似文献   

16.
Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task and data parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the task graph, the runtime estimates and scalability characteristics of the tasks, and the intertask data communication volumes. A locality-conscious scheduling strategy is used to improve intertask data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. Our algorithm also produces schedules that have a lower makespan than pure task- and data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches.  相似文献   

17.
In this paper, we propose a static scheduling algorithm for allocating task graphs to fully connected multiprocessors. We discuss six recently reported scheduling algorithms and show that they possess one drawback or the other which can lead to poor performance. The proposed algorithm, which is called the Dynamic Critical-Path (DCP) scheduling algorithm, is different from the previously proposed algorithms in a number of ways. First, it determines the critical path of the task graph and selects the next node to be scheduled in a dynamic fashion. Second, it rearranges the schedule on each processor dynamically in the sense that the positions of the nodes in the partial schedules are not fixed until all nodes have been considered. Third, it selects a suitable processor for a node by looking ahead the potential start times of the remaining nodes on that processor, and schedules relatively less important nodes to the processors already in use. A global as well as a pair-wise comparison is carried out for all seven algorithms under various scheduling conditions. The DCP algorithm outperforms the previous algorithms by a considerable margin. Despite having a number of new features, the DCP algorithm has admissible time complexity, is economical in terms of the number of processors used and is suitable for a wide range of graph structures  相似文献   

18.
张艳  李延红 《计算机应用》2006,26(5):1161-1163
Out-Tree任务图代表分治算法的一大类问题。本文专门针对该类任务图,提出了一个新的调度算法。它利用fork结构的最优调度为各任务定义优先级,准确的反映了任务对调度的影响,保证了任务的正确调度顺序,得到优的调度长度。并在不改变调度长度的情况下,将结点尽可能地分配到已用处理器上,节省了处理器。实验表明,本文算法的调度性能优于现有同类算法。  相似文献   

19.
This paper presents a hybrid scheduling methodology for task graphs to multiprocessor embedded systems. The proposed methodology is designed for task graphs which are dynamic in nature due to the presence of conditional tasks as well as tasks whose execution times are unpredictable but bounded. We have presented the methodology as a three phase strategy in which task nodes are mapped to the processors in the first (static mapping) phase. In the second (selective duplication) phase some critical nodes are identified and duplicated for possible rescheduling at run-time depending on the code memory constraints of the processors. The third (online) phase is a run-time scheduling algorithm that performs list scheduling based on actual dynamics of the schedule up to the current time. We show that this technique provides better schedule length (up to 20%) compared to previous techniques which are predominantly static in nature with low overhead and comparable in complexity with existing online techniques. The effects of model parameters like number of processors, memory and various task graph parameters on performance are investigated in this paper.  相似文献   

20.
Genetic algorithms for task scheduling problem   总被引:1,自引:0,他引:1  
The scheduling and mapping of the precedence-constrained task graph to processors is considered to be the most crucial NP-complete problem in parallel and distributed computing systems. Several genetic algorithms have been developed to solve this problem. A common feature in most of them has been the use of chromosomal representation for a schedule. However, these algorithms are monolithic, as they attempt to scan the entire solution space without considering how to reduce the complexity of the optimization process. In this paper, two genetic algorithms have been developed and implemented. Our developed algorithms are genetic algorithms with some heuristic principles that have been added to improve the performance. According to the first developed genetic algorithm, two fitness functions have been applied one after the other. The first fitness function is concerned with minimizing the total execution time (schedule length), and the second one is concerned with the load balance satisfaction. The second developed genetic algorithm is based on a task duplication technique to overcome the communication overhead. Our proposed algorithms have been implemented and evaluated using benchmarks. According to the evolved results, it has been found that our algorithms always outperform the traditional algorithms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号