首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 158 毫秒
1.
方明  袁由光 《计算机科学》2007,34(2):284-288
针对实时分布系统中的Out-Tree任务,提出了一种启发式的调度算(HSA-OT),并开发了一种多处理机上的最优检查点策略。该调度算法能够保证任务的调度长度最小,所需处理器数目尽量少,没有处理机间通信开销。该检查点策略没有检查点全局一致性开销,可保证各处理机的失效率最低。  相似文献   

2.
检查点机制作为一种软件容错机制,将其与网格环境相结合,提高网格计算的服务质量。更好地满足网格系统的要求。本文研究了如何面向网格应用实现检查点设置,使网格环境能够在某个计算结点发生故障后。将相关进程恢复到故障前的检查状态,从该检查点处继续执行,避免重新执行整个任务,节省了大量重复计算时间,实现了容错服务。  相似文献   

3.
《软件》2019,(12):6-12
提出了一种带任务重复的任务划分策略算法D-ITPS(Improved task partitioning Strategy with duplication),该算法首先将DAG图中的一些满足归并条件的任务进行归并,然后将所有的任务按照划分策略划分为一个个包,将包按照Max-Min策略整体调度到处理器上执行,在完成基本的映射后,检测每个染色体是否可以通过任务重复来减少通信时间,若可以则在处理器的空闲时间隙重复任务以减少总调度长度。  相似文献   

4.
矩阵乘法作为高性能计算中的关键组成部分,是一种具有计算和访存密集特点的典型应用,因此优化矩阵乘法的性能对通用处理器是非常重要的.为了提高矩阵乘法的性能,本文提出了一种性能模型,用于预测通用处理器上矩阵乘法的执行时间.该模型反映了矩阵乘法执行时间与通用处理器的运算部件、访存带宽、寄存器个数等结构参数之间的关系,可以指导处理器结构的优化来平衡计算和访存能力、提高执行速度.基于该模型本文给出了在一个优化的通用处理器结构中,寄存器个数和访存带宽应满足的理论下界.本文在Godson-3B处理器平台上对该性能模型进行了验证,实验结果表明矩阵乘法执行时间的预测精确度达到95%以上.基于该模型,本文还提出了一种对Godson-3B结构进行优化的方法,使矩阵乘法的执行时间减少了50%左右.  相似文献   

5.
一个非均匀分布处理器系统由若干个处理器组成,本文研究该系统程序模块的分配问题。其目的是在考虑到处理器内部通讯总开销的条件下尽可能的为所有处理器分配模块,使这些模块在该系统的处理器上最快地执行。与分布成本有关的因素有: 1) 每个模块需要的计算量。 2) 每对模块间的数据传送量。 3) 每个处理器的速度。 4) 每对处理器之间的通讯网络速度。本文描述了最短树算法,在用任意多个处理器,按照任何一种方式连接的分布系统中是这个算法的实现时间和通讯成本和为最小。这个算法提供的程序模块联接方式具有树型结构。这个算法使用动态程序设计方法对m个模块和n个处理器系统求解这个问题需要时间0(mn~2)。文章指出这个算法也可以用于若干项任务的最佳调度,这些任务的优先关系形成了一棵树。而处理器的实现成本和内部通讯量是随时变化的。在此情况下,系统把任务分配给处理器,当盲线(deadline)出现时延迟了这个任务的执行时间,于是要利用个别处理器和通讯网络暂短的加载时间。此时,我们可以利用这个算法使执行成本总和(处理器和时间的依赖关系),通讯成本(依赖具体网络的特点和时间),和为了不遇盲线所花代价为最小。索引术语——计算机网络,分布处理,优先树,程序分配,调度,最短树。  相似文献   

6.
针对现有拟态存储架构中数据同步方法时延增速过快,导致系统安全调度时的性能下降问题,提出了一种预同步模型,让备用执行体在异构池中利用检查点进行数据预同步工作,从而减少执行体上线时间。进一步地,根据预同步模型的同步特点和切换调度情况,提出一种执行周期最大有效率的检查点放置(execution cycle maximum efficiency checkpointing, CMEC)方法。通过最大化每个执行周期的有效工作率求得最佳的检查点间隔,较好地平衡了检查点开销和回滚开销。实验证明,与现有的全量同步策略相比,该方法缩短了执行体上线过程中的同步时间开销,提高了同步效率,保障了系统在业务量不断增加场景下的服务稳定性和连续性。  相似文献   

7.
随着多处理器系统规模的不断扩大,如何节能成为一个亟待解决的重要问题。为此,基于多处理器系统提出一种针对随机任务的在线节能实时调度算法。使用统计方法,根据已有任务的到达时间和计算量估计新任务在空闲处理器上执行的电压/频率,使还未到达的任务能够满足截止期限并有效节能。在考虑单个处理器上执行的任务时,计算执行这些任务所需的平均电压/频率,使所有任务的执行速度尽量均衡,当某些任务不能满足截止期限要求时,则调高未执行任务的电压/频率。实验结果表明,与EDF,HVEA,MEG和ME-MC算法相比,该算法在满足截止期限和节能方面具有明显的优势。  相似文献   

8.
在安全关键系统中,针对多核处理器共享硬件资源竞争带来的执行时间波动性问题,提出了基于性能计数器PMR和RSB的通用测试方法,通过捕捉执行时间波动性相关的硬件事件来分析硬件资源的共享性、执行时间的波动性和硬件平台的黑盒或灰盒行为。此方法可用于硬件平台的性能评估,也可用于应用任务的资源消耗评估,从而为WCET预测提供指导。  相似文献   

9.
适用于不确定环境中的DVS软实时调度算法   总被引:1,自引:0,他引:1       下载免费PDF全文
为了解决嵌入式软实时系统的节能问题,提出了一种DVS调度算法。它的特点是克服了任务执行时间不确定所带来的干扰,在运行时动态地寻找最优电压调节方案。实验表明:该调度算法可以很好地保证软实时系统的效率和稳定性,即使在处理器超载的情况下,也能自动调节,超过99%的作业可以在时间期限之前完成。对多种随机任务集的评测显示,该调度算法使得系统能耗平均减少15%以上。  相似文献   

10.
抢占阈值调度的功耗优化   总被引:2,自引:0,他引:2  
DVS(Dynamic Voltage Scaling)技术的应用使得任务执行时间延长进而使得处理器的静态功耗(由CMOS电路的泄露电流引起)迅速增加.延迟调度(Procrastination Scheduling)算法是近年提出用于减少静态功耗的有效方法,它通过推迟任务的正常执行来尽可能长时间地让处理器处于睡眠或关闭状态,从而避免过多的静态功耗泄露.文中针对可变电压处理器上运用抢占阈值调度策略的周期性任务集合,将节能调度和延迟调度结合起来,提出一种两阶段节能调度算法,先使用离线算法来计算每个任务的最优处理器执行速度,而后使用在线模拟调度算法来计算每个任务的延迟时间,从而动态判定处理器开启/关闭时刻.实例研究和仿真实验表明,作者的方法能够进一步降低抢占阈值任务调度算法的功耗.  相似文献   

11.
合理运用动态电压调整技术可以有效降低实时任务运行所需的能耗.提出了一种新的单任务DVS调度方法,针对程序的平均执行信息,并结合参数化动态预测策略,合理设置电压/频率调整点.实验结果表明,该方法能够充分利用动态松弛时间,有效控制调整开销,实现较高的能耗优化率.  相似文献   

12.
王一拙  陈旭  计卫星  苏岩  王小军  石峰 《软件学报》2016,27(7):1789-1804
任务并行程序设计模型已成为并行程序设计的主流,其通过发掘任务并行性来提高并行计算机的系统性能.提出一种支持容错的任务并行程序设计模型,将容错技术融入到任务并行程序设计模型中,在保证性能的同时提高系统可靠性.该模型以任务为调度、执行、错误检测与恢复的基本单位,在应用级实现容错支持.采用一种Buffer-Commit计算模型支持瞬时错误的检测与恢复;采用应用级无盘检查点实现节点故障类型永久错误的恢复;采用一种支持容错的工作窃取任务调度策略获得动态负载均衡.实验结果表明,该模型以较低的性能开销提供了对硬件错误的容错支持.  相似文献   

13.
This paper presents a flexible, portable, and transparent solution for strong mobility of composed Web services relying on policy-oriented techniques. The proposed approach provides a checkpoint solution based on automatic code instrumentation using correct source code transformation rules. This checkpoint technique permits to save the execution state of a mobile orchestration process as well as the execution states of its orchestrated partners. Thus, after migration, only non-executed codes will be resumed. In addition, our approach enables dynamic adaptation of the employed checkpointing and mobility techniques using aspects. For that, we use policies allowing dynamic selection of the used checkpointing and mobility techniques according to the execution context. Moreover, the proposed solution includes a module allowing the determination of the checkpointing interval satisfying QoS requirements. Experimentations show the efficiency of the proposed solution.  相似文献   

14.
Adaptive checkpointing strategy to tolerate faults in economy based grid   总被引:3,自引:2,他引:1  
In this paper, we develop a fault tolerant job scheduling strategy in order to tolerate faults gracefully in an economy based grid environment. We propose a novel adaptive task checkpointing based fault tolerant job scheduling strategy for an economy based grid. The proposed strategy maintains a fault index of grid resources. It dynamically updates the fault index based on successful or unsuccessful completion of an assigned task. Whenever a grid resource broker has tasks to schedule on grid resources, it makes use of the fault index from the fault tolerant schedule manager in addition to using a time optimization heuristic. While scheduling a grid job on a grid resource, the resource broker uses fault index to apply different intensity of task checkpointing (inserting checkpoints in a task at different intervals). To simulate and evaluate the performance of the proposed strategy, this paper enhances the GridSim Toolkit-4.0 to exhibit fault tolerance related behavior. We also compare “checkpointing fault tolerant job scheduling strategy” with the well-known time optimization heuristic in an economy based grid environment. From the measured results, we conclude that even in the presence of faults, the proposed strategy effectively schedules grid jobs tolerating faults gracefully and executes more jobs successfully within the specified deadline and allotted budget. It also improves the overall execution time and minimizes the execution cost of grid jobs.  相似文献   

15.
Scientific workflow systems often operate in unreliable environments, and have accordingly incorporated different fault tolerance techniques. One of them is the checkpointing technique combined with its corresponding rollback recovery process. Different checkpointing schemes have been developed and at various levels: task- (or activity-) level and workflow-level. At workflow-level, the usually adopted approach is to establish a checkpointing frequency in the system which determines the moment at which a global workflow checkpoint – a snapshot of the whole workflow enactment state at normal execution (without failures) – has to be accomplished. We describe an alternative workflow-level checkpointing scheme and its corresponding rollback recovery process for hierarchical scientific workflows in which every workflow node in the hierarchy accomplishes its own local checkpoint autonomously and in an uncoordinated way after its enactment. In contrast to other proposals, we utilise the Reference net formalism for expressing the scheme. Reference nets are a particular type of Petri nets which can more effectively provide the abstractions to support and to express hierarchical workflows and their dynamic adaptability.  相似文献   

16.
多核系统中基于Global EDF 的在线节能实时调度算法   总被引:3,自引:1,他引:2  
张冬松  吴彤  陈芳园  金士尧 《软件学报》2012,23(4):996-1009
随着多核系统能耗问题日益突出,在满足时间约束条件下降低系统能耗成为多核实时节能调度研究中亟待解决的问题之一.现有研究成果基于事先已知实时任务属性的假设,而实际应用中,只有当任务到达之后才能够获得其属性.为此,针对一般任务模型,不基于任何先验知识提出一种多核系统中基于Global EDF在线节能硬实时任务调度算法,通过引入速度调节因子,利用松弛时间,结合动态功耗管理和动态电压/频率调节技术,降低多核系统中任务的执行速度,达到实时约束与能耗节余之间的合理折衷.所提出的算法仅在上下文切换和任务完成时进行动态电压/频率调节,计算复杂度小,易于在实时操作系统中实现.实验结果表明,该算法适用于不同类型的片上动态电压/频率调节技术,节能效果始终优于Global EDF算法,最多可节能15%~20%,最少可节能5%~10%.  相似文献   

17.
The execution times of large-scale parallel applications on nowadays multi/many-core systems are usually longer than the mean time between failures. Therefore, parallel applications must tolerate hardware failures to ensure that not all computation done is lost on machine failures. Checkpointing and rollback recovery is one of the most popular techniques to implement fault-tolerant applications. However, checkpointing parallel applications is expensive in terms of computing time, network utilization and storage resources. Thus, current checkpoint-recovery techniques should minimize these costs in order to be useful for large scale systems. In this paper three different and complementary techniques to reduce the size of the checkpoints generated by application-level checkpointing are proposed and implemented. Detailed experimental results obtained on a multicore cluster show the effectiveness of the proposed methods to reduce checkpointing cost.  相似文献   

18.
Cloud computing offers new computing paradigms, capacity and flexible solutions to high performance computing (HPC) applications. For example, Hardware as a Service (HaaS) allows users to provide a large number of virtual machines (VMs) for computation-intensive applications using the HaaS model. Due to the large number of VMs and electronic components in HPC system in the cloud, any fault during the execution would result in re-running the applications, which will cost time, money and energy. In this paper we presented a proactive fault tolerance (FT) approach to HPC systems in the cloud to reduce the wall-clock execution time and dollar cost in the presence of faults. We also developed a generic FT algorithm for HPC systems in the cloud. Our algorithm does not rely on a spare node prior to prediction of a failure. We also developed a cost model for executing computation-intensive applications on HPC systems in the cloud. We analysed the dollar cost of provisioning spare nodes and checkpointing FT to assess the value of our approach. Our experimental results obtained from a real cloud execution environment show that the wall-clock execution time and cost of running computation-intensive applications in cloud can be reduced by as much as 30%. The frequency of checkpointing of computation-intensive applications can be reduced up to 50% with our FT approach for HPC in the cloud compared with current FT approaches.  相似文献   

19.
Identification of unnatural control chart patterns (CCPs) from manufacturing process measurements is a critical task in quality control as these patterns indicate that the manufacturing process is out-of-control. Recently, there have been numerous efforts in developing pattern recognition and classification methods based on artificial neural network to automatically recognize unnatural patterns. Most of them assume that a single type of unnatural pattern exists in process data. Due to this restrictive assumption, severe performance degradations are observed in these methods when unnatural concurrent CCPs present in process data. To address this problem, this paper proposes a novel approach based on singular spectrum analysis (SSA) and learning vector quantization network to identify concurrent CCPs. The main advantage of the proposed method is that it can be applied to the identification of concurrent CCPs in univariate manufacturing processes. Moreover, there are no permutation and scaling ambiguities in the CCPs recovered by the SSA. These desirable features make the proposed algorithm an attractive alternative for the identification of concurrent CCPs. Computer simulations and a real application for aluminium smelting processes confirm the superior performance of proposed algorithm for sets of typical concurrent CCPs.  相似文献   

20.
In its simplest structure, cloud computing technology is a massive collection of connected servers residing in a datacenter and continuously changing to provide services to users on-demand through a front-end interface. The failure of task during execution is no more an accident but a frequent attribute of scheduling systems in a large-scale distributed environment. Recently, some computational intelligence techniques have been mostly utilized to decipher the problems of scheduling in the cloud environment, but only a few emphasis on the issue of fault tolerance. This research paper puts forward a Checkpointed League Championship Algorithm (CPLCA) scheduling scheme to be used in the cloud computing system. It is a fault-tolerance aware task scheduling mechanisms using the checkpointing strategy in addition to tasks migration against unexpected independent task execution failure. The simulation results show that, the proposed CPLCA scheme produces an improvement of 41%, 33% and 23% as compared with the Ant Colony Optimization (ACO), Genetic Algorithm (GA) and the basic league championship algorithm (LCA) respectively as parametrically measured using the total average makespan of the schemes. Considering the total average response time of the schemes, the CPLCA scheme produces an improvement of 54%, 57% and 30% as compared with ACO, GA and LCA respectively. It also turns out significant failure decrease in jobs execution as measured in terms of failure metrics and performance improvement rate. From the results obtained, CPLCA provides an improvement in both tasks scheduling performance and failure awareness that is more appropriate for scheduling in the cloud computing model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号