首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
Efficient data-aware methods in job scheduling, distributed storage management and data management platforms are necessary for successful execution of data-intensive applications. However, research about methods for data-intensive scientific applications are insufficient in large-scale distributed cloud and cluster computing environments and data-aware methods are becoming more complex. In this paper, we propose a Data-Locality Aware Workflow Scheduling (D-LAWS) technique and a locality-aware resource management method for data-intensive scientific workflows in HPC cloud environments. D-LAWS applies data-locality and data transfer time based on network bandwidth to scientific workflow task scheduling and balances resource utilization and parallelism of tasks at the node-level. Our method consolidates VMs and consider task parallelism by data flow during the planning of task executions of a data-intensive scientific workflow. We additionally consider more complex workflow models and data locality pertaining to the placement and transfer of data prior to task executions. We implement and validate the methods based on fairness in cloud environments. Experimental results show that, the proposed methods can improve performance and data-locality of data-intensive workflows in cloud environments.  相似文献   

2.
为了优化云工作流调度的经济代价和执行效率,提出一种基于有向无循环图(DAG)分割的工作流调度算法PBWS。以工作流调度效率与代价同步优化为目标,算法将调度求解过程划分为三个阶段进行:工作流DAG结构分割、分割结构调整及资源分配。工作流DAG结构分割阶段在确保任务间执行顺序依赖的同时求解初始的任务分割图;分割结构调整阶段以降低执行跨度为目标,在不同分割间对任务进行重分配;资源分配阶段旨在选择代价最高效的任务与资源映射关系,确保资源的总空闲时间最小。利用五种科学工作流DAG模型对算法进行了仿真实验。结果表明。PBWS算法仅以较小的执行跨度为开销,极大降低了工作流执行代价,实现了调度效率与调度代价的同步优化,其综合性能是优于同类型算法的。  相似文献   

3.
Security is increasingly critical for various scientific workflows that are big data applications and typically take quite amount of time being executed on large-scale distributed infrastructures. Cloud computing platform is such an infrastructure that can enable dynamic resource scaling on demand. Nevertheless, based on pay-per-use and hourly-based pricing model, users should pay attention to the cost incurred by renting virtual machines (VMs) from cloud data centers. Meanwhile, workflow tasks are generally heterogeneous and require different instance series (i.e., computing optimized, memory optimized, storage optimized, etc.). In this paper, we propose a security and cost aware scheduling (SCAS) algorithm for heterogeneous tasks of scientific workflow in clouds. Our proposed algorithm is based on the meta-heuristic optimization technique, particle swarm optimization (PSO), the coding strategy of which is devised to minimize the total workflow execution cost while meeting the deadline and risk rate constraints. Extensive experiments using three real-world scientific workflow applications, as well as CloudSim simulation framework, demonstrate the effectiveness and practicality of our algorithm.  相似文献   

4.
随着云计算的迅速发展,将工作流部署到云计算平台已经成为了常见的选择。相比于传统的本地工作流,云工作流不仅要考虑计算时长等要求,还要考虑其产生的经济开销。而云计算服务商为了提高资源利用率,提供了可抢占虚拟机实例这种非常廉价但是不稳定的资源。针对工作流在云计算中的调度和执行问题,提出一种满足工作流执行时限的可抢占虚拟机实例配置和调度方法。该方法使用马尔科夫模型和动态规划方法,对可抢占虚拟机实例的价格进行预测,并得到成本最低的出价策略。同时,结合工作流的执行时限要求,在估计的出价策略下对工作流中使用的实例进行配置。实验结果显示,相比于全部使用按需付费虚拟机实例,该方法在满足工作流执行时限的前提下最高可以节省89.9%的计算成本。  相似文献   

5.
为了降低云环境中科学工作流调度的执行代价与数据中心能耗,提出了一种基于能效感知的工作流调度代价最优化算法CWCO-EA。算法在满足截止时间约束下,以最小化工作流执行代价与降低能耗为目标,将工作流的任务调度划分为四步执行。首先,通过代价效用的概念设计虚拟机选择策略,实现了子makespan约束下的任务与最优虚拟机间的映射;其次,通过串行与并行任务合并策略,同步降低了工作流的执行代价与能耗;然后,通过空闲虚拟机重用机制,改善了租用虚拟机的利用率,进一步提高了能效;最后,通过任务松驰策略实现了租用虚拟机的能力回收,节省了能耗。通过四种科学工作流的仿真实验,结果表明,CWCO-EA算法比较同类型算法,在满足截止时间的同时,可以同步降低工作流的执行代价与执行能耗。  相似文献   

6.
As the cost-driven public cloud services emerge, budget constraint is one of the primary design issues in large-scale scientific applications executed on heterogeneous cloud computing systems. Minimizing the schedule length while satisfying the budget constraint of an application is one of the most important quality of service requirements for cloud providers. A directed acyclic graph (DAG) can be used to describe an application consisted of multiple tasks with precedence constrains. Previous DAG scheduling methods tried to presuppose the minimum cost assignment for each task to minimize the schedule length of budget constrained applications on heterogeneous cloud computing systems. However, our analysis revealed that the preassignment of tasks with the minimum cost does not necessarily lead to the minimization of the schedule length. In this study, we propose an efficient algorithm of minimizing the schedule length using the budget level (MSLBL) to select processors for satisfying the budget constraint and minimizing the schedule length of an application. Such problem is decomposed into two sub-problems, namely, satisfying the budget constraint and minimizing the schedule length. The first sub-problem is solved by transferring the budget constraint of the application to that of each task, and the second sub-problem is solved by heuristically scheduling each task with low-time complexity. Experimental results on several real parallel applications validate that the proposed MSLBL algorithm can obtain shorter schedule lengths while satisfying the budget constraint of an application than existing methods in various situations.  相似文献   

7.
This paper proposes a scheduling algorithm to solve the problem of task scheduling in a cloud computing system with time‐varying communication conditions. This algorithm converts the scheduling problem with communication changes into a directed acyclic graph (DAG) scheduling problem for existing fuzzy communication task nodes, that is, the scheduling problem for a communication‐change DAG (CC‐DAG). The CC‐DAG contains both computation task nodes and communication task nodes. First, this paper proposes a weighted time‐series network bandwidth model to solve the indefinite processing time (cost) problem for a fuzzy communication task node. This model can accurately predict the processing time of a fuzzy communication task node. Second, to address the scheduling order problem for the computation task nodes, a dynamic pre‐scheduling search strategy (DPSS) is proposed. This strategy computes the essential paths for the pre‐scheduling of the computation task nodes based on the actual computation costs (times) of the computation task nodes and the predicted processing costs (times) of the fuzzy communication task nodes during the scheduling process. The computation task node with the longest essential path is scheduled first because its completion time directly influences the completion time of the task graph. Finally, we demonstrate the proposed algorithm via simulation experiments. The experimental results show that the proposed DPSS produced remarkable performance improvement rate on the total execution time that ranges between 11.5% and 21.2%. In view of the experimental results, the proposed algorithm provides better quality scheduling solution that is suitable for scientific application task execution in the cloud computing environment than HEFT, PEFT, and CEFT algorithms.  相似文献   

8.
虚拟机上部署容器的双层虚拟化云架构在云数据中心中的使用越来越广泛。为了解决该架构下云数据中心的能耗问题,提出了一种工作流任务调度算法TUMS-RTC。针对有截止时间约束的并行工作流,算法将调度过程划分为时间利用率最大化调度和运行时间压缩两个阶段。时间利用率最大化调度通过充分使用给定的时间范围减少完成工作流所需的虚拟机和服务器数量;运行时间压缩阶段通过压缩虚拟机空闲时间以缩短虚拟机和服务器的工作时间,最终达到降低能耗的目标。使用大量特征可控的随机工作流对TUMS-RTC算法的性能进行了测试。实验结果表明,TUMS-RTC算法相较于对比算法有更高的资源利用率,虚拟机数量减少率和能耗节省率,并且可以很好地处理云计算中规模大且并行度高的工作流。  相似文献   

9.
Cloud computing has established itself as an interesting computational model that provides a wide range of resources such as storage, databases and computing power for several types of users. Recently, the concept of cloud computing was extended with the concept of federated clouds where several resources from different cloud providers are inter-connected to perform a common action (e.g. execute a scientific workflow). Users can benefit from both single-provider and federated cloud environment to execute their scientific workflows since they can get the necessary amount of resources on demand. In several of these workflows, there is a demand for high performance and parallelism techniques since many activities are data and computing intensive and can execute for hours, days or even weeks. There are some Scientific Workflow Management Systems (SWfMS) that already provide parallelism capabilities for scientific workflows in single-provider cloud. Most of them rely on creating a virtual cluster to execute the workflow in parallel. However, they also rely on the user to estimate the amount of virtual machines to be allocated to create this virtual cluster. Most SWfMS use this initial virtual cluster configuration made by the user for the entire workflow execution. Dimensioning the virtual cluster to execute the workflow in parallel is then a top priority task since if the virtual cluster is under or over dimensioned it can impact on the workflow performance or increase (unnecessarily) financial costs. This dimensioning is far from trivial in a single-provider cloud and specially in federated clouds due to the huge number of virtual machine types to choose in each location and provider. In this article, we propose an approach named GraspCC-fed to produce the optimal (or near-optimal) estimation of the amount of virtual machines to allocate for each workflow. GraspCC-fed extends a previously proposed heuristic based on GRASP for executing standalone applications to consider scientific workflows executed in both single-provider and federated clouds. For the experiments, GraspCC-fed was coupled to an adapted version of SciCumulus workflow engine for federated clouds. This way, we believe that GraspCC-fed can be an important decision support tool for users and it can help determining an optimal configuration for the virtual cluster for parallel cloud-based scientific workflows.  相似文献   

10.
In the last years, scientific workflows have emerged as a fundamental abstraction for structuring and executing scientific experiments in computational environments. Scientific workflows are becoming increasingly complex and more demanding in terms of computational resources, thus requiring the usage of parallel techniques and high performance computing (HPC) environments. Meanwhile, clouds have emerged as a new paradigm where resources are virtualized and provided on demand. By using clouds, scientists have expanded beyond single parallel computers to hundreds or even thousands of virtual machines. Although the initial focus of clouds was to provide high throughput computing, clouds are already being used to provide an HPC environment where elastic resources can be instantiated on demand during the course of a scientific workflow. However, this model also raises many open, yet important, challenges such as scheduling workflow activities. Scheduling parallel scientific workflows in the cloud is a very complex task since we have to take into account many different criteria and to explore the elasticity characteristic for optimizing workflow execution. In this paper, we introduce an adaptive scheduling heuristic for parallel execution of scientific workflows in the cloud that is based on three criteria: total execution time (makespan), reliability and financial cost. Besides scheduling workflow activities based on a 3-objective cost model, this approach also scales resources up and down according to the restrictions imposed by scientists before workflow execution. This tuning is based on provenance data captured and queried at runtime. We conducted a thorough validation of our approach using a real bioinformatics workflow. The experiments were performed in SciCumulus, a cloud workflow engine for managing scientific workflow execution.  相似文献   

11.
Workflow scheduling has become one of the hottest topics in cloud environments, and efficient scheduling approaches show promising ways to maximize the profit of cloud providers via minimizing their cost, while guaranteeing the QoS for users’ applications. However, existing scheduling approaches are inadequate for dynamic workflows with uncertain task execution times running in cloud environments, because those approaches assume that cloud computing environments are deterministic and pre-computed schedule decisions will be statically followed during schedule execution. To cover the above issue, we introduce an uncertainty-aware scheduling architecture to mitigate the impact of uncertain factors on the workflow scheduling quality. Based on this architecture, we present a scheduling algorithm, incorporating both event-driven and periodic rolling strategies (EDPRS), for scheduling dynamic workflows. Lastly, we conduct extensive experiments to compare EDPRS with two typical baseline algorithms using real-world workflow traces. The experimental results show that EDPRS performs better than those algorithms.  相似文献   

12.
Cloud computing, an important source of computing power for the scientific community, requires enhanced tools for an efficient use of resources. Current solutions for workflows execution lack frameworks to deeply analyze applications and consider realistic execution times as well as computation costs. In this study, we propose cloud user–provider affiliation (CUPA) to guide workflow’s owners in identifying the required tools to have his/her application running. Additionally, we develop PSO-DS, a specialized scheduling algorithm based on particle swarm optimization. CUPA encompasses the interaction of cloud resources, workflow manager system and scheduling algorithm. Its featured scheduler PSO-DS is capable of converging strategic tasks distribution among resources to efficiently optimize makespan and monetary cost. We compared PSO-DS performance against four well-known scientific workflow schedulers. In a test bed based on VMware vSphere, schedulers mapped five up-to-date benchmarks representing different scientific areas. PSO-DS proved its efficiency by reducing makespan and monetary cost of tested workflows by 75 and 78%, respectively, when compared with other algorithms. CUPA, with the featured PSO-DS, opens the path to develop a full system in which scientific cloud users can run their computationally expensive experiments.  相似文献   

13.
近年来随着网格、云计算工作流等分布式计算技术的发展,关于DAG(有向无环图)模型任务在分布式系统环境下的调度问题逐渐成为备受关注的研究热点。根据最新研究进展,对分布式系统下的DAG任务调度问题和有关技术进行了研究与讨论,主要包括四个方面:系统地描述了分布式系统和异构分布式系统的有关概念,异构分布式系统下的DAG任务调度问题、调度模型及其典型应用;对现有分布式系统下DAG任务调度的研究按照不同的方式进行了分类;探讨了多DAG共享异构分布式资源调度的研究现状;讨论了目前多DAG共享异构分布式资源调度研究存在的问题和未来可能的研究方向。  相似文献   

14.
Optimizing cloud provisioning for scientific workflow applications is a challenging problem, since the workflows generally contain dependency between tasks and require specific deadlines. Usually, cloud providers offer many options to the consumers. These options include the number of virtual machines, the type of each virtual machine and the purchasing method for each machine. Currently, cloud provisioning cost optimization is an active research topic. Most of this literature is concerned with task scheduling, cloud option selection, and cloud option selection for scientific workflow applications. However, research that attempts to find solutions which cover both cloud option selection and workflow task scheduling is very limited. In this paper, we focus on optimizing the cost of purchasing infrastructure-as-a-service cloud capabilities to achieve scientific work flow execution within the specific deadlines. The proposed system considers the number of purchased instances, instance types, purchasing options, and task scheduling as constraints in an optimization process. Particle swarm optimization augmented with a variable neighborhood search technique is used to find the optimal solution. Our approach finds the configurations of purchasing options with the optimum budget for a specified workflow application based on the required performance. The solutions from the proposed system show promising performance from the perspectives of the total cost and fitness convergence when compared with other state-of-the-art algorithms.  相似文献   

15.
为了实现任务执行效率与执行代价的同步优化,提出了一种云计算环境中的DAG任务多目标调度优化算法。算法将多目标最优化问题以满足Pareto最优的均衡最优解集合的形式进行建模,以启发式方式对模型进行求解;同时,为了衡量多目标均衡解的质量,设计了基于hypervolume方法的评估机制,从而可以得到相互冲突目标间的均衡调度解。通过配置云环境与三种人工合成工作流和两种现实科学工作流的仿真实验测试,结果表明,比较同类单目标算法和多目标启发式算法,算法不仅求解质量更高,而且解的均衡度更好,更加符合现实云的资源使用特征与工作流调度模式。  相似文献   

16.
云计算可以通过即付即用的方式向用户工作流提供资源。为了解决资源服务代价异构环境下的云工作流任务调度代价问题,提出一种基于改进粒子群算法的云工作流任务调度算法WSA-IPSO。通过综合考虑任务的执行代价和依赖任务间发生数据传输时的通信代价,算法将总代价优化问题形式化为有向无环图DAG中的任务调度模型,并提出基于改进粒子群算法的优化模型对其进行求解。通过改进传统粒子群算法的粒子速度更新策略和惯性权重更新策略,算法可以以更快的收敛速度得到代价最小化的调度方案。通过仿真实验,与MCT算法及标准粒子群算法进行性能比较。实验结果表明,WSA-IPSO算法在降低总代价、任务分布的负载均衡以及算法收敛性方面比较同类算法均表现出更好的性能。  相似文献   

17.
云计算为大规模科学工作流应用的执行提供了更高效的运行环境。为了解决云环境中科学工作流调度的代价优化问题,提出了一种基于协同进化的工作流调度遗传算法CGAA。该算法将自适应惩罚函数引入严格约束的遗传算法中,通过协同进化的方法,自适应地调整种群个体的交叉与变异概率,以加速算法收敛并防止种群早熟。通过4种科学工作流的仿真实验结果表明,CGAA算法得到的调度方案在满足工作流调度截止时间约束与降低任务执行代价的综合性能方面优于同类型算法。  相似文献   

18.
Workflow scheduling is a key issue and remains a challenging problem in cloud computing.Faced with the large number of virtual machine(VM)types offered by cloud providers,cloud users need to choose the most appropriate VM type for each task.Multiple task scheduling sequences exist in a workflow application.Different task scheduling sequences have a significant impact on the scheduling performance.It is not easy to determine the most appropriate set of VM types for tasks and the best task scheduling sequence.Besides,the idle time slots on VM instances should be used fully to increase resources'utilization and save the execution cost of a workflow.This paper considers these three aspects simultaneously and proposes a cloud workflow scheduling approach which combines particle swarm optimization(PSO)and idle time slot-aware rules,to minimize the execution cost of a workflow application under a deadline constraint.A new particle encoding is devised to represent the VM type required by each task and the scheduling sequence of tasks.An idle time slot-aware decoding procedure is proposed to decode a particle into a scheduling solution.To handle tasks'invalid priorities caused by the randomness of PSO,a repair method is used to repair those priorities to produce valid task scheduling sequences.The proposed approach is compared with state-of-the-art cloud workflow scheduling algorithms.Experiments show that the proposed approach outperforms the comparative algorithms in terms of both of the execution cost and the success rate in meeting the deadline.  相似文献   

19.
移动云计算可以将任务从移动设备计算卸载至云端以增强设备计算能力,而如何实现能效计算卸载机制是当前的主要挑战。为了解决该问题,以降低移动设备能耗和应用完成时间为目标,将计算卸载问题形式化为满足任务顺序与截止时间约束的能效代价最小化问题,并提出一种动态能效感知计算卸载算法。算法由三个子算法组成:计算卸载选择、时钟频率控制及传输功率分配。实验结果表明,通过局部计算时优化调整移动设备CPU时钟频率,以及云端计算时自适应分配传输功率,新算法可以有效降低应用执行能效代价,同时确保满足约束条件,提高执行效率。  相似文献   

20.
针对执行时间限制严格的DAG类型网格工作流任务调度问题,考虑到网格环境中存在多个性能相同的网格资源,但其有效度和价格各不相同将会对工作流任务调度产生影响,该文利用有限状态连续时间的Markov过程的数学模型,提出一种网格工作流调度算法。在DAG中的关键路径上资源系统有效度满足用户要求的一定信任水平,选择执行费用相对较低的资源。仿真实验结果验证了算法的有 效性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号