期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

An elasticity model for High Throughput Computing clusters

Ruben S. Montero Author VitaeRafael Moreno-VozmedianoAuthor Vitae Ignacio M. Llorente^{Author Vitae} 《Journal of Parallel and Distributed Computing》2011,71(6):750-757

Different methods have been proposed to dynamically provide scientific applications with execution environments that hide the complexity of distributed infrastructures. Recently virtualization has emerged as a promising technology to provide such environments. In this work we present a generic cluster architecture that extends the classical benefits of virtual machines to the cluster level, so providing cluster consolidation, cluster partitioning and support for heterogeneous environments. Additionally the capacity of the virtual clusters can be supplemented with resources from a commercial cloud provider. The performance of this architecture has been evaluated in the execution of High Throughput Computing workloads. Results show that, in spite of the overhead induced by the virtualization and cloud layers, these virtual clusters constitute a feasible and performing HTC platform. Additionally, we propose a performance model to characterize these variable capacity (elastic) cluster environments. The model can be used to dynamically dimension the cluster using cloud resources, according to a fixed budget, or to estimate the cost of completing a given workload in a target time. 相似文献

2.

Dynamic energy efficient data placement and cluster reconfiguration algorithm for MapReduce framework 总被引：1，自引：0，他引：1

Nitesh MaheshwariAuthor Vitae Radheshyam Nanduri Author VitaeVasudeva Varma Author Vitae 《Future Generation Computer Systems》2012,28(1):119-127

With the recent emergence of cloud computing based services on the Internet, MapReduce and distributed file systems like HDFS have emerged as the paradigm of choice for developing large scale data intensive applications. Given the scale at which these applications are deployed, minimizing power consumption of these clusters can significantly cut down operational costs and reduce their carbon footprint—thereby increasing the utility from a provider’s point of view. This paper addresses energy conservation for clusters of nodes that run MapReduce jobs. The algorithm dynamically reconfigures the cluster based on the current workload and turns cluster nodes on or off when the average cluster utilization rises above or falls below administrator specified thresholds, respectively. We evaluate our algorithm using the GridSim toolkit and our results show that the proposed algorithm achieves an energy reduction of 33% under average workloads and up to 54% under low workloads. 相似文献

3.

COS:度量分布式大数据处理系统的效率

李晓涵陈文光《数据与计算发展前沿》2020,2(1):93-104

【目的】在大数据处理领域,分布式计算系统得到广泛应用,它们的可扩展性得到重点关注,但其绝对性能往往没有得到重视。我们希望提出科学合理、与时俱进的度量标准,对分布式系统的性能进行评估。【方法】本文通过对比特定任务的单机实现和分布式实现来讨论分布式系统的性能,提出COS(Configuration that Outperforms a Single machine)这一指标,来衡量分布式系统在达到单台机器的性能时,需要的硬件资源数量。我们选取k-means聚类和逻辑回归两个经典机器学习算法,对其进行单机多线程实现,并通过向量化计算、优化内存分配与访问等方式对性能进行了优化,为分布式多机系统的性能提供参考。【结果】以Apache Spark作为对标系统,实验发现无论是使用其原生编程接口,还是经过悉心优化的机器学习库,都要使用数倍甚至数百倍的机器,才能达到单机多线程实现的性能。【局限】分布式系统与单机实现进行性能对比并不是完全公平的,分布式系统的额外开销客观存在。【结论】但COS指标仍能反映分布式系统存在的绝对性能较差、没有充分利用硬件优势等问题。相似文献

4.

Distributed shared abstractions (DSA) on multiprocessors

Clemencon C. Mukherjee B. Schwan K. 《IEEE transactions on pattern analysis and machine intelligence》1996,22(2):132-152

Any parallel program has abstractions that are shared by the program's multiple processes. Such shared abstractions can considerably affect the performance of parallel programs, on both distributed and shared memory multiprocessors. As a result, their implementation must be efficient, and such efficiency should be achieved without unduly compromising program portability and maintainability. The primary contribution of the DSA library is its representation of shared abstractions as objects that may be internally distributed across different nodes of a parallel machine. Such distributed shared abstractions (DSA) are encapsulated so that their implementations are easily changed while maintaining program portability across parallel architectures. The principal results presented are: a demonstration that the fragmentation of object state across different nodes of a multiprocessor machine can significantly improve program performance; and that such object fragmentation can be achieved without compromising portability by changing object interfaces. These results are demonstrated using implementations of the DSA library on several medium scale multiprocessors, including the BBN Butterfly, Kendall Square Research, and SGI shared memory multiprocessors. The DSA library's evaluation uses synthetic workloads and a parallel implementation of a branch and bound algorithm for solving the traveling salesperson problem (TSP) 相似文献

5.

分布式计算平台中混合多应用调度策略的研究* 总被引：1，自引：0，他引：1

覃德泽《计算机应用研究》2011,28(5):1850-1853

本文提出与分析了分布式计算平台中几种混合多应用的调度策略,它主要面向多个并行应用之间的调度而不是应用内部的调度,应用内部的调度采用了常见的工作队列容错调度算法。与资源信息有关的调度(Knowledge-Based)比较起来,这些调度策略运用到了与资源信息无关的调度方式(Knowledge-Free),这使它们的实现更加简单与容易,更加适合于高挥发性的分布式计算系统。针对各种不同的计算强度、资源可用性、任务粒度来划分实验场景,把各种调度策略进行了评测与比较。实验结果表明：这些调度策略各有优缺点,可以作为评估大规模分布式计算环境下的并行分布式应用的有效策略。相似文献

6.

Optimizing the execution of multiple data analysis queries on parallel and distributed environments

Andrade H. Kurc T. Sussman A. Saltz J. 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(6):520-532

We investigate techniques for efficiently executing multiquery workloads from data and computation-intensive applications in parallel and/or distributed computing environments. In this context, we describe a database optimization framework that supports data and computation reuse, query scheduling, and active semantic caching to speed up the evaluation of multiquery workloads. Its most striking feature is the ability of optimizing the execution of queries in the presence of application-specific constructs by employing a customizable data and computation reuse model. Furthermore, we discuss how the proposed optimization model is flexible enough to work efficiently irrespective of the parallel/distributed environment underneath. In order to evaluate the proposed optimization techniques, we present experimental evidence using real data analysis applications. For this purpose, a common implementation for the queries under study was provided according to the database optimization framework and deployed on top of three distinct experimental configurations: a shared memory multiprocessor, a cluster of workstations, and a distributed computational Grid-like environment. 相似文献

7.

Affinity-aware modeling of CPU usage with communicating virtual machines

Sujesha Sudevalayam Purushottam Kulkarni 《Journal of Systems and Software》2013

Use of virtualization in Infrastructure as a Service (IaaS) environments provides benefits to both users and providers: users can make use of resources following a pay-per-use model and negotiate performance guarantees, whereas providers can provide quick, scalable and hardware-fault tolerant service and also utilize resources efficiently and economically. With increased acceptance of virtualization-based systems, an important issue is that of virtual machine migration-enabled consolidation and dynamic resource provisioning. Effective resource provisioning can result in higher gains for users and providers alike. Most hosted applications (for example, web services) are multi-tiered and can benefit from their various tiers being hosted on different virtual machines. These mutually communicating virtual machines may get colocated on the same physical machine or placed on different machines, as part of consolidation and flexible provisioning strategies. In this work, we argue the need for network affinity-awareness in resource provisioning for virtual machines. First, we empirically quantify the change in CPU resource usage due to colocation or dispersion of communicating virtual machines for both Xen and KVM virtualization technologies. Next, we build models based on these empirical measurements to predict the change in CPU utilization when transitioning between colocated and dispersed placements. Due to the modeling process being independent of virtualization technology and specific applications, the resultant model is generic and application-agnostic. Via extensive experimentation, we evaluate the applicability of our models for synthetic and benchmark application workloads. We find that the models have high prediction accuracy — maximum prediction error within 2% absolute CPU usage. 相似文献

8.

基于Storm平台的多任务分组调度策略与实现

王中华柴小丽《计算机系统应用》2021,30(2):250-254

随着大数据与人工智能技术的飞速发展,高性能,实时性的流式计算系统逐渐取代传统基于数据仓库的批量计算系统.Apache storm作为一款开源,高容错,实时处理的分布式大数据流式计算平台,支持任务平均分配策略,单机任务指定策略等多种任务分配方案.当任务拓扑结构中存在多个任务时,且集群中只有某些机器支持某一任务执行时,传统的任务调度方法只能实现将单一的任务分配给单一指定的机器,使得整个集群的资源没有充分的利用.通过调整任务调度策略,获得满足条件的机器队列,查看机器队列中可用工作节点,将指定任务均匀分配给可用工作节点,其他任务仍通过默认策略分配给集群中的剩余机器,实现多任务的分组调度策略. 相似文献

9.

基于负载预测的微服务混合自动扩展

宋程豪江凌云《计算机应用研究》2022,39(8)

由于边缘云没有比中心云更强大的计算处理能力,在应对动态负载时很容易导致无意义的扩展抖动或资源处理能力不足的问题,所以在一个真实的边缘云环境中对微服务应用程序使用两个合成和两个实际工作负载进行实验评估,并提出了一种基于负载预测的混合自动扩展方法（predictively horizontal and vertical pod autoscaling,Pre-HVPA）。该方法首先采用机器学习对负载数据特征进行预测,并获得最终负载预测结果。然后利用预测负载进行水平和垂直的混合自动扩展。仿真结果表明,基于该方法所进行自动扩展可以减少扩展抖动和容器使用数量,所以适用于边缘云环境中的微服务应用。相似文献

10.

服务计算环境下一种基于机器学习的负载预测方法研究

王俊郑笛吴泉源官延安《计算机科学》2007,34(9):269-272

随着计算机技术的迅速发展，分布式应用的规模迅速增加，越来越多的软件系统开始采用面向服务的体系结构SOA。为了提高SOA的可靠性和可扩展性，一种有效的方式就是提供服务副本，并通过基于中间件的负载平衡服务在不同的服务副本之间平衡负载。通过使用中间件，我们可以满足当前的面向服务应用对于性能、可扩展性和可用性的需求。然而，我们必须保证对于负载的计算具有一定的预测性以避免负载峰值的影响。对于复杂的面向服务应用来说，负载峰值意味着系统可能在短时间内具有极高的负载，而在大多数时间内负载较为平稳，从而因为负载取样的延时性导致系统过载而响应时间增加、总体吞吐量也受到影响。因此，为了降低响应时间，以及在负载频繁波动的情况下也能有效地利用服务副本，我们基于中间件为自适应和灵活的负载平衡机制的需求提出并实现了一种基于机器学习的预测机制。相似文献

11.

Adaptive parallel job scheduling with flexible coscheduling

Frachtenberg E. Feitelson G. Petrini F. Fernandez J. 《Parallel and Distributed Systems, IEEE Transactions on》2005,16(11):1066-1077

Many scientific and high-performance computing applications consist of multiple processes running on different processors that communicate frequently. Because of their synchronization needs, these applications can suffer severe performance penalties if their processes are not all coscheduled to run together. Two common approaches to coscheduling jobs are batch scheduling, wherein nodes are dedicated for the duration of the run, and gang scheduling, wherein time slicing is coordinated across processors. Both work well when jobs are load-balanced and make use of the entire parallel machine. However, these conditions are rarely met and most realistic workloads consequently suffer from both internal and external fragmentation, in which resources and processors are left idle because jobs cannot be packed with perfect efficiency. This situation leads to reduced utilization and suboptimal performance. Flexible coscheduling (FCS) addresses this problem by monitoring each job's computation granularity and communication pattern and scheduling jobs based on their synchronization and load-balancing requirements. In particular, jobs that do not require stringent synchronization are identified, and are not coscheduled; instead, these processes are used to reduce fragmentation. FCS has been fully implemented on top of the STORM resource manager on a 256-processor alpha cluster and compared to batch, gang, and implicit coscheduling algorithms. This paper describes in detail the implementation of FCS and its performance evaluation with a variety of workloads, including large-scale benchmarks, scientific applications, and dynamic workloads. The experimental results show that FCS saturates at higher loads than other algorithms (up to 54 percent higher in some cases), and displays lower response times and slowdown than the other algorithms in nearly all scenarios. 相似文献

12.

Performance Tradeoffs in Scheduling Techniques for Mixed Workloads

Golubchik Leana Lui John C.S. de Souza e Silva Edmundo Gail H. Richard 《Multimedia Tools and Applications》2003,21(2):147-172

Many modern multimedia applications require the retrieval of different classes of data with drastically different characteristics. For instance, digital libraries type systems must be designed to deliver not only text files and still images, but voice and video as well. These applications can benefit from the sharing of resources such as disk and network bandwidth, instead of the conservative approach of partitioning the resources according to the characteristics of each type of data being retrieved. Continuous and non-continuous media applications, require different performance metrics to be achieved, so that the necessary quality of service (QoS) is satisfied. As a consequence, the proper managing of resources is a major issue in order to provide the complete sharing of resources and yet reaching the QoS goals. This work focuses on multimedia storage systems that are capable of serving a mixture of continuous and non-continuous workloads. Our main objective is to expose and investigate the tradeoffs involved in managing the system resources, in particular, I/O bandwidth. The performance metrics of interest are the mean and variance of response time for non-continuous media requests and the probability of missing an imposed deadline for continuous media workloads. Different scheduling algorithms are considered and tradeoffs to achieve performance goals are studied, including those involving buffer sizing. 相似文献

13.

Scheduling Policies for Processor Coallocation in Multicluster Systems 总被引：1，自引：0，他引：1

Bucur A.I.D. Epema D.H.J. 《Parallel and Distributed Systems, IEEE Transactions on》2007,18(7):958-972

Building multicluster systems out of multiple, geographically distributed clusters interconnected by high-speed wide-area networks can provide access to a larger computational power and to a wider range of resources. Jobs running on multiclusters and, more generally, in grids, may require (processor) coallocation, i.e., the simultaneous allocation of resources (processors) in different clusters or subsystems of a grid. In this paper, we propose four scheduling policies for processor coallocation in multiclusters, and we assess with simulations their performance under a wide variety of parameter settings. In particular, in our simulations we use synthetic workloads and workloads derived from the logs of actual systems and from runtime measurements. We conclude that although coallocation makes scheduling more difficult and the wide-area communication critically impacts the performance, there is a wide range of realistic applications that may benefit from coallocation. However, unrestricted coallocation is not recommended: Limiting the total job size or the number or the sizes of their components improves performance. 相似文献

14.

基于容器技术的云计算资源自适应管理方法

树岸彭鑫赵文耘《计算机科学》2017,44(7):120-127

云计算的发展使得越来越多的软件应用选择云平台作为部署平台。为了应对动态变化的工作负载、应用场景和服务质量目标,应用提供商希望能以一种可伸缩的方式对云计算资源进行动态调整。基于虚拟机的资源管理较为重载,难以实现细粒度的资源动态调整与混合云中跨平台的服务快速迁移。容器技术在一定程度上弥补了虚拟机的不足,然而传统的资源管理方法在诸多方面并不十分适用于容器技术。针对这一问题,提出了基于容器技术的云计算资源自适应管理方法,设计了更适用于容器的资源架构方案与资源之间的调度方式。与传统的线性建模方法不同,所提方法使用非线性函数对云计算资源进行更加精确的建模,同时用遗传算法进行参数调优,使得自适应调整响应更快、总体性能更好。所提方法还针对不同容器多维度的异构性,合理分配容器部署位置,提高物理资源利用率。此外,所提方法结合了容器技术多方面的底层特性,在分配负载等方面进行适应性调整。最后通过实验分析初步确认了所提方法的有效性。相似文献

15.

An analysis of definition and placement of virtual machines for high performance applications on Clouds

Giacomo Mc Evoy Antonio R. Mury Bruno Schulze 《Concurrency and Computation》2015,27(7):1789-1814

相似文献

16.

Active semantic caching to optimize multidimensional data analysis in parallel and distributed environments

《Parallel Computing》2007,33(7-8):497-520

In this paper, we present a multi-query optimization framework based on the concept of active semantic caching. The framework permits the identification and transparent reuse of data and computation in the presence of multiple queries (or query batches) that specify user-defined operators and aggregations originating from scientific data-analysis applications. We show how query scheduling techniques, coupled with intelligent cache replacement policies, can further improve the performance of query processing by leveraging the active semantic caching operators. We also propose a methodology for functionally decomposing complex queries in terms of primitives so that multiple reuse sites are exposed to the query optimizer, to increase the amount of reuse. The optimization framework and the database system implemented with it are designed to be efficient irrespective of the underlying parallel and/or distributed machine configuration. We present experimental results highlighting the performance improvements obtained by our methods using real scientific data-analysis applications on multiple parallel and distributed processing configurations (e.g., single symmetric multiprocessor (SMP) machine, cluster of SMP nodes, and a Grid computing configuration). 相似文献

17.

Simulating peer-to-peer cloud resource scheduling

D. Cenk Erdil 《Peer-to-Peer Networking and Applications》2012,5(3):219-230

Resource scheduling in large-scale distributed systems, such as grids and clouds, is difficult due to the size, dynamism, and volatility of resources. These resources are eclectic and autonomous, and may exhibit different usage policies, levels of participation, capabilities, local load, and reliability. Moreover, applications are likely to exhibit various patterns and levels, and distributed resources may organize into various different overlay topologies for information and query dissemination. Researchers have proposed a wide variety of approaches and policies for mapping offered load onto resources and for solving the various component parts of the scheduling problem. However, production clouds and grids may be underutilized, and may not exhibit the load to effectively characterize all of the scheduling system inputs. The composition of large-scale systems is also changing, potentially to include more individual and peer-to-peer resources. These factors will influence the effectiveness of proposed scheduling solutions. Therefore, a simulation environment is necessary to study different approaches under different scenarios, especially those that are expected, but that are not currently characteristic of existing systems. This article describes a general-purpose peer-to-peer simulation environment that allows a wide variety of parameters, protocols, strategies and policies to be varied and studied. To provide a proof of concept, utilization of the simulation environment is presented in a large-scale distributed system problem that includes a core model and related mechanisms. In particular, this article presents a definition and possible peer-to-peer solutions for the large-scale scheduling problem. Moreover, this article describes a general simulation model, some policies that can be varied, an implementation, and some sample results. 相似文献

18.

Elastic management of web server clusters on distributed virtual infrastructures

Rafael Moreno‐Vozmediano Ruben S. Montero Ignacio M. Llorente 《Concurrency and Computation》2011,23(13):1474-1490

Cluster‐based solutions are being widely adopted for implementing flexible, scalable, low‐cost and high‐performance web server platforms. One of the main difficulties to implement these platforms is the correct dimensioning of the cluster size, so as to satisfy variable and peak demand periods. In this context, virtualization is being adopted by many organizations as a solution not only to provide service elasticity, but also to consolidate server workloads, and improve server utilization rates. A virtualized web server can be dynamically adapted to the client demands by deploying new virtual nodes when the demand increases, and powering off and consolidating virtual nodes during periods of low demand. Furthermore, the resources from the in‐house infrastructure can be complemented with a cloud provider (cloud bursting), so that peak demand periods can be satisfied by deploying cluster nodes in the external cloud, on an on‐demand basis. In this paper, we analyze the scalability of hybrid virtual infrastructures for two different distributed web server cluster implementations: a simple web cluster serving static files and a multi‐tier web server platform running the CloudStone benchmark. Copyright © 2011 John Wiley & Sons, Ltd. 相似文献

19.

SWSL: a synthetic workload specification language for real-timesystems

Kiskis D.L. Shin K.G. 《IEEE transactions on pattern analysis and machine intelligence》1994,20(10):798-811

We discuss the issues that must be addressed in the specification and generation of synthetic workloads for distributed real-time systems. We describe a synthetic workload specification language (SWSL) that defines a workload in a form that can be compiled by a synthetic workload generator (SWG) to produce an executable synthetic workload. The synthetic workload is then downloaded to the target machine and executed while performance and dependability measurements are made. SWSL defines the workload at the task level using a data flow graph, and at the operation level using control constructs and synthetic operations taken from a library. It is intended to be easy to use, flexible, and capable of creating synthetic workloads that are representative of real-time workloads. It provides a compact, parameterized notation. It supports automatic replication of objects to facilitate the specification of large workloads for distributed real-time systems. It also provides extensive support for the experimentation process 相似文献

20.

Dynamic balancing of communication and computation load for HLA-based simulations on large-scale distributed systems 总被引：1，自引：0，他引：1

Robson E. De Grande Author VitaeAzzedine BoukercheAuthor Vitae 《Journal of Parallel and Distributed Computing》2011,71(1):40-52

Dynamic balancing of computation and communication load is vital for the execution stability and performance of distributed, parallel simulations deployed on the shared, unreliable resources of large-scale environments. High Level Architecture (HLA) based simulations can experience a decrease in performance due to imbalances that are produced initially and/or during run time. These imbalances are generated by the dynamic load changes of distributed simulations or by unknown, non-managed background processes resulting from the non-dedication of shared resources. Due to the dynamic execution characteristics of elements that compose distributed applications, the computational load and interaction dependencies of each simulation entity change during run time. These dynamic changes lead to an irregular load and communication distribution, which increases overhead of resources and latencies. A static partitioning of load is limited to deterministic applications and is incapable of predicting the dynamic changes caused by distributed applications or by external background processes. Therefore, a scheme for balancing the communication and computational load during the execution of distributed simulations is devised in a scalable hierarchical architecture. The proposed balancing system employs local and cluster monitoring mechanisms in order to observe the distributed load changes and identify imbalances, repartitioning policies to determine a distribution of load and minimize imbalances. A migration technique is also employed by this proposed balancing system to perform reliable and low-latency load transfers. Such a system successfully improves the use of shared resources and increases distributed simulations’ performance by minimizing communication latencies and partitioning the load evenly. Experiments and comparative analyses were conducted in order to identify the gains that the proposed balancing scheme provides to large-scale distributed simulations. 相似文献