在异构Hadoop集群场景中, 为了缓和由于纠删码和副本存储模式混合使用, 以及服务器节点本身实时算力差异造成的MapReduce作业处理效率低下的问题, 本文实现了一种根据数据存储情况和节点实时负载来在多并发场景下动态调节MapReduce作业任务分配情况的调度策略. 该策略通过修改当前Hadoop框架中的数据存储选址策略并对节点任务并发量进行动态控制, 在多作业并发时实现更加均衡的作业间资源分配. 实验结果表明, 相较于Hadoop默认的两种作业调度策略, 本文提出的调度模式能够将作业完成时间缩短约17%, 并有效避免部分作业面临的饥饿现象.  相似文献   

针对异构Hadoop环境下仍采用均等的数据分配方法将严重降低MapReduce的性能,提出比例数据分配策略。通过计算异构集群中各节点的计算比率,将已经分割好的数据块重新进行组合,形成数个按比例划分的数据块。每个节点根据自身性能来选择所分配和存储的数据块,从而使异构Hadoop集群中各节点处理数据的时间大致相同,降低节点之间数据的移动量。实验验证了提出的比例数据分配方法可以有效地提高MapReduce的性能,并使数据负载均衡。  相似文献   

在大规模云存储系统中,由于磁盘或网络故障造成的存储节点失效事件频发,系统需要数据冗余技术以保证数据的可靠性和可用性。纠删码,相对于副本方式而言,能大大提高存储空间的利用率,但纠删码在冗余数据修复方面的代价较副本方式高很多。目前针对纠删码的冗余数据修复研究大都无差别对待每个存储节点,然而实际分布式存储系统中,节点通常存在带宽资源、计算资源、存储容量资源等方面的差异性,这些资源的异构性对冗余数据修复性能影响很大。本文指出影响修复性能的关键因素,选取带宽开销、磁盘访问开销、修复时间、参与修复的节点数量和修复代价作为修复性能的评价标准;分析了现有研究方法如何降低这五种开销,重点讨论了这些方法的优缺点;阐述当前异构分布式存储系统中纠删码修复技术的研究现状;最后指出纠删码数据修复技术中尚未解决的一些难题和未来纠删码修复技术可能的发展方向。  相似文献   

分布式存储系统作为数据存储的载体,广泛应用于大数据领域.纠删码存储方式相对副本方式,既具有较高的空间利用效率,又能保证数据存储的可靠性,因此被越来多的应用于存储系统当中.在EB级大规模纠删码分布式存储系统中,元数据管理成本较大,位置信息等元数据查询效率影响了I O时延和吞吐量.基于位置信息记录的有中心数据放置算法需要频繁访问元数据服务器,导致性能优化受限,基于Hash映射的无中心数据放置算法越来越多地得到应用.但面向纠删码的无中心放置算法,在节点变更和数据恢复过程中,存在位置变更困难、迁移数据量大、数据恢复和迁移并发度低等问题.提出了一种基于条带的一致性Hash数据放置算法(consistent Hash data placement algorithm based on stripe,SCHash),SCHash以条带为单位放置数据,通过把数据块到节点的映射转化为条带到节点组的映射过程,减少节点变动过程中的数据迁移量,从而在恢复过程中降低了变动数据的比例,加速了恢复带宽.并基于SCHash算法设计了一种基于条带的并发I O调度恢复策略,通过避免选取同一节点的数据块进行I O操作,提升了I O并行度,通过调度恢复I O和迁移I O的执行顺序,减少了数据恢复的执行时间.相比APHash数据放置算法,SCHash在数据恢复过程中,减少了46.71%~85.28%数据的迁移.在条带内重建时,恢复带宽提升了48.16%,在条带外节点重建时,恢复带宽提升了138.44%.  相似文献   

Hadoop平台下,数据的负载均衡对平台性能的发挥有着深远的影响。首先分析默认数据负载均衡的局限性,针对现有默认HDFS(Hadoop Distributed File System)数据负载均衡算法只考虑存储空间利用率,而未考虑节点间异构性的问题,提出一种量化异构集群数据负载均衡的数学模型。该模型根据节点的存储空间及节点性能计算得到各个节点的理论空间利用率,并根据当前集群存储空间利用率动态调整节点最大负载。实验结果表明,提出的数据负载均衡策略能够让异构集群达到更合理的均衡状态,提高集群的效率,并有效减少作业的执行时间。  相似文献   

为解决早期云计算模型对医学小文件存储出现的单节点问题,数据高冗余造成数据的不一致性以及检索效率低等方面的问题,提出一种新型云存储模式。模型中,引入BWFS算法实现优化海量医学小文件序列化合并,优化纠删码算法实现数据块编码,减少数据块的冗余存储,而且引入位图索引技术与HBase索引结合形成新型并行索引策略,优化HBase主索引的缺点。实验表明,新型存储模型通过使用BWFS算法和纠删码技术减少了集群主控节点的内存消耗,在保证数据快速恢复的情况下,减少了集群数据的冗余存储,并行索引技术提高了医学数据影像的检索效率。  相似文献   

Hadoop已成为研究云计算的基础平台,MapReduce是其大数据分布式处理的计算模型。针对异构集群下MapReduce数据分布、数据本地性、作业执行流程等问题,提出一种基于DAG的MapReduce调度算法。把集群中的节点按计算能力进行划分,将MapReduce作业转换成DAG模型,改进向上排序值计算方法,使其在异构集群中计算更精准、任务的优先级排序更合理。综合节点的计算能力与数据本地性及集群利用情况,选择合理的数据节点分配和执行任务,减少当前任务完成时间。实验表明,该算法能合理分布数据,有效提高数据本地性,减少通信开销,缩短整个作业集的调度长度,从而提高集群的利用率。  相似文献   

针对Hadoop异构集群中计算和数据资源的不一致分布所导致的调度性能较低的缺点,设计了一种基于Hadoop集群和改进Late算法的并行作业调度算法;首先,介绍了基于Hadoop框架和Map-Reduce模型的调度原理,然后,在经典的Late调度算法的基础上,对Map任务和Reduce任务的各阶段执行时间进度比例进行存储和更新,为了进一步地提高调度效率,将慢任务迁移到本地化节点或离数据资源较近的物理节点上,并给了基于改进Late算法的作业调度流程;为了验证文中方法,在Hadoop集群系统上测试,设定1个为Jobtracker主控节点和7个为TaskTracker节点,实验结果表明文中方法能实现异构集群的作业调度,且与其它方法比较,具有较低的预测误差和较高的调度效率。  相似文献   

现有的纠删码假设存储节点的可用性是相同的,而现实的应用系统往往由可用性不同的异构存储节点组成,这使得系统的实际可用性与设计的最优值之间存在一定的差异。为了减小系统的实际可用性与设计最优值的差异,将存储节点看作是异构的,提出一种异构存储节点下的可用性分析框架,以及优化的纠删码部署方法。实验表明,所提出的纠删码部署方法的可用性与系统实际可用性的差异小,性能明显优于现有的相关工作。  相似文献   

纠删码存储集群已经成为适应大规模数据中心的典型容错存储方案.纠删码存储研究主要从新型编码和存取过程优化2个方面展开,从存取过程角度来优化纠删码存储集群的可靠性和能效性.具体地,结合系统运行状态,建立一种与运行状态相匹配的弹性I/O调度策略,即,节点正常运行时,在保证用户性能的前提下,将一部分节点切换到休眠状态,以降低存储系统的当前功率;节点失效时,以提升系统可靠性为目标,对失效数据进行流水线高速重构,最小化数据恢复时间.分别设计了具有节能特性的正常模式方案ECS2和能够加速恢复的Pipe-Rec方案,并在Reed-Solomon码存储集群(其中,编码参数k=6,r=3)中实现了原型.能耗对比测试表明:ECS2在读密集型和写密集型负载下分别能节约29.8%和28%的能耗;而重构对比测试表明:Pipe-Rec方案的重构性能是传统同步式重构方案的5.76倍.  相似文献   

The average PC now contains a large and increasing amount of storage with an ever greater amount left unused. We believe there is an opportunity for organizations to harness the vast unused storage capacity on their PCs to create a very large, low‐cost, shared storage system. What is needed is the proper storage system architecture and software to exploit and manage the unused portions of existing PC storage devices across an organization and make it reliably accessible to users and applications. We call our vision of such a storage system Storage@desk (SD). This paper describes our first step towards the realization of SD—a study of machine and storage characteristics and usage in a model organization. We studied 729 PCs in an academic institution for 91 days, monitoring the configuration, load and usage of the major machine subsystems, i.e. disk, memory, CPU and network. To further analyze the availability characteristics of storage in an SD system, we performed a trace‐driven simulation of some basic storage allocation strategies. This paper presents the results of our data collection efforts, our analysis of the data, our simulation results and our conclusion that an SD system is indeed feasible and holds promise as a cost‐effective way to create massive storage systems. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

操顺德  华宇  冯丹  孙园园  左鹏飞 《软件学报》2017,28(8):1999-2009
通过对视频监控数据的特点和传统存储方案进行分析,提出一种高性能分布式存储系统解决方案.不同于传统的基于文件存储的方式,设计了一种逻辑卷结构,将非结构化的视频流数据以此结构进行组织并直接写入RAW磁盘设备,解决了传统存储方案中随机磁盘读写和磁盘碎片导致存储性能下降的问题.该方案将元数据组织为两级索引结构,分别由状态管理器和存储服务器管理,极大地减少了状态管理器需要管理元数据的数量,消除了性能瓶颈,并提供精确到秒级的检索精度.此外,该方案灵活的存储服务器分组策略和组内互备关系使得存储系统具备容错能力和线性扩展能力.系统测试结果表明,该方案在成本低廉的PC服务器上实现了单台服务器能同时记录400路1080P视频流,写入速度是本地文件系统的2.5倍.  相似文献   

This paper describes the design criteria and implementation details of a dynamic storage allocator for real‐time systems. The main requirements that have to be considered when designing a new allocator are concerned with temporal and spatial constraints. The proposed algorithm, called TLSF (two‐level segregated fit), has an asymptotic constant cost, O(1), maintaining a fast response time (less than 200 processor instructions on a x86 processor) and a low level of memory usage (low fragmentation). TLSF uses two levels of segregated lists to arrange free memory blocks and an incomplete search policy. This policy is implemented with word‐size bitmaps and logical processor instructions. Therefore, TLSF can be categorized as a good‐fit allocator. The incomplete search policy is shown also to be a good policy in terms of fragmentation. The fragmentation caused by TLSF is slightly smaller (better) than that caused by best fit (which is one of the best allocators regarding memory fragmentation). In order to evaluate the proposed allocator, three analyses are presented in this paper. The first one is based on worst‐case scenarios. The second one provides a detailed consideration of the execution cost of the internal operations of the allocator and its fragmentation. The third analysis is a comparison with other well‐known allocators from the temporal (number of cycles and processor instructions) and spatial (fragmentation) points of view. In order to compare them, a task model has been presented. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

The control of battery energy storage systems (BESSs) plays an important role in the management of microgrids. In this paper, the problem of balancing the state-of-charge (SoC) of the networked battery units in a BESS while meeting the total charging/discharging power requirement is formulated and solved as a distributed control problem. Conditions on the communication topology among the battery units are established under which a control law is designed for each battery unit to solve the control problem based on distributed average reference power estimators and distributed average unit state estimators. Two types of estimators are proposed. One achieves asymptotic estimation and the other achieves finite time estimation. We show that, under the proposed control laws, SoC balancing of all battery units is achieved and the total charging/discharging power of the BESS tracks the desired power. A simulation example is shown to verify the theoretical results.   相似文献   

随着现代信息技术和经济的发展,各类信息化平台的规模不断扩大,随之而来的便是对数据存储容量、速度和安全越来越高的要求。而信息技术的发展也推动了网络存储技术的发展,使其突破了网络存储技术的瓶颈,存储性能大幅度上升。NAS正是当前用应用最为广泛的一种网络存储技术,其凭借经济性、高效性、可靠性等特点,被广泛应用到机房网路存储方案中。然而该技术由于刚刚起步,相关的研究还有所不足。本文对此进行了系统的分析和研究,希望能够对促进网络存储技术的应用和发展起到一定的积极作用。  相似文献   

Threshold-Based Dynamic Replication in Large-Scale Video-on-Demand Systems   总被引:1,自引:0,他引:1  
Recent advances in high speed networking technologies and video compression techniques have made Video-on-Demand (VOD) services feasible. A large-scale VOD system imposes a large demand on I/O bandwidth and storage resources, and therefore, parallel disks are typically used for providing VOD service. Although striping of movie data across a large number of disks can balance the utilization among these disks, such a striping technique can exhibit additional complexity, for instance, in data management, such as synchronization among disks during data delivery, as well as in supporting fault tolerant behavior. Therefore, it is more practical to limit the extent of data striping, for example, by arranging the disks in groups (or nodes) and then allowing intra-group (or intra-node) data striping only. With multiple striping groups, however, we may need to assign a movie to multiple nodes so as to satisfy the total demand of requests for that movie. Such an approach gives rise to several design issues, including: (1) what is the right number of copies of each movie we need so as to satisfy the demand and at the same time not waste storage capacity, (2) how to assign these movies to different nodes in the system, and (3) what are efficient approaches to altering the number of copies of each movie (and their placement) when the need for that arises. In this paper, we study an approach to dynamically reconfiguring the VOD system so as to alter the number of copies of each movie maintained on the server as the access demand for these movies fluctuates. We propose various approaches to addressing the above stated issues, which result in a VOD design that is adaptive to the changes in data access patterns. Performance evaluation is carried out to quantify the costs and the performance gains of these techniques.  相似文献   

Data grids are middleware systems that offer secure shared storage of massive scientific datasets over wide area networks. The main challenge in their design is to provide reliable storage, search, and transfer of numerous or large files over geographically dispersed heterogeneous platforms. The Storage Resource Broker (SRB) is an example of a system that provides these services and that has been deployed in multiple high-performance scientific projects during the past few years. In this paper, we take a detailed look at several of its functional features and examine its scalability using synthetic and trace-based workloads. Unlike traditional file systems, SRB uses a commodity database to manage both system- and user-defined metadata. We quantitatively evaluate this decision and draw insightful conclusions about its implications to the system architecture and performance characteristics. We find that the bulk transfer facilities of SRB demonstrate good scalability properties, and we identify the bottleneck resources across different data search and transfer tasks. We examine the sensitivity to several configuration parameters and provide details about how different internal operations contribute to the overall performance.  相似文献   

High-performance storage systems are evolving towards decentralized commodity clusters that can scale in capacity, processing power, and network throughput. Building such systems requires: (a) Sharing physical resources among applications; (b) Sharing data among applications; (c) Allowing customized data views. Current solutions typically satisfy the first two requirements through a cluster file-system, resulting in monolithic, hard-to-manage systems. In this paper we present a storage system that addresses all three requirements by extending the block layer below the file-system. First, we discuss how our system provides customized (virtualized) storage views within a single node. Then, we discuss how it scales in clustered setups. To achieve efficient resource and data sharing we support block-level allocation and locking as in-band mechanisms. We implement a prototype under Linux and use it to build a shared cluster file-system. Our evaluation in a 24-node cluster setup concludes that our approach offers flexibility, scalability and reduced effort to implement new functionality.  相似文献   

The speed of mass storage devices has a significant impact on the performance of computer systems. The speed that is realized on a particular mass storage device, however, depends heavily on how that device is used. Operating systems, such as the UNIX
  • 1 UNIX is a trademark of AT&T Bell Laboratories.
  • time-sharing system, use layout policies and head-scheduling disciplines that are designed to work well on average. Numerous studies have shown that disk access patterns exhibit a high degree of locality. Further, studies have shown that these access patterns do not necessarily correspond to the usage patterns anticipated by the system's designers, and that head scheduling is used infrequently enough that it has limited effect. This paper describes the design, implementation, and use of a disk subsystem that adaptively corrects the disparity between expected access patterns and actual access patterns by reorganizing disk data. A representative experiment that demonstrates the resulting performance improvement is presented.  相似文献   

    Performance needs of many database applicationsdictate that the entire database be stored in main memory.The dali system is a main memory storage manager designed toprovide the persistence, availability and safety guarantees one typically expects from a disk-resident database, while at the same time providing very high performance by virtue of being tuned to support in-memory data.User processes map the entire database into their address space andaccess data directly, thus avoiding expensive remote procedure calls andbuffer manager interactionstypical of accesses in disk-resident commercial systems available today.dali recovers the database to a consistent state in the case of system as well as process failures. It alsoprovides unique concurrency control and memory protection features, aswell as ordered and unordered index structures. Both object-oriented and relational database management systems have beenimplemented on top of dali. dali provides access to multiple layers ofapplication programming interface, including its low-level recovery,concurrency control and indexing components as well as its high-levelrelational component. Finally, various features of dali can be tailored tothe needs of an application to achieve high performance–for example,concurrency control and logging can be turned off if not desired, enablingdali to efficiently support applications that requirenon-persistent memory-resident data to be shared by multiple processes.  相似文献   

