期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

dCompaction: Speeding up Compaction of the LSM-Tree via Delayed Compaction

Feng-Feng Pan Yin-Liang Yue Jin Xiong 《计算机科学技术学报》2017,32(1):41-54

Key-value (KV) stores have become a backbone of large-scale applications in today’s data centers. Writeoptimized data structures like the Log-Structured Merge-tree (LSM-tree) and their variants are widely used in KV storage systems like BigTable and RocksDB. Conventional LSM-tree organizes KV items into multiple, successively larger components, and uses compaction to push KV items from one smaller component to another adjacent larger component until the KV items reach the largest component. Unfortunately, current compaction scheme incurs significant write amplification due to repeated KV item reads and writes, and then results in poor throughput. We propose a new compaction scheme, delayed compaction (dCompaction) that decreases write amplification. dCompaction postpones some compactions and gathers them into the following compaction. In this way, it avoids KV item reads and writes during compaction, and consequently improves the throughput of LSM-tree based KV stores. We implement dCompaction on RocksDB, and conduct extensive experiments. Validation using YCSB framework shows that compared with RocksDB, dCompaction has about 40% write performance improvements and also comparable read performance. 相似文献

2.

SSDKV:一种SSD友好的键值对存储系统

梅飞曹强《计算机工程与科学》2016,38(7):1299-1308

当前大量键值对(Key-Value)存储系统使用固态硬盘(SSD)改善系统的I/O响应速度。但是现有的键值对存储系统应用程序使用标准文件系统处理数据在固态硬盘上的存储,这对应用程序而言底层固态盘的物理特性被屏蔽,同时固态盘也无法针对应用程序的特定I/O模式进行优化,使得基于固态盘的键值对系统性能没有得到充分发挥。针对此问题,设计了同时考虑键值对应用程序存取行为和SSD存储器访问特性的存储管理模块,并与LevelDB结合实现了一种轻量级的、将上层应用与底层存储集成一体的键值对系统—SSDKV。它提供键值对接口给外部程序,结合键值对数据的特点构造适应SSD的数据布局。SSDKV简化了传统文件系统对键值对数据的额外处理,并根据键值对数据的类型及其存取模式对SSD存储空间进行有效管理,使得基于SSD设备的键值对系统性能进一步提高。通过基准程序测试,与运行于传统文件系统上的LevelDB相比,SSDKV使得写性能提高达4倍,读性能提高达1.5倍。相似文献

3.

Cascaded Write Amplification of LSM-tree-based Key-Value Stores underlying Solid-State Disks

《Microprocessors and Microsystems》2020

Log-structured merge tree (i.e., LSM-tree)-based key–value stores (i.e., KV stores) are widely used in big-data applications and provide high performance. NAND Flash-based Solid-state disks (i.e., SSDs) have become a popular storage device alternative to hard disk drives (i.e., HDDs) because of their high performance and low power consumption. LSM-tree KV stores with SSDs are deployed in large-scale storage systems, which aims to achieve high performance in the cloud. Write amplification in LSM-tree KV stores and NAND Flash memory in SSDs are defined as WA1 and WA2 in this paper. The former, which is attributed to compaction operations in LSM-tree-based KV stores, is a burden on I/O bandwidth between the host and the device. The latter, which results from out-place updates in NAND Flash memory, blocks user I/O requests between the host and NAND Flash memory, thereby degrading the SSD performance. Write amplification impairs the overall system performance. In this study, we explored the two-level cascaded write amplification in LSM-tree KV stores with SSDs. The cascaded write amplification is represented as WA. Our primary goal is to comprehensively study two-level cascaded write amplification on the host-side LSM-tree KV stores and the device-side SSDs. We quantitatively analyze the impact of two-level write amplification on overall performance. The cascaded write amplification is 16.44 (WA1 is 16.55; WA2 is 0.99) and 35.51 (WA1 is 16.6; WA2 is 2.14) for SSD-I and SSD-S with LevelDB’s default setting under DB_bench. The larger cascaded write amplification of KV stores has a bad impact on SSD performance and lifetime. The throughput of SSD-S and SSD-I under an 80%-write workload is approximately 0.28x and 0.31x of that under a 100%-write workload. Therefore, it is important to design a novel approach to balance the cost of an SSD lifetime caused by cascaded write amplification and its high performance under the read-write-mixed workloads. We attempt to reveal details of cascaded write amplification and hope that this study is useful for developers of LSM-tree-based KV stores and SSD software stacks. 相似文献

4.

On the necessity of hot and cold data identification to reduce the write amplification in flash-based SSDs

《Performance Evaluation》2014

相似文献

5.

一种混合映射闪存转换层的设计与实现

郁志平刘伟彭虎《计算机工程》2014,(2):300-302,307

使用NAND Flash作为存储媒介的存储设备常需要闪存转换层(FTL)对NAND进行管理。页映射是一种常见的映射方式,但需要很大的内存存放页映射表,在嵌入式环境下这一条件往往无法满足。针对该问题,提出一种基于超级块的混合映射FTL,包括坏块管理、地址翻译、垃圾回收、上电恢复,使用的SRAM空间不到128 KB,远小于页映射,同时不需要存储映射表,程序在固态硬盘开发板上成功运行,实现固态硬盘基本读写功能。测试结果表明,该混合映射FTL方案具有较好的顺序读写性能。相似文献

6.

Performance of a class of multi-robot deploy and search strategies based on centroidal voronoi configurations

K.R. Guruprasad Debasish Ghose 《International journal of systems science》2013,44(4):680-699

This article considers a class of deploy and search strategies for multi-robot systems and evaluates their performance. The application framework used is deployment of a system of autonomous mobile robots equipped with required sensors in a search space to gather information. The lack of information about the search space is modelled as an uncertainty density distribution. The agents are deployed to maximise single-step search effectiveness. The centroidal Voronoi configuration, which achieves a locally optimal deployment, forms the basis for sequential deploy and search (SDS) and combined deploy and search (CDS) strategies. Completeness results are provided for both search strategies. The deployment strategy is analysed in the presence of constraints on robot speed and limit on sensor range for the convergence of trajectories with corresponding control laws responsible for the motion of robots. SDS and CDS strategies are compared with standard greedy and random search strategies on the basis of time taken to achieve reduction in the uncertainty density below a desired level. The simulation experiments reveal several important issues related to the dependence of the relative performances of the search strategies on parameters such as the number of robots, speed of robots and their sensor range limits. 相似文献

7.

On the cost of uniform protocols whose memory consumption is adaptive to interval contention

Burkhard Englert 《Theoretical computer science》2009,410(6-7):533-545

Recently, we introduced a novel term, memory-adaptive, whose goal it is to capture what it means for a distributed protocol to most efficiently make use of its shared memory. We proved three results that relate to the memory-adaptive model in the uniform setting. We considered a store/release protocol where processes are required to store a value in shared MWMR memory so that it cannot be overwritten until it has been released by the process. We showed that there do not exist uniformly wait-free store/release protocols using only the basic operations read and write that are memory-adaptive to point contention. We further showed that there exists a uniformly wait-free store/release protocol using only the basic operations read, write, and read-modify-write that is memory-adaptive to interval contention and time-adaptive to total contention. This left a significant gap — it remained open as to whether there exists a uniform, memory adaptive to interval contention store/release protocol that only uses read/write (no read-modify-write) registers. In this paper, we close this gap by showing that no such protocol can exist. We furthermore illustrate the validity and practicality of the concept of memory adaptiveness by providing a uniform, memory-adaptive to interval contention store/release protocol for Network Attached Disks. 相似文献

8.

Sensor selection for fault diagnosis in uncertain systems

Daniel Jung Yi Dong Erik Frisk Mattias Krysander Gautam Biswas 《International journal of control》2020,93(3):629-639

ABSTRACT

Finding the cheapest, or smallest, set of sensors such that a specified level of diagnosis performance is maintained is important to decrease cost while controlling performance. Algorithms have been developed to find sets of sensors that make faults detectable and isolable under ideal circumstances. However, due to model uncertainties and measurement noise, different sets of sensors result in different achievable diagnosability performance in practice. In this paper, the sensor selection problem is formulated to ensure that the set of sensors fulfils required performance specifications when model uncertainties and measurement noise are taken into consideration. However, the algorithms for finding the guaranteed global optimal solution are intractable without exhaustive search. To overcome this problem, a greedy stochastic search algorithm is proposed to solve the sensor selection problem. A case study demonstrates the effectiveness of the greedy stochastic search in finding sets close to the global optimum in short computational time. 相似文献

9.

Blocking analysis of transaction processing queues

Tugrul Vidyadhar G. 《Performance Evaluation》2001,46(4)

A transaction processing queue manages a database which is partitioned into N items. Each arriving class-i customer requests to read and write a certain subset of the N items (called the shared and exclusive access sets and ). Classes i and j are said to conflict if

Full-size image

. No two conflicting classes of customers can be processed simultaneously. All classes arrive according to independent Poisson processes and have general i.i.d. service times.In this paper, we discuss database systems without queuing. We show the insensitivity property of the system, and derive analytical expressions for performance measures such as blocking probabilities, throughput, etc. 相似文献

10.

Avoiding game‐tree pathology in 2‐player adversarial search

下载免费PDF全文

Inon Zuckerman Brandon Wilson Dana S. Nau 《Computational Intelligence》2018,34(2):542-561

Adversarial search, or game‐tree search, is a technique for analyzing an adversarial game to determine what moves a player should make in order to win a game. Until recently, lookahead pathology (in which deeper game‐tree search results in worse play) has been thought to be quite rare. We provide an analysis that shows that every game should have some sections that are locally pathological, assuming that both players can potentially win the game. We also modify the minimax algorithm to recognize local pathologies in arbitrary games and cut off search accordingly (shallower search is more effective than deeper search when local pathologies occur). We show experimentally that our modified search procedure avoids local pathologies and consequently provides improved performance, in terms of decision accuracy, when compared with the minimax algorithm. In addition, we provide an experimental evaluation on the African game of Kalah, which shows the improved performances of our suggested error‐minimizing minimax algorithm when there is a large degree of pathology. 相似文献

11.

Optimizing the costs of hierarchical quorum consensus

Akhil Kumar Kavindra Malik 《Acta Informatica》1996,33(3):255-275

We study the problem of how to minimize the cost of maintaining consistency among at least N copies of an object in an enviroment where the mix of read and write operations can vary. Quorum consensus requires that read and write operations must assemble appropriate quorums before an operation can succeed. The cost of an operation is proportional to the size of a quorum, and the objective is obviously to minimize the cost while still maintaining consistency. It is known that the quorum size can be reduced by organizing a number of copies into logical structures such as grids and hierarchies. In this paper, we show (a) how methods based on grids and hierarchies can be treated in a common framework, and (b) how these hierarchies can be optimized so as to minimize the cost of consensus. Of course, the optimal solution depends upon the mix of read and write operations that is present. Consequently, given N copies of an object and a ratio of write operations F, our algorithms determine the optimal values for the number of levels in the hierarchy and the read/write quorum sizes at each level. The algorithms, which run in O(N ^1.63) and O(N ²) time, were tested, and the results are reported and discussed. Received September 1, 1992/February 16, 1995 相似文献

12.

面向数值模拟数据的HDF5性能优化

沈卫超曹立强夏芳宋磊《计算机研究与发展》2012,(Z1):314-318

大规模数值模拟数据对可视化分析提出了挑战,I/O是影响可视化交互性能的重要因素.HDF5是科学计算领域广泛采用的存储格式,介绍了HDF5的抽象数据模型、数据读写流程,并使用典型数值模拟数据测试了HDF5的读性能.测试发现HDF5的数据集定位开销较大.根据数值模拟数据的数据块以整数有规律编号的特点,通过在HDF5中增加数据块视图对象来提高读性能.测试表明,该方法可显著加速数据的读取性能. 相似文献

13.

Improving Shift‐Reduce Phrase‐Structure Parsing with Constituent Boundary Information

下载免费PDF全文

Wenliang Chen Muhua Zhu Min Zhang Yue Zhang Jingbo Zhu 《Computational Intelligence》2017,33(3):428-447

Shift‐reduce parsing enjoys the property of efficiency because of the use of efficient parsing algorithms like greedy/deterministic search and beam search. In addition, shift‐reduce parsing is much simpler and easy to implement compared with other parsing algorithms. In this article, we explore constituent boundary information to improve the performance of shift‐reduce phrase‐structure parsing. In previous work, constituent boundary information has been used to speed up chart parsers successfully. However, whether it is useful for improving parsing accuracy has not been investigated. We propose two different models to capture constituent boundary information, based on which two sets of novel features are designed for a shift‐reduce parser. The first model is a boundary prediction model that uses a classifier to predict the boundaries of constituents. We use automatically parsed data to train the classifier. The second one is a Tree Likelihood Model that measures the validity of a constituent by its likelihood which is calculated on automatically parsed data. Experimental results show that our proposed method outperforms a strong baseline by 0.8% and 1.6% in F‐score on English and Chinese data, respectively, achieving the competitive parsing accuracies on Chinese (84.8%) and English (90.8%). To our knowledge, this is the first time for shift‐reduce phrase‐structure parsing to advance the state‐of‐the‐art with constituent boundary information. 相似文献

14.

Approximate algorithms for static and continuous range queries in mobile navigation

Haidar AL-Khalidi David Taniar Maytham Safar 《Computing》2013,95(10-11):949-976

For many years, spatial range search has been applied to computational geometry and multimedia problems to find interest objects within a given radius. Range search query has traditionally been used to return all objects within a given radius. However, having all objects is not necessary, especially when there are already enough objects closer to the query point. Furthermore, expanding the radius may give users better results, especially when there are a lot of objects just outside the search boundary. Therefore, in this paper, we focus on approximate range search, where the query results are approximate, rather than exact. We propose approximate static range search (ARS) which combines two approaches, namely (i) lowerbound approximate range search, and (ii) upperbound approximate range search. Using ARS, we are able to deliver a better performance, together with low false hit and reasonable false miss. We also extend ARS in the context of a continuous query setting, in which the query moves. This is particularly important in spatial databases as a mobile user who invokes the query is moving. In terms of continuous range search, the intention is to find split points—the locations where the query results will be updated. Accordingly, we propose two methods for approximate continuous range search, namely (i) range search minimization, and (ii) split points minimization. Our performance evaluation which compares our methods with the traditional continuous range search shows that our methods considerably reduce the number of split points, thereby improving overall performance. 相似文献

15.

Using eager strategies to improve NFS I/O performance

《International Journal of Parallel, Emergent and Distributed Systems》2013,28(2):134-158

Typical network file system (NFS) clients write lazily: they leave dirty pages in the page cache and defer writing to the server. This reduces network traffic when applications repeatedly modify the same set of pages. However, this approach can lead to memory pressure, when the number of available pages on the client system is so low that the system must work harder to reclaim dirty pages. We show that NFS performance is poor under memory pressure and present two mechanisms to solve it: eager writeback and eager page laundering. These mechanisms change the client's data management policy from lazy to eager, in which dirty pages are written back proactively, resulting in higher throughput for sequential writes. In addition, we show that NFS servers suffer from out-of-order file operations, which further reduce performance. We introduce request ordering, a server mechanism to process operations, as much as possible, in the order they were sent by the client, which improves read performance substantially. We have implemented these techniques in the Linux operating system. I/O performance is improved, with the most pronounced improvement visible for sequential access to large files. We see 33% improvement in the performance of streaming write workloads and more than triple the performance of streaming read workloads. We evaluate several non-sequential workloads and show that these techniques do not degrade performance, and can sometimes improve performance. 相似文献

16.

Optimal Search and One-Way Trading Online Algorithms 总被引：15，自引：0，他引：15

R. El-Yaniv A. Fiat R. M. Karp G. Turpin 《Algorithmica》2001,30(1):101-139

This paper is concerned with the time series search and one-way trading problems. In the (time series) search problem a player is searching for the maximum (or minimum) price in a sequence that unfolds sequentially, one price at a time. Once during this game the player can decide to accept the current price p in which case the game ends and the player's payoff is p . In the one-way trading problem a trader is given the task of trading dollars to yen. Each day, a new exchange rate is announced and the trader must decide how many dollars to convert to yen according to the current rate. The game ends when the trader trades his entire dollar wealth to yen and his payoff is the number of yen acquired. The search and one-way trading are intimately related. Any (deterministic or randomized) one-way trading algorithm can be viewed as a randomized search algorithm. Using the competitive ratio as a performance measure we determine the optimal competitive performance for several variants of these problems. In particular, we show that a simple threat-based strategy is optimal and we determine its competitive ratio which yields, for realistic values of the problem parameters, surprisingly low competitive ratios. We also consider and analyze a one-way trading game played against an adversary called Nature where the online player knows the probability distribution of the maximum exchange rate and that distribution has been chosen by Nature. Finally, we consider some applications for a special case of portfolio selection called two-way trading in which the trader may trade back and forth between cash and one asset. Received October 19, 1998; revised August 12, 1999. 相似文献

17.

pRPL + pGTIOL: The marriage of a parallel processing library and a parallel I/O library for big raster data

《Environmental Modelling & Software》2017

Data I/O has become a major bottleneck of computational performance of geospatial analysis and modeling. In this study, a parallel GeoTIFF I/O library (pGTIOL) was developed. Through the storage mapping and data arrangement techniques, pGTIOL can operate on files in either strip or tile storage mode, read/write any sub-domain of data within the raster dataset. pGTIOL enables asynchronized I/O, which means a process can read/write its own sub-domains of data when necessary without synchronizing with other processes. pGTIOL was integrated into the parallel raster processing library (pRPL). Several pGTIOL-based data I/O functions and options were added to pRPL, while the existing functions of pRPL stay intact. Experiments showed that the integration of pRPL and pGTIOL achieved higher performance than the original pRPL that uses GDAL as the I/O interface. Therefore, pRPL + pGTIOL enables transparent parallelism for high-performance raster processing with the capability of true parallel I/O of massive raster datasets. 相似文献

18.

Search algorithms for automated validation

Tom A.N. Engels Jan Friso Groote Muck J. van Weerdenburg Tim A.C. Willemse 《The Journal of Logic and Algebraic Programming》2009,78(4):274-287

A novel search technique called highway search is introduced. The search technique relies on a highway simulation which takes several homogeneous walks through a (possibly infinite) state space. Furthermore, we provide a memory-efficient algorithm that approximates a highway search and we prove that, under particular conditions, they coincide. The effectiveness of highway search is compared to two mainstream search techniques, viz. random search and randomised depth-first search. Our results demonstrate that randomised depth-first search explores the least amount of states in the effort of finding states of interest, whereas a highways search yields the shortest witnessing traces to such states. 相似文献

19.

A parallel computational approach for similarity search using Bloom filters

下载免费PDF全文

Sachendra Singh Chauhan Shalini Batra 《Computational Intelligence》2018,34(2):713-733

Finding similar items in a large and unstructured dataset is a challenging task in many applications of data science, such as searching, indexing, and retrieval. With the increasing data volume and demand for real time responses, similarity search has gained much consideration. In this paper, a parallel computational approach for similarity search using Bloom filters (PCASSB) has been proposed, which uses Bloom filter for the representation of features of document and comparison with user's query. Query features are stored in integer query array (^IQ_A), an array of integer. The PCASSB, an approximate similarity search technique, has been implemented on graphics processing unit with compute unified device architecture as the programming platform. To compute the similarity score between query and reference dataset, Dice coefficient has been used as a baseline method. The accuracy of the results generated by PCASSB is compared with the baseline method and other state‐of‐the‐art methods. The experimental results show that the proposed technique is quite effective in processing large number of text documents as it takes less computational time. 相似文献

20.

Persistent clustered main memory index for accelerating <Emphasis Type="Italic">k</Emphasis>-NN queries on high dimensional datasets

Alexander Thomasian Lijuan Zhang 《Multimedia Tools and Applications》2008,38(2):253-270

Similarity search implemented via k-nearest neighbor— k-NN queries on multidimensional indices is an extremely useful paradigm for content-based image retrieval. As the dimensionality of feature vectors increases the curse of dimensionality sets in, i.e., the performance of k-NN search of disk-resident indices in the R-tree family degrades rapidly due to the overlap in index pages in high dimensions. This problem is dealt with in this study by utilizing the double filtering effect of clustering and indexing. The clustering algorithm ensures that the largest cluster fits into main memory and that only clusters closest to a query point need to be searched and hence loaded into main memory. We organize the data in each cluster according to the ordered-partition—OP-tree main memory resident index, which is not prone to the curse of dimensionality and highly efficient for processing k-NN queries. We serialize an OP-tree by writing its dynamically allocated nodes into contiguous memory locations, optimize its parameters, and make it persistent by writing it to disk. The time to read and write clusters constituting an OP-tree with a single sequential access to disk benefits from higher data transfer rates of modern disk drives. The performance of the index is further improved by applying the Karhunen–Loève transformation—KLT to the dataset, since this results in a more efficient computation of distances for k-NN queries. We compare OP-trees and sequential scans with and without a KL-transformation and with and without using a shortcut method in calculating Euclidean distances. A comparison against the OMNI-sequential scan is also reported. We finally compare a clustered and persistent version of the OP-tree against a clustered version of the SR-tree and the VA-file method. CPU time is measured and elapsed time is estimated in this study. It is observed that the OP-tree index outperforms the other two methods and that the improvement increases with the number of dimensions.

Lijuan ZhangEmail:

相似文献