期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Weighted adaptive concurrency control for software transactional memory

Mohammad Ansari 《The Journal of supercomputing》2014,68(3):1027-1047

Transactional memory programs may have dynamic available parallelism, which is defined as the number of transactions that can be committed concurrently. Prior work presented adaptive concurrency control, which adapts the number of active threads at runtime, and thus the number of concurrently executing transactions, based on available parallelism. Reducing threads when available parallelism is low, and vice versa, improved speedup and reduced wasted work (in aborted transactions). However, prior work did not consider the case where individual threads exhibit dynamic available parallelism. Deactivating threads with low available parallelism, and vice versa, may improve speedup and reduce wasted work further. This paper introduces weighted adaptive concurrency control to exploit the variance in available parallelism between threads. Four algorithms are designed, implemented, and evaluated. They improve speedup and reduce wasted work over prior non-weighted algorithms in applications whose threads exhibit such variance, while maintaining performance parity in applications whose threads do not. 相似文献

2.

Selective dynamic serialization for reducing energy consumption in hardware transactional memory systems

Epifanio Gaona J. Rubén Titos-Gil Juan Fernández Manuel E. Acacio 《The Journal of supercomputing》2014,68(2):914-934

In the search for new paradigms to simplify multithreaded programming, Transactional Memory (TM) is currently being advocated as a promising alternative to deadlock-prone lock-based synchronization. In this way, future many-core CMP architectures may need to provide hardware support for TM. On the other hand, power dissipation constitutes a first class consideration in multicore processor designs. In this work, we propose Selective Dynamic Serialization (SDS) as a new technique to improve energy consumption without degrading performance in applications with conflicting transactions by avoiding wasted work due to aborted transactions. Our proposal, which is implemented on top of a hardware transactional memory (HTM) system with an eager conflict management policy, detects and serializes conflicting transactions dynamically (at run-time). In its simplest form, in case of conflict, one transaction is allowed to continue whilst the rest are completely stalled. Once the executing transaction has finished, it wakes up several of the stalling transactions. More elaborated implementations of SDS try to delay this behavior until serialization of transactions is profitable, achieving the best trade-off between performance, energy savings and network traffic. SDS implementations differ from each other in the condition that triggers the serialization mode. We have evaluated several SDS schemes using GEMS, a full-system simulator implementing the LogTM-SE Eager–Eager HTM system, and several benchmarks from the STAMP suite. Results for a 16-core CMP show that SDS obtains reductions of 6 % on average in energy consumption (more than 20 % in high contention scenarios) in a wide range of benchmarks without affecting, on average, execution time. At the same time, network traffic level is also reduced by 22 %. 相似文献

3.

An abort-aware model of transactional programming

Kousha?Etessami Email author Patrice?Godefroid 《International Journal on Software Tools for Technology Transfer (STTT)》2011,13(6):537-551

There has been a lot of recent research on transaction-based concurrent programming, aimed at offering an easier concurrent programming paradigm that enables programmers to better exploit the parallelism of modern multi-processor machines, such as multi-core microprocessors. We introduce Transactional State Machines (TSMs) as an abstract finite-data model of transactional shared-memory concurrent programs. TSMs are a variant of concurrent boolean programs (or concurrent extended recursive state machines) augmented with additional constructs for specifying potentially nested transactions. Namely, some procedures (or code segments) can be marked as transactions and are meant to be executed “atomically”, and there are also explicit commit and abort operations for transactions. The TSM model is non-blocking and allows interleaved executions where multiple processes can simultaneously be executing inside transactions. It also allows nested transactions, transactions which may never terminate, and transactions which may be aborted explicitly, or aborted automatically by the run-time environment due to memory conflicts. We show that concurrent executions of TSMs satisfy a correctness criterion closely related to serializability, which we call stutter-serializability, with respect to shared memory. We initiate a study of model checking problems for TSMs. Model checking arbitrary TSMs is easily seen to be undecidable, but we show it is decidable in the following case: when recursion is exclusively used inside transactions in all (but one) of the processes, we show that model checking such TSMs against all stutter-invariant ω-regular properties of shared memory is decidable. 相似文献

4.

Profiling and Optimizing Transactional Memory Applications

Ferad Zyulkyarov Srdjan Stipic Tim Harris Osman S. Unsal Adrián Cristal Ibrahim Hur Mateo Valero 《International journal of parallel programming》2012,40(1):25-56

Many researchers have developed applications using transactional memory (TM) with the purpose of benchmarking different implementations, and studying whether or not TM is easy to use. However, comparatively little has been done to provide general-purpose tools for profiling and optimizing programs which use transactions. In this paper we introduce a series of profiling and optimization techniques for TM applications. The profiling techniques are of three types: (i) techniques to identify multiple potential conflicts from a single program run, (ii) techniques to identify the data structures involved in conflicts by using a symbolic path through the heap, rather than a machine address, and (iii) visualization techniques to summarize how threads spend their time and which of their transactions conflict most frequently. Altogether they provide in-depth and comprehensive information about the wasted work caused by aborting transactions. To reduce the contention between transactions we suggest several TM specific optimizations which leverage nested transactions, transaction checkpoints, early release and etc. To examine the effectiveness of the profiling and optimization techniques, we provide a series of illustrations from the STAMP TM benchmark suite and from the synthetic WormBench workload. First we analyze the performance of TM applications using our profiling techniques and then we apply various optimizations to improve the performance of the Bayes, Labyrinth and Intruder applications. We discuss the design and implementation of the profiling techniques in the Bartok-STM system. We process data offline or during garbage collection, where possible, in order to minimize the probe effect introduced by profiling. 相似文献

5.

Improving performance of software transactional memory through contention locality

Ehsan Atoofian 《The Journal of supercomputing》2013,64(2):527-547

In this paper, we introduce contention locality in Transactional Memory (TM) which describes the likelihood that a previously aborted transaction conflicts again in the future. We find that conflicts are highly predictable in TMs and we propose two optimization techniques based on contention locality: The first optimization technique is Speculative Contention Avoidance (SCA). SCA dynamically controls the number of concurrently executing transactions and serializes those transactions that are likely to conflict. As such, SCA reduces contention in TMs and improves performance. The second optimization technique is Adaptive Validation (AV). We show that there is no single validation policy that works well across all applications. AV adjusts validation based on applications’ behavior and improves performance of TMs. In this paper, SCA and AV are evaluated using Transactional Locking II (TL2) and Stamp v0.9.10 benchmark suite. The evaluation reveals that SCA and AV are effective and improve performance significantly. 相似文献

6.

STM多事务竞争冲突下竞争管理策略的研究

谢灵均胡大裟蒋玉明《现代计算机》2014,(10):3-6

事务存储系统是一种高层次抽象并行编程模型,目的为方便开发并行程序。事务存储系统中的竞争管理模块用于解决事务之间的冲突。传统的事务竞争管理策略只负责仲裁两个冲突事务之间的冲突．提出将多个事务及事务冲突关联转换成一张无向图,基于全局事务冲突情景,利用图顶点着色技术求解无向图中最大独立集。最大独立集中事务相互不冲突,CM仲裁处理并发执行,实现系统并发最大化。相似文献

7.

事务存储中的一种自适应冲突检测算法

下载免费PDF全文

彭林谢伦国张小强《计算机工程与科学》2009,31(11)

事务存储被认为是极具前景的多核处理器并行编程的手段,但存在开销过大的问题。采用Bloom Filter对事务间访问共享变量进行冲突检测,能够有效地降低开销,但其存在误判会导致不必要的事务作废,因此要尽可能减少。简要介绍了Bloom Filter和事务存储,提出了一种事务存储的自适应冲突检测算法ACDA,根据事务读写集合大小自适应地调整Bloom Filter的位串大小,在较低开销的情况下,保持误判率不增加。分析了软件事务存储中实现ACDA的特点,初步实现ACDA,与主流软件事务存储实现RSTM相比,在事务存储测试程序STAMP中,开销可接受的前提下,减少因误判而作废的事务最高达93%。给出了对ACDA哈希函数进一步优化的思路。相似文献

8.

Inherent Limitations on Disjoint-Access Parallel Implementations of Transactional Memory

Hagit Attiya Eshcar Hillel Alessia Milani 《Theory of Computing Systems》2011,49(4):698-719

Transactional memory (TM) is a popular approach for alleviating the difficulty of programming concurrent applications; TM guarantees that a transaction, consisting of a sequence of operations, appear to be executed atomically. Two fundamental properties of TM implementations are disjoint-access parallelism and the invisibility of read operations. Disjoint access parallelism ensures that operations on disconnected data do not interfere, and thus it is critical for TM scalability. The invisibility of read operations means that their implementation does not write to the memory, thereby reducing memory contention. 相似文献

9.

基于依赖图的硬件事务存储技术研究 总被引：1，自引：0，他引：1

曾坤杨学军《计算机研究与发展》2012,49(1):44-54

事务存储技术能够简化并行程序中对共享资源的访问控制,是当前的研究热点之一.目前,多数基于硬件的事务存储系统采用基于冲突检测与处理的并发控制协议,当检测到两事务发生冲突时就中止二者之一.但是对事务间"冲突"更深入的分析表明,某些"冲突"并不一定会导致事务的回退,这种冲突称为"弱冲突".基于依赖图的硬件事务存储技术能够避免弱冲突引发的多余事务回退.模拟实验表明,基于依赖图的事务存储系统与基于冲突处理的事务存储系统相比具有明显的性能优势. 相似文献

10.

Checkpointing for optimistic concurrency control methods

Thomasian A. 《Knowledge and Data Engineering, IEEE Transactions on》1995,7(2):332-339

Restart-oriented concurrency control (CC) methods, such as optimistic CC, outperform blocking-oriented methods, such as standard two-phase locking in a high data contention environment, but this is at the cost of wasted processing due to restarts. Volatile savepoints are considered in this study as a method to reduce this wasted processing and to improve response time. There is the usual tradeoff between the checkpointing overhead and the wasted processing when a transaction is restarted. Our study shows that in a system where objects are accessed and updated uniformly during the lifetime of transactions, significant improvement in performance at high data conflict levels are attainable only when the checkpointing cost is low. This implies few optimally placed checkpoints per transaction. It is observed that checkpointing may result in a significant improvement in performance when access to database hot-spots are deferred to the final steps of transaction execution. The parametric studies reported in this paper are facilitated by closed-form analytic solutions expressing system performance, as well as an iterative solution which takes into account hardware resource contention in addition to data contention 相似文献

11.

Transactional Contention Management as a Non-Clairvoyant Scheduling Problem

Hagit Attiya Leah Epstein Hadas Shachnai Tami Tamir 《Algorithmica》2010,57(1):44-61

The transactional approach to contention management guarantees consistency by making sure that whenever two transactions have a conflict on a resource, only one of them proceeds. A major challenge in implementing this approach lies in guaranteeing progress, since transactions are often restarted. Inspired by the paradigm of non-clairvoyant job scheduling, we analyze the performance of a contention manager by comparison with an optimal, clairvoyant contention manager that knows the list of resource accesses that will be performed by each transaction, as well as its release time and duration. The realistic, non-clairvoyant contention manager is evaluated by the competitive ratio between the last completion time (makespan) it provides and the makespan provided by an optimal contention manager. 相似文献

12.

Atomic RMI: A Distributed Transactional Memory Framework

Konrad?Siek Email author Pawe??T.?Wojciechowski 《International journal of parallel programming》2016,44(3):598-619

This paper presents Atomic RMI, a distributed transactional memory framework that supports the control flow model of execution. Atomic RMI extends Java RMI with distributed transactions that can run on many Java virtual machines located on different network nodes. Our system employs SVA, a fully-pessimistic concurrency control algorithm that provides exclusive access to shared objects and supports rollback and fault tolerance. SVA is capable of achieving a relatively high level of parallelism by interweaving transactions that access the same objects and by making transactions that do not share objects independent of one another. It also allows any operations within transactions, including irrevocable ones, like system calls, and provides an unobtrusive API. Our evaluation shows that in most cases Atomic RMI performs better than fine grained mutual-exclusion and read/write locking mechanisms. Atomic RMI also performs better than an optimistic transactional memory in environments with high contention and a high ratio of write operations, while being competitive otherwise. 相似文献

13.

Pinpointing and scheduling access conflicts to improve internal resource utilization in solid-state drives

Xuchao XIE Liquan XIAO Dengping WEI Qiong LI Zhenlong SONG Xiongzi GE 《Frontiers of Computer Science》2019,13(1):35

Modern solid-state drives (SSDs) are integrating more internal resources to achieve higher capacity. Parallelizing accesses across internal resources can potentially enhance the performance of SSDs. However, exploiting parallelism inside SSDs is challenging owing to real-time access conflicts. In this paper, we propose a highly parallelizable I/O scheduler (PIOS) to improve internal resource utilization in SSDs from the perspective of I/O scheduling. Specifically, we first pinpoint the conflicting flash requests with precision during the address translation in the Flash Translation Layer (FTL). Then, we introduce conflict eliminated requests (CERs) to reorganize the I/O requests in the device-level queue by dispatching conflicting flash requests to different CERs. Owing to the significant performance discrepancy between flash read and write operations, PIOS employs differentiated scheduling schemes for read and write CER queues to always allocate internal resources to the conflicting CERs that are more valuable. The small dominant size prioritized scheduling policy for the write queue significantly decreases the average write latency. The high parallelism density prioritized scheduling policy for the read queue better utilizes resources by exploiting internal parallelism aggressively. Our evaluation results show that the parallelizable I/O scheduler (PIOS) can accomplish better SSD performance than existing I/O schedulers implemented in both SSD devices and operating systems. 相似文献

14.

Implementation tradeoffs in the design of flexible transactional memory support

Arrvindh Shriraman Sandhya Dwarkadas Michael L. Scott 《Journal of Parallel and Distributed Computing》2010

We present FlexTM (FLEXible Transactional Memory), a high performance TM framework that allows software to determine when (eagerly, lazily, or in a mixed fashion) and how to manage conflicts, while employing hardware to manage transactional state and to track conflicts. FlexTM coordinates four decoupled hardware mechanisms: read and write signatures, which summarize per-thread access sets; per-thread conflict summary tables (CSTs), which identify the processors with which conflicts have occurred; Programmable Data Isolation, which buffers speculative updates in the local cache and uses an overflow table to handle unbounded updates; and Alert-On-Update, which notifies a thread immediately when a specified location is written by another processor. The CSTs enable an STM-inspired commit protocol that manages conflicts in a decentralized manner (no global arbitration) and allows parallel commits. 相似文献

15.

A time complexity lower bound for adaptive mutual exclusion

Yong-Jik Kim James H. Anderson 《Distributed Computing》2012,24(6):271-297

We consider the time complexity of adaptive mutual exclusion algorithms, where “time” is measured by counting the number of remote memory references required per critical-section access. For systems that support (only) read, write, and comparison primitives (such as compare-and-swap), we establish a lower bound that precludes a deterministic algorithm with o(k) time complexity, where k is point contention. In particular, it is impossible to construct a deterministic O(log k) algorithm based on such primitives. 相似文献

16.

A Commit Strategy for Distributed Real-Time Transaction

下载免费PDF全文

覃飙刘云生杨进才《计算机科学技术学报》2003,18(5):0-0

Ramamritham gives three common types of constraints for the execution his-tory of concurrent transactions. This paper extends the constraints and gives the fourth type of constraint. Then the weak commit dependency and abort dependency between transactions, be-cause of data access conflicts, axe analyzed. Based on the analysis, an optimistic commit protocol 2LC (two-Level Commit) is proposed, which is specially designed for the distributed real-time do-main. It allows transactions to optimistically access the locked data in a controlled manner, which reduces the data inaccessibility and priority inversion inherent and undesirable in distributed real-time database systems. Furthermore, if the prepared transaction is aborted, the transactions in its weak commit dependency set will execute as normal according to 2LC. Extensive simulation ex-periments have been performed to compare the performance of 2LC with that of the base protocol,the permits reading of modified prepared-data for timeliness (PROMPT) and the deadline-driven conflict resolution (DDCR). The simulation results show that 2LC is effective in reducing the num-ber of missed transaction deadlines. Furthermore, it is easy to be incorporated with the existing concurrency control protocols. 相似文献

17.

A new fast-path mechanism for mutual exclusion

James H. Anderson Yong-Jik Kim 《Distributed Computing》2001,14(1):17-29

Summary. Several years ago, Yang and Anderson presented an N-process algorithm for mutual exclusion under read/write atomicity that has time complexity, where “time” is measured by counting remote memory references. In this algorithm, instances of a two-process mutual exclusion algorithm are embedded within a binary arbitration tree. In the two-process algorithm that was used, all busy-waiting is done by “local spinning.” Performance studies presented by Yang and Anderson showed that their N-process algorithm exhibits scalable performance under heavy contention. One drawback of using an arbitration tree, however, is that each process is required to perform remote memory operations even when there is no contention. To remedy this problem, Yang and Anderson presented a variant of their algorithm that includes a “fast-path” mechanism that allows the arbitration tree to be bypassed in the absence of contention. This algorithm has the desirable property that contention-free time complexity is O(1). Unfortunately, the fast-path mechanism that was used caused time complexity under contention to rise to in the worst case. To this day, the problem of designing a read/write mutual exclusion algorithm with O(1) time complexity in the absence of contention and O(logN) time complexity under contention has remained open. In this paper, we close this problem by presenting a fast-path mechanism that achieves these time complexity bounds when used in conjunction with Yang and Anderson's arbitration-tree algorithm. Received: July 1999 / Accepted: July 2000 相似文献

18.

Observations on optimistic concurrency control schemes

Theo Härder 《Information Systems》1984,9(2):111-120

Optimistic concurrency control schemes allow uncontrolled access to shared data objects during transaction processing under the explicit assumption that read and write conflicts among transactions are rare events. Before a transaction commits, the DBMS has to validate that no conflict has occurred. Conflict resolution mainly relies on transaction abort.Two different optimistic concurrency control schemes are introduced and compared to each other. The problems of implementing such schemes and their implications on DBMS processing is investigated in some detail. A number of general properties of optimistic concurrency control schemes is derived, and their advantages and drawbacks w.r.t. two-phase locking approaches are discussed. 相似文献

19.

Transactional scheduling for read-dominated workloads

Hagit Attiya Alessia Milani 《Journal of Parallel and Distributed Computing》2012

The transactional approach to contention management guarantees atomicity by aborting transactions that may violate consistency. A major challenge in this approach is to schedule transactions in a manner that reduces the total time to perform all transactions (the makespan), since transactions are often aborted and restarted. The performance of a transactional scheduler can be evaluated by the ratio between its makespan and the makespan of an optimal, clairvoyant scheduler that knows the list of resource accesses that will be performed by each transaction, as well as its release time and duration. 相似文献

20.

Transactional support for adaptive indexing

Goetz Graefe Felix Halim Stratos Idreos Harumi Kuno Stefan Manegold Bernhard Seeger 《The VLDB Journal The International Journal on Very Large Data Bases》2014,23(2):303-328

Adaptive indexing initializes and optimizes indexes incrementally, as a side effect of query processing. The goal is to achieve the benefits of indexes while hiding or minimizing the costs of index creation. However, index-optimizing side effects seem to turn read-only queries into update transactions that might, for example, create lock contention. This paper studies concurrency control and recovery in the context of adaptive indexing. We show that the design and implementation of adaptive indexing rigorously separates index structures from index contents; this relaxes constraints and requirements during adaptive indexing compared to those of traditional index updates. Our design adapts to the fact that an adaptive index is refined continuously and exploits any concurrency opportunities in a dynamic way. A detailed experimental analysis demonstrates that (a) adaptive indexing maintains its adaptive properties even when running concurrent queries, (b) adaptive indexing can exploit the opportunity for parallelism due to concurrent queries, (c) the number of concurrency conflicts and any concurrency administration overheads follow an adaptive behavior, decreasing as the workload evolves and adapting to the workload needs. 相似文献