首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 372 毫秒
1.
2.
基于类型的运行时环境存储管理算法   总被引:1,自引:0,他引:1  
张武生  杨广文  郑纬民 《软件学报》2005,16(8):1506-1512
基于类型进行分类管理堆空间的垃圾回收算法通过废弃对象复用来降低运行时环境创建对象所需时间开销,同时还通过线程缓存、租赁等技术进一步增强运行时系统的存储管理效率.运行实验表明,该算法能够在回收线程和工作线程之间实现细粒度的并行性并缩短对象申请和回收时间,进而能够减少工作线程的停顿现象,增加服务器应用的平滑性以及提高堆空间的使用效率.  相似文献   

3.
Andrew W. Appel 《Software》1989,19(2):171-183
Generational garbage collection algorithms achieve efficiency because newer records point to older records; the only way an older record can point to a newer record is by a store operation to a previously created record, and such operations are rare in many languages. A garbage collector that concentrates just on recently allocated records can take advantage of this fact. Such a garbage collector can be so efficient that the allocation of records costs more than their disposal. A scheme for quick record allocation attacks this bottleneck. Many garbage-collected environments do not know when to ask the operating system for more memory. A robust heuristic solves this problem. This paper presents a simple, efficient, low-overhead version of generational garbage collection with fast allocation, suitable for implementation in a Unix environment.  相似文献   

4.
Coordination systems, in particular Linda, have established themselves as important tools for the development of applications to open systems such as the Internet. This paper shows how to tackle a forgotten, but crucial problem in open coordination systems: memory management. As with any system which intends to be of wide use and because memory is a finite resource, coordination systems must address the problems of memory exhaustion. This paper first explores the orthogonality between coordination and computation in order to make it clear that the problem of memory exhaustion in coordination systems cannot be solved using garbage collection schemes implemented at the computation language—a garbage collection scheme must exist in the coordination environment as well. Following the explanation on orthogonality, the paper will focus on describing a garbage collection scheme for the Linda family of coordination systems. It is expected that the solution in Linda can be adapted to other coordination systems as long as they are based on tuple space communication. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

5.
数据竞争检测、确定性回放等方法被广泛应用于解决多线程程序中由内存访问顺序不确定性引发的数据竞争及死锁等问题.但是,由于上述方法需要监测程序内存访问,所以通常带来很大的运行开销.实验表明,在多线程程序中存在着大量只被赋值一次的对象,去除这类对象内存访问的监测操作不会影响上述方法的正确性,且能有效减少系统的运行开销.在此基础上,本文形式化定义了单赋值对象,并提出了一个静态对象单赋值分析算法,将这一算法的分析结果应用到多种成熟的数据竞争检测、确定性回放系统中.测试数据表明使用对象单赋值分析可以有效减少数据竞争检测、确定性回放等系统的运行开销,从而扩展系统应用场景.  相似文献   

6.
Automatic memory management and the hiding of the notion of pointers are the prominent features of symbolic processing languages. They make programming easy and guarantee the safety of memory references. For the memory management of linked data structures, copying garbage collection is most widely used because of its simplicity and desirable properties. However, if certain properties about runtime storage allocation and the behavior of pointers can be obtaind by static analysis, a compiler may be able to generate object code closer to that of procedural programs. In the fields of parallel, distributed and real-time computation, it is highly desirable to be able to identify data structures in a program that can be managed without using garbage collection. To this end, this paper proposes a framework of linearity analysis for a concurrent logic language Moded Flat GHC, and proves its basic property. The purpose of linearity analysis is to distinguish between fragments of data structures that may be referenced by two or more pointers and those that cannot be referenced by two or more pointers. Data structures with only one reader are amenable to compile-time garbage collection or local reuse. The proposed framework of linearity analysis is constraint-based and involves both equality and implicational constraints. It has been implemented as part of klint v2, a static analyzer for KL1 programs.  相似文献   

7.
雷兵兵  严华 《计算机应用》2017,37(4):1149-1152
针对现有的NAND闪存垃圾回收算法中回收性能不高,磨损均衡效果差,并且算法内存开销大的问题,提出了一种基于逻辑区间热度的垃圾回收算法。该算法重新定义了热度计算公式,把连续逻辑地址的NAND内存定义为一个热度区间,以逻辑区间的热度来代替逻辑页的热度,并将不同热度的数据分开存储到不同擦除次数的闪存块上,有效地实现了数据冷热分离,并且节约了内存空间。同时,算法还构造了一种新的回收代价函数来选择回收块,在考虑回收效率的同时,还兼顾了磨损均衡的问题。实验结果表明,该算法与性能优异的FaGC算法相比,总的擦除次数减少了11%,总的拷贝次数减少了13%,擦次数最大差值减少了42%,内存消耗能减少了75%。因此,该算法有利于增加闪存可用空间,改善闪存系统的读写性能,延长闪存使用寿命。  相似文献   

8.
As multithreaded server applications and runtime systems prevail, garbage collection is becoming an essential feature to support high performance systems, especially those running data-intensive applications. The fundamental issue of garbage collector (GC) design is to maximize the recycled space with minimal time overhead. This paper proposes two innovative solutions: one to improve space efficiency, and the other to improve time efficiency. To achieve space efficiency, we propose the Space Tuner that utilizes the novel concept of allocation speed to reduce wasted space. Conventional static space partitioning techniques often lead to inefficient space utilization. The Space Tuner adjusts the heap partitioning dynamically such that when a collection is triggered, all space partitions are fully filled. To achieve time efficiency, we propose a novel parallelization method that reduces the compacting GC parallelization problem into a tree traversal parallelization problem. This method can be applied for both normal and large object compaction. Object compaction is hard to parallelize due to strong data dependencies such that the source object can not be moved to its target location until the object originally in the target location has been moved out. Our proposed algorithm overcomes the difficulties by dividing the heap into equal-sized blocks and parallelizing the movement of the independent blocks. It is noteworthy that these proposed algorithms are generic such that they can be utilized in different GC designs. The proposed techniques have been implemented in Apache Harmony JVM and we evaluated the proposed algorithms with SPECjbb and Dacapo benchmark suites. The experiment results demonstrate that our proposed algorithms greatly improve space utilization and the corresponding parallelization schemes are scalable, which brings time efficiency.  相似文献   

9.
This paper presents a low-power tag organization for physically tagged caches in embedded processors with virtual memory support. An exceedingly small subset of tag bits is identified for each application hot-spot so that only these tag bits are used for cache access with no performance sacrifice as they provide complete address resolution. The minimal subset of physical tag bits is dynamically updated following the changes in the physical address space of the application. Operating system support is introduced in order to maintain the reduced tags during program execution. Efficient algorithms are incorporated within the memory allocator and the dynamic linker in order to achieve dynamic update of the reduced tags. The only hardware support needed within the I/D-caches is the support for disabling bitlines of the tag arrays. An extensive set of experimental results demonstrates the efficacy of the proposed approach.  相似文献   

10.
Multilevel graphics structures are highly suited for interactive applications of computer graphics. The facilities concerning the semantics of picture naming and structuring provided by a graphics package are highly influenced by the ease of implementation and operational requirements. This paper presents a multilevel graphics structure provided by GRASP—A 3D Graphics Systems for Pascal users. Picture segmentation commands provided by GRASP allow pictures to be hierarchically structured as collections of sub-pictures which themselves are collections of instances of predifined picture segments. Complex changes in a picture can be effected by commands which allow modifications of graphical entities at specified levels in the picture hierarchy. An edited sub-picture can be redefined as a segment to allow restructuring of the picture hierarchy. The implementation of the scheme presented here has been facilitated through the use of dynamic data structures composed of pointers and records. Powerful error reporting functions aid the programmer considerably in detecting errors if any. A garbage collection scheme conserves working memory space. A couple of applications described at the end of the paper illustrate the flexibility provided by the scheme.  相似文献   

11.
针对现有射频识别(RFID)防碰撞算法存在的通信开支较大问题,提出一种改进的多比特识别算法.该算法在不降低原有算法识别效率的情况下,采用帧时隙的结构,避免了查询前缀的重复发送;同时,通过对碰撞比特进行定位,仅恢复碰撞比特的方法从而进一步减少了算法的通信开支.仿真结果表明,相比基于多比特识别的防碰撞算法,该算法在标签端和总通信开支方面均有所降低,其中总的通信开支最大降低20%.  相似文献   

12.
N-step incremental straight-line algorithms   总被引:7,自引:0,他引:7  
This class of algorithms extends Bresenham's (1965) integer straight-line algorithm to generate more than one pixel per inner loop, thus reducing inner loop overhead. The quad-step algorithm is too large to justify its use in older hardware with limited memory space, but it can be viable in the context of modern memory and software sizes. Because the algorithm reduces both calculation overhead and the number of memory accesses for adjacent pixels, it can improve the performance of current systems that are limited in their processor speed and of future systems that might be limited in their memory speed. The algorithm gives results identical to those from Bresenham's single-step routine while drawing pixels in the expected direction from start to end point. Furthermore, as the gradual trend towards more bits per pixel continues, a processor supporting multi-word burst data instructions could make good use of this algorithm in speeding up line drawing into a 24-bits-per-pixel, 1-pixel-per-word color frame buffer. I chose to implement 4 steps per loop because it gave a useful performance improvement without exceeding the resources of the target processor, and it was small enough to hand-code. However, the techniques described can be used to construct a straight-line algorithm that generates more than 4 steps per loop. The relatively small average decision tree sizes indicate that algorithms of greater than 4 pixels per step might further improve line-drawing efficiency  相似文献   

13.
Benjamin Zorn 《Software》1993,23(7):733-756
Because dynamic memory management is an important part of a large class of computer programs, high-performance algorithms for dynamic memory management have been, and will continue to be, of considerable interest. Experience indicates that for many programs, dynamic storage allocation is so important that programmers feel compelled to write and use their own domain-specific allocators to avoid the overhead of system libraries. As an alternative to explicit storage management techniques, conservative garbage collection has been suggested as an important algorithm for dynamic storage management in C programs. In this paper, I evaluate the costs of different dynamic storage management algorithms, including domain-specific allocators, widely-used general-purpose allocators, and a publicly available conservative garbage collection algorithm. Surprisingly, I find that programmer enhancements often have little effect on program performance. I also find that the true cost of conservative garbage collection is not the CPU overhead, but the memory system overhead of the algorithm. I conclude that conservative garbage collection is a promising alternative to explicit storage management and that the performance of conservative collection is likely to improve in the future. C programmers should now seriously consider using conservative garbage collection instead of explicitly calling free in programs they write.  相似文献   

14.
针对Android存储系统在闪存管理上存在较差的磨损均衡效果和较高的垃圾回收额外开销的缺陷,引入冷热数据分离策略,将文件按照不同热度写入对应热度的物理存储单元,同时改进垃圾回收策略,以达到良好的磨损均衡效果并减少垃圾回收额外开销。基于Android平台的实验结果表明,改进后的策略在有效减少NAND闪存垃圾回收额外开销的同时,还能有效改善其磨损均衡效果。  相似文献   

15.
一种检测运行栈与静态数据区重叠的新方法   总被引:1,自引:0,他引:1  
嵌入式系统中由于内存限制,容易出现运行栈和数据区重叠的错误。已有的两种检测该错误的方法在准确性和易用性方面存在缺陷,不适用于基于软件模拟器的大规模回归测试。文章通过改变运行栈与静态数据区的布局,将运行栈与静态数据区重叠的错误转化为运行栈超越内存地址空间的错误。新方法大大简化了这种运行时错误的检测和调试。  相似文献   

16.
Monitoring resource consumptions is fundamental in software engineering, e.g., in validation of quality requirements, performance engineering, or adaptive software systems. However, resource monitoring does not come for free as it typically leads to overhead in the observed program. Minimizing this overhead and increasing the reliability of the monitored data is a major goal in realizing resource monitoring tools. Typically, this is achieved by limiting capabilities, e.g., supported resources, granularity of the monitoring focus, or runtime access to results. Thus, in practice often several approaches must be combined to obtain relevant information.We describe SPASS-meter, a novel resource monitoring approach for Java and Android Apps, which combines these conflicting capabilities with low overhead. SPASS-meter supports a large set of resources, flexible configuration of the monitoring scope even for user-defined semantic units (components), runtime analysis and online access to monitoring results in a platform-independent way. We discuss the concepts of SPASS-meter, its architecture, realization and validation, the latter in terms of case studies and an overhead analysis based on performance experiments with SPASS-meter, OpenCore and Kieker. SPASS-meter provides a detailed view of the runtime resource consumption at reasonable overhead of less than 3% processing power and 0.5% memory consumption in our experiments.  相似文献   

17.
Memory leaks are usually not associated with runtime environments with automatic garbage collection; however, memory leaks do happen in such environments and present a challenge to detect and find a root cause. Currently in the industry manual heap dump analysis is the most popular way of finding memory leaks, regardless of the number of automated methods proposed by scientists over the years. However, heap dump analysis alone cannot answer all questions needed to fix the leak effectively. The current paper reviews memory leak detection approaches proposed over the years and classifies them from the point of view of assessed metrics, performance overhead and intrusiveness. In addition, we classify the methods into online, offline and hybrid groups based on their features.  相似文献   

18.
This paper develops a formalism that precisely characterizes when class tables are required for C++ memory layouts. A memory layout is a particular choice of data structures for implementing run‐time support for object‐oriented languages. We use this formalism to quantify and evaluate, on a set of benchmarks, the space overhead for a set of C++ memory layouts. In particular, this paper studies the space overhead due to three language features: virtual dispatch, virtual inheritance, and dynamic typing. To date, there has been no scientific quantification or evaluation of C++ memory layouts. Our approach can help C++ implementors. This work has already influenced the memory layout design choices in IBM's Visual Age C++ V5 compiler. Applying our approach to a set of five benchmarks, we demonstrate that the impact of object‐oriented space overhead can vary dramatically between applications (ranging from 0.42% to 99.79% for our benchmarks). In particular, applications whose object space is dominated by instances of classes that heavily use object‐oriented language features will be significantly impacted by the choice of a memory layout. Copyright © 2003 John Wiley & Sons, Ltd.  相似文献   

19.
NAND Flash is the most prevalent memory technology used today in data storage systems covering a wide range of applications, from consumer devices to high-end enterprise systems. In this work, we present a modular and versatile FPGA-based platform that achieves accurate emulation of multiple NAND Flash channels. The NAND Flash emulator is based on an expandable and reconfigurable architecture that can be used for developing and testing new NAND Flash controllers and for analysing the behaviour of existing NAND Flash controllers and/or host device drivers. The presented NAND Flash emulator is based on PCIe-based FPGA boards attached to a high-end server, supports standard memory interfaces, responds to all memory commands in proper time and has the capability to emulate memory space in the range of a few TBs. The NAND Flash emulator has been prototyped and tested, and experimental results demonstrate that all timing requirements are satisfied under maximum read/write workloads. The NAND Flash emulator also includes a hardware tracer unit that records information of all commands exchanged at the NAND Flash interfaces along with high resolution timestamps. The recorded information can be used to analyse higher level functions, like wear leveling and garbage collection, and combined with other software tools for analysing cognitive functions. Experimental results demonstrate the advantage of using this emulator for analysing how host device drivers implement wear leveling and garbage collection functions.  相似文献   

20.
Stride prefetching is recognized as an important technique to improve memory access performance. The prior work usually profiles and/or analyzes the program behavior offline, and uses the identified stride patterns to guide the compilation process by injecting the prefetch instructions at appropriate places. There are some researches trying to enable stride prefetching in runtime systems with online profiling, but they either cannot discover cross-procedural prefetch opportunity, or require special supports in hardware or garbage collection. In this paper, we present a prefetch engine for JVM (Java Virtual Machine). It firstly identifies the candidate load operations during just-in-time (JIT) compilation, and then instruments the compiled code to profile the addresses of those loads. The runtime profile is collected in a trace buffer, which triggers a prefetch controller upon a protection fault. The prefetch controller analyzes the trace to discover any stride patterns, then modifies the compiled code to inject the prefetch instructions in place of the instrumentations. One of the major advantages of this engine is that, it can detect striding loads in any virtual code places for both regular and irregular code, not being limited with plain loop or procedure scopes. Actually we found the cross-procedural patterns take about 30% of all the prefetchings in the representative Java benchmarks. Another major advantage of the engine is that it has runtime overhead much smaller (the maximal is less than 4.0%) than the benefits it brings. Our evaluation with Apache Harmony JVM shows that the engine can achieve an average 6.2% speed-up with SPECJVM98 and DaCapo on Intel Pentium 4 platform, in spite of the runtime overhead.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号