首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Java uses exceptions to provide elegant error handling capabilities during program execution. However, the presence of exception handlers complicates the job of the just‐in‐time (JIT) compiler, while exceptions are rarely used in most programs. This paper describes two techniques for reducing such complications. First, we delay the translation of an exception handler until the exception really occurs. This on‐demand translation of exception handlers allows more optimizations when translating the main flow, without being hindered by constraints caused by the exception flows. Secondly, for those exceptions that are actually thrown during program execution we insert exception‐type check code and a direct branch to the translated exception handlers. This exception handler prediction is motivated by an observation that frequently thrown exceptions are likely to be handled by the same exception handlers, so this will eliminate the exception processing overhead of the Java virtual machine. Our experiments indicate that the code quality of the main flow is no longer affected by the presence of exception handlers. Also, frequently thrown exceptions can be efficiently handled by the exception handler prediction. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

2.
3.
One of the most promising approaches to Java acceleration in embedded systems is a bytecode-to-C ahead-of-time compiler (AOTC). It improves the performance of a Java virtual machine (JVM) by translating bytecode into C code, which is then compiled into machine code via an existing C compiler. One important design issue in AOTC is efficient exception handling. Since the excepting point and the exception handler may locate in different methods on a call stack, control transfer between them should be streamlined, while an exception would be an “exceptional” event, so it should not slow down normal execution paths. Previous AOTCs often employed a technique called stack cutting based on a setjmp()/longjmp() pair, which we found is involved with too much performance overheads. Also, when the AOTC and the interpreter are employed concurrently (e.g., some methods are AOTCed while other methods are interpreted), the performance of normal execution paths is affected more seriously. This paper proposes a simpler solution based on an exception check after each method call, merged with garbage collection check for reducing its overhead. Our evaluation results on SPECjvm98 on Sun's CVM indicate that our technique can improve the performance of stack cutting by more than 25%. A similar performance benefit can be noted on a hybrid execution environment of both the AOTC and the interpreter.  相似文献   

4.
Complex graphical user interfaces (GUIs) that support a large amount of user interaction require a fast response time, a rich set of building blocks for an esthetic look‐and‐feel, and a development environment that supports ongoing change. On the World Wide Web, client‐side technologies offer more of these features than do server‐side solutions. Java and JavaScript are the two most popular languages used for client‐side GUI implementations. Java implementations require a user to download a plug‐in that contains a virtual machine to execute the Java byte‐code. The installation and maintenance of this plug‐in is sometimes an unsurmountable barrier to using Java. JavaScript lacks some of the desirable features of Java, such as easy to use object‐oriented features and having a GUI class library, but does not require a plug‐in. We have enhanced JavaScript by implementing a new language Object‐JavaScript (OJS) and by providing an OJS library of GUI components, thus making it a viable alternative to Java. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

5.
Since its release, the Java programming language has attracted considerable attention from the high‐performance computing (HPC) community because of its portability, high programming productivity, and built‐in multithreading and networking support. As a consequence, several initiatives have been taken to develop a high‐performance Java message‐passing library to program distributed memory architectures, such as clusters. The performance of Java message‐passing applications relies heavily on the communications performance. Thus, the design and implementation of low‐level communication devices that support message‐passing libraries is an important research issue in Java for HPC. MPJ Express is our Java message‐passing implementation for developing high‐performance parallel Java applications. Its public release currently contains three communication devices: the first one is built using the Java New Input/Output (NIO) package for the TCP/IP; the second one is specifically designed for the Myrinet Express library on Myrinet; and the third one supports thread‐based shared memory communications. Although these devices have been successfully deployed in many production environments, previous performance evaluations of MPJ Express suggest that the buffering layer, tightly coupled with these devices, incurs a certain degree of copying overhead, which represents one of the main performance penalties. This paper presents a more efficient Java message‐passing communications device, based on Java Input/Output sockets, that avoids this buffering overhead. Moreover, this device implements several strategies, both in the communication protocol and in the HPC hardware support, which optimizes Java message‐passing communications. In order to evaluate its benefits, this paper analyzes the performance of this device comparatively with other Java and native message‐passing libraries on various high‐speed networks, such as Gigabit Ethernet, Scalable Coherent Interface, Myrinet, and InfiniBand, as well as on a shared memory multicore scenario. The reported communication overhead reduction encourages the upcoming incorporation of this device in MPJ Express ( http://mpj‐express.org ). Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

6.
Java just‐in‐time compilers often compile only hot methods because the compilation overhead is a part of the running time. This requires precise and efficient hot spot detection, which includes distinguishing hot methods from cold ones, detecting them as early as possible, and paying a small detection overhead. Hot spot detection is especially important in embedded applications because they show more of a start‐up phase behavior of a regular application where methods are not executed heavily, so the hot methods are not definite. Because a long‐running method is likely to be a hot method, we can detect a hot method by measuring its running time during interpretation. However, precise measurement of the running time during execution is too expensive, especially in embedded systems, so many counter‐based heuristics have been proposed to estimate it such as Oracle's HotSpot heuristic. One problem is that although the overhead of these heuristics is low, they do not estimate the running time precisely, which may lead to imprecise hot spot detection.This paper proposes a new hot spot detection heuristic called flow‐sensitive runtime estimation, which can estimate the running time more precisely than others with a relatively low overhead. It only counts important bytecode instructions dynamically, but it can obtain the precise count of all interpreted bytecode instructions with a simple arithmetic calculation. We also propose a static analysis technique to predict those hot methods which spends a huge execution time once invoked, so as to compile them at their first invocation. Our experimental results show that these techniques can improve the performance by as much as an average of 7.4% compared with the HotSpot heuristic for the benchmarks when they run once, which is often regarded as showing the start‐up phase behavior. Even for real embedded Java applications such as the digital TV Java Xlet applications, our techniques can improve the user response time by an average of 7.1%. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

7.
As network‐enabled embedded devices and Java grow in their popularity, embedded system researchers start seeking ways to make these devices Java‐enabled. However, it is a challenge to apply Java technology to these devices due to their shortage of resources. In this paper, we propose EJVM (Economic Java Virtual Machine), an economic way to run Java programs on network‐enabled and resource‐limited embedded devices. Espousing the architecture proposed by distributed JVM, we store all Java codes on the server to reduce the storage needs of the client devices. In addition, we use two novel techniques to reduce the client‐side memory footprints: server‐side class representation conversion and on‐demand bytecode loading. Finally, we maintain client‐side caches and provide performance evaluation on different caching policies. We implement EJVM by modifying a freely available JVM implementation, Kaffe. From the experiment results, we show that EJVM can reduce Java heap requirements by about 20–50% and achieve 90% of the original performance. Copyright © 2001 John Wiley & Sons, Ltd.  相似文献   

8.
王朝坤 《计算机教育》2011,(11):48-51,60
针对编译原理教学实际,在分析和修改工业级开源编译器实现代码的基础上,提出一个基于Java的编译原理课程案例教学过程,结合Java这种日益普及的面向对象程序设计语言,这种教学过程在编译原理课程教学方面取得良好效果。  相似文献   

9.
在嵌入式Java芯片中使用即时编译技术   总被引:1,自引:0,他引:1  
Java虚拟机具有面向堆栈与面向对象的特点,不利于硬件有效支持字节码的直接执行,传统JIT也不适应嵌入式系统的应用环境,介绍了在自行设计的嵌入式Java芯片中使用JIT的技术途径,通过对Java虚拟机堆栈和复杂指令的支持,密切配合JIT软件,较好地解决了Java芯片设计中的问题。测试结果表明,相对于目前前界最好的picoJava-Ⅱ内核而言内核而言,JC401的编译后代码性能提高了1.2至1.9倍,在硬件复杂度、执行速度、内存开销等方面都有较大程度的改善,适合于嵌入式应用。  相似文献   

10.
将Java程序静态编译成可执行程序是使用Java虚拟机动态编译/解释执行Java程序的另一种运行Java程序的方式。针对Java异常机制的特点和静态编译的需求,在介绍Java异常处理逻辑的基础上,提出一种在静态编译器中实现Java异常机制的算法,结合Open64开源编译器,给出该算法的具体步骤以及实现方式,以SPECjvm98为测试集,验证该算法的有效性。  相似文献   

11.
Virtual execution environments, such as the Java virtual machine, promote platform‐independent software development. However, when it comes to analyzing algorithm complexity and performance bottlenecks, available tools focus on platform‐specific metrics, such as the CPU time consumption on a particular system. Other drawbacks of many prevailing profiling tools are high overhead, significant measurement perturbation, as well as reduced portability of profiling tools, which are often implemented in platform‐dependent native code. This article presents a novel profiling approach, which is entirely based on program transformation techniques, in order to build a profiling data structure that provides calling‐context‐sensitive program execution statistics. We explore the use of platform‐independent profiling metrics in order to make the instrumentation entirely portable and to generate reproducible profiles. We implemented these ideas within a Java‐based profiling tool called JP. A significant novelty is that this tool achieves complete bytecode coverage by statically instrumenting the core runtime libraries and dynamically instrumenting the rest of the code. JP provides a small and flexible API to write customized profiling agents in pure Java, which are periodically activated to process the collected profiling information. Performance measurements point out that, despite the presence of dynamic instrumentation, JP causes significantly less overhead than a prevailing tool for the profiling of Java code. Copyright © 2008 John Wiley & Sons, Ltd.  相似文献   

12.
Mobile code makes it easier to maintain, debug, update, and customize a system. Active networks are one of the more interesting applications of mobile code: code is injected into the nodes of a network to customize the network's functionality, such as routing, and to add new features, such as special‐purpose congestion control and filtering algorithms. The challenge is to develop a communication‐oriented platform for such systems. We refer to mobile code targeted at low‐level, communication‐oriented systems like active networks as liquid software, the key distinction being that liquid software is focused on the efficient transfer of data, not high‐performance computation. To this end, we have designed and implemented Joust, which consists of a complete re‐implementation of the Java virtual machine (including both the runtime system and a just‐in‐time compiler), running on the Scout operating system (a configurable, communication‐oriented OS). The result is a configurable, high‐performance platform for running liquid software. We present the results of implementing two different applications of liquid software on Joust, including a prototype architecture for active networks. Copyright © 2000 John Wiley & Sons, Ltd.  相似文献   

13.
Providing high‐performance inter‐node communication is a key capability for running high performance computing applications efficiently on parallel architectures. In fact, current systems deployments are aggregating a significant number of cores interconnected via advanced networking hardware with Remote Direct Memory Access (RDMA) mechanisms, that enable zero‐copy and kernel‐bypass features. The use of Java for parallel programming is becoming more promising thanks to some useful characteristics of this language, particularly its built‐in multithreading support, portability, easy‐to‐learn properties, and high productivity, along with the continuous increase in the performance of the Java virtual machine. However, current parallel Java applications generally suffer from inefficient communication middleware, mainly based on protocols with high communication overhead that do not take full advantage of RDMA‐enabled networks. This paper presents efficient low‐level Java communication devices that overcome these constraints by fully exploiting the underlying RDMA hardware, providing low‐latency and high‐bandwidth communications for parallel Java applications. The performance evaluation conducted on representative RDMA networks and parallel systems has shown significant point‐to‐point performance increases compared with previous Java communication middleware, allowing to obtain up to 40% improvement in application‐level performance on 4096 cores of a Cray XE6 supercomputer. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

14.
The Java language is popular because of its platform independence, making it useful in a lot of technologies ranging from embedded devices to high‐performance systems. The platform‐independent property of Java, which is visible at the Java bytecode level, is only made possible thanks to the availability of a Virtual Machine (VM), which needs to be designed specifically for each underlying hardware platform. More specifically, the same Java bytecode should run properly on a 32‐bit or a 64‐bit VM. In this paper, we compare the behavioral characteristics of 32‐bit and 64‐bit VMs using a large set of Java benchmarks. This is done using the Jikes Research VM as well as the IBM JDK 1.4.0 production VM on a PowerPC‐based IBM machine. By running the PowerPC machine in both 32‐bit and 64‐bit mode we are able to compare 32‐bit and 64‐bit VMs. We conclude that the space an object takes in the heap in 64‐bit mode is 39.3% larger on average than in 32‐bit mode. We identify three reasons for this: (i) the larger pointer size, (ii) the increased header and (iii) the increased alignment. The minimally required heap size is 51.1% larger on average in 64‐bit than in 32‐bit mode. From our experimental setup using hardware performance monitors, we observe that 64‐bit computing typically results in a significantly larger number of data cache misses at all levels of the memory hierarchy. In addition, we observe that when a sufficiently large heap is available, the IBM JDK 1.4.0 VM is 1.7% slower on average in 64‐bit mode than in 32‐bit mode. Copyright © 2005 John Wiley & Sons, Ltd.  相似文献   

15.
This paper describes JaRec, a portable record/replay system for Java. It correctly replays multi‐threaded, data‐race free Java applications, by recording the order of synchronization operations, and by executing them in the same order during replay. The record/replay infrastructure is developed in Java, and does not require a modification of the Java Virtual Machine (JVM) if it provides the JVM Profiler Interface (JVMPI). If the JVM does not support JVMPI, which is used for intercepting the loaded classes, only a minor modification to the JVM is required in order to run the system. On ystems with limited memory resources, JaRec can be executed in a distributed fashion. This also makes it suitable to aid debugging of multi‐threaded applications on embedded systems. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

16.
17.
Java supports the monitor construct for language‐level synchronization in the context of multi‐threading. This paper introduces the lightweight monitor, an efficient user‐level monitor implementation. The lightweight monitor is useful for single‐threaded Java programs as well as for multi‐threaded Java programs with little lock contention. A 32‐bit lock is embedded in each object for efficient lock access while other monitor data structures are managed using a hash table. We highly optimized the lock manipulation code, which is translated and inlined by a just‐in‐time (JIT) compiler. In the most probable cases, only nine SPARC instructions are spent for lock acquisition and five instructions are spent for lock release. Our experimental results indicate that the lightweight monitor is faster than the monitor implementation in the SUN JDK 1.2 RC1 by up to 21 times in the absence of lock contention and by up to seven times in the presence of lock contention. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

18.
Code transformation and analysis tools provide support for software engineering tasks such as style checking, testing, calculating software metrics as well as reverse‐ and re‐engineering. In this paper we describe the architecture and the applications of JTransform, a general Java source code processing and transformation framework. It consists of a Java parser generating a configurable parse tree and various visitors (transformers, tree evaluators) which produce different kinds of outputs. While our framework is written in Java, the paper further opens an opportunity for a new generation of XML‐based source code tools. Copyright © 2004 John Wiley & Sons, Ltd.  相似文献   

19.
Scratch‐pad memory (SPM), a small, fast, software‐managed on‐chip SRAM (Static Random Access Memory) is widely used in embedded systems. With the ever‐widening performance gap between processors and main memory, it is very important to reduce the serious off‐chip memory access overheads caused by transferring data between SPM and off‐chip memory. In this paper, we propose a novel compiler‐assisted technique, ISOS (Iteration‐access‐pattern‐based Space Overlapping SPM management), for dynamic SPM management with DMA (Direct Memory Access). In ISOS, we combine both SPM and DMA for performance optimization by exploiting the chance to overlap SPM space so as to further utilize the limited SPM space and reduce the number of DMA operations. We implement our technique based on IMPACT and conduct experiments using a set of benchmarks from DSPstone and Mediabench on the cycle‐accurate VLIW simulator of Trimaran. The experimental results show that our technique achieves run‐time performance improvement compared with the previous work. The average improvements are 13.15, 19.05, and 25.52% when the SPM sizes are 1KB, 512 bytes, and 256 bytes, respectively. Copyright © 2010 John Wiley & Sons, Ltd.  相似文献   

20.
Distributed Java virtual machine (dJVM) systems enable concurrent Java applications to transparently run on clusters of commodity computers. This is achieved by supporting Java's shared‐memory model over multiple JVMs distributed across the cluster's computer nodes. In this work, we describe and evaluate selective dynamic diffing and lazy home allocation, two new runtime techniques that enable dJVMs to efficiently support memory sharing across the cluster. Specifically, the two proposed techniques can contribute to reduce the overheads due to message traffic, extra memory space, and high latency of remote memory accesses that such dJVM systems require for implementing their memory‐coherence protocol either in isolation or in combination. In order to evaluate the performance‐related benefits of dynamic diffing and lazy home allocation, we implemented both techniques in Cooperative JVM (CoJVM), a basic dJVM system we developed in previous work. In subsequent work, we carried out performance comparisons between the basic CoJVM and modified CoJVM versions for five representative concurrent Java applications (matrix multiply, LU, Radix, fast Fourier transform, and SOR) using our proposed techniques. Our experimental results showed that dynamic diffing and lazy home allocation significantly reduced memory sharing overheads. The reduction resulted in considerable gains in CoJVM system's performance, ranging from 9% up to 20%, in four out of the five applications, with resulting speedups varying from 6.5 up to 8.1 for an 8‐node cluster of computers. Copyright © 2007 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号