排序方式: 共有53条查询结果,搜索用时 15 毫秒
11.
In today’s consumer electronics market, Java has become one of the most important programming languages for the rapid development of mobile applications – spanning from home appliances/controllers, mobile and communication devices, to network-centric applets. However, the demand for high-performance low-power Java-based consumer mobile applications puts forward new challenges to the system design and implementation. This paper analyzes the energy consumption, execution efficiency, and speed issues of Java applications in a typical consumer mobile device environment. By adopting a hardware-assisted approach, we introduce a Java accelerator with a companion Java virtual machine. The accelerator is designed in an asynchronous style, and can be integrated with most existing processors and operating systems. The core architecture, design philosophy, and implementation considerations are presented in detail in this paper. 相似文献
12.
实用的并行程序性能分析方法 总被引:2,自引:0,他引:2
莫则尧 《数值计算与计算机应用》2000,21(4):266-275
Firstly, with the discusses of main ingredients to exert the peak floatperformance for currently high performance mirco-processors in detail,this paper analyzed the principal motivations for the speedup ofparallel applied codes under the parallel computers consisted of thethese micro-processors. Secondly, this paper presented a suite ofperformance evaluations rules for parallel codes, which can reveal theoverall numerical and parallel performance with respect to the serialcodes, pose the performance improving strategies, explain exactly thereasons for super-linear Speedup. The numerical experimential results oftwo realistic applied codes under two parallel computer are also givenin this paper 相似文献
13.
可扩展性是并行计算系统的重要性能指标,虽然异构系统越来越普遍,但对其可扩展性的研究还很少。给出了一种既适合同构并行计算系统又适合异构并行计算系统的效率的定义,根据访定义对可扩展性进行了分析,得出了既适用于同构系统又适用于异构系统的等效率模型,并根据开销比得出了在某一效率常数保持一致的情况下系统规模和工作负载的变化情况。最后通过实验进行了分析,结果表明该模型可以对效率和可扩展性进行较好的评测,并能预测并行计算系统的高可扩展性。 相似文献
14.
15.
16.
《International Journal of Parallel, Emergent and Distributed Systems》2012,27(4):283-299
This paper proposes a parallel algorithm for computing anN( = Kn) point Lagrange interpolation on fc-ary n-cube networks. The algorithm consists of three phases: initialisation, main and final. There is no computation in the initialisation phase. The main phase is composed of N/2 steps, each consisting of four multiplications and four subtractions, and an additional step including one division and one multiplication. Communication in the main phase is based on an all-to-all broadcast algorithm on a Hamiltonian ring embedded in a k-ary n-cube. The final phase is carried out in n x ?k/l? steps, each requiring one addition. A performance evaluation of the proposed algorithm reveals a near to optimum speedup for a typical range of sy:;tem parameters used in current state-of-the-art implementations. Our study also reveals that when implementation cost is taken into account low-dimensional K-ary n-cubes achieve better speedup than their higher-dimensional counterparts. 相似文献
17.
18.
通过实验,从运行时间、加速比、可扩展性和规模增长性4个方面比较了 Spark平台中3种典型的聚类算法即K-means聚类算法、二分K-means聚类算法和高斯混合聚类算法 的性能。实验结果表明:1)随着节点个数的增加,3种算法对百兆以上规模数据集聚类的运行时间明显减少;2)当数据集规模大于500MB时,3种算法的加速比均有明显提高,且随着节点个数的增加,加速比近似于线性增长;3)3种算法的可扩展性随着节点个数的增加而降低,当数据集规模大于500MB时,相对于K-means和高斯混合算法,二分K-means算法的可扩展性最差;4)当数据集规模大于100MB时,高斯混合算法的规模增长性远高于K-means和二分K-means算法。 相似文献
19.
§1.引言 近年来,中尺度数值模式已为国内气象界熟悉,随着应用中的改进,模式物理过程的复杂化以及对分辩率的提高,其计算量显著增长,并行计算尤为重要.目前,安徽省引进一台国产曙光-1000并行计算机,为我们提供了良好的条件. 为实现该模式的并行化,对MM4串行程序彻头彻尾的分析和改造是至关重要的.经过试验,我们在数据结构和程序结构两方面,取得了突破性进展.其结果表明并行化计算与串行程序的执行完全一致,使加速比由初期的 1: 1.5提高到 1: 6.72(16 CPUs)初步达到满意程度, 此项工作是在中国… 相似文献
20.
A Genetic Algorithm (GA) is a heuristic to find exact or approximate solutions to optimization and search problems within an acceptable time. We discuss GAs from an architectural perspective, offering a general analysis of performance of GAs on multi-core CPUs and on many-core GPUs. Based on the widely used Parallel GA (PGA) schemes, we propose the best one for each architecture. More specifically, the Asynchronous Island scheme, Island/Master–Slave Hierarchy PGA and Island/Cellular Hierarchy PGA are the best for multi-core, multi-socket multi-core and many-core architectures, respectively. Optimization approaches and rules based on a deep understanding of multi- and many-core architectures are also analyzed and proposed. Finally, the comparison of GA performance on multi-core and many-core architectures are discussed. Three real GA problems are used as benchmarks to evaluate our analysis and findings.There are three extra contributions compared to previous work. Firstly, our findings based on deeply analyzing architectures can be applied to all GA problems, even for other parallel computing, not for a particular GA problem. Secondly, the performance of GAs in our work not only concerns execution speed, also the solution quality has not been considered seriously enough. Thirdly, we propose the theoretical performance and optimization models of PGA on multi-core and many-core architectures, finding a more practical result of the performance comparison of the GA on these architectures, so that the speedup presented in this work is more reasonable and is a better guide to practical decisions. 相似文献