首页 | 官方网站   微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   52篇
  免费   1篇
工业技术   53篇
  2021年   1篇
  2020年   1篇
  2017年   1篇
  2016年   1篇
  2015年   2篇
  2014年   3篇
  2012年   4篇
  2011年   2篇
  2010年   2篇
  2009年   3篇
  2008年   2篇
  2007年   3篇
  2005年   1篇
  2004年   4篇
  2003年   5篇
  2002年   3篇
  2001年   4篇
  2000年   3篇
  1999年   2篇
  1998年   1篇
  1997年   1篇
  1996年   3篇
  1993年   1篇
排序方式: 共有53条查询结果,搜索用时 15 毫秒
11.
In today’s consumer electronics market, Java has become one of the most important programming languages for the rapid development of mobile applications – spanning from home appliances/controllers, mobile and communication devices, to network-centric applets. However, the demand for high-performance low-power Java-based consumer mobile applications puts forward new challenges to the system design and implementation. This paper analyzes the energy consumption, execution efficiency, and speed issues of Java applications in a typical consumer mobile device environment. By adopting a hardware-assisted approach, we introduce a Java accelerator with a companion Java virtual machine. The accelerator is designed in an asynchronous style, and can be integrated with most existing processors and operating systems. The core architecture, design philosophy, and implementation considerations are presented in detail in this paper.  相似文献   
12.
实用的并行程序性能分析方法   总被引:2,自引:0,他引:2  
Firstly, with the discusses of main ingredients to exert the peak floatperformance for currently high performance mirco-processors in detail,this paper analyzed the principal motivations for the speedup ofparallel applied codes under the parallel computers consisted of thethese micro-processors. Secondly, this paper presented a suite ofperformance evaluations rules for parallel codes, which can reveal theoverall numerical and parallel performance with respect to the serialcodes, pose the performance improving strategies, explain exactly thereasons for super-linear Speedup. The numerical experimential results oftwo realistic applied codes under two parallel computer are also givenin this paper  相似文献   
13.
祝永志  田甜 《计算机科学》2010,37(12):287-291
可扩展性是并行计算系统的重要性能指标,虽然异构系统越来越普遍,但对其可扩展性的研究还很少。给出了一种既适合同构并行计算系统又适合异构并行计算系统的效率的定义,根据访定义对可扩展性进行了分析,得出了既适用于同构系统又适用于异构系统的等效率模型,并根据开销比得出了在某一效率常数保持一致的情况下系统规模和工作负载的变化情况。最后通过实验进行了分析,结果表明该模型可以对效率和可扩展性进行较好的评测,并能预测并行计算系统的高可扩展性。  相似文献   
14.
传统MPI自动并行化编译系统从数据重分布的角度,生成面向分布式存储系统的消息传递程序,但是大量数据重分布通信的额外开销导致其加速比低。为了解决此问题,在基于Open64的MPI自动并行化编译系统后端,提出了一种消息传递代码生成算法。该算法以统一数据分布为中心,根据给定的并行化循环集和通信数组集,通过修改WHIRL表示的串行代码语法结构树,生成更精确的消息传递代码。实验结果表明,该算法能够较大程度地降低消息传递程序的通信开销,并且明显提升其加速比。  相似文献   
15.
对称矩阵三对角化的混合并行算法设计   总被引:2,自引:0,他引:2  
赵永华  迟学斌  陈江 《计算机工程》2005,31(22):39-41,53
基于Householder转换,给出了稠密对称矩阵三对角化的MPI+OpenMP混合并行算法。内容集中在SMP集群系统环境下算法的负载平衡、通信开销和性能评价。OpenMP共享内存并行采用了粗粒度方法,解决了MPI算法中的负载平衡问题,降低了通信开销。在深腾6800上的试验结果表明,MPI+OpenMP版本比纯MPI版本具有更好的性能和可扩展性。  相似文献   
16.
This paper proposes a parallel algorithm for computing anN( = Kn) point Lagrange interpolation on fc-ary n-cube networks. The algorithm consists of three phases: initialisation, main and final. There is no computation in the initialisation phase. The main phase is composed of N/2 steps, each consisting of four multiplications and four subtractions, and an additional step including one division and one multiplication. Communication in the main phase is based on an all-to-all broadcast algorithm on a Hamiltonian ring embedded in a k-ary n-cube. The final phase is carried out in n x ?k/l? steps, each requiring one addition. A performance evaluation of the proposed algorithm reveals a near to optimum speedup for a typical range of sy:;tem parameters used in current state-of-the-art implementations. Our study also reveals that when implementation cost is taken into account low-dimensional K-ary n-cubes achieve better speedup than their higher-dimensional counterparts.  相似文献   
17.
朴素并行LDA     
并行潜在狄利克雷分配(LDA)主题模型在计算与通信两方面的时间消耗较大,导致训练模型的时间过长,因而无法被广泛应用.提出朴素并行LDA算法,针对计算和通信分别提出改进方法.一方面通过加入单词影响因子以及设置阈值的方法来降低文本训练的粒度,另一方面通过降低通信频率来减少通信时间.实验结果表明,优化后的并行LDA在保证精度损失为1%的前提下,将训练速度提高了36%,有效提高了并行的加速比.  相似文献   
18.
海沫  张游 《计算机科学》2017,44(Z6):414-418
通过实验,从运行时间、加速比、可扩展性和规模增长性4个方面比较了 Spark平台中3种典型的聚类算法即K-means聚类算法、二分K-means聚类算法和高斯混合聚类算法 的性能。实验结果表明:1)随着节点个数的增加,3种算法对百兆以上规模数据集聚类的运行时间明显减少;2)当数据集规模大于500MB时,3种算法的加速比均有明显提高,且随着节点个数的增加,加速比近似于线性增长;3)3种算法的可扩展性随着节点个数的增加而降低,当数据集规模大于500MB时,相对于K-means和高斯混合算法,二分K-means算法的可扩展性最差;4)当数据集规模大于100MB时,高斯混合算法的规模增长性远高于K-means和二分K-means算法。  相似文献   
19.
§1.引言 近年来,中尺度数值模式已为国内气象界熟悉,随着应用中的改进,模式物理过程的复杂化以及对分辩率的提高,其计算量显著增长,并行计算尤为重要.目前,安徽省引进一台国产曙光-1000并行计算机,为我们提供了良好的条件. 为实现该模式的并行化,对MM4串行程序彻头彻尾的分析和改造是至关重要的.经过试验,我们在数据结构和程序结构两方面,取得了突破性进展.其结果表明并行化计算与串行程序的执行完全一致,使加速比由初期的 1: 1.5提高到 1: 6.72(16 CPUs)初步达到满意程度, 此项工作是在中国…  相似文献   
20.
A Genetic Algorithm (GA) is a heuristic to find exact or approximate solutions to optimization and search problems within an acceptable time. We discuss GAs from an architectural perspective, offering a general analysis of performance of GAs on multi-core CPUs and on many-core GPUs. Based on the widely used Parallel GA (PGA) schemes, we propose the best one for each architecture. More specifically, the Asynchronous Island scheme, Island/Master–Slave Hierarchy PGA and Island/Cellular Hierarchy PGA are the best for multi-core, multi-socket multi-core and many-core architectures, respectively. Optimization approaches and rules based on a deep understanding of multi- and many-core architectures are also analyzed and proposed. Finally, the comparison of GA performance on multi-core and many-core architectures are discussed. Three real GA problems are used as benchmarks to evaluate our analysis and findings.There are three extra contributions compared to previous work. Firstly, our findings based on deeply analyzing architectures can be applied to all GA problems, even for other parallel computing, not for a particular GA problem. Secondly, the performance of GAs in our work not only concerns execution speed, also the solution quality has not been considered seriously enough. Thirdly, we propose the theoretical performance and optimization models of PGA on multi-core and many-core architectures, finding a more practical result of the performance comparison of the GA on these architectures, so that the speedup presented in this work is more reasonable and is a better guide to practical decisions.  相似文献   
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号