首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
程序自动并行化系统中IR的面向对象设计   总被引:3,自引:0,他引:3  
从构造一个高性能的程序自动并行化系统的需求出发,介绍了程序自动并行化系统AGASSIZ中IR(IntermediateRepresentation)的设计原则与设计方法,阐明了此种IR的设计给整个程序自动并行化系统的设计所带来的便利。  相似文献   

2.
该文介绍了大规模并行处理系统程序自动并行化工具FAX(FortranAutomatedXlator)的系统概况。重点阐述了FAX中所采用的先进技术。测试结果表明,FAX已具备一定的可用性及有效性,作为面向分布主存并行机系统的程序自动并行化工具,基本达到了设计目标。  相似文献   

3.
本文面向计算流体力学(CFD)提出了数据自动迁移的并行计算模型(ADM模型),符合该模型的并行CFD程序能够根据计算节点的计算能力及负载轻重,自动将数据迁移至计算能力强,负载轻的计算节点,从而使得并行程序能够在网络计算平台上取得较好的并行效率,本文还讨论了自动并行化系统对ADM模型的支持方法,最后给出了性能测试结果。  相似文献   

4.
程序自动并行化工具FAK   总被引:1,自引:0,他引:1  
该文介绍了大规模并行处理系统程序自动并行化工具FAX(Fortran Automated Xlator)的系统概况。重点阐述了FAX中所采用的先进技术。测试结果表明,FAX已具备一定的可用性及有效性,作为面向分布主存并行机系统的程序自动并行化工具,基本达到了设计目标。  相似文献   

5.
邝宏斌  罗贵明 《计算机工程》2008,34(19):23-25,2
并行化是提高模型检测效率的重要手段。该文研究了基于标号迁移系统的C程序模型检测,提出一种软件模型检测并行化的方法。该方法利用软件模型检测工具模块化验证(MAGIC)的模块化特性对C程序进行组件分解,将各组件均衡地分发到若干计算节点,由节点调用MAGIC完成验证。由于保证节点间只有少量的通信与同步,该方法能达到较好的并行加速比,具有良好的可扩展性。实验结果显示,该方法大幅压缩了检测时间,有利于大规模软件的形式化验证。  相似文献   

6.
目前全自动并行化方法在并行化能力和应用范围上存在较大限制,而交互式并行化方法能弥补全自动并行化系统的不足。基于此,提出一种交互式并行化方法及其系统ZIPS,描述系统的并行化处理机制,即采用一种计算类型驱动的并行化算法,并以其作为理论基础,针对2类不同计算,利用强大的交互功能获取相关程序信息,并结合自动并行化技术进行源到源的变换。实验表明,该交互式并行化方法能够获得较好的性能。  相似文献   

7.
《并行分布式程序设计》一书由华中理工大学计算机系刘键教授撰著,由华中理工大学出版社计划于1995年初出版。全书约28万字,共分七章,内容丰富、新颖,理论与实际并重。该书系统地研究分析了国内外近十年来在并行分布式系统软件方面所发表的大量文献,特别着重作者自己主持研制的Fortran并行化编译系统HZPARA和HZPARA-Ⅰ的经验;在此基础上,进一步明确阐述并行分布式程序设计(特别是并行化编译)的数学模型、基本理论、基本方法及该学科的基本结构。该书主要面向高性能大型计算的并行分布式程序  相似文献   

8.
分析了并行关联规则挖掘算法存在的不足,提出了一种改进的关联规则挖掘的多核并行优化算法。该算法对Apriori算法的压缩矩阵进行了改造,并在多核平台下利用OpenMP技术和TBB技术对串行程序进行循环并行化和任务分配的并行化设计,最大限度地实现并行关联规则挖掘。  相似文献   

9.
并行化编译器通过发掘串行程序中的并行性来提高程序的运行性能。但当可并行的工作量与并行的线程数目之比较小时,有可能采用并行执行反而会降低程序的整体性能。本文工作基于SUIF结构.研究精确的工作量计算方法,并实现了基于工作量的条件并行化技术.有效地提高了并行程序的执行性能。  相似文献   

10.
本文描述了化学复合驱数值模拟程序UTCHEM在分布式内存多计算机并行系统SMP-CLUSTER上并行化的关键技术。化学复合驱并行模型采用单程序多数据(SPMD)程序模型,利用区域分解方法将整个求解区域分解为子区域,使得多个计算节点同时求解一个单一的模拟问题。各计算节点通过消息传递对重叠区域的共享数据进行通信,以协调各节点之问的计算。目前仅对压力方程组求解部分进行了并行化实现。测试结果显示了较好的并行效率。  相似文献   

11.
三维激光烧蚀流体界面不稳定性程序的并行化   总被引:1,自引:0,他引:1  
在共享存储并行机和MPP并行机上,基于MPI(MessagePassingInterface)并行编程环境,本文研究三维激光烧蚀界而不稳定性程序(Lared-S)的并行实现.三维激光烧蚀的数值模拟采用分裂方法,其90%以上的计算负载存在于流体方程和热传导方程的求解(流体方程的求解采用分裂显格式,热传导方程的求解采用分裂隐格式).本文给出基于三维分裂格式的交替平面数据通信模式.分裂隐格式的求解转化为三对角方程组的求解,其并行实现采用块流水线并行算法.数值实验结果表明交替平面数据通信策略和块流水线并行算法是有效且可扩展的.在共享存储并行机上,应用64台处理机获得93%以上的并行效率;在MPP并行机上,应用128台处理机获得90%以上的并行效率.  相似文献   

12.
Gordon Lyon  Raghu Kacker  Arnaud Linz 《Software》1995,25(12):1299-1314
Code scalability, crucial on any parallel system, determines how well parallel code avoids becoming a bottleneck as its host computer is made larger. The scalability of computer code can be estimated by statistically designed experiments that empirically approximate a multivariate Taylor expansion of the code's execution response function. Each suspected code bottleneck corresponds to a first-order term in the expansion, the coefficient for that term indicating how sensitive execution is to changes in the suspect location. However, it is the expansion coefficients for second-order interactions between code segments and the number of processors that are fundamental to discovering which program elements impede parallel speedup. A new, unified view of these second-order coefficients yields an informal relative scalability test of high utility in code development. Discussion proceeds through actual examples, including a straightforward illustration of the test applied to SLALOM, a complex, multiphase benchmark. A quick graphical shortcut makes the scalability test readily accessible.  相似文献   

13.
This work illustrates results obtained by implementing in a parallel computer environment the gl rate formulation of the theory of plasticity and its integration scheme as illustrated in the preceding Part I. The Preconditioned Conjugate Gradient is used as a tool for repeatedly solving linear systems of equations. Although the investigation about the performance of the workstation cluster as a parallel virtual computer is still far from being completed, it is already possible to conclude that, in such an environment, several techniques proposed for reducing the computer time required by the iterative solver are not applicable. One purpose of the work is therefore to give guidelines in terms of expected performances of the Conjugate Gradient method when applied to stiff problems, in which the condition number may shoot up to the billions. Even if the implemented computer code is not based on the most convenient rate formulation in terms of parallelizability, as shown in Part I of this work, the obtained results indicate anyway that, for some categories of structural problems, the development of a parallel, element-by-element computer code is a promising line of work.  相似文献   

14.
The extended full-potential (FPX) helicopter rotor computational fluid dynamics (CFD) code of Fortran in its reduced two-dimensional version is successfully converted into a parallel version for multiprocessing. The FPX code with an internal grid generator solves the compressible full-potential equation using an approximately factored finite-difference scheme with added numerous physical modeling enhancements, including viscous boundary layers, shock-induced entropy corrections and wake-vortex embedding. The parallel version of the code uses open multi-processing (OpenMP) directives as parallel programming tool in shared-memory (SM) environment. The OpenMP code is portable and scalable, which can run on various computer platforms including UNIX platforms and Windows NT platforms. The performance study of the parallel code on SGI Origin 2000 UNIX platform is made. The results show that reasonable speedups through parallelization are obtained and that OpenMP is easy to use and an efficient parallel programming tool for the present problem.  相似文献   

15.
We describe the application of pD, a small para-functional language that we developed as a high-level programming interface for the parallel computer algebra package PACLIB. pD provides several facilities to express parallel algorithms in a flexible way on different levels of abstraction. The compiler translates a pD module into statically typed parallel C code with explicit task creation and synchronization constructs. This target code can be linked with the PACLIB kernel, the multi-processor runtime system of the computer algebra library SACLIB. The parallelization of several computer algebra algorithms on a shared memory multi-processor demonstrates the elegance and efficiency of this approach.  相似文献   

16.
基于Matlab的高光谱遥感数据降维并行计算分析   总被引:1,自引:1,他引:0       下载免费PDF全文
刘春  陈燕  辛亮 《遥感信息》2010,(3):13-17
对于海量遥感数据的计算而言,串行运算对计算机性能要求高,而且耗时长。为此本文提出引用并行运算方法,不仅可以降低对计算机性能的要求,还可以大大提高运行和计算速度。为此,首先介绍了基于MPI(Message Passing Interface)的并行运算机制,且以Matlab为例给出了它的并行模式,并详细介绍了将现有串行运算代码改造成并行运算的流程。以海量高光谱影像数据为例,将本征维数估计的串行运算修改为并行运算,实验分析并测试了其运行效率。结果表明,并行计算较串行计算可大大缩短本征维数的计算时间。  相似文献   

17.
In this study, a parallel computing technology is applied on the simulation of a wind turbine flow problem. A third-order Roe type flux limited splitting based on a pre-conditioning matrix with an explicit time marching method is used to solve the Navier–Stokes equations. The original FORTRAN code was parallelized with Message Passing Interface (MPI) language and tested on a 64-CPU IBM SP2 parallel computer. The test results show that a significant reduction of computing time in running the model and a super-linear speed up rate is achieved up to 32 CPUs at IBM SP2 processors. The speed up rate is as high as 49 for using IBM SP2 64 processors. The test shows very promising potential of parallel processing to provide prompt simulation of the current wind turbine problems.  相似文献   

18.
一个有效的并行分析算法   总被引:3,自引:0,他引:3  
并行分析在并行编译系统中有着很重要的作用,它的优劣直接影响到编译系统的成败,随着机群系统及其并行开发环境的发展,多数的并行系统可支持多重并行循环的运行。而对只支持一重并行循环的编程系统,选择并行运行效率最高的循环,也是很重要的。为此,本文提出了一个有效的循环并行分析方案,它不但能给出多层循环的并行性,而且能够处理绝大部分实际应用中的并行性问题,本文对传统的并行分析算法进行修改,并给出了一个有效的并  相似文献   

19.
At present, meshless element free Galerkin (EFG) method is being successfully applied in the areas such as solid mechanics, fracture mechanics and thermal. Being a meshless method, it has many advantages over finite element method. One big hurdle with the wide implementation of this method is its computational cost. Therefore, in this paper, a parallel algorithm is proposed for the EFG method. The parallel code has been written in FORTRAN language using MPI message passing library and executed on a four node (eight processors) MIMD type, distributed memory ‘PARAM 10000’ parallel computer. The total time, communication time, speedup and efficiency have been estimated for a three-dimensional heat transfer problem to validate the proposed algorithm. For eight processors, the speedup and efficiency are obtained to be 4.66 and 58.22%, respectively, for a data size of 1320 nodes.  相似文献   

20.
赵博  赵荣彩  徐金龙  高伟 《计算机科学》2015,42(1):50-53,58
为了充分发挥高性能计算机的计算能力,缓解程序员设计和编写并行程序的压力,扩充可用软件集合,设计并实现了利用交互界面深入挖掘程序中的可向量化语句,优化生成代码中的向量化语句,提高生成代码的执行效率.该方法对充分发挥高性能计算机的计算能力,增强系统可用性和扩展应用范围具有重要的意义,同时能够提供有效的辅助手段和工具支持.渐进式智能回溯向量化代码调优架构通过对用户提交的串行程序进行程序分析和变换,采用串行程序分析、数据依赖分析、向量化分析等技术手段,根据分析结果对程序进行变换和优化,自动生成最终的向量化代码.该方法通过分析串行程序中潜在的并行性,将其自动变换为等价的向量化代码形式,大大简化了程序员的工作.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号