共查询到20条相似文献,搜索用时 109 毫秒
1.
2.
该文介绍了大规模并行处理系统程序自动并行化工具FAX(FortranAutomatedXlator)的系统概况。重点阐述了FAX中所采用的先进技术。测试结果表明,FAX已具备一定的可用性及有效性,作为面向分布主存并行机系统的程序自动并行化工具,基本达到了设计目标。 相似文献
3.
本文面向计算流体力学(CFD)提出了数据自动迁移的并行计算模型(ADM模型),符合该模型的并行CFD程序能够根据计算节点的计算能力及负载轻重,自动将数据迁移至计算能力强,负载轻的计算节点,从而使得并行程序能够在网络计算平台上取得较好的并行效率,本文还讨论了自动并行化系统对ADM模型的支持方法,最后给出了性能测试结果。 相似文献
4.
程序自动并行化工具FAK 总被引:1,自引:0,他引:1
该文介绍了大规模并行处理系统程序自动并行化工具FAX(Fortran Automated Xlator)的系统概况。重点阐述了FAX中所采用的先进技术。测试结果表明,FAX已具备一定的可用性及有效性,作为面向分布主存并行机系统的程序自动并行化工具,基本达到了设计目标。 相似文献
5.
6.
7.
8.
9.
10.
11.
三维激光烧蚀流体界面不稳定性程序的并行化 总被引:1,自引:0,他引:1
在共享存储并行机和MPP并行机上,基于MPI(MessagePassingInterface)并行编程环境,本文研究三维激光烧蚀界而不稳定性程序(Lared-S)的并行实现.三维激光烧蚀的数值模拟采用分裂方法,其90%以上的计算负载存在于流体方程和热传导方程的求解(流体方程的求解采用分裂显格式,热传导方程的求解采用分裂隐格式).本文给出基于三维分裂格式的交替平面数据通信模式.分裂隐格式的求解转化为三对角方程组的求解,其并行实现采用块流水线并行算法.数值实验结果表明交替平面数据通信策略和块流水线并行算法是有效且可扩展的.在共享存储并行机上,应用64台处理机获得93%以上的并行效率;在MPP并行机上,应用128台处理机获得90%以上的并行效率. 相似文献
12.
Code scalability, crucial on any parallel system, determines how well parallel code avoids becoming a bottleneck as its host computer is made larger. The scalability of computer code can be estimated by statistically designed experiments that empirically approximate a multivariate Taylor expansion of the code's execution response function. Each suspected code bottleneck corresponds to a first-order term in the expansion, the coefficient for that term indicating how sensitive execution is to changes in the suspect location. However, it is the expansion coefficients for second-order interactions between code segments and the number of processors that are fundamental to discovering which program elements impede parallel speedup. A new, unified view of these second-order coefficients yields an informal relative scalability test of high utility in code development. Discussion proceeds through actual examples, including a straightforward illustration of the test applied to SLALOM, a complex, multiphase benchmark. A quick graphical shortcut makes the scalability test readily accessible. 相似文献
13.
Anna Feriani Francesco Genna 《Computer Methods in Applied Mechanics and Engineering》1996,130(3-4):299-318
This work illustrates results obtained by implementing in a parallel computer environment the gl rate formulation of the theory of plasticity and its integration scheme as illustrated in the preceding Part I. The Preconditioned Conjugate Gradient is used as a tool for repeatedly solving linear systems of equations. Although the investigation about the performance of the workstation cluster as a parallel virtual computer is still far from being completed, it is already possible to conclude that, in such an environment, several techniques proposed for reducing the computer time required by the iterative solver are not applicable. One purpose of the work is therefore to give guidelines in terms of expected performances of the Conjugate Gradient method when applied to stiff problems, in which the condition number may shoot up to the billions. Even if the implemented computer code is not based on the most convenient rate formulation in terms of parallelizability, as shown in Part I of this work, the obtained results indicate anyway that, for some categories of structural problems, the development of a parallel, element-by-element computer code is a promising line of work. 相似文献
14.
《Advances in Engineering Software》2001,32(8):665-671
The extended full-potential (FPX) helicopter rotor computational fluid dynamics (CFD) code of Fortran in its reduced two-dimensional version is successfully converted into a parallel version for multiprocessing. The FPX code with an internal grid generator solves the compressible full-potential equation using an approximately factored finite-difference scheme with added numerous physical modeling enhancements, including viscous boundary layers, shock-induced entropy corrections and wake-vortex embedding. The parallel version of the code uses open multi-processing (OpenMP) directives as parallel programming tool in shared-memory (SM) environment. The OpenMP code is portable and scalable, which can run on various computer platforms including UNIX platforms and Windows NT platforms. The performance study of the parallel code on SGI Origin 2000 UNIX platform is made. The results show that reasonable speedups through parallelization are obtained and that OpenMP is easy to use and an efficient parallel programming tool for the present problem. 相似文献
15.
WOLFGANG SCHREINER 《Journal of Symbolic Computation》1996,21(4-6)
We describe the application of pD, a small para-functional language that we developed as a high-level programming interface for the parallel computer algebra package PACLIB. pD provides several facilities to express parallel algorithms in a flexible way on different levels of abstraction. The compiler translates a pD module into statically typed parallel C code with explicit task creation and synchronization constructs. This target code can be linked with the PACLIB kernel, the multi-processor runtime system of the computer algebra library SACLIB. The parallelization of several computer algebra algorithms on a shared memory multi-processor demonstrates the elegance and efficiency of this approach. 相似文献
16.
对于海量遥感数据的计算而言,串行运算对计算机性能要求高,而且耗时长。为此本文提出引用并行运算方法,不仅可以降低对计算机性能的要求,还可以大大提高运行和计算速度。为此,首先介绍了基于MPI(Message Passing Interface)的并行运算机制,且以Matlab为例给出了它的并行模式,并详细介绍了将现有串行运算代码改造成并行运算的流程。以海量高光谱影像数据为例,将本征维数估计的串行运算修改为并行运算,实验分析并测试了其运行效率。结果表明,并行计算较串行计算可大大缩短本征维数的计算时间。 相似文献
17.
In this study, a parallel computing technology is applied on the simulation of a wind turbine flow problem. A third-order Roe type flux limited splitting based on a pre-conditioning matrix with an explicit time marching method is used to solve the Navier–Stokes equations. The original FORTRAN code was parallelized with Message Passing Interface (MPI) language and tested on a 64-CPU IBM SP2 parallel computer. The test results show that a significant reduction of computing time in running the model and a super-linear speed up rate is achieved up to 32 CPUs at IBM SP2 processors. The speed up rate is as high as 49 for using IBM SP2 64 processors. The test shows very promising potential of parallel processing to provide prompt simulation of the current wind turbine problems. 相似文献
18.
一个有效的并行分析算法 总被引:3,自引:0,他引:3
并行分析在并行编译系统中有着很重要的作用,它的优劣直接影响到编译系统的成败,随着机群系统及其并行开发环境的发展,多数的并行系统可支持多重并行循环的运行。而对只支持一重并行循环的编程系统,选择并行运行效率最高的循环,也是很重要的。为此,本文提出了一个有效的循环并行分析方案,它不但能给出多层循环的并行性,而且能够处理绝大部分实际应用中的并行性问题,本文对传统的并行分析算法进行修改,并给出了一个有效的并 相似文献
19.
《Advances in Engineering Software》2005,36(8):554-560
At present, meshless element free Galerkin (EFG) method is being successfully applied in the areas such as solid mechanics, fracture mechanics and thermal. Being a meshless method, it has many advantages over finite element method. One big hurdle with the wide implementation of this method is its computational cost. Therefore, in this paper, a parallel algorithm is proposed for the EFG method. The parallel code has been written in FORTRAN language using MPI message passing library and executed on a four node (eight processors) MIMD type, distributed memory ‘PARAM 10000’ parallel computer. The total time, communication time, speedup and efficiency have been estimated for a three-dimensional heat transfer problem to validate the proposed algorithm. For eight processors, the speedup and efficiency are obtained to be 4.66 and 58.22%, respectively, for a data size of 1320 nodes. 相似文献
20.
为了充分发挥高性能计算机的计算能力,缓解程序员设计和编写并行程序的压力,扩充可用软件集合,设计并实现了利用交互界面深入挖掘程序中的可向量化语句,优化生成代码中的向量化语句,提高生成代码的执行效率.该方法对充分发挥高性能计算机的计算能力,增强系统可用性和扩展应用范围具有重要的意义,同时能够提供有效的辅助手段和工具支持.渐进式智能回溯向量化代码调优架构通过对用户提交的串行程序进行程序分析和变换,采用串行程序分析、数据依赖分析、向量化分析等技术手段,根据分析结果对程序进行变换和优化,自动生成最终的向量化代码.该方法通过分析串行程序中潜在的并行性,将其自动变换为等价的向量化代码形式,大大简化了程序员的工作. 相似文献