首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
The implementations of the domain decomposition, SOR, multigrid and conjugate gradient method on CM-5 and Cray C-90 are described for the Laplace's equation on the unit square and L-shaped region. Domain decomposition method uses the Schwarz alternating method. In each domain we take the one-dimensional FFT to convert the problem into the tridiagonal systems which are solved by the scientific libraries installed in the CM-5 and the Cray C-90. On the CM-5 the V-cycle multigrid with symmetric smoothings on P-1 finite element spaces is run with red/black Gauss-Seidel relaxation. Multigrid with natural order Gauss-Seidel relaxation is used on the Cray C-90. While natural order SOR is used in the Cray C-90, R/B SOR is performed on the CM-5. Multigrid is the fastest method on the CM-5 and three methods except SOR give similar performances on Cray C-90.This research was partially supported by the National Science Foundation under Grant No. CDA-9024618.  相似文献   

2.
通过推广修正埃尔米特和反埃尔米特(MHSS)迭代法,我们进一步得到了求解大型稀疏非埃尔米特正定线性方程组的广义MHSS(GMHSS)迭代法.基于不动点方程,我们还将超松弛(SOR)技术运用到了GMHSS迭代法,得到了关于GMHSS迭代法的SOR加速,并分析了它的收敛性.数值算例表明,SOR技术能够大大提高加速GMHSS迭代法的收敛效率.  相似文献   

3.
We present a polynomial preconditioner that can be used with the conjugate gradient method to solve symmetric and positive definite systems of linear equations. Each step of the preconditioning is achieved by simultaneously taking an iteration of the SOR method and an iteration of the reverse SOR method (equations taken in reverse order) and averaging the results. This yields a symmetric preconditioner that can be implemented on parallel computers by performing the forward and reverse SOR iterations simultaneously. We give necessary and sufficient conditions for additive preconditioners to be positive definite.

We find an optimal parameter, ω, for the SOR-Additive linear stationary iterative method applied to 2-cyclic matrices. We show this method is asymptotically twice as fast as SSOR when the optimal ω is used.

We compare our preconditioner to the SSOR polynomial preconditioner for a model problem. With the optimal ω, our preconditioner was found to be as effective as the SSOR polynomial preconditioner in reducing the number of conjugate gradient iterations. Parallel implementations of both methods are discussed for vector and multiple processors. Results show that if the same number of processors are used for both preconditioners, the SSOR preconditioner is more effective. If twice as many processors are used for the SOR-Additive preconditioner, it becomes more efficient than the SSOR preconditioner when the number of equations assigned to a processor is small. These results are confirmed by the Blue Chip emulator at the University of Washington.  相似文献   


4.
介绍一种基于随机行走方法与松弛迭代(SOR)算法相结合的快速电源网络求解方法,它先将P/G网分为若干块,然后用简化的随机行走方法求取电路块边界结点的电压,最后采用松弛迭代算法求出电路块内部结点的电压.同时还给出了一种电路块从对角顶点向中央求解的策略,并将此方法推广到采用RLC瞬态网络的求解.大量的实验数据表明,受限于P/G网供电PAD的数目较少这一现实,随机行走方法的效率比较低,在此情形下,该方法比随机行走方法快20倍.  相似文献   

5.
The building-cube method (BCM) is a new generation algorithm for CFD simulations. The basic idea of BCM is to simplify the algorithm in all stages of flow computation to achieve large-scale simulations. Calculation of a pressure field using the Successive Over Relaxation (SOR) method consumes most of the total execution time required for BCM. In this paper, effective implementations on modern vector and scalar processors are investigated. NEC SX-9 and Intel Nehalem-EX are the latest vector and scalar processors. Those processors have much higher peak performances than their previous-generation processors. However, their memory bandwidth improvement cannot catch up with the performance improvement of processors. This is the so-called memory wall problem. In our paper, we discuss optimization techniques for implementation of the SOR method based on architectural characteristics of these modern processors, and evaluate their effects on the sustained performances of these processors for BCM.  相似文献   

6.
The paper describes the implementation of the Successive Overrelaxation (SOR) method on an asynchronous multiprocessor computer for solving large, linear systems. The parallel algorithm is derived by dividing the serial SOR method into noninterfering tasks which are then combined with an optimal schedule of a feasible number of processors. The important features of the algorithm are: (i) achieves a speedup Sp O(N/3) and an efficiency Ep 2/3 using P = [N/2] processors, where N is the number of the equations, (ii) contains a high level of inherent parallelism, whereas on the other hand, the convergence theory of the parallel SOR method is the same as its sequential counterpart and (iii) may be modified to use block methods in order to minimise the overhead due to communication and synchronisation of the processors.  相似文献   

7.
Given a linear system Ax = b and an iterative method x (m + 1) = Gx (m)+k, m = 0,1,2,…(1) to solve it, we determine analytically the optimum extrapolation factor of the extrapolated method of (1), when all the eigenvalues of G have the same modulus (Section 2). Then using the SOR theory in the case of consistently ordered matrices A and applying the results of Section 2 to the extrapolated SOR (ESOR) method, we show (Section 3), that the globally optimum parameters of it (and also of the AOR method) [2, 11, 14] are recovered.  相似文献   

8.
逐次松弛迭代算法(SOR)是求解线性方程组的一种常用迭代算法,当系数矩阵正定时,它具有较快的收敛速度。但是,由于每个迭代步内存在数据相关,它难以实现并行计算。目前的SOR并行算法采用数据分解的方法,但由于该法并行区域过小,同步通讯代价大,并行效率低。本文提出了SOR的一种新型并行算法,该算法与传统SOR方法等价,具有相同的收敛性和迭代结果。该并行算法通过矩阵分块增大了可并行计算的区域,并引入流水线技术,利用各处理器间通讯与计算时间的重叠,获得较理想的并行加速效率。通过多核微机以及小规模集群上的数值实验证明,本文提出的SOR并行算法在求解大型稠密线性方程组时具有较好的并行效率。  相似文献   

9.
In this paper, some convergence theorems for the double splitting of a monotone matrix or a Hermitian positive definite matrix are presented. Two comparison theorems for two double splittings of a monotone matrix are obtained. Meanwhile, we establish a new sufficient condition for convergence of the Gauss-Seidel double SOR method for an H-matrix.  相似文献   

10.
This paper describes a successive overrelaxation (SOR) method for computing a multivariate C1 piecewise polynomial interpolant. Given a collection of points in Rm together with a triangulation of those points, the scheme described requires only the values of the function to be interpolated at the given points. The result is a C1 interpolant whose restriction to each of the triangles in the triangulation is a polynomial of degree n.  相似文献   

11.
We present a modification on the successive overrelaxation (SOR) method and the iteration of the Green's function integral representation for the solution of the (nonlinear) Poisson-Boltzmann equation between two spheres. In comparison with other attempts, which approximate the geometry or the nonlinearity, the computations here are done for the full problem and compared with those done by the finite element method as a typical method for such problems. For the parameters of general interest, while the SOR method does not work, and the iteration of the integral representation is limited in its convergence, our modification to these iterative schemes converge. The modified SOR surpasses both methods in simplicity and speed; it is about 100 times faster than the modified iteration of the integral representation, with the latter being still simpler and faster than the finite element method. These two examples further illustrate the advantage of our recent modification to iterative methods, which is based on an analytical fixed point argument.  相似文献   

12.
The implementation of a variant of the Successive Overrelaxation (SOR) method for the numerical solution of elliptic partial differential equations is discussed. The convergence of this variant (SORV) is faster than that of SOR for a fixed prescribed error bound, and it is asymptotically as fast as SOR. In addition, SORV is much better suited for vector computers than the classical SOR or SOR with so-called red-black ordering. In the following, we compare SORV with SOR, red-black SOR (SORB1), SORB1 with stride 1 (SORB2) and a one-dimensional long-vector version of SORB2 (SORB3). Implementation is discussed and results are presented for the IBM 3090-200VF and the CRAY-2.  相似文献   

13.
Recently, several proposals for the generalization of Young's SOR method to the saddle point problem or the augmented system has been presented. One of the most practical versions is the SOR-like method given by Golub et al., [(2001). SOR-like methods for augmented systems. BIT, 41, 71–85.], where the convergence and the determination of its optimum parameters were given. In this article, a full characterization of the spectral radius of the SOR-like iteration matrix is given, and an explicit expression for the optimum parameter is given in each case. The new results also lead to different results to that of Golub et al. Besides, it is shown that by the choices of the preconditioning matrix, the optimum SOR-like iteration matrix has no complex eigenvalues, therefore, it can be accelerated by semi-iterative methods.  相似文献   

14.
边缘海静力数值模式是国内针对边缘海特点自主开发的数值预报模式,但该模式因物理求解方程较多且采用不宜并行化的SOR求解算法而程序计算时间过长。针对上述问题,提出基于三维网格和海洋模式特点的SOR并行求解算法,该算法在保留三维网格数据间依赖关系的同时,有效解决了SOR迭代算法难以并行化的问题。同时,引入通信避免算法,采用MPI非阻塞通信方式,细分计算和通信过程,利用计算有效隐藏通信开销,提高了并行程序效率。实验结果表明,并行后的边缘海静力数值模式程序的性能相对串行程序提升了60.71倍,3天(25920计算时间步)预报结果的均方根误差低于0.001,满足海洋数值预报的时效性和精度要求。  相似文献   

15.
逐次超松弛迭代方法被广泛应用于油藏数值模拟中压力方程的求解.其并行实现是提高模拟速度的重要途径.传统并行方案大都只是在一次迭代内进行数据划分,而没有进一步将数据划分与迭代空间划分相结合,故针对SOR算法和SMP(symmetric multi-processors)系统的特点,以OpenMP为并行化实现工具,提出了基于SMP的并行逐次超松弛迭代方法(parallelSOR).方法通过改变不同迭代步内数据点的更新次序,使不同区域内的数据点可以并行执行多次迭代.总结出针对三维油藏区域在数据空间划分和迭代空间合并上相对较优的策略,分析了迭代过程中网格块的生长形状.与传统的并行策略相比,该方法具有可减小同步开销、改进数据局部性、cache命中率高等优点.实验结果表明,该方法具有较高的加速比和效率.  相似文献   

16.
《国际计算机数学杂志》2012,89(3-4):355-369
For the solution of the linear system Ax=b many iterative methods based on a splitting of A exist. Among them the Jacobi, the Gauss-Seidel and the Successive Overrelaxation (SOR) methods as well as their extrapolated counterparts are the most popular. This paper presents a new general method such that the aforementioned methods become special cases of it. Besides its four degrees of freedom, which make it a very flexible method, another of its main characteristics is that it is well-defined even when some elements on the diagonal of A are zero. The first results concerning the new method show that a proper exploitation of its basic properties will make it a very powerful technique.  相似文献   

17.
A parallelized point rowwise Successive Over-Relaxation (SOR) iterative algorithm is developed for the Heterogeneous Element Processor (HEP) multiple instruction stream computer. The classical point SOR method is not easily vectorizable with rowwise ordering of the grid points, but it can be effectively parallelized on a multiple instruction stream machine without suffering in computational and convergence rate. The details of the implementation including restructuring of a serial FORTRAN program and techniques needed to exploit the parallel processing architectural concept of the HEP are presented. The parallelized algorithm is analyzed in detail. The lessons learned in this study are documented and may provide some guidelines for similar future coding since new approaches and restructuring techniques are required for programming a multiple instruction stream machine, which are totally different than those required for programming an algorithm on a vector processor. To assess the capabilitiesof the parallelized algorithm it was used to solve the Laplace's equation on a rectangular field with Dirichlet boundary conditions. Computer run times are presented which indicate significant speed gain over a scalar version of the code. For a moderate to large size problem seventeen or more processes are required to make efficient use of the parallel processing hardware. Also, to demonstrate the capability of the algorithm for a realistic problem, it was used to obtain the numerical solution of a viscous incompressible fluid in a square cavity. Since point iterative relaxation schemes are at the core of many systems of elliptic as well as non-elliptic partial differential equations occuring in engineering and scientific applications, the present study suggests the possibility of both reducing the real time processing and increasing the scope of computational modeling.  相似文献   

18.
In the high-performance IC design with increasing design complexity,it is a very important design content to efficiently analyze IC parameters.Thus,the electro-thermal (ET) analyses including power/ground (P/G) analysis and thermal analysis are hot topics in today’s IC research.Since ET analysis equation has a sparse,positive definite and strictly diagonally dominant coefficient-matrix,we prove that the ET analysis has the advantage of locality.Owing to this advantage,localized relaxation method is formally proposed,which has the same accuracy as the global relaxation done with the constraint of the same truncation error limitation.Based on the localized relaxation theory,an efficient and practical localized successive over-relaxation algorithm (LSOR2) is introduced and applied to solve the following three ET analysis problems.(1) Single-node statistical voltage analysis for over-IR-drop nodes in P/G networks;(2) single-node statistical temperature analysis for hot spots in 3D thermal analysis;(3) fast single open-defect analysis for P/G networks.A large amount of experimental data demonstrates that compared with the global successive over-relaxation (SOR) algorithm,LSOR2 can speed up 1-2 orders of magnitudes with the same accuracy in ET analyses.  相似文献   

19.
The alternating direction method of multipliers (ADMM) has been successfully applied to solve structured convex optimization problems due to its superior practical performance. The convergence properties of the 2-block ADMM have been studied extensively in the literature. Specifically, it has been proven that the 2-block ADMM globally converges for any penalty parameter \(\gamma >0\). In this sense, the 2-block ADMM allows the parameter to be free, i.e., there is no need to restrict the value for the parameter when implementing this algorithm in order to ensure convergence. However, for the 3-block ADMM, Chen et al. (Math Program 155:57–79, 2016) recently constructed a counter-example showing that it can diverge if no further condition is imposed. The existing results on studying further sufficient conditions on guaranteeing the convergence of the 3-block ADMM usually require \(\gamma \) to be smaller than a certain bound, which is usually either difficult to compute or too small to make it a practical algorithm. In this paper, we show that the 3-block ADMM still globally converges with any penalty parameter \(\gamma >0\) if the third function \(f_3\) in the objective is smooth and strongly convex, and its condition number is in [1, 1.0798), besides some other mild conditions. This requirement covers an important class of problems to be called regularized least squares decomposition (RLSD) in this paper.  相似文献   

20.
《国际计算机数学杂志》2012,89(3-4):269-282
In this paper, a new explicit 4-pint block over-relaxation scheme is presented for the numerical solution of the sparse linear systems derived from the discretization of self-adjoint elliptic partial differential equations. A comparison with the implicit line and 2-line block SOR schemes for the model problem shows the new technique to be competitive  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号