期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

莫则尧《计算机学报》2001,24(2):183-190

提出了一个适合同构和异构并行计算环境的高效一维动态负载平衡方法;多层均权法,并成功地解决了多物质非定常流体力学Lagrange法并行数值模拟过程中的动态负载不平衡问题。文中给出了详细的理论分析以及两台并行机上结合某实际物理问题组织的并行数值实验。相似文献

2.

一种基于实测的高维动态负载平衡方法 总被引：3，自引：0，他引：3

曹小林莫则尧《计算机学报》2005,28(9):1440-1446

针对大规模科学计算中的强非规则结构负载问题,作者开发出一种基于实测的动态负载平衡方法．首先,将由规则结构化网格组成的模拟区域剖分成多块;其次,把块的高维坐标转换成一维Hilbert空间填充曲线（HSFC）索引;然后,基于实测信息采用多层均权法剖分按一维HSFC索引排列的块;最后根据剖分信息重分配块以平衡负载．它把仅适用于一维的多层均权法扩展到二维和三维,并引入更多的实测信息和块数据结构．与ISP方法相比,该方法在64个CPU上提高负载平衡效率10％,在某MPP的500个CPU上模拟强非规则结构负载问题时,获得了88％的负载平衡效率和84％的并行效率．相似文献

3.

基于消息传递的数据交错重分布负载平衡技术

刘杰迟利华胡庆丰李晓梅《计算机工程与设计》2005,26(2):312-314,319

数据重分布是实现消息传递环境下负载平衡的重要手段,提出了数据交错分布的模型问题及模型问题的并行计算模型,分析了模型问题在消息传递环境下的实现,讨论了性能和适用条件,给出了分析结果,讨论了通信与计算的时间重叠问题,将数据交错重分布负载平衡技术应用到非平衡刚性动力学方程组的并行计算中,获得了很好的负载平衡效果。相似文献

4.

一种基于实测的自动负载建模算法

刘旭莫则尧安恒斌曹小林张爱清《计算机科学》2015,42(1):63-66,78

负载平衡是影响大规模并行计算效率的一个关键因素,准确的负载建模是负载平衡的基础.提出了一种基于实测的自动负载建模算法.该算法无需用户提供信息,具有良好的理论保证以及近似线性的计算复杂度和完全的并行性.2400个进程上的分子动力学模拟表明,该算法执行速度快,同时能够保证60％以上的负载平衡效率. 相似文献

5.

Parallel implementation of molecular dynamics simulation for short-ranged interaction

Jong-Shinn Wu Yu-Lin Hsu Yun-Min Lee 《Computer Physics Communications》2005,170(2):175-185

A parallel molecular dynamics simulation method, designed for large-scale problems, employing dynamic spatial domain decomposition for short-ranged molecular interactions is proposed. In this parallel cellular molecular dynamics (PCMD) simulation method, the link-cell data structure is used to reduce the searching time required for forming the cut-off neighbor list as well as for domain decomposition, which utilizes the multi-level graph-partitioning technique. A simple threshold scheme (STS), in which workload imbalance is monitored and compared with some threshold value during the runtime, is proposed to decide the proper time for repartitioning the domain. The simulation code is implemented and tested on the memory-distributed parallel machine, e.g., PC-cluster system. Parallel performance is studied using approximately one million L-J atoms in the condensed, vaporized and supercritical states. Results show that fairly good parallel efficiency at 49 processors can be obtained for the condensed and supercritical states (∼60%), while it is comparably lower for the vaporized state (∼40%). 相似文献

6.

Parallel computing of diatomic molecular rarefied gas flows

《Parallel Computing》1997,23(9):1249-1260

A parallel algorithm for direct simulation Monte Carlo calculation of diatomic molecular rarefied gas flows is presented. For reliable simulation of such flow, an efficient molecular collision model is required. Using the molecular dynamics method, the collision of N₂ molecules is simulated. For this molecular dynamics simulation, the parameter decomposition method is applied for parallel computing. By using these results, the statistical collision model of diatomic molecule is constructed. For validation this model is applied to the direct simulation Monte Carlo method to simulate the energy distribution at equilibrium condition and the structure of normal shock wave. For this DSMC calculation, the domain decomposition is applied. It is shown that the collision process of diatomic molecules can be calculated precisely and the parallel algorithm can be efficiently implemented on the parallel computer. 相似文献

7.

Chaotic local search algorithm 总被引：24，自引：0，他引：24

Changkyu Choi Ju-Jang Lee 《Artificial Life and Robotics》1998,2(1):41-47

The steepest descent search algorithm is modified in conjunction withchaos to solve the optimization problem of an unstructured search space. The problem is that given only the gradient information of the quality function at the present configuration,X(t), we must find the value of a configuration vector that minimizes the quality function. The proposed algorithm starts basically from the steepest descent search technique but at the prescribed points, i.e., local minimum points, the chaotic jump is performed by the dynamics of a chaotic neuron. Chaotic motions are mainly caused because the Gaussian function has a hysteresis as a refractoriness. An adaptation mechanism to adjust the size of the chaotic jump is also given. In order to enhance the probability of finding the global minimum, a parallel search strategy is developed. The validity of the proposed method is verified in simulation examples of the function minimization problem and the motion planning problem of a mobile robot. This work was presented, in part, at the International Symposium on Artificial Life and Robotics, Oita, Japan, February 18–20, 1996 相似文献

8.

Hybrid parallelization of molecular dynamics simulations to reduce load imbalance

Morillo Julian Vassaux Maxime Coveney Peter V. Garcia-Gasulla Marta 《The Journal of supercomputing》2022,78(7):9184-9215

The most widely used technique to allow for parallel simulations in molecular dynamics is spatial domain decomposition, where the physical geometry is divided into boxes, one per processor. This technique can inherently produce computational load imbalance when either the spatial distribution of particles or the computational cost per particle is not uniform. This paper shows the benefits of using a hybrid MPI+OpenMP model to deal with this load imbalance. We consider LAMMPS (Large-scale Atomic/Molecular Massively Parallel Simulator), a prototypical molecular dynamics simulator that provides its own balancing mechanism and an OpenMP implementation for many of its modules, allowing for a hybrid setup. In this work, we extend the current OpenMP implementation of LAMMPS and optimize it and evaluate three different setups: MPI-only, MPI with the LAMMPS balance mechanism, and hybrid setup using our improved OpenMP version. This comparison is made using the five standard benchmarks included in the LAMMPS distribution plus two additional test cases. Results show that the hybrid approach can deal with load balancing problems better and more effectively (50% improvement versus MPI-only for a highly imbalanced test case) than the LAMMPS balance mechanism (only 43% improvement) and improve simulations with issues other than load imbalance.

相似文献

9.

基于极值优化的多核集群负载均衡

谭鹤毅《测控技术》2017,36(6):109-111

针对分布式多核节点系统的负载均衡难以取得最优解的问题,提出了一种基于改进极值优化的负载均衡方法.该方法通过节点的CPU占用率发现负载不均衡情况,然后用一个衡量模型估计计算与通信开销使改进的极值优化方法能够实现集群的负载均衡.仿真与实验结果表明该算法能够提高分布式集群的计算效率,是一种理想的负载均衡算法. 相似文献

10.

基于任务的Cholesky分解多核并行化研究

吴华平郑晓薇张建强《计算机工程与设计》2011,32(12):4057-4059,4255

针对Cholesky分解算法采用OpenMP并行程序设计时的并行性开销增大和线程负载不平衡的问题,利用并行性能分析工具对串行程序进行热点分析,提出了一种基于任务的Cholesky分解多核并行算法。该算法将大循环问题划分成各个相互独立的小任务,并运用任务窃取技术和动态负载均衡算法使多个任务能够并行完成。采用ParallelAmplifier对并行程序进行调试和优化,实验结果表明,其性能得到较大幅度的提升。相似文献

11.

针对复杂系统并行模拟问题的并发式多级矩阵重排算法 总被引：1，自引：0，他引：1

王健陈丙珍何小荣《计算机与应用化学》2002,19(5):589-592

在解决复杂化工过程优化与模拟问题时,大规模代数差分方程的存在导致大量的计算时间花费在重复求解稀疏大型线性方程组的过程中。随着并行计算和网络技术的发展,为了提高优化或模拟工作的速度,可以通过将非对称矩阵重排为带边块对角形式,从而实现对线性系统的高效并行求解。本文提出一种基于Kernighan-Lin算法的并发式的多层次矩阵重排策略,它以最小化边块为目标,同时保证尽可能小的负荷非平衡性,从而获得好的重排结果。应用该重排策略可以对大型稀疏矩阵进行压缩和并行重排,提高重排算法的效率。在研究过程中应用了基于该技术的并行计算程序对一系列标准矩阵进行了检验,并与一些现有的算法进行了比较,证明了其有效性和可行性。相似文献

12.

基于联合权重超图划分的SNN负载均衡方法

徐聪叶均超黄尧柴志雷《计算机应用研究》2023,40(7)

大规模脉冲神经网络并行模拟是探究大脑机能的重要手段。其难点在于合理地将负载映射到并行分布式平台上,提升模拟速度。为解决该问题,提出一种基于联合权重超图划分的SNN负载均衡方法,解决并行计算中进程间计算负载与通信负载的均衡问题,提高SNN模拟速度。并使用稀疏通信的方式替代集体通信,解决事件通信过程中的数据冗余问题,提升通信效率。实验结果表明,该方法使带有STDP突触20%规模的皮质层微电路模型的模拟时间,比标准循环分配算法缩短约64.5%,比普通超图分配算法缩短约57.4%,同时事件通信数据量减少了90%以上。相似文献

13.

A hybrid algorithm for queueing systems

Wai-On Yuen Wai-Ki Ching Michael K. Ng 《Calcolo》2004,41(3):139-151

Abstract In this paper, we propose a hybrid algorithm based on [12] for solving linear systems of equations. The hybrid algorithm combines the evolutionary algorithm and the successive over-relaxation (SOR) method. The evolutionary algorithm allows the relaxation parameter w to be adaptive in the SOR method. We prove the convergence of the hybrid algorithm for strictly diagonal dominant linear systems. We then apply it to solve the steady-state probability distributions of Markovian queueing systems. Numerical examples are given to demonstrate the fast convergence rate of the method. 相似文献

14.

Parameter Estimation Algorithms for Hierarchical Distributed Systems

《国际计算机数学杂志》2012,89(1):65-88

There has been a great deal of research activity in the area of identification of distributed parameter systems over the past two decades. An extensive treatment of off-line schemes ( e.g. , output least squares, estimation error, etc. ) together with a comprehensive survey of the literature can be found in the monograph by Banks and Kunisch [4] . In the case of on-line, or adaptive, schemes, the available literature is less extensive and more recent ( Isermann et al . [7] ). The on-line methods give estimates recursively as the measurements are obtained within the time limit imposed by the sampling period. These include recursive projection algorithm ( Baumeister et al. [5] ), recursive least squares algorithm ( Glentis et al . [6] ), on-line excitation algorithms ( Ludwig et al. [8] ), etc . In this paper an equivalent 2nd order dynamical system is formulated from a given trajectory representing the pattern to be recognised and simulated in order to estimate the parameters for hierarchical distributed systems using 1st and 2nd order dynamics. Recommendations for the best estimation strategy are given. 相似文献

15.

一种提高并行数据挖掘效率的方法

佘春东范植华孙世新车著明唐剑《计算机科学》2004,31(2):132-134

发现关联规则是数据挖掘的一项重要任务，本文介绍了几种数据挖掘的串行和并行算法。其中IDD算法是一种高效的和易于扩展的发现关联规则的并行算法，然而，当处理嚣数目增加时，由于负载的失衡导致其效率的严重下降，于是通过引入近似算法成功地解决了这个问题。我们给出了两种近似算法和其性能证明，其一是在线算法，另一种是离线算法。在本文的最后，我们进行了改进的IDD算法的复杂性分析。相似文献

16.

基于Slice的H.264并行视频编码算法 总被引：1，自引：0，他引：1

宁华梅铮李锦涛《计算机工程》2005,31(4):181-182

从H.264视频编码标准的特点出发,提出了基于Slice级别的H.264视频编码并行算法,该算法不仅能够保证节点间的负载平衡,减少各节点间数据的依赖关系,还充分利用了已有的计算能力。最后给出了在曙光3000上的实验结果。相似文献

17.

一种改进的并行挖掘关联规则的算法

佘春东孙世新范植华胡世祥车著明《小型微型计算机系统》2004,25(12):2139-2142

发现关联规则是数据挖掘的一个重要的任务．简要介绍了几种发现关联规则的串行算法和并行算法，并针对IDD和HD这两种效率和可扩展性较好的算法，引入在线LPT调度算法，有效地解决了IDD和HD算法中非常重要的候选项目集在各个处理器节点之间的划分问题，尽可能使得各个节点负载平衡，从而提高算法的效率．相似文献

18.

Model‐Free H∞ Control Design for Unknown Continuous‐Time Linear System Using Adaptive Dynamic Programming

下载免费PDF全文

Chunbin Qin Huaguang Zhang Yanhong Luo 《Asian journal of control》2016,18(2):609-618

In this paper, a new online model‐free adaptive dynamic programming algorithm is developed to solve the H_∞ control problem of the continuous‐time linear system with completely unknown system dynamics. Solving the game algebraic Riccati equation, commonly used in H_∞ state feedback control design, is often referred to as a two‐player differential game where one player tries to minimize the predefined performance index while the other tries to maximize it. Using data generated in real time along the system trajectories, this new method can solve online the game algebraic Riccati equation without requiring the full knowledge of system dynamics. A rigorous proof of convergence of the proposed algorithm is given. Finally, simulation studies on two examples demonstrate the effectiveness of the proposed method. 相似文献

19.

Numerical methods for minimizing real functions unimodal on a set

Liana Lupşa Ioana Chiorean 《Calcolo》2006,43(2):83-94

Abstract An extension of the classical concept of unimodality was recently proposed in [4] and slightly modified in [5]. Here we present a numerical method based on the idea of bisection for determining the minimum points of a real unimodal function on a set. A serial and a parallel algorithm are given. 相似文献

20.

Parallel Algorithm Oriented Mesh Database 总被引：1，自引：1，他引：0

Jean-François Remacle Ottmar Klaas Joseph E. Flaherty Mark S. Shephard 《Engineering with Computers》2002,18(3):274-284

In this paper, we present a new point of view for efficiently managing general parallel mesh representations. Taking as a slarting point the Algorithm Oriented Mesh Database (AOMD) of [1] we extend the concepts to a parallel mesh representation. The important aspects of parallel adaptivity and dynamic load balancing are discussed. We finally show how AOMD can be effectively interfaced with mesh adaptive partial differential equation solvers. Results of the calculation of an elasticity problem and of a transient fluid dynamics problem involving thousands of mesh refinements, and load balancings are finally presented. ID="A1" Correspondence and offprint requests to: J. Remacle, Scientific Computation Research Center, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180, USA. E-mail: remacle@scorec.rpi.edu 相似文献