首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Methods are investigated for the construction of computational media for the solution of large problems on accessible distributed inhomogeneous computer resources. The developed approaches form the basis of the X-Com system intended to perform large-scale metacomputer calculations. Principles of the system development are briefly described and results of the actual computations are given.  相似文献   

2.
Resource distribution in hierarchical systems is formulated as a multiindex linear programming problem under transport-type constraints. Conditions under which this problem is reduced to the determination of the minimal-cost circulation in a transport network are stated.  相似文献   

3.
4.
A scalable model and methods of resource co-allocation to organize data processing in distributed systems by families of basic plans—strategies—are proposed. The character of strategies is multilevel since they are designed for structurally different but functionally equivalent models of the same job which is a complex set of interrelated tasks. A concrete basic plan of computations is selected depending on time parameters of control events that occur in the system and are related first of all to the load and dynamics of the composition of heterogeneous computational nodes.  相似文献   

5.
In this article we focus on the implementation of a Lattice Monte Carlo simulation for a generic pair potential within a reconfigurable computing platform. The approach presented was used to simulate a specific soft matter system.We found the performed simulations to be in excellent accordance with previous theoretical and simulation studies. By taking advantage of the shortened processing time, we were also able to find new micro- and macroscopic properties of this system. Furthermore we analyzed analytically the effects of the spatial discretization introduced by the Lattice Monte Carlo algorithm.  相似文献   

6.
Parallelization of the global extremum searching process   总被引:1,自引:0,他引:1  
The parallel algorithm for searching the global extremum of the function of several variables is designed. The algorithm is based on the method of nonuniform coverings proposed by Yu.G. Evtushenko for functions that comply with the Lipschitz condition. The algorithm is realized in the language C and message passing interface (MPI) system. To speed up computations, auxiliary procedures for founding the local extremum are used. The operation of the algorithm is illustrated by the example of atomic cluster structure calculations.  相似文献   

7.
A technique for describing a complex structural diagram of plants with progressive operations in the form of a system of logical functions whose variables can be both aggregates of the transportation network of the plants and larger blocks, i.e., channels is proposed. A method for searching and eliminating potentially false paths of the system is devised as well as a method for taking into account specified constraints on the mutual operation of parallel channels.  相似文献   

8.
The characteristic of development of supercomputers is given, which during the last 10–15 years are began to be built as the microprocessor based mass-parallel structures. There is a prospect for continuation of development of the parallel data processing for the next 10 years both on the general system level and on the processors structure level; the physical performance of the components will grow and targeting area of the computer systems based on the rearrangeable structures will be expanded.  相似文献   

9.
Consideration is given to the problem of maze routing: compilation of a route that delivers us from an arbitrary point to a given one. A class of mazes subject to routing (namely, a class of mazes for which the problem of routing can be solved) is described.  相似文献   

10.
In this paper an algorithm is proposed for the retrieval of a wide class of invariants in quasi-polynomial systems. The invariance properties of the algorithm under different transformations are discussed. The application of the algorithm is illustrated on physical and numerical examples. The algorithm has been implemented in the MATLAB computing environment.  相似文献   

11.
This paper examines some issues in numerical modeling of seismology in three-dimensional space on high-performance computing systems. As a method of modeling, the grid-characteristic method is used. This method allows accurate staging of different contact conditions and is suitable for the most physically correct solutions of problems of seismology and seismic prospecting in complex heterogeneous media. We use the grid-characteristic schemes up to the 4th order accuracy inclusive. The software package is parallelized for work in a distributed clustered medium using the MPI technology. We present the results of the simulation of the Love and Rayleigh surface seismic waves, as well as the passage of seismic waves initiated by an earthquake’s hypocenter to the earth’s surface through a multilayer geological formation.  相似文献   

12.
In this paper, a programming model is presented which enables scalable parallel performance on multi-core shared memory architectures. The model has been developed for application to a wide range of numerical simulation problems. Such problems involve time stepping or iteration algorithms where synchronization of multiple threads of execution is required. It is shown that traditional approaches to parallelism including message passing and scatter-gather can be improved upon in terms of speed-up and memory management. Using spatial decomposition to create orthogonal computational tasks, a new task management algorithm called H-Dispatch is developed. This algorithm makes efficient use of memory resources by limiting the need for garbage collection and takes optimal advantage of multiple cores by employing a “hungry” pull strategy. The technique is demonstrated on a simple finite difference solver and results are compared to traditional MPI and scatter-gather approaches. The H-Dispatch approach achieves near linear speed-up with results for efficiency of 85% on a 24-core machine. It is noted that the H-Dispatch algorithm is quite general and can be applied to a wide class of computational tasks on heterogeneous architectures involving multi-core and GPGPU hardware.  相似文献   

13.
This paper describe a package written in MATHEMATICA that automatizes typical operations performed during evaluation of Feynman graphs with Mellin-Barnes (MB) techniques. The main procedure allows to analytically continue a MB integral in a given parameter without any intervention from the user and thus to resolve the singularity structure in this parameter. The package can also perform numerical integrations at specified kinematic points, as long as the integrands have satisfactory convergence properties. It is demonstrated that, at least in the case of massive graphs in the physical region, the convergence may turn out to be poor, making naïve numerical integration of MB integrals unusable. Possible solutions to this problem are presented, but full automatization in such cases may not be achievable.

Program summary

Title of program: MBProgram summary URL:http://cpc.cs.qub.ac.uk/summaries/ADYG_v1_0Catalogue identifier: ADYG_v1_0Program obtainable from: CPC Program Library, Queen's University of Belfast, N. IrelandComputers: AllOperating systems: AllProgramming language used:MATHEMATICA, Fortran 77 for numerical evaluationMemory required to execute with typical data: Sufficient for a typical installation of MATHEMATICA.No. of lines in distributed program, including test data, etc.: 12 013No. of bytes in distributed program, including test data, etc.: 231 899Distribution format: tar.gzLibraries used:CUBA [T. Hahn, Comput. Phys. Commun. 168 (2005) 78] for numerical evaluation of multidimensional integrals and CERNlib [CERN Program Library, obtainable from: http://cernlib.web.cern.ch/cernlib/] for the implementation of Γ and ψ functions in Fortran.Nature of physical problem: Analytic continuation of Mellin-Barnes integrals in a parameter and subsequent numerical evaluation. This is necessary for evaluation of Feynman integrals from Mellin-Barnes representations.Method of solution: Recursive accumulation of residue terms occurring when singularities cross integration contours. Numerical integration of multidimensional integrals with the help of the CUBA library.Restrictions on the complexity of the problem: Limited by the size of the available storage space.Typical running time: Depending on the problem. Usually seconds for moderate dimensionality integrals.  相似文献   

14.
The efficiency and scalability of traditional parallel force-decomposition (FD) algorithms are not good because of high communication cost which is introduced when skew-symmetric character of force matrix is applied. This paper proposed a new parallel algorithm called UTFBD (Under Triangle Force Block Decomposition), which is based on a new efficient force matrix decomposition strategy. This strategy decomposes only the under triangle force matrix and greatly reduces parallel communication cost, e.g., the communication cost of UTFBD algorithm is only one third of Taylor's FD algorithm. UTFBD algorithm is implemented on Cluster system and applied to solve a physical nucleation problem with 500,000 particles. Numerical results are analyzed and compared among three algorithms, namely, FRI, Taylor's FD and UTFBD. The efficiency of UTFBD on 105 processors is 41.3%, and the efficiencies of FRI and Taylor's FD on 100 processors are 4.3 and 35.2%, respectively. In another words, the efficiency of UTFBD on about 100 processors is 37.0 and 6.1% higher than that of FRI and Taylor's FD, respectively. Results show that UTFBD can increase the efficiency of parallel MD (Molecular Dynamics) simulation to a higher degree and has a better scalability.  相似文献   

15.
16.
For the multiprocessor systems of the hierarchical-architecture relational databases, a new approach to data layout and load balancing was proposed. Described was a database multiprocessor model enabling simulation and examination of arbitrary multiprocessor hierarchical configurations in the context of the on-line transaction processing applications. An important subclass of the symmetrical multiprocessor hierarchies was considered, and a new data layout strategy based on the method of partial mirroring was proposed for them. The disk space used to replicate the data was evaluated analytically. For the symmetrical hierarchies having certain regularity, theorems estimating the laboriousness of replica formation were proved. An efficient method of load balancing on the basis of the partial mirroring technique was proposed. The methods described are oriented to the clusters and Grid-systems.  相似文献   

17.
Quantum Monte Carlo (QMC) is among the most accurate methods for solving the time independent Schrödinger equation. Unfortunately, the method is very expensive and requires a vast array of computing resources in order to obtain results of a reasonable convergence level. On the other hand, the method is not only easily parallelizable across CPU clusters, but as we report here, it also has a high degree of data parallelism. This facilitates the use of recent technological advances in Graphical Processing Units (GPUs), a powerful type of processor well known to computer gamers. In this paper we report on an end-to-end QMC application with core elements of the algorithm running on a GPU. With individual kernels achieving as much as 30× speed up, the overall application performs at up to 6× faster relative to an optimized CPU implementation, yet requires only a modest increase in hardware cost. This demonstrates the speedup improvements possible for QMC in running on advanced hardware, thus exploring a path toward providing QMC level accuracy as a more standard tool. The major current challenge in running codes of this type on the GPU arises from the lack of fully compliant IEEE floating point implementations. To achieve better accuracy we propose the use of the Kahan summation formula in matrix multiplications. While this drops overall performance, we demonstrate that the proposed new algorithm can match CPU single precision.  相似文献   

18.
Given the resurgent attractiveness of single-instruction-multiple-data (SIMD) processing, it is important for high-performance computing applications to be SIMD-capable. The Hartree-Fock SCF (HF-SCF) application, in it's canonical form, cannot fully exploit SIMD processing. Prior attempts to implement Electron Repulsion Integral (ERI) sorting functionality to essentially “SIMD-ify” the HF-SCF application have met frustration because of the low throughput of the sorting functionality. With greater awareness of computer architecture, we discuss how the sorting functionality may be practically implemented to provide high-performance. Overall system performance analysis, including memory locality analysis, is also conducted, and further emphasises that a system with ERI sorting is capable of very high throughput. We discuss two alternative implementation options, with one immediately accessible software-based option discussed in detail. The impact of workload characteristics on expected performance is also discussed, and it is found that in general as basis set size increases the potential performance of the system also increases. Consideration is given to conventional CPUs, GPUs, FPGAs, and the Cell Broadband Engine architecture.  相似文献   

19.
We propose a method for the pattern generation on the base of a light guide for application in an edge-lit backlight. This method uses the molecular dynamics method of a generalized force model as a generation scheme to produce a microstructure with a pattern combining a variable aspect ratio and a variable microstructure orientation. This generation scheme can accommodate the needs of the subsequent optical design phase, and allows for easier optical optimization to reach the equal luminance condition. The scheme is incorporated into several new schemes to meet these needs, the most important of which is the cell division technique, which allows the adjustment of the microstructure density in each sub-domain, or cell. In addition, the boundary treatments allow the precise control of the microstructure density in each cell and the ability to smooth the microstructure distribution across the cell boundary. Finally, we present an example of a backlight with one light emitting diode showing the integration of the generation scheme and the optical design phase in order to assess the validation of the proposed scheme.  相似文献   

20.
State-of-the-art molecular dynamics (MD) simulations generate massive datasets involving billion-vertex chemical bond networks, which makes data mining based on graph algorithms such as K-ring analysis a challenge. This paper proposes an algorithm to improve the efficiency of ring analysis of large graphs, exploiting properties of K-rings and spatial correlations of vertices in the graph. The algorithm uses dual-tree expansion (DTE) and spatial hash-function tagging (SHAFT) to optimize computation and memory access. Numerical tests show nearly perfect linear scaling of the algorithm. Also a parallel implementation of the DTE + SHAFT algorithm achieves high scalability. The algorithm has been successfully employed to analyze large MD simulations involving up to 500 million atoms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号