首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Given the resurgent attractiveness of single-instruction-multiple-data (SIMD) processing, it is important for high-performance computing applications to be SIMD-capable. The Hartree-Fock SCF (HF-SCF) application, in it's canonical form, cannot fully exploit SIMD processing. Prior attempts to implement Electron Repulsion Integral (ERI) sorting functionality to essentially “SIMD-ify” the HF-SCF application have met frustration because of the low throughput of the sorting functionality. With greater awareness of computer architecture, we discuss how the sorting functionality may be practically implemented to provide high-performance. Overall system performance analysis, including memory locality analysis, is also conducted, and further emphasises that a system with ERI sorting is capable of very high throughput. We discuss two alternative implementation options, with one immediately accessible software-based option discussed in detail. The impact of workload characteristics on expected performance is also discussed, and it is found that in general as basis set size increases the potential performance of the system also increases. Consideration is given to conventional CPUs, GPUs, FPGAs, and the Cell Broadband Engine architecture.  相似文献   

2.
Quantum Monte Carlo (QMC) is among the most accurate methods for solving the time independent Schrödinger equation. Unfortunately, the method is very expensive and requires a vast array of computing resources in order to obtain results of a reasonable convergence level. On the other hand, the method is not only easily parallelizable across CPU clusters, but as we report here, it also has a high degree of data parallelism. This facilitates the use of recent technological advances in Graphical Processing Units (GPUs), a powerful type of processor well known to computer gamers. In this paper we report on an end-to-end QMC application with core elements of the algorithm running on a GPU. With individual kernels achieving as much as 30× speed up, the overall application performs at up to 6× faster relative to an optimized CPU implementation, yet requires only a modest increase in hardware cost. This demonstrates the speedup improvements possible for QMC in running on advanced hardware, thus exploring a path toward providing QMC level accuracy as a more standard tool. The major current challenge in running codes of this type on the GPU arises from the lack of fully compliant IEEE floating point implementations. To achieve better accuracy we propose the use of the Kahan summation formula in matrix multiplications. While this drops overall performance, we demonstrate that the proposed new algorithm can match CPU single precision.  相似文献   

3.
We present HONEI, an open-source collection of libraries offering a hardware oriented approach to numerical calculations. HONEI abstracts the hardware, and applications written on top of HONEI can be executed on a wide range of computer architectures such as CPUs, GPUs and the Cell processor. We demonstrate the flexibility and performance of our approach with two test applications, a Finite Element multigrid solver for the Poisson problem and a robust and fast simulation of shallow water waves. By linking against HONEI's libraries, we achieve a two-fold speedup over straight forward C++ code using HONEI's SSE backend, and additional 3–4 and 4–16 times faster execution on the Cell and a GPU. A second important aspect of our approach is that the full performance capabilities of the hardware under consideration can be exploited by adding optimised application-specific operations to the HONEI libraries. HONEI provides all necessary infrastructure for development and evaluation of such kernels, significantly simplifying their development.

Program summary

Program title: HONEICatalogue identifier: AEDW_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDW_v1_0.htmlProgram obtainable from: CPC Program Library, Queen's University, Belfast, N. IrelandLicensing provisions: GPLv2No. of lines in distributed program, including test data, etc.: 216 180No. of bytes in distributed program, including test data, etc.: 1 270 140Distribution format: tar.gzProgramming language: C++Computer: x86, x86_64, NVIDIA CUDA GPUs, Cell blades and PlayStation 3Operating system: LinuxRAM: at least 500 MB freeClassification: 4.8, 4.3, 6.1External routines: SSE: none; [1] for GPU, [2] for Cell backendNature of problem: Computational science in general and numerical simulation in particular have reached a turning point. The revolution developers are facing is not primarily driven by a change in (problem-specific) methodology, but rather by the fundamental paradigm shift of the underlying hardware towards heterogeneity and parallelism. This is particularly relevant for data-intensive problems stemming from discretisations with local support, such as finite differences, volumes and elements.Solution method: To address these issues, we present a hardware aware collection of libraries combining the advantages of modern software techniques and hardware oriented programming. Applications built on top of these libraries can be configured trivially to execute on CPUs, GPUs or the Cell processor. In order to evaluate the performance and accuracy of our approach, we provide two domain specific applications; a multigrid solver for the Poisson problem and a fully explicit solver for 2D shallow water equations.Restrictions: HONEI is actively being developed, and its feature list is continuously expanded. Not all combinations of operations and architectures might be supported in earlier versions of the code. Obtaining snapshots from http://www.honei.org is recommended.Unusual features: The considered applications as well as all library operations can be run on NVIDIA GPUs and the Cell BE.Running time: Depending on the application, and the input sizes. The Poisson solver executes in few seconds, while the SWE solver requires up to 5 minutes for large spatial discretisations or small timesteps.References:
  • [1] 
    http://www.nvidia.com/cuda.
  • [2] 
    http://www.ibm.com/developerworks/power/cell.
  相似文献   

4.
In this paper, a programming model is presented which enables scalable parallel performance on multi-core shared memory architectures. The model has been developed for application to a wide range of numerical simulation problems. Such problems involve time stepping or iteration algorithms where synchronization of multiple threads of execution is required. It is shown that traditional approaches to parallelism including message passing and scatter-gather can be improved upon in terms of speed-up and memory management. Using spatial decomposition to create orthogonal computational tasks, a new task management algorithm called H-Dispatch is developed. This algorithm makes efficient use of memory resources by limiting the need for garbage collection and takes optimal advantage of multiple cores by employing a “hungry” pull strategy. The technique is demonstrated on a simple finite difference solver and results are compared to traditional MPI and scatter-gather approaches. The H-Dispatch approach achieves near linear speed-up with results for efficiency of 85% on a 24-core machine. It is noted that the H-Dispatch algorithm is quite general and can be applied to a wide class of computational tasks on heterogeneous architectures involving multi-core and GPGPU hardware.  相似文献   

5.
Modern high energy physics experiments have to process terabytes of input data produced in particle collisions. The core of many data reconstruction algorithms in high energy physics is the Kalman filter. Therefore, the speed of Kalman filter based algorithms is of crucial importance in on-line data processing. This is especially true for the combinatorial track finding stage where the Kalman filter based track fit is used very intensively. Therefore, developing fast reconstruction algorithms, which use maximum available power of processors, is important, in particular for the initial selection of events which carry signals of interesting physics.One of such powerful feature supported by almost all up-to-date PC processors is a SIMD instruction set, which allows packing several data items in one register and to operate on all of them, thus achieving more operations per clock cycle. The novel Cell processor extends the parallelization further by combining a general-purpose PowerPC processor core with eight streamlined coprocessing elements which greatly accelerate vector processing applications.In the investigation described here, after a significant memory optimization and a comprehensive numerical analysis, the Kalman filter based track fitting algorithm of the CBM experiment has been vectorized using inline operator overloading. Thus the algorithm continues to be flexible with respect to any CPU family used for data reconstruction.Because of all these changes the SIMDized Kalman filter based track fitting algorithm takes 1 μs per track that is 10000 times faster than the initial version. Porting the algorithm to a Cell Blade computer gives another factor of 10 of the speedup.Finally, we compare performance of the tracking algorithm running on three different CPU architectures: Intel Xeon, AMD Opteron and Cell Broadband Engine.  相似文献   

6.
In this article we focus on the implementation of a Lattice Monte Carlo simulation for a generic pair potential within a reconfigurable computing platform. The approach presented was used to simulate a specific soft matter system.We found the performed simulations to be in excellent accordance with previous theoretical and simulation studies. By taking advantage of the shortened processing time, we were also able to find new micro- and macroscopic properties of this system. Furthermore we analyzed analytically the effects of the spatial discretization introduced by the Lattice Monte Carlo algorithm.  相似文献   

7.
We present two sequential and one parallel global optimization codes, that belong to the stochastic class, and an interface routine that enables the use of the Merlin/MCL environment as a non-interactive local optimizer. This interface proved extremely important, since it provides flexibility, effectiveness and robustness to the local search task that is in turn employed by the global procedures. We demonstrate the use of the parallel code to a molecular conformation problem.

Program summary

Title of program: PANMINCatalogue identifier: ADSUProgram summary URL:http://cpc.cs.qub.ac.uk/summaries/ADSUProgram obtainable from: CPC Program Library, Queen's University of Belfast, N. IrelandComputer for which the program is designed and others on which it has been tested: PANMIN is designed for UNIX machines. The parallel code runs on either shared memory architectures or on a distributed system. The code has been tested on a SUN Microsystems ENTERPRISE 450 with four CPUs, and on a 48-node cluster under Linux, with both the GNU g77 and the Portland group compilers. The parallel implementation is based on MPI and has been tested with LAM MPI and MPICHInstallation: University of Ioannina, GreeceProgramming language used: Fortran-77Memory required to execute with typical data: Approximately O(n2) words, where n is the number of variablesNo. of bits in a word: 64No. of processors used: 1 or manyHas the code been vectorised or parallelized?: Parallelized using MPINo. of bytes in distributed program, including test data, etc.: 147163No. of lines in distributed program, including the test data, etc.: 14366Distribution format: gzipped tar fileNature of physical problem: A multitude of problems in science and engineering are often reduced to minimizing a function of many variables. There are instances that a local optimum does not correspond to the desired physical solution and hence the search for a better solution is required. Local optimization techniques can be trapped in any local minimum. Global Optimization is then the appropriate tool. For example, solving a non-linear system of equations via optimization, one may encounter many local minima that do not correspond to solutions, i.e. they are far from zeroMethod of solution: PANMIN is a suite of programs for Global Optimization that take advantage of the Merlin/MCL optimization environment [1,2]. We offer implementations of two algorithms that belong to the stochastic class and use local searches either as intermediate steps or as solution refinementRestrictions on the complexity of the problem: The only restriction is set by the available memory of the hardware configuration. The software can handle bound constrained problems. The Merlin Optimization environment must be installed. Availability of an MPI installation is necessary for executing the parallel codeTypical running time: Depending on the objective functionReferences: [1] D.G. Papageorgiou, I.N. Demetropoulos, I.E. Lagaris, Merlin-3.0. A multidimensional optimization environment, Comput. Phys. Commun. 109 (1998) 227-249. [2] D.G. Papageorgiou, I.N. Demetropoulos, I.E. Lagaris, The Merlin Control Language for strategic optimization, Comput. Phys. Commun. 109 (1998) 250-275.  相似文献   

8.
A new method that employs grammatical evolution and a stopping rule for finding the global minimum of a continuous multidimensional, multimodal function is considered. The genetic algorithm used is a hybrid genetic algorithm in conjunction with a local search procedure. We list results from numerical experiments with a series of test functions and we compare with other established global optimization methods. The accompanying software accepts objective functions coded either in Fortran 77 or in C++.

Program summary

Program title: GenMinCatalogue identifier: AEAR_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEAR_v1_0.htmlProgram obtainable from: CPC Program Library, Queen's University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 35 810No. of bytes in distributed program, including test data, etc.: 436 613Distribution format: tar.gzProgramming language: GNU-C++, GNU-C, GNU Fortran 77Computer: The tool is designed to be portable in all systems running the GNU C++ compilerOperating system: The tool is designed to be portable in all systems running the GNU C++ compilerRAM: 200 KBWord size: 32 bitsClassification: 4.9Nature of problem: A multitude of problems in science and engineering are often reduced to minimizing a function of many variables. There are instances that a local optimum does not correspond to the desired physical solution and hence the search for a better solution is required. Local optimization techniques are frequently trapped in local minima. Global optimization is hence the appropriate tool. For example, solving a nonlinear system of equations via optimization, employing a least squares type of objective, one may encounter many local minima that do not correspond to solutions (i.e. they are far from zero).Solution method: Grammatical evolution and a stopping rule.Running time: Depending on the objective function. The test example given takes only a few seconds to run.  相似文献   

9.
We present a software library for numerically estimating first and second order partial derivatives of a function by finite differencing. Various truncation schemes are offered resulting in corresponding formulas that are accurate to order O(h), O(h2), and O(h4), h being the differencing step. The derivatives are calculated via forward, backward and central differences. Care has been taken that only feasible points are used in the case where bound constraints are imposed on the variables. The Hessian may be approximated either from function or from gradient values. There are three versions of the software: a sequential version, an OpenMP version for shared memory architectures and an MPI version for distributed systems (clusters). The parallel versions exploit the multiprocessing capability offered by computer clusters, as well as modern multi-core systems and due to the independent character of the derivative computation, the speedup scales almost linearly with the number of available processors/cores.

Program summary

Program title: NDL (Numerical Differentiation Library)Catalogue identifier: AEDG_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEDG_v1_0.htmlProgram obtainable from: CPC Program Library, Queen's University, Belfast, N. IrelandLicensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 73 030No. of bytes in distributed program, including test data, etc.: 630 876Distribution format: tar.gzProgramming language: ANSI FORTRAN-77, ANSI C, MPI, OPENMPComputer: Distributed systems (clusters), shared memory systemsOperating system: Linux, SolarisHas the code been vectorised or parallelized?: YesRAM: The library uses O(N) internal storage, N being the dimension of the problemClassification: 4.9, 4.14, 6.5Nature of problem: The numerical estimation of derivatives at several accuracy levels is a common requirement in many computational tasks, such as optimization, solution of nonlinear systems, etc. The parallel implementation that exploits systems with multiple CPUs is very important for large scale and computationally expensive problems.Solution method: Finite differencing is used with carefully chosen step that minimizes the sum of the truncation and round-off errors. The parallel versions employ both OpenMP and MPI libraries.Restrictions: The library uses only double precision arithmetic.Unusual features: The software takes into account bound constraints, in the sense that only feasible points are used to evaluate the derivatives, and given the level of the desired accuracy, the proper formula is automatically employed.Running time: Running time depends on the function's complexity. The test run took 15 ms for the serial distribution, 0.6 s for the OpenMP and 4.2 s for the MPI parallel distribution on 2 processors.  相似文献   

10.
A new stochastic method for locating the global minimum of a multidimensional function inside a rectangular hyperbox is presented. A sampling technique is employed that makes use of the procedure known as grammatical evolution. The method can be considered as a “genetic” modification of the Controlled Random Search procedure due to Price. The user may code the objective function either in C++ or in Fortran 77. We offer a comparison of the new method with others of similar structure, by presenting results of computational experiments on a set of test functions.

Program summary

Title of program: GenPriceCatalogue identifier:ADWPProgram summary URL:http://cpc.cs.qub.ac.uk/summaries/ADWPProgram available from: CPC Program Library, Queen's University of Belfast, N. IrelandComputer for which the program is designed and others on which it has been tested: the tool is designed to be portable in all systems running the GNU C++ compilerInstallation: University of Ioannina, GreeceProgramming language used: GNU-C++, GNU-C, GNU Fortran-77Memory required to execute with typical data: 200 KBNo. of bits in a word: 32No. of processors used: 1Has the code been vectorized or parallelized?: noNo. of lines in distributed program, including test data, etc.:13 135No. of bytes in distributed program, including test data, etc.: 78 512Distribution format: tar. gzNature of physical problem: A multitude of problems in science and engineering are often reduced to minimizing a function of many variables. There are instances that a local optimum does not correspond to the desired physical solution and hence the search for a better solution is required. Local optimization techniques are frequently trapped in local minima. Global optimization is hence the appropriate tool. For example, solving a nonlinear system of equations via optimization, employing a “least squares” type of objective, one may encounter many local minima that do not correspond to solutions, i.e. minima with values far from zero.Method of solution: Grammatical Evolution is used to accelerate the process of finding the global minimum of a multidimensional, multimodal function, in the framework of the original “Controlled Random Search” algorithm.Typical running time: Depending on the objective function.  相似文献   

11.
12.
State-of-the-art molecular dynamics (MD) simulations generate massive datasets involving billion-vertex chemical bond networks, which makes data mining based on graph algorithms such as K-ring analysis a challenge. This paper proposes an algorithm to improve the efficiency of ring analysis of large graphs, exploiting properties of K-rings and spatial correlations of vertices in the graph. The algorithm uses dual-tree expansion (DTE) and spatial hash-function tagging (SHAFT) to optimize computation and memory access. Numerical tests show nearly perfect linear scaling of the algorithm. Also a parallel implementation of the DTE + SHAFT algorithm achieves high scalability. The algorithm has been successfully employed to analyze large MD simulations involving up to 500 million atoms.  相似文献   

13.
This paper describe a package written in MATHEMATICA that automatizes typical operations performed during evaluation of Feynman graphs with Mellin-Barnes (MB) techniques. The main procedure allows to analytically continue a MB integral in a given parameter without any intervention from the user and thus to resolve the singularity structure in this parameter. The package can also perform numerical integrations at specified kinematic points, as long as the integrands have satisfactory convergence properties. It is demonstrated that, at least in the case of massive graphs in the physical region, the convergence may turn out to be poor, making naïve numerical integration of MB integrals unusable. Possible solutions to this problem are presented, but full automatization in such cases may not be achievable.

Program summary

Title of program: MBProgram summary URL:http://cpc.cs.qub.ac.uk/summaries/ADYG_v1_0Catalogue identifier: ADYG_v1_0Program obtainable from: CPC Program Library, Queen's University of Belfast, N. IrelandComputers: AllOperating systems: AllProgramming language used:MATHEMATICA, Fortran 77 for numerical evaluationMemory required to execute with typical data: Sufficient for a typical installation of MATHEMATICA.No. of lines in distributed program, including test data, etc.: 12 013No. of bytes in distributed program, including test data, etc.: 231 899Distribution format: tar.gzLibraries used:CUBA [T. Hahn, Comput. Phys. Commun. 168 (2005) 78] for numerical evaluation of multidimensional integrals and CERNlib [CERN Program Library, obtainable from: http://cernlib.web.cern.ch/cernlib/] for the implementation of Γ and ψ functions in Fortran.Nature of physical problem: Analytic continuation of Mellin-Barnes integrals in a parameter and subsequent numerical evaluation. This is necessary for evaluation of Feynman integrals from Mellin-Barnes representations.Method of solution: Recursive accumulation of residue terms occurring when singularities cross integration contours. Numerical integration of multidimensional integrals with the help of the CUBA library.Restrictions on the complexity of the problem: Limited by the size of the available storage space.Typical running time: Depending on the problem. Usually seconds for moderate dimensionality integrals.  相似文献   

14.
A modification of the standard Simulated Annealing (SA) algorithm is presented for finding the global minimum of a continuous multidimensional, multimodal function. We report results of computational experiments with a set of test functions and we compare to methods of similar structure. The accompanying software accepts objective functions coded both in Fortran 77 and C++.

Program summary

Title of program:GenAnnealCatalogue identifier:ADXI_v1_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADXI_v1_0Program available from: CPC Program Library, Queen's University of Belfast, N. IrelandComputer for which the program is designed and others on which it has been tested: The tool is designed to be portable in all systems running the GNU C++ compilerInstallation: University of Ioannina, Greece on Linux based machinesProgramming language used:GNU-C++, GNU-C, GNU Fortran 77Memory required to execute with typical data: 200 KBNo. of bits in a word: 32No. of processors used: 1Has the code been vectorized or parallelized?: NoNo. of bytes in distributed program, including test data, etc.:84 885No. of lines in distributed program, including test data, etc.:14 896Distribution format: tar.gzNature of physical problem: A multitude of problems in science and engineering are often reduced to minimizing a function of many variables. There are instances that a local optimum does not correspond to the desired physical solution and hence the search for a better solution is required. Local optimization techniques are frequently trapped in local minima. Global optimization is hence the appropriate tool. For example, solving a non-linear system of equations via optimization, employing a “least squares” type of objective, one may encounter many local minima that do not correspond to solutions (i.e. they are far from zero).Typical running time: Depending on the objective function.Method of solution: We modified the process of step selection that the traditional Simulated Annealing employs and instead we used a global technique based on grammatical evolution.  相似文献   

15.
16.
Resource distribution in hierarchical systems is formulated as a multiindex linear programming problem under transport-type constraints. Conditions under which this problem is reduced to the determination of the minimal-cost circulation in a transport network are stated.  相似文献   

17.
A new stochastic clustering algorithm is introduced that aims to locate all the local minima of a multidimensional continuous and differentiable function inside a bounded domain. The accompanying software (MinFinder) is written in ANSI C++. However, the user may code his objective function either in C++, C or Fortran 77. We compare the performance of this new method to the performance of Multistart and Topographical Multilevel Single Linkage Clustering on a set of benchmark problems.

Program summary

Title of program:MinFinderCatalogue identifier:ADWUProgram summary URL:http://cpc.cs.qub.ac.uk/summaries/ADWUProgram obtainable from: CPC Program Library, Queen's University of Belfast, N. IrelandComputer for which the program is designed and others on which is has been tested:The tool is designed to be portable in all systems running the GNU C++ compilerInstallation:University of Ioannina, GreeceProgramming language used:GNU-C++, GNU-C, GNU Fortran 77Memory required to execute with typical data:200 KBNo. of bits in a word:32No. of processors used:1Has the code been vectorized or parallelized?:noNo. of lines in distributed program, including test data, etc.:5797No. of bytes in distributed program, including test data, etc.:588 121Distribution format:gzipped tar fileNature of the physical problem:A multitude of problems in science and engineering are often reduced to minimizing a function of many variables. There are instances that a local optimum does not correspond to the desired physical solution and hence the search for a better solution is required. Local optimization techniques can be trapped in any local minimum. Global optimization is then the appropriate tool. For example, solving a non-linear system of equations via optimization, employing a “least squares” type of objective, one may encounter many local minima that do not correspond to solutions, i.e. they are far from zero.Method of solution:Using a uniform pdf, points are sampled from the rectangular search domain. A clustering technique, based on a typical distance and a gradient criterion, is used to decide from which points a local search should be started. The employed local procedure is a BFGS version due to Powell. Further searching is terminated when all the local minima inside the search domain are thought to be found. This is accomplished via the double-box rule.Typical running time:Depending on the objective function  相似文献   

18.
A new version of the “MinFinder” program is presented that offers an augmented linking procedure for Fortran-77 subprograms, two additional stopping rules and a new start-point rejection mechanism that saves a significant portion of gradient and function evaluations. The method is applied on a set of standard test functions and the results are reported.

New version program summary

Program title: MinFinder v2.0Catalogue identifier: ADWU_v2_0Program summary URL:http://cpc.cs.qub.ac.uk/summaries/ADWU_v2_0.htmlProgram obtainable from: CPC Program Library, Queen's University, Belfast, N. IrelandLicensing provisions: Standard CPC Licence, http://cpc.cs.qub.ac.uk/licence/licence.htmlNo. of lines in distributed program, including test data, etc.: 14 150No. of bytes in distributed program, including test data, etc.: 218 144Distribution format: tar.gzProgramming language used: GNU C++, GNU FORTRAN, GNU CComputer: The program is designed to be portable in all systems running the GNU C++ compilerOperating system: Linux, Solaris, FreeBSDRAM: 200 000 bytesClassification: 4.9Catalogue identifier of previous version: ADWU_v1_0Journal reference of previous version: Computer Physics Communications 174 (2006) 166-179Does the new version supersede the previous version?: YesNature of problem: A multitude of problems in science and engineering are often reduced to minimizing a function of many variables. There are instances that a local optimum does not correspond to the desired physical solution and hence the search for a better solution is required. Local optimization techniques can be trapped in any local minimum. Global optimization is then the appropriate tool. For example, solving a non-linear system of equations via optimization, one may encounter many local minima that do not correspond to solutions, i.e. they are far from zero.Solution method: Using a uniform pdf, points are sampled from a rectangular domain. A clustering technique, based on a typical distance and a gradient criterion, is used to decide from which points a local search should be started. Further searching is terminated when all the local minima inside the search domain are thought to be found. This is accomplished via three stopping rules: the “double-box” stopping rule, the “observables” stopping rule and the “expected minimizers” stopping rule.Reasons for the new version: The link procedure for source code in Fortran 77 is enhanced, two additional stopping rules are implemented and a new criterion for accepting-start points, that economizes on function and gradient calls, is introduced.Summary of revisions:
1.
Addition of command line parameters to the utility program make_program.
2.
Augmentation of the link process for Fortran 77 subprograms, by linking the final executable with the g2c library.
3.
Addition of two probabilistic stopping rules.
4.
Introduction of a rejection mechanism to the Checking step of the original method, that reduces the number of gradient evaluations.
Additional comments: A technical report describing the revisions, experiments and test runs is packaged with the source code.Running time: Depending on the objective function.  相似文献   

19.
In this paper we present an inversion algorithm for nonlinear ill-posed problems arising in atmospheric remote sensing. The proposed method is the iteratively regularized Gauss-Newton method. The dependence of the performance and behaviour of the algorithm on the choice of the regularization matrices and sequences of regularization parameters is studied by means of simulations. A method for improving the accuracy of the solution when the identity matrix is used as regularization matrix is also discussed. Results are presented for atmospheric temperature retrievals from a far infrared spectrum observed by an airborne uplooking heterodyne instrument.  相似文献   

20.
Grid computing is distributed computing performed transparently across multiple administrative domains. Grid middleware, which is meant to enable access to grid resources, is currently widely seen as being too heavyweight and, in consequence, unwieldy for general scientific use. Its heavyweight nature, especially on the client-side, has severely restricted the uptake of grid technology by computational scientists. In this paper, we describe the Application Hosting Environment (AHE) which we have developed to address some of these problems. The AHE is a lightweight, easily deployable environment designed to allow the scientist to quickly and easily run legacy applications on distributed grid resources. It provides a higher level abstraction of a grid than is offered by existing grid middleware schemes such as the Globus Toolkit. As a result, the computational scientist does not need to know the details of any particular underlying grid middleware and is isolated from any changes to it on the distributed resources. The functionality provided by the AHE is ‘application-centric’: applications are exposed as web services with a well-defined standards-compliant interface. This allows the computational scientist to start and manage application instances on a grid in a transparent manner, thus greatly simplifying the user experience. We describe how a range of computational science codes have been hosted within the AHE and how the design of the AHE allows us to implement complex workflows for deployment on grid infrastructure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号

京公网安备 11010802026262号