首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
基于MapReduce模型的并行科学计算   总被引:4,自引:1,他引:3  
随着多核处理器日渐普及,开发高效易用的并行编程模型成为新的挑战,MapReduce是Google开发的一种并行分布式计算模型,在其搜索业务中获得了巨大的成功,将MapReduce模型引入科学计算领域,并结合实例阐述了如何使用面向高性能计算的HPMR/HPMR-s系统在分布式或共享存储系统中采用统一的方式描述并实现并行科学计算.  相似文献   

2.
在基于CC-NUMA结构的高性能计算机上,研究三维空间非理想爆轰波传播的并行化计算。通过对原串行程序进行分析与测试,确定了以曲率计算、第一次差分计算和第二次差分计算等为并行化的重点。结合并行机结构特点,采用“分而治之”和负载平衡等并行处理技术,将串行程序转化成并行程序。在高性能服务器上进行数值模拟计算与测试表明,爆轰波传播并行程序的运算速度得到了大幅度的提高。  相似文献   

3.
本文针对自行研制的二维爆轰驱动动力学计算程序LSFC2D的结构化网格特点,采用递归对分的区域剖分技术以及全局重分和局部微调相结合的动态负载平衡技术,实现了欧拉网格间的物理量传递,使得程序能够在高性能并行计算机上运行,解决了计算物理模型的计算规模和计算时间问题.在计算集群上进行了并行程序的正确性验证和并行性能测试,结果表明,150万网格量时并行计算效率达到了50%以上.  相似文献   

4.
异构BSP模型及其通信协议   总被引:9,自引:0,他引:9       下载免费PDF全文
异构并行计算由于其较高性能价格比而在高性能科学计算和通用应用领域受到广泛研究.但由于异构并行程序设计与性能分析仍处于经验阶段,开发实用程序较为困难.本文提出异构环境中的HBSP模型,并导出相应的开销预测方法,能有效指导异构并行程序的设计与分析.所设计并实现的HBSP模型的通信协议能运行于所有支持MPICH软件包的计算平台.最后以并行FFT算法为例,给出相应的算法设计和实际测试结果.  相似文献   

5.
基于MPI的集群系统的研究   总被引:1,自引:0,他引:1  
MPI 是目前集群系统中最重要的并行编程工具, 它采用消息传递的方式实现并行程序间通信.本文研究如何实现一个基于MPI的集群计算系统,并利用一个并行程序实例设计了一个linux集群,对linux集群系统进行了性能评测  相似文献   

6.
磁化等离子体的并行三维JEC-FDTD算法及其应用   总被引:2,自引:1,他引:1  
李毅  徐利军  袁乃昌 《电子学报》2008,36(6):1119-1123
 给出了三维磁化等离子体的电流密度卷积-时域有限差分(JEC-FDTD)算法的迭代公式,指出该算法与一般FDTD算法实现并行时的不同:增加了电流密度的迭代,以及并行计算时在子域交界面上增加了一些数据的交换.并实现了基于MPI (Message Passing Interface)的并行JEC-FDTD算法.然后用计算涂覆等离子体的金属球的雷达散射截面(RCS)的算例验证了并行程序的可靠性,并测试了并行程序在某集群上的并行效率.最后计算了涂敷磁化等离子体的全尺寸飞机的单站RCS.结果表明并行JEC-FDTD算法是可靠的,而且并行效率高,能计算各向异性磁化等离子体的电大尺寸目标的散射.  相似文献   

7.
高功率微波三维全电磁PIC并行模拟   总被引:5,自引:0,他引:5  
研制了一个三维全电磁粒子模拟并行程序,用于模拟具有复杂几何结构的高功率微波源器件中的电磁场和电子束相互作用随时间非线性演化过程.本文给出基于"块.网格片"的并行区域分解策略,以及建立在该基础上的并行时间积分算法.采用该并行程序计算一典型微波源器件,在上千台处理机上获得可扩展的并行性能,且模拟结果与国际同类型软件具有可比性.  相似文献   

8.
高功率微波三维全电磁PIC并行模拟   总被引:1,自引:0,他引:1       下载免费PDF全文
陈军  董烨  杨温渊  董志伟 《电子学报》2009,37(9):2051-2054
 研制了一个三维全电磁粒子模拟并行程序,用于模拟具有复杂几何结构的高功率微波源器件中的电磁场和电子束相互作用随时间非线性演化过程.本文给出基于"块-网格片"的并行区域分解策略,以及建立在该基础上的并行时间积分算法.采用该并行程序计算一典型微波源器件,在上千台处理机上获得可扩展的并行性能,且模拟结果与国际同类型软件具有可比性.  相似文献   

9.
张凌洁  赵英 《电子设计工程》2012,20(17):15-18,22
Floyd-Warshall算法是图论中APSP(All-Pair Shortest Paths)问题的经典算法,为了加快计算速度,提出使用GPU通用计算来实现。文章先从算法的原理入手,层层深入,提出了可以在GPU上运行的并行F-W算法。之后,又根据矩阵分块的原理和GPU共享存储器的使用,实现了改进的GPU并行F-W算法。通过大量测试实验,得到了该GPU并行程序相对于传统CPU并行程序产生超过百倍的加速比的结论。  相似文献   

10.
由于工业及消费电子应用的图形内容变得更为复杂,而且与计算的联系更为密切,嵌入式开发商需要选用高性价比的处理器解决方案来提供出众的多媒体功能.飞思卡尔最新推出的两款高集成的i.MX35多媒体应用处理器面向工业和消费市场,可实现高性能、连接性和图形处理的强大功能,使开发人员能够在更低的价格和能耗下,将系统设计提升到更高的性能级别.  相似文献   

11.
Although most community colleges don't have high-performance computing programs, a consortium of four community colleges developed an HPC curriculum with help from the US National Science Foundation. Contra Costa College's HPC program is vocational, training students to be Linux cluster administrators. Courses also prepare students for various Computing Technology Industry Association exams.  相似文献   

12.
Advances in sensor technology are revolutionizing the way remotely sensed data is collected, managed and analyzed. The incorporation of latest-generation sensors to airborne and satellite platforms is currently producing a nearly continual stream of high-dimensional data, and this explosion in the amount of collected information has rapidly created new processing challenges. For instance, hyperspectral signal processing is a new technique in remote sensing that generates hundreds of spectral bands at different wavelength channels for the same area on the surface of the Earth. Many current and future applications of remote sensing in Earth science, space science, and soon in exploration science will require (near) real-time processing capabilities. In recent years, several efforts have been directed towards the incorporation of high-performance computing (HPC) systems and architectures in remote sensing missions. With the aim of providing an overview of current and new trends in parallel and distributed systems for remote sensing applications, this paper explores three HPC-based paradigms for efficient implementation of the Pixel Purity Index (PPI) algorithm, available from the popular Kodak’s Research Systems ENVI software package, as a representative case study for demonstration purposes. Several different parallel programming techniques are used to improve the performance of the PPI on a variety of parallel platforms, including a set of message passing interface (MPI)-based implementations on a massively parallel Beowulf cluster at NASA’s Goddard Space Flight Center in Maryland and on a variety of heterogeneous networks of workstations at University of Maryland; a Handel-C implementation of the algorithm on a Virtex-II field programmable gate array (FPGA); and a compute unified device architecture (CUDA)-based implementation on graphical processing units (GPUs) of NVidia. Combined, these parts deliver an excellent snapshot of the state-of-the-art in those areas, and offer a thoughtful perspective on the potential and emerging challenges of adapting HPC systems to remote sensing problems.  相似文献   

13.
High-performance computing for vision   总被引:2,自引:0,他引:2  
The main focus of the paper is on effectively using commercial-off-the-shelf (COTS) based general purpose parallel computing platforms to realize high speed implementations of vision tasks. Due to the successful use of the COTS-based systems in a variety of high performance applications, it is attractive to consider their use for vision applications as well. However, the irregular data dependencies in vision tasks lead to large communication overheads in the HPC systems. At the University of Southern California, our research efforts have been directed toward designing scalable parallel algorithms for vision tasks on the HPC systems. In our approach, we use the message passing programming model to develop portable code. Our algorithms are specified using C and MPI. In this paper, we summarize our efforts, and illustrate our approach using several example vision tasks  相似文献   

14.
Parallel processing over the Internet is now becoming a realistic possibility. There are numerous of-the-shelf high-performance computing (HPC) platforms available with Internet access, on which to implement computationally intensive algorithms. HPC can be applied in the field of computational electromagnetics. The networking capabilities of the Internet now allow these computing resources to be used as a remote service. Additionally, the pragmatics of their utilization can be abstracted by adopting a World Wide Web (WWW) interface. A Web-based environment can provide the supportive tools for data entry, program initiation, result visualization, and even interactive modifications of the geometry and/or electromagnetic (EM) properties. For realistic interaction, the emerging question is which algorithm to use that supports the exploitation of parallelism. In order to exploit and utilize all the available performance of current and predicted HPC platforms, inherently-parallel-based algorithms have to be devised. One such algorithm is the parallel method of moments/method of auxiliary sources, P(MoM/MAS), introduced in this paper. The resulting algorithm parallelization enables the MoM/MAS method to be applied to solving electrically-large-in-size and complex EM structures on various computational platforms. This paper concentrates on the parallel-processing issues, and on the importance of adopting suitable algorithms, such as the MoM/MAS technique  相似文献   

15.
Today, parallel programming is dominated by message passing libraries, such as message passing interface (MPI). This article intends to simplify parallel programming by generating parallel programs from parallelized algorithm design strategies. It uses skeletons to abstract parallelized algorithm design strategies, as well as parallel architectures. Starting from problem specification, an abstract parallel abstract programming language+ (Apla+) program is generated from parallelized algorithm design strategies and problem-specific function definitions. By combining with parallel architectures, implicity of parallelism inside the parallelized algorithm design strategies is exploited. With implementation and transformation, C++ and parallel virtual machine (CPPVM) parallel program is finally generated. Parallelized branch and bound (B&B) algorithm design strategy and paraUelized divide and conquer (D & C) algorithm design strategy are studied in this article as examples. And it also illustrates the approach with a case study.  相似文献   

16.
17.
The ParaScope parallel programming environment   总被引:1,自引:0,他引:1  
The ParaScope parallel programming environment, developed to support scientific programming of shared-memory multiprocessors, is described. It includes a collection of tools that use global program analysis to help users develop and debug parallel programs. The focus is on ParaScope's compilation system. The compilation system extends the traditional single-procedure compiler by providing a mechanism for managing the compilation of complete programs. The ParaScope editor brings both compiler analysis and user expertise to bear on program parallelization. The debugging system detects and reports timing-dependent errors, called data races, in execution of parallel programs. A project aimed at extending ParaScope to support programming in FORTRAN D, a machine-independent parallel programming language for use with both distributed-memory and shared-memory parallel computers, is described  相似文献   

18.
BP网络的Matlab实现及应用研究   总被引:17,自引:2,他引:15  
刘浩  白振兴 《现代电子技术》2006,29(2):49-51,54
人工神经网络以其具有信息的分布存储、并行处理以及自学习能力等优点,已经在信息处理、模式识别、智能控制及系统建模等领域得到越来越广泛的应用。他的基于误差反向传播算法的多层前馈网络,即BP网络在非线性建模、函数逼近和模式识别中有广泛的应用,介绍了BP网络的基本原理,分析了Matlab人工神经网络工具箱中有关BP网络的工具函数,并给出了部分重要工具函数的实际应用。  相似文献   

19.
Optical loop mirror multiplexer   总被引:6,自引:0,他引:6  
A novel fiber-optic configuration has been developed to generate high-repetition-rate optical pulses up to hundreds of GHz. The set-up consists of a parallel connection of optical loop mirrors. The multiplexer has potential applications in multichannel parallel processing and in high speed optical communications. With the use of fiber switches, our set-up can be converted into an optical bit pattern generator.  相似文献   

20.
The 3D vector problem of diffraction of the fields of two coupled parallel electric dipoles located in parallel to an infinitely thin rectangular perfectly conducting screen is considered. On the basis of the asymptotic solution to this problem, fast algorithms and programs are developed for computation of patterns, the directive gain, and the radiation resistance as functions of the geometric parameters and electric dimensions of the radiating structure. It is shown that, when the distance between a dipole and the screen is fixed, the appropriately chosen distance between the dipoles and the appropriately chosen dimensions of the screen provide for axially symmetric patterns and the maximum maximorum directive gain.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号