首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A parallelized version of the level-set algorithm based on the MPI technique is presented. TM-polarized plane waves are used to illuminate two-dimensional perfect electric conducting targets. A variety of performance measures such as the efficiency, the load balance, the weak scaling, and the communication/computation times are discussed. For electromagnetic inverse scattering problems, retrieving the target’s arbitrary shape and location in real time is considered as a main goal, even as a trade-off with algorithm efficiency. For the three cases considered here, a maximum speedup of 53X-84X is achieved when using 256 processors. However, the overall efficiency of the parallelized level-set algorithm is 21%–33% when using 256 processors and 26%–52% when using 128 processors. The effects of the bottlenecks of the level-set algorithm on the algorithm efficiency are discussed.  相似文献   

2.
The abundance of parallel and distributed computing platforms, such as MPP, SMP, and the Beowulf clusters, to name just a few, has added many more possibilities and challenges to high performance computing (HPC), parallel I/O, mass data storage, scalable architectures, and large-scale simulations, which traditionally belong to the realm of custom-tailored parallel systems. The intent of this special issue is to discuss problems and solutions, to identify new issues, and to help shape future research directions in these areas. From these perspectives, this special issue addresses the problems encountered at the hardware, architectural, and application levels, while providing conceptual as well as empirical treatments to the current issues in high performance computing, and the I/O architectures and systems utilized therein.  相似文献   

3.
Effective task scheduling is essential for obtaining high performance in heterogeneous distributed computing systems (HeDCSs). However, finding an effective task schedule in HeDCSs requires the consideration of both the heterogeneity of processors and high interprocessor communication overhead, which results from non-trivial data movement between tasks scheduled on different processors. In this paper, we present a new high-performance scheduling algorithm, called the longest dynamic critical path (LDCP) algorithm, for HeDCSs with a bounded number of processors. The LDCP algorithm is a list-based scheduling algorithm that uses a new attribute to efficiently select tasks for scheduling in HeDCSs. The efficient selection of tasks enables the LDCP algorithm to generate high-quality task schedules in a heterogeneous computing environment. The performance of the LDCP algorithm is compared to two of the best existing scheduling algorithms for HeDCSs: the HEFT and DLS algorithms. The comparison study shows that the LDCP algorithm outperforms the HEFT and DLS algorithms in terms of schedule length and speedup. Moreover, the improvement in performance obtained by the LDCP algorithm over the HEFT and DLS algorithms increases as the inter-task communication cost increases. Therefore, the LDCP algorithm provides a practical solution for scheduling parallel applications with high communication costs in HeDCSs.  相似文献   

4.
Heterogeneous network-based distributed and parallel computing is gaining increasing acceptance as an alternative or complementary paradigm to multiprocessor-based parallel processing as well as to conventional supercomputing. While algorithmic and programming aspects of heterogeneous concurrent computing are similar to their parallel processing counterparts, system issues, partitioning and scheduling, and performance aspects are significantly different. In this paper, we discuss the evolution of heterogeneous concurrent computing, in the context of the parallel virtual machine (PVM) system, a widely adopted software system for network computing. In particular, we highlight the system level infrastructures that are required, aspects of parallel algorithm development that most affect performance, system capabilities and limitations, and tools and methodologies for effective computing in heterogeneous networked environments. We also present recent developments and experiences in the PVM project, and comment on ongoing and future work.  相似文献   

5.
针对软件系统中进程间控制、调用及数据访问的关系,分析了进程间的耦合程度,给出了判定进程间重启相关度方法和系统重启树的构建规则,并结合DNA计算的原理和特性,给出了判定进程间重启相关度DNA计算模型,并初步制定了重启实施策略,为实现智能化细粒度软件抗衰提供支持.  相似文献   

6.
The aim of the proposed fault tolerant model is to attain reliability and high performance for distributed computing on the Internet. The novelty of this model lies in the integration of three unique schemes that work in unison within a single framework. These three schemes are consecutive message transmission, adaptive buffer control, and message balancing. Message balancing essentially seeks to ensure that each message queue is served for an interval, which depends on the current length of the queue, by the processor. In the experiments, only two parameters: current buffer length and rate of change of the actual queue length were used for proportional and derivative feedback control of adaptive buffer management. Test results have indicated clearly that the model goes a considerable way towards achieving the stated aim.  相似文献   

7.
The paper presents a new approach for the introduction of computational science into high level school curricula. It also discusses a set of real life problems that are appropriate for these curricula because they can be described through simple models. The computer based simulation of these systems require an ad hoc environment, including a programming language, suitable for this target age. The paper proposes a new environment, the ORESPICS environment, including a new programming language. The sequential part of the language integrates the classical imperative constructs with a simple set of graphical primitives, mostly taken from the LOGO language. The concurrent part of the language is based on the message passing paradigm. The solutions of some classical problems through ORESPICS are shown.  相似文献   

8.
李春艳  张学杰 《计算机应用》2013,33(12):3580-3585
云计算是一种提供各种IT服务的互联网资源利用的新模式,已经广泛地应用在包括高性能计算的各种领域。然而,虚拟化带来了一些性能开销;同时,不同的云平台实施虚拟化技术的不同,使得在这些云平台上应用高性能计算服务的性能也千差万别。通过HPC Challenge (HPCC) Benchmark和NAS Parallel Benchmark(NPB)分别对CPU、内存、网络、扩展性和高性能计算真实负载进行评估,比较并分析了诸如Nimbus、OpenNebula和OpenStack实施高性能计算的性能,实验显示OpenStack对计算密集型的高性能应用负载表现出较好的性能,因此,OpenStack是实施高性能计算的开源云平台的一个好的选择。  相似文献   

9.
尤红桃  张延园  林奕  刘胜 《计算机科学》2013,40(Z6):112-114,148
随着数字化信息爆炸性的增长,存储技术成为IT业发展的新动力。存储系统规模的不断扩大,使能效问题越来越突出,主要表现为增加了系统运行维护和冷却的成本、降低了系统的可靠性和扩展性、加剧了存储系统周围环境的污染,因此研究存储能效问题具有较大的经济价值和实用意义。阐述了存储系统中磁盘能效的研究进展和现状,并从语义信息出发,设计与实现了基于语义信息的驱动程序。实验表明该驱动程序有效地降低了磁盘能耗,提高了存储系统I/O性能,优化了存储能效。  相似文献   

10.
Atomistic simulations of thin film deposition, based on the lattice Monte Carlo method, provide insights into the microstructure evolution at the atomic level. However, large-scale atomistic simulation is limited on a single computer—due to memory and speed constraints. Parallel computation, although promising in memory and speed, has not been widely applied in these simulations because of the intimidating overhead. The key issue in achieving optimal performance is, therefore, to reduce communication overhead among processors. In this paper, we propose a new parallel algorithm for the simulation of large-scale thin film deposition incorporating two optimization strategies: (1) domain decomposition with sub-domain overlapping and (2) asynchronous communication. This algorithm was implemented both on message-passing-processor systems (MPP) and on cluster computers. We found that both architectures are suitable for parallel Monte Carlo simulation of thin film deposition in either a distributed memory mode or a shared memory mode with message-passing libraries.  相似文献   

11.
In this paper we examine the performance of parallel approximate inverse preconditioning for solving finite element systems, using a variety of clusters containing the Message Passing Interface (MPI) communication library, the Globus toolkit and the Open MPI open-source software. The techniques outlined in this paper contain parameters that can be varied so as to tune the execution to the underlying platform. These parameters include the number of CPUs, the order of the linear system (n) and the “retention parameter” (δ l) of the approximate inverse used as a preconditioner. Numerical results are presented for solving finite element sparse linear systems on platforms with various CPU types and number, different compilers, different File System types, different MPI implementations and different memory sizes.
J. P. MorrisonEmail:
  相似文献   

12.
由于无线传感器网络的能量受限,如何优化网络能量消耗和评估网络生存周期是当前无线传感器网络研究的首要挑战。在分析无线传感器网络能量消耗特征的基础上,调研传感器网络节点和网络系统的能量优化策略;并针对能量优化存在的不足,分析近几年兴起的无线传感器网络能耗建模工作;从基于无线通信、状态转换、协议栈等方面归纳总结无线传感器网络能耗模型的建模方法;指出跨层能量优化以及软硬件综合的能耗建模技术是无线传感器网络能量研究的重点。  相似文献   

13.
From microarrays and next generation sequencing to clinical records, the amount of biomedical data is growing at an exponential rate. Handling and analyzing these large amounts of data demands that computing power and methodologies keep pace. The goal of this paper is to illustrate how high performance computing methods in SAS can be easily implemented without the need of extensive computer programming knowledge or access to supercomputing clusters to help address the challenges posed by large biomedical datasets. We illustrate the utility of database connectivity, pipeline parallelism, multi-core parallel process and distributed processing across multiple machines. Simulation results are presented for parallel and distributed processing. Finally, a discussion of the costs and benefits of such methods compared to traditional HPC supercomputing clusters is given.  相似文献   

14.
VAV中央空调能耗建模与仿真研究   总被引:2,自引:0,他引:2  
中央空调是一个复杂的非线性、时滞系统。中央空调系统在运行过程中存在着巨大的节能潜力,对中央空调系统的节能优化研究应以中央空调的能耗模型为基础。根据VAV中央空调各设备的能耗数学模型,并综合考虑VAV中央空调各设备之间的耦合关系,利用matlab中的simulink工具箱建立了反映VAV中央空调运行过程中各变量与系统能耗之间关系的仿真模型进行仿真,并对仿真结果进行了分析验证。模型可以用于对中央空调节能的参数优化研究中,对中央空调的节能优化控制具有重要的意义。  相似文献   

15.
在考虑转换开销和核间通信开销的情况下,针对变电压多核处理器上存在时间约束的含依赖任务的应用,提出了一种用于实时多核嵌入式系统开销感知的综合节能方法.该方法用RDAG算法将任务独立化后,将动态电源管理.自适应衬底偏置和动态电压调节有效地结合起来.分别用几个随机任务图和代表实际应用的任务图做2,3处理器核上的模拟实验,结果表明提出的方法优于原方法.  相似文献   

16.
为制定太阳能赛车高效能量管理策略,以太阳能赛车为研究对象,建立考虑光伏电池温度、空气密度、胎压、悬架阻尼等时变参数的整车模型,设计了基于带遗忘因子的递推最小二乘法在线道路坡度估计算法,建立了整车MATLAB/Simulink模型,仿真研究了巡航速度、整车质量和轮胎压力参数变化对于整车能耗特性的影响,并进行了实验验证。实验结果表明上述模型能够真实反应车辆和环境参数变化对于整车能耗的影响,为后续太阳能赛车能量策略的制定奠定了基础。  相似文献   

17.
This paper describes the FPGA implementation of FastCrypto, which extends a general-purpose processor with a crypto coprocessor for encrypting/decrypting data. Moreover, it studies the trade-offs between FastCrypto performance and design parameters, including the number of stages per round, the number of parallel Advance Encryption Standard (AES) pipelines, and the size of the queues. Besides, it shows the effect of memory latency on the FastCrypto performance. FastCrypto is implemented with VHDL programming language on Xilinx Virtex V FPGA. A throughput of 222 Gb/s at 444 MHz can be achieved on four parallel AES pipelines. To reduce the power consumption, the frequency of four parallel AES pipelines is reduced to 100 MHz while the other components are running at 400 MHz. In this case, our results show a FastCrypto performance of 61.725 bits per clock cycle (b/cc) when 128-bit single-port L2 cache memory is used. However, increasing the memory bus width to 256-bit or using 128-bit dual-port memory, improves the performance to 112.5 b/cc (45 Gb/s at 400 MHz), which represents 88% of the ideal performance (128 b/cc).  相似文献   

18.
19.
云计算环境下多有向无环图工作流的节能调度算法   总被引:1,自引:0,他引:1  
刘丹琦  于炯  英昌甜 《计算机应用》2013,33(9):2410-2415
针对多有向无环图(DAG)工作流节能调度算法中存在的节能效果不佳、适用范围较窄和无法兼顾性能优化等问题,提出了一种新的多DAG工作流节能调度方法--MREO。MREO在对计算密集型和通信密集型任务特点进行分析的基础上,通过整合独立任务,减少了处理器的数量,并利用回溯和分支限界算法对任务整合路径进行动态的优化选择,有效降低了整合算法的复杂度。实验结果证明,MREO在保证多DAG工作流性能的前提下,能够有效降低系统的计算和通信能量开销,获得了良好的节能效果。  相似文献   

20.
Since its release, the Java programming language has attracted considerable attention from the high‐performance computing (HPC) community because of its portability, high programming productivity, and built‐in multithreading and networking support. As a consequence, several initiatives have been taken to develop a high‐performance Java message‐passing library to program distributed memory architectures, such as clusters. The performance of Java message‐passing applications relies heavily on the communications performance. Thus, the design and implementation of low‐level communication devices that support message‐passing libraries is an important research issue in Java for HPC. MPJ Express is our Java message‐passing implementation for developing high‐performance parallel Java applications. Its public release currently contains three communication devices: the first one is built using the Java New Input/Output (NIO) package for the TCP/IP; the second one is specifically designed for the Myrinet Express library on Myrinet; and the third one supports thread‐based shared memory communications. Although these devices have been successfully deployed in many production environments, previous performance evaluations of MPJ Express suggest that the buffering layer, tightly coupled with these devices, incurs a certain degree of copying overhead, which represents one of the main performance penalties. This paper presents a more efficient Java message‐passing communications device, based on Java Input/Output sockets, that avoids this buffering overhead. Moreover, this device implements several strategies, both in the communication protocol and in the HPC hardware support, which optimizes Java message‐passing communications. In order to evaluate its benefits, this paper analyzes the performance of this device comparatively with other Java and native message‐passing libraries on various high‐speed networks, such as Gigabit Ethernet, Scalable Coherent Interface, Myrinet, and InfiniBand, as well as on a shared memory multicore scenario. The reported communication overhead reduction encourages the upcoming incorporation of this device in MPJ Express ( http://mpj‐express.org ). Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号