期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

李国燕顾军华宋庆增陆益财周博君《计算机测量与控制》2013,(1):250-253

针对传统的泊松方程求解算法执行效率低、功耗大,很难满足实际需要的缺点,设计了一种FPGA硬件平台的泊松方程快速求解器。设计采用软件与硬件结合的方式,由软件执行控制复杂、计算量较小的任务,而由硬件完成控制简单、计算量大的任务,从而达到硬件加速的目的。在FPGA平台上,独立设计的FFT协处理器可以流水和高度并行化的处理数据,提高了求解器的性能。实验结果表明,硬件实现的基于FFT的泊松方程快速求解器具有较高的计算性能和良好的可扩展性。相似文献

2.

异构多机器人环境下的二进制翻译系统

蔡慧玲刘博《微型电脑应用》2010,26(2):18-20

文章提出一种基于软硬件协同设计的动态二进制翻译系统。该系统在硬件层通过虚拟机协处理器,实现动态二进制翻译系统执行流程中的部分关键路径,同时通过软件和硬件的紧密耦合,有效控制整个虚拟机和原系统之间的共存问题,可以用来解决异构多机器人之间,由于不同的体系结构导致的代码兼容问题。实测结果表明,这种方案比纯软件方案具有明显性能改善。相似文献

3.

基于串联结构的分布式模型预测控制 总被引：2，自引：0，他引：2

蔡星谢磊苏宏业古勇《自动化学报》2013,39(5):510-518

分布式模型预测控制(Distributed model predictive control, DMPC)是一类用于多输入多输出的大规模系统的控制方式.每个智能体通过相互协作完成整个系统的控制. 已有的分布式预测控制算法可以划分为迭代式算法和非迭代算法:迭代算法在迭代到收敛情况下,具有集中式预测控制(Centralized model predictive control, CMPC)算法的性能,但迭代次数过多,子系统间通信量大;非迭代算法不需要迭代,但性能有一定损失.本文提出了一种基于串联结构的非迭代分布式预测控制算法.本文算法在串联结构系统中可以有效减少计算量,并结合氧化铝碳分解(Alumina continuous carbonation decomposition process, ACCDP)这一串联过程,通过仿真验证了算法的有效性;同时分析了算法运用在串联结构下的性能并证明了其稳定性. 相似文献

4.

浮点协处理器在嵌入式组合导航计算机中的应用研究 总被引：1，自引：1，他引：0

孙炼赵伟刘建业《计算机测量与控制》2008,16(4):555-557

为了提高导航计算机的浮点运算性能,满足组合导航系统实时性的要求,在基于FPGA的嵌入式导航计算机中,利用新型FPGA的片内逻辑资源,设计出专门用于浮点运算的协处理器单元,实现了组合导航浮点运算的硬件执行。为了使浮点运算协处理器的性能充分发挥,对组合导航软件的代码进行了优化。实现了嵌入式导航计算机硬件和软件性能同步提高。使用真实导航数据进行了测试,结果表明,系统的浮点运算性能大大提升,达到了预期的实时性能改善效果。相似文献

5.

基于CG-MSNWF的空时自适应抗干扰算法研究

潘延明卢艳娥骆艳卜李思佳《电子技术应用》2011,(10)

针对多级维纳滤波MSNWF(Multi-Stage Nested Winner Filter)算法的计算量大,在处理高维数据时不能满足实时性的需要问题,采用基于共轭梯度CG(Conjugate Gradient)的多级维纳滤波算法—CG-MSNWF。在相同的干扰抑制性能条件下,该算法相比于MSNWF不需要后向迭代过程,降低了计算量和对硬件存储器的要求,提高了算法的收敛速度,满足了实时性的需求。仿真结果证实了算法的实效性。相似文献

6.

基于随机插入策略的Java混淆器设计与实现

宋亚奇李莉《计算机工程与设计》2009,30(4)

控制流混淆用于混淆程序的运行流程,从而防止对软件的逆向工程,但通常混淆后的程序在代码量以及执行时间方面都有较大增长.提出了随机插入混淆策略,采用分支插入算法和循环条件插入算法相结合,并引入了随机函数以限制代码的插入操作,从而控制代码长度的增长.使用BCEL设计并实现了基于Java字节码的控制流混淆转换工具,能够实现Java字节码的迭代混淆,且混淆结果具有一定的不可再现性.实验结果表明,该策略能够有效地控制混淆转换带来的性能过载,同时能够有效地防止逆向工程攻击. 相似文献

7.

稀疏矩阵向量乘的FPGA设计与实现

下载免费PDF全文

宋庆增顾军华《计算机工程》2011,37(23):214-216

针对传统的通用处理器(GPP)平台上执行稀疏矩阵向量乘计算效率低的问题,提出一种基于可重构计算平台的SpMXV协处理器设计。方案采用二叉树结构高度流水的数据流、IEEE-754的32 bit浮点数数据格式和对角存储格式。数据通路以流水线方式进行组织,能够优化计算性能。仿真结果表明,与GPP平台上的软件实现相比,通过硬件实现的设计能达到最高2.69倍的性能加速。相似文献

8.

基于JNI和C+〖KG-*3〗+的Intel集成众核并行方法

桑喆 邓川 苟聪 刘开兴 白明泽 《计算机与现代化》2018,(4):32

针对当前Intel集成众核协处理器（MIC）只能使用C/C+〖KG-*3〗+/Fortran编程语言进行并行计算,不能对已有的Java程序提供高性能计算支持的问题,提出基于Java Native Interface(JNI)技术和C+〖KG-*3〗+的MIC混合并行计算方法。该方法基于JNI设计Java代码与C+〖KG-*3〗+代码的数据交换机制,使MIC协处理器强大的浮点计算能力加速Java应用程序成为可能。通过实验测试分析基于MIC多线程并行的Java程序计算性能效果,结果表明该方法能有效利用MIC协处理器,对Java程序的计算性能提升显著。相似文献

9.

用于加解密流程控制的协处理器

王剑非马德黄凯杰陈亮黄凯葛海通《计算机系统应用》2013,22(11):204-208,217

本文设计与实现了一种专用于加解密流程控制的协处理器．协处理器根据特定的应用需求,自定义了一种精简的8位指令集,同时采用与SoC系统一致的32位数据位宽设计．协处理器采用三级流水线设计,数据旁路的设计解决了流水线中的数据冒险．通过与加解密算法IP联合测试仿真,验证了协处理器能够灵活地完成加解密流程控制工作．通过SMl加密实验,证明了协处理器能够提供较主处理器更好的性能,同时释放大量的主处理器资源,显著提高了SoC的性能．最后DC综合结果显示,该协处理器只占用了很小面积．相似文献

10.

嵌入式协处理器中除法和平方根计算的整合设计 总被引：2，自引：0，他引：2

梁政沈绪榜《计算机研究与发展》2001,38(8):1016-1020

在浮点处理元中串行实现除法和平方根计算虽然速度慢,但设计简单规则,占用资源少,有利于嵌入式的应用。结合嵌入式协处理器LSC87的研制,给出了串行实现除法和平方根计算的基4SRT算法,介绍了确定SRT选择常数过程中不确定区域的验证方法;给出了除法与平方根计算可共用的基4SRT查询表设计;同时讨论了迭代冗余结果向非冗余二进制的转换。本协处理器设计量大限度地利用了通用数据路径来完成SRT算法的实现,节约了设计资源,并缩短了迭代时间。相似文献

11.

A pipelined-loop-compatible architecture and algorithm to reduce variable-length sets of floating-point data on a reconfigurable computer

Gerald R. Morris Viktor K. Prasanna 《Journal of Parallel and Distributed Computing》2008

相似文献

12.

TPC码译码器硬件仿真的优化设计

郭丽蒋卓勤《电子技术应用》2007,33(12):45-47

介绍一种TPC码迭代译码器的硬件设计方案,基于软判决译码规则,采用完全并行规整的译码结构,使用VHDL硬件描述语言,实现了码率为1/2的(8,4)二维乘积码迭代译码器,并特别通过硬件测试激励来实时测量所设计迭代译码器的误码率情况,提出了优化设计方案,和传统的硬件仿真方法相比大大提高了仿真效率。仿真结果证明该译码器有很大的实用性和灵活性。相似文献

13.

Implementation and performance evaluation of a distributed conjugate gradient method in a cloud computing environment

Leila Ismail Rajeev Barua 《Software》2013,43(3):281-304

Cloud computing is an emerging technology where information technology resources are provisioned to users in a set of a unified computing resources on a pay per use basis. The resources are dynamically chosen to satisfy a user service level agreement and a required level of performance. A cloud is seen as a computing platform for heavy load applications. Conjugate gradient (CG) method is an iterative linear solver that is used by many scientific and engineering applications to solve a linear system of algebraic equations. CG generates a heavy load of computation, and therefore, it slows the performance of the applications using it. Distributing CG is considered as a way to increase its performance. However, running a distributed CG, based on a standard API, such as Message Passing Interface, in a cloud face many challenges, such as the cloud processing and networking capabilities. In this work, we present an in‐depth analysis of the CG algorithm and its complexity to develop adequate distributed algorithms. The implementation of these algorithms and their evaluation in our cloud environment reveal the gains and losses achieved by distributing the CG. The performance results show that despite the complexity of the CG processing and communication, a speedup gain of at least 1157.7 is obtained using 128 cores compared with National Aeronautics and Space Administration Advanced Supercomputing sequential execution. Given the emergence of clouds, the results in this paper analyzes performance issues when a generic public cloud, along with a standard development library, such as Message Passing Interface, is used for high‐performance applications, without the need of some specialized hardware and software. Copyright © 2012 John Wiley & Sons, Ltd. 相似文献

14.

基于FPGA的Jacobi迭代求解器研究 总被引：1，自引：0，他引：1

下载免费PDF全文

宋庆增顾军华张金珠《计算机工程与应用》2011,47(29):74-77

针对特定的数值算法进行硬件加速是当前体系结构的趋势之一。Jacobi迭代是典型的数值迭代算法,针对软件Jacobi迭代求解器性能慢,实时性差的缺点,在FPGA硬件平台上设计和实现了硬件Jacobi迭代求解器。求解器采用高度并行、流水的数据通路和优化的归约电路设计,充分利用了Jacobi迭代本身固有的并行性和FPGA的并发式结构,有效地提升求解器的性能。实验结果表明,Jacobi求解器具有良好的可扩展性和较高的计算性能。相似文献

15.

Discrete adjoint gradient evaluations for linear stress and vibration analysis

Schwalbach Marc Verstraete Tom Gauger Nicolas R. 《Computing and Visualization in Science》2019,21(1-6):23-31

This paper presents methods used to perform discrete adjoint gradient evaluations for linear stress and vibration analysis. The methods are implemented within the framework of a discrete adjoint structural solver being developed for multidisciplinary adjoint optimizations of turbomachinery components. The code is differentiated using the algorithmic differentiation tool CoDiPack in tandem with manual treatment of the iterative solvers. Stress analysis leads to a linear system of equations that is typically solved by an iterative solver (e.g. GMRES). To ensure accuracy, the adjoint problem is formulated as a new linear system of equations to be solved. Vibration analysis results in a generalized eigenvalue problem that is also typically solved by an interative solver. The adjoint problem takes out the generalized eigenvalue solve and replaces it by one outer product per eigenfrequency, leading to significantly cheap eigenfrequency gradients for vibration analysis.

相似文献

16.

基于Linux的嵌入式远程监控系统的设计 总被引：2，自引：0，他引：2

艾红王洪涛《工业控制计算机》2008,21(8)

介绍了基于Linux的嵌入式远程监控系统.系统采用ARM9芯片控制方案实现系统功能并移植了具有一定实时性、源代码公开的嵌入时系统Linux,同时也给出了具体的软、硬件设计方案.实验表明所提出系统及实现方法的正确性和可行性. 相似文献

17.

Tailored least-squares solvers implementation for high-performance gravity field research

Oliver Baur 《Computers & Geosciences》2009,35(3):548-556

Least-squares (LS) problems occur in almost every scientific and engineering discipline. Basically, they are generated by providing more observations than unknown parameters to be resolved. Appropriate LS solvers depend on both quality and computational issues. With regard to the latter, this paper focuses on the tailored parallel implementation of two LS solvers: the iterative LSQR method (substitutional for any Krylov-space method) and the “brute-force” inversion approach. Both implementations demonstrate very good scaling results in a parallel processing environment. Even so, the present investigations show that, from the computational and hardware point of view, iterative solvers outperform the “brute-force” approach. LSQR not only provides superior speed-up values; but, in addition, source code portability and hardware requirements are much more convenient for the iterative solver. These conclusions are drawn in the context of state-of-the-art terrestrial geopotential recovery with regard to the forthcoming Gravity field and steady-state Ocean Circulation Explorer (GOCE) satellite mission. 相似文献

18.

Comparisons of air traffic control implementations on an associative processor with a MIMD and consequences for parallel computing

Man Yuan Johnnie W. Baker Will C. Meilander 《Journal of Parallel and Distributed Computing》2013

This paper has two complementary focuses. The first is the system design and algorithmic development for air traffic control (ATC) using an associative SIMD processor (AP). The second is the comparison of this implementation with a multiprocessor implementation and the implications of these comparisons. This paper demonstrates how one application, ATC, can more easily, more simply, and more efficiently be implemented on an AP than is generally possible on other types of traditional hardware. The AP implementation of ATC will take advantage of its deterministic hardware to use static scheduling. The software will be dramatically smaller and cheaper to create and maintain. Likewise, a large AP system will be considerably simpler and cheaper than the MIMD hardware currently used. While APs were used for ATC-type applications earlier, these are no longer available. We use a ClearSpeed CSX600 accelerator to emulate the AP solutions of ATC on an ATC prototype consisting of eight data-intensive ATC real-time tasks. Its performance is compared with an 8-core multiprocessor (MP) using OpenMP. Our extensive experiments show that the AP implementation meets all deadlines while the MP will regularly miss a large number of deadlines. The AP code will be similar in size to sequential code for the same tasks and will avoid all of the additional support software needed with an MP to handle dynamic scheduling, load balancing, shared resource management, race conditions, false sharing, etc. At this point, essentially only MIMD systems are built. Many of the advantages of using an AP to solve an ATC problem would carry over to other applications. AP solutions for a wide variety of applications will be cited in this paper. Applications that involve a high degree of data parallelism such as database management, text processing, image processing, graph processing, bioinformatics, weather modeling, managing UAS (Unmanned Aircraft Systems or drones) etc., are good candidates for AP solutions. This raises the issue of whether we should routinely consider using non-multiprocessor hardware like the AP for applications where substantially simpler software solutions will normally exist. It also raises the question of whether the use of both AP and MIMD hardware in a single hetergeneous system could provide more versatility and efficiency. Either the AP or MIMD could serve as the primary system, but could hand off jobs it could not handle efficiently to the other system. 相似文献

19.

基于硬件虚拟化的虚拟机进程代码分页式度量方法

蔡梦娟陈兴蜀金鑫赵成殷明勇《计算机应用》2018,38(2):305-309

云环境下恶意软件可利用多种手段篡改虚拟机（VM）中关键业务代码,威胁其运行的稳定性。传统的基于主机的度量系统易被绕过或攻击而失效,针对在虚拟机监视器（VMM）层难以获取虚拟机中运行进程完整代码段并对其进行完整性验证的问题,提出基于硬件虚拟化的虚拟机进程代码分页式度量方法。该方法以基于内核的虚拟机（KVM）作为虚拟机监视器,在VMM层捕获虚拟机进程的系统调用作为度量流程的触发点,基于相对地址偏移解决了不同版本虚拟机之间的语义差异,实现了分页式度量方法在VMM层透明地验证虚拟机中运行进程代码段的完整性。实现的原型系统——虚拟机分页式度量系统（VMPMS）能有效度量虚拟机中进程,性能损耗在可接受范围内。相似文献

20.

计算机基本输入输出系统安全研究

严霄凤《网络安全技术与应用》2013,(3):67-71

计算机基本输入/输出系统(BIOS)是计算机通电后首先执行的一组程序,是硬件与软件程序之间的接口,其功能包括上电自检及初始化、硬件中断处理、程序服务处理等。为计算机提供最底层、最直接的硬件控制,在计算机系统中起着非常重要的作用。运行在BIOS级别上的代码对计算机系统具有很强的控制能力,BIOS一旦受到恶意破坏将可能直接导致整个硬件系统瘫痪。本文介绍了BIOS的特点,分析了BIOS面临的安全威胁及其风险,并为BIOS风险缓解提出了措施建议。相似文献