期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

黄栋良周海兵顾炳根《硅谷》2009,(11)

为提高视频抠像处理的速度,提出基于GPU(图形处理器)加速的视频抠像方法,将色度抠像算法转化为GPU中的纹理图像渲染过程,利用GPU并行计算和高速浮点计算特性,使得色度算法在GPU中加速执行,有效的提高了算法计算速度。相似文献

2.

袁斌《工程图学学报》2010,31(3):76-83

计算机图形硬件技术的快速发展可以用来加速可视化过程,为此针对非均匀直线网格,给出了基于均匀辅助网格的CPU光线投射算法、基于辅助纹理的GPU光线投射算法,以及基于切片的3D纹理体绘制算法,并在Nvidia Geforce 6800GT图形卡上对这些算法进行了测试。结果表明,GPU算法远远快于CPU算法,而基于切片的3D纹理体绘制算法则快于GPU光线投射算法。相似文献

3.

基于光子映射的并行渲染算法

陈皓刘晓平《工程图学学报》2009,30(3)

光子映射是针对光线跟踪以及辐射度算法存在的缺陷提出的一种新的全局光照图形绘制算法,但因其计算量大,难以适用于视景仿真和虚拟现实这样对渲染速度有严格要求的场合.并行化是提高渲染效率的重要手段,通过对不同渲染任务划分方式的比较,得出了一种较为适用的并行渲染任务划分方式,在此基础上,提出并在MPI环境下实现了针对光子映射的并行渲染算法,通过实验证明该算法可以很好适应集群计算环境,在一定程度上提高渲染速度. 相似文献

4.

基于GPU的超声弹性成像并行实现研究

彭博谌勇刘东权《光电工程》2013,(5):97-105

为了提高超声弹性成像计算速度,提出使用GPU硬件加速基于互相关技术和相位零估计的弹性成像技术。先描述这两种弹性成像技术的实现细节及特点,然后分析这两种技术的计算密集操作部分的并行化计算可能性,最后通过GPU程序开发工具ArrayFire实现了基于GPU的互相关和相位零估计的超声弹性成像技术。通过模拟和扫描仿真人体组织的弹性成像体模获得的压缩前后数据帧对基于GPU的超声弹性成像方法进行测试与验证。实验结果表明,基于GPU的方法可以大幅提高弹性图计算速度,在处理单帧弹性图条件下,与基于互相关方法比较,加速比达到42,而基于相位零估计的方法在提高数据吞吐量的情况下加速比可达到65。相似文献

5.

基于CUDA的光线跟踪实现中纹理内存的应用研究

陆建勇焦良葆《中国新技术新产品》2009,(23):40-41

CUDA是由NVIDIA开发的用于通用并行计算的开发平台,可方便地实现并行算法的编程。本文利用光线跟踪算法具有的天然可并行性,采用KD树加速结构,在CUDA上实现光线跟踪的并行算法,经过纹理内存的优化使用后,可达到交互式光线跟踪。相似文献

6.

基于GPU的快速Sobel边缘检测算法 总被引：2，自引：1，他引：1

左颢睿张启衡徐勇赵汝进《光电工程》2009,36(1)

传统的Soble边缘检测算法的优化和实现都是针对常用处理器(CPU、DSP和FPGA等)提出的,难以应用在图像处理器(GPU)上.本文提出了一种基于NVIDIA公司CUDA架构图形处理器(GPU)的快速Sobel边缘检测算法.快速算法根据GPU的并行结构和硬件特点,采用了纹理存储技术、多点访问技术和对称计算技术三种加速技术,优化了数据存储结构,提高了数据访问效率,降低了算法复杂度.实验结果表明,快速算法充分利用了GPU的并行处理能力,在处理4 096x4 096分辨力的8位灰度图像时速度可达190 fps,是基于CPU实现的122倍. 相似文献

7.

基于CUDA的点匹配合成算法

傅纲《中国科技博览》2014,(25):323-324

纹理合成在计算机动画制作中具有重要地位。为克服传统串行点匹配纹理合成算法效率低下的缺陷，提出一种基于计算统一设备架构（CUDA）的并行合成算法。通过合理安排CPU和GPU之间的数据传输，用GPU进行繁琐耗时的计算，明显地提高了算法效率。相似文献

8.

基于八叉树邻域分析的光线跟踪加速算法

张文胜解骞钟瑾刘俊平郝青郭广利《工程图学学报》2015,(3)

八叉树是加速光线跟踪常用的层次划分结构,为加快八叉树跟踪光线的过程,论文研究了运用八叉树邻域分析提高光线与八叉树节点之间的碰撞检测速度的方法,提出了一种结构简单、计算效率更高的八叉树节点的邻域分析算法。运用该算法可由现碰撞节点快速计算出下一碰撞节点,避免了采用大量递归搜索计算,从而提高了图像的渲染速度。实验结果表明,使用论文提出的邻域分析进行碰撞检测,效率比传统算法提高了3倍以上,大大提高了光线跟踪的速度。相似文献

9.

基于并行处理技术的谷物粒型快速测量算法 总被引：1，自引：1，他引：0

蒋霓段凌凤杨万能刘谦《光电工程》2012,39(3):66-71

谷物粒型是决定谷粒品质和产量的重要参数之一。传统人工测量粒型的方法耗时、工作量大、主观性强。本文首先介绍一种基于线阵列采集技术和工业输送技术的谷物粒型自动测量系统。为提高系统测量效率,文章中应用了图形处理器(GPU)并行处理技术,在统一计算设备架构(CUDA)下对测量算法进行优化。实验结果表明,基于GPU的并行加速算法,能有效提高测量效率,当图像中谷粒数近2000颗时,优化后的算法速度为中央处理器(CPU)下算法运行速度的400多倍,且随着采集图像中谷粒数的增多,优化测量算法的加速效果更显著。相似文献

10.

GPU上的水彩画风格实时渲染及动画绘制 总被引：1，自引：0，他引：1

王妙一王斌雍俊海《工程图学学报》2012,33(3):73-80

论文提出了一种基于GPU的对三维场景进行实时水彩画效果渲染的方法。该方法的大部分过程使用图像空间的技术实现。算法将画面分为细节层、环境层、笔触层分别渲染,再进行合成。在过程中使用环境遮挡、shadow mapping等技术进行快速的阴影计算,并使用图像滤镜的方法模拟水彩的多种主要特征。由于该方法以图像空间的技术为主,因此可以利用GPU并行处理的特点对计算过程进行加速,进而达到实时的渲染速度。最后建立动画脚本分析系统,进行实时动画渲染,表明该方法在计算机动画、游戏等数字娱乐产业领域有较大的应用潜力。相似文献

11.

A parallel built-in self-diagnostic method for nontraditional faults of embedded memory arrays

Arora V. Jone W.B. Huang D.C. Das S.R. 《IEEE transactions on instrumentation and measurement》2004,53(4):915-932

In this paper, we propose a built-in self-diagnostic march-based algorithm that identifies faulty memory cells based on a recently introduced nontraditional fault model. It is developed based on the DiagRSMarch algorithm, which is a diagnostic algorithm to identify traditional faults for embedded memory arrays. A minimal set of additional operations is added to DiagRSMarch for identifying the nontraditional faults without affecting the diagnostic coverage of the traditional faults. The embedded memory arrays are accessed using a bidirectional serial interfacing architecture which minimizes the routing overhead introduced by the diagnosis hardware. Using the concepts of the bidirectional interfacing technique, parallel testing, and redundant-tolerant operations, the diagnostic process can be accomplished efficiently at-speed with minimal hardware overhead. 相似文献

12.

一种新颖的用于触觉再现的平行菱形链连接模型 总被引：1，自引：1，他引：0

张小瑞宋爱国刘佳李建清《高技术通讯》2009,19(7)

针对如何提高虚拟触觉再现的精度与实时性问题,提出了一种新颖的基于物理意义的平行菱形链连接触觉变形模型.该模型中各个链结构单元中菱形的长度等比例变化,因而计算量小;改变链结构单元中菱形的长度和夹角就可方便对不同的柔性体进行建模,系统中各个链结构单元的相对位移的叠加对外等效为物体表面的变形,与之相连的弹簧弹性力的合力等效为物体表面的接触力.利用手控器对柔性体的接触变形和实时虚拟触觉反馈进行了仿真.实验表明所提出的方法适用于柔性体的触觉反馈计算,能够满足精细作业对虚拟现实系统的要求. 相似文献

13.

基于FPGA的铅笔画绘制算法结构设计及优化

张江红赵杨张学杰徐丹《工程图学学报》2006,27(4):77-87

由于FPGA同时具有硬件的快速和软件的灵活两方面的优点,这一平台在图像和视频处理方面的应用日益广泛。然而就目前来说,FPGA主要用于真实感图像或视频的处理,而把FPGA应用于非真实感绘制的研究还是一片空白,因此提出了一种基于FPGA把输入图像处理为铅笔画输出或把输入实时视频处理为铅笔画视频输出的方法。在此方法的探索中,先对铅笔画生成算法作并行性分析,得到适用于FPGA的算法,再在此基础上应用硬件特有的流水线乘法技术进行优化以提高硬件系统的处理速度。相似文献

14.

基于EBE策略实现结构动力响应的并行计算

聂旭涛范大鹏《振动与冲击》2007,26(10):51-55

基于EBE(Element by Element)策略的并行算法不用形成总体刚度矩阵,而且无需进行三维模型的区域分解,从而提高并行计算的速度和效率,是实现结构动力响应快速分析的有效途径。采用Newmark法,结合EBE并行算法和Jacobi预处理技术实现结构动力方程的并行计算。在此基础上,利用虚拟激励方法实现结构随机振动的并行计算。最后在网络集群环境下,综合运用多种编程语言和分析工具,应用该并行算法对三维零件的冲击响应以及随机振动进行仿真计算,并与Ansys、精细时程积分法的相比较。结果表明,该并行算法的计算误差小,并行效率较高,适用于工程计算。相似文献

15.

NURBS Modeling and Curve Interpolation Optimization of 3D Graphics

Hao Zhu Mulan Wang Kun Liu Weiye Xu 《计算机、材料和连续体（英文）》2021,66(2):1799-1811

In order to solve the problem of complicated Non-Uniform Rational B-Splines (NURBS) modeling and improve the real-time performance of the high-order derivative of the curve interpolation process, the method of NURBS modeling based on the slicing and layering of triangular mesh is introduced. The research and design of NURBS curve interpolation are carried out from the two aspects of software algorithm and hardware structure. Based on the analysis of the characteristics of traditional computing methods with Taylor series expansion, the Adams formula and the Runge-Kutta formula are used in the NURBS curve interpolation process, and the process is then optimized according to the characteristics of NURBS interpolation. This can ensure accuracy, and avoid the calculation of higher-order derivatives. Furthermore, the hardware modules for the Adams and Runge-Kutta formulas are designed by using the parallel hardware construction technology of Field Programmable Gate Array (FPGA) chips. The parallel computing process using FPGA is compared with the traditional serial computing process using CPUs. Simulation and experimental results show that this scheme can improve the computational speed of the system and that the algorithm is feasible. 相似文献

16.

Parallel terrain rendering using a cluster of computers

Tung-Ju Hsieh Falko Kuester Tara Hutchinson 《中国工程学刊》2013,36(2):212-223

This article presents a distributed parallel processing technique for rendering massive terrain using a cluster of machines consisting of one designated rendering node and 20 computing nodes. With a novel approach, the presented technique achieves an increase in rendering speed and an improvement in rendering capability. Adaptive terrain mesh constructions are done in parallel at computing nodes and the resulting meshes are combined and subsequently rendered at the rendering node. This study uses a height field of the United States at 30-m resolution spacing. It is divided into smaller blocks consisting of 4096?×?4096 vertices. Each computing node is assigned one or four blocks and tasked with creating the level-of-detail mesh that corresponds to view-dependent parameters provided by the rendering node. These individual terrain meshes are subsequently combined and rendered as seamless terrain meshes with a continuous terrain surface. The high rendering capacity of the presented technique is essential to the high-resolution large display system. 相似文献

17.

一种新型平面并联机器人机构及其解耦控制 总被引：1，自引：0，他引：1

刘延杰孙立宁刘品宽蔡鹤皋《高技术通讯》2003,13(7):69-73

结合同时考虑机构设计和控制策略的思想，提出一种新型的可应用于微电子封装与组装行业中的两自由度平面并联机器人机构，其具有高刚度、高速度、高精度的特点。对机构进行了运动学、动力学分析和基于加速度反馈的解耦控制研究。解耦控制的仿真分析表明了该控制策略的有效性。相似文献

18.

Design of an optical content-addressable parallel processor for expert systems

Louri A Na J 《Applied optics》1995,34(23):5053-5063

The slow execution speed of current rule-based systems (RBS's) has restricted their application areas. To improve the speed of RBS's, researchers have proposed various electronic multiprocessor systems as well as optical systems. However, the electronic systems still suffer in performance from the large amount of required time-consuming pattern-matching and comparison operations at the core of RBS's. And optical systems do not fully exploit the available parallelism in RBS's. We propose an optical content-addressable parallel processor for expert systems. The processor executes the three basic RBS operations, match, select, and act, in a highly parallel fashion. Additionally, it extracts and exploits all possible parallelism in a RBS. Distinctive features of the proposed system include the following: (1) two-dimensional representation of data (knowledge) and control information to exploit the parallelism of optics in the three RBS units; (2) capability of processing general-domain knowledge expressed in terms of variables, numbers, symbols, and comparison operators such as greater than and less than; (3) the parallel optical match unit, which performs the two-dimensional optical pattern matching and comparison operations; (4) a novel conflict-resolution algorithm to resolve conflicts in a single step within the optical select unit. The three units and the general-knowledge representation scheme are designed to make the optical content-addressable parallel processor for expert systems suitable for any high-speed general-purpose RBS. 相似文献

19.

基于云计算的框架结构参数并行辨识算法

姜绍飞任晖骆剑彬《工程力学》2018,35(4):135-143

大型复杂结构健康监测（Structural Health Monitoring,SHM）系统的安装产生了海量监测数据,传统结构分析与数据处理技术使得监测数据得不到实时分析处理,导致不能及时评估结构工作状态并进行危险预警。为了解决这一问题,该文对传统多粒子群协同优化（Multi-Particle Swarm Coevolution Optimization,MPSCO）算法进行分布式并行化改进,开发了基于云计算的PMPSCO算法。在此基础上,提出了基于PMPSCO算法的框架结构物理参数辨识方法,并在MATLAB分布式云计算平台上对一个15层框架数值试验和一个7层钢框架实验室试验进行结构物理参数辨识,探讨了接入不同分布式并行节点数时该算法的加速关系。辨识结果表明:PMPSCO算法具有良好的精度、稳定性和拓展性,可通过增加接入的分布式并行节点数灵活提高算法运算速度,以满足结构监测数据实时处理的要求。相似文献