期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

郑杰姬红兵《中国图象图形学报》2008,13(2):316-321

为克服图形硬件对传统纹理映射体绘制的限制,提出了一种在普通PC上进行大规模数据场体绘制的有效方法。该方法中,体数据被划分为合适大小的数据块,这些数据块被动态的载入图形硬件,并利用3维纹理映射进行绘制。在整个绘制过程中,仅有一个数据块存储在图形硬件上,有效地提高了对大规模体数据的绘制能力。同时,充分利用目前PC图形硬件成熟的可编程特性,通过对梯度的实时计算来减少在传统纹理映射体绘制中巨大的内存消耗。实验结果表明,该方法在普通PC上可以对超过纹理内存容量的大规模体数据进行交互式体绘制。相似文献

2.

一种高度并行的多任务并行绘制系统结构 总被引：2，自引：0，他引：2

彭敏峰曾亮陆筱霞李思昆《计算技术与自动化》2006,25(3):63-66

随着计算机图形技术的实用化，需要构造更逼真、更精细的三维复杂场景，其数据规模日益膨胀，加上对场景的实时交互的要求也越来越高，人们对多屏幕高分辨率显示的需求与日俱增，迫切需要一种针对大规模复杂场景的多任务并行图形绘制系统。本文介绍了一种适用于大规模复杂场景的高度并行的多任务多屏幕并行图形绘制系统的体系结构，支持图形任务的并行化处理和多屏幕显示。该系统结构将几何计算任务与图形绘制任务相分离，分剐进行并行化处理，在计算节点按绘制对象类型对任务进行分类以便于并行计算和任务分配，在绘制节点对各个小块屏幕图形进行并行合成。实验测试结果表明，该系统结构对多任务具有较好的并行效率和可扩展性，能够充分利用系统的并行计算资源，达到较好的绘制效果。相似文献

3.

Incremental Volume Rendering Using Hierarchical Compression

Michael B. Haley Edwin H. Blake 《Computer Graphics Forum》1996,15(3):45-55

We present a new algorithm here for efficient incremental rendering of volumetric datasets. The primary goal of this algorithm is to give average workstations the ability to efficiently render volume data received over relatively low bandwidth network links in such a way that rapid user feedback is maintained. Common limitations of workstation rendering of volume data include: large memory overheads, the requirement of expensive rendering hardware, and high speed processing ability. The rendering algorithm presented here overcomes these problems by making use of the efficient Shear-Warp Factorisation method which does not require specialised graphics hardware. However the original Shear-Warp algorithm suffers from a high memory overhead and does not provide for incremental rendering which is required should rapid user feedback be maintained. Our algorithm represents the volumetric data using a hierarchical data structure which provides for the incremental classification and rendering of volume data. This exploits the multiscale nature of the octree data structure. The algorithm reduces the memory footprint of the original Shear-Warp Factorisation algorithm by a factor of more than two, while maintaining good rendering performance. These factors make our octree algorithm more suitable for implementation on average desktop workstations for the purposes of interactive exploration of volume models over a network. Results from tests using typical volume datasets will be presented which demonstrate the ability of the algorithm to achieve high rendering rates for both incremental rendering and standard rendering while reducing the runtime memory requirements. 相似文献

4.

Distributed shared memory for roaming large volumes 总被引：1，自引：0，他引：1

Castanié L Mion C Cavin X Lévy B 《IEEE transactions on visualization and computer graphics》2006,12(5):1299-1306

We present a cluster-based volume rendering system for roaming very large volumes. This system allows to move a gigabyte-sized probe inside a total volume of several tens or hundreds of gigabytes in real-time. While the size of the probe is limited by the total amount of texture memory on the cluster, the size of the total data set has no theoretical limit. The cluster is used as a distributed graphics processing unit that both aggregates graphics power and graphics memory. A hardware-accelerated volume renderer runs in parallel on the cluster nodes and the final image compositing is implemented using a pipelined sort-last rendering algorithm. Meanwhile, volume bricking and volume paging allow efficient data caching. On each rendering node, a distributed hierarchical cache system implements a global software-based distributed shared memory on the cluster. In case of a cache miss, this system first checks page residency on the other cluster nodes instead of directly accessing local disks. Using two Gigabit Ethernet network interfaces per node, we accelerate data fetching by a factor of 4 compared to directly accessing local disks. The system also implements asynchronous disk access and texture loading, which makes it possible to overlap data loading, volume slicing and rendering for optimal volume roaming. 相似文献

5.

An introduction to parallel rendering

《Parallel Computing》1997,23(7):819-843

相似文献

6.

Equalizer: A Scalable Parallel Rendering Framework

Eilemann Stefan Makhinya Maxim Pajarola Renato 《IEEE transactions on visualization and computer graphics》2009,15(3):436-452

Continuing improvements in CPU and GPU performances as well as increasing multi-core processor and cluster-based parallelism demand for flexible and scalable parallel rendering solutions that can exploit multipipe hardware accelerated graphics. In fact, to achieve interactive visualization, scalable rendering systems are essential to cope with the rapid growth of data sets. However, parallel rendering systems are non-trivial to develop and often only application specific implementations have been proposed. The task of developing a scalable parallel rendering framework is even more difficult if it should be generic to support various types of data and visualization applications, and at the same time work efficiently on a cluster with distributed graphics cards. In this paper we introduce a novel system called Equalizer, a toolkit for scalable parallel rendering based on OpenGL which provides an application programming interface (API) to develop scalable graphics applications for a wide range of systems ranging from large distributed visualization clusters and multi-processor multipipe graphics systems to single-processor single-pipe desktop machines. We describe the system architecture, the basic API, discuss its advantadges over previous approaches, present example configurations and usage scenarios as well as scalability results. 相似文献

7.

Rendering large scenes using parallel ray tracing

《Parallel Computing》1997,23(7):873-885

Ray tracing is a powerful technique to generate realistic images of 3D scenes. However, rendering complex scenes may easily exceed the processing and memory capabilities of a single workstation. Distributed processing offers a solution if the algorithm can be parallelized in an efficient way. In this paper a hybrid scheduling approach is presented that combines demand driven and data parallel techniques. Which tasks to process demand driven and which data parallel, is decided by the data intensity of the task and the amount of data locality (coherence) that will be present in the task. By combining demand driven and data driven tasks, a better load balance may be achieved, while at the same time the communication is spread evenly across the network. This leads to a scalable and efficient parallel implementation of the ray tracing algorithm with little restriction on the size of the model data base to be rendered. 相似文献

8.

GPU的并行支持向量机算法(英文)

DO Thanh-Nghi NGUYEN Van-Hoa POULET Franois 《计算机科学与探索》2009,3(4):368-377

提出了一种新的并行增量式支持向量机算法来解决图形处理单元(GPU)中大规模数据集的分类问题。SVM以及核相关方法可以用来创建精确分类模型,但学习过程需要大量内存和很长时间。扩展了Suykens和Vandewalle提出的最少次方SVM(LS-SVM)方法来建立增量和并行算法。新算法使用图形处理器以低代价获得高系统性能。实现表明,在UCI和Delve数据集上,基于GPU并行增量算法较CPU实现方法快130倍,而且比现行算法,如LibSVM、SVM-perf和CB-SVM等快的多(超过2500倍)。相似文献

9.

一种图形加速器和着色器的体系结构 总被引：4，自引：0，他引：4

韩俊刚蒋林杜慧敏曹小鹏董梁孟李林赵全良殷诚信张军《计算机辅助设计与图形学学报》2010,22(3)

为了适应智能手机和网本机对于图形加速器的需求,提出一种二维图形加速器和三维像素着色器的体系结构.该体系结构包括自主设计的VLIW指令集和可重组的数据驱动流水线.针对通常将图像帧划分成多个块,且每块由一个微引擎处理的方法可能造成微引擎的负载不均衡的问题,采用按扫描行分配的并行存储和处理结构,其中每个扫描行的处理任务按照需要动态地分配给微引擎.为了评估和实现该体系结构,建立了性能仿真平台、系统仿真平台和RTL仿真平台,并用C++语言编写性能仿真平台评估了该体系结构对性能的影响.模拟实验结果表明,新颖的存储/任务映射方法可以充分地利用处理器资源,降低存储访问的冲突,有利于改善并行处理的可扩展性.文中还讨论了自主设计的图形产生器、图像变换器和VLIW微引擎的结构以及相关的图形硬件加速算法. 相似文献

10.

Real-time visualization of large volume datasets on standard PC hardware

Xie K Yang J Zhu YM 《Computer methods and programs in biomedicine》2008,90(2):117-123

In medical area, interactive three-dimensional volume visualization of large volume datasets is a challenging task. One of the major challenges in graphics processing unit (GPU)-based volume rendering algorithms is the limited size of texture memory imposed by current GPU architecture. We attempt to overcome this limitation by rendering only visible parts of large CT datasets. In this paper, we present an efficient, high-quality volume rendering algorithm using GPUs for rendering large CT datasets at interactive frame rates on standard PC hardware. We subdivide the volume dataset into uniform sized blocks and take advantage of combinations of early ray termination, empty-space skipping and visibility culling to accelerate the whole rendering process and render visible parts of volume data. We have implemented our volume rendering algorithm for a large volume data of 512 x 304 x 1878 dimensions (visible female), and achieved real-time performance (i.e., 3-4 frames per second) on a Pentium 4 2.4GHz PC equipped with NVIDIA Geforce 6600 graphics card ( 256 MB video memory). This method can be used as a 3D visualization tool of large CT datasets for doctors or radiologists. 相似文献

11.

基于GPU的海量离散点高程并行插值算法

王智广张腾畅吴相锦鲁强《计算机工程与科学》2021,43(4):614-619

提出一种基于GPU的高程并行插值算法,实现了对三维地表上海量离散点的并行加速渲染。通过高程纹理组织三维地表网格高程数据作为离散点渲染的基础,并通过GLSL编写GPU着色器程序动态控制图形渲染管线,实现视点相关的高程并行插值算法。实验结果表明,提出的基于GPU的高程并行插值算法较传统的内存插值算法,将三维地表上海量离散点的渲染量级从百万级提高到了千万级。相似文献

12.

协同分布式图形硬件的混合并行体绘制

下载免费PDF全文

曹轶莫则尧王弘堃袁斌《中国图象图形学报》2008,13(7):1379-1384

由于一般的共享存储并行机缺乏图形硬件,其上产生的3维科学计算数据,无法采用硬件加速的并行体绘制来就地进行数据可视化。为此基于本地并行机和分布式图形工作站,给出了一种混合并行绘制模型。该模型的工作原理是先将源数据存留在并行机,然后通过并行机的多处理器发布远程绘制命令流,进而通过操控工作站的图形硬件完成绘制;后期图像合成在并行机上执行,以发挥共享存储通信优势。通过负载平衡优化,并行绘制流水线有效实现了绘制、合成与显示的重叠。实验结果显示,该方法能以1024×1024图像分辨率,交互绘制并行机上的大规模数据场。相似文献

13.

支持Shader的Direct3D9应用程序透明并行化

刘真石教英熊华彭浩宇《计算机研究与发展》2007,44(10):1673-1681

根据图形处理器的最新可编程单元Vertex Shader和Pixel Shader的体系结构和单机Direct3D9应用程序的执行流程,提出支持Shader的Direct3D9应用程序在图形集群的透明并行化策略.图形集群的节点划分为资源分配和资源绘制节点,资源分配节点通过截取绘制接口将应用程序实时转换为6类绘制资源,包括命令流、Vertex Shader、Pixel Shader、顶点流、索引流和纹理流.资源绘制节点根据绘制资源的描述信息和资源数据重构出Direct3D9的绘制命令.图形集群中的所有绘制节点都保留全部的绘制资源,并且通过计算基于多流模式场景数据在屏幕空间的包围盒进行绘制任务划分.实验证明,使用,这种策略完全可以实现支持Shader的Direct3D9应用程序透明并行化.相对于单机绘制,基于图形集群的并行图形绘制不仅提高绘制性能而且得到较高绘制加速比. 相似文献

14.

An efficient and scalable parallel algorithm for out-of-core isosurface extraction and rendering

Qin Wang Joseph JaJa Amitabh Varshney 《Journal of Parallel and Distributed Computing》2007

We consider the problem of isosurface extraction and rendering for large scale time-varying data. Such data sets have been appearing at an increasing rate especially from physics-based simulations, and can range in size from hundreds of gigabytes to tens of terabytes. Isosurface extraction and rendering is one of the most widely used visualization techniques to explore and analyze such data sets. A common strategy for isosurface extraction involves the determination of the so-called active cells followed by a triangulation of these cells based on linear interpolation, and ending with a rendering of the triangular mesh. We develop a new simple indexing scheme for out-of-core processing of large scale data sets, which enables the identification of the active cells extremely quickly, using more compact indexing structure and more effective bulk data movement than previous schemes. Moreover, our scheme leads to an efficient and scalable implementation on multiprocessor environments in which each processor has access to its own local disk. In particular, our parallel algorithm provably achieves load balancing across the processors independent of the isovalue, with almost no overhead in the total amount of work relative to the sequential algorithm. We conduct a large number of experimental tests on the University of Maryland Visualization Cluster using the Richtmyer–Meshkov instability data set, and obtain results that consistently validate the efficiency and the scalability of our algorithm. 相似文献

15.

Parallel sphere rendering

《Parallel Computing》1997,23(7):961-974

Sphere rendering is an important method for visualizing molecular dynamics data. This paper presents a parallel algorithm that is almost 90 times faster than current graphics workstations. To render extremely large data sets and large images, the algorithm uses the MIMD features of the supercomputers to divide up the data, render independent partial images, and then finally composite the multiple partial images using an optimal method. The algorithm and performance results are presented for the CM-5 and the T3D. 相似文献

16.

Interactive volume rendering of large sparse data sets using adaptive mesh refinement hierarchies 总被引：2，自引：0，他引：2

Kahler R. Simon M. Hege H.-C. 《IEEE transactions on visualization and computer graphics》2003,9(3):341-351

In this paper, we present an algorithm that accelerates 3D texture-based volume rendering of large, sparse data sets, i.e., data sets where only a traction of the voxels contain relevant information. In texture-based approaches, the rendering performance is affected by the fill-rate, the size of texture memory, and the texture I/O bandwidth. For sparse data, these limitations can be circumvented by restricting most of the rendering work to the relevant parts of the volume. In order to efficiently enclose the corresponding regions with axis-aligned boxes, we employ a hierarchical data structure, known as an AMR (adaptive mesh refinement) tree. The hierarchy is generated utilizing a clustering algorithm. A good balance is thereby achieved between the size of the enclosed volume, i.e., the amount to render in graphics hardware and the number of axis-aligned regions, i.e., the number of texture coordinates to compute in software. The waste of texture memory by the power-of-two restriction is minimized by a 3D packing algorithm which arranges texture bricks economically in memory. Compared to an octree approach, the rendering performance is significantly increased and less parameter tuning is necessary. 相似文献

17.

面向GPU的批LOD地形实时绘制 总被引：1，自引：0，他引：1

下载免费PDF全文

张兵强张立民张建廷《中国图象图形学报》2012,17(4):582-588

为提高大规模地形实时渲染时的绘制效率,提出一种使用地形分块作为处理单元的批LOD算法。在预处理阶段,将多分辨率的地形数据划分成适于GPU批处理的分块,使用四叉树进行分块的有效组织。在此基础上,提出一种基于分块绘制的LOD误差标准,简化层次选取的计算量,通过增加"裙"和进行几何变形实现了层次间的有效过渡;实时绘制过程中,使用视锥裁剪减少进入图形硬件的数据量,利用地形四叉树列表和预测机制实现地形数据的有效加载管理。实验结果表明,本文算法能够充分发挥图形硬件的性能,具有较高的地形实时渲染效率。相似文献

18.

计算机图形并行处理的研究与发展 总被引：2，自引：0，他引：2

吴恩华贺瑞容《计算机学报》1991,14(5):380-388

本文概述了计算机图形并行处理研究的产生与发展,着重阐述了并行处理功能部件的研究和发展及多边形绘制、全局光照模型(光线跟踪与辐射度方法)、物理场数据与体介质数据绘制、动画、并行化图形标准等研究领域在并行处理方面的研究和发展.文中叙述了在这一领域的研究工作,并在最后展望了计算机图形并行处理的进一步发展方向. 相似文献

19.

Data Partitioning for Parallel Spatial Join Processing 总被引：1，自引：0，他引：1

Xiaofang Zhou David J. Abel David Truffet 《GeoInformatica》1998,2(2):175-204

The cost of spatial join processing can be very high because of the large sizes of spatial objects and the computation-intensive spatial operations. While parallel processing seems a natural solution to this problem, it is not clear how spatial data can be partitioned for this purpose. Various spatial data partitioning methods are examined in this paper. A framework combining the data-partitioning techniques used by most parallel join algorithms in relational databases and the filter-and-refine strategy for spatial operation processing is proposed for parallel spatial join processing. Object duplication caused by multi-assignment in spatial data partitioning can result in extra CPU cost as well as extra communication cost. We find that the key to overcome this problem is to preserve spatial locality in task decomposition. In this paper we show that a near-optimal speedup can be achieved for parallel spatial join processing using our new algorithms. 相似文献

20.

Topology-controlled volume rendering

Weber GH Dillard SE Carr H Pascucci V Hamann B 《IEEE transactions on visualization and computer graphics》2007,13(2):330-341

Topology provides a foundation for the development of mathematically sound tools for processing and exploration of scalar fields. Existing topology-based methods can be used to identify interesting features in volumetric data sets, to find seed sets for accelerated isosurface extraction, or to treat individual connected components as distinct entities for isosurfacing or interval volume rendering. We describe a framework for direct volume rendering based on segmenting a volume into regions of equivalent contour topology and applying separate transfer functions to each region. Each region corresponds to a branch of a hierarchical contour tree decomposition, and a separate transfer function can be defined for it. The novel contributions of our work are: 1) a volume rendering framework and interface where a unique transfer function can be assigned to each subvolume corresponding to a branch of the contour tree, 2) a runtime method for adjusting data values to reflect contour tree simplifications, 3) an efficient way of mapping a spatial location into the contour tree to determine the applicable transfer function, and 4) an algorithm for hardware-accelerated direct volume rendering that visualizes the contour tree-based segmentation at interactive frame rates using graphics processing units (GPUs) that support loops and conditional branches in fragment programs 相似文献