首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 925 毫秒
1.
A polyhedron P is castable if its boundary can be partitioned by a plane into two polyhedral terrains. Castable polyhedra can be manufactured easily using two cast parts, where each cast part can be removed from the object without breaking the cast part or the object. If we assume that the cast parts are each removed by a single translation, it is shown that for a simple polyhedron with n vertices, castability can be decided in time and linear space using a simple algorithm. A more complicated algorithm solves the problem in time and space, for any fixed ε > 0. In the case where the cast parts are to be removed in opposite directions, a simple O(n 2 )-time algorithm is presented. Finally, if the object is a convex polyhedron and the cast parts are to be removed in opposite directions, a simple algorithm is presented. Received June 1, 1994; revised May 25, 1995.  相似文献   

2.
Compute unified device architecture (CUDA) is a software development platform that allows us to run C-like programs on the nVIDIA graphics processing unit (GPU). This paper presents an acceleration method for cone beam reconstruction using CUDA compatible GPUs. The proposed method accelerates the Feldkamp, Davis, and Kress (FDK) algorithm using three techniques: (1) off-chip memory access reduction for saving the memory bandwidth; (2) loop unrolling for hiding the memory latency; and (3) multithreading for exploiting multiple GPUs. We describe how these techniques can be incorporated into the reconstruction code. We also show an analytical model to understand the reconstruction performance on multi-GPU environments. Experimental results show that the proposed method runs at 83% of the theoretical memory bandwidth, achieving a throughput of 64.3 projections per second (pps) for reconstruction of 5123-voxel volume from 360 5122-pixel projections. This performance is 41% higher than the previous CUDA-based method and is 24 times faster than a CPU-based method optimized by vector intrinsics. Some detailed analyses are also presented to understand how effectively the acceleration techniques increase the reconstruction performance of a naive method. We also demonstrate out-of-core reconstruction for large-scale datasets, up to 10243-voxel volume.  相似文献   

3.
In this paper, a simple and efficient algorithm is proposed for manifold-guaranteed out-of-core simplification of large meshes with controlled topological type. By dual-sampling the input large mesh model, the proposed algorithm utilizes a set of Hermite data (sample points with normals) as an intermediate model representation, which allows the topological structure of the mesh model to be encoded implicitly and thus makes it particularly suitable for out-of-core mesh simplification. Benefiting from the construction of an in-core signed distance field, the proposed algorithm possesses a set of features including manifoldness of the simplified meshes, toleration of nonmanifold mesh data input, topological noise removal, topological type control and, sharp features and boundary preservation. A novel, detailed implementation of the proposed algorithm is presented, and experimental results demonstrate that very large meshes can be simplified quickly on a low-cost off-the-shelf PC with tightly bounded approximation errors and with time and space efficiency.  相似文献   

4.
大规模场景的快速绘制是虚拟现实技术重要的研究课题之一.为了加速场景的绘制,一般采用层次细节模型和可见性裁剪方法,但是现有算法在处理大规模场景时存在着局限性.本文提出了一种新的大规模场景快速绘制算法,该算法在场景层次划分的基础上,利用拓扑结构可变的网格简化方法为场景层次计算连续的分层层次细节模型(HLOD);然后在实时绘制阶段,对场景分层层次细节模型进行视点相关的全局和局部细化,并结合快速有效的视域裁剪,从而大大加速了场景绘制速度.实验结果表明该算法是简单有效的,并且算法还可以进一步扩展到外存方式.  相似文献   

5.
马军  杨波  马绍汉 《软件学报》2000,11(2):260-264
求解最佳的Manhattan型Steiner树问题(minimum rectilinear Steiner tree,简记为MRST问题)是在VLSI布线、网络通信中所遇到的组合优化问题,同时也是一个NP-难解问题.该文给出对该问题的O(n2)时间复杂性的近似算法.该算法在最坏情况下的近似比严格小于3/2.计算机实验结果表明,所求得的支撑树的平均费用与最佳算法的平均费用仅相差0.8%.该算法稍加修改,可应用到三维或多维的Manhattan空间对Steiner问题求解,且易于在并行与分布式环境下编程实现  相似文献   

6.
This paper presents a unified framework that optimizes out-of-core programs by exploiting locality and parallelism, and reducing communication overhead. For out-of-core problems where the data set sizes far exceed the size of the available in-core memory, it is particularly important to exploit the memory hierarchy by optimizing the I/O accesses. We present algorithms that consider both iteration space (loop) and data space (file layout) transformations in a unified framework. We show that the performance of an out-of-core loop nest containing references to out-of-core arrays can be improved by using a suitable combination of file layout choices and loop restructuring transformations. Our approach considers array references one-by-one and attempts to optimize each reference for parallelism and locality. When there are references for which parallelism optimizations do not work, communication is vectorized so that data transfer can be performed before the innermost loop. Results from hand-compiles on IBM SP-2 and Inter Paragon distributed-memory message-passing architectures show that this approach reduces the execution times and improves the overall speedups. In addition, we extend the base algorithm to work with file layout constraints and show how it is useful for optimizing programs that consist of multiple loop nests  相似文献   

7.
在场景分割的基础上,提出一种连续分层层次细节模型组织场景,并对视点空间进行划分,为每个视点单元计算恰好满足单元内任意视点屏幕像素误差的场景图节点列表,称为cell-front;然后利用cell-front对外存模型的存储进行重新设计.在绘制时采用多线程技术,绘制线程对当前cell-front进行可见性剔除和视点相关的选择绘制;预取线程采用一种基于视点单元的预取策略,利用视点单元之间cell-front的变化控制预取数据的调度.该算法能够在保证场景绘制质量的前提下,在普通的PC机上实现大规模外存场景的实时交互显示.  相似文献   

8.
《Parallel Computing》2014,40(10):754-767
The processing of massive amounts of data on clusters with finite amount of memory has become an important problem facing the parallel/distributed computing community. While MapReduce-style technologies provide an effective means for addressing various problems that fit within the MapReduce paradigm, there are many classes of problems for which this paradigm is ill-suited. In this paper we present a runtime system for traditional MPI programs that enables the efficient and transparent out-of-core execution of distributed-memory parallel programs. This system, called BDMPI,1 leverages the semantics of MPI’s API to orchestrate the execution of a large number of MPI processes on much fewer compute nodes, so that the running processes maximize the amount of computation that they perform with the data fetched from the disk. BDMPI enables the development of efficient out-of-core parallel distributed memory codes without the high engineering and algorithmic complexities associated with multiple levels of blocking. BDMPI achieves significantly better performance than existing technologies on a single node as well as on a small cluster, and performs within 30% of optimized out-of-core implementations.  相似文献   

9.
We give substantially improved exact exponential-time algorithms for a number of NP-hard problems. These algorithms are obtained using a variety of techniques. These techniques include: obtaining exact algorithms by enumerating maximal independent sets in a graph, obtaining exact algorithms from parameterized algorithms and a variant of the usual branch-and-bound technique which we call the "colored" branch-and-bound technique. These techniques are simple in that they avoid detailed case analyses and yield algorithms that can be easily implemented. We show the power of these techniques by applying them to several NP-hard problems and obtaining new improved upper bounds on the running time. The specific problems that we tackle are: (1) the Odd Cycle Transversal problem in general undirected graphs, (2) the Feedback Vertex Set problem in directed graphs of maximum degree 4, (3) Feedback Arc Set problem in tournaments, (4) the 4-Hitting Set problem and (5) the Minimum Maximal Matching and the Edge Dominating Set problems. The algorithms that we present for these problems are the best known and are a substantial improvement over previous best results. For example, for the Minimum Maximal Matching we give an O*(1.4425n) algorithm improving the previous best result of O*(1.4422m) [35]. For the Odd Cycle Transversal problem, we give an O*(1.62n) algorithm which improves the previous time bound of O*(1.7724n) [3].  相似文献   

10.
Recently ElGindy and Avis (EA) presented anO(n) algorithm for solving the two-dimensional hidden-line problem in ann-sided simple polygon. In this paper we show that their algorithm can be used to solve other geometric problems. In particular, triangulating anL-convex polygon and finding the convex hull of a simple polygon can be accomplished inO(n) time, whereas testing a simple polygon forL-convexity can be done inO(n 2) time.  相似文献   

11.
Melab  N.  Talbi  E.-G.  Petiton  S. 《The Journal of supercomputing》2000,17(2):167-185
This paper presents a parallel adaptive version of the block-based Gauss-Jordan algorithm, utilized to invert large matrices. This version includes a characterization of the workload and a mechanism of its folding/unfolding. Furthermore, this paper proposes a work scheduling strategy and an application-oriented solution for the fault tolerance problem. The application is implemented and experimented with MARS1 in dedicated and non-dedicated environments. The results show that an absolute efficiency of 92% is possible on a cluster of DEC/ALPHA processors interconnected by a Gigaswitch network and an absolute efficiency of 67% can be obtained on an Ethernet network of SUN-Sparc 4 workstations. Moreover, the algorithm is tested on a meta-system including both the two parks of machines. Finally, an out-of-core solution for the algorithm is proposed. This solution allows a gain of 66% of data input operations and reduces the central memory space required for storing the data space of the algorithm by a factor q, where q is the dimension of the matrix to be inverted in terms of data blocks.  相似文献   

12.
In this paper we study the problems of detecting holes and antiholes in general undirected graphs, and we present algorithms for these problems. For an input graph G on n vertices and m edges, our algorithms run in O(n + m2) time and require O(n m) space; we thus provide a solution to the open problem posed by Hayward et al. asking for an O(n4)-time algorithm for finding holes in arbitrary graphs. The key element of the algorithms is the use of the depth-first-search traversal on appropriate auxiliary graphs in which moving between any two adjacent vertices is equivalent to walking along a P4 (i.e., a chordless path on four vertices) of the input graph or on its complement, respectively. The approach can be generalized so that for a fixed constant k ≥ 5 we obtain an O(nk-1)-time algorithm for the detection of a hole (antihole resp.) on at least k vertices. Additionally, we describe a different approach which allows us to detect antiholes in graphs that do not contain chordless cycles on five vertices in O(n + m2) time requiring O(n + m) space. Again, for a fixed constant k ≥ 6, the approach can be extended to yield O(nk-2)-time and O(n2)-space algorithms for detecting holes (antiholes resp.) on at least k vertices in graphs which do not contain holes (antiholes resp.) on k - 1 vertices. Our algorithms are simple and can be easily used in practice. Finally, we also show how our detection algorithms can be augmented so that they return a hole or an antihole whenever such a structure is detected in the input graph; the augmentation takes O(n + m) time and space.  相似文献   

13.
随着三维扫描与建模技术的飞速发展,三维场景的数据量急剧增大,无法一次性载入内存,而且难以进行交互绘制.研究开发了一个大规模外存场景的交互漫游系统QuickWalk,该系统集成了视点相关的层次细节技术、可见性剔除和场景数据的内外存调度,能够有效地减少运行时刻的内存需要,使CPU、GPU和I/O三者的效率得到充分发挥.实验表明,在保证场景绘制质量的前提下,该系统能够在目前普通的PC上实现大规模外存场景的实时显示和交互操纵.  相似文献   

14.
We present a novel approach for interactive rendering of massive 3D models. Our approach integrates adaptive sampling-based simplification, visibility culling, out-of-core data management and level-of-detail. We use a unified scene graph representation for acceleration techniques. In preprocessing, we subdivide large objects, and build a BVH clustering hierarchy. We make use of a novel adaptive sampling method to generate LOD models: AdaptiveVoxels. The AdaptiveVoxels reduces the preprocessing cost and our out-of-core rendering algorithm improves rendering efficiency. We have implemented our algorithm on a desktop PC. We can render massive CAD and isosurface models, consisting of hundreds of millions of triangles interactively with little loss in image quality.  相似文献   

15.
We present a memory and time efficient surface reconstruction algorithm for large sets of points, without normal information, that may not fit within the main memory. Our algorithm treats the input points as forming layers of surface patches, and then performs reconstruction by working on multiple layers of surface patches concurrently based on an out-of-core octree. Additionally, the memory usage of the algorithm can be pre-estimated through a few simple quantities in relation to the spatial coherence of the input points. Tests on the algorithm with large point sets verify that it produces good outputs too.  相似文献   

16.
Algorithms for a Class of Isotonic Regression Problems   总被引:4,自引:0,他引:4  
The isotonic regression problem has applications in statistics, operations research, and image processing. In this paper a general framework for the isotonic regression algorithm is proposed. Under this framework, we discuss the isotonic regression problem in the case where the directed graph specifying the order restriction is a directed tree with n vertices. A new algorithm is presented for this case, which can be regarded as a generalization of the PAV algorithm of Ayer et al. Using a simple tree structure such as the binomial heap, the algorithm can be implemented in O(n log n) time, improving the previously best known O(n 2 ) time algorithm. We also present linear time algorithms for special cases where the directed graph is a path or a star. Received September 2, 1997; revised January 2, 1998, and February 16, 1998.  相似文献   

17.
大规模三维地形可视化算法研究进展   总被引:11,自引:0,他引:11  
大规模地形可视化是大型户外环境模拟不可缺少的组成部分,也是近年来可视化领域的研究热点,在游戏、仿真、虚拟现实、地理信息系统等领域有着广泛的应用。本文重点讨论了国内外学者在该领域的研究方法和最新研究进展以及尚未解决的问题。从数据拟合和模型简化两个方面叙述了自适应地形可视化建模方法,根据对现代图形硬件是否友好,将地形模型简化算法归纳为面向CPU的细粒度LOD算法和面向GPU的粗粒度LOD算法两类,同时描述了建模过程中存在的空间不连续问题以及各种解决方案,详细阐述了支持大数据集绘制的out-of-core技术,最后总结并分析了地形可视化建模领域的发展趋势和今后的研究重点。  相似文献   

18.
提出一种大规模通信网络带宽分配的新方法.作者将大系统理论中的分解—协调方法运用于解决大规模通信网络带宽分配的优化问题,大型网络的优化问题被分解成一些互相关联的小型子网的优化问题.整个带宽优化分配问题的解决分为三个阶段:分解、协调优化及合并优化.计算结果表明,与现有算法相比,分解—协调大规模带宽管理算法(DCLPBM)既保证了很高的计算精度,又降低了时间与空间复杂度.由于算法中用到的协调机制较简单,DCLPBM易于推广到分布式计算环境,从以网络来治理网络的角度看,它具有较广的前景.  相似文献   

19.
We present a technique for steganography in polygonal meshes. Our method hides a message in the indexed rep‐resentation of a mesh by permuting the order in which faces and vertices are stored. The permutation is relative to a reference ordering that encoder and decoder derive from the mesh connectivity in a consistent manner. Our method is distortion‐free because it does not modify the geometry of the mesh. Compared to previous steganographic methods for polygonal meshes our capacity is up to an order of magnitude better. Our steganography algorithm is universal and can be used instead of the standard permutation steganography algorithm on arbitrary datasets. The standard algorithm runs in Ω (n2 log2 n log log n) time and achieves optimal O(nlog n) bit capacity on datasets with n elements. In contrast, our algorithm runs in O(n) time, achieves a capacity that is only one bit per element less than optimal, and is extremely simple to implement.  相似文献   

20.
In this paper, we present a framework for synthesizing I/O efficient out-of-core programs for block recursive algorithms, such as the fast Fourier transform (FFT) and block matrix transposition algorithms. Our framework uses an algebraic representation which is based on tensor products and other matrix operations. The programs are optimized for the striped Vitter and Shriver's two-level memory model in which data can be distributed using various cyclic(B) distributions in contrast to the normally used physical track distribution cyclic(Bd ), where Bd is the physical disk block size. We first introduce tensor bases to capture the semantics of block-cyclic data distributions of out-of-core data and also data access patterns to out-of-core data. We then present program generation techniques for tensor products and matrix transposition. We accurately represent the number of parallel I/O operations required for the synthesized programs for tensor products and matrix transposition as a function of tensor bases and data distributions. We introduce an algorithm to determine the data distribution which optimizes the performance of the synthesized programs. Further, we formalize the procedure of synthesizing efficient out-of-core programs for tensor product formulas with various block-cyclic distributions as a dynamic programming problem. We demonstrate the effectiveness of our approach through several examples. We show that the choice of an appropriate data distribution can reduce the number of passes to access out-of-core data by as large as eight times for a tensor product and the dynamic programming approach can largely reduce the number of passes to access out-of-core data for the overall tensor product formulas  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号