首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Building a visual hull model from multiple two-dimensional images provides an effective way of understanding the three-dimensional geometries inherent in the images. In this paper, we present a GPU accelerated algorithm for volumetric visual hull reconstruction that aims to harness the full compute power of the many-core processor. From a set of binary silhouette images with respective camera parameters, our parallel algorithm directly outputs the triangular mesh of the resulting visual hull in the indexed face set format for a compact mesh representation. Unlike previous approaches, the presented method extracts a smooth silhouette contour on the fly from each binary image, which markedly reduces the bumpy artifacts on the visual hull surface due to a simple binary in/out classification. In addition, it applies several optimization techniques that allow an efficient CUDA implementation. We also demonstrate that the compact mesh construction scheme can easily be modified for also producing a time- and space-efficient GPU implementation of the marching cubes algorithm.  相似文献   

2.
针对目前并行Prim最小生成树算法效率不高的问题,在分析现有并行Prim算法的基础上,提出了适于GPU架构的压缩邻接表图表示形式,开发了基于GPU的minreduction数据并行原语,在NVIDIA GPU上设计并实现了基于Prim算法思想的并行最小生成树算法。该算法通过使用原语缩短关键步骤的查找时间,从而获得较高效率。实验表明,相对于传统CPU实现算法和不使用原语的算法,该算法具有较明显的性能优势。  相似文献   

3.
In this work we describe some parallel algorithms for solving nonlinear systems using CUDA (Compute Unified Device Architecture) over a GPU (Graphics Processing Unit). The proposed algorithms are based on both the Fletcher–Reeves version of the nonlinear conjugate gradient method and a polynomial preconditioner type based on block two-stage methods. Several strategies of parallelization and different storage formats for sparse matrices are discussed. The reported numerical experiments analyze the behavior of these algorithms working in a fine grain parallel environment compared with a thread-based environment.  相似文献   

4.
提出一种基于GPU的高程并行插值算法,实现了对三维地表上海量离散点的并行加速渲染。通过高程纹理组织三维地表网格高程数据作为离散点渲染的基础,并通过GLSL编写GPU着色器程序动态控制图形渲染管线,实现视点相关的高程并行插值算法。实验结果表明,提出的基于GPU的高程并行插值算法较传统的内存插值算法,将三维地表上海量离散点的渲染量级从百万级提高到了千万级。  相似文献   

5.
许建  林泳  秦勇  黄翰 《计算机应用研究》2013,30(9):2656-2659
为提高协同过滤算法的可伸缩性, 加快其运行速度, 提出了一种基于GPU(graphic processing unit)的并行协同过滤算法来实现高速并行处理。GPU的运算模式采用单指令多数据流, 适用于逻辑性弱、数据量巨大的运算, 而这正是协同过滤算法所具有的特点。使用统一计算设备框架(compute unified device architecture, CUDA)实现了此协同过滤算法。实验表明, 在中低端的GPU上该算法与在高端的四核CPU上的协同过滤算法相比, 其加速比达到40倍以上, 显著地提高了算法的可伸缩性, 而算法在准确率方面也有优秀的表现。  相似文献   

6.
We introduce a GPU-based parallel vertex substitution (pVS) algorithm for the p-median problem using the CUDA architecture by NVIDIA. pVS is developed based on the best profit search algorithm, an implementation of vertex substitution (VS), that is shown to produce reliable solutions for p-median problems. In our approach, each candidate solution in the entire search space is allocated to a separate thread, rather than dividing the search space into parallel subsets. This strategy maximizes the usage of GPU parallel architecture and results in a significant speedup and robust solution quality. Computationally, pVS reduces the worst case complexity from sequential VS’s O(p · n2) to O(p · (n ? p)) on each thread by parallelizing computational tasks on GPU implementation. We tested the performance of pVS on two sets of numerous test cases (including 40 network instances from OR-lib) and compared the results against a CPU-based sequential VS implementation. Our results show that pVS achieved a speed gain ranging from 10 to 57 times over the traditional VS in all test network instances.  相似文献   

7.
韩玉  闫镔  宇超群  李磊  李建新 《计算机应用》2012,32(5):1407-1410
针对FDK算法重建耗时长的问题,提出了一种基于图形处理器(GPU)的FDK并行加速算法。通过采用合理的线程分配方式,对反投影参数计算过程中与体素无关的中间变量的提取和预计算、对全局存储器访问次数的细致优化等策略,提高FDK算法的执行效率。仿真实验结果表明,在不牺牲重建质量的前提下,完全优化后的FDK并行加速算法重建2563规模的体数据需要0.5s,重建5123规模的体数据需要2.5s,这与较新的研究成果相比有很大幅度的提升。  相似文献   

8.
程博  郭振宇  王军平  曹秉刚 《控制与决策》2007,22(12):1395-1398
基于克隆选择原理,提出一种自适应并行免疫进化策略.在算法中根据抗体抗原亲和度将初始抗体种群分为两个子群,相应地提出了精英克隆算子和超变异算子.通过精英克隆算子提高算法局部搜索能力,同时利用超变异算子维持种群多样性,通过这两个功能互补算子的并行操作实现种群进化.仿真表明,自适应并行免疫进化策略搜索效率高,能有效抑制早熟收敛现象,可用于解决复杂机器学习问题.  相似文献   

9.
提出了一种针对离群数据规则挖掘的决策树构造方法。通过给出一个平均致密度的新定义和对离群数据产生机制的深入分析,提出离群数据的致密度往往比正常样本数据高的新认识,指出离群数据本质上也是不平衡数据,基于此提出了一种自动标记离群数据的新算法,并进一步在该算法和C4.5算法部分功能的基础上提出了一种基于离群数据自动标记的模糊决策树构造方法。仿真实验结果表明,该方法具有高效的离群数据规则挖掘能力,能处理不平衡数据,优化决策树的结构,挖掘出更高信任度的规则,有一定的实用价值。  相似文献   

10.
In the past few decades, much success has been achieved in the use of artificial neural networks for classification, recognition, approximation and control. Flexible neural tree (FNT) is a special kind of artificial neural network with flexible tree structures. The most distinctive feature of FNT is its flexible tree structures. This makes it possible for FNT to obtain near-optimal network structures using tree structure optimization algorithms. But the modeling efficiency of FNT is always a problem due to its two-stage optimization. This paper designed a parallel evolving algorithm for FNT (PE-FNT). This algorithm uses PIPE algorithm to optimize tree structures and PSO algorithm to optimize parameters. The evaluation processes of tree structure populations and parameter populations were both parallelized. As an implementation of PE-FNT algorithm, two parallel programs were developed using MPI. A small data set, two medium data sets and three large data sets were applied for the performance evaluations of these programs. Experimental results show that PE-FNT algorithm is an effective parallel FNT algorithm especially for large data sets.  相似文献   

11.
A tree T is labeled when the n vertices are distinguished from one another by names such as v1, v2…vn . Two labeled trees are considered to be distinct if they have different vertex labels even though they might be isomorphic. According to Cayley's tree formula, there are nn-2 labeled trees on n vertices. Prufer used a simple way to prove this formula and demonstrated that there exists a mapping between a labeled tree and a number sequence. From his proof, we can find a naive sequential algorithm which transfers a labeled tree to a number sequence and vice versa. However, it is hard to parallelize. In this paper, we shall propose an O(log n) time parallel algorithm for constructing a labeled tree by using O(n) processors and O(n log n) space on the EREW PRAM computational model  相似文献   

12.
In this paper we propose a simple GPU-based approach for discrete incremental approximation of 3D Voronoi diagram. By constructing region maps via GPU. Nearest sites, space clustering, and shortest distance query can be quickly answered by looking up the region map. In addition, we propose another representation of the 3D Voronoi diagram for visualization.  相似文献   

13.
General Purpose Graphic Processing Unit (GPGPU) computing with CUDA has been effectively used in scientific applications, where huge accelerations have been achieved. However, while today’s traditional GPGPU can reduce the execution time of parallel code by many times, it comes at the expense of significant power and energy consumption. In this paper, we propose ubiquitous parallel computing approach for construction of decision tree on GPU. In our approach, we exploit parallelism of well-known ID3 algorithm for decision tree learning by two levels: at the outer level of building the tree node-by-node, and at the inner level of sorting data records within a single node. Thus, our approach not only accelerates the construction of decision tree via GPU computing, but also does so by taking care of the power and energy consumption of the GPU. Experiment results show that our approach outperforms purely GPU-based implementation and CPU-based sequential implementation by several times.  相似文献   

14.
This paper deals with the optimization of parameters of technical indicators for stock market investment. Price prediction is a problem of great complexity and, usually, some technical indicators are used to predict market trends. The main difficulty in using technical indicators lies in deciding a set of parameter values. We proposed the use of Multi-Objective Evolutionary Algorithms (MOEAs) to obtain the best parameter values belonging to a collection of indicators that will help in the buying and selling of shares. The experimental results indicate that our MOEA offers a solution to the problem by obtaining results that improve those obtained through technical indicators with standard parameters. In order to reduce execution time is necessary to parallelize the executions. Parallelization results show that distributing the workload of indicators in multiple processors to improve performance is recommended. This parallelization has been performed taking advantage of the idle time in a corporate technology infrastructure. We have configured a small parallel grid using the students Labs of a Computer Science University College.  相似文献   

15.
Robust optimization is a popular method to tackle uncertain optimization problems. However, traditional robust optimization can only find a single solution in one run which is not flexible enough for decision-makers to select a satisfying solution according to their preferences. Besides, traditional robust optimization often takes a large number of Monte Carlo simulations to get a numeric solution, which is quite time-consuming. To address these problems, this paper proposes a parallel double-level multiobjective evolutionary algorithm (PDL-MOEA). In PDL-MOEA, a single-objective uncertain optimization problem is translated into a bi-objective one by conserving the expectation and the variance as two objectives, so that the algorithm can provide decision-makers with a group of solutions with different stabilities. Further, a parallel evolutionary mechanism based on message passing interface (MPI) is proposed to parallel the algorithm. The parallel mechanism adopts a double-level design, i.e., global level and sub-problem level. The global level acts as a master, which maintains the global population information. At the sub-problem level, the optimization problem is decomposed into a set of sub-problems which can be solved in parallel, thus reducing the computation time. Experimental results show that PDL-MOEA generally outperforms several state-of-the-art serial/parallel MOEAs in terms of accuracy, efficiency, and scalability.  相似文献   

16.
蒙特卡洛树搜索算法是一种常用的强化学习算法,博弈过程中动态空间的指数级增长是制约该算法学习效率的因素。基于并行方法对蒙特卡洛树搜索算法进行优化,提出基于胜率估值传递的并行蒙特卡洛树搜索算法。改进后的并行博弈搜索策略框架包含一个主进程和多个子进程,其中子进程用于探索,主进程根据子进程传递的胜率估值数据进行决策。结合多智能体博弈平台Pommerman进行实验验证,与传统的蒙特卡罗树搜索算法相比,并行蒙特卡罗树搜索算法有效提高了资源利用率、博弈胜率及决策效率。  相似文献   

17.
Hardware/software partitioning is an essential step in hardware/software co-design. For large size problems, it is difficult to consider both solution quality and time. This paper presents an efficient GPU-based parallel tabu search algorithm (GPTS) for HW/SW partitioning. A single GPU kernel of compacting neighborhood is proposed to reduce the amount of GPU global memory accesses theoretically. A kernel fusion strategy is further proposed to reduce the amount of GPU global memory accesses of GPTS. To further minimize the transfer overhead of GPTS between CPU and GPU, an optimized transfer strategy for GPU-based tabu evaluation is proposed, which considers that all the candidates do not satisfy the given constraint. Experiments show that GPTS outperforms state-of-the-art work of tabu search and is competitive with other methods for HW/SW partitioning. The proposed parallelization is significant when considering the ordinary GPU platform.  相似文献   

18.
19.
提出了三种新的GPU并行的自适应邻域模拟退火算法,分别是GPU并行的遗传-模拟退火算法,多条马尔可夫链并行的退火算法,基于BLOCK分块的GPU并行模拟退火算法,并通过对GPU端的程序采取合并内存访问,避免bank冲突,归约法等方式进一步提升了性能。实验中选取了11个典型的基准函数,实验结果证明这三种GPU并行退火算法比nonu-SA算法具有更好的精度和更快的收敛速度。  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号