期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

GPU-based parallel construction of compact visual hull meshes

Byungjoon Chang Sangkyu Woo Insung Ihm 《The Visual computer》2014,30(2):201-211

Building a visual hull model from multiple two-dimensional images provides an effective way of understanding the three-dimensional geometries inherent in the images. In this paper, we present a GPU accelerated algorithm for volumetric visual hull reconstruction that aims to harness the full compute power of the many-core processor. From a set of binary silhouette images with respective camera parameters, our parallel algorithm directly outputs the triangular mesh of the resulting visual hull in the indexed face set format for a compact mesh representation. Unlike previous approaches, the presented method extracts a smooth silhouette contour on the fly from each binary image, which markedly reduces the bumpy artifacts on the visual hull surface due to a simple binary in/out classification. In addition, it applies several optimization techniques that allow an efficient CUDA implementation. We also demonstrate that the compact mesh construction scheme can easily be modified for also producing a time- and space-efficient GPU implementation of the marching cubes algorithm. 相似文献

2.

GPU-based parallel algorithms for sparse nonlinear systems

V. Galiano H. Migallón V. Migallón J. Penadés 《Journal of Parallel and Distributed Computing》2012

In this work we describe some parallel algorithms for solving nonlinear systems using CUDA (Compute Unified Device Architecture) over a GPU (Graphics Processing Unit). The proposed algorithms are based on both the Fletcher–Reeves version of the nonlinear conjugate gradient method and a polynomial preconditioner type based on block two-stage methods. Several strategies of parallelization and different storage formats for sparse matrices are discussed. The reported numerical experiments analyze the behavior of these algorithms working in a fine grain parallel environment compared with a thread-based environment. 相似文献

3.

GPU-based parallel vertex substitution algorithm for the p-median problem

Gino J. Lim Likang Ma 《Computers & Industrial Engineering》2013,64(1):381-388

We introduce a GPU-based parallel vertex substitution (pVS) algorithm for the p-median problem using the CUDA architecture by NVIDIA. pVS is developed based on the best profit search algorithm, an implementation of vertex substitution (VS), that is shown to produce reliable solutions for p-median problems. In our approach, each candidate solution in the entire search space is allocated to a separate thread, rather than dividing the search space into parallel subsets. This strategy maximizes the usage of GPU parallel architecture and results in a significant speedup and robust solution quality. Computationally, pVS reduces the worst case complexity from sequential VS’s O(p · n²) to O(p · (n ? p)) on each thread by parallelizing computational tasks on GPU implementation. We tested the performance of pVS on two sets of numerous test cases (including 40 network instances from OR-lib) and compared the results against a CPU-based sequential VS implementation. Our results show that pVS achieved a speed gain ranging from 10 to 57 times over the traditional VS in all test network instances. 相似文献

4.

A parallel algorithm for constructing a labeled tree

Yue-Li Wang Hon-Chan Chen Wei-Kai Liu 《Parallel and Distributed Systems, IEEE Transactions on》1997,8(12):1236-1240

A tree T is labeled when the n vertices are distinguished from one another by names such as v₁, v₂…v_n. Two labeled trees are considered to be distinct if they have different vertex labels even though they might be isomorphic. According to Cayley's tree formula, there are n^n-2 labeled trees on n vertices. Prufer used a simple way to prove this formula and demonstrated that there exists a mapping between a labeled tree and a number sequence. From his proof, we can find a naive sequential algorithm which transfers a labeled tree to a number sequence and vice versa. However, it is hard to parallelize. In this paper, we shall propose an O(log n) time parallel algorithm for constructing a labeled tree by using O(n) processors and O(n log n) space on the EREW PRAM computational model 相似文献

5.

A simple GPU-based approach for 3D Voronoi diagram construction and visualization

Hsien-Hsi Hsieh Wen-Kai Tai 《Simulation Modelling Practice and Theory》2005,13(8):681-692

In this paper we propose a simple GPU-based approach for discrete incremental approximation of 3D Voronoi diagram. By constructing region maps via GPU. Nearest sites, space clustering, and shortest distance query can be quickly answered by looking up the region map. In addition, we propose another representation of the 3D Voronoi diagram for visualization. 相似文献

6.

Decision tree construction on GPU: ubiquitous parallel computing approach

Aziz Nasridinov Yangsun Lee Young-Ho Park 《Computing》2014,96(5):403-413

General Purpose Graphic Processing Unit (GPGPU) computing with CUDA has been effectively used in scientific applications, where huge accelerations have been achieved. However, while today’s traditional GPGPU can reduce the execution time of parallel code by many times, it comes at the expense of significant power and energy consumption. In this paper, we propose ubiquitous parallel computing approach for construction of decision tree on GPU. In our approach, we exploit parallelism of well-known ID3 algorithm for decision tree learning by two levels: at the outer level of building the tree node-by-node, and at the inner level of sorting data records within a single node. Thus, our approach not only accelerates the construction of decision tree via GPU computing, but also does so by taking care of the power and energy consumption of the GPU. Experiment results show that our approach outperforms purely GPU-based implementation and CPU-based sequential implementation by several times. 相似文献

7.

A parallel evolutionary algorithm for technical market indicators optimization

Diego José Bodas-Sagi Pablo Fernández-Blanco José Ignacio Hidalgo Francisco José Soltero-Domingo 《Natural computing》2013,12(2):195-207

This paper deals with the optimization of parameters of technical indicators for stock market investment. Price prediction is a problem of great complexity and, usually, some technical indicators are used to predict market trends. The main difficulty in using technical indicators lies in deciding a set of parameter values. We proposed the use of Multi-Objective Evolutionary Algorithms (MOEAs) to obtain the best parameter values belonging to a collection of indicators that will help in the buying and selling of shares. The experimental results indicate that our MOEA offers a solution to the problem by obtaining results that improve those obtained through technical indicators with standard parameters. In order to reduce execution time is necessary to parallelize the executions. Parallelization results show that distributing the workload of indicators in multiple processors to improve performance is recommended. This parallelization has been performed taking advantage of the idle time in a corporate technology infrastructure. We have configured a small parallel grid using the students Labs of a Computer Science University College. 相似文献

8.

A parallel double-level multiobjective evolutionary algorithm for robust optimization

《Applied Soft Computing》2017

Robust optimization is a popular method to tackle uncertain optimization problems. However, traditional robust optimization can only find a single solution in one run which is not flexible enough for decision-makers to select a satisfying solution according to their preferences. Besides, traditional robust optimization often takes a large number of Monte Carlo simulations to get a numeric solution, which is quite time-consuming. To address these problems, this paper proposes a parallel double-level multiobjective evolutionary algorithm (PDL-MOEA). In PDL-MOEA, a single-objective uncertain optimization problem is translated into a bi-objective one by conserving the expectation and the variance as two objectives, so that the algorithm can provide decision-makers with a group of solutions with different stabilities. Further, a parallel evolutionary mechanism based on message passing interface (MPI) is proposed to parallel the algorithm. The parallel mechanism adopts a double-level design, i.e., global level and sub-problem level. The global level acts as a master, which maintains the global population information. At the sub-problem level, the optimization problem is decomposed into a set of sub-problems which can be solved in parallel, thus reducing the computation time. Experimental results show that PDL-MOEA generally outperforms several state-of-the-art serial/parallel MOEAs in terms of accuracy, efficiency, and scalability. 相似文献

9.

GPU-based parallel genetic approach to large-scale travelling salesman problem

Semin Kang Sung-Soo Kim Jongho Won Young-Min Kang 《The Journal of supercomputing》2016,72(11):4399-4414

相似文献

10.

GPU-based image method for room impulse response calculation

Zhong-hua Fu Jian-wei Li 《Multimedia Tools and Applications》2016,75(9):5205-5221

Room impulse response (RIR) simulation based on the image-source method is widely used in room acoustic research. The calculation of the RIR in computer has to digitalize sound propagation delay into discrete samples. To carefully consider the digitalization error greatly increases the massive computational load of the image-source method. Therefore many real-time audio applications simply round-off the propagation delay to its nearest sample. This approximation, however, especially when the sampling frequency is low, degrades the phase precision that is required by applications such as microphone array. In this paper, by involving a Hanning-windowed ideal low-pass filter to reduce the digitalization error, a more precise image-source model is studied. We analyze its parallel calculation procedure and propose to use Graphics Processing Unit (GPU) to accelerate the calculation speed. The calculation procedure is divided into many parallel threads and arranged according the GPU architecture and its optimization criteria. We evaluate the calculation speeds of different RIRs using a general 5-core CPU, an ordinary GPU (GTX750) and an advanced GPU (K20C). The results show that, with similar precise RIR results, the speedup ratios of GTX750 and K20C over the general CPU can achieve 20 and 120 respectively. 相似文献

11.

A parallel nodal-based evolutionary structural optimization algorithm

Y.-M. Chen A. Bhaskar A. Keane 《Structural and Multidisciplinary Optimization》2002,23(3):241-251

相似文献

12.

GPU-based parallel solver via the Kantorovich theorem for the nonlinear Bernstein polynomial systems

Feifei Wei Jieqing Feng Hongwei Lin 《Computers & Mathematics with Applications》2011,62(6):2506-2517

This paper proposes a parallel solver for the nonlinear systems in Bernstein form based on subdivision and the Newton-Raphson method, where the Kantorovich theorem is employed to identify the existence of a unique root and guarantee the convergence of the Newton-Raphson iterations. Since the Kantorovich theorem accommodates a singular Jacobian at the root, the proposed algorithm performs well in a multiple root case. Moreover, the solver is designed and implemented in parallel on Graphics Processing Unit(GPU) with SIMD architecture; thus, efficiency for solving a large number of systems is improved greatly, an observation validated by our experimental results. 相似文献

13.

一种基于CUDA的并行多目标进化算法

胡宾宾祁荣宾钱锋《计算机与应用化学》2015,32(1)

传统的多目标进化算法多是基于Pareto最优概念的类随机搜索算法,求解速度较慢,特别是当问题维度变高,需要群体规模较大时,上述问题更加凸显。这一问题已经获得越来越多研究人员以及从业人员的关注。实验仿真中可以发现,构造非支配集和保持群体多样性这两部分工作占用了算法99%以上的执行时间。解决上述问题的一个有效方法就是对这一部分算法进行并行化改造。本文提出了一种基于CUDA平台的并行化解决方案,采用小生境技术实现共享适应度来维持候选解集的多样性,将多目标进化算法的实现全部置于GPU端,区别于以往研究中非支配排序的部分工作以及群体多样性保持的全部工作仍在CPU上执行。通过对ZDT系列函数的仿真结果,可以看出本文算法性能远远优于NSGA-Ⅱ和NPGA。最后通过求解油品调和过程这一有约束多目标优化问题,可以看出在解决化工应用中的有约束多目标优化问题时,该算法依然表现出优异的加速效果。相似文献

14.

A parallel cooperative team of multiobjective evolutionary algorithms for motif discovery

David L. González-Álvarez Miguel A. Vega-Rodríguez 《The Journal of supercomputing》2013,66(3):1576-1612

When solving a wide range of complex scenarios of a given optimization problem, it is very difficult, if not impossible, to develop a single technique or algorithm that is able to solve all of them adequately. In this case, it is necessary to combine several algorithms by applying the most appropriate one in each case. Parallel computing can be used to improve the quality of the solutions obtained in a cooperative algorithms model. Exchanging information between parallel cooperative algorithms will alter their behavior in terms of solution searching, and it may be more effective than a sequential metaheuristic. For demonstrating this, a parallel cooperative team of four multiobjective evolutionary algorithms based on OpenMP is proposed for solving different scenarios of the Motif Discovery Problem (MDP), which is an important real-world problem in the biological domain. As we will see, the results show that the application of a properly configured parallel cooperative team achieves high quality solutions when solving the addressed problem, improving those achieved by the algorithms executed independently for a much longer time. 相似文献

15.

A parallel micro evolutionary algorithm for heterogeneous computing and grid scheduling 总被引：1，自引：0，他引：1

Sergio Nesmachnow Héctor CancelaEnrique Alba 《Applied Soft Computing》2012,12(2):626-639

This work presents a novel parallel micro evolutionary algorithm for scheduling tasks in distributed heterogeneous computing and grid environments. The scheduling problem in heterogeneous environments is NP-hard, so a significant effort has been made in order to develop an efficient method to provide good schedules in reduced execution times. The parallel micro evolutionary algorithm is implemented using MALLBA, a general-purpose library for combinatorial optimization. Efficient numerical results are reported in the experimental analysis performed on both well-known problem instances and large instances that model medium-sized grid environments. The comparative study of traditional methods and evolutionary algorithms shows that the parallel micro evolutionary algorithm achieves a high problem solving efficacy, outperforming previous results already reported in the related literature, and also showing a good scalability behavior when facing high dimension problem instances. 相似文献

16.

GPU-based simulation of the long-range Potts model via parallel tempering

Attila Boer 《Computer Physics Communications》2014

We discuss the efficiency of parallelization on graphical processing units (GPUs) for the simulation of the one-dimensional Potts model with long-range interactions via parallel tempering. We investigate the behavior of some thermodynamic properties, such as equilibrium energy and magnetization, critical temperatures as well as the separation between the first- and second-order regimes. By implementing multispin coding techniques and an efficient parallelization of the interaction energy computation among threads, the GPU-accelerated approach reached speedup factors of up to 37. 相似文献

17.

A new method for actuating parallel manipulators

Awad Khidir Nik Abdullah Mohd Jailani Mohd Mohd Marzuki 《Sensors and actuators. A, Physical》2008,147(2):593-599

This paper presents a new technique of actuating a parallel platform manipulator using shape memory alloy (SMA). This is a type of smart materials that can attain a high strength-to-weight ratio, which makes them ideal for miniature application. The work is mainly to develop a new SMA actuator and then incorporating the actuator in building the parallel manipulator prototype. The SMA used in this study is a commercial NiTi wire. The SMA wire provides an actuating force that produces a large bending and end displacement. A 3-UPU (universal–prismatic–universal) parallel manipulator using linear SMA actuators was developed. The manipulator consists of a fixed platform, a moving platform and three SMA actuators. The manipulator workspace was specified based on the restrictions due to actuator strokes and joint angle limits. System identification techniques were used to model both heating and cooling processes. An ON/OFF control was performed and the results showed closeness in simulation and experimental results. This study showed that shape memory alloy actuated beam can successfully be used to provide linear displacement. The built prototype indicates the feasibility of using SMA actuators in parallel manipulators. 相似文献

18.

A parallel alpha/beta tree searching algorithm

Robert M. Hyatt Bruce W. Suter Harry L. Nelson 《Parallel Computing》1989,10(3):299-308

This paper gives a theoretical foundation for using parallel processing to search an alpha/beta minimax game tree. A mathematical discussion predicts the behavior of a particular parallel algorithm, Principle Variation Splitting (PVS), along with an enhancement that improves performance for most test cares. After the theoretical discussion, some practical results obtained by running two implementations of the PVS algorithm on a 30-processor tightly coupled shared memory machine are given to show the speedup obtained by parallel processing. 相似文献

19.

A parallel multi-p method

《Computers & Mathematics with Applications》2000,39(9-10):115-123

A parallel implementation of the multi-p method is discussed, using the master/slave model and the Parallel Virtual Machine (PVM) message passing library. In a series of performance tests, significant speed-up was achieved in those typical cases for the p-version where there was sufficient computational granularity to justify use of the parallel method. These tests indicate that the algorithms devised for load distribution and load balancing are sufficiently robust. These tests also indicate that, even though communication overhead in a network environment is relatively high, there is significant potential for scaling the method to larger processor ensembles. 相似文献

20.

A Java-based parallel platform for the implementation of evolutionary computation for engineering applications

Chun Che Fung Jia Bin Li Kok Wai Wong Kit Po Wong 《International journal of systems science》2013,44(13-14):741-750

As computers continually improve in performance and decrease in manufacturing cost, distributed systems consisting of multiple computers implemented as parallel computation platforms have become viable for engineering applications which demand intensive computation power. This paper proposes an extended version of a previously developed low cost parallel computation platform called para worker. The system is based on a cluster structure which is a form of a distributed system. The new system is termed para worker 2 which differentiates it from the earlier system. The new proposed system adds enhanced features of improved dynamic object reallocation, adaptive consistency protocols, and location transparency. Performance of the para worker 2 has proven to be superior to the para worker. Testing was based on an execution of Genetic Algorithm to solve the Economic Dispatch problem in Power Engineering. The proposal is particularly useful for the implementation and execution of computational intelligence techniques such as evolutionary computing for engineering applications. 相似文献