首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
张建平  杜学东 《计算机工程》2007,33(15):96-97,100
提出了一种奇序列双调排序算法,通过分析发现,该算法对某些奇双调序列不能得到正确的排序结果。在该算法的基础上,通过增加CCI操作,得到一种改进算法,改进后的算法能对任意奇双调序列进行正确排序,且不增加存储空间,计算复杂度级别也不变。  相似文献   

2.
翁玉萍  顾乃杰  李恺  陈强 《计算机工程》2011,37(20):255-257
分析归并排序算法和快速排序算法,根据国产CPU龙芯3A的体系结构特性,提出2种优化算法并进行实现。综合利用访存特性,引入拷贝优化、循环展开、交换操作优化和不同基本排序混用等优化技术。测试结果表明,在不影响排序稳定性的前提下,与Glibc 2.11库中的排序函数相比,2种优化算法均能提升16.9%~90.5%的排序性能。  相似文献   

3.
归并排序是一种稳定,高效的排序算法。归并排序算法一般是用顺序存储结构实现的。如Sun公司JDK中Java Collection库中对数组、List的排序。使用顺序存储结构实现归并排序需要空间复杂度为O(n)的辅助存储空间,对于链表来说,还需要转换为顺序存储结构,所以共需要2n的辅助存储空间。本文提出一种链表非递归归并排序算法,可以对链表进行原地(In Place)排序,只需要O(logn)的辅助存储空间,时间复杂度不变。  相似文献   

4.
一种新的并行归并排序算法   总被引:5,自引:0,他引:5  
文章提出了一种新的并行归并排序算法。算法充分利用并行系统中各个处理机中数据排序后序列长度相等的特点,计算出归并段对中的一个元素和最后一个元素的位置,然后再从相应的位置进行归并排序。该算法可使排序后的数据分布完全达到平衡,具有较高的负载平衡性、可扩展性和排序稳定性。文章最后给出了基于PC集群的实验结果,并把该结果与PSRS算法作了比较。  相似文献   

5.
MapReduce已经发展成为大数据领域标准的并行计算模型。为了使MapReduce系统下参与计算的所有节点高度负载均衡,并且最小化空间使用率、CPU、I/O的使用时长和网络传输开销等指标,在保持算法良好并行性的基础上,提出了一种MapReduce优化算法的设计规范,对多个指标同时进行优化。针对数据处理领域最重要的排序算法进行理论分析,给出了多指标约束下的最优算法,并证明了该优化算法满足MapReduce 优化算法规范。最后通过实验验证了该优化的排序算法在有效性和效率方面严格优于传统的排序算法。  相似文献   

6.
金菁 《计算机科学》2014,41(12):155-159
MapReduce已经发展成为大数据领域标准的并行计算模型。理想情况下,一个MapReduce系统应该使参与计算的所有节点高度负载均衡,并且最小化空间使用率、CPU和I/O的使用时长以及网络传输开销。传统的算法往往只针对上述指标中的一种进行优化。在保持算法良好并行性基础上,对多个指标同时进行优化,提出了MapReduce优化算法的设计规范。针对数据处理领域最重要的排序算法进行理论分析,给出了多指标约束下的最后算法,并证明了该优化算法满足MapReduce优化算法规范。最后通过实验验证了优化的排序算法的有效性和效率。  相似文献   

7.
本文介绍一种归并排序算法--插入归并算法的基本原理,并通过该算法的Systolic阵列映射,重点阐述了正则映射生成VLSI阵列的理论和方法,最后,还指出了改进脉动阵列通用性和灵活性的途径。  相似文献   

8.
分组排序算法   总被引:3,自引:0,他引:3       下载免费PDF全文
提出了分组排序算法,详细分析了算法的原理及其时间与空间复杂度,得出了在最坏情况下的时间复杂度是θmn);最好情况和平均情况下的时间复杂度均是θnlog(n/mk));在最坏情况下的空间复杂度是O(mn-m2m);最好情况和平均情况下的空间复杂度均是O(mklog(n/mk));并用多组随机数据与效率较高的快速算法进行仿真对比实验,试验结果说明了文中结论的正确性。这一结果,将有助于进一步设计高效的海量数据分析方法。  相似文献   

9.
本文主要描述了分治策略的基本思想,并且用分治策略实现了快速排序和归并排序两种排序算法。从分、解、合三方面剖析排序,从而得出分割方式是影响排序效率的关键,并将分治法扩展应用到更多排序方法中。  相似文献   

10.
为了解决经典快速排序算法在面对待排序数据事先有序,大量重复数据,递归层数过深以及排序稳定性等诸多问题时暴露出来的缺陷,从枢轴的合理选择、三路划分、与其他排序法结合和尾递归优化等多个方面分析和总结了优化经典快速排序算法的各种策略,在实际使用快速排序算法时具有一定的参考价值.  相似文献   

11.
Graphics processor units (GPUs) have emerged as powerful parallel processors in recent years. Although floating point computations and high level programming languages are now available, the efficient use of the enormous computing power of GPUs still requires a significant amount of graphics specific knowledge.The paper explains how to use GPUs for scientific computations without graphics specific terminology. It offers an algorithmic view on GPUs with comparisons to cache aware and parallel programming of CPUs. Two typical simulation techniques, namely grid based and particle based methods are discussed.  相似文献   

12.
Graphics Processing Units (GPUs) were originally designed to manipulate images, but due to their intrinsic parallel nature, they turned into a powerful tool for scientific applications. In this article, we evaluated GPU performance in an implementation of a traditional stochastic simulation – the correlated Brownian motion. This movement can be described by the Generalized Langevin Equation (GLE), which is a stochastic integro-differential equation, with applications in many areas like anomalous diffusion, transport in porous media, noise analysis, quantum dynamics, among many others. Our results show the power inherent in GPU programming when compared to traditional CPUs (Intel): we observed acceleration values up to sixty times by using a NVIDIA GPU in place of a single-core Intel CPU.  相似文献   

13.
The sense of being within a three-dimensional (3D) space and interacting with virtual 3D objects in a computer-generated virtual environment (VE) often requires essential image, vision and sensor signal processing techniques such as differentiating and denoising. This paper describes novel implementations of the Gaussian filtering for characteristic signal extraction and wavelet-based image denoising algorithms that run on the graphics processing unit (GPU). While significant acceleration over standard CPU implementations is obtained through exploiting data parallelism provided by the modern programmable graphics hardware, the CPU can be freed up to run other computations more efficiently such as artificial intelligence (AI) and physics. The proposed GPU-based Gaussian filtering can extract surface information from a real object and provide its material features for rendering and illumination. The wavelet-based signal denoising for large size digital images realized in this project provided better realism for VE visualization without sacrificing real-time and interactive performances of an application.  相似文献   

14.
一种三路划分快速排序的改进算法   总被引:1,自引:0,他引:1  
快速排序是一种经典的排序算法,它的平均性能非常突出。针对快速排序在某些特殊情况下(如数据已有序或重复数据较多时)效率较低的问题进行了研究,对三路快速排序进行改进,使快速排序在特殊情况下也能保持较好的效率。通过大量的数据测试发现,该算法在最好情况下其性能在几个数量级上优于普通快速排序,在最坏情况下,其性能较普通快速排序无明显差距。改进后的三路快速排序是一种通用高效的排序算法,因此在某些情况下选用、该算法会获得更好的效率。  相似文献   

15.
Many complex natural systems studied in the geosciences are characterized by simple local-scale interactions that result in complex emergent behavior. Simulations of these systems, often implemented in parallel using standard central processing unit (CPU) clusters, may be better suited to parallel processing environments with large numbers of simple processors. Such an environment is found in graphics processing units (GPUs) on graphics cards.This paper discusses GPU implementations of three example applications from computational fluid dynamics, seismic wave propagation, and rock magnetism. These candidate applications involve important numerical modeling techniques, widely employed in physical system simulations, that are themselves examples of distinct computing classes identified as fundamental to scientific and engineering computing. The presented numerical methods (and respective computing classes they belong to) are: (1) a lattice-Boltzmann code for geofluid dynamics (structured grid class); (2) a spectral-finite-element code for seismic wave propagation simulations (sparse linear algebra class); and (3) a least-squares minimization code for interpreting magnetic force microscopy data (dense linear algebra class). Significant performance increases (between 10× and 30× in most cases) are seen in all three applications, demonstrating the power of GPU implementations for these types of simulations and, more generally, their associated computing classes.  相似文献   

16.
Parallelization strategies are presented for Virtual Quake, a numerical simulation code for earthquakes based on topologically realistic systems of interacting earthquake faults. One of the demands placed upon the simulation is the accurate reproduction of the observed earthquake statistics over three to four decades. This requires the use of a high‐resolution fault model in computations, which demands computational power that is well beyond the scope of off‐the‐shelf multi‐core CPU computers. However, the recent advances in general‐purpose graphic processing units have the potential to address this problem at moderate cost increments. A functional decomposition of Virtual Quake is performed, and opportunities for parallelization are discussed in this work. Computationally intensive modules are identified, and these are implemented on graphics processing units, significantly speeding up earthquake simulations. In the current best case scenario, a computer with six graphics processing units can simulate 500 years of fault activity in California at 1.5 km × 1.5 km element resolution in less than 1 hour, whereas a single CPU requires more than 2 days to perform the same simulation. Copyright © 2015 John Wiley & Sons, Ltd.  相似文献   

17.
BRAD3D, a low-cost hardware platform for the development of a realtime 3D graphics software is presented. The BRAD3D configuration is derived from a generalization of 3D image synthesis. Three basic processes have been identified: the geometric process, dealing with the measurements of the scene; the topologic process, extracting visible information from the polygonal structure; and the scan-conversion process, producing pixel values on a frame buffer. BRAD3D is implemented as a three-stage pipeline and accommodates depth-list and scan-line hidden-surface-removal algorithms. Each stage of the pipeline can be implemented using different hardware solutions. A microprocessor-based solution is presented as a general prototyping approach.  相似文献   

18.
The hierarchical Z-buffer is application-invisible and more efficient than the traditional Z-buffer for quickly rejecting hidden geometries. But there are construction and management issues associated with integrating a hierarchical Z-buffer into current graphics hardware. Here we present a two-level hierarchical Z-buffer algorithm, and provide solutions to these issues. Simulation results show that the bandwidth can be reduced by up to 35%. Moreover we propose a dynamic bi-level HZ-buffer compression technique that reduces the buffer size up by to 40%, and for which there is little performance degradation.  相似文献   

19.
3D图形流水线像素处理后期的设计和实现   总被引:1,自引:0,他引:1  
针对3D图形流水线像素处理后期的实时大批量数据处理和存储器读写要求,以及嵌入式系统资源和功耗的特殊性,给出一种像素处理后期的硬件设计方案。设计首先实现所有测试功能,确保各种效果,其次采用了基于屏幕分割渲染的设计思想,减少存储器需求,然后吸收了Early Z算法,尽早抛弃不可见的三角面信息,减少渲染的数据,最后实现了Flip Quad反走样算法,提高图像的质量。模块已经完成了RTL级建模,并在FPGA上通过验证。  相似文献   

20.
View-dependent multiresolution rendering places a heavy load on CPU. This paper presents a new method on view-dependent refinement of multiresolution meshes by using the computation power of modern programmable graphics hardware (GPU). Two rendering passes using this method are included. During the first pass, the level of detail selection is performed in the fragment shaders. The resultant buffer from the first pass is taken as the input texture to the second rendering pass by vertex texturing, and then the node culling and triangulation can be performed in the vertex shaders. Our approach can generate adaptive meshes in real-time, and can be fully implemented on GPU. The method improves the efficiency of mesh simplification, and significantly alleviates the computing load on CPU.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号