期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

K-Dimensional Optimal Parallel Algorithm for the Solution of a General Class of Recurrence Equations 总被引：1，自引：0，他引：1

下载免费PDF全文

Gao Qingshi Liu Zhiyong 《计算机科学技术学报》1995,10(5):417-424

This paper proposes a parallel algorithm,called KDOP (K-Dimensional Optimal Parallel algorithm),to solve a general class of recurrence equations efficiently.The KDOP algorithm partitions the computation into a series of subcomputations,each of which is executed in the fashion that all the processors work simultaneously with each one executing an optimal sequential algorithm to solve a subcomputation task.The algorithm solves the equations in O(N/P) steps in EREW PRAM model (Exclusive Read Exclusive Write Parallel Random Access Machine model) using p≤N^1-∈ processors,where N is the size of the problem,and ∈ is a given constant.This is an optimal algorithm (its sepeedup is O(p)) in the case of p≤N^1-∈.Such an optimal speedup for this problem was previously achieved only in the case of p≤N^0.5.The algorithm can be implemented on machines with multiple processing elements or pipelined vector machines with parallel memory systems. 相似文献

2.

基于流水光总线阵列上的快速可扩展并行排序算法

陈宏建陈崚秦玲徐晓华屠莉《计算机工程》2004,30(24):17-18,191

在Y．Pan提出的基于流水光总线阵列模型(LARPBS)上使用N个处理器对N个元素进行排序在最好情况下以O(logN)时间，最坏情况下以O(N)时间完成的并行排序算法的基础上，提出了一种LARPBS模型上的可扩展的快速并行排序算法，对N个元素进行排序，使用p(1≤P≤N)个处理器在最好情况下以O(NlogN／p)时间，最坏情况下以O(N^2／p)时间完成排序。另外还提出了一种LARPBS模型上改进的快速高效并行排序算法，该算法对N个元素进行排序使用N个处理器在最好情况下以O(log√N)时间、最坏情况下以O(√N)时间完成排序。相似文献

3.

Pyramidal thinning algorithm for SIMD parallel machines

Stphane 《Pattern recognition》1995,28(12):1993-2000

We propose a parallel thinning algorithm for binary pictures. Given an N × N binary image including an object, our algorithm computes in O(N²) the skeleton of the object, using a pyramidal decomposition of the picture. The behavior of this algorithm is studied considering a family of digitalization of the same object at a different level of resolution. With the Exclusive Read Exclusive Write (EREW) Parallel Random Access Machine (PRAM), our algorithm runs in O(log N) time using O(N²/logN) processors and it is work-optimal. The same result is obtained with high-connectivity distributed memory SIMD machines having strong hypercube and pyramid. We describe the basic operator, the pyramidal algorithm and some experimental results on the SIMD MasPar parallel machine. 相似文献

4.

一种优化的并行汉字／字符串匹配算法 总被引：1，自引：1，他引：0

王素琴邹旭楷《中文信息学报》1995,9(1):49-53

字符串检索指在一个文本Text=t1…tn中找出一个字符串Pat=p1…pm的所有出现。本文给出了在CREW/CRCW PRAM机器模型上并行检索汉字/字符串的算法, 它使用n/m。个处理机, 预处理时间为O(m+|∑|, 并行执行时间为O(m)。相似文献

5.

Multiway merging in parallel

Zhaofang Wen 《Parallel and Distributed Systems, IEEE Transactions on》1996,7(1):11-17

The problem of merging k (k⩾2) sorted lists is considered. We give an optimal parallel algorithm which takes O((n log k/p)+log n) time using p processors on a parallel random access machine that allows concurrent reads and exclusive writes, where n is the total size of the input lists. This algorithm achieves O(log n) time using p=n log k/log n processors. Most of the previous log n research for this problem has been focused on the case when k=2. Very recently, parallel solutions for the case when k=2 have been reported. Our solution is the first logarithmic time optimal parallel algorithm for the problem when k⩾2. It can also be seen as a unified optimal parallel algorithm for sorting and merging. In order to support the algorithm, a new processor assignment strategy is also presented 相似文献

6.

背包问题的最优并行算法 总被引：10，自引：2，他引：10

李庆华李肯立蒋盛益张薇《软件学报》2003,14(5):891-896

利用分治策略,提出一种基于SIMD共享存储计算机模型的并行背包问题求解算法.算法允许使用O(2^n/4)^1-ε个并行处理机单元,0≤ε≤1,O(2^n/2)个存储单元,在O(2^n/4(2^n/4)^ε)时间内求解n维背包问题,算法的成本为O(2^n/2).将提出的算法与已有文献结论进行对比表明,该算法改进了已有文献的相应结果,是求解背包问题的成本最优并行算法.同时还指出了相关文献主要结论的错误. 相似文献

7.

块三对角线性方程组的一种有效并行算法

肖曼玉吕全义《计算机应用与软件》2006,23(6):107-108,134

提出了求解系数矩阵为块三对角的线性方程组的一种适合于MIMD分布式存储的并行算法,该算法以系数矩阵分解为基础,充分利用了系数矩阵结构的特殊性,进行了近似处理,使整个计算过程只在相邻处理机间通信两次,具有很高的并行效率,并在理论上给出了该算法成立的充分条件。最后,在HPrx2600集群上进行数值试验,结果表明,加速比呈线性增加,并行效率达到90％以上。相似文献

8.

Fast parallel algorithm for distance transform

Datta A. Soundaralakshmi S. 《IEEE transactions on systems, man, and cybernetics. Part A, Systems and humans : a publication of the IEEE Systems, Man, and Cybernetics Society》2003,33(4):429-434

We present an O((log log N)/sup 2/) -time algorithm for computing the distance transform of an N /spl times/ N binary image. Our algorithm is designed for the common concurrent read concurrent write parallel random access machine (CRCW PRAM) and requires O(N/sup 2+/spl epsi///log log N) processors, for any /spl epsi/ such that 0 < /spl epsi/ < 1. Our algorithm is based on a novel deterministic sampling scheme and can be used for computing distance transforms for a very general class of distance functions. We also present a scalable version of our algorithm when the number of processors is available p/sup 2+/spl epsi///log log p for some p < N. In this case, our algorithm runs in O((N/sup 2//p/sup 2/)+(N/p) log log p + (log log p)/sup 2/) time. This scalable algorithm is more practical since usually the number of available processors is much less than the size of the image. 相似文献

9.

货郎担问题最优并行启发式算法

孙伟马绍汉《计算机研究与发展》1994,31(4):45-47

本文给出了满足三角不等式的货郎担问题的并行启发式算法，在ＳＩＭＤＣＲＥＶＰＲＡＭ并行机上该算法使用Ｏ（ｎ＾３／ｌｏｇ＾２ｎ）台处理器需Ｏ熄ｌｏｇ＾２ｎ）时间，这里ｎ是给定城市的个数，因而该并行算法是最优的。相似文献

10.

解模糊交货期多机并行调度问题的改进量子遗传算法

吴灵芝黄德才《计算机系统应用》2011,20(9):73-77

在结合遗传算法和量子理论的基础上,提出了一种改进的量子遗传算法（IQGA）求解模糊交货期多机并行调度问题。采用量子比特相位比较法更新量子位,以加快搜索的速度和效率;采用求反解码操作,以扩大种群规模。通过仿真验证,改进的量子遗传算法在求解模糊交货期多机并行调度问题时有较好的寻优能力。相似文献

11.

块三对角线性方程组的一种并行算法

肖曼玉吕全义汪保欧阳洁《数值计算与计算机应用》2007,28(4):241-249

本文提出了分布式环境下求解块三对角线性方程组的一种并行算法,该算法通过分裂系数矩阵,充分利用系数矩阵结构的特殊性,使算法只在相邻处理机间通信两次.并从理论上给出了算法收敛的一个充分条件,分析了误差.最后,在HP rx2600集群上进行了数值试验,结果表明,实算与理论是一致的,并行效率也很高. 相似文献

12.

Optimal parallel algorithms for finding proximate points, withapplications

Hayashi T. Nakano K. Olariu S. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(12):1153-1166

Consider a set P of points in the plane sorted by the x-coordinate. A point p in P is said to be a proximate point if there exists a point q on the x-axis such that p is the closest point to q over all points in P. The proximate point problem is to determine all the proximate points in P. Our main contribution is to propose optimal parallel algorithms for solving instances of size n of the proximate points problem. We begin by developing a work-time optimal algorithm running in O(log log n) time and using n/loglogn Common-CRCW processors. We then go on to show that this algorithm can be implemented to run in O(log n) time using n/logn EREW processors. In addition to being work-time optimal, our EREW algorithm turns out to also be time-optimal. Our second main contribution is to show that the proximate points problem finds interesting, and quite unexpected, applications to digital geometry and image processing. As a first application, we present a work-time optimal parallel algorithm for finding the convex hull of a set of n points in the plane sorted by x-coordinate; this algorithm runs in O(log log n) time using n/logn Common-CRCW processors. We then show that this algorithm can be implemented to run in O(log n) time using n/logn EREW processors. Next, we show that the proximate points algorithms afford us work-time optimal (resp, time-optimal) parallel algorithms for various fundamental digital geometry and image processing problems 相似文献

13.

Space and time optimal parallel sequence alignments

Rajko S. Aluru S. 《Parallel and Distributed Systems, IEEE Transactions on》2004,15(12):1070-1081

We present the first space and time optimal parallel algorithm for the pairwise sequence alignment problem, a fundamental problem in computational biology. This problem can be solved sequentially in O(mn) time and O(m+n) space, where m and n are the lengths of the sequences to be aligned. The fastest known parallel space-optimal algorithm for pairwise sequence alignment takes optimal O(m+n/p) space, but suboptimal O((m+n)/sup 2//p) time, where p is the number of processors. On the other hand, the most space economical time-optimal parallel algorithm takes O(mn/p) time, but O(m+n/p) space. We close this gap by presenting an algorithm that achieves both time and space optimality, i.e. requires only O((m+n)/p) space and O(mn/p) time. We also present an experimental evaluation of the proposed algorithm on an IBM xSeries cluster. Although presented in the context of full sequence alignments, our algorithm is applicable to other alignment problems in computational biology including local alignments and syntenic alignments. It is also a useful addition to the range of techniques available for parallel dynamic programming. 相似文献

14.

带有宽总线网络可重构计算模型上的快速并行排序算法

陈宏建陈崚秦玲徐晓华屠莉《计算机工程与应用》2004,40(19):48-50

该文介绍了带有宽总线网络的可重构计算模型(RAPWBN)的基本结构及其二进制值的前缀和操作,提出了一种快速并行排序算法,对长度为N的序列进行排序,在具有N2个处理器和N条行总线的RAPWBN模型上,若总线带宽ω>logN字节,可以在O(1)时间完成排序。该算法的成本达到了最优。相似文献

15.

Data parallel sorting for particle simulation

Leonardo Dagum 《Concurrency and Computation》1992,4(3):241-255

Sorting on a parallel architecture is a communications intensive event which can incur a high penalty in applications where it is required. In the case of particle simulation, only integer sorting is necessary, and sequential implementations easily attain the minimum performance bound of O(N) for N particles. Parallel implementations, however, have to cope with the parallel sorting problem which, in addition to incurring a heavy communications cost, can make the minimum performance bound difficult to attain. This paper demonstrates how the sorting problem in a particle simulation can be reduced to a merging problem, and describes an efficient data parallel algorithm to solve this merging problem in a particle simulation. The new algorithm is shown to be optimal under conditions usual for particle simulation, and its fieldwise implementation on the Connection Machine is analysed in detail. The new algorithm is about four times faster than a fieldwise implementation of radix sort on the Connection Machine. 相似文献

16.

背包类问题的并行O(25n/6)时间-空间-处理机折衷

下载免费PDF全文

李肯立赵欢李仁发李庆华《软件学报》2007,18(6)

将串行动态二表算法应用于并行三表算法的设计中,提出一种求解背包、精确的可满足性和集覆盖等背包类NP完全问题的并行三表六子表算法.基于EREW-PRAM模型,该算法可使用O(2n/8)的处理机在O(27n/16)的时间和O(213n/48)的空间求解n维背包类问题,其时间-空间-处理机折衷为O(25n/6).与现有文献的性能对比分析表明,该算法极大地提高了并行求解背包类问题的时间-空间-处理机折衷性能.由于该算法能够破解更高维数的背包类公钥和数字水印系统,其结论在密钥分析领域具有一定的理论和实际意义. 相似文献

17.

A Note on Parallel Selection on Coarse-Grained Multicomputers 总被引：1，自引：0，他引：1

E. L. G. Saukas S. W. Song 《Algorithmica》1999,24(3-4):371-380

Consider the selection problem of determining the k th smallest element of a set of n elements. Under the CGM (coarse-grained multicomputer) model with p processors and O(n/p) local memory, we present a deterministic parallel algorithm for the selection problem that requires O( log p) communication rounds. Besides requiring a low number of communication rounds, the algorithm also attempts to minimize the total amount of data transmitted in each round (only O(p) except in the last round). In addition to showing theoretical complexities, we present very promising experimental results obtained on a parallel machine that show almost linear speedup, indicating the efficiency and scalability of the proposed algorithm. Received June 1, 1997; revised March 10, 1998. 相似文献

18.

Railway Computation for Infinite Linear Systems

Paola Favati Grazia Lotti Ornella Menchi Francesco Romani 《International journal of parallel programming》2002,30(6):419-439

The problem of solving an infinite system of linear equations finitely expressed is addressed. Modifications of the Gauss–Seidel method are presented, especially suitable for the implementation on SMP machines with a small number of processors. One of the proposed parallel algorithms, which concentrates the computational efforts where they are most needed, results to be more efficient than the sequential algorithm, even from the point of view of the total number of operations. 相似文献

19.

几乎最快与渐近最优的并行分枝界限算法

武继刚计永昶陈国良《软件学报》2000,11(12):1572-1580

分枝界限算法是求解组合优化问题的技术之一,它被广泛地应用在埃运筹学与组合数学中.对共享存储的最优优先一般并行分枝界限算法给出了运行时间复杂度下界Ω(m/p+hlogp),其中p为可用处理器数,h为扩展的结点数,m为状态空间中的活结点数.通过将共享存器设计成p个立体堆,提出了PRAM-EREW上一个新的一般并行分枝界限算法,理论上证明了对于h<p^2p,该算法为最快且渐近最优的并行分枝界限算法.最后对0-r背包问题给出了模拟实验结果. 相似文献

20.

Optimal computation of prefix sums on a binary tree of processors

Henk Meijer Selim G. Akl 《International journal of parallel programming》1987,16(2):127-136

Givenn numbersa ₀,a ₁,...,a _n–1, it is required to compute all sums of the forma ₀+a ₁+...+a _i, fori=0, 1,...,n–1. This problem arises in many applications and is trivial to solve sequentially in O(n) time. Besides its practical importance, the problem gains an additional theoretical interest in parallel computation. A technique known asrecursive doubling allows all sums to be computed in O(logn) time on a model of computation wheren processors communicate through aninverse perfect suffle interconnection network. In this paper we show how the problem can be solved on a simple network, namely abinary tree of processors. In addition, we show how to extend our solution to obtain an optimal-cost algorithm. The algorithm usesp processors and runs in O((n/p)+logp) time, for a cost of O(n+p logp). This cost is optimal whenp logp=O(n). Finally, two applications of our results are illustrated, namely job scheduling with deadlines and the knapsack problem.This work was supported by the Natural Sciences and Engineering Research Council of Canada under Grants A0282 and A3336. 相似文献