期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Parallel Algorithms for Maximum Matching in Complements of Interval Graphs and Related Problems

M. G. Andrews M. J. Atallah D. Z. Chen D. T. Lee 《Algorithmica》2000,26(2):263-289

Given a set of n intervals representing an interval graph, the problem of finding a maximum matching between pairs of disjoint (nonintersecting) intervals has been considered in the sequential model. In this paper we present parallel algorithms for computing maximum cardinality matchings among pairs of disjoint intervals in interval graphs in the EREW PRAM and hypercube models. For the general case of the problem, our algorithms compute a maximum matching in O( log ³ n) time using O(n/ log² n) processors on the EREW PRAM and using n processors on the hypercubes. For the case of proper interval graphs, our algorithm runs in O( log n ) time using O(n) processors if the input intervals are not given already sorted and using O(n/ log n ) processors otherwise, on the EREW PRAM. On n -processor hypercubes, our algorithm for the proper interval case takes O( log n log log n ) time for unsorted input and O( log n ) time for sorted input. Our parallel results also lead to optimal sequential algorithms for computing maximum matchings among disjoint intervals. In addition, we present an improved parallel algorithm for maximum matching between overlapping intervals in proper interval graphs. Received November 20, 1995; revised September 3, 1998. 相似文献

2.

An optimal speed-up parallel algorithm for triangulating simplicial point sets in space

Hossam ElGindy 《International journal of parallel programming》1986,15(5):389-398

Previous research on developing parallel triangulation algorithms concentrated on triangulating planar point sets.O(log³ n) running time algorithms usingO(n) processors have been developed in Refs. 1 and 2. Atallah and Goodrich⁽³⁾ presented a data structure that can be viewed as a parallel analogue of the sequential plane-sweeping paradigm, which can be used to triangulate a planar point set inO(logn loglogn) time usingO(n) processors. Recently Merks⁽⁴⁾ described an algorithm for triangulating point sets which runs inO(logn) time usingO(n) processors, and is thus optimal. In this paper we develop a parallel algorithm for triangulating simplicial point sets in arbitrary dimensions based on the idea of the sequential algorithm presented in Ref. 5. The algorithm runs inO(log² n) time usingO(n/logn) processors. The algorithm hasO(n logn) as the product of the running time and the number of processors; i.e., an optimal speed-up. 相似文献

3.

A Randomized Algorithm for the Voronoi Diagram of Line Segments on Coarse-Grained Multiprocessors

Xiaotie Deng Binhai Zhu 《Algorithmica》1999,24(3-4):270-286

We present a randomized algorithm for computing the Voronoi diagram of line segments using coarse-grained parallel machines. Operating on P processors, for any input of n line segments, this algorithm performs O((n log n)/P) local operations per processor, O(n/P) messages per processor, and O(1) communication phases, with high probability for n=Ω(P ^3+ε ) . Received June 1, 1997; revised March 10, 1998. 相似文献

4.

Fast Parallel Reordering and Isomorphism Testing of k -Trees

Del Greco Sekharan Sridhar 《Algorithmica》2008,32(1):61-72

Abstract. In this paper two problems on the class of k -trees, a subclass of the class of chordal graphs, are considered: the fast reordering problem and the isomorphism problem. An O(log ² n) time parallel algorithm for the fast reordering problem is described that uses O(nk(n-k)/\kern -1ptlog n) processors on a CRCW PRAM proving membership in the class NC for fixed k . An O(nk(k+1)!) time sequential algorithm for the isomorphism problem is obtained representing an improvement over the O(n ² k(k+1)!) algorithm of Sekharan (the second author) [10]. A parallel version of this sequential algorithm is presented that runs in O(log ² n) time using O((nk((k+1)!+n-k))/log n) processors improving on a parallel algorithm of Sekharan for the isomorphism problem [10]. Both the sequential and parallel algorithms use a concept introduced in this paper called the kernel of a k -tree. 相似文献

5.

Optimal Sublogarithmic Time Parallel Algorithms on Rooted Forests

G. Sajith S. Saxena 《Algorithmica》2000,27(2):187-197

The problem of finding a sublogarithmic time optimal parallel algorithm for 3 -colouring rooted forests has been open for long. We settle this problem by obtaining an O(( log log n) log^* ( log^* n)) time optimal parallel algorithm on a TOLERANT Concurrent Read Concurrent Write (CRCW) Parallel Random Access Machine (PRAM). Furthermore, we show that if f(n) is the running time of the best known algorithm for 3 -colouring a rooted forest on a COMMON or TOLERANT CRCW PRAM, a fractional independent set of the rooted forest can be found in O(f(n)) time with the same number of processors, on the same model. Using these results, it is shown that decomposable top-down algebraic computation and, hence, depth computation (ranking), 2 -colouring and prefix summation on rooted forests can be done in O( log n) optimal time on a TOLERANT CRCW PRAM. These algorithms have been obtained by proving a result of independent interest, one concerning the self-simulation property of TOLERANT: an N -processor TOLERANT CRCW PRAM that uses an address space of size O(N) only, can be simulated on an n -processor TOLERANT PRAM in O(N/n) time, with no asymptotic increase in space or cost, when n=O(N/ log log N) . Received May 20, 1997; revised June 15, 1998. 相似文献

6.

Fast Generation of Random Permutations Via Networks Simulation

A. Czumaj P. Kanarek M. Kutylowski K. Lorys 《Algorithmica》1998,21(1):2-20

We consider the problem of generating random permutations with uniform distribution. That is, we require that for an arbitrary permutation π of n elements, with probability 1/n! the machine halts with the i th output cell containing π(i) , for 1 ≤ i ≤ n . We study this problem on two models of parallel computations: the CREW PRAM and the EREW PRAM. The main result of the paper is an algorithm for generating random permutations that runs in O(log log n) time and uses O(n ^1+o(1) ) processors on the CREW PRAM. This is the first o(log n) -time CREW PRAM algorithm for this problem. On the EREW PRAM we present a simple algorithm that generates a random permutation in time O(log n) using n processors and O(n) space. This algorithm outperforms each of the previously known algorithms for the exclusive write PRAMs. The common and novel feature of both our algorithms is first to design a suitable random switching network generating a permutation and then to simulate this network on the PRAM model in a fast way. Received November 1996; revised March 1997. 相似文献

7.

An Optimal Parallel Co-Connectivity Algorithm

Ka Wong Chong Stavros D. Nikolopoulos Leonidas Palios 《Theory of Computing Systems》2004,37(4):527-546

In this paper we consider the problem of computing the connected components of the complement of a given graph. We describe a simple sequential algorithm for this problem, which works on the input graph and not on its complement, and which for a graph on n vertices and m edges runs in optimal O(n+m) time. Moreover, unlike previous linear co-connectivity algorithms, this algorithm admits efficient parallelization, leading to an optimal O(log n)-time and O((n+m)log n)-processor algorithm on the EREW PRAM model of computation. It is worth noting that, for the related problem of computing the connected components of a graph, no optimal deterministic parallel algorithm is currently available. The co-connectivity algorithms find applications in a number of problems. In fact, we also include a parallel recognition algorithm for weakly triangulated graphs, which takes advantage of the parallel co-connectivity algorithm and achieves an O(log² n) time complexity using O((n+m²) log n) processors on the EREW PRAM model of computation. 相似文献

8.

Optimal parallel algorithms for finding proximate points, withapplications

Hayashi T. Nakano K. Olariu S. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(12):1153-1166

Consider a set P of points in the plane sorted by the x-coordinate. A point p in P is said to be a proximate point if there exists a point q on the x-axis such that p is the closest point to q over all points in P. The proximate point problem is to determine all the proximate points in P. Our main contribution is to propose optimal parallel algorithms for solving instances of size n of the proximate points problem. We begin by developing a work-time optimal algorithm running in O(log log n) time and using n/loglogn Common-CRCW processors. We then go on to show that this algorithm can be implemented to run in O(log n) time using n/logn EREW processors. In addition to being work-time optimal, our EREW algorithm turns out to also be time-optimal. Our second main contribution is to show that the proximate points problem finds interesting, and quite unexpected, applications to digital geometry and image processing. As a first application, we present a work-time optimal parallel algorithm for finding the convex hull of a set of n points in the plane sorted by x-coordinate; this algorithm runs in O(log log n) time using n/logn Common-CRCW processors. We then show that this algorithm can be implemented to run in O(log n) time using n/logn EREW processors. Next, we show that the proximate points algorithms afford us work-time optimal (resp, time-optimal) parallel algorithms for various fundamental digital geometry and image processing problems 相似文献

9.

Parallel algorithms for relational coarsest partition problems 总被引：2，自引：0，他引：2

Rajasekaran S. Lee I. 《Parallel and Distributed Systems, IEEE Transactions on》1998,9(7):687-699

Relational Coarsest Partition Problems (RCPPs) play a vital role in verifying concurrent systems. It is known that RCPPs are P-complete and hence it may not be possible to design polylog time parallel algorithms for these problems. In this paper, we present two efficient parallel algorithms for RCPP in which its associated label transition system is assumed to have m transitions and n states. The first algorithm runs in O(n^1+ϵ) time using m/n^ϵ CREW PRAM processors, for any fixed ϵ<1. This algorithm is analogous to and optimal with respect to the sequential algorithm of P.C. Kanellakis and S.A. Smolka (1990). The second algorithm runs in O(n log n) time using m/n CREW PRAM processors. This algorithm is analogous to and nearly optimal with respect to the sequential algorithm of R. Paige and R.E. Tarjan (1987) 相似文献

10.

Detecting Race Conditions in Parallel Programs that Use Semaphores

Klein Netzer Lu 《Algorithmica》2008,35(4):321-345

Abstract. We address the problem of detecting race conditions in programs that use semaphores for synchronization. Netzer and Miller showed that it is NP-complete to detect race conditions in programs that use many semaphores. We show in this paper that it remains NP-complete even if only two semaphores are used in the parallel programs. For the tractable case, i.e., using only one semaphore, we give two algorithms for detecting race conditions from the trace of executing a parallel program on p processors, where n semaphore operations are executed. The first algorithm determines in O(n) time whether a race condition exists between any two given operations. The second algorithm runs in O( np log n) time and outputs a compact representation from which one can determine in O(1) time whether a race condition exists between any two given operations. The second algorithm is near-optimal in that the running time is only O( log n) times the time required simply to write down the output. 相似文献

11.

An Efficient Output-Size Sensitive Parallel Algorithm for Hidden-Surface Removal for Terrains

N. Gupta S. Sen 《Algorithmica》2001,31(2):179-207

We describe an efficient parallel algorithm for hidden-surface removal for terrain maps. The algorithm runs in O(log ⁴ n) steps on the CREW PRAM model with a work bound of O((n+k) \polylog ( n)) where n and k are the input and output sizes, respectively. In order to achieve the work bound we use a number of techniques, among which our use of persistent data structures is somewhat novel in the context of parallel algorithms. Received July 29, 1998; revised October 5, 1999. 相似文献

12.

Constructing the Voronoi diagram of a set of line segments in parallel 总被引：1，自引：1，他引：0

Michael T. Goodrich Colm Ó'Dúnlaing Chee K. Yap 《Algorithmica》1993,9(2):128-141

In this paper we give a parallel algorithm for constructing the Voronoi diagram of a polygonal scene, i.e., a set of line segments in the plane such that no two segments intersect except possibly at their endpoints. Our algorithm runs inO(log² n) time usingO(n) processors in the CREW PRAM model.The research of M. T. Goodrich was supported by NSF under Grants CCR-8810568 and CCR-9003299 and by NSF/DARPA under Grant CCR-8908092. C. K. Yap's research was supported in part by NSF Grants DCR-8401898 and CCR-9002819. 相似文献

13.

Finding least-weight subsequences with fewer processors

Tak Wah Lam Kwong-fai Chan 《Algorithmica》1993,9(6):615-628

By restricting weight functions to satisfy the quadrangle inequality or the inverse quadrangle inequality, significant progress has been made in developing efficient sequential algorithms for the least-weight subsequence problem [10], [9], [12], [16]. However, not much is known on the improvement of the naive parallel algorithm for the problem, which is fast but demands too many processors (i.e., it takesO(log² n) time on a CREW PRAM with n³/logn processors). In this paper we show that if the weight function satisfies the inverse quadrangle inequality, the problem can be solved on a CREW PRAM in O(log² n log logn) time withn/log logn processors, or in O(log² n) time withn logn processors. Notice that the processor-time complexity of our algorithm is much closer to the almost linear-time complexity of the best-known sequential algorithm [12]. 相似文献

14.

An optimal and processor efficient parallel sorting algorithm on a linear array with a reconfigurable pipelined bus system

Min He Xiaolong Wu Si Qing Zheng 《Computers & Electrical Engineering》2009,35(6):951-965

Optical interconnections attract many engineers and scientists’ attention due to their potential for gigahertz transfer rates and concurrent access to the bus in a pipelined fashion. These unique characteristics of optical interconnections give us the opportunity to reconsider traditional algorithms designed for ideal parallel computing models, such as PRAMs. Since the PRAM model is far from practice, not all algorithms designed on this model can be implemented on a realistic parallel computing system. From this point of view, we study Cole’s pipelined merge sort [Cole R. Parallel merge sort. SIAM J Comput 1988;14:770–85] on the CREW PRAM and extend it in an innovative way to an optical interconnection model, the LARPBS (Linear Array with Reconfigurable Pipelined Bus System) model [Pan Y, Li K. Linear array with a reconfigurable pipelined bus system—concepts and applications. J Inform Sci 1998;106;237–58]. Although Cole’s algorithm is optimal, communication details have not been provided due to the fact that it is designed for a PRAM. We close this gap in our sorting algorithm on the LARPBS model and obtain an O(log N)-time optimal sorting algorithm using O(N) processors. This is a substantial improvement over the previous best sorting algorithm on the LARPBS model that runs in O(log N log log N) worst-case time using N processors [Datta A, Soundaralakshmi S, Owens R. Fast sorting algorithms on a linear array with a reconfigurable pipelined bus system. IEEE Trans Parallel Distribut Syst 2002;13(3):212–22]. Our solution allows efficiently assign and reuse processors. We also discover two new properties of Cole’s sorting algorithm that are presented as lemmas in this paper. 相似文献

15.

Optimal Sequential And Parallel Algorithms To Compute A Steiner Tree On Permutation Graphs

《国际计算机数学杂志》2012,89(8):937-943

This paper presents an optimal sequential and an optimal parallel algorithm to compute a minimum cardinality Steiner set and a Steiner tree. The sequential algorithm takes O ( n ) time and parallel algorithm takes O (log n ) time and O ( n /log n ) processors on an EREW PRAM model. 相似文献

16.

Efficient parallel algorithms for computing all pair shortest paths in directed graphs

Yijie Han V. Y. Pan J. H. Reif 《Algorithmica》1997,17(4):399-415

We present parallel algorithms for computing all pair shortest paths in directed graphs. Our algorithm has time complexityO(f(n)/p+I(n)logn) on the PRAM usingp processors, whereI(n) is logn on the EREW PRAM, log logn on the CCRW PRAM,f(n) iso(n ³). On the randomized CRCW PRAM we are able to achieve time complexityO(n ³/p+logn) usingp processors. A preliminary version of this paper was presented at the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, June 1992. Support by NSF Grant CCR 90-20690 and PSC CUNY Awards #661340 and #662478. 相似文献

17.

Parallel parsing algorithms for static dictionary compression

Nagumo H. Mi Lu Watson K.L. 《Parallel and Distributed Systems, IEEE Transactions on》1999,10(12):1241-1251

The data compression based on dictionary techniques works by replacing phrases in the input string with indexes into some dictionary. The dictionary can be static or dynamic. In static dictionary compression, the dictionary contains a predetermined fixed set of entries. In dynamic dictionary compression, the dictionary changes its entries during compression. We present parallel algorithms for two parsing strategies for static dictionary compression. One is the optimal parsing strategy with dictionaries that have the prefix properly, for which our algorithm requires O(L+log n) time and O(n) processors, where n is the number of symbols in the input string, and L is the maximum length of the dictionary entries, while previous results run in O(L+log n) time using O(n²) processors or in O(L+log² n) time using O(n) processors. The other is the longest fragment first (LFF) parsing strategy, for which our algorithm requires O(L+log n,) time and O(n log L) processors, while a previous result obtained an O(L log n) time performance on O(n/log n) processors. For both strategies, we derive our parallel algorithms by modifying the on-line algorithms using a pointer doubling technique 相似文献

18.

Efficient Parallel Computation of the Characteristic Polynomial of a Sparse, Separable Matrix

J. H. Reif 《Algorithmica》2001,29(3):487-510

{This paper is concerned with the problem of computing the characteristic polynomial of a matrix. In a large number of applications, the matrices are symmetric and sparse : with O(n) non-zero entries. The problem has an efficient sequential solution in this case, requiring O(n ² ) work by use of the sparse Lanczos method. A major remaining open question is: to find a polylog time parallel algorithm with matching work bounds. Unfortunately, the sparse Lanczos method cannot be parallelized to faster than time Ω (n) using n processors. Let M(n) be the processor bound to multiply two n \times n matrices in O(log n) parallel time. Giesbrecht [G2] gave the best previous polylog time parallel algorithms for the characteristic polynomial of a dense matrix with O (M(n)) processors. There is no known improvement to this processor bound in the case where the matrix is sparse. Often, in addition to being symmetric and sparse, the matrix has a sparsity graph (which has edges between indices of the matrix with non-zero entries) that has small separators. This paper gives a new algorithm for computing the characteristic polynomial of a sparse symmetric matrix, assuming that the sparsity graph is s(n) -separable and has a separator of size s(n)=O(n ^γ ) , for some γ , 0 < γ < 1 , that when deleted results in connected components of ≤α n vertices, for some 0 < α < 1 , with the same property. We derive an interesting algebraic version of Nested Dissection, which constructs a sparse factorization of the matrix A-λ I _n where A is the input matrix and I _n is the n \times n identity matrix. While Nested Dissection is commonly used to minimize the fill-in in the solution of sparse linear systems, our innovation is to use the separator structure to bound also the work for manipulation of rational functions in the recursively factored matrices. The matrix elements are assumed to be over an arbitrary field. We compute the characteristic polynomial of a sparse symmetric matrix in polylog time using P(n)(n+M(s(n))) ≤ P(n)(n+ s(n) ^2.376 ) processors, where P(n) is the processor bound to multiply two degree n polynomials in O(log n) parallel time using a PRAM (P(n) = O(n) if the field supports an FFT of size n but is otherwise O(nlog log n) [CK]. Our method requires only that a matrix be symmetric and non-singular (it need not be positive definite as usual for Nested Dissection techniques). For the frequently occurring case where the matrix has small separator size, our polylog parallel algorithm has work bounds competitive with the best known sequential algorithms (i.e., the Ω(n ² ) work of sparse Lanczos methods), for example, when the sparsity graph is a planar graph, s(n) ≤ O( \sqrt n ) , and we require polylog time with only P(n)n ^1.188 processors. } Received September 26, 1997; revised June 5, 1999. 相似文献

19.

Design and Implementation of a Practical Parallel Delaunay Algorithm 总被引：1，自引：0，他引：1

G. E. Blelloch G. L. Miller J. C. Hardwick D. Talmor 《Algorithmica》1999,24(3-4):243-269

This paper describes the design and implementation of a practical parallel algorithm for Delaunay triangulation that works well on general distributions. Although there have been many theoretical parallel algorithms for the problem, and some implementations based on bucketing that work well for uniform distributions, there has been little work on implementations for general distributions. We use the well known reduction of 2D Delaunay triangulation to find the 3D convex hull of points on a paraboloid. Based on this reduction we developed a variant of the Edelsbrunner and Shi 3D convex hull algorithm, specialized for the case when the point set lies on a paraboloid. This simplification reduces the work required by the algorithm (number of operations) from O(n log ² n) to O(n log n) . The depth (parallel time) is O( log ³ n) on a CREW PRAM. The algorithm is simpler than previous O(n log n) work parallel algorithms leading to smaller constants. Initial experiments using a variety of distributions showed that our parallel algorithm was within a factor of 2 in work from the best sequential algorithm. Based on these promising results, the algorithm was implemented using C and an MPI-based toolkit. Compared with previous work, the resulting implementation achieves significantly better speedups over good sequential code, does not assume a uniform distribution of points, and is widely portable due to its use of MPI as a communication mechanism. Results are presented for the IBM SP2, Cray T3D, SGI Power Challenge, and DEC AlphaCluster. Received June 1, 1997; revised March 10, 1998. 相似文献

20.

A Randomized Linear-Work EREW PRAM Algorithm to Find a Minimum Spanning Forest

Chung Keung Poon Vijaya Ramachandran 《Algorithmica》2008,35(3):257-268

We present a randomized EREW PRAM algorithm to find a minimum spanning forest in a weighted undirected graph. On an n -vertex graph the algorithm runs in o(( log n) ¹⁺ ^ɛ ) expected time for any ɛ >0 and performs linear expected work. This is the first linear-work, polylog-time algorithm on the EREW PRAM for this problem. This also gives parallel algorithms that perform expected linear work on two general-purpose models of parallel computation—the QSM and the BSP. 相似文献