首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Shuffle-unshuffle sorting networks are a class of comparator networks whose structure maps efficiently to the hypercube and any of its bounded degree variants. Recently, n -input shuffle-unshuffle sorting networks with depth have been discovered. These networks are the only known sorting networks of depth o( lg 2 n) that are not based on expanders, and their existence raises the question of whether a depth of O( lg n) can be achieved by any shuffle-unshuffle sorting network. In this paper we resolve this question by establishing an Ω( lg n lg lg n/lg lg lg n) lower bound on the depth of any n -input shuffle-unshuffle sorting network. Our lower bound can be extended to certain restricted classes of nonoblivious sorting algorithms on hypercubic machines. Received September 9, 1999, and in final form December 20, 1999.  相似文献   

2.
Given k terminals and n axis-parallel rectangular obstacles on the plane, our algorithm finds a plane region R * such that, for any point p in R * , the total length of the k shortest rectilinear paths connecting p and the k terminals without passing through any obstacle is minimum. The algorithm is output-sensitive, and takes O((K+n) log n) time and O(K+n) space if k is a fixed constant, where K is the total number of polygonal vertices of the found region R * .  相似文献   

3.
We study some minimum-area hull problems that generalize the notion of convex hull to star-shaped and monotone hulls. Specifically, we consider the minimum-area star-shaped hull problem: Given an n -vertex simple polygon P , find a minimum-area, star-shaped polygon P * containing P . This problem arises in lattice packings of translates of multiple, nonidentical shapes in material layout problems (e.g., in clothing manufacture), and has been recently posed by Daniels and Milenkovic. We consider two versions of the problem: the restricted version, in which the vertices of P * are constrained to be vertices of P , and the unrestricted version, in which the vertices of P * can be anywhere in the plane. We prove that the restricted problem falls in the class of ``3sum-hard' (sometimes called ``n 2 -hard') problems, which are suspected to admit no solutions in o(n 2 ) time. Further, we give an O(n 2 ) time algorithm, improving the previous bound of O(n 5 ) . We also show that the unrestricted problem can be solved in O(n 2 p(n)) time, where p(n) is the time needed to find the roots of two equations in two unknowns, each a polynomial of degree O(n) . We also consider the case in which P * is required to be monotone, with respect to an unspecified direction; we refer to this as the minimum-area monotone hull problem. We give a matching lower and upper bound of Θ(n log n) time for computing P * in the restricted version, and an upper bound of O(n q(n)) time in the unrestricted version, where q(n) is the time needed to find the roots of two polynomial equations in two unknowns with degrees 2 and O(n) . Received November 1996; revised March 1997.  相似文献   

4.
An optimal tree contraction algorithm for the boolean hypercube and the constant-degree hypercubic networks, such as the shuffle exchange or the butterfly network, is presented. The algorithm is based on novel routing techniques and, for certain small subtrees, simulates optimal PRAM algorithms. For trees of size n, stored on a p processor hypercube in in-order, the running time of the algorithm is . The resulting speed-up of is optimal due to logarithmic communication overhead, as shown by a corresponding lower bound. The same algorithmic ingredients can also be used to solve the term matching problem, one of the fundamental problems in logic programming. Received August 10, 1994; revised May 2, 1995.  相似文献   

5.
G. Sajith  S. Saxena 《Algorithmica》2000,27(2):187-197
The problem of finding a sublogarithmic time optimal parallel algorithm for 3 -colouring rooted forests has been open for long. We settle this problem by obtaining an O(( log log n) log * ( log * n)) time optimal parallel algorithm on a TOLERANT Concurrent Read Concurrent Write (CRCW) Parallel Random Access Machine (PRAM). Furthermore, we show that if f(n) is the running time of the best known algorithm for 3 -colouring a rooted forest on a COMMON or TOLERANT CRCW PRAM, a fractional independent set of the rooted forest can be found in O(f(n)) time with the same number of processors, on the same model. Using these results, it is shown that decomposable top-down algebraic computation and, hence, depth computation (ranking), 2 -colouring and prefix summation on rooted forests can be done in O( log n) optimal time on a TOLERANT CRCW PRAM. These algorithms have been obtained by proving a result of independent interest, one concerning the self-simulation property of TOLERANT: an N -processor TOLERANT CRCW PRAM that uses an address space of size O(N) only, can be simulated on an n -processor TOLERANT PRAM in O(N/n) time, with no asymptotic increase in space or cost, when n=O(N/ log log N) . Received May 20, 1997; revised June 15, 1998.  相似文献   

6.
Given a set of n intervals representing an interval graph, the problem of finding a maximum matching between pairs of disjoint (nonintersecting) intervals has been considered in the sequential model. In this paper we present parallel algorithms for computing maximum cardinality matchings among pairs of disjoint intervals in interval graphs in the EREW PRAM and hypercube models. For the general case of the problem, our algorithms compute a maximum matching in O( log 3 n) time using O(n/ log 2 n) processors on the EREW PRAM and using n processors on the hypercubes. For the case of proper interval graphs, our algorithm runs in O( log n ) time using O(n) processors if the input intervals are not given already sorted and using O(n/ log n ) processors otherwise, on the EREW PRAM. On n -processor hypercubes, our algorithm for the proper interval case takes O( log n log log n ) time for unsorted input and O( log n ) time for sorted input. Our parallel results also lead to optimal sequential algorithms for computing maximum matchings among disjoint intervals. In addition, we present an improved parallel algorithm for maximum matching between overlapping intervals in proper interval graphs. Received November 20, 1995; revised September 3, 1998.  相似文献   

7.
N. Gupta  S. Sen 《Algorithmica》2001,31(2):179-207
We describe an efficient parallel algorithm for hidden-surface removal for terrain maps. The algorithm runs in O(log 4 n) steps on the CREW PRAM model with a work bound of O((n+k) \polylog ( n)) where n and k are the input and output sizes, respectively. In order to achieve the work bound we use a number of techniques, among which our use of persistent data structures is somewhat novel in the context of parallel algorithms. Received July 29, 1998; revised October 5, 1999.  相似文献   

8.
Abstract. We present an optimal parallel randomized algorithm for the Voronoi diagram of a set of n nonintersecting (except possibly at endpoints) line segments in the plane. Our algorithm runs in O(log n) time with high probability using O(n) processors on a CRCW PRAM. This algorithm is optimal in terms of work done since the sequential time bound for this problem is Ω(n log n) . Our algorithm improves by an O(log n) factor the previously best known deterministic parallel algorithm, given by Goodrich, ó'Dúnlaing, and Yap, which runs in O( log 2 n) time using O(n) processors. We obtain this result by using a new ``two-stage' random sampling technique. By choosing large samples in the first stage of the algorithm, we avoid the hurdle of problem-size ``blow-up' that is typical in recursive parallel geometric algorithms. We combine the two-stage sampling technique with efficient search and merge procedures to obtain an optimal algorithm. This technique gives an alternative optimal algorithm for the Voronoi diagram of points as well (all other optimal parallel algorithms for this problem use the transformation to three-dimensional half-space intersection).  相似文献   

9.
We present parallel algorithms for computing all pair shortest paths in directed graphs. Our algorithm has time complexityO(f(n)/p+I(n)logn) on the PRAM usingp processors, whereI(n) is logn on the EREW PRAM, log logn on the CCRW PRAM,f(n) iso(n 3). On the randomized CRCW PRAM we are able to achieve time complexityO(n 3/p+logn) usingp processors. A preliminary version of this paper was presented at the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, June 1992. Support by NSF Grant CCR 90-20690 and PSC CUNY Awards #661340 and #662478.  相似文献   

10.
In Dijkstra (Commun ACM 17(11):643–644, 1974) introduced the notion of self-stabilizing algorithms and presented three such algorithms for the problem of mutual exclusion on a ring of n processors. The third algorithm is the most interesting of these three but is rather non intuitive. In Dijkstra (Distrib Comput 1:5–6, 1986) a proof of its correctness was presented, but the question of determining its worst case complexity—that is, providing an upper bound on the number of moves of this algorithm until it stabilizes—remained open. In this paper we solve this question and prove an upper bound of 3\frac1318 n2 + O(n){3\frac{13}{18} n^2 + O(n)} for the complexity of this algorithm. We also show a lower bound of 1\frac56 n2 - O(n){1\frac{5}{6} n^2 - O(n)} for the worst case complexity. For computing the upper bound, we use two techniques: potential functions and amortized analysis. We also present a new-three state self-stabilizing algorithm for mutual exclusion and show a tight bound of \frac56 n2 + O(n){\frac{5}{6} n^2 + O(n)} for the worst case complexity of this algorithm. In Beauquier and Debas (Proceedings of the second workshop on self-stabilizing systems, pp 17.1–17.13, 1995) presented a similar three-state algorithm, with an upper bound of 5\frac34n2+O(n){5\frac{3}{4}n^2+O(n)} and a lower bound of \frac18n2-O(n){\frac{1}{8}n^2-O(n)} for its stabilization time. For this algorithm we prove an upper bound of 1\frac12n2 + O(n){1\frac{1}{2}n^2 + O(n)} and show a lower bound of n 2O(n). As far as the worst case performance is considered, the algorithm in Beauquier and Debas (Proceedings of the second workshop on self-stabilizing systems, pp 17.1–17.13, 1995) is better than the one in Dijkstra (Commun ACM 17(11):643–644, 1974) and our algorithm is better than both.  相似文献   

11.
Let A and B be two sets of n objects in \reals d , and let Match be a (one-to-one) matching between A and B . Let min(Match ), max(Match ), and Σ(Match) denote the length of the shortest edge, the length of the longest edge, and the sum of the lengths of the edges of Match , respectively. Bottleneck matching— a matching that minimizes max(Match )— is suggested as a convenient way for measuring the resemblance between A and B . Several algorithms for computing, as well as approximating, this resemblance are proposed. The running time of all the algorithms involving planar objects is roughly O(n 1.5 ) . For instance, if the objects are points in the plane, the running time of the exact algorithm is O(n 1.5 log n ) . A semidynamic data structure for answering containment problems for a set of congruent disks in the plane is developed. This data structure may be of independent interest. Next, the problem of finding a translation of B that maximizes the resemblance to A under the bottleneck matching criterion is considered. When A and B are point-sets in the plane, an O(n 5 log n) -time algorithm for determining whether for some translated copy the resemblance gets below a given ρ is presented, thus improving the previous result of Alt, Mehlhorn, Wagener, and Welzl by a factor of almost n . This result is used to compute the smallest such ρ in time O(n 5 log 2 n ) , and an efficient approximation scheme for this problem is also given. The uniform matching problem (also called the balanced assignment problem, or the fair matching problem) is to find Match * U , a matching that minimizes max (Match)-min(Match) . A minimum deviation matching Match * D is a matching that minimizes (1/n)Σ(Match) - min(Match) . Algorithms for computing Match * U and Match * D in roughly O(n 10/3 ) time are presented. These algorithms are more efficient than the previous O(n 4 ) -time algorithms of Martello, Pulleyblank, Toth, and de Werra, and of Gupta and Punnen, who studied these problems for general bipartite graphs. Received October 21, 1997; revised July 16, 1998.  相似文献   

12.
The wavelet tree has become a very useful data structure to efficiently represent and query large volumes of data in many different domains, from bioinformatics to geographic information systems. One problem with wavelet trees is their construction time. In this paper, we introduce two algorithms that reduce the time complexity of a wavelet tree’s construction by taking advantage of nowadays ubiquitous multicore machines. Our first algorithm constructs all the levels of the wavelet in parallel with O(n) time and \(O(n\lg \sigma + \sigma \lg n)\) bits of working space, where n is the size of the input sequence and \(\sigma \) is the size of the alphabet. Our second algorithm constructs the wavelet tree in a domain decomposition fashion, using our first algorithm in each segment, reaching \(O(\lg n)\) time and \(O(n\lg \sigma + p\sigma \lg n/\lg \sigma )\) bits of extra space, where p is the number of available cores. Both algorithms are practical and report good speedup for large real datasets.  相似文献   

13.
We present efficient algorithms for computing very sparse low distortion spanners in distributed networks and prove some non-trivial lower bounds on the tradeoff between time, sparseness, and distortion. All of our algorithms assume a synchronized distributed network, where relatively short messages may be communicated in each time step. Our first result is a fast distributed algorithm for finding an ${O(2^{{\rm log}^{*} n} {\rm log} n)}We present efficient algorithms for computing very sparse low distortion spanners in distributed networks and prove some non-trivial lower bounds on the tradeoff between time, sparseness, and distortion. All of our algorithms assume a synchronized distributed network, where relatively short messages may be communicated in each time step. Our first result is a fast distributed algorithm for finding an O(2log* n log n){O(2^{{\rm log}^{*} n} {\rm log} n)} -spanner with size O(n). Besides being nearly optimal in time and distortion, this algorithm appears to be the first that constructs an O(n)-size skeleton without requiring unbounded length messages or time proportional to the diameter of the network. Our second result is a new class of efficiently constructible (α, β)-spanners called Fibonacci spanners whose distortion improves with the distance being approximated. At their sparsest Fibonacci spanners can have nearly linear size, namely O(n(loglogn)f){O(n(\log \log n)^{\phi})} , where f = (1 + ?5)/2{\phi = (1 + \sqrt{5})/2} is the golden ratio. As the distance increases the multiplicative distortion of a Fibonacci spanner passes through four discrete stages, moving from logarithmic to log-logarithmic, then into a period where it is constant, tending to 3, followed by another period tending to 1. On the lower bound side we prove that many recent sequential spanner constructions have no efficient counterparts in distributed networks, even if the desired distortion only needs to be achieved on the average or for a tiny fraction of the vertices. In particular, any distance preservers, purely additive spanners, or spanners with sublinear additive distortion must either be very dense, slow to construct, or have very weak guarantees on distortion.  相似文献   

14.
J. H. Reif 《Algorithmica》2001,29(3):487-510
{This paper is concerned with the problem of computing the characteristic polynomial of a matrix. In a large number of applications, the matrices are symmetric and sparse : with O(n) non-zero entries. The problem has an efficient sequential solution in this case, requiring O(n 2 ) work by use of the sparse Lanczos method. A major remaining open question is: to find a polylog time parallel algorithm with matching work bounds. Unfortunately, the sparse Lanczos method cannot be parallelized to faster than time Ω (n) using n processors. Let M(n) be the processor bound to multiply two n \times n matrices in O(log n) parallel time. Giesbrecht [G2] gave the best previous polylog time parallel algorithms for the characteristic polynomial of a dense matrix with O (M(n)) processors. There is no known improvement to this processor bound in the case where the matrix is sparse. Often, in addition to being symmetric and sparse, the matrix has a sparsity graph (which has edges between indices of the matrix with non-zero entries) that has small separators. This paper gives a new algorithm for computing the characteristic polynomial of a sparse symmetric matrix, assuming that the sparsity graph is s(n) -separable and has a separator of size s(n)=O(n γ ) , for some γ , 0 < γ < 1 , that when deleted results in connected components of ≤α n vertices, for some 0 < α < 1 , with the same property. We derive an interesting algebraic version of Nested Dissection, which constructs a sparse factorization of the matrix A-λ I n where A is the input matrix and I n is the n \times n identity matrix. While Nested Dissection is commonly used to minimize the fill-in in the solution of sparse linear systems, our innovation is to use the separator structure to bound also the work for manipulation of rational functions in the recursively factored matrices. The matrix elements are assumed to be over an arbitrary field. We compute the characteristic polynomial of a sparse symmetric matrix in polylog time using P(n)(n+M(s(n))) ≤ P(n)(n+ s(n) 2.376 ) processors, where P(n) is the processor bound to multiply two degree n polynomials in O(log n) parallel time using a PRAM (P(n) = O(n) if the field supports an FFT of size n but is otherwise O(nlog log n) [CK]. Our method requires only that a matrix be symmetric and non-singular (it need not be positive definite as usual for Nested Dissection techniques). For the frequently occurring case where the matrix has small separator size, our polylog parallel algorithm has work bounds competitive with the best known sequential algorithms (i.e., the Ω(n 2 ) work of sparse Lanczos methods), for example, when the sparsity graph is a planar graph, s(n) ≤ O( \sqrt n ) , and we require polylog time with only P(n)n 1.188 processors. } Received September 26, 1997; revised June 5, 1999.  相似文献   

15.
The center of area of a convex polygonP is the unique pointp * that maximizes the minimum area overlap betweenP and any halfplane that includesp *. We show thatp * is unique and present two algorithms for its computation. The first is a combinatorial algorithm that runs in timeO (n 6 log2 n). The second is a numerical algorithm that runs in timeO(GK(n+K)) whereK represents the number of desired bits of precision in the output coordinates andG the number of bits used to represent the coordinates of the input polygon vertices. We conclude with a discussion of implementation issues and related results.Research partially supported by the second author's NSF grant CCR-8351468, at Johns Hopkins University and Smith College.  相似文献   

16.
In this paper, we present randomized algorithms for selection on the hypercube. We identify two variants of the hypercube, namely, thesequential modeland theparallel model. In the sequential model, any node at any time can handle only communication along a single incident edge, whereas in the parallel model a node can communicate along all its incident edges at the same time. We specify three variations of the parallel model and present optimal randomized algorithms on all these three versions of parallel model. In particular, we show that selection on an input of sizencan be performed on ap-node hypercube in timeO((n/p) + logp) with high probability, on any of the three versions of the parallel model. This result is important in view of a lower bound that implies that selection needs Ω((n/p)log logp+ logp) time on ap-node sequential hypercube. We modify our selection algorithm to run on the sequential hypercube in which case it runs in an expected time nearly matching this lower bound. For the special case whenn=p, our selection algorithm runs in an optimalO(logn) time on the sequential hypercube. Our algorithms are very simple and are most likely to perform well in practice.  相似文献   

17.
Previous research on developing parallel triangulation algorithms concentrated on triangulating planar point sets.O(log3 n) running time algorithms usingO(n) processors have been developed in Refs. 1 and 2. Atallah and Goodrich(3) presented a data structure that can be viewed as a parallel analogue of the sequential plane-sweeping paradigm, which can be used to triangulate a planar point set inO(logn loglogn) time usingO(n) processors. Recently Merks(4) described an algorithm for triangulating point sets which runs inO(logn) time usingO(n) processors, and is thus optimal. In this paper we develop a parallel algorithm for triangulating simplicial point sets in arbitrary dimensions based on the idea of the sequential algorithm presented in Ref. 5. The algorithm runs inO(log2 n) time usingO(n/logn) processors. The algorithm hasO(n logn) as the product of the running time and the number of processors; i.e., an optimal speed-up.  相似文献   

18.
T. Uno  M. Yagiura 《Algorithmica》2000,26(2):290-309
Given two permutations of n elements, a pair of intervals of these permutations consisting of the same set of elements is called a common interval . Some genetic algorithms based on such common intervals have been proposed for sequencing problems and have exhibited good prospects. In this paper we propose three types of fast algorithms to enumerate all common intervals: (i) a simple O(n 2 ) time algorithm (LHP), whose expected running time becomes O(n) for two randomly generated permutations, (ii) a practically fast O(n 2 ) time algorithm (MNG) using the reverse Monge property, and (iii) an O(n+K) time algorithm (RC), where K is the number of common intervals. It will also be shown that the expected number of common intervals for two random permutations is O(1) . This result gives a reason for the phenomenon that the expected time complexity O(n) of the algorithm LHP is independent of K . Among the proposed algorithms, RC is most desirable from the theoretical point of view; however, it is quite complicated compared with LHP and MNG. Therefore, it is possible that RC is slower than the other two algorithms in some cases. For this reason, computational experiments for various types of problems with up to n=10 6 are conducted. The results indicate that (i) LHP and MNG are much faster than RC for two randomly generated permutations, and (ii) MNG is rather slower than LHP for random inputs; however, there are cases in which LHP requires Ω(n 2 ) time, but MNG runs in o(n 2 ) time and is faster than both LHP and RC. Received December 21, 1996; revised June 2, 1998.  相似文献   

19.
Finding a vast array of applications, the list-ranking problem has emerged as one of the fundamental techniques in parallel algorithm design. Surprisingly, the best previously known algorithm to rank a list of n items on a reconfigurable mesh of size was running in O(log n ) time. It was open for more than 8 years to obtain a faster algorithm for this important problem. Our main contribution is to provide the first breakthrough: we propose a deterministic list-ranking algorithm that runs in O(log* n ) time as well as a randomized one running in O(1) expected time, both on a reconfigurable mesh of size . Our results open the door to a large number of efficient list-ranking-based algorithms on reconfigurable meshes. Received February 1997, and in final form February 1998.  相似文献   

20.
Finding a vast array of applications, the list-ranking problem has emerged as one of the fundamental techniques in parallel algorithm design. Surprisingly, the best previously known algorithm to rank a list of n items on a reconfigurable mesh of size was running in O(log n ) time. It was open for more than 8 years to obtain a faster algorithm for this important problem. Our main contribution is to provide the first breakthrough: we propose a deterministic list-ranking algorithm that runs in O(log* n ) time as well as a randomized one running in O(1) expected time, both on a reconfigurable mesh of size . Our results open the door to a large number of efficient list-ranking-based algorithms on reconfigurable meshes. Received November 1996, and in final form February 1998.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号