首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
K. Diks  A. Pelc 《Algorithmica》2000,28(1):37-50
We consider broadcasting among n processors, f of which can be faulty. A fault-free processor, called the source, holds a piece of information which has to be transmitted to all other fault-free processors. We assume that the fraction f/n of faulty processors is bounded by a constant γ<1 . Transmissions are fault free. Faults are assumed to be of the crash type: faulty processors do not send or receive messages. We use the whispering model: pairs of processors communicating in one round must form a matching. A fault-free processor sending a message to another processor becomes aware of whether this processor is faulty or fault free and can adapt future transmissions accordingly. The main result of the paper is a broadcasting algorithm working in O( log n) rounds and using O(n) messages of logarithmic size, in the worst case. This is an improvement of the result from [17] where O ((log n) 2 ) rounds were used. Our method also gives the first algorithm for adaptive distributed fault diagnosis in O( log n) rounds. Received May 1997; revised May 1998.  相似文献   

2.
T. Takaoka 《Algorithmica》1998,20(3):309-318
In this paper we give three subcubic cost algorithms for the all pairs shortest distance (APSD) and path (APSP) problems. The first is a parallel algorithm that solves the APSD problem for a directed graph with unit edge costs in O(log 2 n) time with processors where μ = 2.688 on an EREW PRAM. The second parallel algorithm solves the APSP, and consequently APSD, problem for a directed graph with nonnegative general costs (real numbers) in O(log 2 n) time with o(n 3 ) subcubic cost. Previously this cost was greater than O(n 3 ) . Finally we improve with respect to M the complexity O((Mn) μ ) of a sequential algorithm for a graph with edge costs up to M to O(M 1/3 n (6+ω)/3 (log n) 2/3 (log log n) 1/3 ) in the APSD problem, where ω = 2.376 . Received October 15, 1995; revised June 21, 1996.  相似文献   

3.
Xiaotie Deng  Binhai Zhu 《Algorithmica》1999,24(3-4):270-286
We present a randomized algorithm for computing the Voronoi diagram of line segments using coarse-grained parallel machines. Operating on P processors, for any input of n line segments, this algorithm performs O((n log n)/P) local operations per processor, O(n/P) messages per processor, and O(1) communication phases, with high probability for n=Ω(P 3+ε ) . Received June 1, 1997; revised March 10, 1998.  相似文献   

4.
A planar monotone circuit (PMC) is a Boolean circuit that can be embedded in the plane and that contains only AND and OR gates. A layered PMC is a PMC in which all input nodes are in the external face, and the gates can be assigned to layers in such a way that every wire goes between gates in successive layers. Goldschlager, Cook and Dymond, and others have developed NC 2 algorithms to evaluate a layered PMC when the output node is in the same face as the input nodes. These algorithms require a large number of processors (Ω(n 6 ), where n is the size of the input circuit). In this paper we give an efficient parallel algorithm that evaluates a layered PMC of size n in time using only a linear number of processors on an EREW PRAM. Our parallel algorithm is the best possible to within a polylog factor, and is a substantial improvement over the earlier algorithms for the problem. Received April 18, 1994; revised April 7, 1995.  相似文献   

5.
We consider the problem of generating random permutations with uniform distribution. That is, we require that for an arbitrary permutation π of n elements, with probability 1/n! the machine halts with the i th output cell containing π(i) , for 1 ≤ i ≤ n . We study this problem on two models of parallel computations: the CREW PRAM and the EREW PRAM. The main result of the paper is an algorithm for generating random permutations that runs in O(log log n) time and uses O(n 1+o(1) ) processors on the CREW PRAM. This is the first o(log n) -time CREW PRAM algorithm for this problem. On the EREW PRAM we present a simple algorithm that generates a random permutation in time O(log n) using n processors and O(n) space. This algorithm outperforms each of the previously known algorithms for the exclusive write PRAMs. The common and novel feature of both our algorithms is first to design a suitable random switching network generating a permutation and then to simulate this network on the PRAM model in a fast way. Received November 1996; revised March 1997.  相似文献   

6.
Given a set of n intervals representing an interval graph, the problem of finding a maximum matching between pairs of disjoint (nonintersecting) intervals has been considered in the sequential model. In this paper we present parallel algorithms for computing maximum cardinality matchings among pairs of disjoint intervals in interval graphs in the EREW PRAM and hypercube models. For the general case of the problem, our algorithms compute a maximum matching in O( log 3 n) time using O(n/ log 2 n) processors on the EREW PRAM and using n processors on the hypercubes. For the case of proper interval graphs, our algorithm runs in O( log n ) time using O(n) processors if the input intervals are not given already sorted and using O(n/ log n ) processors otherwise, on the EREW PRAM. On n -processor hypercubes, our algorithm for the proper interval case takes O( log n log log n ) time for unsorted input and O( log n ) time for sorted input. Our parallel results also lead to optimal sequential algorithms for computing maximum matchings among disjoint intervals. In addition, we present an improved parallel algorithm for maximum matching between overlapping intervals in proper interval graphs. Received November 20, 1995; revised September 3, 1998.  相似文献   

7.
Jansen  Porkolab 《Algorithmica》2008,32(3):507-520
Abstract. A malleable parallel task is one whose execution time is a function of the number of (identical) processors alloted to it. We study the problem of scheduling a set of n independent malleable tasks on a fixed number of parallel processors, and propose an approximation scheme that for any fixed ε > 0 , computes in O(n) time a non-preemptive schedule of length at most (1+ε) times the optimum.  相似文献   

8.
N. Gupta  S. Sen 《Algorithmica》2001,31(2):179-207
We describe an efficient parallel algorithm for hidden-surface removal for terrain maps. The algorithm runs in O(log 4 n) steps on the CREW PRAM model with a work bound of O((n+k) \polylog ( n)) where n and k are the input and output sizes, respectively. In order to achieve the work bound we use a number of techniques, among which our use of persistent data structures is somewhat novel in the context of parallel algorithms. Received July 29, 1998; revised October 5, 1999.  相似文献   

9.
Abstract. We present an optimal parallel randomized algorithm for the Voronoi diagram of a set of n nonintersecting (except possibly at endpoints) line segments in the plane. Our algorithm runs in O(log n) time with high probability using O(n) processors on a CRCW PRAM. This algorithm is optimal in terms of work done since the sequential time bound for this problem is Ω(n log n) . Our algorithm improves by an O(log n) factor the previously best known deterministic parallel algorithm, given by Goodrich, ó'Dúnlaing, and Yap, which runs in O( log 2 n) time using O(n) processors. We obtain this result by using a new ``two-stage' random sampling technique. By choosing large samples in the first stage of the algorithm, we avoid the hurdle of problem-size ``blow-up' that is typical in recursive parallel geometric algorithms. We combine the two-stage sampling technique with efficient search and merge procedures to obtain an optimal algorithm. This technique gives an alternative optimal algorithm for the Voronoi diagram of points as well (all other optimal parallel algorithms for this problem use the transformation to three-dimensional half-space intersection).  相似文献   

10.
In this paper we consider the problem of finding perfect matchings in parallel. We present a RNC algorithm with almost optimal work with respect to sequential algorithms, i.e., it uses O(n ω ) processors, where ω is the matrix multiplication exponent. Our algorithm is based on an RNC algorithm for computing determinant of a degree one polynomial matrix which is of independent interest. Research supported by KBN grant 1P03A01830.  相似文献   

11.
We present parallel algorithms for computing all pair shortest paths in directed graphs. Our algorithm has time complexityO(f(n)/p+I(n)logn) on the PRAM usingp processors, whereI(n) is logn on the EREW PRAM, log logn on the CCRW PRAM,f(n) iso(n 3). On the randomized CRCW PRAM we are able to achieve time complexityO(n 3/p+logn) usingp processors. A preliminary version of this paper was presented at the 4th Annual ACM Symposium on Parallel Algorithms and Architectures, June 1992. Support by NSF Grant CCR 90-20690 and PSC CUNY Awards #661340 and #662478.  相似文献   

12.
Previous research on developing parallel triangulation algorithms concentrated on triangulating planar point sets.O(log3 n) running time algorithms usingO(n) processors have been developed in Refs. 1 and 2. Atallah and Goodrich(3) presented a data structure that can be viewed as a parallel analogue of the sequential plane-sweeping paradigm, which can be used to triangulate a planar point set inO(logn loglogn) time usingO(n) processors. Recently Merks(4) described an algorithm for triangulating point sets which runs inO(logn) time usingO(n) processors, and is thus optimal. In this paper we develop a parallel algorithm for triangulating simplicial point sets in arbitrary dimensions based on the idea of the sequential algorithm presented in Ref. 5. The algorithm runs inO(log2 n) time usingO(n/logn) processors. The algorithm hasO(n logn) as the product of the running time and the number of processors; i.e., an optimal speed-up.  相似文献   

13.
14.
Samal and Henderson claim that any parallel algorithm for enforcing arc consistency in the worst case must have (na) sequential steps, wheren is the number of nodes, anda is the number of labels per node. We argue that Samal and Henderson's argument makes assumptions about how processors are used and give a counterexample that enforces arc consistency in a constant number of steps usingO(n[su2a22na) processors. It is possible that the lower bound holds for a polynomial number of processors; if such a lower bound were to be proven it would answer an important open question in theoretical computer science concerning the relation between the complexity classesP andNC. The strongest existing lower bound for the arc consistency problem states that it cannot be solved in polynomial log time unlessP=NC.  相似文献   

15.
G. Sajith  S. Saxena 《Algorithmica》2000,27(2):187-197
The problem of finding a sublogarithmic time optimal parallel algorithm for 3 -colouring rooted forests has been open for long. We settle this problem by obtaining an O(( log log n) log * ( log * n)) time optimal parallel algorithm on a TOLERANT Concurrent Read Concurrent Write (CRCW) Parallel Random Access Machine (PRAM). Furthermore, we show that if f(n) is the running time of the best known algorithm for 3 -colouring a rooted forest on a COMMON or TOLERANT CRCW PRAM, a fractional independent set of the rooted forest can be found in O(f(n)) time with the same number of processors, on the same model. Using these results, it is shown that decomposable top-down algebraic computation and, hence, depth computation (ranking), 2 -colouring and prefix summation on rooted forests can be done in O( log n) optimal time on a TOLERANT CRCW PRAM. These algorithms have been obtained by proving a result of independent interest, one concerning the self-simulation property of TOLERANT: an N -processor TOLERANT CRCW PRAM that uses an address space of size O(N) only, can be simulated on an n -processor TOLERANT PRAM in O(N/n) time, with no asymptotic increase in space or cost, when n=O(N/ log log N) . Received May 20, 1997; revised June 15, 1998.  相似文献   

16.
D. Bini  A. Fontani 《Calcolo》1987,24(1):65-84
Fast numerical methods for the evaluation of the eigenvalues of the finite-differences laplacian over a regular hexagon are devised. At first we show how this eigenvalue problem can be splitted into 3 eigenvalue problems, of lower dimension (reduced by a factor of 1/6), for the discrete laplacian over a regular triangle with suitable boundary conditions. Then, expressing explicitely the eigenvalues and the eigenvectors of the discrete laplacian over a triangle in terms of the coefficients of the discrete Fourier transform, we show how to deal efficiently with each subproblem. In particular we show that each step of the shifted inverse power method, for the approximation of the eigenvalues, costs O(n2log n) arithmetic operations in a sequential model of computation, and O(log n) steps with n2 processors in a parallel model of computation, where n is the number of the nodes on the edge of the hexagon. Similar estimates hold for the orthogonal iterations (subspaces iterations) method and for Lanczos method. This approach includes the deflation of the eigenvalues of the triangle from those of the hexagon. These results improve the methods given by Bauer and Reiss [1] allowing a higher precision in the approximation of the eigenvalues of the laplacian and reducing the computational cost, either in a sequential or in a parallel model of computation.  相似文献   

17.
This paper shows that an N -node AKS network (as described by Paterson) can be embedded in a ( 3N / 2 ) -node twinbutterfly network (i.e., a multibutterfly constructed by superimposing two butterfly networks) with load 1, congestion 1, and dilation 2. The result has several implications, including the first deterministic algorithms for sorting and finding the median of n log n items on an n -input multibutterfly in O ( log n ) time, a work-efficient deterministic algorithm for finding the median of n log 2 n log log n items on an n -input multibutterfly in O (log n log log n ) time, and a three-dimensional VLSI layout for the n -input AKS network with volume O(n 3/2 ) . While these algorithms are not practical, they provide further evidence of the robustness of multibutterfly networks. We also present a separate, and more practical, deterministic algorithm for routing h -relations on an n -input multibutterfly in O(h + log n) time. Previously, only algorithms for solving h one-to-one routing problems were known. Finally, we show that a twinbutterfly, whose individual splitters do not exhibit expansion, can emulate a bounded-degree multibutterfly with (α,β) -expansion, for any α⋅β < 1/4 . Received July 23, 1997; revised May 18, 1998.  相似文献   

18.
Riccati and Lyapunov equations can be solved using the recursive matrix sign method applied to symmetric matrices constructed from the corresponding Hamiltonian matrices. In this paper we derive an efficient systolic implementation of that algorithm where theLDL T andUDU T decompositions of those symmetric matrices are propagated. As a result the solution of a class of Riccati and Lyapunov equations can be obtained inO(n) time steps on a bidimensional (triangular) grid ofO(n 2) processors, leading to an optimal speedup.  相似文献   

19.
Design and Implementation of a Practical Parallel Delaunay Algorithm   总被引:1,自引:0,他引:1  
This paper describes the design and implementation of a practical parallel algorithm for Delaunay triangulation that works well on general distributions. Although there have been many theoretical parallel algorithms for the problem, and some implementations based on bucketing that work well for uniform distributions, there has been little work on implementations for general distributions. We use the well known reduction of 2D Delaunay triangulation to find the 3D convex hull of points on a paraboloid. Based on this reduction we developed a variant of the Edelsbrunner and Shi 3D convex hull algorithm, specialized for the case when the point set lies on a paraboloid. This simplification reduces the work required by the algorithm (number of operations) from O(n log 2 n) to O(n log n) . The depth (parallel time) is O( log 3 n) on a CREW PRAM. The algorithm is simpler than previous O(n log n) work parallel algorithms leading to smaller constants. Initial experiments using a variety of distributions showed that our parallel algorithm was within a factor of 2 in work from the best sequential algorithm. Based on these promising results, the algorithm was implemented using C and an MPI-based toolkit. Compared with previous work, the resulting implementation achieves significantly better speedups over good sequential code, does not assume a uniform distribution of points, and is widely portable due to its use of MPI as a communication mechanism. Results are presented for the IBM SP2, Cray T3D, SGI Power Challenge, and DEC AlphaCluster. Received June 1, 1997; revised March 10, 1998.  相似文献   

20.
A parallel algorithm for transforming an n × n symmetric matrix to tridiagonal form is described. The algorithm implements Givens rotations on a square array of n × n processors in such a way that the transformation can be performed in time O(n log n). The processors require only nearest-neighbor communication. The reduction to tridiagonal form could be the first step in the parallel solution of the symmetric eigenvalue problem in time O(n log n).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号