首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 633 毫秒
1.
The paper presents parallel algorithms for solving Poisson equation at N2 mesh points. The methods based on marching techniques are structured for efficient parallel realization. Using orthogonal decomposition properties of arising matrices, the algorithms can be formulated in terms of transformed vectors. On a MIMD computer with not more than N processors, the computations can be performed in horizontal slices with minimal synchronization requirements. Considering an SIMD machine with N2 processors, the complexity bound O(log N) has been achieved, whereby the single marching requires 10 log N steps only.  相似文献   

2.
Parallel clustering algorithms   总被引:3,自引:0,他引:3  
Clustering techniques play an important role in exploratory pattern analysis, unsupervised learning and image segmentation applications. Many clustering algorithms, both partitional clustering and hierarchical clustering, require intensive computation, even for a modest number of patterns. This paper presents two parallel clustering algorithms. For a clustering problem with N = 2n patterns and M = 2m features, the time complexity of the traditional partitional clustering algorithm on a single processor computer is O(MNK), where K is the number of clusters. The proposed algorithm on anSIMD computer with MN processors has a time complexity O(K(n + m)). The time complexity of the proposed single-link hierarchical clustering algorithm is reduced from O(MN2) of the uniprocessor algorithm to O(nN) with MN processors.  相似文献   

3.
The problem of finding a rectilinear minimum bend path (RMBP) between two designated points inside a rectilinear polygon has applications in robotics and motion planning. In this paper, we present efficient algorithms to solve the query version of the RMBP problem for special classes of rectilinear polygons given their visibility graphs. Specifically, we show that given an unweighted graph G = (V, E), with ¦V¦ = N and ¦E¦ = M, algorithms to preprocess G in linear space and time such that the shortest distance queries — queries asking for the distance between any pair of nodes in the graph — can be answered in constant time and space are presented in this paper. For the case of a chordal graph G, our algorithms give a distance which is at most one away from the actual shortest distance. When G is a K-chordal graph, our algorithm produces an exact shortest distance in O(K) time. We also present a non-trivial parallel implementation of the sequential preprocessing algorithm for the CREW-PRAM model which runs in O(log2 N) time using O(N + M) processors. After the preprocessing, we can answer the queries in constant time using a single processor.  相似文献   

4.
Several fast and space-optimal sequential and parallel algorithms for solving the satisfaction problem of functional and multivalued dependencies (FDs and MVDs) are presented. Two frameworks to verify an MVD for a relation and their implementation by exploring the existing fast space-optimal sorting techniques are described. The space optimality means that only a constant amount of extra memory space is needed for the sequential implementations, and O(M) amount of extra memory space for parallel algorithms that use M processors. This feature makes the algorithms attractive whenever space is a critical resource and I/O transfers should be reduced to the minimal, as is often the case for relational database systems. The time requirements for in-place FD and MVD verification are given in terms of M and of N, which is the number of tuples in a relation. The effect of relation modification on FD and MVD verification is examined  相似文献   

5.
This paper concerns generally the satisfaction and the inference problem involving functional and/or multivalued dependencies in a relational database. In particular, two independent aids in solving an inference problem, concerning the logical counterparts of functional as well as multivalued dependencies, are introduced. The first aid is provided by establishing a pair of complementary inequivalence and equivalence theorems between the propositional formula corresponding to the difference, U-X, in set theory and the propositional formula not(X) where U is a relation scheme and X is a subset of U. By applying these theorems, correctness of solving an inference problem is assured. The second aid is the application of a Venn diagram for simplifying a propositional formula involving conjunctions, differences, etc., for solving an inference problem. A guideline for constructing simplified Venn diagrams is also given and discussed.  相似文献   

6.
This paper makes an improvement of computing two nearest-neighbor problems of images on a reconfigurable array of processors (RAP) by increasing the bus width between processors. Based on a base-n system, a constant time algorithm is first presented for computing the maximum/minimum of N log N-bit unsigned integers on a RAP using N processors each with N1/c-bit bus width, where c is a constant and c ≥ 1. Then, two basic operations such as image component labeling and border following are also derived from it. Finally, these algorithms are used to design two constant time algorithms for the nearest neighbor black pixel and the nearest neighbor component problems on an N1/2 × N1/2 image using N1/2 × N1/2 processors each with N1/c-bit bus width, where c is a constant and c ≥ 1. Another contribution of this paper is that the execution time of the proposed algorithms is tunable by the bus width.  相似文献   

7.
In the literature, there are quite a few sequential and parallel algorithms for solving problems on distance-hereditary graphs. With an n-vertex and m-edge distance-hereditary graph G, we show that the efficient domination problem on G can be solved in O(log/sup 2/ n) time using O(n + m) processors on a CREW PRAM. Moreover, if a binary tree representation of G is given, the problem can be optimally solved in O(log n) time using O(n/log n) processors on an EREW PRAM.  相似文献   

8.
Nested dissection is a very popular direct method for solving sparse linear systems that arise from finite difference and finite element methods. Worley and Schreiber [16] give a fine grain algorithm for a square array of processors. Their algorithm uses O(N2) processors, each with O(N) memory, to factor an N2 by N2 sparse matrix whose graphs is an N × N mesh. The efficiency of their method is between 1/46 and 1/12. George et al. [6] [8] give a medium grain algorithm for hypercube architecture, while George et al. [7] give an algorithm for shared memory machines. These papers present a column oriented approach which can exploit O(N) parallelism and yield efficiencies up to 50%. Lucas [11] also gives a column oriented scheme which achieves up to 75% efficiency and O(N) parallelism. In this paper, we present a medium to fine grain algorithm for a P × P array of processors with local memory. This algorithm can exploit up to O(N2) parallelism. The efficiency of the fine grain version is comparable to [16] while as a medium grain algorithm achieves about 49% efficiency. The strength of the method is due to three factors: its ability to pipeline much of the computation, overlapping computation and communication, and the use of level 3 BLAS like primitives. In addition to its high efficiency its memory requirement is optimal, only O(N2 log N/P2) words memory is needed per processor.  相似文献   

9.
In the long-lived M-renaming problem, N processes repeatedly acquire and release names ranging over {0, …, M − 1}, where M < N. It is assumed that at most k M processes concurrently request or hold names. Efficient solutions to the long-lived renaming problem can be used to improve the performance of applications in which processes repeatedly perform computations whose time complexity depends on the size of the name space containing the processes that participate concurrently. In this paper, we consider wait-free solutions to the long-lived M -renaming problem that use only read and write instructions in an asynchronous, shared-memory multiprocessor. A solution to long-lived renaming is fast if the time complexity of acquiring and releasing a name once is independent of N. We present a new fast, long-lived (k(k + 1)/2)renaming algorithm that significantly improves upon the time and space complexity of similar previous algorithms, while providing a much simpler solution. We also show that fast, long-lived (2k − 1)-renaming can be implemented with reads and writes. This result is optimal with respect to the size of the name space.  相似文献   

10.
The paper describes the implementation of the Successive Overrelaxation (SOR) method on an asynchronous multiprocessor computer for solving large, linear systems. The parallel algorithm is derived by dividing the serial SOR method into noninterfering tasks which are then combined with an optimal schedule of a feasible number of processors. The important features of the algorithm are: (i) achieves a speedup Sp O(N/3) and an efficiency Ep 2/3 using P = [N/2] processors, where N is the number of the equations, (ii) contains a high level of inherent parallelism, whereas on the other hand, the convergence theory of the parallel SOR method is the same as its sequential counterpart and (iii) may be modified to use block methods in order to minimise the overhead due to communication and synchronisation of the processors.  相似文献   

11.
Given a graph G=(V, E) with n vertices and m edges, the k-connectivity of G denotes either the k-edge connectivity or the k-vertex connectivity of G. In this paper, we deal with the fully dynamic maintenance of k-connectivity of G in the parallel setting for k=2, 3. We study the problem of maintaining k-edge/vertex connected components of a graph undergoing repeatedly dynamic updates, such as edge insertions and deletions, and answering the query of whether two vertices are included in the same k-edge/vertex connected component. Our major results are the following: (1) An NC algorithm for the 2-edge connectivity problem is proposed, which runs in O(log n log(m/n)) time using O(n3/4) processors per update and query. (2) It is shown that the biconnectivity problem can be solved in O(log2 n ) time using O(nα(2n, n)/logn) processors per update and O(1) time with a single processor per query or in O(log n logn/m) time using O(nα(2n, n)/log n) processors per update and O(logn) time using O(nα(2n, n)/logn) processors per query, where α(.,.) is the inverse of Ackermann's function. (3) An NC algorithm for the triconnectivity problem is also derived, which takes O(log n logn/m+logn log log n/α(3n, n)) time using O(nα(3n, n)/log n) processors per update and O(1) time with a single processor per query. (4) An NC algorithm for the 3-edge connectivity problem is obtained, which has the same time and processor complexities as the algorithm for the triconnectivity problem. To the best of our knowledge, the proposed algorithms are the first NC algorithms for the problems using O(n) processors in contrast to Ω(m) processors for solving them from scratch. In particular, the proposed NC algorithm for the 2-edge connectivity problem uses only O(n3/4) processors. All the proposed algorithms run on a CRCW PRAM  相似文献   

12.
An efficient parallel algorithm for merging two sorted lists is presented. The algorithm is based on a novel partitioning algorithm that splits the two lists among the processors, in a way that ensures load balance during the merge. The partitioning algorithm can itself be efficiently parallelized, allowing the solution to scale with increased numbers of processors. A shared memory multiprocessor is assumed. The time complexity for partitioning and merging is O(N/p + log N), where p is the number of processors and N is the total number of elements in the two lists. Implementation results on a twenty node Sequent Symmetry multiprocessor are also presented.  相似文献   

13.
A subsequence of a given string is any string obtained by deleting none or some symbols from the given string. A longest common subsequence (LCS) of two strings is a common subsequence of both that is as long as any other common subsequences. The problem is to find the LCS of two given strings. The bound on the complexity of this problem under the decision tree model is known to be mn if the number of distinct symbols that can appear in strings is infinite, where m and n are the lengths of the two strings, respectively, and m⩽n. In this paper, we propose two parallel algorithms far this problem on the CREW-PRAM model. One takes O(log2 m + log n) time with mn/log m processors, which is faster than all the existing algorithms on the same model. The other takes O(log2 m log log m) time with mn/(log2 m log log m) processors when log2 m log log m > log n, or otherwise O(log n) time with mn/log n processors, which is optimal in the sense that the time×processors bound matches the complexity bound of the problem. Both algorithms exploit nice properties of the LCS problem that are discovered in this paper  相似文献   

14.
Given n points in the plane the planar dominance counting problem is to determine for each point the number of points dominated by it. Point p is said to dominate point q if x(q)x(p) and y(q)y(p), when x(p) and y(p) are the x− and y-coordinate of p, respectively. We present two CREW PRAM parallel algorithms for the problem, one running in O(log n loglog n) time and and the other in O(lognloglogn/logloglogn) time both using O(n) processors. Some applicationsare also given.  相似文献   

15.
Consider a set P of points in the plane sorted by the x-coordinate. A point p in P is said to be a proximate point if there exists a point q on the x-axis such that p is the closest point to q over all points in P. The proximate point problem is to determine all the proximate points in P. Our main contribution is to propose optimal parallel algorithms for solving instances of size n of the proximate points problem. We begin by developing a work-time optimal algorithm running in O(log log n) time and using n/loglogn Common-CRCW processors. We then go on to show that this algorithm can be implemented to run in O(log n) time using n/logn EREW processors. In addition to being work-time optimal, our EREW algorithm turns out to also be time-optimal. Our second main contribution is to show that the proximate points problem finds interesting, and quite unexpected, applications to digital geometry and image processing. As a first application, we present a work-time optimal parallel algorithm for finding the convex hull of a set of n points in the plane sorted by x-coordinate; this algorithm runs in O(log log n) time using n/logn Common-CRCW processors. We then show that this algorithm can be implemented to run in O(log n) time using n/logn EREW processors. Next, we show that the proximate points algorithms afford us work-time optimal (resp, time-optimal) parallel algorithms for various fundamental digital geometry and image processing problems  相似文献   

16.
In this paper, a distributed selectsort algorithm and a parameterized selectsort algorithm are presented to be applied on distributed systems for cases when N P where N is the number of elements to be sorted and P is the number of processors in the system. The distributed system considered in this paper uses a broadcasting channel for communication between processors. We show that the number of messages required for the parameterized selectsort algorithm is independent of N and is of complexity O(P), which is optimal in a distributed system with P processors. Furthermore, the amount of communication required in terms of elements is N + O(P3) and the computation time complexity is O((N/P)lgN + P2lg(N/P)). Hence, when N P3, the computation time complexity is O((N/P)lgN), which is optimal using P processors. In addition, this parameterized algorithm provides us with a parameter K such that by choosing the value of K allows us to trade among processing requirement, memory requirement, and communication requirement. It is shown that this parameterized algorithm can reduce the communication requirements significantly while only slightly increasing the computation requirements.  相似文献   

17.
“Complex Random Sample Scheduling(CRSS)” was proposed in this paper as an efficient heuristic method for solving any permutation scheduling problems. To show the effectiveness of the proposed CRSS, it was applied to an N-job, M-machine, permutation flowshop scheduling problem to minimize makespan, N/M/F/Fmax. Numerical experiments made it clear that the proposed CRSS provides a schedule very close to the near-optimal schedule obtained by the existing promising heuristic methods such as taboo search and simulated annealing, within less computation time than these heuristic methods.  相似文献   

18.
Two parallel algorithms for finding minimum spanning forest (MSF) of a weighted undirected graph on hypercube computers, consisting of a fixed number of processors, are presented. One algorithm is suited for sparse graphs, the other for dense graphs. Our design strategy is based on successive elimination of non-MSF edges. The input graph is partitioned equally among different processors, which then repeatedly eliminate non-MSF edges and merge results to gradually construct the desired MSF of the entire graph. Low communication overhead is achieved by restricting the message-flow to between the neighboring processors in the hypercube topology. The correctness of our approach is due to a theorem which states that with total-ordered edges, if an edge of an arbitrary subgraph does not belong to its MSF, then it does not belong to the MSF of the entire graph. For a graph of n vertices and m edges, our first algorithm finds an MSF in O(m log m)/p) time using p processors for p ≤ (mlog m)/n(1+log(m/n)). The second algorithm, efficient for dense graphs, requires O(n2/p) time for pn/log n.  相似文献   

19.
A heap structure designed for secondary storage is suggested that tries to make the best use of the available buffer space in primary memory. The heap is a complete multi-way tree, with multi-page blocks of records as nodes, satisfying a generalized heap property. A special feature of the tree is that the nodes may be partially filled, as in B-trees. The structure is complemented with priority-queue operations insert and delete-max. When handling a sequence of S operations, the number of page transfers performed is shown to be O(∑i = 1S(1/P) log(M/P)(Ni/P)), where P denotes the number of records fitting into a page, M the capacity of the buffer space in records, and Ni, the number of records in the heap prior to the ith operation (assuming P 1 and S> M c · P, where c is a small positive constant). The number of comparisons required when handling the sequence is O(∑i = 1S log2 Ni). Using the suggested data structure we obtain an optimal external heapsort that performs O((N/P) log(M/P)(N/P)) page transfers and O(N log2 N) comparisons in the worst case when sorting N records.  相似文献   

20.
A set of vector fields on a differentiable manifold M is said to be uniformly completely controllable (u.c.c.) if there exists a nonnegative integer N such that evert pair (p, q) of point of M can be joined by a trajectory, or positive orbit, of which involves at most N switches.

In this article we show that if M is a Lie group G and a set of left-invariant vector fields on G, N must be greater than or equal to dim(G)-1. We also construct sets of vector fields which are uniformly completely controllable in dim(G)-1 switches when G is the Lie group of any compact real form of g and g runs over all classical simple Lie algebras over .  相似文献   


设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号