首页 | 官方网站   微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
An on-line parser processes each word as soon as it is typed by the user, without waiting for the end of the sentence. Thus, in an interactive system, a sentence will be parsed almost immediately after the last word has been presented.

The complexity of an on-line parser is determined by the resources needed for the analysis of a single word, as it is assumed that previous words have been processed already. Sequential parsing algorithms like CYK or Earley need O(n2) time for the nth word. A parallel implementation in O(n) time on O(n) processors is straightforward. In this paper a novel parallel on-line parser is presented that needs O(1) time on O(n2) processors.  相似文献   

Yi Pan  Keqin Li 《Information Sciences》1999,120(1-4):209-221
The computation of Euclidean distance maps (EDM), also called Euclidean distance transform, is a basic operation in computer vision, pattern recognition, and robotics. Fast computation of the EDM is needed since most of the applications using the EDM require real-time computation. It is shown in L. Chen and H.Y.H. Chuang [Information Processing Letters, 51, pp. 25–29 (1994)] that a lower bound Ω(n2) is required for any sequential EDM algorithm due to the fact that in any EDM algorithm each of the n2 pixels has to be scanned at least once. Recently, many parallel EDM algorithms have been proposed to speedup its computation. Chen and Chuang proposed an algorithm for computing the EDM on an n×n mesh in O(n) time [L. Chen and H.Y.H. Chuang Parallel Computing, 21, pp. 841–852 (1995)]. Clearly, the VLSI complexities of both the sequential and the mesh algorithm described in L. Chen and H.Y.H. Chuang [Parallel Computing, 21, pp. 841–852 (1995)] are AT2=O(n4), where A is the VLSI layout area of the design and T is the computation time using area A when implemented in VLSI. In this paper, we propose a new and faster parallel algorithm for computing the EDM problem on the reconfigurable VLSI mesh model. For the same problem, our algorithm runs in O(1) time on a two-dimensional n2×n2 reconfigurable mesh. We show that the VLSI complexity of our algorithm is the same as those of the above sequential algorithm and the mesh algorithm, while it uses much less time. To our best knowledge, this is the first constant-time EDM algorithm on any parallel computational model.  相似文献   

In this paper, we derive time-minimal systolic arrays for Gaussian elimination and the Algebraic Path Problem (APP) that use a minimal number of processors. For a problem of size n, we obtain an execution time T(n) = 3n −1 using A(n) = n2/4+O(n) processors for Gaussian elimination, and T(n) = 5n −2 and A(n) = n3/+O(n) for the APP.  相似文献   

Parallel clustering algorithms   总被引:3,自引:0,他引:3  
Clustering techniques play an important role in exploratory pattern analysis, unsupervised learning and image segmentation applications. Many clustering algorithms, both partitional clustering and hierarchical clustering, require intensive computation, even for a modest number of patterns. This paper presents two parallel clustering algorithms. For a clustering problem with N = 2n patterns and M = 2m features, the time complexity of the traditional partitional clustering algorithm on a single processor computer is O(MNK), where K is the number of clusters. The proposed algorithm on anSIMD computer with MN processors has a time complexity O(K(n + m)). The time complexity of the proposed single-link hierarchical clustering algorithm is reduced from O(MN2) of the uniprocessor algorithm to O(nN) with MN processors.  相似文献   

The parallel stratagem in this paper uses scattered square decomposition, introduced by G. Fox, for its data assignment and then exploits parallelism in the solution steps of the sequential Householder tridiagonalization algorithm. One may condense a real symmetric full matrix A of order n into a tridiagonal form by the stratagem in concurrent machines where N(= D2) processors are used. Expressions for efficiency and speedup are given for the evaluation of the stratagem. An alternative stratagem which requires less data transmission but more computations is also discussed. The results shown that the Householder Method of tridiagonalization may be implemented on a concurrent machine efficiently by scattered square decomposition provided that the number of matrix elements contained in each processor is much larger than the number of processors of the concurrent machine, and the ratio of the time to transmit one data item from one processor to any other processor to the time to perform a floating-point arithmetic operation is small enough.  相似文献   

We substantially improve the known algorithms for approximating all the complex zeros of an nth degree polynomial p(x). Our new algorithms save both Boolean and arithmetic sequential time, versus the previous best algorithms of Schönhage [1], Pan [2], and Neff and Reif [3]. In parallel (NC) implementation, we dramatically decrease the number of processors, versus the parallel algorithm of Neff [4], which was the only NC algorithm known for this problem so far. Specifically, under the simple normalization assumption that the variable x has been scaled so as to confine the zeros of p(x) to the unit disc x : |x| ≤ 1, our algorithms (which promise to be practically effective) approximate all the zeros of p(x) within the absolute error bound 2b, by using order of n arithmetic operations and order of (b + n)n2 Boolean (bitwise) operations (in both cases up to within polylogarithmic factors). The algorithms allow their optimal (work preserving) NC parallelization, so that they can be implemented by using polylogarithmic time and the orders of n arithmetic processors or (b + n)n2 Boolean processors. All the cited bounds on the computational complexity are within polylogarithmic factors from the optimum (in terms of n and b) under both arithmetic and Boolean models of computation (in the Boolean case, under the additional (realistic) assumption that n = O(b)).  相似文献   

Nested dissection is a very popular direct method for solving sparse linear systems that arise from finite difference and finite element methods. Worley and Schreiber [16] give a fine grain algorithm for a square array of processors. Their algorithm uses O(N2) processors, each with O(N) memory, to factor an N2 by N2 sparse matrix whose graphs is an N × N mesh. The efficiency of their method is between 1/46 and 1/12. George et al. [6] [8] give a medium grain algorithm for hypercube architecture, while George et al. [7] give an algorithm for shared memory machines. These papers present a column oriented approach which can exploit O(N) parallelism and yield efficiencies up to 50%. Lucas [11] also gives a column oriented scheme which achieves up to 75% efficiency and O(N) parallelism. In this paper, we present a medium to fine grain algorithm for a P × P array of processors with local memory. This algorithm can exploit up to O(N2) parallelism. The efficiency of the fine grain version is comparable to [16] while as a medium grain algorithm achieves about 49% efficiency. The strength of the method is due to three factors: its ability to pipeline much of the computation, overlapping computation and communication, and the use of level 3 BLAS like primitives. In addition to its high efficiency its memory requirement is optimal, only O(N2 log N/P2) words memory is needed per processor.  相似文献   

A linear rotation based algorithm is proposed for solving linear system equations, Ax = b. This algorithm modified the conventional Gaussian elimination method and can avoid the problems of numerical singularity and ill condition. In this study, the implementation of a trapezoidal systolic array of n2/2 + n −2 processors as well as a linear array of n processors are accomplished for this algorithm. The trapezoidal systolic array performs the triangularization of a matrix A by using the modified linear rotation algorithm; while the linear array performs the backward substitution for evaluating the solution of x. The computing time for solving a linear equation system will be O(5n) time units. Also an implicit representation of the elimination factor by means of the sign parameter sequence instead of an numerical value is introduced for simplifying the hardware complexity. It is clear that this systolic architecture is simple, uniform, and regular, and therefore well suitable for the implementation of a VLSI chip.  相似文献   

A parallel two-list algorithm for the knapsack problem   总被引:10,自引:0,他引:10  
An n-element knapsack problem has 2n possible solutions to search over, so a task which can be accomplished in 2″ trials if an exhaustive search is used. Due to the exponential time in solving the knapsack problem, the problem is considered to be very hard. In the past decade, much effort has been done in order to find techniques which could lead to practical algorithms with reasonable running time. In 1994, Chang et al. proposed a brilliant parallel algorithm, which needs O(2n/8) processors to solve the knapsack problem in O(2n/2) time; that is, the cost of Chang et al.'s parallel algorithm is O(25n/8). In this paper, we propose a parallel algorithm to improve Chang et al.'s parallel algorithm by reducing the time complexity to be O(23n/8) under the same O(2n/8) processors available. Thus, the proposed parallel algorithm has a cost of O(2n/2). It is an improvement over previous literature. We believe that the proposed parallel algorithm is pragmatically feasible at the moment when multiprocessor systems become more and more popular.  相似文献   

Two parallel algorithms for finding minimum spanning forest (MSF) of a weighted undirected graph on hypercube computers, consisting of a fixed number of processors, are presented. One algorithm is suited for sparse graphs, the other for dense graphs. Our design strategy is based on successive elimination of non-MSF edges. The input graph is partitioned equally among different processors, which then repeatedly eliminate non-MSF edges and merge results to gradually construct the desired MSF of the entire graph. Low communication overhead is achieved by restricting the message-flow to between the neighboring processors in the hypercube topology. The correctness of our approach is due to a theorem which states that with total-ordered edges, if an edge of an arbitrary subgraph does not belong to its MSF, then it does not belong to the MSF of the entire graph. For a graph of n vertices and m edges, our first algorithm finds an MSF in O(m log m)/p) time using p processors for p ≤ (mlog m)/n(1+log(m/n)). The second algorithm, efficient for dense graphs, requires O(n2/p) time for pn/log n.  相似文献   

The paper presents parallel algorithms for solving Poisson equation at N2 mesh points. The methods based on marching techniques are structured for efficient parallel realization. Using orthogonal decomposition properties of arising matrices, the algorithms can be formulated in terms of transformed vectors. On a MIMD computer with not more than N processors, the computations can be performed in horizontal slices with minimal synchronization requirements. Considering an SIMD machine with N2 processors, the complexity bound O(log N) has been achieved, whereby the single marching requires 10 log N steps only.  相似文献   

The problem of planning a path for a point robot from a source point s to a destination point d so as to avoid a set of polygonal obstacles in plane is considered. Using well-known methods, a shortest path from s to d can be computed with a time complexity of O(n2) where n is the total number of obstacle vertices. The focus here is in

1. (a) planning paths faster at the expense of setting for suboptimal path lengths and

2. (b) performance analysis of simple and/or well-known suboptimal methods.

A method that enables a hierarchical implementation of any path planning algorithm with no increase in the worst-case time complexity, is presented; this implementation enables fast planning of simple paths. Then methods are presented based on the Voronoi diagrams, trapezoidal decomposition and triangulation, which compute (suboptimal) paths in O(nlog n) time with the preprocessing costs of O(n log n), O(n2) and O(n log n), respectively. Using existing navigational algorithms for unknown terrains, algorithms that run in O(n log n) time (after preprocessing) and yield suboptimal paths, are presented. For all these algorithms, upper bounds on the path lengths are estimated in terms of the shortest of the obstacles, etc.  相似文献   

杨美姣  刘惊雷 《计算机应用》2018,38(9):2515-2522
针对电影评分中特征提取效率较低的问题,提出了与QR分解相结合的Nyström方法。首先,利用自适应方法进行采样,然后对内部矩阵进行QR分解,将分解后的矩阵与内部矩阵进行重新组合并进行特征分解。Nyström方法的近似过程与标志点选取的数量以及选取标志点的过程密切相关,选取一系列具有标志性的点来保证采样后的近似性,自适应的采样方法能够保证近似的精度。QR分解能够保证矩阵的稳定性,提高偏好特征提取的精度。偏好特征提取的精度越高,推荐系统的稳定性就会越高,推荐的精度也会提高。最后在真实的观众对电影评分的数据集上进行了特征提取的实验,该电影数据集中包含480189个用户,17770部电影,实验结果表明,提取相同数目的标志点时,该算法的精度和效率都有了一定程度的提高:相对于采样前,时间复杂度由原来的On3)减少为Onc2)(c<<n);与标准的Nyström相比,误差控制在25%以下。  相似文献   

This paper describes several parallel algorithms for image edge relaxation on array processors with different numbers of processing elements (PEs) connected by a mesh or hypercube network. The time complexity of Prager's original edge relaxation scheme is O(N2) per iteration using floating-point operations on a sequential machine, where N2 is the number of pixels in the image. Modifications to the scheme are made so that no multiplications are employed and only integer operations are required. Moreover, with parallel processing, the time complexity per iteration is reduced to some constant value. A time complexity analysis on two parallel algorithms is performed. Although the algorithm on an array processor with 4N2 PEs achieved higher degree of parallelism, the algorithm with N2 PEs is preferred. Further modifications on the latter algorithm are made to accommodate to fewer PEs.  相似文献   

尤洁  李劲    张赛  李婷 《智能系统学报》2019,14(4):761-768
针对已有链路预测算法复杂度高,不适于在大规模图上进行链接预测的问题,本文基于图勾勒近似技术对已有链路预测方法进行优化,提出了基于图勾勒的链路预测方法。该方法将链路预测算法的计算复杂度由On3)降低至On2k2log2n)。为进一步提高链接预测效率,给出了基于Spark的并行化链路预测实现方法。在真实图数据集上进行测试,实验结果表明本文方法在保证链接预测精度的前提下,可有效提升算法效率。  相似文献   

In this paper we consider the unbounded single machine parallel batch scheduling problem with family jobs and release dates to minimize makespan. We show that this problem is strongly NP-hard, and give an O(n(n/m+1)m) time dynamic programming algorithm and an O(mkk+1P2k−1) time dynamic programming algorithm, where n is the number of jobs, m is the number of families, k is the number of distinct release dates and P is the sum of the processing times of all families. We further give a heuristic with a performance ratio 2. We also give a polynomial-time approximation scheme for the problem.  相似文献   

Some recent results claimed the existence of a class of algorithms for certain NP-complete problems, with running time O(n1g k 2n/2) and storage requirements O(k 2n/k), for 2 kn. In this note we show that those results do not hold, implying that an algorithm with time O(n 2n/2) and space O(2n/4) is still the best-known solution for such class of NP-complete problems.  相似文献   

This paper presents an efficient algorithm for enumerating all minimal a-b separators separating given non-adjacent vertices a and b in an undirected connected simple graph G = (V, E), Our algorithm requires O(n3Rab) time, which improves the known result of O(n4Rab) time for solving this problem, where ¦V¦= n and Rab is the number of minimal a-b separators. The algorithm can be generalized for enumerating all minimal A-B separators that separate non-adjacent vertex sets A, B < V, and it requires O(n2(nnAnb)RAB) time in this case, where na = ¦A¦, nB = ¦B¦ and rAB is the number of all minimal AB separators. Using the algorithm above as a routine, an efficient algorithm for enumerating all minimal separators of G separating G into at least two connected components is constructed. The algorithm runs in time O(n3R+Σ + n4RΣ), which improves the known result of O(n6RΣ) time, where Rσ is the number of all minimal separators of G and RΣR+Σ = ∑1i, vj) ERvivj n − 1)/2 − m)RΣ. Efficient parallelization of these algorithms is also discussed. It is shown that the first algorithm requires at most O((n/log n)Rab) time and the second one runs in time O((n/log n)R+Σ+n log nRΣ) on a CREW PRAM with O(n3) processors.  相似文献   

A new parallel algorithm for transforming an arithmetic infix expression into a par se tree is presented. The technique is based on a result due to Fischer (1980) which enables the construction of the parse tree, by appropriately scanning the vector of precedence values associated with the elements of the expression. The algorithm presented here is suitable for execution on a shared memory model of an SIMD machine with no read/write conflicts permitted. It uses O(n) processors and has a time complexity of O(log2n) where n is the expression length. Parallel algorithms for generating code for an SIMD machine are also presented.  相似文献   

In this paper, a distributed selectsort algorithm and a parameterized selectsort algorithm are presented to be applied on distributed systems for cases when N P where N is the number of elements to be sorted and P is the number of processors in the system. The distributed system considered in this paper uses a broadcasting channel for communication between processors. We show that the number of messages required for the parameterized selectsort algorithm is independent of N and is of complexity O(P), which is optimal in a distributed system with P processors. Furthermore, the amount of communication required in terms of elements is N + O(P3) and the computation time complexity is O((N/P)lgN + P2lg(N/P)). Hence, when N P3, the computation time complexity is O((N/P)lgN), which is optimal using P processors. In addition, this parameterized algorithm provides us with a parameter K such that by choosing the value of K allows us to trade among processing requirement, memory requirement, and communication requirement. It is shown that this parameterized algorithm can reduce the communication requirements significantly while only slightly increasing the computation requirements.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号