首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 47 毫秒
1.
The implementations of the Viterbi algorithm (VA) and the interacting multiple model (IMM) algorithm on a shared-bus and shared-memory multiple-input multiple-data (MIMD) multiprocessor are discussed. The computational complexity as well as the speedup and efficiency are examined in detail. It is shown that the computational complexity of the parallel implementation of these algorithms is about the same in both memory space and processing time categories. Efficiency with P processors is about 1-1/P for small P and is expected to be relatively high for large P, especially when many filters and large state and measurement vectors are considered  相似文献   

2.
Parallel simulated annealing using speculative computation   总被引:1,自引:0,他引:1  
A parallel simulated annealing algorithm that is problem-independent, maintains the serial decision sequence, and obtains speedup which can exceed log2P on P processors is discussed. The algorithm achieves parallelism by using the concurrency technique of speculative computation. Implementation of the parallel algorithm on a hypercube multiprocessor and application to a task assignment problem are described. The simulated annealing solutions are shown to be, on average, 28% better than the solutions produced by a random task assignment algorithm and 2% better than the solutions produced by a heuristic  相似文献   

3.
The performance of job scheduling is studied in a large parallel processing system where a job is modeled as a concatenation of two stages which must be processed in sequence. Pi is the number of processors required by stage P as the total number of processors in the system. A large parallel computing system is considered where Max(P1, P2)⩾P≫1 and Max(P1 , P2)≫Min(P1, P2). For such systems, exact expressions for the mean system delay are obtained for various job models and disciplines. The results show that the priority should be given to jobs working on the stage which requires fewer processors. The large parallel system (i.e. P≫1) condition is then relaxed to obtain the mean system time for two job models when the priority is given to the second stage. Moreover, a scale-up rule is introduced to obtain the approximated delay performance when the system provides more processors than the maximum number of processors required by both stages (i.e. P>Max(P1, P2)). An approximation model is given for jobs with more than two stages  相似文献   

4.
Parallelization of Bresenham's line and circle algorithms   总被引:1,自引:0,他引:1  
Parallel algorithm for line and circle drawing that are based on J.E. Bresenham's line and circle algorithms (see Commun. ACM, vol.20, no.2, p.100-6 (1977)) are presented. The new algorithms are applicable on raster scan CRTs, incremental pen plotters, and certain types of printers. The line algorithm approaches a perfect speedup of P as the line length approaches infinity, and the circle algorithm approaches a speedup greater than 0.9P as the circle radius approaches infinity. It is assumed that the algorithm are run in a multiple-instruction-multiple-data (MIMD) environment, that the raster memory is shared, and that the processors are dedicated and assigned to the task (of line or circle drawing)  相似文献   

5.
6.
Computing the width of a set   总被引:1,自引:0,他引:1  
For a set of points P in three-dimensional space, the width of P, W (P), is defined as the minimum distance between parallel planes of support of P. It is shown that W(P) can be computed in O(n log n +I) time and O(n) space, where I is the number of antipodal pairs of edges of the convex hull of P, and n is the number of vertices; in the worst case, I=O( n2). For a convex polyhedra the time complexity becomes O(n+I). If P is a set of points in the plane, the complexity can be reduced to O(nlog n). For simple polygons, linear time suffices  相似文献   

7.
A simple model of parallel computation which is capable of explaining speedups greater than n on n processors is presented. Necessary and sufficient conditions for these exceptional speedups are derived from the model. Several of the contradictory previous results relating to parallel speedup are resolved by using the model  相似文献   

8.
A parallel algorithm is proposed for the two-dimensional discrete Fourier transform (2-D DFT) computation which eliminates interprocessor communications and uses only O(N) processors. The mapping of the algorithm onto architectures with broadcast and report capabilities is discussed. Expressions are obtained for estimating the speed performance on these machines as a function of the size N×N of the 2-D DFT, the bandwidth of the communications channel, the time for an addition, the time T( FN) for a single processing element to perform an N-point DFT, and the degree of parallelism. For single I/O channel machines that are capable of exploiting the full degree of parallelism of the algorithm, attainable execution times are as low as the time T(FN) plus the I/O time for data upload and download. An implementation on a binary tree computer is discussed  相似文献   

9.
Kharitonov's theorems are generalized to the problem of so-called weak Kharitonov regions for robust stability of linear uncertain systems. Given a polytope of (characteristic) polynomials P and a stability region D in the complex plane, P is called D-stable if the zeros of every polynomial in P are contained in D. It is of interest to know whether the D-stability of the vertices of P implies the D-stability of P. A simple approach is developed which unifies and generalizes many known results on this problem  相似文献   

10.
The problem of determining whether a polytope P of n ×n matrices is D-stable-i.e. whether each point in P has all its eigenvalues in a given nonempty, open, convex, conjugate-symmetric subset D of the complex plane-is discussed. An approach which checks the D-stability of certain faces of P is used. In particular, for each D and n the smallest integer m such that D-stability of every m-dimensional face guarantees D-stability of P is determined. It is shown that, without further information describing the particular structure of a polytope, either (2n-4)-dimensional or (2n-2)-dimensional faces need to be checked for D-stability, depending on the structure of D. Thus more work needs to be done before a computationally tractable algorithm for checking D-stability can be devised  相似文献   

11.
In a general algebraic framework, starting with a bicoprime factorization P=NprD-1 Npl, a right-coprime factorization Np Dp-1, a left-coprime factorization D-1pNp, and the generalized Bezout identities associated with the pairs (Np, Dp) and (D˜ p, N˜p) are obtained. The set of all H-stabilizing compensators for P in the unity-feedback configuration S(P, C) are expressed in terms of (Npr, D, N pt) and the elements of the Bezout identity. The state-space representation P=C(sI-A)-1B is included as an example  相似文献   

12.
A hypercube algorithm to solve the list ranking problem is presented. Let n be the length of the list, and let p be the number of processors of the hypercube. The algorithm described runs in time O(n/p) when n=Ω(p 1+ε) for any constant ε>0, and in time O(n log n/p+log3 p) otherwise. This clearly attains a linear speedup when n=Ω(p 1+ε). Efficient balancing and routing schemes had to be used to achieve the linear speedup. The authors use these techniques to obtain efficient hypercube algorithms for many basic graph problems such as tree expression evaluation, connected and biconnected components, ear decomposition, and st-numbering. These problems are also addressed in the restricted model of one-port communication  相似文献   

13.
A unified analytical model for computing the task-based dependability (TDB) of hypercube architectures is presented. A hypercube is deemed operational as long as a task can be executed on the system. The technique can compute both reliability and availability for two types of task requirements-I-connected model and subcube model. The I-connected TBD assumes that a connected group of at least I working nodes is required for task execution. The subcube TBD needs at least an m-cube in an n-cube, mn, for task execution. The dependability is computed by multiplying the probability that x nodes (xI or x⩾2m) are working in an n-cube at time t by the conditional probability that the hypercube can satisfy any one of the two task requirements from x working nodes. Recursive models are proposed for the two types of task requirements to find the connection probability. The subcube requirement is extended to find multiple subcubes for analyzing multitask dependability. The analytical results are validated through extensive simulation  相似文献   

14.
Simultaneous controller design for linear time-invariant systems   总被引:1,自引:0,他引:1  
The use of generalized sampled-data hold functions (GSHF) in the problem of simultaneous controller design for linear time-invariant plants is discussed. This problem can be stated as follows: given plants P1, P2, . . ., PN , find a controller C which achieves not only simultaneous stability, but also simultaneous optimal performance in the N given systems. By this, it is meant that C must optimize an overall cost function reflecting the closed-loop performance of each plant when it is regulated by C. The problem is solved in three aspects: simultaneous stabilization, simultaneous optimal quadratic performance, and simultaneous pole assignment in combination with simultaneous intersampling performance  相似文献   

15.
Two arrays of numbers sorted in nondecreasing order are given: an array A of size n and an array B of size m, where n<m. It is required to determine, for every element of A, the smallest element of B (if one exists) that is larger than or equal to it. It is shown how to solve this problem on the EREW PRAM (exclusive-read exclusive-write parallel random-access machine) in O(logm logn/log log m) time using n processors. The solution is then extended to the case in which fewer than n processors are available. This yields an EREW PRAM algorithm for the problem whose cost is O(n log m, which is O(m)) for nm/log m. It is shown how the solution obtained leads to an improved parallel merging algorithm  相似文献   

16.
An adaptive parallel algorithm for inducing a priority queue structure on an n-element array is presented. The algorithm is extended to provide optimal parallel construction algorithms for three other heap-like structures useful in implementing double-ended priority queues, namely min-max heaps, deeps, and min-max-pair heaps. It is shown that an n-element array can be made into a heap, a deap, a min-max heap, or a min-max-pair heap in O(log n+(n /p)) time using no more than n/log n processors, in the exclusive-read-exclusive-write parallel random-access machine model  相似文献   

17.
An O(n2) time serial algorithm is developed for obtaining the medial axis transform (MAT) of an n×n image. An O(log n) time CREW PRAM algorithm and an O(log2 n) time SIMD hypercube parallel algorithm for the MAT are also developed. Both of these use O(n2) processors. Two problems associated with the MAT, the area and perimeter reporting problem, are studied. An O(log n) time hypercube algorithm is developed for both of them, where n is the number of squares in the MAT, and the algorithms use O(n2) processors  相似文献   

18.
A new parallel algorithm is proposed for fat image labeling using local operators on image pixels. The algorithm can be implemented on an n×n mesh-connected computer such that, for any integer k in the range [1, log (2n)], the algorithm requires Θ(kn1k/) bits of local memory per processor and takes Θ(kn) time. Bit-serial processors and communication links can be used without affecting the asymptotic time complexity of the algorithm. The time complexity of the algorithm has very small leading constant factors, which makes it superior to previous mesh computer labeling algorithms for most practical image sizes (e.g. up to 4096×4096 images). Furthermore, the algorithm is based on using stacks that can be realized using very fast shift registers within each processing element  相似文献   

19.
The binary-image-compression problem is analyzed using irreducible cover of maximal rectangles. A bound on the minimum-rectangular-cover problem for image compression is given under certain conditions that previously have not been analyzed. It is demonstrated that for a simply connected image, the irreducible cover proposed uses less than four times the number of the rectangles in a minimum cover. With n pixels in a square, the parallel algorithm for obtaining the irreducible cover uses (n/log n) concurrent-read-exclusive write (CREW) processors in O(log n) time  相似文献   

20.
An application-specific architecture for the parallel calculation of the decimation in time and radix 2 fast Hartley (FHT) and Fourier (FFT) transforms is presented. A real sequence with N=2n data items is considered as input. The system calculates the FHT and the FFT in n and n+1 stages. respectively. The modular and regular parallel architecture is based on a constant geometry algorithm using butterflies of four data items and the perfect unshuffle permutation. With this permutation, the mapping of the algorithm in VLSI technology is simplified and the communications among processors are minimized. Organization of the processor memory based on first-in, first-out (FIFO) queues facilitates a systolic data flow and permits the implementation in a direct way of the complex data movements and address sequences of the transforms. This is accomplished by means of simple multiplexing operations, using hardwired control. The total calculation time is (Nlog2N)/4Q cycles for the FHT and N(1+log2N)/4Q cycles for the FFT, where Q is the number of processors ( Q= 2q, QN/4)  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号