期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Parallel implementation of multiple model tracking algorithms

Averbuch A. Itzikowitz S. Kapon T. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(2):242-252

The implementations of the Viterbi algorithm (VA) and the interacting multiple model (IMM) algorithm on a shared-bus and shared-memory multiple-input multiple-data (MIMD) multiprocessor are discussed. The computational complexity as well as the speedup and efficiency are examined in detail. It is shown that the computational complexity of the parallel implementation of these algorithms is about the same in both memory space and processing time categories. Efficiency with P processors is about 1-1/P for small P and is expected to be relatively high for large P, especially when many filters and large state and measurement vectors are considered 相似文献

2.

Parallel simulated annealing using speculative computation 总被引：1，自引：0，他引：1

Witte E.E. Chamberlain R.D. Franklin M.A. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(4):483-494

A parallel simulated annealing algorithm that is problem-independent, maintains the serial decision sequence, and obtains speedup which can exceed log₂P on P processors is discussed. The algorithm achieves parallelism by using the concurrency technique of speculative computation. Implementation of the parallel algorithm on a hypercube multiprocessor and application to a task assignment problem are described. The simulated annealing solutions are shown to be, on average, 28% better than the solutions produced by a random task assignment algorithm and 2% better than the solutions produced by a heuristic 相似文献

3.

Performance evaluation of dynamic sharing of processors intwo-stage parallel processing systems

Jau-Hsiung Huang Kleinrock L. 《Parallel and Distributed Systems, IEEE Transactions on》1993,4(3):306-317

The performance of job scheduling is studied in a large parallel processing system where a job is modeled as a concatenation of two stages which must be processed in sequence. P_i is the number of processors required by stage P as the total number of processors in the system. A large parallel computing system is considered where Max(P₁, P₂)⩾P≫1 and Max(P₁, P₂)≫Min(P₁, P₂). For such systems, exact expressions for the mean system delay are obtained for various job models and disciplines. The results show that the priority should be given to jobs working on the stage which requires fewer processors. The large parallel system (i.e. P≫1) condition is then relaxed to obtain the mean system time for two job models when the priority is given to the second stage. Moreover, a scale-up rule is introduced to obtain the approximated delay performance when the system provides more processors than the maximum number of processors required by both stages (i.e. P>Max(P₁, P₂)). An approximation model is given for jobs with more than two stages 相似文献

4.

Parallelization of Bresenham's line and circle algorithms 总被引：1，自引：0，他引：1

Wright W.E. 《Computer Graphics and Applications, IEEE》1990,10(5):60-67

Parallel algorithm for line and circle drawing that are based on J.E. Bresenham's line and circle algorithms (see Commun. ACM, vol.20, no.2, p.100-6 (1977)) are presented. The new algorithms are applicable on raster scan CRTs, incremental pen plotters, and certain types of printers. The line algorithm approaches a perfect speedup of P as the line length approaches infinity, and the circle algorithm approaches a speedup greater than 0.9P as the circle radius approaches infinity. It is assumed that the algorithm are run in a multiple-instruction-multiple-data (MIMD) environment, that the raster memory is shared, and that the processors are dedicated and assigned to the task (of line or circle drawing) 相似文献

5.

Debugging concurrent Ada programs by deterministic execution

Tai K.-C. Carver R.H. Obaid E.E. 《IEEE transactions on pattern analysis and machine intelligence》1991,17(1):45-63

相似文献

6.

Computing the width of a set 总被引：1，自引：0，他引：1

Houle M.E. Toussaint G.T. 《IEEE transactions on pattern analysis and machine intelligence》1988,10(5):761-765

For a set of points P in three-dimensional space, the width of P, W (P), is defined as the minimum distance between parallel planes of support of P. It is shown that W(P) can be computed in O(n log n +I) time and O(n) space, where I is the number of antipodal pairs of edges of the convex hull of P, and n is the number of vertices; in the worst case, I=O( n²). For a convex polyhedra the time complexity becomes O(n+I). If P is a set of points in the plane, the complexity can be reduced to O(nlog n). For simple polygons, linear time suffices 相似文献

7.

Modelling speedup (n) greater than n

Helmbold D.P. McDowell C.E. 《Parallel and Distributed Systems, IEEE Transactions on》1990,1(2):250-256

A simple model of parallel computation which is capable of explaining speedups greater than n on n processors is presented. Necessary and sufficient conditions for these exceptional speedups are derived from the model. Several of the contradictory previous results relating to parallel speedup are resolved by using the model 相似文献

8.

A parallel algorithm for 2-D DFT computation with no interprocessorcommunication

Gertner I. Rofheart M. 《Parallel and Distributed Systems, IEEE Transactions on》1990,1(3):377-382

A parallel algorithm is proposed for the two-dimensional discrete Fourier transform (2-D DFT) computation which eliminates interprocessor communications and uses only O(N) processors. The mapping of the algorithm onto architectures with broadcast and report capabilities is discussed. Expressions are obtained for estimating the speed performance on these machines as a function of the size N×N of the 2-D DFT, the bandwidth of the communications channel, the time for an addition, the time T( F_N) for a single processing element to perform an N-point DFT, and the degree of parallelism. For single I/O channel machines that are capable of exploiting the full degree of parallelism of the algorithm, attainable execution times are as low as the time T(F_N) plus the I/O time for data upload and download. An implementation on a binary tree computer is discussed 相似文献

9.

A class of weak Kharitonov regions for robust stability of linearuncertain systems

Fu M. 《Automatic Control, IEEE Transactions on》1991,36(8):975-978

Kharitonov's theorems are generalized to the problem of so-called weak Kharitonov regions for robust stability of linear uncertain systems. Given a polytope of (characteristic) polynomials P and a stability region D in the complex plane, P is called D-stable if the zeros of every polynomial in P are contained in D. It is of interest to know whether the D-stability of the vertices of P implies the D-stability of P. A simple approach is developed which unifies and generalizes many known results on this problem 相似文献

10.

The minimal dimension of stable faces required to guaranteestability of a matrix polytope: D-stability

Cobb J.D. 《Automatic Control, IEEE Transactions on》1990,35(4):469-473

The problem of determining whether a polytope P of n ×n matrices is D-stable-i.e. whether each point in P has all its eigenvalues in a given nonempty, open, convex, conjugate-symmetric subset D of the complex plane-is discussed. An approach which checks the D-stability of certain faces of P is used. In particular, for each D and n the smallest integer m such that D-stability of every m-dimensional face guarantees D-stability of P is determined. It is shown that, without further information describing the particular structure of a polytope, either (2n-4)-dimensional or (2n-2)-dimensional faces need to be checked for D-stability, depending on the structure of D. Thus more work needs to be done before a computationally tractable algorithm for checking D-stability can be devised 相似文献

11.

Bicoprime factorizations of the plant and their relation to right and left-coprime factorizations

Desoer C.A. Gundes A.N. 《Automatic Control, IEEE Transactions on》1988,33(7):672-676

In a general algebraic framework, starting with a bicoprime factorization P=N_prD^-1 N_pl, a right-coprime factorization N_pD_p^-1, a left-coprime factorization D^-1_pN_p, and the generalized Bezout identities associated with the pairs (N_p, D_p) and (D˜ _p, N˜_p) are obtained. The set of all H-stabilizing compensators for P in the unity-feedback configuration S(P, C) are expressed in terms of (N_pr, D, N _pt) and the elements of the Bezout identity. The state-space representation P=C(sI-A)^-1B is included as an example 相似文献

12.

Efficient algorithms for list ranking and for solving graphproblems on the hypercube

Ryu K.W. Jaja J. 《Parallel and Distributed Systems, IEEE Transactions on》1990,1(1):83-90

A hypercube algorithm to solve the list ranking problem is presented. Let n be the length of the list, and let p be the number of processors of the hypercube. The algorithm described runs in time O(n/p) when n=Ω(p ^1+ε) for any constant ε>0, and in time O(n log n/p+log³ p) otherwise. This clearly attains a linear speedup when n=Ω(p ^1+ε). Efficient balancing and routing schemes had to be used to achieve the linear speedup. The authors use these techniques to obtain efficient hypercube algorithms for many basic graph problems such as tree expression evaluation, connected and biconnected components, ear decomposition, and st-numbering. These problems are also addressed in the restricted model of one-port communication 相似文献

13.

A unified task-based dependability model for hypercube computers

Das C.R. Kim J. 《Parallel and Distributed Systems, IEEE Transactions on》1992,3(3):312-324

A unified analytical model for computing the task-based dependability (TDB) of hypercube architectures is presented. A hypercube is deemed operational as long as a task can be executed on the system. The technique can compute both reliability and availability for two types of task requirements-I-connected model and subcube model. The I-connected TBD assumes that a connected group of at least I working nodes is required for task execution. The subcube TBD needs at least an m-cube in an n-cube, m⩽ n, for task execution. The dependability is computed by multiplying the probability that x nodes (x⩾I or x⩾2^m) are working in an n-cube at time t by the conditional probability that the hypercube can satisfy any one of the two task requirements from x working nodes. Recursive models are proposed for the two types of task requirements to find the connection probability. The subcube requirement is extended to find multiple subcubes for analyzing multitask dependability. The analytical results are validated through extensive simulation 相似文献

14.

Simultaneous controller design for linear time-invariant systems 总被引：1，自引：0，他引：1

Kabamba P.T. Yang C. 《Automatic Control, IEEE Transactions on》1991,36(1):106-111

The use of generalized sampled-data hold functions (GSHF) in the problem of simultaneous controller design for linear time-invariant plants is discussed. This problem can be stated as follows: given plants P₁, P₂, . . ., P_N, find a controller C which achieves not only simultaneous stability, but also simultaneous optimal performance in the N given systems. By this, it is meant that C must optimize an overall cost function reflecting the closed-loop performance of each plant when it is regulated by C. The problem is solved in three aspects: simultaneous stabilization, simultaneous optimal quadratic performance, and simultaneous pole assignment in combination with simultaneous intersampling performance 相似文献

15.

Parallel binary search

Akl S.G. Meijer H. 《Parallel and Distributed Systems, IEEE Transactions on》1990,1(2):247-250

Two arrays of numbers sorted in nondecreasing order are given: an array A of size n and an array B of size m, where n<m. It is required to determine, for every element of A, the smallest element of B (if one exists) that is larger than or equal to it. It is shown how to solve this problem on the EREW PRAM (exclusive-read exclusive-write parallel random-access machine) in O(logm logn/log log m) time using n processors. The solution is then extended to the case in which fewer than n processors are available. This yields an EREW PRAM algorithm for the problem whose cost is O(n log m, which is O(m)) for n⩽m/log m. It is shown how the solution obtained leads to an improved parallel merging algorithm 相似文献

16.

Optimal parallel initialization algorithms for a class of priorityqueues

Olariu S. Wen Z. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(4):423-429

An adaptive parallel algorithm for inducing a priority queue structure on an n-element array is presented. The algorithm is extended to provide optimal parallel construction algorithms for three other heap-like structures useful in implementing double-ended priority queues, namely min-max heaps, deeps, and min-max-pair heaps. It is shown that an n-element array can be made into a heap, a deap, a min-max heap, or a min-max-pair heap in O(log n+(n /p)) time using no more than n/log n processors, in the exclusive-read-exclusive-write parallel random-access machine model 相似文献

17.

Serial and parallel algorithms for the medial axis transform

Jenq J.-F. Sahni S. 《IEEE transactions on pattern analysis and machine intelligence》1992,14(12):1218-1224

An O(n²) time serial algorithm is developed for obtaining the medial axis transform (MAT) of an n×n image. An O(log n) time CREW PRAM algorithm and an O(log² n) time SIMD hypercube parallel algorithm for the MAT are also developed. Both of these use O(n²) processors. Two problems associated with the MAT, the area and perimeter reporting problem, are studied. An O(log n) time hypercube algorithm is developed for both of them, where n is the number of squares in the MAT, and the algorithms use O(n²) processors 相似文献

18.

Fast image labeling using local operators on mesh-connectedcomputers

Alnuweiri H.M. Kumar V.K.P. 《IEEE transactions on pattern analysis and machine intelligence》1991,13(2):202-207

A new parallel algorithm is proposed for fat image labeling using local operators on image pixels. The algorithm can be implemented on an n×n mesh-connected computer such that, for any integer k in the range [1, log (2n)], the algorithm requires Θ(kn¹k/) bits of local memory per processor and takes Θ(kn) time. Bit-serial processors and communication links can be used without affecting the asymptotic time complexity of the algorithm. The time complexity of the algorithm has very small leading constant factors, which makes it superior to previous mesh computer labeling algorithms for most practical image sizes (e.g. up to 4096×4096 images). Furthermore, the algorithm is based on using stacks that can be realized using very fast shift registers within each processing element 相似文献

19.

A new method of image compression using irreducible covers ofmaximal rectangles

Cheng Y. Iyengara S.S. Kashyap R.L. 《IEEE transactions on pattern analysis and machine intelligence》1988,14(5):651-658

The binary-image-compression problem is analyzed using irreducible cover of maximal rectangles. A bound on the minimum-rectangular-cover problem for image compression is given under certain conditions that previously have not been analyzed. It is demonstrated that for a simply connected image, the irreducible cover proposed uses less than four times the number of the rectangles in a minimum cover. With n pixels in a square, the parallel algorithm for obtaining the irreducible cover uses (n/log n) concurrent-read-exclusive write (CREW) processors in O(log n) time 相似文献

20.

A VLSI constant geometry architecture for the fast Hartley andFourier transforms

Zapata E.L. Arguello F. 《Parallel and Distributed Systems, IEEE Transactions on》1992,3(1):58-70

An application-specific architecture for the parallel calculation of the decimation in time and radix 2 fast Hartley (FHT) and Fourier (FFT) transforms is presented. A real sequence with N=2ⁿ data items is considered as input. The system calculates the FHT and the FFT in n and n+1 stages. respectively. The modular and regular parallel architecture is based on a constant geometry algorithm using butterflies of four data items and the perfect unshuffle permutation. With this permutation, the mapping of the algorithm in VLSI technology is simplified and the communications among processors are minimized. Organization of the processor memory based on first-in, first-out (FIFO) queues facilitates a systolic data flow and permits the implementation in a direct way of the complex data movements and address sequences of the transforms. This is accomplished by means of simple multiplexing operations, using hardwired control. The total calculation time is (Nlog₂N)/4Q cycles for the FHT and N(1+log₂N)/4Q cycles for the FFT, where Q is the number of processors ( Q= 2^q, Q⩽N/4) 相似文献