期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Özalp Babaoğlu Keith Marzullo Fred B. Schneider 《Real-Time Systems》1993,5(4):285-303

A priority inversion occurs when a low-priority task causes the execution of a higher-priority task to be delayed. The possibility of priority inversions complicates the analysis of systems that use priority-based schedulers because priority inversions invalidate the assumption that a task can be delayed by only higher-priority tasks. This paper formalizes priority inversion and gives sufficient conditions as well as some new protocols for preventing priority inversions.Supported by the Commission of the European Communities under the ESPRIT Programme Basic Research Action Number 3092 (Predictably Dependable Computing Systems) and the Italian Ministry of Research and University, and in part by the Defense Advanced Research Projects Agency (DoD) under NASA Ames grant number NAG-2-593.Supported in part by the Defense Advanced Research Projects Agency (DoD) under NASA Ames grant number NAG 2-593, and by grants from IBM T.J. Watson Research Laboratory, the IBM Endicott Programming Laboratory, Siemens RTL, and Xerox Webster Research Center.Supported in part by the Office of Naval Research under contract N00014-91-J-1219, the National Science Foundation under Grant No. CCR-8701103, DARPA/NSF Grant No. CCR-9014363, and by the IBM Endicott Programming Laboratory. 相似文献

2.

Optimal tracing and replay for debugging message-passing parallel programs

Robert H. B. Netzer Barton P. Miller 《The Journal of supercomputing》1995,8(4):371-388

A common debugging strategy involves reexecuting a program (on a given input) over and over, each time gaining more information about bugs. Such techniques can fail on message-passing parallel programs. Because of nondeterminacy, different runs on the given input may produce different results. This nonrepeatability is a serious debugging problem, since an execution cannot always be reproduced to track down bugs. This paper presents a technique for tracing and replaying message-passing programs. By tracing the order in which messages are delivered, a reexecution can be forced to deliver messages in their original order, reproducing the original execution. To reduce the overhead of such a scheme, we show that the delivery'order of only messages involved inraces need be traced (and not every message). Our technique makes run-time decisions to detect and trace racing messages and is usuallyoptimal in the sense that the minimal number of racing messages is traced. Experiments indicate that only 1% of the messages are often traced, gaining a reduction of two orders of magnitude over traditional techniques that trace every message. These traces allow an execution to be reproduced any number of times for debugging. Our work is novel in that we adaptively decide what to trace, and trace only those messages that introduce nondeterminacy. With our strategy, large reductions in trace size allow long-running programs to be replayed that were previously unmanageable. In addition, the reduced tracing requirements alleviate tracing bottle-necks, allowing executions to be debugged with substantially lower execution time overhead.This work was supported in part by National Science Foundation grants CCR-8815928 and CCR-9100968, Office of Naval Research grant N00014-89-J-1222, and a grant from Sequent Computer Systems, Inc. 相似文献

3.

Extended parallelism in the Gröbner basis algorithm

Stephen A. Schwab 《International journal of parallel programming》1992,21(1):39-66

This paper presents a new parallel implementation to compute Gröbner bases utilizing two different forms of parallelism. A coarse-grain technique developed by Jean-Phillipe Vidal expands and reducesS-polynomials in parallel. A finegrain technique, proposed by Melenk and Neun, constructs a pipeline of processors to overlap execution of the reduction operations. A hybrid algorithm that outperforms both of the original approaches is presented. The combined algorithm requires the user to select the appropriate allocation of processors to the two styles of parallelism, and uses this static assignment throughout the computation. The paper also discusses the design and implementation approaches used to construct an efficient version of this algorithm.The author was partially supported by an NSF graduate fellowship. This research was sponsored in part by the Avionics Laboratory, Wright Research and Development Center, Aeronautical Systems Division (AFSC), U.S. Air Force, Wright-Patterson AFB, Ohio 45433-6543 under Contract F33615-90-C-1465, ARPA Order No. 7597 and in part by the National Science Foundation under grant CCR-87-226-33. The views and conclusions contained in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of the National Science Foundation or the U.S. government. 相似文献

4.

Rewriting-Based Techniques for Runtime Verification 总被引：1，自引：0，他引：1

Grigore?Ro?u Email author Klaus?Havelund 《Automated Software Engineering》2005,12(2):151-197

Techniques for efficiently evaluating future time Linear Temporal Logic (abbreviated LTL) formulae on finite execution traces are presented. While the standard models of LTL are infinite traces, finite traces appear naturally when testing and/or monitoring real applications that only run for limited time periods. A finite trace variant of LTL is formally defined, together with an immediate executable semantics which turns out to be quite inefficient if used directly, via rewriting, as a monitoring procedure. Then three algorithms are investigated. First, a simple synthesis algorithm for monitors based on dynamic programming is presented; despite the efficiency of the generated monitors, they unfortunately need to analyze the trace backwards, thus making them unusable in most practical situations. To circumvent this problem, two rewriting-based practical algorithms are further investigated, one using rewriting directly as a means for online monitoring, and the other using rewriting to generate automata-like monitors, called binary transition tree finite state machines (and abbreviated BTT-FSMs). Both rewriting algorithms are implemented in Maude, an executable specification language based on a very efficient implementation of term rewriting. The first rewriting algorithm essentially consists of a set of equations establishing an executable semantics of LTL, using a simple formula transforming approach. This algorithm is further improved to build automata on-the-fly via caching and reuse of rewrites (called memoization), resulting in a very efficient and small Maude program that can be used to monitor program executions. The second rewriting algorithm builds on the first one and synthesizes provably minimal BTT-FSMs from LTL formulae, which can then be used to analyze execution traces online without the need for a rewriting system. The presented work is part of an ambitious runtime verification and monitoring project at NASA Ames, called PathExplorer, and demonstrates that rewriting can be a tractable and attractive means for experimenting and implementing logics for program monitoring.Supported in part by joint NSF/NASA grant CCR-0234524. 相似文献

5.

Randomized competitive algorithms for the list update problem

Nick Reingold Jeffery Westbrook Daniel D. Sleator 《Algorithmica》1994,11(1):15-32

We prove upper and lower bounds on the competitiveness of randomized algorithms for the list update problem of Sleator and Tarjan. We give a simple and elegant randomized algorithm that is more competitive than the best previous randomized algorithm due to Irani. Our algorithm uses randomness only during an initialization phase, and from then on runs completely deterministically. It is the first randomized competitive algorithm with this property to beat the deterministic lower bound. We generalize our approach to a model in which access costs are fixed but update costs are scaled by an arbitrary constantd. We prove lower bounds for deterministic list update algorithms and for randomized algorithms against oblivious and adaptive on-line adversaries. In particular, we show that for this problem adaptive on-line and adaptive off-line adversaries are equally powerful.A preliminary version of these results appeared in a joint paper with S. Irani in theProceedings of the 2nd Symposium on Discrete Algorithms, 1991 [17].This research was partially supported by NSF Grants CCR-8808949 and CCR-8958528.This research was partially supported by NSF Grant CCR-9009753.This research was supported in part by the National Science Foundation under Grant CCR-8658139, by DIMACS, a National Science Foundation Science and Technology center, Grant No. NSF-STC88-09648. 相似文献

6.

An application of program unification to priority queue vectorization

Ling-Yu Chuang Vernon Rego Aditya Mathur 《International journal of parallel programming》1992,21(3):193-224

In this experimental study, we apply the technique of program unification to priority queues. We examine the performance of a variety of unified priority queue implementations on a Cray Y-MP. The scope of the study is restricted to determining if different implementations of priority queues exhibit markedly different performance characteristics under program unification. We found this to be true. In a larger view, this result has interesting consequences in the application of program unification to discrete event simulations on vector or SIMD machines. We find the heap to be a promising data structure in the program unification paradigm.This research was supported by the National Science Foundation under Grant No. ASC-9002225, and by NATO under Grant CRG 900108.Also supported in part by the Mathematical Sciences Section of Oak Ridge National Laboratory under contract DE-AC05-84OR21400 with Marietta Energy Systems, Inc. 相似文献

7.

Optimal parallel algorithms for point-set and polygon problems

Richard Cole Michael T. Goodrich 《Algorithmica》1992,7(1):3-23

In this paper we give parallel algorithms for a number of problems defined on point sets and polygons. All our algorithms have optimalT(n) * P(n) products, whereT(n) is the time complexity andP(n) is the number of processors used, and are for the EREW PRAM or CREW PRAM models. Our algorithms provide parallel analogues to well-known phenomena from sequential computational geometry, such as the fact that problems for polygons can oftentimes be solved more efficiently than point-set problems, and that nearest-neighbor problems can be solved without explicitly constructing a Voronoi diagram.The research of R. Cole was supported in part by NSF Grants CCR-8702271, CCR-8902221, and CCR-8906949, by ONR Grant N00014-85-K-0046, and by a John Simon Guggenheim Memorial Foundation fellowship. M. T. Goodrich's research was supported by the National Science Foundation under Grant CCR-8810568 and by the National Science Foundation and DARPA under Grant CCR-8908092. 相似文献

8.

Lie and Morse theory for periodic orbits of vector fields and matrix Riccati equations,II

Robert Hermann Clyde Martin 《Theory of Computing Systems》1983,16(1):297-306

In this paper, elementary techniques from linear algebra and elementary properties of the Grassmann manifolds are used to prove the existence of periodic orbits and to study the equilibrium structure of Riccati differential equations.Supported in part by NASA Grants #2384 and NAG-82 and DOE Contract #DE-AC01-80RA-5256Supported in part by NASA Grant #NSG-2402, ARMY Grant #ILIG1102RHN7-05 and the National Science Foundation. 相似文献

9.

Initial failures in distributed computations

Gadi Taubenfeld Shmuel Katz Shlomo Moran 《International journal of parallel programming》1989,18(4):255-276

We investigate the possibility of solving problems in completely asynchronous message passing systems where a number of processes may fail prior to execution. By using game-theoretical notions, necessary and sufficient conditions are provided for solving problems is such a model with an without a termination requirement. An upper bound on the message complexity for solving any problem in the model is given, as well as a simple design concept for constructing a solution to any solvable problem.Supported in part by the Guttwirth Fellowship, by the National Science Foundation under grant CCR-8405478, and by the Hebrew Technical Institute scholarship.Supported in part by Technion V.P.R. fund—C. Wellner Research fund. 相似文献

10.

The complexity of reachability in distributed communicating processes

John H. Reif Scott A. Smolka 《Acta Informatica》1988,25(4):333-354

Summary A crucial problem in the analysis of communicating processes is the detection of program statements that are unreachable due to communication deadlocks. In this paper, we consider the computational complexity of the reachability problem for various models of communicating processes. We obtain these models by making simplifying assumptions about the behavior of message queues and program control, with the hope that reachability may become easier to decide. Depending on the assumptions made, we show that reachability is undecidable, requires nearly exponential space infinitely often, or is NP-complete. In obtaining these results, we demonstrate a very close relationship between the decidable models and Petri nets and Habermann’s path expressions, respectively. A preliminary version of this paper appeared in the proceedings of the Sixth Annual ACM Symposium on Principles of Programming Languages, pp. 257–268, June 1979. Supported by National Science Foundation Grant NSF MCS 82-00269 and the Office of Naval Research Contract N00014-80-C-0647. Supported by National Science Foundation Grants NSF DCR-8505873 and CCR-8704309. 相似文献

11.

The complexity of reachability in distributed communicating processes

John H. Reif Scott A. Smolka 《Acta Informatica》1988,25(3):333-354

Summary A crucial problem in the analysis of communicating processes is the detection of program statements that are unreachable due to communication deadlocks. In this paper, we consider the computational complexity of the reachability problem for various models of communicating processes. We obtain these models by making simplifying assumptions about the behavior of message queues and program control, with the hope that reachability may become easier to decide. Depending on the assumptions made, we show that reachability is undecidable, requires nearly exponential space infinitely often, or is NP-complete. In obtaining these results, we demonstrate a very close relationship between the decidable models and Petri nets and Habermann's path expressions, respectively.A preliminary version of this paper appeared in the proceedings of the Sixth Annual ACM Symposium on Principles of Programming Languages, pp. 257–268, June 1979Supported by National Science Foundation Grant NSF MCS 82-00269 and the Office of Naval Research Contract N00014-80-C-0647Supported by National Science Foundation Grants NSF DCR-8505873 and CCR-8704309ACM = The Association for Computing Machinery, Inc. IEEE = The Institute of Electrical and Electronics Engineers, Inc. 相似文献

12.

Data flow analysis of distributed communicating processes 总被引：1，自引：0，他引：1

John H. Reif Scott A. Smolka 《International journal of parallel programming》1990,19(1):1-30

Data flow analysis is a technique essential to the compile-time optimization of computer programs, wherein facts relevant to program optimizations are discovered by the global propagation of facts obvious locally. This paper extends several known techniques for data flow analysis of sequential programs to the static analysis of distributed communicating processes. In particular, we present iterative algorithms for detecting unreachable program statements, and for determining the values of program expressions. The latter information can be used to place bounds on the size of variables and messages. Our main innovation is theevent spanning graph, which serves as a heuristic for ordering the nodes through which data flow information is propagated. We consider bothstatic communication, where all channel arguments are constants, and the more difficultdynamic communication, where channel arguments may be variables and channels may be passed as messages.A preliminary version of this paper appeared in the proceedings of the Sixth Annual ACM Symposium on Principles of Programming Languages, pp. 257–268, June 1979.Supported by National Science Foundation Grant NSF MCS82-00269 and the Office of Naval Research Contract N00014-80-C-0647.Supported by National Science Foundation Grants NSF DCR-8505873 and NSF CCR-8704309. 相似文献

13.

Blocking for external graph searching

M. H. Nodine M. T. Goodrich J. S. Vitter 《Algorithmica》1996,16(2):181-214

In this paper we consider the problem of using disk blocks efficiently in searching graphs that are too large to fit in internal memory. Our model allows a vertex to be represented any number of times on the disk in order to take advantage of redundancy. We give matching upper and lower bounds for completed-ary trees andd-dimensional grid graphs, as well as for classes of general graphs that intuitively speaking have a close to uniform number of neighbors around each vertex. We also show that, for the special case of grid graphs blocked with isothetic hypercubes, there is a provably better speed-up if even a small amount of redundancy is permitted.Support was provided in part by an IBM Graduate Fellowship, by NSF Research Grants CCR-9007851 and IRI-9116451, and by Army Research Office Grant DAAL03-91-G-0035.Support was provided in part by NSF Grants CCR-9003299, CCR-9300079, and IRI-9116843, and by NSF/DARPA Grant CCR-8908092.Support was provided in part by a National Science Foundation Presidential Young Investigator Award CCR-9047466 with matching funds from IBM, by NSF Research Grant CCR-9007851, and by Army Research Office Grant DAAL03-91-G-0035. 相似文献

14.

Delay point schedules for irregular parallel computations

David M. Nicol Joel H. Saltz James C. Townsend 《International journal of parallel programming》1989,18(1):69-90

In irregular scientific computational problems one is periodically forced to choosea delay point where some overhead cost is suffered to ensure correctness, or to improve subsequent performance. Examples of delay points are problem remappings, and global synchronizations. One sometimes has considerable latitude in choosing the placement and frequency of delay points; we consider the problem of scheduling delay points so as to minimize the overal execution time. We illustrate the problem with two examples, a regridding method which changes the problem discretization during the course of the computation, and a method for solving sparse triangular systems of linear equations. We show that one can optimally choose delay points in polynomial time using dynamic programming. However, the cost models underlying this approach are often unknown. We consequently examine a scheduling heuristic based on maximizing performance locally, and empirically show it to be nearly optimal on both problems. We explain this phenomenon analytically by identifying underlying assumptions which imply that overall performance is maximized asymptotically if local performance is maximized.This research was supported in part by the National Aeronautics and Space Administration under NASA contract NAS1-18107 while the author consulted at ICASE, Mail Stop 132C, NASA Langley Research Center, Hampton, Virginia 23665.Supported in part by NASA contract NAS1-18107, the Office of Naval Research under Contract No. N00014-86-K-0654, and NSF Grant DCR 8106181. 相似文献

15.

Lazy structure sharing for query optimization

Adam L. Buchsbaum Rajamani Sundar Robert E. Tarjan 《Acta Informatica》1995,32(3):255-270

We studylazy structure sharing as a tool for optimizing equivalence testing on complex data types. We investigate a number of strategies for implementing lazy structure sharing and provide upper and lower bounds on their performance (how quickly they effect ideal configurations of our data structure). In most cases when the strategies are applied to a restricted case of the problem, the bounds provide nontrivial improvements over the naïve linear-time equivalence-testing strategy that employs no optimization. Only one strategy, however, which employs path compression, seems promising for the most general case of the problem.Work completed while at Princeton University and supported by a Fannie and John Hertz Foundation Fellowship, National Science Foundation Grant No. CCR-8920505, and the Center for Discrete Mathematics and Theoretical Computer Science (DIMACS) under NSF-STC-91-19999.Work completed while at Princeton University and DIMACS and supported by DIMACS under NSF-STC-91-19999.Research at Princeton University partially supported by the National Science Foundation, Grant No. CCR-8920505, the Office of Naval Research, Contract No. N00014-91-J-1463, and by DIMACS under NSF-STC-91-19999. 相似文献

16.

Fine-grain compilation for pipelined machines

Alexandru Nicolau Keshav Pingali Alexander Aiken 《The Journal of supercomputing》1988,2(3):279-295

Computer architecture design requires careful attention to the balance between the complexity of code scheduling problems and the cost and feasibility of building a machine. In this paper, we show that recently developed software pipelining algorithms produce optimal or near-optimal code for a large class of loops when the target architecture is a clean pipelined parallel machine. The important feature of these machines is the absence of structural hazards. We argue that the robustness of the scheduling algorithms and relatively simple hardware make these machines realistic and cost-effective. To illustrate the delicate balance between architecture and scheduling complexity, we show that scheduling with structural hazards is NP-hard, and that there are machines with simple structural hazards for which vectorization and the software pipelining techniques generate poor code.Supported in part by NSF Grants DCR-8502884, CCR-8704367, ONR Grant N00014-86-K-0215, and the Cornell NSF Supercomputing Center.Supported by NSF Grant CCR-8702668 and an IBM Faculty Development Award. 相似文献

17.

Algorithms for bichromatic line-segment problems and polyhedral terrains

Bernard Chazelle Herbert Edelsbrunner Leonidas J. Guibas Micha Sharir 《Algorithmica》1994,11(2):116-132

We consider a variety of problems on the interaction between two sets of line segments in two and three dimensions. These problems range from counting the number of intersecting pairs between m blue segments andn red segments in the plane (assuming that two line segments are disjoint if they have the same color) to finding the smallest vertical distance between two nonintersecting polyhedral terrains in three-dimensional space. We solve these problems efficiently by using a variant of the segment tree. For the three-dimensional problems we also apply a variety of recent combinatorial and algorithmic techniques involving arrangements of lines in three-dimensional space, as developed in a companion paper.Work on this paper by the first author has been supported in part by the National Science Foundation under Grant CCR-9002352. Work by the second author was supported in part by the National Science Foundation under Grant CCR-8714565. The fourth author has been supported in part by the Office of Naval Research under Grant N0014-87-K-0129, by the National Science Foundation under Grant NSF-DCR-83-20085, by grants from the Digital Equipment Corporation and the IBM Corporation, and by a grant from the US-Israeli Binational Science Foundation. 相似文献

18.

Near-optimal message routing and broadcasting in faulty hypercubes

Douglas M. Blough Nader Bagherzadeh 《International journal of parallel programming》1990,19(5):405-423

A distributed routing algorithm for faulty hypercubes is described. This algorithm uses a directed depth-first approach to find a path between the sender and receiver of a message whenever at least one non-faulty path exists. We show that, when an arbitrary number of elements of the hypercube can be faulty, the algorithm always routes messages using fewer than 2N hops, whereN is the number of nodes in the hypercube. This performance is shown to be within a factor of two of the optimal worst-case routing efficiency. Through foult simulations, we show that, even when up to half of the elements in the cube are faulty, complete the analysis, we prove that our algorithm is deadlock-free. Finally, we present two extensions of the algorithm. The first uses local storage to reduce the overhead of the algorithm while the second allows reliable broadcasting in the presence of an arbitrary number of faults.Supported in part by the National Science Foundation under Grant CCR-9010547.Supported in part by the National Science Foundation Instrumentation Grant CDA-8820627. 相似文献

19.

System reduction via truncated Hankel matrices

Charles K. Chui Xin Li Joseph D. Ward 《Mathematics of Control, Signals, and Systems (MCSS)》1991,4(2):161-175

The problem of approximating Hankel operators of finite or infinite rank by lower-rank Hankel operators is considered. For efficiency, truncated Hankel matrices are used as the intermediate step before other existing algorithms such as theCF algorithms are applied to yield the desirable approximants. If the Hankel operator to be approximated is of finite rank, the order of approximation by truncated Hankel operators is obtained. It is also shown that when themths-number is simple, then rational symbols of the best rank-m Hankel approximants of thenth truncated Hankel matrices converge uniformly to the corresponding rational symbol of the best rank-m Hankel approximant of the original Hankel operator asn tends to infinity. Supported by SDIO/IST managed by the U.S. Army under Contract No. DAAL03-87-K-0025 and also supported by the National Science Foundation under Grant No. DMS 8602337. Supported by SDIO/IST managed by the U.S. Army under Contract No. DAAL03-87-K-0025. Supported by the National Science Foundation under Grant No. DMS 8602337. 相似文献

20.

On the power of parity polynomial time

Jin-yi Cai Lane A. Hemachandra 《Theory of Computing Systems》1990,23(1):95-106

This paper proves that the complexity class P, parity polynomial time [PZ], contains the class of languages accepted byNP machines with few accepting paths. Indeed, P contains a broad class of languages accepted by path-restricted nondeterministic machines. In particular, P contains the polynomial accepting path versions ofNP, of the counting hierarchy, and of Mod_m NP form>1. We further prove that the class of nondeterministic path-restricted languages is closed under bounded truth-table reductions.These results were announced at the 6th Symposium on Theoretical Aspects of Computer Science [CH3]. Jin-yi Cai was supported in part by NSF Grant CCR-8709818 and the work was done while he was at Yale University. Lane A. Hemachandra was supported in part by a Hewlett-Packard Corporation equipment grant and the National Science Foundation under Grant CCR-8809174/CCR-8996198 and a Presidential Young Investigator Award. His work was done in part while at Columbia University. 相似文献