期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Solution of finite element systems on concurrent processing computers

Charbel Farhat Edward Wilson Graham Powell 《Engineering with Computers》1987,2(3):157-165

A new computer program architecture for the solution of finite element systems using concurrent processing is presented. The basic approach involves the automatic creation of substructures. A host provides control over a set of processors, each of which is assigned initially to one substructure, then dynamically reassigned to the common interface for the solution of the complete system of substructures. Algorithm details are presented fo each phase of the analysis.Results of analysis of large plate bending problems on a hypercube multicomputer are reported. For a system with 2,000 equations, an efficiency of 80 percent of the maximum theoretical value was obtained using 16 processors. 相似文献

2.

Multicomputers: message-passing concurrent computers 总被引：1，自引：0，他引：1

Athas W.C. Seitz C.L. 《Computer》1988,21(8):9-24

A status report is provided on the architecture and programming of a family of concurrent computers that are organized as ensembles of small programmable computers called nodes, connected by a message-passing network, each with its own private memory. The architecture of the multicomputer is described and contrasted with that of the shared-memory multiprocessor, and the concept of grain size (which depends on the size of the individual memories) is explained. Medium-grain and fine-grain multicomputers, with nodes containing megabytes and tens of kilobytes of memory, respectively, are examined, and their programming is discussed 相似文献

3.

Parallel computers for region-level image processing

Azriel Rosenfeld Angela Y. Wu 《Pattern recognition》1982,15(1):41-50

It is well known that parallel computers can be used very effectively for image processing at the pixel level, by assigning a processor to each pixel or block of pixels, and passing information as necessary between processors whose blocks are adjacent. This paper discusses the use of parallel computers for processing images at the region level, assigning a processor to each region and passing information between processors whose regions are related. The basic difference between the pixel and region levels is that the regions (e.g. obtained by segmenting the given image) and relationships differ from image to image, and even for a given image, they do not remain fixed during processing. Thus, one cannot use the standard type of cellular parallelism, in which the set of processors and interprocessor connections remain fixed, for processing at the region level. Reconfigurable cellular computers, in which the set of processors that each processor can communicate with can change during a computation, are more appropriate. A class of such computers is described, and general examples are given illustrating how such a computer could initially configure itself to represent a given decomposition of an image into regions, and dynamically reconfigure itself, in parallel, as regions merge or split. 相似文献

4.

Parallel computers for advanced information processing

America P.H.M. Hulshof B.J.A. Odijk E.A.M. Sijstermans F. van Twist R.A.H. Wester R.h.H. 《Micro, IEEE》1990,10(6)

An overview is given of ESPRIT project 415, which involved the study of object-oriented, functional, and logic programming styles in six subprojects. The parallel languages and architectures designed to implement them are described, and the technology of the object-oriented approach pursued by the authors' team is examined. Their design includes a novel language, decentralized memory architecture, and system software 相似文献

5.

Optical processing paradigms for electronic computers

Mitkas P.A. Betzos G.A. Irakliotis L. 《Computer》1998,31(2):45-51

Even if electronics were to reach some natural law, the authors state that photonics is not ready to replace electronics as the new platform for digital data processing. On the other hand, they say, photonics can greatly enhance the performance of electronic computers. Photonics already contributes to the fields of data storage and data communication. The missing link, however, between storage and communication is data processing. The authors describe a special optoelectronic architecture that can support data processing. In the context of their research, the authors discuss three computing paradigms: analog optical processing, which entails analog operations on sets of analog data obtained by a camera lens, for example; digital optical processing, which involves the use of light to perform digital logic; and analog-digital hybrid optical processing, which involves hybrid techniques and systems that can process both types of data 相似文献

6.

Decomposition of knowledge for concurrent processing 总被引：1，自引：0，他引：1

Babin G. Cheng Hsu 《Knowledge and Data Engineering, IEEE Transactions on》1996,8(5):758-772

In some environments, it is more difficult for distributed systems to cooperate. In fact, some distributed systems are highly heterogeneous and might not readily cooperate. In order to alleviate these problems, we have developed an environment that preserves the autonomy of the local systems, while enabling distributed processing. This is achieved by: modeling the different application systems into a central knowledge base (called a Metadatabase); providing each application system with a local knowledge processor; and distributing the knowledge within these local shells. This paper is concerned with describing the knowledge decomposition process used for its distribution. The decomposition process is used to minimize the needed cooperation among the local knowledge processors, and is accomplished by “serializing” the rule execution process. A rule is decomposed into an ordered set of subrules, each of which is executed in sequence and located in a specific local knowledge processor. The goals of the decomposition algorithm are to minimize the number of subrules produced, hence reducing the time spent in communication, and to assure that the sequential execution of the subrules is “equivalent” to the execution of the original rule 相似文献

7.

Decoherence dynamics estimation for superconducting gate-model quantum computers

Gyongyosi Laszlo 《Quantum Information Processing》2020,19(10):1-9

Quantum Information Processing - In this note, I attempt to explore the quantum Colonel Blotto game and contrast it with the classical Colonel Blotto game; in particular, I will focus on an... 相似文献

8.

A concurrent test architecture for massively parallel computers andits error detection capability

Hancu M.V.A. Iwasaki K. Sato Y. Sugie M. 《Parallel and Distributed Systems, IEEE Transactions on》1994,5(11):1169-1184

Presents new principles for online monitoring in the context of multiprocessors (especially massively parallel processors) and then focuses on the effect of the aliasing probability on the error detection process. In the proposed test architecture, concurrent testing (or online monitoring) at the system level is accomplished by enforcing the run-time testing of the data and control dependences of the algorithm currently being executed on the parallel computer. In order to help in this process, each message contains both source and destination addresses. At each message source, the sequence of destination addresses of the outgoing messages is compressed on a block basis. At the same time, at each destination, the sequence of source addresses of all incoming messages is compressed, also on a block basis. Concurrent compression of the instructions executed by the PEs is also possible. As a result of this procedure, an image of the data dependences and of the control flow of the currently running algorithm is created. This image is compared, at the end of each computational block, with a reference image created at compilation time. The main results of this work are in proposing new principles for the online system-level testing of multiprocessor systems, based on signaturing and monitoring the data dependences together with the control dependences, and in providing an analytical model and analysis for the address compression process used for monitoring the data routing process 相似文献

9.

A survey of synchronization methods for parallel computers

Dinning A. 《Computer》1989,22(7):66-77

An examination is given of how traditional synchronization methods influence the design of MIMD (multiple-instruction multiple-data-stream) multiprocessors. She provides an overview of MIMD multiprocessing and goes on to discuss semaphore-based implementations (Ultracomputers, Cedar, and the Sequent Balance/21000), monitor-based implementations (the HM²p) and implementations based on message-passing (HEP, the BBN Butterfly and the Transputer) 相似文献

10.

Non-numerical methods on parallel computers

P.M. Flanders 《Computer Physics Communications》1982,26(3-4):363-371

相似文献

11.

Evaluating deadlock detection methods for concurrent software 总被引：1，自引：0，他引：1

Corbett J.C. 《IEEE transactions on pattern analysis and machine intelligence》1996,22(3):161-180

Static analysis of concurrent programs has been hindered by the well-known state explosion problem. Although many different techniques have been proposed to combat this state explosion, there is little empirical data comparing the performance of the methods. This information is essential for assessing the practical value of a technique and for choosing the best method for a particular problem. In this paper, we carry out an evaluation of three techniques for combating the state explosion problem in deadlock detection: reachability searching with a partial-order state-space reduction, symbolic model checking and inequality-necessary conditions. We justify the method used for the comparison, and carefully analyze several sources of potential bias. The results of our evaluation provide valuable data on the kinds of programs to which each technique might best be applied. Furthermore, we believe that the methodological issues we discuss are of general significance in comparison of analysis techniques 相似文献

12.

Design,implementation and evaluation of a deadlock-free routing algorithm for concurrent computers

M. Cannataro G. Spezzano D. Talia E. Gallizzi 《Concurrency and Computation》1992,4(2):143-161

This paper describes the design, the implementation, and the performance results of a routing algorithm which provides deadlock-free communication in a tightly coupled message-passing concurrent computer. The algorithm is adaptive, isolated and uses the store-and-forward technique. It allows message communication between two processes regardless of where they are physically located on the network. The routing algorithm has many positive characteristics including provable deadlock freedom, guaranteed message arrival, and automatic local congestion reduction. It can be used as a basis for the design of high-level communication primitives. An Occam implementation on a network of inmos Transputers is discussed. The experimental results show that the routing algorithm is effective to support process to process communication on a concurrent computer. 相似文献

13.

Interrupt processing in concurrent processors

Walker W. Cragon H.G. 《Computer》1995,28(6):36-46

Systems architects are faced with many possibilities for designing interrupt processing strategies that optimize computer resources and performance. This framework of hardware implementation techniques highlights choices for consideration. The approach we've developed broadly classifies interrupt processing techniques and implementations into six phases. In preparing this taxonomy, we've examined the strategies used in 15 modern concurrent processors (those that can process more than one instruction at a time), such as the MIPS R4000 and Intel Pentium. We extend our findings, as applicable, to interrupt processing design decisions in general and survey the different hardware techniques available to designers. We concentrate on concurrent processors because their interrupt processing systems are more complex than those of nonconcurrent processors, and because the level of concurrency in modern processors is steadily increasing 相似文献

14.

A new parallel matrix multiplication algorithm on distributed-memory concurrent computers

Jaeyoung Choi 《Concurrency and Computation》1998,10(8):655-670

We present a new fast and scalable matrix multiplication algorithm called DIMMA (distribution-independent matrix multiplication algorithm) for block cyclic data distribution on distributed-memory concurrent computers. The algorithm is based on two new ideas; it uses a modified pipelined communication scheme to overlap computation and communication effectively, and exploits the LCM block concept to obtain the maximum performance of the sequential BLAS (basic linear algebra subprograms) routine in each processor even when the block size is very small or very large. The algorithm is implemented and compared with SUMMA on the Intel Paragon computer. © 1998 John Wiley & Sons, Ltd. 相似文献

15.

Design and optimization of parallel computers for processing mass data

O. L. Perevozchikova V. G. Tulchinsky R. A. Yushchenko 《Cybernetics and Systems Analysis》2006,42(4):559-569

Problems of high-performance processing of mass cluster data are considered. Estimates of execution times of parallel data processing programs and a heuristic algorithm of optimization of cluster architectures for such problems are proposed. __________ Translated from Kibernetika i Sistemnyi Analiz, No. 4, pp. 117–129, July–August 2006. 相似文献

16.

Structural testing of concurrent programs 总被引：1，自引：0，他引：1

Taylor R.N. Levine D.L. Kelly C.D. 《IEEE transactions on pattern analysis and machine intelligence》1992,18(3):206-215

Although structural testing techniques are among the weakest available with regard to developing confidence in sequential programs, they are not without merit. The authors extend the notion of structural testing criteria to concurrent programs and propose a hierarchy of supporting structural testing techniques. Coverage criteria described include concurrency state coverage, state transition coverage and synchronization coverage. Requisite support tools include a static concurrency analyzer and either a program transformation system or a powerful run-time monitor. Also helpful is a controllable run-time scheduler. The techniques proposed are suitable for Ada or CSP-like languages. Best results are obtained for programs having only static naming of tasking objects 相似文献

17.

Pseudospectral methods on massively parallel computers

Richard B. Pelz 《Computer Methods in Applied Mechanics and Engineering》1990,80(1-3):493-503

Parallel algorithms are presented for the Fourier pseudospectral method and parts of the Chebyshev pseudospectral method. Performance of these schemes is reported as implemented on the NCUBE hypercube. The problem to which these methods are applied is the time integration of the incompressible Navier-Stokes equations. Despite serious communication requirements, the efficiencies are high; e.g., 92% for a 128³ mesh on 1024 processors. Benchmark timings rival those of optimized codes on supercomputers. 相似文献

18.

Tapetool: a software tool for importing image data from image acquisition computers to image processing computers

V R Mandava R J Maciunas J M Fitzpatrick 《Computer methods and programs in biomedicine》1990,33(4):213-219

Currently, the image distribution gap between image acquisition computers and image processing computers is bridged through magnetic tapes. The tape formats used by the manufacturers of the image acquisition computers are idiosyncratic and fairly complex. A general purpose window based software tool is herein described, which frees the clinical and research sectors of the responsibility of understanding and decoding these complex formats in order to import image databases from acquisition computers to image processing computers. This software tool provides a cornerstone for developing image processing software for diagnostic, therapeutic and surgical planning purposes. 相似文献

19.

A high performance engine for concurrent complex event processing

Bill Karakostas 《Concurrency and Computation》2014,26(2):491-499

This paper describes the architecture, prototype implementation and performance analysis of a complex event processing engine that can scale up to very large numbers of concurrent events while keeping the requirements on system resources predictable and low. The main innovation of this approach is that each instantiated event pattern is handled by a dedicated Erlang process, instead of a single or shared operating system thread. This in turn, reduces the latency in processing the event processing as it avoids the overheads associated with resource contention. We demonstrate how this approach can achieve linear event processing times under high event loads, using modest computing resources. Copyright © 2013 John Wiley & Sons, Ltd. 相似文献

20.

Two packages for meteorological data processing and runoff simulation for personal computers

《Environmental Software》1987,2(4):192-198

This paper presents two packages designed to solve some of the most common problems involved in the analysis of a river catchment on a personal computer. The main operations performed concern pre-processing of raw meteorological data (to obtain time series of rainfall, snowfall and daily mean temperatures at different elevations), and simulation of the daily runoff from the catchment. Both packages have been developed at the Laboratorio di Informatica Ambientale e Territoriale (LITA) of the Politecnico di Milano, Italy. 相似文献