期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Efficient mapping of backpropagation algorithm onto a network ofworkstations

Sudhakar V. Siva Ram Murthy C. 《IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics》1998,28(6):841-848

In this paper, we present an efficient technique for mapping a backpropagation (BP) learning algorithm for multilayered neural networks onto a network of workstations (NOW's). We adopt a vertical partitioning scheme, where each layer in the neural network is divided into p disjoint partitions, and map each partition onto an independent workstation in a network of p workstations. We present a fully distributed version of the BP algorithm and also its speedup analysis. We compare the performance of our algorithm with a recent work involving the vertical partitioning approach for mapping the BP algorithm onto a distributed memory multiprocessor. Our results on SUN 3/50 NOW's show that we are able to achieve better speedups by using only two communication sets and also by avoiding some redundancy in the weights computation for one training cycle of the algorithm. 相似文献

2.

Queueing models for a single server LAN with processor sharing disciplines

B. M. Rao S. Ramakrishnan 《Computing》1996,57(3):225-244

A local area network (LAN) is a collection of autonomous workstations interconnected by a communication network. A key component of a local network is the file server which stores programs and data and makes them available to the workstations as needed. In practice, a workstation requests a large portion of the file (type 1 request) when a new application is launched. Following this, the workstation requests additional portions of the file as needed (type 2 requests). Clearly, the response time to these requests will depend strongly on the file server scheduling policy to service the two types of incoming requests. In an earlier paper [12] we studied this system when type 2 requests havepreemptive and non-preemptive priority over type 1 requests. From a practical point of view, it is more realistic to describe the file transfer by around-robin scheduling discipline. In this paper we study LANs where the file server followsprocessor sharingdiscipline, which is a limiting case of theround-robin discipline. Assuming that all relevant intervals are exponentially distributed we develop algorithms to analyze the performance of the system. Illustrative examples are presented to study the system behavior under various load conditions. The computational approach presented in this paper helps in resolving some of the analytical difficulties associated with the analysis ofprocessor sharing disciplines. 相似文献

3.

基于局域网的有限元网格分布式并行生成 总被引：2，自引：0，他引：2

李水乡王云鹏陈永强《计算机工程与设计》2005,26(12):3165-3166,3193

在常见的PC＋Windows＋LAN环境下,采用Winsock API网络通信接口实现了局域网上的分布式并行有限元网格生成。网格生成区域在服务器上按照工作站数量被分解为若干个子区域,这些子区域及网格控制参数通过局域网（LAN）传给工作站。子区域在工作站上被剖分成子网格并通过局域网传回服务器以合并形成最终网格。算例表明只要有足够的计算节点,分布式并行技术可以将网格生成速度大幅度提高,而网络通信所占时间的比例基本固定。相似文献

4.

Project Athena as a distributed computer system

Champine G.A. Geer D.E. Jr. Ruh W.N. 《Computer》1990,23(9):40-51

Project Athena, established in 1983 to improve the quality of education at MIT (Massachussetts Institute of Technology) by providing campuswide, high-quality computing based on a large network of workstations, is discussed, focusing on the design of Athena's distributed workstation system. The requirements of the system are outlined distributed system models are reviewed, other distributed operating systems are described, and issues in distributed systems are examined. The distributed-system model for Athena is discussed. Athena has three major components; workstations a network, and servers. The approach taken by the Athena developers was to implement a set of network services to replace equivalent time-sharing services, in essence converting the time-sharing Unix model into a distributed operating system 相似文献

5.

Issues and experiences in implementing a distributed tuplespace

James B. Fenwick Lori L. Pollock 《Software》1997,27(10):1199-1232

相似文献

6.

A survey of recent advances in SAT-based formal verification 总被引：2，自引：0，他引：2

Mukul R. Prasad Armin Biere Aarti Gupta 《International Journal on Software Tools for Technology Transfer (STTT)》2005,7(2):156-173

Dramatic improvements in SAT solver technology over the last decade and the growing need for more efficient and scalable verification solutions have fueled research in verification methods based on SAT solvers. This paper presents a survey of the latest developments in SAT-based formal verification, including incomplete methods such as bounded model checking and complete methods for model checking. We focus on how the surveyed techniques formulate the verification problem as a SAT problem and how they exploit crucial aspects of a SAT solver, such as application-specific heuristics and conflict-driven learning. Finally, we summarize the noteworthy achievements in this area so far and note the major challenges in making this technology more pervasive in industrial design verification flows. 相似文献

7.

Interpolant Learning and Reuse in SAT-Based Model Checking 总被引：1，自引：0，他引：1

Joao Marques-Silva 《Electronic Notes in Theoretical Computer Science》2007,174(3):31

Bounded Model Checking (BMC) is one of the most paradigmatic practical applications of Boolean Satisfiability (SAT). The utilization of SAT in model checking has allowed significant performance gains and, as a consequence, a large number of commercial verification tools now include SAT-based model checkers. Recent work has provided SAT-based BMC with completeness conditions, and this is generally referred to as unbounded model checking (UMC). Among the existing approaches for SAT-based UMC, the utilization of interpolants is among the most effective. Despite their success, interpolants have only been used for identifying a fixed point of the set of reachable states. This paper extends the utilization of interpolants in SAT-based model checking. This is achieved by observing that, under reasonable assumptions, interpolants can be reused, i.e. computed interpolants can be reused at later stages of the model checking process. The paper develops conditions for validity of interpolant reuse. In addition, the paper outlines a new fixed point condition, alternative to the existing interpolant-based fixed point condition. Preliminary practical experience on interpolant learning and reuse is reported. 相似文献

8.

Lightweight axiom pinpointing via replicated driver and customized SAT-solving

Dantong OUYANG Mengting LIAO Yuxin YE 《Frontiers of Computer Science》2023,17(2):172315

相似文献

9.

Modeling and distributed simulation of a broadband-ISDN network

Chai A. Ghosh S. 《Computer》1993,26(9):37-51

A distributed approach to communication network simulation using a network of workstations configured as a loosely coupled parallel processor to model and simulate the broadband integrated services digital network (B-ISDN) is proposed. In a loosely coupled parallel processor system, a number of concurrently executable processors communicate asynchronously using explicit messages over high-speed links. Since this architecture is similar to that of B-ISDN networks, it constitutes a realistic testbed for their modeling and simulation. The authors describe an implementation of this approach on 50 Sun workstations at Brown University. Performance results, based on representative B-ISDN networks and realistic traffic models, indicate that the distributed approach is efficient and accurate 相似文献

10.

ZGL分布式操作系统的设备共享

薛行孙钟秀《计算机学报》1991,14(2):100-105

本文介绍南京大学设计和实现的异构型分布式操作系统ZGL中的设备共享系统.在ZGL中,一些处理器和外部设备被定为专职的服务器.此外,任何工作站在空闲时还可以使自己临时成为计算服务器.用户可以在命令级和程序级访问设备服务器.在这两种情况下系统都将自动从可用的服务器池中选择一个服务器完成指定的任务.工作站可以作为远程控制终端与在计算服务器上执行的任务进行交互. 相似文献

11.

Accelerating Bounded Model Checking of Safety Properties 总被引：4，自引：0，他引：4

Ofer Strichman 《Formal Methods in System Design》2004,24(1):5-24

Bounded Model Checking based on SAT methods has recently been introduced as a complementary technique to BDD-based Symbolic Model Checking. The basic idea is to search for a counterexample in executions whose length is bounded by some integer k. The BMC problem can be efficiently reduced to a propositional satisfiability problem, and can therefore be solved by SAT methods rather than BDDs. SAT procedures are based on general-purpose heuristics that are designed for any propositional formula. We show how the unique characteristics of BMC invariant formulas (G p) can be exploited for a variety of optimizations in the SAT checking procedure. Experiments with these optimizations on real designs prove their efficiency in many of the hard test cases, in comparison to both the standard SAT procedure and a BDD-based model checker. 相似文献

12.

Model partitioning and the performance of distributed timewarp simulation of logic circuits

《Simulation Practice and Theory》1997,5(1):83-99

Simulation of complex digital electronic systems requires powerful machines and algorithms. Distributed simulation could improve both the execution time and the availability of a large distributed memory for complex models. Model partitioning onto the available processors has a major impact on simulation efficiency. We report on how various partitioning algorithms affect timewarp-based distributed simulation of combinational and synchronous sequential logic circuits, and try to determine the relationship between circuit parameters (the number of gates, topological levels and the degree of activity in the circuit) and the structure of the partition having the fastest simulation on a heterogeneous network of Sun workstations. 相似文献

13.

Accelerating incompressible flow computations with a Pthreads-CUDA implementation on small-footprint multi-GPU platforms

Julien C. Thibault Inanc Senocak 《The Journal of supercomputing》2012,59(2):693-719

Graphics processor units (GPU) that are originally designed for graphics rendering have emerged as massively-parallel “co-processors” to the central processing unit (CPU). Small-footprint multi-GPU workstations with hundreds of processing elements can accelerate compute-intensive simulation science applications substantially. In this study, we describe the implementation of an incompressible flow Navier–Stokes solver for multi-GPU workstation platforms. A shared-memory parallel code with identical numerical methods is also developed for multi-core CPUs to provide a fair comparison between CPUs and GPUs. Specifically, we adopt NVIDIA’s Compute Unified Device Architecture (CUDA) programming model to implement the discretized form of the governing equations on a single GPU. Pthreads are then used to enable communication across multiple GPUs on a workstation. We use separate CUDA kernels to implement the projection algorithm to solve the incompressible fluid flow equations. Kernels are implemented on different memory spaces on the GPU depending on their arithmetic intensity. The memory hierarchy specific implementation produces significantly faster performance. We present a systematic analysis of speedup and scaling using two generations of NVIDIA GPU architectures and provide a comparison of single and double precision computational performance on the GPU. Using a quad-GPU platform for single precision computations, we observe two orders of magnitude speedup relative to a serial CPU implementation. Our results demonstrate that multi-GPU workstations can serve as a cost-effective small-footprint parallel computing platform to accelerate computational fluid dynamics (CFD) simulations substantially. 相似文献

14.

一种基于SAT的运算电路查错方法

陈云霁张健沈海华胡伟武《计算机学报》2007,30(12):2082-2089

基于SAT的运算电路查错方法将被验证系统中系统规范成立与否的问题转换为布尔公式和数学公式的混合形式E-CNF,通过采用了标志子句技术的E-SAT求解器进行求解.实验表明该方法自动化程度高,能处理大规模的运算电路,有较强的查找错误能力. 相似文献

15.

Adaptive and Reliable Paging to Remote Main Memory

George Dramitinos Evangelos P. Markatos 《Journal of Parallel and Distributed Computing》1999,58(3):505

Workstation clusters provide significant aggregate amounts of resources, including processing power and main memory. In this paper we explore the collective use of main memory in a workstation cluster to boost the performance of applications that require more memory than a single workstation can provide. We describe the design, simulation, implementation, and evaluation of a pager that uses main memory of remote workstations in a workstation cluster as a faster-than-disk paging device and provides reliability in case of single workstation failures and adaptivity in network and disk load variations. Our pager has been implemented as a block device driver linked to the Digital UNIX operating system, without any modifications to the kernel code. Using several test applications we measure the performance of remote memory paging over an Ethernet interconnection network and find it to be up to twice as fast as traditional disk paging. We also evaluate the performance of various reliability policies and demonstrate their feasibility even over low bandwith networks such as Ethernet. We conclude that the benefits of reliable remote memory paging in workstation clusters are significant today and are likely to increase in the near future. 相似文献

16.

Limitations of Cycle Stealing for Parallel Processing on a Network of Homogeneous Workstations

Scott T. Leutenegger Xian-He Sun 《Journal of Parallel and Distributed Computing》1997,43(2):733

The low cost and availability of clusters of workstations have lead researchers to re-explore distributed computing using independent workstations. This approach may provide better cost/performance than tightly coupled multiprocessors. In practice, this approach often utilizes wasted cycles to run parallel jobs. In this paper we address the feasibility and limitation of such a nondedicated parallel processing environment assuming workstation processes have priority over parallel tasks. We develop a simple analytical model to predict parallel job response times. Our model provides insight into how significantly workstation owner interference degrades parallel program performance. It forms a foundation for task partitioning and scheduling in a nondedicated network environment. A new term, task ratio, which relates the parallel task demand to the mean service demand of nonparallel workstation processes, is introduced. We propose that task ratio is a useful metric for determining how a parallel applications should be partitioned and scheduled in order to make efficient use of a nondedicated distributed system. 相似文献

17.

A template-based approach to the generation of distributedapplications using a network of workstations

Singh A. Schaeffer J. Green M. 《Parallel and Distributed Systems, IEEE Transactions on》1991,2(1):52-67

A computational model and system for the generation of distributed applications in a workstation environment are presented. The well-known RPC model is modified by a novel concept known as template attachment. A computation consists of a network of sequential procedures which have been encapsulated in templates. A small selection of templates is available from which a distributed application with the desired communication behavior can be rapidly built. The system generates all the required low-level code for correct synchronization, communication, and scheduling. This results in a system that is easy to use and flexible and can provide a programmer with the desired amount of control in using idle processing power over a network of workstations. The practical feasibility of the model has been demonstrated by implementing it for Unix-based workstation environments 相似文献

18.

On the performance of distributed objects

《Journal of Systems Architecture》2000,46(5):411-428

Early distributed shared memory systems used the shared virtual memory approach with fixed-size pages, usually 1–8 KB. As this does not match the variable granularity of sharing of most programs, recently the emphasis has shifted to distributed object-oriented systems. With small object sizes, the overhead of inter-process communication could be large enough to make a distributed program too inefficient for practical use. To support research in this area, we have implemented a user-level distributed programming testbed, DIPC, that provides shared memory, semaphores and barriers. We develop a computationally-efficient model of distributed shared memory using approximate queueing network techniques. The model can accommodate several algorithms including central server, migration and read-replication. These models have been carefully validated against measurements on our distributed shared memory testbed. Results indicate that for large granularities of sharing and small access bursts, central server performs better than both migration and read-replication algorithms. Read-replication performs better than migration for small and moderate object sizes for applications with high degree of read-sharing and migration performs better than read-replication for large object sizes for applications having moderate degree of read-sharing. 相似文献

19.

The performance of the Amoeba distributed operating system

Robbert Van Renesse Hans Van Staveren Andrew S. Tanenbaum 《Software》1989,19(3):223-234

Amoeba is a capability-based distributed operating system designed for high-performance interactions between clients and servers using the well-known RPC model. The paper starts out by describing the architecture of the Amoeba system, which is typified by specialized components such as workstations, several services, a processor pool, and gateways that connect other Amoeba systems transparently over wide-area networks. Next the RPC interface is described. The paper presents performance measurements of the Amoeba RPC on unloaded and loaded systems. The time to perform the simplest RPC between two user processes has been measured to be 1-4 ms. Compared to SUN 3/50's RPC, Amoeba has one ninth of the delay, and over three times the throughput. Finally we describe the Amoeba file server. The Amoeba file server is so fast that it is limited by the communication bandwidth. To the best of our knowledge this is the fastest file server yet reported in the literature for this class of hardware. 相似文献

20.

基于GIS的分布式实时协同制图系统的研究 总被引：4，自引：1，他引：3

周树语许小艳刘然《计算机工程与设计》2005,26(1):55-57,60

由于Internet网络带宽限制等诸多方面的影响,基于GIS的分布式实时协同制图系统是较难实现的。采用复制式模型构造了一个基于GIS的分布式实时协同制图系统,并且采用动态数据格式作为各站点的交换数据,极大限度地战少了网络数据传输量,提高了系统的响应速度和稳定性,满足了协同制图系统对实时性的要求。相似文献