首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 437 毫秒
1.
A critical component of any large-scale parallel processing system is the interconnection network that provides a means for communication along the system's processors and memories. Attributes of the multistage cube topology that have made it an effective basis for interconnection networks and the subject of much ongoing research are reviewed. These attributes include O(N log2N) cost for an N-input/output network, decentralized control, a variety of implementation options, good data-permuting capability to support single-instruction-stream/multiple-data-stream (SIMD) parallelism, good throughput to support multiple-instruction-stream/multiple-data-stream (MIMD) parallelism, and ability to be partitioned into independent subnetworks to support reconfigurable systems. Examples of existing systems that use multistage cube networks are considered. The multistage cube topology can be converted into a single-stage network by associating with each switch in the network a processor (and a memory). Properties of systems that use the multistage cube network in this way are examined  相似文献   

2.
The authors consider linear lightwave networks with a single waveband that have N inputs, each with a transmitter, and N outputs, each with a receiver, interconnected by optical links, broadcast stars, and wavelength-independent 2×2 switches. The transmitters and receivers can tune to C different wavelengths. The authors describe a rearrangeably nonblocking network that is a modification of the Benes network and uses transmitters that are fixed tuned and switches with two states. The network uses [1+o(1)] N/log2(N/C) switches, which is shown to be nearly the minimum number. It is also shown that, if C =o(log N), then a rearrangeably nonblocking network requires [1+o(1)]Nlog2N switches even if the switches have more than two states  相似文献   

3.
The performance of nonblocking packet switches such as the knockout switch and Batcher banyan switch for high-speed communication networks can be improved as the switching capacity L per output increases; the switching capacity per output refers to the maximum number of packets transferred to an output during a slot. The N×N switch with L=N was shown to attain the best possible performance by M.J. Karol et al. (1987). Here a N×N nonblocking packet switch with input and output buffers is analyzed for an arbitrary number of L such that 1⩽LN. The maximum throughput and packet loss probability at input are obtained when N=∞  相似文献   

4.
It is shown that a frame of N time slots can be arbitrarily permuted with 2log2N-1 controlled exchange switches with associated delay elements. This is an improvement over previously known interconnection networks that require O( N) exchange elements. The proof utilizes the recursive algorithm of V.E. Benes (1965) and the time interchange properties of a particular configuration of a single exchange element. The architecture is especially applicable in optical systems, since optical exchange switches are among the simplest optical logic devices to build, are inherently very fast, and are the best developed, although expensive  相似文献   

5.
A low-pass and a bandpass additive white Gaussian noise channel with a peak-power constraint imposed on otherwise arbitrary input signals are considered. Upper bounds on the capacity of such channels are derived. They are strictly less than the capacity of the channel when the peak-power constrain is removed and replaced by the average-power constraint, for which the Gaussian inputs are optimum. This provides the answer to an often-posed question: peak-power limiting in the case of bandlimited channels does reduce capacity, whereas in infinite bandwidth channels it does not, as is well known. For an ideal low-pass filter of bandwidth B, the upper bound is Blog 0.934P/(N0B) for P/( N0B)≫1, where P is the peak power of the input signal and N0/2 is the double-sided power spectral density of the additive white Gaussian noise  相似文献   

6.
A novel single-mode-single-slip-structure S3 optical switch using carrier-induced refractive index change is proposed as a unit cell for a small polarization-independent nonblocking N×N optical switch array. Sixteen S3 optical switches have been integrated into a nonblocking 4×4 optical switch array on an InP substrate. The 8-mm-length InGaAsP/InP 4×4 optical array has shown satisfactory switching characteristics and is suitable for larger scale integration of optical switch arrays and also for integration with other active optical devices such as laser diodes  相似文献   

7.
Asynchronous transfer mode (ATM) is the transport technique for the broadband ISDN recommended by CCITT (I.121). Many switches have been proposed to accommodate the ATM that requires fast packet switching capability.1-8 The proposed switches for the broadband ISDN can be classified as being of input queueing or output queueing type. Those of the input queueing type have a throughput performance which is approximately 58 per cent that of the output queueing type. However, output queueing networks require larger amounts of hardware than input queueing networks. In this paper, we propose a new multistage switch with internal buffering that approaches a maximum throughput of 100 per cent as the buffering is increased. The switch is capable of broadcasting and self-routeing. It consists of two switching planes which consist of packet processors, 2 x 2 switching elements, distributors and buffers located between stages and in the output ports. The internal data rate of the proposed switch is the same as that of the arriving information stream. In this sense, the switch does not require speed-up. The switch has log2 N stages that forward packets in a store-and-forward fashion, thus incurring a latency of log2 N time periods. Performance analysis shows that the additional delay is small.  相似文献   

8.
A routing architecture applying the concept of multichannel transmission groups (MCTGs) for ATM systems is proposed. A queuing analysis of an internally nonblocking ATM switch employing this MCTG concept with partially shared output buffers is presented. The analysis is based on the discrete-time DA///D/c /B queuing model. Both bulk input traffic bulk-size distribution (A) and deterministic traffic (D1 +. . .+DN) are considered. The impact of switch speedup on the performance is also taken into account. It is shown that the MCTG architecture yields better performance in terms of delay and cell loss probability than its single channel counterpart. It is also found that the switch speedup required to closely approximate the optimal performance obtained by having the switch fabric run N times as fast as the input and output channels, where N is the size of the switch, is rather small compared to N. This makes the practical realization of the proposed switch architecture feasible  相似文献   

9.
Queueing in high-performance packet switching   总被引:14,自引:0,他引:14  
The authors study the performance of four different approaches for providing the queuing necessary to smooth fluctuations in packet arrivals to a high-performance packet switch. They are (1) input queuing, where a separate buffer is provided at each input to the switch; (2) input smoothing, where a frame of b packets is stored at each of the input line to the switch and simultaneously launched into a switch fabric of size Nb×Nb; (3) output queuing, where packets are queued in a separate first-in first-out (FIFO) buffer located at each output of the switch; and (4) completely shared buffering, where all queuing is done at the outputs and all buffers are completely shared among all the output lines. Input queues saturate at an offered load that depends on the service policy and the number of inputs N, but is approximately 0.586 with FIFO buffers when N is large. Output queuing and completely shared buffering both achieve the optimal throughput-delay performance for any packet switch. However, compared to output queuing, completely shared buffering requires less buffer memory at the expense of an increase in switch fabric size  相似文献   

10.
Many time-multiplex switching systems require that the incoming traffic be scheduled to avoid conflict at the switch output (two or more users converging simultaneously upon a single output). Optimal scheduling provides a means to assign traffic on demand such that either blocking probability is minimized (unbuffered system) or packet waiting time is minimized (buffered system). However, computation of an optimal schedule for switches of a reasonable size (i.e. N=100) may require many seconds or even minutes, whereas the traffic demand may vary much more rapidly. Since the computation time varies as O(N2), the problem becomes readily intractable for large N. This computational bottleneck is overcome by using a scheduling algorithm which is run on a simple special-purpose parallel computer (cellular automaton). A schedule is produced in O(N) time if signal propagation time in the automaton is considered negligible, and therefore increases in computation speed by several orders of magnitude should be possible; the time to compute a schedule for a 1000-input switch would be measured in milliseconds rather than minutes  相似文献   

11.
A fast block-matching algorithm for motion estimation is presented. It is based on a logarithmic step where, in each search step, only four locations are tested. For a motion displacement of w pels/frame, this technique requires 5+4 log2w computations to locate the best match. Using sequences of CIF standard pictures, the interframe motion compensated prediction error with this technique is compared to the other fast methods. The computational complexity of this algorithm is also compared against those methods  相似文献   

12.
One-to-many connection (i.e., multicast) is an important communication primitive used in parallel processing and high-speed switching in order to simultaneously send data from an input to more than one output. We prove that for even (respectively, odd) n, a multi-log2N network is strictly nonblocking for a one-to-many connection traffic if it is designed by vertically stacking at least (δn)/4+1((δ/2)(n-1)+1) planes of a log2N network together, where N=2n, δ=2[n/2], and [x] denotes the greatest integer less than or equal to x. We thus give answer to the open problem and introduce yet another strictly nonblocking multicast network. The characterized network has self-routing capability, regular topology, O(2log2N+2log2(log2N)) stages, and fewer crosspoints than the Clos network for N⩾512. We then extend multi log2N multicast networks to the fanout restricted nonblocking networks. It turns out that the multi-log2N network nonblocking in a strict-sense for a one-to-one connection traffic is also wide-sense nonblocking for a multicast traffic in which the fanout of any connection does not exceed δ, provided that for even (respectively, odd) n, the fanout capability of each log2N network is restricted to stage (n/2)(((n-1)/2)+1) through n-1  相似文献   

13.
A multibus train (ordered demand assignment) communication architecture, using the AMTRAC protocol (for efficient utilization of fiber-optic-based very-high-speed networks) is presented. Taking advantage of the emerging WDM (wavelength-division multiplexing) and FDM (frequency-division multiplexing) technologies, the proposed solution introduces a coordinated multichannel control combining the performance advantages of two known approaches for high-speed communication: multichannel and train protocols. As a result an AMTRAC-based high-speed network achieves channel utilization significantly higher than previous approaches. For a network consisting of N stations, with propagation delay to packet transmission time ratio given by a, the AMTRAC architecture reaches a capacity of 1/(1+a/N 2)  相似文献   

14.
Lee  H.C. Kyung  C.M. 《Electronics letters》1996,32(25):2301-2302
A highly regular switching network consisting of several switching stages for output buffering is proposed. Each switching element performs 3×3 switching and has a tail-spared buffer for each input port. According to the performance evaluation of the proposed switching network based on computer simulation, a packet loss ratio of 10-8 was obtained for a 1024×1024 switching network consisting of 15 stages with the Bernoulli traffic source when the size of tail-spared buffer is 8 and the input traffic load is 0.9  相似文献   

15.
In the state-parallel implementation of the Viterbi algorithm, one add-compare-select (ACS) unit is devoted to each state in the treillis. A systematic approach to partitioning, scheduling, and mapping N trellis states to P ACSs, where N>P , is presented here. The area saving of this architecture comes from the reduced number of ACSs and interconnection wires. The design of the ACS, path metric storage, and routing network is discussed in detail. The proposed architecture creates internal parallelism due to the ACS sharing, which can be exploited to increase the throughput rate by pipelining. Consequently, the architecture offers a favorable (smaller) area-time product, compared to the state-parallel implementation  相似文献   

16.
A procedure for synthesizing multilayered radar absorbing coatings is presented. Given a predefined set of Nm available materials with frequency-dependent permittivities ∈i(f) and permeabilities μi(f ) (i=1,. . ., Nm), the technique determines simultaneously the optimal material choice for each layer and its thickness. This optimal choice results in a screen which maximally absorbs TM and TE incident plane waves for a prescribed range of frequencies {f1,f2,. . ., f Nf} and incident angles {&thetas;1, &thetas;2,. . .,&thetas;N&thetas;}. The synthesis technique is based on a genetic algorithm. The technique automatically places an upper bound on the total thickness of the coating, as well as the number of layers contained in it, which greatly simplifies manufacturing. In addition, the thickness or surface mass of the coating can be minimized simultaneously with the reflection coefficient. The algorithm was successfully applied to the synthesis of wideband absorbing coatings in the frequency ranges of 0.2-2 GHz and 2-8 GHz  相似文献   

17.
A packet having a crossbar architecture with M inputs and N outputs is considered. Following earlier work, the theoretical capacity of such a switch is found in terms of the throughput of a closed queueing network. A state-dependent server model that approximates the rate at which the switch is transferring packets as a function of the work backlog (the number of packets queued at the switch inputs) is then developed. In this way, the model reflects the fact that such a switch tends to function more efficiently as the work backlog increases. This model yields an accurate method for approximating the entire distribution of the work backlog, as well as mean queue lengths and waiting times  相似文献   

18.
To maintain consistency in a distributed database environment, the transactions must be executed atomically. The standard algorithm for ensuring an atomic execution is called the distributed commit protocol. The two-phase commit protocol and its variations, the well-known protocols used for this purpose, are characterized by successive rounds of message exchange, among all the sites of the database, at the time a transaction enters into a completion phase. The performance of these protocols is given by a complexity measure that depends on the communication structure of the protocol. Given N sites, the worst-case complexity of a commit protocol is O(N2). A communication structure called maximal binomial structure (MBS) is presented, for which the complexity of the protocol is O(N×log3 N). A lower bound for this complexity is also given, which is O(N×log2 N). Protocols using the MBS remain symmetric. A scheme for an arbitrary expansion of the MBS to allow communication among a large number of sites is proposed. For the expanded system, the protocol complexity is also shown to be O(N×log3 N ). These structures are shown to be superior to other known structures  相似文献   

19.
A self-pruning binary tree (SPBT) interconnection network architecture that tolerate faults in a wafer scale integration (WSI) environment is proposed. The goal of the SPBT network is to provide a reliable and a quickly reconfigured interconnection network architecture for linear WSI arrays. The proposed architecture uses a bottom-up approach to reconfigure a linear pipelined array on a potentially defective WSI array using a binary tree interconnection scheme. The binary tree is generated by successive formation of hierarchical modules. For N processing elements (PEs) on the wafer, reconfiguration time is O(log N). The propagation delay is bounded by Θ(log N) and is independent of the number of faulty PEs. Faults in the switching network as well as faulty processing elements are tolerated  相似文献   

20.
Hybrid automatic-repeat-request (ARQ) error control schemes make use of both error detection and error correction in order to achieve high throughput and low undetected error probabilities on two way channels. Two hybrid ARQ schemes, termed hybrid go-back-N (HGB- N) and hybrid selective-repeat (HSR), are proposed for point-to-multipoint communications over broadcast channels. Both schemes incorporate a concatenated code for error correction and error detection. The performance study of the hybrid schemes is based on a two-state Markov model of a burst noise channel. An analytic solution is derived for the throughput efficiency of the HSR scheme, while approximations and computer simulation are used to evaluate the throughput efficiency of the HGB-N scheme. It is shown that the schemes perform considerably better than the corresponding pure ARQ schemes in which a block code is used for error detection only, especially in environments with a large number of receivers and large channel roundtrip delays, such as satellite broadcast links  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号