首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Parallel Computing》1997,23(10):1429-1460
Based on the Cayley graph framework for the generation and evaluation of multiprocessor interconnection topologies, an application dependent task mapping problem is addressed. We start with interconnection topologies considered attractive from a graph theoretical point of view. Given such a topology together with an application dependent task communication profile, the problem addressed in this paper is to find an optimal task-to-processor assignment. Such a mapping yields for the given interconnection topology and the given profile a minimal expected communication path length and therefore a minimal number of data transfer steps between the physical processing elements of a multiprocessor machine. At first, the Cayley graph approach is briefly outlined. We demonstrate the potential of Cayley graphs for a systematic generation framework aiming at node-symmetric interconnection topologies for a given number of processing elements each equipped with a constant number of communication channels. Cayley graphs found most attractive within a large set of generated graphs are compared to prominent interconnection topologies like, for example, hypercubes. Later on, it is shown that the problem of the optimal task-to-processor assignment, being a special case of the well-known quadratic assignment problem, is still NP-hard. Consequently, any practically relevant mapping algorithm can be expected to produce at best near-optimal solutions for reasonable problem sizes beyond approximately 10 nodes. We present two fundamentally different mapping approaches, namely a straightforward greedy mapping and a more sophisticated algorithm using simulated annealing techniques as known from artificial intelligence applications. For both approaches, we elaborate on their relative performance as well as, where feasible, on the question of suboptimality.  相似文献   

2.
The communication overhead is a major bottleneck for the execution of a process graph on a parallel computer system. In the case of two processors, the minimization of the communication can be modeled using the graph bisection problem. The spectral lower bound of λ2|V|/4 for the bisection width of a graph is widely known. The bisection width is equal to λ2|V|/4 iff all vertices are incident to λ2/2 cut edges in every optimal bisection.

We present a new method of obtaining tighter lower bounds on the bisection width. This method makes use of the level structure defined by the bisection. We define some global expansion properties and we show that the spectral lower bound increases with this global expansion. Under certain conditions we obtain a lower bound depending on λ2β|V| with . We also present examples of graphs for which our new bounds are tight up to a constant factor. As a by-product, we derive new lower bounds for the bisection widths of 3- and 4-regular Ramanujan graphs.  相似文献   


3.
We introduce and analyze a new interconnection topology, called the k-dimensional folded Petersen (FPk) network, which is constructed by iteratively applying the Cartesian product operation on the well-known Petersen graph. Since the number of nodes in FPk is restricted to a power of ten, for better scalability we propose a generalization, the folded Petersen cube network FPQn,k =Qn×FPk, which is a product of the n-dimensional binary hypercube (Qn) and FPk. The FPQn,k topology provides regularity, node- and edge-symmetry, optimal connectivity (and therefore maximal fault-tolerance), logarithmic diameter, modularity, and permits simple self-routing and broadcasting algorithms. With the same node-degree and connectivity, FPQ n,k has smaller diameter and accommodates more nodes than Q n+3k, and its packing density is higher compared to several other product networks. This paper also emphasizes the versatility of the folded Petersen cube networks as a multicomputer interconnection topology by providing embeddings of many computationally important structures such as rings, multi-dimensional meshes, hypercubes, complete binary trees, tree machines, meshes of trees, and pyramids. The dilation and edge-congestion of all such embeddings are at most two  相似文献   

4.
This paper addresses the problem of creating a fault-tolerant interconnection network for a parallel computer. Three topologies, namely, the base-2 de Bruijn graph, the base-m de Bruijn graph, and the shuffle-exchange, are studied. For each topology an N+k node fault-tolerant graph is defined. These fault-tolerant graphs have the property that given any set of k node faults, the remaining N nodes contain the desired topology as a subgraph. All of the constructions given are the best known in terms of the degree of the fault-tolerant graph. We also investigate the use of buses to reduce the degrees of the fault-tolerant graphs still further  相似文献   

5.
SKY: efficient peer-to-peer networks based on distributed Kautz graphs   总被引:1,自引:0,他引:1  
Many proposed P2P networks are based on traditional interconnection topologies. Given a static topology, the maintenance mechanism for node join/departure is critical to designing an efficient P2P network. Kautz graphs have many good properties such as constant degree, low congestion and optimal diameter. Due to the complexity in topology maintenance, however, to date there have been no effective P2P networks that are proposed based on Kautz graphs with base > 2. To address this problem, this paper presents the “distributed Kautz (D-Kautz) graphs”, which adapt Kautz graphs to the characteristics of P2P networks. Using the D-Kautz graphs we further propose SKY, the first effective P2P network based on Kautz graphs with arbitrary base. The effectiveness of SKY is demonstrated through analysis and simulations. Supported partially by the National Natural Science Foundation of China (Grant Nos. 60673167 and 60703072), the Hunan Provincial Natural Science Foundation of China (Grant No. 08JJ3125), and the National Basic Research Program of China (973) (Grant No. 2005CB321801)  相似文献   

6.
The recently introduced interconnection networks, the Mobius cubes, are hypercube variants that have some better properties than hypercubes. The n-dimensional Mobius cube Mn is a regular graph with 2n nodes and n2n-1 edges. The diameter of Mn is about one half that of the n-dimensional hypercube Q n and the average number of steps between nodes for Mn is about two-thirds of the average for Qn, and 1-Mn has dynamic performance superior to that of Qn. Of course, the symmetry of Mn is not superior to that of Qn, i.e., Qn is both node symmetric and edge symmetric , whereas Mn is, in general, neither node symmetric (n⩾4) nor edge symmetric (n⩾3). In this paper, we study the diagnosability of Mn. We use two diagnosis strategies, both based on the so-called PMC diagnostic model-the precise (one-step) diagnosis strategy proposed by Preparata et al. (1967) and the pessimistic diagnosis strategy proposed by Friedman (1975). We show that the diagnosability of Mn is the same as that of Qn , i.e., Mn is n-diagnosable under the precise diagnosis strategy and (2n-2)/(2n-2)-diagnosable under the pessimistic diagnosis strategy  相似文献   

7.
Extended Fibonacci Cubes   总被引:1,自引:0,他引:1  
The Fibonacci Cube is an interconnection network that possesses many desirable properties that are important in network design and application. The Fibonacci Cube can efficiently emulate many hypercube algorithms and uses fewer links than the comparable hypercube, while its size does not increase as fast as the hypercube. However, most Fibonacci Cubes (more than 2/3 of all) are not Hamiltonian. In this paper, we propose an Extended Fibonacci Cube (EFC1) with an even number of nodes. It is defined based on the same sequence F(i)=F(i-1)+F(i-2) as the regular Fibonacci sequence; however, its initial conditions are different. We show that the Extended Fibonacci Cube includes the Fibonacci Cube as a subgraph and maintains its sparsity property. In addition, it is Hamiltonian and is better in emulating other topologies. Specifically, the Extended Fibonacci Cube can embed binary trees more efficiently than the regular Fibonacci Cube and is almost as efficient as the hypercube, even though the Extended Fibonacci Cube is a much sparser network than the hypercube. We also propose a series of Extended Fibonacci Cubes with even number of nodes. Any Extended Fibonacci Cube (EFCk, with k⩾) in the series contains the node set of any other cube that precedes EFCk in the series. We show that any Extended Fibonacci Cube maintains virtually all the desirable properties of the Fibonacci Cube. The EFCks can be considered as flexible versions of incomplete hypercubes, which eliminates their restriction on the number of nodes, and, thus, makes it possible to construct parallel machines with arbitrary sizes  相似文献   

8.
A number of low degree and, thus, low complexity, Cayley-graph interconnection structures, such as honeycomb and diamond networks, are known to be derivable by systematic pruning of 2D or 3D tori. In this paper, we extend these known pruning schemes via a general algebraic construction based on commutative groups. We show that, under certain conditions, Cayley graphs based on the constructed groups are pruned networks when Cayley graphs of the original commutative groups are kD tori. Thus, our results offer a general mathematical framework for synthesizing and exploring pruned interconnection networks that offer lower node degrees and, thus, smaller VLSI layout and simpler physical packaging. Our constructions also lead to new insights, as well as new concrete results, for previously known interconnection schemes such as honeycomb and diamond networks  相似文献   

9.
We compare the complexities of two fundamental distributed coordination problems, distributed counting and distributed queuing, in a concurrent setting. In both distributed counting and queuing, processors in a distributed system issue operations which are organized into a total order. In counting, each participating processor receives the rank of its operation in the total order, where as in queuing, a processor receives the identity of its predecessor in the total order. Many coordination applications can be solved using either distributed counting or queuing, and it is useful to know which of counting or queuing is the easier problem.Our results show that concurrent counting is harder than concurrent queuing on a variety of processor interconnection topologies, including high and low diameter graphs. For all these topologies, we show that the concurrent delay complexity of a particular solution to queuing, the arrow protocol, is asymptotically smaller than a lower bound on the complexity of any solution to counting.  相似文献   

10.
Previous works on edge-fault tolerance with respect to hypercubes Qn are mainly focused on 1-edge fault and 2- or 3-edge fault with limited size of n. We give a construction scheme for 2-EFT(Qn ) graphs and 3-EFT(Qn) graphs, where n is arbitrarily large. In our constructions, approximately log n extra degree is added to the vertices of Qn for 2-edge-fault tolerance, and one more degree for 3-edge-fault tolerance  相似文献   

11.
Star networks were proposed recently as an attractive alternative to the well-known hypercube models for interconnection networks. Extensive research has been performed that shows that star networks are as versatile as hypercubes. This paper is an effort in the same direction. Based on the well-known paradigms, we study the one-to-many parallel routing problem on star networks and develop an improved routing algorithm that finds n-1 node-disjoint paths between one node and a set of other n-1 nodes in the n-star network. These parallel paths are proven of minimum length within a small additive constant, and the running time of our algorithm is bounded by O(n2). More specifically, given a node s and n-1 other nodes {t1, t2 , …, tn-1} in the n-star network, our algorithm constructs n-1 node-disjoint paths P1, P2, …, Pn-1, where Pi is a path from s to tj of length at most dist(s, tj)+6 and dist(s, t j) is the distance, i.e., the length of a shortest path, from s to tj, for i=1, 2, …, n-1.The best bound on the path length by previously known algorithms for the same problem is 5(n-2)≈10Δn/3, where Δn=max{dist(s, t)} is the diameter of the n-star network  相似文献   

12.
In this paper we propose a limit characterization of the behaviour of classes of graphs with respect to their number of spanning trees. Let {Gn} be a sequence of graphs G0,G1,G2,… that belong to a particular class. We consider graphs of the form KnGn that result from the complete graph Kn after removing a set of edges that span Gn. We study the spanning tree behaviour of the sequence {KnGn} when n→∞ and the number of edges of Gn scales according to n. More specifically, we define the spanning tree indicator ({Gn}), a quantity that characterizes the spanning tree behaviour of {KnGn}. We derive closed formulas for the spanning tree indicators for certain well-known classes of graphs. Finally, we demonstrate that the indicator can be used to compare the spanning tree behaviour of different classes of graphs (even when their members never happen to have the same number of edges).  相似文献   

13.
The interesting hydrogen sensing characteristics of two transistors with an Al0.24Ga0.76As (device A) and In0.49Ga0.51P (device B) Schottky layer are demonstrated and studied. Experimentally, device A shows a lower hydrogen detection limit of 4.3 ppm H2/air, a higher current variation of 7.79 mA and a shorter adsorption time of 10.95 s in a 9970 ppm H2/air at room temperature. On the other hand, device B exhibits more stable hydrogen-sensing characteristics at high temperatures. Even at a low concentration of 14 ppm H2/air the hydrogen sensing properties of device B can be obtained as the temperature increases from 30 to 160 °C. Because the Al0.24Ga0.76As and In0.49Ga0.51P materials are lattice-matched to the GaAs substrate, the studied devices can be integrated as sensor arrays to obtain superior hydrogen sensing characteristics including higher sensing signals, lower detection limit, shorter response time, and widespread detection and temperature regimes.  相似文献   

14.
Many parallel algorithms use hypercubes as the communication topology among their processes. When such algorithms are executed on hypercube multicomputers the communication cost is kept minimum since processes can be allocated to processors in such a way that only communication between neighbor processors is required. However, the scalability of hypercube multicomputers is constrained by the fact that the interconnection cost-per-node increases with the total number of nodes. From scalability point of view, meshes and toruses are more interesting classes of interconnection topologies. This paper focuses on the execution of algorithms with hypercube communication topology on multicomputers with mesh or torus interconnection topologies. The proposed approach is based on looking at different embeddings of hypercube graphs onto mesh or torus graphs. The paper concentrates on toruses since an already known embedding, which is called standard embedding, is optimal for meshes. In this paper, an embedding of hypercubes onto toruses of any given dimension is proposed. This novel embedding is called xor embedding. The paper presents a set of performance figures for both the standard and the xor embeddings and shows that the latter outperforms the former for any torus. In addition, it is proven that for a one-dimensional torus (a ring) the xor embedding is optimal in the sense that it minimizes the execution time of a class of parallel algorithms with hypercube topology. This class of algorithms is frequently found in real applications, such as FFT and some class of sorting algorithms  相似文献   

15.
The effect of CdO doping on microstructure, conductance and gas-sensing properties of SnO2-based sensors has been presented in this study. Precursor powders with Cd/Sn molar ratios ranging from 0 to 0.5 were prepared by chemical coprecipitation. X-ray diffraction (XRD) analysis indicates that the solid-state reaction in the CdO–SnO2 system occurs and -CdSnO3 with pervoskite structure is formed between 600 and 650°C. CdO doping suppresses SnO2 crystallite growth effectively which has been confirmed by means of XRD, transmission electron microscopy (TEM) and BET method. The 10 mol% Cd-doped SnO2-based sensor shows an excellent ethanol-sensing performance, such as high sensitivity (275 for 100 ppm C2H5OH), rapid response rate (12 s for 90% response time) and high selectivity over CO, H2 and i-C4H10. On the other hand, this sensor has good H2-sensing properties in the absence of ethanol vapor. The sensor operates at 300°C, the sensitivity to 1000 ppm H2 is up to 98, but only 16 and 7 for 1000 ppm CO and i-C4H10, respectively.  相似文献   

16.
A graph G was defined in [16] as P4-reducible, if no vertex in G belongs to more than one chordless path on four vertices or P4. A graph G is defined in [15] as P4-sparse if no set of five vertices induces more than one P4, in G. P4-sparse graphs generalize both P4-reducible and the well known class of p4-free graphs or cographs. In an extended abstract in [11] the first author introduced a method using the modular decomposition tree of a graph as the framework for the resolution of algorithmic problems. This method was applied to the study of P4-sparse and extended P4-sparse graphs.

In this paper, we begin by presenting the complete information about the method used in [11]. We propose a unique tree representation of P4-sparse and a unique tree representation of P4-reducible graphs leading to a simple linear recognition algorithm for both classes of graphs. In this way we simplify and unify the solutions for these problems, presented in [16–19]. The tree representation of an n-vertex P4-sparse or a P4-reducible graph is the key for obtaining O(n) time algorithms for the weighted version of classical optimization problems solved in [20]. These problems are NP-complete on general graphs.

Finally, by relaxing the restriction concerning the exclusion of the C5 cycles from P4-sparse and P4-reducible graphs, we introduce the class of the extended P4-sparse and the class of the extendedP4-reducible graphs. We then show that a minimal amount of additional work suffices for extending most of our algorithms to these new classes of graphs.  相似文献   


17.
Optimal broadcasting on the star graph   总被引:2,自引:0,他引:2  
The star graph has been show to be an attractive alternative to the widely used n-cube. Like the n-cube, the star graph possesses rich structure and symmetry as well as fault tolerant capabilities, but has a smaller diameter and degree. However, very few algorithms exists to show its potential as a multiprocessor interconnection network. Many fast and efficient parallel algorithms require broadcasting as a basic step. An optimal algorithm for one-to-all broadcasting in the star graph is proposed. The algorithm can broadcast a message to N processors in O(log2 N) time. The algorithm exploits the rich structure of the star graph and works by recursively partitioning the original star graph into smaller star graphs. In addition, an optimal all-to-all broadcasting algorithm is developed  相似文献   

18.
《Performance Evaluation》1999,35(1-2):49-74
Multicast network traffic is information with one source node, but many destination nodes. Rather than setting up individual connections between the source node and each destination node, or broadcasting the information to the entire network, multicasting efficiently exploits link capacity by allowing the source node to transmit a small number of copies of the information to mutually-exclusive groups of destination nodes. Multicasting is an important topic in the fields of networking (video and audio conferencing, video on demand, local-area network interconnection) and computer architecture (cache coherency, multiprocessor message passing). In this paper, we derive approximate expressions for the minimum cost (in terms of link utilization) of shortest-path multicast traffic in arbitrary tree networks. Our results provide a theoretical best-case scenario for link utilization of multicast distribution in tree topologies overlaid onto arbitrary graphs. In real networks such as the Internet MBONE, multicast distribution paths are often tree-like, but contain some cycles for purposes of fault tolerance. We find that even for richly-connected graphs such as the shufflenet and the hypercube, our expression provides a good prediction of the cost (in terms of link utilization) of multicast communication. Thus, this theoretical result has two applications: (1) a lower bound on the link capacity required for multicasting in random tree topologies, and (2) an approximation of the cost of multicasting in regular LAN and MAN topologies.  相似文献   

19.
20.
We consider the open problem of whether there exists finite L2 gain in each channel of a double integrator in feedback interconnection with a saturation nonlinearity. It has been shown by Liu-Chitour-Sontag (1996) that the L2 gain from input to the output of the first integrator is infinite. In this paper, by constructing a special excitation, we show that the L2 gain from input to the output of the second integrator is also infinite  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号