首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
针对虚输出队列结构的路由节点所构成的片上网络(NoC),提出了一种定制化路由节点中各个虚拟通道缓存大小的方法,以提高片上网络的整体通信性能。在有限的片上缓存资源约束下,分析各个虚输入队列中缓存大小对数据通过片上网络的平均延迟的影响,并在此基础上提出一种缓存资源配置方法,以便将缓存资源分配到片上网络通信瓶颈处,从而在不增加缓存资源开销的情况下提高片上网络的通信性能。最后通过仿真验证了路由节点优化设计对提高片上网络性能的可行性,并同未优化的路由节点构成的片上网络性能进行了比较。  相似文献   

2.
This paper proposes and evaluates a shuffle–exchange as an efficient alternative to the popular mesh topology for Network-on-Chips (NoCs). Although the proposed topology imposes the cost equal to that of the mesh topology, the proposed topology (1) provides lower diameter for NoC, (2) offers better performance under uniform, hotspot, and matrix-transpose traffic patterns and (3) consumes lower energy for packet delivery. To speed up the evaluation process of the proposed topology, an analytical performance model is proposed in the paper to predict the performance of NoCs. The model uses a network of M/G/1 queues to consider channels of the NoC. In this way, the model accurately estimates the average message latency which is a widely used representative for the network performance. Results obtained from the analytical model are in good agreement with those of simulations for a wide range of working conditions (e.g. various network sizes, different message lengths, and different traffic patterns). The proposed analytical model provides a minimum of 400% speed-up in the evaluation process of the proposed topology.  相似文献   

3.
针对片上网络典型路由器的缓冲资源利用率不高、大容量缓存设计受限等问题,在不增加缓存和虚通道的情况下,提出一种新的面向片上网络缓冲资源争用的路由器设计方案。在该路由器中,当某个输入端繁忙发生资源争用情况时,将阻塞数据包分配到其他拥有空闲缓存资源的输入端口,解决缓冲资源的争用问题,从而提高网络整体性能。SystemC仿真结果表明,相对于基本路由器,该路由器在热点模式和均衡模式下均具有较高的网络饱和率和吞吐量,尤其在热点模式下提高了约11.4%的饱和率。FPGA实现结果表明,该路由器的面积开销较小,能较好满足片上网络的应用需求。  相似文献   

4.
Networks-on-chip (NoC) is a promising communication architecture for next generation SoC. The size of buffer used in on-chip routers impacts the silicon area and power consumption of NoC dominantly. It is important to plan the total buffer-size and each router buffer-allocation carefully for an efficient NoC design. In this paper, we propose two buffer planning algorithms for application-specific NoC design. More precisely, given the traffic parameters and performance constraints of target application, the proposed algorithms automatically determine minimal buffer budget and assign the buffer depth for each input channel in different routers. The experimental results show that the proposed algorithms can significantly reduce total buffer usage and guarantee the performance requirements. Supported by the National Natural Science Foundation of China (Grant No. 60803018)  相似文献   

5.
基于路由器解析式模型的NoC网络性能分析方法   总被引:2,自引:1,他引:1  
建立一种高效的片上网络(NoC)性能分析方法对NoC早期的系统设计分析具有重要的指导意义.首先从NoC路由器工作原理出发,对报文传输中的各种阻塞现象进行分析,建立了基于M/G/1/N排队系统的路由器模型;然后提出NoC网络性能分析算法,并且给出了传输延迟、饱和吞吐率等参数的解析表达式.与时钟精度仿真结果比较表明,该方法分析误差约为6.9%,但分析效率提高了约200倍.该方法适用于指导程序NoC拓扑映射,在获取最优映射方案同时,可有效地挖掘网络通信瓶颈.  相似文献   

6.
In this paper, we have developed analytical stochastic communication technique for inter and intra-Networks-on-Chip (NoC) communication. It not only separates the computation and communication in Networks-in-Package (NiP) but also predicts the communication performance. Moreover, it will help in tracking of the lost data packets and their exact location during the communication. Further, the proposed technique helps in building the Closed Donor Controlled Based Compartmental Model, which helps in building Stochastic Model of NoC and NiP. This model helps in computing the transition probabilities, latency, and data flow from one IP to other IP in a NoC and among NoCs in NiP. From the simulation results, it is observed that the transient and steady state response of transition probabilities give state of data flow latencies among the different IPs in NoC and among the compartments of NoCs in NiP. Furthermore, the proposed technique produces low latency as compared to the latencies being produced by the existing topologies.  相似文献   

7.
The Spidergon Network-on-Chip (NoC) was proposed to address the demand for a fixed and optimized communication infrastructure for cost-effective multi-processor Systems-on-Chip (MPSoC) development. To deal with the increasing diversity in quality of service requirements of SoC applications, the performance of this architecture needs to be improved. Virtual channels have traditionally been employed to enhance the performance of the interconnect networks. In this paper, we present analytical models to evaluate the message latency and network throughput in the Spidergon NoC and investigate the effect of employing virtual channels. Results obtained through simulation experiments show that the model exhibits a good degree of accuracy in predicting average message latency under various working conditions. Moreover an FPGA implementation of the Spidergon has been developed to provide an accurate analysis of the cost of employing virtual channels in this architecture.  相似文献   

8.
Reliability is an important design concern for modern many-core embedded systems. Specifically, on-chip interconnecting systems are vulnerable to permanent channel faults and transient data transmission faults which may significantly impact the overall system performance. In this work, a Unified Link-layer Fault-tolerant NoC (ULF-NoC) architecture is proposed. ULF-NoC is developed for NoC equipped with bidirectional channels and features wormhole switching (instead of store-and-forward switching) and packet-based retransmission. An intelligent buffer controller is developed that does not require separate, dedicated buffer spaces to support packet retransmissions. Extensive simulations using both synthetic and real world data traffics demonstrated marked performance of the proposed ULF-NoC solution.  相似文献   

9.
Providing highly flexible connectivity is a major architectural challenge for hardware implementation of reconfigurable neural networks. We perform an analytical evaluation and comparison of different configurable interconnect architectures (mesh NoC, tree, shared bus and point-to-point) emulating variants of two neural network topologies (having full and random configurable connectivity). We derive analytical expressions and asymptotic limits for performance (in terms of bandwidth) and cost (in terms of area and power) of the interconnect architectures considering three communication methods (unicast, multicast and broadcast). It is shown that multicast mesh NoC provides the highest performance/cost ratio and consequently it is the most suitable interconnect architecture for configurable neural network implementation. Routing table size requirements and their impact on scalability were analyzed. Modular hierarchical architecture based on multicast mesh NoC is proposed to allow large scale neural networks emulation. Simulation results successfully validate the analytical models and the asymptotic behavior of the network as a function of its size.  相似文献   

10.
TCP-Cherry is a novel TCP congestion control scheme that we devised for ensuring high performance over satellite IP networks and the alikes which are characterized by long propagation delays and high link errors. In TCP-Cherry, two new algorithms, Fast-Forward Start and First-Aid Recovery, have been proposed for congestion control. Our algorithms use supplement segments, i.e., low-priority segments to probe the available bandwidth in the network for the TCP connections along with carrying new data blocks. In this paper, we present our new congestion control scheme, TCP-Cherry and devise an analytical model for it. Our major contributions in this paper include the analytical model and equations for performance evaluation, validation of the analytical model through comparison between analytical and simulation results and devising a guideline to tune the buffer related parameters both at the sender as well as the receiver ends for optimum throughput performance. Experiments show that simulation results and the calculated throughput from our analytical model match quite closely, thereby verifying the appropriateness of the model. In addition, from analysis of simulation results, we discover that a buffer size at the receiver, rwnd, that is around four times maxcwnd, or the maximum congestion window at the sender side, is likely to maintain high throughput over a wide range of operating conditions.  相似文献   

11.
为了有效利用缓存资源,提出一种动态分配片上网络路由器端口缓存的方法,根据传输方向将输入端口接收到的数据分成不同的组,每个组对应一个输出端口,并将数据以组的形式进行存储,控制部件根据各个组数据规模为其动态分配缓存资源。与基于虚通道的动态缓存分配方式相比,该方法降低了控制和仲裁的复杂度。仿真结果表明,获得同等性能的条件下,该方法可以有效降低缓存的需求。  相似文献   

12.
NoC映射和通讯参数设计是NoC设计过程中非常重要的部分,其结果直接影响NoC的性能、面积和功耗。本文将NoC映射问题和通讯参数设计问题统一考虑,首先对NoC映射问题进行了形式化定义,然后提出了基于虫孔交换的NoC延迟性能分析方法,根据应用的通讯延迟约束,将应用模型映射到NoC拓扑结构上,并自动设计出NoC通讯参数。实验表明,本文所提出的延迟性能分析方法比以往方法精确7%~13%,映射结果和通讯参数设计更优。  相似文献   

13.
NoC映射和通讯参数设计是NoC设计过程中非常重要的部分,其结果直接影响NoC的性能、面积和功耗。本文将NoC映射问题和通讯参数设计问题统一考虑,首先对NoC映射问题进行了形式化定义,然后提出了基于虫孔交换的NoC延迟性能分析方法,根据应用的通讯延迟约束,将应用模型映射到NoC拓扑结构上,并自动设计出NoC通讯参数。实验表明,本文所提出的延迟性能分析方法比以往方法精确7%~13%,映射结果和通讯参数设计更优。  相似文献   

14.
为实现高效的NoC(片上网络)性能评估, 缩短系统芯片的开发周期, 针对时钟精确级的NoC仿真方法进行研究, 提出了一种新型的高层次、高效率仿真平台, 与仅支持网格拓扑结构的传统仿真器相比, 其创新地支持了网格和环型双拓扑结构的性能评估, 同时支持虚通道扩展的路由器结构设计, 能快速得到网络的延迟、吞吐率、功耗等性能结果。实验结果表明, 该仿真平台能准确模拟NoC功能行为, 快速获得其仿真性能, 为NoC设计验证提供了高效的方法。  相似文献   

15.
Networks-on-chip (NoCs) are used in a growing number of SoCs and multi-core processors. Because messages compete for the NoC’s shared resources, quality of service and resource allocation are major concerns for system designers. In particular, a model for the properties of packet delivery through the network is desirable. We present a methodology for packet-level static timing analysis in NoCs. Our methodology quickly and accurately gauges the performance parameters of a virtual-channel wormhole NoC without simulation. The network model can handle any topology, link capacities, and buffer sizes. It provides per-flow delay analysis that is orders-of-magnitude faster than simulation while being significantly more accurate than prior static modeling techniques. Using a carefully derived and reduced Markov chain, the model can statically represent the dynamic network state. Usage of the model in a placement optimization problem is shown as an example application.  相似文献   

16.
In this paper we present a methodology to develop efficient and deadlock free routing algorithms for Network-on-Chip (NoC) platforms which are specialized for an application or a set of concurrent applications. The proposed methodology, called Application Specific Routing Algorithm (APSRA), exploits the application specific information regarding pairs of cores which communicate and other pairs which never communicate in the NoC platform to maximize communication adaptivity and performance. The methodology also exploits the known information regarding concurrency/non-concurrency of communication transactions among cores for the same purpose. We demonstrate, through analysis of adaptivity as well as simulation based evaluation of latency and throughput, that algorithms produced by the proposed methodology give significantly higher performance as compared to other deadlock free algorithms for both homogeneous as well as heterogeneous 2D mesh topology NoC systems. For example, for homogeneous mesh NoC, APSRA results in approximately 30% less average delay as compared to Odd-Even algorithm just below saturation load. Similarly the saturation load point for APSRA is significantly higher as compared to other adaptive routing algorithms for both homogeneous and non-homogeneous mesh networks.  相似文献   

17.
This paper proposes a power-efficient flow-control method to tackle the problem of crosstalk faults in Network-on-Chips (NoCs). The method, called FRR (Flit Reordering/Rotation), combines three coding mechanisms to entirely eliminate opposite direction transitions (OD transitions) as the source of crosstalk faults in NoC communication channels. The first mechanism, called flit-reordering, reorders flits of every packet to find a flit sequence which produces the lowest number of OD transitions on NoC channels. The second mechanism called flit-rotation, logically rotates the content of every flit of the packet with respect to previously sent flit to achieve even more reduction in the number of OD transitions. Finally, the third mechanism called flit-insertion, investigates flits of the packet to find the OD transitions which are not removed by first and second mechanisms. This mechanism inserts null-flits between the required flits to completely eliminate appearance of OD transitions on NoC channels. Evaluation of FRR method is done in two ways: (1) VHDL-based simulations are carried out for 16- and 32-bit channels when maximum reorderings and maximum rotations in the first and second mechanisms are limited to 2, 4, and 8. (2) An analytical model is developed to calculate and compare the expected number of OD transitions in an unprotected NoC as well as an FRR-enabled NoC. Both simulation and analytical results confirm that the FRR method completely removes crosstalk faults from NoC channels. In addition, VHDL simulations show that the FRR method provides a remarkable power saving, since the method reduces the number of transitions in NoC channels by at least 32.8%.  相似文献   

18.
Shared-buffer switches have many advantages such as relatively low cell loss rate and good buffer utilization, and they are increasingly favoured in recent VLSI switch designs for ATM. However, their performance degrades dramatically under nonuniform traffic due to the monopolization of the buffer by some favoured cells. To overcome this, restricted types of sharing and hot-spot pushout (HSPO) have been proposed, and the latter has been shown by simulation to perform better in all situations. In this paper we develop an analytical model for performance evaluation of a shared-buffer asynchronous transfer mode (ATM) switch with HSPO under bursty traffic. This analytical model is an improved version of the first model ever developed for this purpose. We balance the relative queues to approximate the effects of pushout, while keeping only four state-variables, and our model gives a good agreement with simulation, for calculating throughput and cell loss.  相似文献   

19.
袁景凌  刘华  谢威  蒋幸 《计算机应用》2011,31(10):2630-2633
为了满足片上网络日益丰富的应用要求,多播路由机制被应用到片上网络,以弥补传统单播通信方式的不足。以Mesh和Torus类的片上网络为例,分析了基于路径的3种多播路由算法(即XY路由、UpDown路由和SubPartition路由算法),并研究了相应的拥塞控制策略。通过模拟实验表明,多播较单播通信具有更小的平均传输延时和更高的网络吞吐量,且负载分配均匀;特别是SubPartition路由算法随着规模增大效果更加明显;提出的多播拥塞控制机制,能更有效地利用多播通信,提高片上网络的性能。  相似文献   

20.
王雷  凌翔  胡剑浩 《计算机科学》2011,38(9):298-303
针对异构多核片上网络(NoO的任务映射问题,根据IP核的选择以及IP核向NoC平台中位置映射的两个阶段的不同特点,分别提出能耗和延时的粗略估算模型和精确计算模型。为避免离散空间搜索解落入局部最优,设计了混沌扰动机制。提出了带混沌扰动机制的改进型离散粒子群优化算法,以搜索能耗和延时优化的多目标NoC映射方案,该算法比传统优化算法在能耗和延时上有显著的性能提高。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号