首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Network-on-chip (NoC) has rapidly become a promising alternative for complex system-on-chip architectures including recent multicore architectures. Additionally, optimizing NoC architectures with respect to different design objectives that are suitable for a particular application domain is crucial for achieving high-performance and energy-efficient customized solutions. Despite the fact that many researches have provided various solutions for different aspects of NoCs design, a comprehensive NoCs system solution has not emerged yet. This paper presents a novel methodology to provide a solution for complex on-chip communication problems to reduce power, latency and area overhead. Our proposed NoC communication architecture is based on setting up virtual source–destination paths between selected pairs of NoCs cores so that the packets belonging to distance nodes in the network can bypass intermediate routers while traveling through these virtual paths. In this scheme, the paths are constructed for an application based on its task-graph at the design time. After that, the run time scheduling mechanism is applied to improve the buffer management, virtual channel and switch allocation schemes and hence, the constructed paths are optimized dynamically. Moreover, in our design the router complexity and its overheads are reduced. Additionally, the suggested router has been implemented on Xilinx Virtex-5 FPGA family. The evaluation results captured by SPLASH-2 benchmark suite reveal that in comparison with the conventional NoC router, the proposed router takes 25% and 53% reduction in latency and energy, respectively besides 3.5% area overhead. Indeed, our experimental results demonstrate a significant reduction in the average packet latency and total power consumption with negligible area overhead.  相似文献   

2.
对芯片面积、能耗上的严格限制是片上网络与宏观网络的最大不同。片上网络路由器中的缓存占用了大量的芯片面积和功耗,因此无缓存片上网络得到广泛关注。它完全去掉路由器内的缓存,通过偏转未获得有效端口的微片,处理微片对输出端口的竞争。本文对无缓存偏转网络的原理及关键技术进行了研究,包括拓扑结构、仲裁策略等。最后,通过与有缓存网络的对比,对无缓存网络的优势与劣势进行了总结。  相似文献   

3.
马立伟  孙义和 《电子学报》2007,35(5):906-911
微系统芯片(System-on-Chip,SoC)发展到今天,集成密度指数增长和芯片面积的急剧膨胀使得全局连线的延时上升,可靠性下降,成为集成电路的设计瓶颈.片上网络(Network-on-Chip,NoC)是解决整个芯片上数据有效传输的结构之一,以片上网络为基础通信架构的微系统芯片称为片上网上系统芯片(System-on-Network-on-Chip,SoNoC).微系统芯片内通信模式兼有随机性和确定性,应该根据特定应用的通信特征设计片上网络.本文在确定SoNoC设计流程的基础上,根据SoNoC的通信特征,选择了合适的离散平面结构,对SoNoC的运算及控制等模块进行布局、对模块间的通信依赖关系进行布线,发展出FRoD(Floor-plan and Routing on Discrete Plane)算法,以自动生成片上网络的拓扑结构.该算法定义了离散平面的一般表示方法,并在四种典型的离散平面上使用不同规模的随机系统完成了系列实验.为了处理系统和网络之间的耦合关系,逐点分裂的布局算法可以逐步学习和适应系统的通信需求,同时优化系统的执行时间和通信能量,在运行随机任务流图的模拟系统上与随机布局结果相比可以节省30%左右的通信能量,20%左右的系统通信时间.串行、并行和串并混合的布线算法使用最短路径把通信关系分布在离散平面的通道上,使不同的通信关系尽量复用网络通道,与全连接网络相比可以节省10%到30%的面积代价.  相似文献   

4.
Network on chip (NoC) has been proposed as an appropriate solution for today’s on-chip communication challenges. Power dissipation has become a key factor in the NoCs because of their shrinking sizes. In this paper, we propose a new encoding approach aimed at power reduction by decreasing the number of switching activities on the buses. This approach assigns the symbols to data word in such a way that the more frequent words are sent by less power consumption. This algorithm dedicates the symbols with less ones to high probability data and uses transition signaling to transmit data. The proposed method, unlike the existing low power encoding, does not rely on spatial redundancy and keeps the width of the bus constant. Experimental evaluations show that our approach reduces the power dissipation up to 46 % with 2.70, 0.51, and 15.43 % power, critical path and area overhead in the NoCs, respectively.  相似文献   

5.
Continuing advances in the processing technology, along with the significant decreases in the feature size of integrated circuits lead to increases in susceptibility to transient errors and permanent faults. Network on Chips (NoCs) have come to address the demands for high bandwidth communication among processing elements. The structural redundancy inherited in NoC-based design can be exploited to improve reliability and compensate for the effects of failures. In this paper, we propose an enhanced fault tolerant microarchitecture with deadlock-free routing for Hierarchical NoCs. The proposed router supplies dynamic Virtual Channel (VC) Allocation, and it employs a high-performance fault tolerant control flow, handling both transient and permanent faults in hierarchical networks without extra retransmission buffer requirements. Experimental results show a significant improvement in reliability as well as decreases in the average latency and energy consumption.  相似文献   

6.
By benefiting from the development of the semiconductor technology, many-core system-on-chips (SoCs) have been widely used in electronic devices. Network-on-chips (NoCs) can address the massive stress of on-chip communications due to the advantages of high bandwidth, low latency, and good flexibility. Since deep sub-micron era, the reliability has become a critical constraint for integrated circuits. To provide correct data transmission, fault-tolerant NoCs have been researched widely in last decades, and many valuable designs have been proposed. This work introduces and summarizes the state-of-the-art technologies for fault diagnosis and fault recovery in fault-tolerant NoCs. Moreover, this work makes prospects for the future's research.  相似文献   

7.
光互连技术因诸多特性优于电互连而成为片上多核互连最具前景的解决方案。为了提高片上光互连网络架构的性能,采取光器件模块搭建的方法,提出了一种基于微环的新型4×4光路由开关,仅用7个微环构建的拓扑结构便实现了4个双向端口的非阻塞交换,降低了功耗和面积;波导交叉的数量减少到6个,优化了插入损耗。结果表明,该结构相对于经典结构光器件的功耗节省了约8%,光互连层的插入损耗降低了约7%。  相似文献   

8.
In this paper, we explore the designs of a circuit-switched router, a wormhole router, a quality-of-service (QoS) supporting virtual channel router and a speculative virtual channel router and accurately evaluate the energy-performance tradeoffs they offer. Power results from the designs placed and routed in a 90-nm CMOS process show that all the architectures dissipate significant idle state power. The additional energy required to route a packet through the router is then shown to be dominated by the data path. This leads to the key result that, if this trend continues, the use of more elaborate control can be justified and will not be immediately limited by the energy budget. A performance analysis also shows that dynamic resource allocation leads to the lowest network latencies, while static allocation may be used to meet QoS goals. Combining the power and performance figures then allows an energy-latency product to be calculated to judge the efficiency of each of the networks. The speculative virtual channel router was shown to have a very similar efficiency to the wormhole router, while providing a better performance, supporting its use for general purpose designs. Finally, area metrics are also presented to allow a comparison of implementation costs.   相似文献   

9.
This paper presents several on-chip antenna structures that may be fabricated with standard CMOS technology for use at millimeter wave frequencies. On-chip antennas for wireless personal area networks (WPANs) promise to reduce interconnection losses and greatly reduce wireless transceiver costs, while providing unprecedented flexibility for device manufacturers. This paper presents the current state of research in on-chip integrated antennas, highlights several pitfalls and challenges for on-chip design, modeling, and measurement, and proposes several antenna structures that derive from the microwave microstrip and amateur radio art. This paper also describes an experimental test apparatus for performing measurements on RFIC systems with on-chip antennas developed at The University of Texas at Austin.  相似文献   

10.
田颖  王爽  任科 《半导体光电》2017,38(3):330-333,368
设计了一款基于延迟锁定环(DLL)和同步计数器结构的10位片上时间数字转换电路(TDC).采用两步层级设计方法,利用同步计数器进行粗量化输出6位二进制码,量化时钟周期的整数倍,再利用高性能差分DLL输出16路固定相移的时钟信号采样,精量化不足一个时钟周期的部分,输出4位温度计码.该结构可以提供较好的精度、动态范围以及转换速度,与传统的子门延时TDC相比,该结构TDC占用的芯片面积更少,转换速度更高,受工艺、电压及温度影响更少.仿真结果表明:该TDC具有LSB 62.5 ps和MSB 64 ns的动态范围,满足一般与时间相关的单光子计数需要.  相似文献   

11.
Congestion is an important issue in networks and significantly affects network performance. Various congestion control mechanisms have been proposed for the Internet, interconnection networks, etc. However, they are not suitable for network–on-chip systems. Based on the requirements of chip designs, we propose a new distributed congestion control mechanism for network–on-chip systems in this paper. The mechanism uses a new detection metric, the length of the occupied source buffer, to detect network congestion. The new metric is more accurate compared with others. Using the new metric, the congestion information can be directly obtained inside a node. This allows the mechanism to be fully distributed and without requiring any global information. In addition, the presence of real time traffic is considered. Throttling is not required for such traffic to provide QoS. An asymmetric router architecture with additional congestion control unit is also proposed to facilitate the implementation of the new congestion control mechanism. The overhead is relatively low, about 1.79% overhead in area and 2.06 mW in power consumption. The simulations are carried out in OPNET. The results show that our congestion control mechanism alleviates performance degradation for loads beyond saturation, and maintains adequate levels of throughput at higher loads. The new mechanism achieves better network performance than others under different traffic patterns and network sizes.  相似文献   

12.
张亮  袁永琼  迟凯 《现代导航》2020,11(4):299-304
针对现有无线自组织网络路由协议路由开销大或者时延长,不适合直接应用于数据链网络的问题,本文提出一种低开销的数据链网络动态路由协议。通过将路由消息嵌入数据链网络运行本身发送的消息中,以较小的网络开销为数据链网络成员提供实时动态的路由服务。实验结果验证了路由协议的有效性。  相似文献   

13.
A compact on-chip error correcting circuit (ECC) for low cost flash memories has been developed. The total increase in chip area is 2%, including all cells, sense amplifiers, logic, and wiring associated with the ECC. The proposed on-chip ECC, employing 10 check bits for 512 data bits, has been implemented on an experimental 64M-bit NAND flash memory. The cumulative sector error rate has been improved from 10-4 to 10-10. By transferring read data from the sense amplifiers to the ECC twice, 522-Byte temporary buffers, which are required for the conventional ECC and occupy a large part of the ECC area, have been eliminated. As a result, the area for the circuit has been drastically reduced by a factor of 25. The proposed on-chip ECC has been optimized in consideration of balance between the reliability improvement and the cell area overhead. The power increase has been suppressed to less than 1 mA  相似文献   

14.
Constructing on-chip or inter-silicon (inter-die/inter-chip) networks to connect multiple processors extends the system capability and scalability. It is a key issue to implement a flexible router that can fit into various application scenarios. This paper proposes a multi-mode adaptable router that can support both circuit and wormhole switching with supplying flexible working strategies for specific traffic patterns in diverse applications. The limitation of mono-mode switched routers is shown at first, followed by algorithm exploration in the proposed router for choosing the proper working strategy in a specific network. We then present the performance improvement when applying the mixed circuit/wormhole switching mode to different applications, and analyze the image decoding as a case study. The multi-mode router has been implemented with different configurations in a 65 nm CMOS technology. The one with 8-bit flit width is demonstrated together with a multi-core processor to show the feasibility. Working at 350 MHz, the average power consumption of the whole system is 22 mW.  相似文献   

15.

The aggressively scaled CMOS technology is increasingly threatening the dependability of network-on-chips (NoCs) architecture. In a mesh-based NoC, a faulty router or broken link may isolate a well functional processing element (PE). Also, a set of faulty routers may form isolated regions, which can degrade the design. In this paper, we propose a router-level redundancy (RLR) fault-tolerant scheme that differs from the traditional microarchitecture-level redundancy (MLR) approach to relieve the problem of isolated PE and isolated region. By simply adding one spare router within each router set in a mesh, RLR can be created and connection paths between adjacent routers can be diversified. To exploit this extra resource, two reconfiguration algorithms are demonstrated to detour observed faulty routers/links. The proposed RLR fault-tolerant scheme can tolerate at most one faulty router within a router set. After the reconfiguration, the original mesh topology is maintained. As a result, the proposed architecture does not need any support from the network layer routing algorithms. The scheme has been evaluated based on the three fault-tolerant metrics: reliability, mean time to failure (MTTF), and yield. The experimental results show that the performance RLR increases as the size of NoC grows; however, the relative connection cost decreases at the same time. This characteristic makes our architecture suitable for large-scale NoC designs.

  相似文献   

16.
Based on multiple-slice turbo codes, a novel semi-iterative analog turbo decoding algorithm and its corresponding decoder architecture are presented. This work paves the way for integrating flexible analog decoders dealing with frame lengths over thousands of bits. The algorithm benefits from a partially continuous exchange of extrinsic information to improve decoding speed and correction performance. The proposed algorithm and architecture are applied to design an analog decoder for double-binary codes. Taking full advantage of multiple slice codes, the on-chip area is shown to be reduced by ten when compared to a conventional fully parallelized analog slice turbo decoder. The reconfigurable analog core area for frames of 40 bits up to 2432 bits is 37 nm2 in a 0.25-mum BiCMOS process.  相似文献   

17.
大规模及超大规模集成电路的快速发展使片上网络系统成为现实,同时也使十几个平方厘米芯片的功耗达到了上百瓦,而且随着集成电路规模的发展,功耗参数也在不断上升。深微亚领域的研究使得片上网络芯片的面积不断缩小,从而使得IP核互连通信中时延和能耗成为了现代片上网络系统的主要考虑因素。本丈主要分析片上网络系统的平均时延以及内部负责主要通信任务的路由器的结构,功耗,及其功耗降低的方法。  相似文献   

18.
Multicast on-chip communication is encountered in various cache-coherence protocols targeting multi-core processors, and its pervasiveness is increasing due to the proliferation of machine learning accelerators. In-network handling of multicast traffic imposes additional switching-level restrictions to guarantee deadlock freedom, while it stresses the allocation efficiency of Network-on-Chip (NoC) routers. In this work, we propose a novel partitioned NoC router microarchitecture, called SmartFork, which employs a versatile and cost-efficient multicast packet replication scheme that allows the design of high-throughput and low-cost NoCs. The design is adapted to the average branch splitting observed in real-world multicast routing algorithms. Compared to state-of-the-art NoC multicast approaches, SmartFork is demonstrated to yield high performance in terms of latency and throughput, while still offering a cost-effective implementation.  相似文献   

19.
为了降低片上网络(NoC)由于虫孔缓冲结构排头(HoL)阻塞导致的性能损失,同时消除虚通道缓冲结构对可变长度报文表现出的缓冲区低利用率现象,本文采用虚拟通道技术提出一种动态分配输入队列(DAIQ)的片上虫孔路由器结构.该结构采用一种令牌表的方式支持虚拟队列深度与数量的动态分配,同时为了支持同一报文微片能够连续调度,本文还提出一种新颖的开关分配机制——SRRM,该机制在高负载下进一步改善了开关的延迟与吞吐率.仿真与综合的结果表明,相比传统虚通道流控的片上路由器结构,DAIQ路由器以50%的缓冲面积获得类似的性能,在0.13微米CMOS工艺下节约了30.18%的标准单元面积与384%的功耗.  相似文献   

20.
Configuring a network is a tedious and error-prone task. In particular, configuring routing policies for a network is complex as it involves subtle dependencies in multiple routers across the network. Misconfigurations are common and certain misconfigurations can bring the Internet down. In 2005, a misconfigured router in AS 9121 blackholed traffic for tens of thousands of networks in the Internet. This paper describes NetPiler, a system that detects router misconfigurations. NetPiler consists of a routing policy configuration model and a misconfiguration detection algorithm. The model is applicable to routing policies configured on a single router as well as to network-wide configuration. Using the model, NetPiler detects configuration commands that do not influence the behavior of the network - we call these configurations ineffective commands. Although the ineffective commands could be benign, sometimes when the commands are mistakenly configured to be ineffective, they cause the network to misbehave deviating from the intended behavior. We have implemented NetPiler in approximately 128,000 lines of C++ code, and evaluated it on the configurations of four production networks. NetPiler discovers nearly a hundred ineffective commands. Some of these misconfigurations can result in loss of connectivity, access to protected networks, and financial implications by providing free transit services. We believe NetPiler can help networks to significantly reduce misconfigurations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号