期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

High-level customization framework for application-specific NoC architectures

Iraklis Anagnostopoulos Alexandros Bartzas Iason Filippopoulos Dimitrios Soudris 《Design Automation for Embedded Systems》2012,16(4):339-361

Network-on-Chip (NoC) has been recognized as the new paradigm to interconnect and organize a high number of cores. NoCs address global communication issues in System-on-Chips (SoC) involving communication-centric design and implementation of scalable communication structures evolving application-specific NoC design as a key challenge to modern SoC design. In this paper we present a SystemC customization framework and methodology for automatic design and evaluation of regular and irregular NoC architectures. The presented framework also supports application-specific optimization techniques such as priority assignment, node clustering and buffer sizing. Experimental results show that generated regular NoC architectures achieve an average of 5.5 % lower communication-cost compared to other regular NoC designs while irregular NoCs proved to achieve on average 4.5×higher throughput and 40 % network delay reduction compared to regular mesh topologies. In addition, employing a buffer sizing algorithm we achieve a reduction in network’s power consumption by an average of 45 % for both regular and irregular NoC design flow. 相似文献

2.

Polaris: A System-Level Roadmapping Toolchain for On-Chip Interconnection Networks

Soteriou V.. Eisley N.. Hangsheng Wang Bin Li Li-Shiuan Peh 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(8):855-868

Technology trends are driving parallel on-chip architectures in the form of multiprocessor systems-on-a-chip (MPSoCs) and chip multiprocessors (CMPs). In these systems, the increasing on-chip communication demand among the computation elements necessitates the use of scalable, high-bandwidth network-on-chip (NoC) fabrics instead of dedicated interconnects and shared buses. As transistor feature sizes are further miniaturized, more complicated NoC architectures become feasible that can support more demanding applications. Given the myriad emerging software-hardware combinations, for cost-effectiveness, a system designer critically needs to prune this widening NoC design-space to predict the interconnect fabric(s) that best balance(s) cost/performance, before the actual design process begins. This prompted us to develop Polaris, a system-level roadmapping toolchain for on-chip interconnection networks that helps designers predict the most suitable interconnection network design(s) tailored to their performance needs and power/silicon area constraints with respect to a range of applications that the system will run. Polaris explores the plethora of NoC designs based on projections of network traffic, architectures, and process characteristics. While Polaris's toolchain is extensible so new traffic, network designs, and technology processes can be added, the current version already incorporates 7872 NoC design points. Polaris is rapid, efficiently iterating over thousands of NoC design points, while maintaining high relative and absolute accuracies when validated against detailed NoC synthesis results. 相似文献

3.

Feature - NoC emulation: a tool and design flow for MPSoC

Genko N. Atienza D. De Micheli G. Benini L. 《Circuits and Systems Magazine, IEEE》2007,7(4):42-51

Current systems-on-chip (SoC) execute applications that demand extensive parallel processing; thus, the amount of processors, memories and application-specific signal processing cores is rapidly increasing. In these new multi-processor SoCs, (MPSoCs) one of the most critical elements regarding overall efficiency is on-chip interconnections. Network-on-chip (NoC) provides a structured way of realizing interconnections on silicon, and obviate the limitations of bus-based solutions. NoCs can have regular or ad hoc topologies and can be tuned by a large set of parameters. Simulation and functional validation are essential to assess the correctness and performance of MPSoC architectures. We present a flexible hardware-software emulation framework implemented on a FPGA that is specially designed to suitably explore, evaluate and compare a wide range of NoC solutions with a very limited effort. Our experimental results show a speed-up of four orders of magnitude with respect to cycle-accurate HDL simulation, while retaining cycle accuracy and flexibility of software simulators. Finally, we propose a validation flow for MPSoCs based on our flexible NoC emulation framework, which allows designers to explore and optimize a range of solutions, as well as quickly characterize performance figures and identify possible limitations in their on-chip interconnection architectures. 相似文献

4.

Design automation for application-specific on-chip interconnects: A survey

《Integration, the VLSI Journal》2016

相似文献

5.

System level modeling methodology of NoC design from UML-MARTE to VHDL

Majdi Elhaji Pierre Boulet Abdelkrim Zitouni Samy Meftali Jean-Luc Dekeyser Rached Tourki 《Design Automation for Embedded Systems》2012,16(4):161-187

相似文献

6.

Synthesis of Predictable Networks-on-Chip-Based Interconnect Architectures for Chip Multiprocessors

Murali S. Atienza D. Meloni P. Carta S. Benini L. De Micheli G. Raffo L. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(8):869-880

Today, chip multiprocessors (CMPs) that accommodate multiple processor cores on the same chip have become a reality. As the communication complexity of such multicore systems is rapidly increasing, designing an interconnect architecture with predictable behavior is essential for proper system operation. In CMPs, general-purpose processor cores are used to run software tasks of different applications and the communication between the cores cannot be precharacterized. Designing an efficient network-on-chip (NoC)-based interconnect with predictable performance is thus a challenging task. In this paper, we address the important design issue of synthesizing the most power efficient NoC interconnect for CMPs, providing guaranteed optimum throughput and predictable performance for any application to be executed on the CMP. In our synthesis approach, we use accurate delay and power models for the network components (switches and links) that are obtained from layouts of the components using industry standard tools. The synthesis approach utilizes the floorplan knowledge of the NoC to detect timing violations on the NoC links early in the design cycle. This leads to a faster design cycle and quicker design convergence across the high-level synthesis approach and the physical implementation of the design. We validate the design flow predictability of our proposed approach by performing a layout of the NoC synthesized for a 25-core CMP. Our approach maintains the regular and predictable structure of the NoC and is applicable in practice to existing NoC architectures. 相似文献

7.

片上网络:新一代的片上系统结构

刘炎华刘静赖宗声《电子与封装》2011,11(5):23-27

片上系统是使用共享或专用总线作为芯片的通信资源.由于这些总线具有一定的限制,因此扩展性较差,不能满足发展需求.在这种情况下,目前的片内互连结构将成为多核芯片的发展瓶颈.文章介绍了一种新型的片上体系结构(片上网络)来解决未来片上系统中总线所带来的不足.片上网络作为一种新的片上体系结构,可以解决片上系统设计中所带来的各种挑... 相似文献

8.

Thermal-aware test scheduling using network-on-chip under multiple clock rates

Hassan Salamy 《International Journal of Electronics》2013,100(3):408-424

The increasing trend in the number of cores on a single chip has led to scalability and bandwidth issues in bus-based communication. Network-on-chip (NoC) techniques have emerged as a solution that provides a much needed flexibility and scalability in the era of multi-cores. This article presents an optimal integer linear programming (ILP) formulation and a simulated annealing (SA) solution to thermal and power-aware test scheduling of cores in an NoC-based SoC using multiple clock rates. The methods have been implemented and results on various benchmarks are presented. 相似文献

9.

An interconnection architecture for network-on-chip systems

S. Suboh M. Bakhouya J. Gaber T. El-Ghazawi 《Telecommunication Systems》2008,37(1-3):137-144

Network on Chip (NoC) is a discipline research path that primarily addresses the global communication in System on Chip (SoC). It is inspired and uses the same routing and switching techniques needed in multi-computer networks. Current shared-bus based on-chip communication architectures generally have limited scalability due to the nature of the buses especially when complex on-chip communication SoC is needed. The main goal is to have a dedicated communication infrastructure in the system that can scale up while minimizing the area and power. The selected topology of the components interconnects plays prime rule in the performance of NoC architecture as well as routing and switching techniques that can be used. In this paper, we introduce a new NoC architecture by adapting a recursive topology structure. An experimental study is performed to compare this structure with basic NoC topologies represented by 2D mesh and Spidergon. The analysis illustrates the main features of this topology and its unique benefits. The simulation results show that recursive network outperforms 2D mesh and Spidergon in main performance metrics. 相似文献

10.

Safe and efficient power management of hard real-time networks-on-chip

《Integration, the VLSI Journal》2019

The power overhead of Networks-on-Chip (NoCs) becomes tremendous in high density Multiprocessor Systems-on-Chip (MPSoCs). Especially in hard real-time and safety-critical systems, power management mechanisms must be developed and efficiently adhered to real-time requirements. However, state-of-the-art solution typically induces a high timing overhead, thus challenging safety, or has limited power saving capabilities. Additionally, current power-gating mechanisms do not provide an upper bound of the latency overhead, and thus no timing guarantees. We propose a safe and enhanced approach for power-gating that allows a global and dynamic power management under timing guarantees, i.e., all deadlines of critical tasks are met. It introduces a control-layer to save power on the NoC data layer using multiple Power-Aware Traffic-Monitor (PATM) units, which apply knowledge of the global state of the system to efficiently save power on NoC routers even at high NoCs utilizations. To safely apply the PATMs in hard real-time systems while meeting the deadlines, we provide a formal worst-case timing analysis to derive PATMs upper bound latency overhead. Experimental results show that our approach efficiently reduces static power consumption, and provides scalability inducing very small area overhead. 相似文献

11.

Test Scheduling of NoC‐Based SoCs Using Multiple Test Clocks

Jin‐Ho Ahn Sungho Kang 《ETRI Journal》2006,28(4):475-485

Network‐on‐chip (NoC) is an emerging design paradigm intended to cope with future systems‐on‐chips (SoCs) containing numerous built‐in cores. Since NoCs have some outstanding features regarding design complexity, timing, scalability, power dissipation and so on, widespread interest in this novel paradigm is likely to grow. The test strategy is a significant factor in the practicality and feasibility of NoC‐based SoCs. Among the existing test issues for NoC‐based SoCs, test access mechanism architecture and test scheduling particularly dominate the overall test performance. In this paper, we propose an efficient NoC‐based SoC test scheduling algorithm based on a rectangle packing approach used for current SoC tests. In order to adopt the rectangle packing solution, we designed specific methods and configurations for testing NoC‐based SoCs, such as test packet routing, test pattern generation, and absorption. Furthermore, we extended and improved the proposed algorithm using multiple test clocks. Experimental results using some ITC’02 benchmark circuits show that the proposed algorithm can reduce the overall test time by up to 55%, and 20% on average compared with previous works. In addition, the computation time of the algorithm is less than one second in most cases. Consequently, we expect the proposed scheduling algorithm to be a promising and competitive method for testing NoC‐based SoCs. 相似文献

12.

Efficient routing in network-on-chip for 3D topologies

Luneque Silva Junior Nadia Nedjah Luiza De Macedo Mourelle 《International Journal of Electronics》2013,100(10):1695-1712

With the increasing of the integration capability intra-chip, nowadays numerous integrated systems explore a set of processing elements, such as in multicore processors. An efficient interconnection of those elements can be obtained via the use of Network on chip (NoC). This approach is similar to the traditional computer networks where, not restricted to multiprocessors, it is possible to interconnect several dedicated devices. Like other networks, NoCs can be arranged in different topologies, such as ring, mesh and torus. It has shared links that can be used in the transmission of packets of different nodes. Thus, the network congestion is an issue and must be treated to reduce delays. Algorithms based on ant colony optimisation have proven to be effective in static routing in systems designed to perform a fixed set of tasks, or where the communication pattern is known. This article introduces 3D ant colony routing (3D-ACR) and applies it as routing policy of NoCs having three different 3D topologies: mesh, torus and hypercube. Experimental results show that 3D ant colony routing performs consistently better compared with the previously proposed routing strategies. 相似文献

13.

Application-aware virtual paths insertion for NOCs

Majed ValadBeigi Farshad Safaei Bahareh Pourshirazi 《Microelectronics Journal》2014

Network-on-chip (NoC) has rapidly become a promising alternative for complex system-on-chip architectures including recent multicore architectures. Additionally, optimizing NoC architectures with respect to different design objectives that are suitable for a particular application domain is crucial for achieving high-performance and energy-efficient customized solutions. Despite the fact that many researches have provided various solutions for different aspects of NoCs design, a comprehensive NoCs system solution has not emerged yet. This paper presents a novel methodology to provide a solution for complex on-chip communication problems to reduce power, latency and area overhead. Our proposed NoC communication architecture is based on setting up virtual source–destination paths between selected pairs of NoCs cores so that the packets belonging to distance nodes in the network can bypass intermediate routers while traveling through these virtual paths. In this scheme, the paths are constructed for an application based on its task-graph at the design time. After that, the run time scheduling mechanism is applied to improve the buffer management, virtual channel and switch allocation schemes and hence, the constructed paths are optimized dynamically. Moreover, in our design the router complexity and its overheads are reduced. Additionally, the suggested router has been implemented on Xilinx Virtex-5 FPGA family. The evaluation results captured by SPLASH-2 benchmark suite reveal that in comparison with the conventional NoC router, the proposed router takes 25% and 53% reduction in latency and energy, respectively besides 3.5% area overhead. Indeed, our experimental results demonstrate a significant reduction in the average packet latency and total power consumption with negligible area overhead. 相似文献

14.

"It's a small world after all": NoC performance optimization via long-range link insertion

Ogras U.Y. Marculescu R. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(7):693-706

Networks-on-chip (NoCs) represent a promising solution to complex on-chip communication problems. The NoC communication architectures considered so far are based on either completely regular or fully customized topologies. In this paper, we present a methodology to automatically synthesize an architecture which is neither regular nor fully customized. Instead, the communication architecture we propose is a superposition of a few long-range links and a standard mesh network. The few application-specific long-range links we insert significantly increase the critical traffic workload at which the network transitions from a free to a congested state. This way, we can exploit the benefits offered by both complete regularity and partial topology customization. Indeed, our experimental results demonstrate a significant reduction in the average packet latency and a major improvement in the achievable network through with minimal impact on network topology. 相似文献

15.

基于片上网络的系统芯片测试研究

荆元利樊晓桠张盛兵高德远周昔平《微电子学与计算机》2004,21(6):154-159

文章介绍了基于片上网络对系统芯片进行测试的原理和实例，这是一种新的设计方法。首先讨论了未来系统芯片存在的各方面测试挑战，并提出了基于片上网络结构的解决方案。其次，在OSI网络堆栈参考模型的基础上．提出了面向测试的片上网络协议堆栈以及对应的测试服务。最后，介绍了基于片上网络的模块化测试方法。相似文献

16.

Timing analysis of network on chip architectures for MP-SoC platforms 总被引：1，自引：0，他引：1

Cristian Grecu André Ivanov 《Microelectronics Journal》2005,36(9):833-845

Recently, the use of multiprocessor system-on-chip (MP-SoC) platforms has emerged as an important integrated circuit design trend for high-performance computing applications. As the number of reusable intellectual property (IP) blocks on such platforms continues to increase, many have argued that monolithic bus-based interconnect architectures will not be able to support the clock cycle requirements of these leading-edge SoCs. While hierarchical system integration using multiple smaller buses connected through repeaters or bridges offer possible solutions, such approaches tend to be ad hoc in nature, and therefore, lack generality and scalability. Instead, many different forms of network on chip (NoC) architectures have been proposed in the past few years to specifically address this problem. We believe that the NoC approach will ultimately be the preferred communication fabric for next generation designs. To support this conjecture, this paper demonstrates, through detailed circuit design and timing analysis that different proposed NoC architectures to date are guaranteed to achieve the minimum possible clock cycle times in a given CMOS technology, usually specified in normalized units as 10-15 FO4 delays. This is contrasted with the bus-based approach, which may require several design iterations to deliver the same performance when the number of IP blocks connected to the bus exceeds certain limits. 相似文献

17.

Optimization of Driver Preemphasis for On-Chip Interconnects

Yun Bai Wong S.S. 《IEEE transactions on circuits and systems. I, Regular papers》2009,56(9):2033-2041

In modern digital systems, on-chip interconnects have become the system bottleneck, limiting the performance of high-speed clock distributions and data communications in terms of speed and power dissipation. An inverse signaling analysis is developed to optimize the driving signal waveforms for lossy interconnects. By specifying the performance parameters, i.e., the signal swing and edge rate of the interconnect output signal, the corresponding input signals can be derived analytically. The result can be used to guide and optimize the design of interconnect preemphasis drivers. Numerical examples are shown for both lossy RC and RLC distributed lines. Analysis shows that optimized driving voltage and current can increase the interconnect bandwidth without voltage overshoot at the output. The significance of an interconnect inductance is also evaluated with this technique. 相似文献

18.

Throughput-Oriented NoC Topology Generation and Analysis for High Performance SoCs

Dumitriu V. Khan G.N. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2009,17(10):1433-1446

This paper presents a new approach to the design and analysis of NoC topologies which is based on the transaction-oriented communication methods of on-chip components. We propose two algorithms that attempt to meet the communication requirement of an on-chip application using a minimum number of network resources for the task, by generating application-specific topologies. In addition, to aid the design process of complex systems, the design method incorporates a form of predictive analysis which can estimate the degree of contention in a given system without performing detailed simulation. This predictive analysis method is used to determine the minimum frequency of operation for generated topologies, and is incorporated into the topology generation process. The proposed design method was tested using real-word applications, including an MPEG4 decoder and a multi-window display application. The generated topologies were found to offer similar or better performance when compared with regular topologies. However, the topologies generated by our method were more economical, using, on average, half the network resources of regular topologies. 相似文献

19.

Layout-driven architecture synthesis for high-speed digital filters

Dongku Kang Choo H. Muhammad K. Roy K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(2):203-207

We propose a floorplan-aware complexity reduction methodology for digital filters. Conventional methodologies for complexity reduction use logic-centric approaches focusing on the total number of adders. Therefore, there is a need to consider interconnects to reduce communication costs while synthesizing reduced-complexity filters. In this paper, we integrate high-level synthesis and floorplan to obtain improvement in both computational complexity and interconnect delay. In our experiments, we could achieve 15% improvement in critical-path delay over conventional methodologies. 相似文献

20.

Design of an Interconnect Architecture and Signaling Technology for Parallelism in Communication

Jongsun Kim Verbauwhede I. Chang M.-C.F. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(8):881-894

The need for efficient interconnect architectures beyond the conventional time-division multiplexing (TDM) protocol-based interconnects has been brought on by the continued increase of required communication bandwidth and concurrency of small-scale digital systems. To improve the overall system performance without increasing communication resources and complexity, this paper presents a cost-effective interconnect architecture, communication protocol, and signaling technology that exploits parallelism in board-level communication, resulting in shorter latency and higher concurrency on a shared bus or link: the proposed source synchronous CDMA interconnect (SSCDMA-I) enables dual concurrent transactions on a single wire line as well as flexible input/output (I/O) reconfiguration. The SSCDMA-I utilizes 2-bit orthogonal CDMA coding and a variation of source synchronous clocking for multilevel superposition; a single 3-level SSCDMA-I line operates as if it consists of dual virtual time-multiplexed interconnects, which exploits communication parallelism with a reduced number of pins, wires, and complexity. The unique multiple access capability of the SSCDMA-I improves real-time communication between multiple semiconductor intellectual property (IP) blocks on a shared link or bus by reducing the bus contention interference from simultaneous traffic requests and by taking advantage of shorter request latency. The prototype transceiver chip is implemented in 0.18-m CMOS and the 10-cm test PC board system achieves an aggregate data rate of 2.5 Gb/s/pin between four off-chip (2Tx-to-2Rx) I/Os. 相似文献