期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Placement and routing tools for the Triptych FPGA

Ebeling C. McMurchie L. Hauck S.A. Burns S. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1995,3(4):473-482

Field-programmable gate arrays (FPGAs) are becoming an increasingly important implementation medium for digital logic. One of the most important keys to using FPGAs effectively is a complete, automated software system for mapping onto the FPGA architecture. Unfortunately, many of the tools necessary require different techniques than traditional circuit implementation options, and these techniques are often developed specifically for only a single FPGA architecture. In this paper we describe automatic mapping tools for Triptych, an FPGA architecture with improved logic density and performance over commercial FPGAs. These tools include a simulated-annealing placement algorithm that handles the routability issues of fine-grained FPGAs, and an architecture-adaptive routing algorithm that can easily be retargeted to other FPGAs. We also describe extensions to these algorithms for mapping asynchronous circuits to Montage, the first FPGA architecture to completely support asynchronous and synchronous interface applications 相似文献

2.

Architecture of field-programmable gate arrays 总被引：8，自引：0，他引：8

Rose J. El Gamal A. Sangiovanni-Vincentelli A. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1993,81(7):1013-1029

A survey of field-programmable gate array (FPGA) architectures and the programming technologies used to customize them is presented. Programming technologies are compared on the basis of their volatility, size parasitic capacitance, resistance, and process technology complexity. FPGA architectures are divided into two constituents: logic block architectures and routing architectures. A classification of logic blocks based on their granularity is proposed, and several logic blocks used in commercially available FPGAs are described. A brief review of recent results on the effect of logic block granularity on logic density and performance of an FPGA is then presented. Several commercial routing architectures are described in the context of a general routing architecture model. Finally, recent results on the tradeoff between the flexibility of an FPGA routing architecture, its routability, and its density are reviewed 相似文献

3.

The effect of logic block architecture on FPGA performance

Singh S. Rose J. Chow P. Lewis D. 《Solid-State Circuits, IEEE Journal of》1992,27(3):281-287

This authors explore the effect of logic block architecture on the speed of a field-programmable gate array (FPGA). Four classes of logic block architecture are investigated: NAND gates, multiplexer configurations, lookup tables, and wide-input AND-OR gates. An experimental approach is taken, in which each of a set of benchmark logic circuits is synthesized into FPGAs that use different logic blocks. The speed of the resulting FPGA implementations using each logic block is measured. While the results depend on the delay of the programmable routing, experiments indicate that five- and six-input lookup tables and certain multiplexer configurations produce the lowest total delay over realistic values of routing delay. The fine grain blocks, such as the two-input NAND gate, exhibit poor performance because these gates require many levels of logic block to implement the circuits and hence require a large routing delay 相似文献

4.

Using bus-based connections to improve field-programmable gate-array density for implementing datapath circuits

Ye A. Rose J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2006,14(5):462-473

As the logic capacity of field-programmable gate arrays (FPGAs) increases, they are increasingly being used to implement large arithmetic-intensive applications, which often contain a large proportion of datapath circuits. Since datapath circuits usually consist of regularly structured components (called bit-slices) which are connected together by regularly structured signals (called buses), it is possible to utilize datapath regularity in order to achieve significant area savings through FPGA architectural innovations. This paper describes such an FPGA routing architecture, called the multibit routing architecture, which employs bus-based connections in order to exploit datapath regularity. It is experimentally shown that, compared to conventional FPGA routing architectures, the multibit routing architecture can achieve 14% routing area reduction for implementing datapath circuits, which represents an overall FPGA area savings of 10%. This paper also empirically determines the best values of several important architectural parameters for the new routing architecture including the most area efficient granularity values and the most area efficient proportion of bus-based connections. 相似文献

5.

Routability of Network Topologies in FPGAs

Saldana M. Shannon L. Jia Shuo Yue Sikang Bian Craig J. Chow P. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(8):948-951

A fundamental difference between application-specific integrated circuits (ASICs) and field-programmable gate arrays (FPGAs) is that the wires in ASICs are designed to match the requirements of a particular design. Conversely, in an FPGA, the area is fixed and the routing resources exist whether or not they are used. In this paper, we investigate how well several common network topologies map onto a modern FPGA routing fabric. Different multiprocessor network topologies with between 8 and 64 nodes are mapped to a single large FPGA. Except for the fully-connected networks, it is observed that the difference in logic resources used and routing overhead among these topologies is insignificant for the systems tested. Fully-connected networks up to about 22 nodes are also feasible on the same FPGA although the logic and routing utilization clearly grows much faster. The conclusion is that a modern FPGA fabric is very rich in resources and capable of supporting highly interconnected topologies. For systems with a modest number of nodes implemented on current large FPGAs, it is not necessary to use the connectivity-limited topologies typically used for networks-on-chip. Rather, direct point-to-point connections between all communicating nodes can be considered. 相似文献

6.

The design of an SRAM-based field-programmable gate array. I.Architecture

Chow P. Soon Ong Seo Rose J. Chung K. Paez-Monzon G. Rahardja I. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1999,7(2):191-197

相似文献

7.

Stateful-NOR based reconfigurable architecture for logic implementation

《Microelectronics Journal》2015,46(6):551-562

Most commercial Field Programmable Gate Arrays (FPGAs) have limitations in terms of density, speed, configuration overhead and power consumption mostly due to the use of SRAM cells in Look-Up Tables (LUTs), configuration memory and programmable interconnects. Also, hardwired Application Specific Integrated Circuit (ASIC) blocks designed for high performance arithmetic circuits in FPGA reduce the area available for reconfiguration. In this paper, we propose a novel generalized hybrid CMOS-memristor based architecture using stateful-NOR gates as basic building blocks for implementation of logic functions. These logic functions are implemented on memristor nanocrossbar layers, while the CMOS layer is used for selection and connection of memristors. The proposed pipelined architecture combines the features of ASIC, FPGA and microprocessor based designs. It has high density due to the use of nanocrossbar layer and high throughput especially for arithmetic circuits. The proposed architecture for three input one output logic block is compared with conventional LUT based Configurable Logic Block (CLB) having the same number of inputs and outputs; which shows 1.82×area saving, 1.57×speedup and 3.63×less power consumption. The automation algorithm to implement any logic function using proposed architecture is also presented. 相似文献

8.

Hybrid Multi-FPGA Board Evaluation by Permitting Limited Multi-Hop Routing

Sushil Chandra Jain Anshul Kumar Shashi Kumar 《Design Automation for Embedded Systems》2003,8(4):309-326

Multi-FPGA Boards (MFBs) have been in use for more than a decade for implementing systems requiring high performance and for emulation/prototyping of multimillion gate chips. It is important to develop an MFB architecture which can be used for emulation or prototyping of a large number of circuits. A key feature of an MFB is its routing architecture defined by its inter-Field-Programmable Gate Array (FPGA) connections. There are two types of inter-FPGA connections, namely–fixed connections (FCs) connecting a pair of FPGAs through dedicated wires and programmable connections (PCs) which connect a pair of FPGAs through a programmable switch. An architecture which has a mix of both these type of connections is called a hybrid routing architecture. It has been shown in the literature [7] that a hybrid MFB architecture is more efficient for emulation than an architecture with only one type of connections. The cost of an MFB and delay of the emulated circuit on it depends on the number of PCs used for emulation. An objective of a designer of an MFB for circuit emulation is to minimize the required number of PCs. In this paper, we describe algorithms to evaluate the requirement of PCs for many hybrid routing architectures.The requirement of PCs can be reduced if some programmable connections are replaced by a connection using only FCs by routing through FPGAs. Such a routing is called multi-hop routing. We present an optimal and a heuristic algorithm for estimation of PCs when limited number of hops through FPGAs are permitted. The unique feature of our evaluation scheme is that it is generic and treat routing architecture as a parameter. We have used benchmark circuits as well as synthetic cloned circuits for testing our algorithms. Our heuristic algorithm is very fast and gives optimal results most of the time. Our algorithms can be used for actual routing during circuit emulation. 相似文献

9.

A 12-Gb/s DEMUX Implemented With SiGe High-Speed FPGA Circuits

Chao You Jong-Ru Guo Kraft R.P. Chu M. Goda B. McDonald J.F. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(9):1051-1054

A 7-12-Gb/s demultiplexer implemented with circuits for a high-speed field-programmable gate array (FPGA) is introduced in this paper. Since the first FPGA was released by Xilinx in 1985, FPGAs have become denser and more powerful. The first FPGA that operates in the microwave range was designed in 2000. Various methods, such as a new basic cell structure and multimode routing, are used to make that design faster and less power consuming. Sequential logic functions are analyzed and tested in this paper with a DEMUX implementation using these high-speed FPGA circuits. A chip measurement has shown that the FPGA can operate at a 12-GHz system clock when configured to perform sequential logic. A DEMUX that operates at 12 Gb/s is used here to demonstrate the potential for high-performance and low-power FPGA features. 相似文献

10.

The memory/logic interface in FPGAs with large embedded memoryarrays

Wilton S.J.E. Rose J. Vranesic Z.G. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1999,7(1):80-91

As the capacities of field-programmable gate arrays (FPGAs) grow, they will be used to implement much larger circuits than ever before. These larger circuits often require significant amounts of storage. In order to address these storage requirements, FPGAs with large embedded memory arrays are now being developed by several vendors. One of the crucial components of an FPGA with on-chip memory is the routing structure between the memory arrays and logic resources. If this memory/logic interface is not flexible enough, many circuits will be unroutable, while if it is too flexible, it will be slower and consume more chip area than is necessary. In this paper, we show that an interconnect in which each memory pin can connect to between four and seven logic routing tracks is best in terms of both area and speed. We also show that by adding switches to support nets that connect multiple memory arrays, we can reduce the memory access time by up to 25% and improve the routability slightly 相似文献

11.

Application of nanojunction-based RRAM to reconfigurable IC

Liu M. Wang W. 《Micro & Nano Letters, IET》2008,3(3):101-105

A novel reconfigurable architecture, rFPGA, is developed by utilising high-density resistive memory (RRAM) circuits as FPGA components. Different from the existing CMOS-nano hybrid FPGAs that use crossbars, the rFPGA mainly consists of 1T1R RRAM structures (one CMOS transistor is integrated with a two-terminal resistive nanojunction) that can be fabricated using an efficient CMOS-compatible process. These 1T1R structures can significantly improve the FPGA memory and routing circuits, and enable the rFPGA to achieve at least a 2x density enhancement along with a 10% reduction of delay and power, compared with the corresponding CMOS FPGA. 相似文献

12.

A routing algorithm for FPGAs with time-multiplexed interconnects

Ruiqi Luo Xiaolei Chen Yajun Ha 《半导体学报》2020,(2):73-82

Previous studies show that interconnects occupy a large portion of the timing budget and area in FPGAs.In this work,we propose a time-multiplexing technique on FPGA interconnects.In order to fully exploit this interconnect architecture,we propose a time-multiplexed routing algorithm that can actively identify qualified nets and schedule them to multiplexable wires.We validate the algorithm by using the router to implement 20 benchmark circuits to time-multiplexed FPGAs.We achieve a 38%smaller minimum channel width and 3.8%smaller circuit critical path delay compared with the state-of-the-art architecture router when a wire can be time-multiplexed six times in a cycle. 相似文献

13.

An efficient logic emulation system

Varghese J. Butts M. Batcheller J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》1993,1(2):171-174

The Realizer, is a logic emulation system that automatically configures a network of field-programmable gate arrays (FPGAs) to implement large digital logic designs, is presented. Logic and interconnect are separated to achieve optimum FPGA utilization. Its interconnection architecture, called the partial crossbar, greatly reduces system-level placement and routing complexity, achieves bounded interconnect delay, scales linearly with pin count, and allows hierarchical expansion to systems with hundreds of thousands of FPGA devices in a fast and uniform way. An actual multiboard system has been built, using 42 Xilinx XC3090 FPGAs for logic. Several designs, including a 32-b CPU datapath, have been automatically realized and operated at speed. They demonstrate very good FPGA utilization. The Realizer has applications in logic verification and prototyping, simulation, architecture development, and special-purpose execution 相似文献

14.

Optimised bit serial modular multiplier for implementation on fieldprogrammable gate arrays

Marnane W.P. 《Electronics letters》1998,34(8):738-739

A high-speed architecture for bit serial modular multiplication is presented. The design of this array is highly regular, allowing the specific logic and routing resources available in field programmable gate arrays (FPGAs) to be exploited. Furthermore, an optimised array is presented which exploits the reprogrammability of the FPGA, such that a longer bit length can be implemented on the same FPGA 相似文献

15.

Sharing of SRAM Tables Among NPN-Equivalent LUTs in SRAM-Based FPGAs

Meyer J. Kocan F. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(2):182-195

This article introduces a novel lookup table (LUT) and its usage in the configurable logic block (CLB) architectures for SRAM-based field-programmable gate array (FPGA) architectures. The proposed CLB allows sharing of SRAM tables of LUTs among NPN-equivalent functions to reduce the size of memories used for storing the functions and also reduces the number of configuration bits required. We measured many different characteristics of FPGAs using our new CLB architecture, including area, delay, routing, and power requirements. We experimentally found that for many different FPGA architectures, CLBs can share one-fourth of their SRAM tables between two basic logic elements (BLEs), which reduced both power consumption and area without negatively affecting routing or wirelength, and there was only a negligible increase in critical path delay of 0.27%. Specifically, we find that FPGAs consisting of CLBs with 16 BLEs and 34 inputs can be implemented with eight normal SRAMs and four SRAMs shared between two BLEs, for an overall reduction of four out of sixteen SRAM tables per CLB. With this new CLB architecture, we measured an approximate reduction in overall power consumption of 2% and an estimated reduction in area of 3% 相似文献

16.

A novel and efficient routing architecture for multi-FPGA systems

Khalid M.A.S. Rose J. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2000,8(1):30-39

Multi-FPGA systems (MFSs) are used as custom computing machines, logic emulators and rapid prototyping vehicles. A key aspect of these systems is their programmable routing architecture which is the manner in which wires, FPGAs and field-programmable interconnect devices (FPIDs) are connected. Several routing architectures for MFSs have been proposed, and previous research has shown that the partial crossbar is one of the best existing architectures. In this paper, we propose a new routing architecture, called the hybrid complete-graph and partial-crossbar (HCGP) which has superior speed and cost compared to a partial crossbar. The new architecture uses both hard-wired and programmable connections between the FPGAs. We compare the performance and cost of the HCGP and partial crossbar architectures experimentally, by mapping a set of 15 large benchmark circuits into each architecture. A customized set of partitioning and interchip routing tools were developed, with particular attention paid to architecture-appropriate interchip routing algorithms. We show that the cost of the partial crossbar (as measured by the number of pins on all FPGAs and FPIDs required to fit a design), is on average 20% more than the new HCGP architecture and as much as 25% more. Furthermore, the critical path delay for designs implemented on the partial crossbar were on average 20% more than the HCGP architecture and up to 43% more. Using our experimental approach, we also explore a key architecture parameter associated with the HCGP architecture-the proportion of hard-wired connections versus programmable connections-to determine its best value 相似文献

17.

PITIA: an FPGA for throughput-intensive applications

Singh A. Mukherjee A. Macchiarulo L. Marek-Sadowska M. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2003,11(3):354-363

In this paper, we present a novel, high throughput field-programmable gate array (FPGA) architecture, PITIA, which combines the high-performance of application specific integrated circuits (ASICs) and the flexibility afforded by the reconfigurability of FPGAs. The new architecture, which targets datapath circuits, uses the concepts of wave steering and pipelined interconnects. We discuss the FPGA architecture and show results for performance, power consumption, clock network performance, and routability. Results for some commonly used datapath designs are encouraging with throughputs in the neighborhood of 625MHz in 0.25-/spl mu/m 2.5-V CMOS technology. Results for random benchmark circuits are also shown. We characterize designs according to their Rent's exponents and argue that designs with predominantly local interconnects are the best fit in PITIA. We also show that as technology scales down toward deep submicron, PITIA shows an increasing throughput performance. 相似文献

18.

Rapid Synthesis and Simulation of Computational Circuits in an MPPA

David Grant Graeme Smecher Guy G. F. Lemieux Rosemary Francis 《Journal of Signal Processing Systems》2012,67(1):47-63

A computational circuit is custom-designed hardware which promises to offer maximum speedup of computationally intensive software algorithms. However, the practical needs to manage development cost and many low-level physical design details erodes much of the potential speedup by distracting attention away from high-level architectural design. Instead, designers need an inexpensive, processor-like platform where computational circuits can be rapidly synthesized and simulated. This enables rapid architectural evolution and mitigates the risk of producing custom hardware. In this paper we present a tool flow (RVETool) for compiling computational circuits into a massively parallel processor array (MPPA). We demonstrate the CAD runtime is on average 70× faster than FPGA tools, with a circuit speed 5.8× slower than FPGA devices. Unlike the fixed logic capacity of FPGAs, RVETool can trade area for simulation performance by targeting a wide range in the number of processor cores. We also demonstrate tool scalability to very large circuits, synthesizing, placing, and routing a ≈1.6 million gate random circuit in 54 min. 相似文献

19.

A Low-Power Field-Programmable Gate Array Routing Fabric

Mingjie Lin El Gamal A. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2009,17(10):1481-1494

This paper describes a new programmable routing fabric for field-programmable gate arrays (FPGAs). Our results show that an FPGA using this fabric can achieve 1.57 times lower dynamic power consumption and 1.35 times lower average net delays with only 9% reduction in logic density over a baseline island-style FPGA implemented in the same 65-nm CMOS technology. These improvements in power and delay are achieved by 1) using only short interconnect segments to reduce routed net lengths, and 2) reducing interconnect segment loading due to programming overhead relative to the baseline FPGA without compromising routability. The new routing fabric is also well-suited to monolithically stacked 3-D-IC implementation. It is shown that a 3-D-FPGA using this fabric can achieve a 3.3 times improvement in logic density, a 2.51 times improvement in delay, and a 2.93 times improvement in dynamic power consumption over the same baseline 2-D-FPGA. 相似文献

20.

Review of advanced FPGA architectures and technologies

Yang Haigang Zhang Jia Sun Jiabin Yu Le 《电子科学学刊(英文版)》2014,31(5):371-393

Field Programmable Gate Array （FPGA） is an efficient reconfigurable integrated circuit platform and has become a core signal processing mieroehip device of digital systems over the last decade. With the rapid development of semiconductor technology, the performance and system inte- gration of FPGA devices have been significantly progressed, and at the same time new challenges arise. The design of FPGA architecture is required to evolve to meet these challenges, while also taking advantage of ever increased microchip density. This survey reviews the recent development of advanced FPGA architectures, including improvement of the programming technologies, logic blocks, intercon- nects, and embedded resources. Moreover, some important emerging design issues of FPGA archi- tectures, such as novel memory based FPGAs and 3D FPGAs, are also presented to provide an outlook for future FPGA development. 相似文献