首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 515 毫秒
1.
2.
Future mobile and wireless communication networks require flexible modem architectures to support seamless services between different network standards. Hence, a common hardware platform that can support multiple protocols implemented or controlled by software, generally referred to as software defined radio (SDR), is essential. This paper presents a family of dynamically reconfigurable application-specific instruction-set processors (ASIPs) for channel coding in wireless communication systems. As a weakly programmable intellectual property (IP) core, it can implement trellis-based channel decoding in a SDR environment. It features binary convolutional decoding, and turbo decoding for binary as well as duobinary turbo codes for all current and upcoming standards. The ASIP consists of a specialized pipeline with 15 stages and a dedicated communication and memory infrastructure. Logic synthesis revealed a maximum clock frequency of 400 MHz and an area of 0.11 mm$^{2}$ for the processor's logic using a low power 65-nm technology. Memories require another 0.31 mm$^{2}$ . Simulation results for Viterbi and turbo decoding demonstrate maximum throughput of 196 and 34 Mb/s, respectively. The ASIP hence outperforms state-of-the-art decoder architectures targeting software defined radio by at least a factor of three while consuming only 60% or less of the logic area.   相似文献   

3.
Application-specific processors offer an attractive option in the design of embedded systems by providing high performance for a specific application domain. In this work, we describe the use of a reconfigurable processor core based on an RISC architecture as starting point for application-specific processor design. By using a common base instruction set, development cost can be reduced and design space exploration is focused on the application-specific aspects of performance. An important aspect of deploying any new architecture is verification which usually requires lengthy software simulation of a design model. We show how hardware emulation based on programmable logic can be integrated into the hardware/software codesign flow. While previously hardware emulation required massive investment in design effort and special purpose emulators, an emulation approach based on high-density field-programmable gate array (FPGA) devices now makes hardware emulation practical and cost effective for embedded processor designs. To reduce development cost and avoid duplication of design effort, FPGA prototypes and ASIC implementations are derived from a common source: We show how to perform targeted optimizations to fully exploit the capabilities of the target technology while maintaining a common source base  相似文献   

4.
Reconfigurable Computing for Digital Signal Processing: A Survey   总被引:6,自引:0,他引:6  
Steady advances in VLSI technology and design tools have extensively expanded the application domain of digital signal processing over the past decade. While application-specific integrated circuits (ASICs) and programmable digital signal processors (PDSPs) remain the implementation mechanisms of choice for many DSP applications, increasingly new system implementations based on reconfigurable computing are being considered. These flexible platforms, which offer the functional efficiency of hardware and the programmability of software, are quickly maturing as the logic capacity of programmable devices follows Moore's Law and advanced automated design techniques become available. As initial reconfigurable technologies have emerged, new academic and commercial efforts have been initiated to support power optimization, cost reduction, and enhanced run-time performance.This paper presents a survey of academic research and commercial development in reconfigurable computing for DSP systems over the past fifteen years. This work is placed in the context of other available DSP implementation media including ASICs and PDSPs to fully document the range of design choices available to system engineers. It is shown that while contemporary reconfigurable computing can be applied to a variety of DSP applications including video, audio, speech, and control, much work remains to realize its full potential. While individual implementations of PDSP, ASIC, and reconfigurable resources each offer distinct advantages, it is likely that integrated combinations of these technologies will provide more complete solutions.  相似文献   

5.
In this paper, dual antenna receiver architectures are studied including RAKE, chip-level linear equalizer, and their combination. The arithmetic complexity of single and dual antenna receiver methods is analyzed. Cost of such receivers when implemented with customized hardware or software on application-specific instruction set processors (ASIP) is estimated. The study shows that feasible dual antenna detection can be obtained with less than 70% additional costs. More flexible implementation supporting several standards can be obtained with software but it requires higher power consumption due to additional memory.  相似文献   

6.
7.
8.
Scientific application kernels mapped to reconfigurable hardware have been reported to have 10times to 100times speedup over equivalent software. These promising results suggest that reconfigurable logic might offer significant speedup on applications in science and engineering. To accurately assess the benefit of hardware acceleration on scientific applications, however, it is necessary to consider the entire application including software components as well as the accelerated kernels. Aspects to be considered include alternative methods of hardware/software partitioning, communications costs, and opportunities for concurrent computation between software and hardware. Analysis of these factors is beyond the scope of current automatic parallelizing compilers. In this paper, a case study is presented in which a simulation of metropolitan road traffic networks is mapped onto a reconfigurable supercomputer, the Cray XD1. Five different methods are presented for mapping the application onto the combined hardware/software system. An approach for approximating the performance of each method is derived through analytic equations. Our results, both analytically and empirically, show that key predictors of performance (which are often not considered in reported speedup of kernel operations) are not necessarily maximum parallelism, but must account for the fraction of the problem that runs on the reconfigurable logic and the amount data flow between software and hardware.  相似文献   

9.
In this paper, we propose a methodology for accelerating application segments by partitioning them between reconfigurable hardware blocks of different granularity. Critical parts are speeded-up on the coarse-grain reconfigurable hardware for meeting the timing requirements of application code mapped on the reconfigurable logic. The reconfigurable processing units are embedded in a generic hybrid system architecture which can model a large number of existing heterogeneous reconfigurable platforms. The fine-grain reconfigurable logic is realized by an FPGA unit, while the coarse-grain reconfigurable hardware by our developed high-performance data-path. The methodology mainly consists of three stages; the analysis, the mapping of the application parts onto fine and coarse-grain reconfigurable hardware, and the partitioning engine. A prototype software framework realizes the partitioning flow. In this work, the methodology is validated using five real-life applications. Analytical partitioning experiments show that the speedup relative to the all-FPGA mapping solution ranges from 1.5 to 4.0, while the specified timing constraints are satisfied for all the applications.  相似文献   

10.
Coarse-grained reconfigurable arrays (CGRAs) have shown potential for application in embedded systems in recent years. Numerous reconfigurable processing elements (PEs) in CGRAs provide flexibility while maintaining high performance by exploring different levels of parallelism. However, a difference remains between the CGRA and the application-specific integrated circuit (ASIC). Some application domains, such as software-defined radios (SDRs), require flexibility with performance demand increases. More effective CGRA architectures are expected to be developed. Customisation of a CGRA according to its application can improve performance and efficiency. This study proposes an application-specific CGRA architecture template composed of generic PEs (GPEs) and special PEs (SPEs). The hardware of the SPE can be customised to accelerate specific computational patterns. An automatic design methodology that includes pattern identification and application-specific function unit generation is also presented. A mapping algorithm based on ant colony optimisation is provided. Experimental results on the SDR target domain show that compared with other ordinary and application-specific reconfigurable architectures, the CGRA generated by the proposed method performs more efficiently for given applications.  相似文献   

11.
High-performance, reliable, and robust products with a short development schedule are general design aims. FACE was developed to achieve these goals, including the organization of a design flow, a frequency-driven information analyzer, compiler techniques (code generator and instruction optimization), and a hierarchical object design library. This paper explores the design space of a retargetable compiler and a reconfigurable hardware, which combine both software and hardware reprogrammability. The environment, FACE, we have developed allows us to quickly move the functions between software and hardware in a state of flux. Finally, it generates the application specific integrated processor (ASIP) and a compiler for the new ASIP architecture. The case study is considered which demonstrates the efficiency in ASIP design of FACE.  相似文献   

12.
13.
14.
15.
16.
A synthesis environment that targets software programmable architectures such as digital signal processors (DSPs) is presented. These processors are well suited for implementation of real-time signal processing systems with medium throughput requirements. Techniques that tightly couple the synthesis environment to an existing communication system simulator are also presented. This enables a seamless transition between the simulation and implementation design level of communication systems. Special focus is on optimization techniques for mapping data flow oriented block diagrams onto DSPs. The combination of different mapping and optimization strategies allows comfortable synthesis of real-time code that is highly adapted to application-specific needs imposed by constraints on memory space, sampling rate, or latency. Thus, tradeoff analysis is supported by efficient interactive or automatic exploration of the design space. All presented concepts are illustrated by the design of a phase synchronizer with automatic gain control on a floating-point DSP  相似文献   

17.
18.
Run-time reconfigurable (RTR) systems are FPGA-based systems that reconfigure FPGAs during execution to alter hardware organization and composition to meet the varying needs of applications as they execute. These systems are difficult to describe with conventional tools (schematic capture, VHDL synthesis, etc.) because most tools assume that the underlying hardware organization is static. JHDL is a Java-based design environment capable of describing, netlisting, simulating and executing complex, dynamic RTR systems. Using conventional Java syntax, users describe hardware structures as objects; as these hardware-object constructors are invoked, JHDL automatically configures hardware circuits onto FPGA hardware, thus directly supporting the dynamic nature of RTR systems with standard language constructs. JHDL also supports codesign of the software and hardware parts of the system; in other words, the entire application can be described in a single piece of Java code that can be co-simulated/co-executed with the FPGA hardware. To date, RTR design with JHDL has focused on the development of automated target recognition (ATR) systems, and working systems described in JHDL have been demonstrated.  相似文献   

19.
An increasing number of standards in wireless communications have encouraged to study programmable processors as platforms for flexible receivers. A multiple-input multiple-output (MIMO) antenna system combined with orthogonal frequency division multiplexing (OFDM) technique has been introduced in many wireless communications standards, such as in the third generation long term evolution (3G LTE). The MIMO-OFDM system requires an efficient detector and a platform support for parallel processing of multiple subcarriers. A K-best list sphere detector (LSD) provides for near optimal decoding performance and a fixed throughput making it an interesting algorithm from the point of view of practical implementations.In this paper, we compare the implementations of the K-best LSD on four processor platforms: a digital signal processor (DSP), software defined radio (SDR), application-specific processor (ASP) and application-specific instruction-set processor (ASIP). The DSP is a popular very long instruction word (VLIW) device (TMS320C6455), the SDR processor employs multithreading and multiple cores (SB3500 core processor), the ASP is based on transport triggered architecture (TTA), while the ASIP is the SDR processor enhanced with a special instruction-set extension for sorting.A 2×2 MIMO antenna system with 64-quadrature amplitude modulation (64-QAM) is assumed. The chosen list sizes K=8 and 16 are based on simulation results carried out in MATLAB environment with the third generation long term evolution (3G LTE) parameters. The proposed ASIP achieved a promising throughput of 32.0 Mbps, where the software defined radio (SDR) implementation on the SB3500 core processor suffers from an inefficient software sorter. The ASP, in which the minimized hardware complexity has been the goal, achieves a throughput of 7.6 Mbps. However, more essential examination is related to the symbol time, which sets strict parallel processing requirements to the programmable processors.  相似文献   

20.
Although simulation remains an important part of application-specific integrated circuit (ASIC) validation, hardware-assisted parallel verification is becoming a larger part of the overall ASIC verification flow. In this paper, we describe and analyze a set of incremental compilation steps that can be directly applied to a range of parallel logic verification hardware, including logic emulators. Important aspects of this work include the formulation and analysis of two incremental design mapping steps: the partitioning of newly added design logic onto multiple logic processors and the communication scheduling of newly added design signals between logic processors. To validate our incremental compilation techniques, the developed mapping heuristics have been integrated into the compilation flow for a field-programmable gate-array-based Ikos VirtuaLogic emulator . The modified compiler has been applied to five large benchmark circuits that have been synthesized from register-transfer level and mapped to the emulator. It is shown that our incremental approach reduces verification compile time for modified designs by up to a factor of five versus complete design recompilation for benchmarks of over 100 000 gates. In most cases, verification run-time following incremental compilation of a modified design matches the performance achieved with complete design recompilation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号