首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
本文叙述了基于单片DSP芯片实现的航迹处理硬件、软件设计技术。本设计充分利用了DSP芯片的资源,减少了外围电路,可减小体积,降低成本和功耗.提高系统的可靠性。  相似文献   

2.
《电子与电脑》2010,(5):87-87
TensiliCa发布第三代COnnX 545CK8-MAc(乘数累加器)VLIW(超长指令字)DSP(数字信号处理器)内核,用于片上系统(SoC)的设计。经改进的第三代数据处理器(DPU)内核,运行速度提高20%,芯片面积减少11%.功耗降低30%。  相似文献   

3.
提出了一种DSP和通用CPU一体化的处理器架构,并完成了一款基于该架构的同构4核处理器设计和流片验证.该处理器基于VLIW结构,支持自主定义的DSP指令系统,兼容现有通用的MIPS 4KC处理器指令集,支持最大8个指令通道的并行发射.处理器在不改变CPU的指令编码以及执行顺序的前提下,实现了芯片结构上的DSP和CPU执行处理的一体化,适合在统一的平台上同时完成宽带通信和多媒体的信号和协议处理的嵌入式应用开发.处理器内核通过自主定义的DSP指令字中前后并行标识位和一条专用的前导paralink指令实现了DSP与CPU指令的并行发射.在4核处理器的同构架构上,采用了全局读局部写的多核间片上数据存储策略,在控制硬件开销的基础上实现片上数据的共享.仿真和流片验证结果表明,所提出的DSP和CPU一体化处理器架构可行,在宽带通信和多媒体等嵌入式应用上具有优势.  相似文献   

4.
地址产生部件(AGU)是DSP芯片的重要组成部分,通过支持多种寻址模式,提高了指令的执行效率.详细介绍了某嵌入式DSP的寻址模式及其指令编码结构,在此基础上设计了该DSP的AGU,使其不仅支持几种特殊的寻址模式,还支持单周期三寻址操作.最后对该AGU进行了优化.结果表明,优化后的AGU在改善性能和功耗的同时能够有效减少数字信号处理算法的执行周期.  相似文献   

5.
随着通信产品的不断升级,人们对DSP技术的要求也越来越高。开发人员需要功耗更低、性能更高的DSP来满足其设计。TI公司的TMS320C5510正是针对这一需求而推出的DSP。C5510是第一个采用TMS320C55x DSP芯核的产品,利用双MAC结构,配备一个32位指令总线,三个16位数据读总线,两个16位数据写总线和五个24位地址总线。其内部的两个MAC单元采用并行操作,每个单元可在单周期内同步完成17x17位乘法运算。这样,C5510可更快地执行指令,并迅速返回到待机或下电状态,从而改进性能,并降低芯片的整体功耗。C5510沿用了C54x DSP的高代…  相似文献   

6.
一种用于移动通信的无线电多媒体DSP芯片的实现   总被引:2,自引:2,他引:0  
叙述了一种用于移动通信的无线电多媒体DSP芯片的实现。开发出的WM DSP芯片既支持用于Viterbi、时间同步等的通信指令,也支持多媒体指令。这个DSP能够处理可变长数据,并且在一个周期里可以执行4个MAC。提出的DSP采用了并行处理技术,如SIMD、矢量处理和DSP结构,并且采用了无线电应用的低功耗特性。整个DSP芯片包括测试电路和各种外围设备,如DMA、总线仲裁、定时器等等。除了存储器之外总共大约有170 000个门电路,并且时钟频率达到了100 MHz。  相似文献   

7.
采用基于硬件的模拟方法--CPU及cache控制器采用RTL级模型,cache体采用电路模型,对cache的性能和功耗进行研究,给出了较为精确的缺失率和功耗随结构参数变化的设计空间.最后设计了基于CAM高相联度cache,与基于RAM的高相联度cache相比,其指令cache和数据cache的平均能耗分别降低了35.16%和30.68%.  相似文献   

8.
浙大数芯是一种全新的RISC/DSP混合体系结构处理器,在一个单核流水级架构上实现RISC通用指令、DSP数据处理指令和SIMD多媒体增强指令,是多媒体数字处理和计算机体系结构研究的一次集成电路创新实践,其相应的集成开发平台进一步为嵌入式应用系统的开发提供良好的软硬件环境。  相似文献   

9.
为了降低DSP外部SDRAM存储系统的功耗,针对DSP访问片外SDRAM的功耗来源特点,提出了基于总线利用率动态监测的读写归并方案。该方案动态监测外部存储器接口(EMIF)总线的利用率,根据总线利用率的不同选择开放的页策略、封闭的页策略或休眠模式;设计了简化的指令Cache(I-Cache),采用块读的方法取指令;设计了写后数据缓冲区,由EMIF对同一行的读写进行归并。经计算,根据EMIF总线利用率的不同(10%~40%),该方案相比单纯采用开放的页策略,功耗可减少5%~20%左右。  相似文献   

10.
本文提出了一种VLIW处理器的预取和针对循环指令的优化策略.文中重点介绍了预取普通指令和处理循环指令的方法,以及普通预取和循环预取这两种预取模式间的切换方式.基于该设计和优化方案,可以有效减小取指操作的功耗.实验证明,在针对不同的应用上,减少的功耗从40%到90%不等,优化了该VLIW多运算簇DSP处理器的性能.  相似文献   

11.
Cache作为处理器和系统总线之间的桥梁,是芯片功耗的主要来源,低功耗Cache设计在嵌入式芯片设计中具有重要意义.传统Cache设计一般依赖于特定体系结构,难以在不同的系统中进行集成,通用性差.本文提出了一种低功耗高效率的AHB-AXI双总线结构联合Cache的IP设计.实验结果显示,本设计可以显著降低Cache功耗和提高系统性能.  相似文献   

12.
A 1 k bit GaAs static RAM with E/D DCFL was designed and successfully fabricated by SAINT. A bit line pull-up was introduced to the design to make higher operation speed by 25 percent and reduce cell array power consumption by 50 percent. The RAM circuit was optimized in the points of a speed, a power, and an operating margin. A minimum address access time of 1.5 ns was measured for a total power dissipation of 369 mW. This performance is the best achieved so far, for practical application in cache or buffer memories.  相似文献   

13.
In this paper, we present the characterization and design of energy-efficient, on chip cache memories. The characterization of power dissipation in on-chip cache memories reveals that the memory peripheral interface circuits and bit array dissipate comparable power. To optimize performance and power in a processor's cache, a multidivided module (MDM) cache architecture is proposed to conserve energy in the bit array as well as the memory peripheral circuits. Compared to a conventional, nondivided, 16-kB cache, the latency and power of the MDM cache are reduced by a factor of 1.9 and 4.6, respectively. Based on the MDM cache architecture, the energy efficiency of the complete memory hierarchy is analyzed with respect to cache parameters in a multilevel processor cache design. This analysis was conducted by executing the SPECint92 benchmark programs with the miss ratios for reduced instruction set computer (RISC) and complex instruction set computer (CISC) machines  相似文献   

14.
Cache能够提高DSP处理器对外部存储器的存取速度,提高DSP的性能,设计高性能低功耗的Cache,对于提高DSP芯片的整体性能有着十分重大的意义。描述了DSP芯片中一种高性能低功耗的数据Cache。这种Cache可以通过增加具备重装功能的Line Buffer来减少处理器对Cache的访问频率,从而降低Cache功耗。通过FFT、AC3、FIR三种基准程序测试表明,Line Buffer可以降低35%的Cache访问频率,明显降低了数据Cache功耗。  相似文献   

15.
This paper presents a forward body-biasing (FBB) technique for active and standby leakage power reduction in cache memories. Unlike previous low-leakage SRAM approaches, we include device level optimization into the design. We utilize super high Vt (threshold voltage) devices to suppress the cache leakage power, while dynamically FBB only the selected SRAM cells for fast operation. In order to build a super high Vt device, the two-dimensional (2-D) halo doping profile was optimized considering various nanoscale leakage mechanisms. The transition latency and energy overhead associated with FBB was minimized by waking up the SRAM cells ahead of the access and exploiting the general cache access pattern. The combined device-circuit-architecture level techniques offer 64% total leakage reduction and 7.3% improvement in bit line delay compared to a previous state-of-the-art low-leakage SRAM technique. Static noise margin of the proposed SRAM cell is comparable to conventional SRAM cells.  相似文献   

16.
This paper presents a new data cache design, cache-processor coupling, which tightly binds an on-chip data cache with a microprocessor. Parallel architectures and high-speed circuit techniques are developed for speeding address handling process associated with accessing the data cache. The address handling time has been reduced by 51% by these architectures and circuit techniques. On the other hand, newly proposed instructions increase data cache bandwidth by eight times. Excessive power consumption due to the wide-bandwidth data transfer is carefully avoided by newly developed circuit techniques, which reduce dissipation power per bit to 1/26. Simulation study of the proposed architecture and circuit techniques yields a 1.8 ns delay each for address handling, cache access, and register access for a 16 kilobyte direct mapped cache with a 0.4 μm CMOS design rule  相似文献   

17.
In this paper, we propose a novel integrated circuit and architectural level technique to reduce leakage power consumption in high-performance cache memories using single V/sub t/ (transistor threshold voltage) process. We utilize the concept of gated-ground (nMOS transistor inserted between ground line and SRAM cell) to achieve a reduction in leakage energy without significantly affecting performance. Experimental results on gated-ground caches show that data is retained (DRG-Cache) even if the memory is put in the standby mode of operation. Data is restored when the gated-ground transistor is turned on. Turning off the gated-ground transistor in turn gives a large reduction in leakage power. This technique requires no extra circuitry; the row decoder itself can be used to control the gated-ground transistor. The technique is applicable to data and instruction caches as well as different levels of cache hierarchy, such as the L1, L2, or L3 caches. We fabricated a test chip in TSMC 0.25-/spl mu/m technology to show the data retention capability and the cell stability of the DRG-Cache. Our simulation results on 100-nm and 70-nm processes (Berkeley Predictive Technology Model) show 16.5% and 27% reduction in consumed energy in L1 cache and 50% and 47% reduction in L2 cache, respectively, with less than 5% impact on execution time and within 4% increase in area overhead.  相似文献   

18.
Design Space Exploration for 3-D Cache   总被引:1,自引:0,他引:1  
As technology scales, interconnects have become a major performance bottleneck and a major source of power consumption for sub-micro integrated circuit (IC) chips. One promising option to mitigate the interconnect challenges is 3D ICs, in which a stack of multiple device layers are put together on the same chip. In this paper, we explore the architectural design of cache memories using 3D circuits. We present a delay and energy model 3D cache delay-energy estimation tool (3D-Cacti) to explore different 3D design options of partitioning a cache. The tool allows partitioning of a cache across different device layers at various levels of granularity. The tool has been validated by comparing its results with those obtained from circuit simulation of custom 3D layouts. We also explore the effects of various cache partitioning parameters and 3D technology parameters on delay and energy to demonstrate the utility of the tool.  相似文献   

19.
An 8-bit fully decoded RAM test circuit has been designed and fabricated using enhancement-mode GaAs-MESFET's with the LPFL circuit approach. Correct operation of the circuit has been observed for a supply voltage varying from 3.5 to 7 v. An access time of 0.6 ns was measured for a total power consumption of 85 mW under nominal operating conditions. This circuit was used to develop and validate both a design strategy and computer-aided design (CAD) tools oriented towards cache or buffer memories of realistic complexity. It is shown that a performance-optimized 1-kbit RAM exhibiting an access time of 1.1 ns for a power dissipation of 850 mW would be feasible with the present fabrication technology.  相似文献   

20.
An 8-bit fully decoded RAM test circuit has been designed and fabricated using enhancement-mode GaAs-MESFET's with the LPFL circuit approach. Correct operation of the circuit has been observed for a supply voltage varying from 3.5 to 7 V. An access time of 0.6 ns was measured for a total power consumption of 85 mW under nominal operating conditions. This circuit was used to develop and validate both a design strategy and computer-aided design (CAD) tools oriented towards cache or buffer memories of realistic complexity. It is shown that a performance-optimized 1-kbit RAM exhibiting an access time of 1.1 ns for a power dissipation of 850 mW would be feasible with the present fabrication technology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号