期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

《A＆S》2007,(2):134-134

DV-109C系列DVR采用每路视频独立芯片压缩方式，内含高性能MOTOROLA通讯专用双内核32位处理器，以及ADI低功耗DSP处理器．它基于VXWDRKS嵌入式实时操作系统从选材到架构，在设计上保证了设备本身的稳定性和可靠性。1 相似文献

2.

TI多核DSP将高性能与超低功耗完美整合

TI公司《电子技术应用》2012,38(5):6

日前,德州仪器(TI)宣布推出3款基于KeyStone多内核架构、采用TMS320C66x数字信号处理器(DSP)系列的最新器件,从而可提供不影响性能与易用型的业界最低功耗解决方案.TI创新型TMS320C665x DSP完美整合了定点与浮点功能,可通过更小外形实现低功耗下的实时高性能,开发人员能够更高效地满足市场上各种高性能与便携式应用的重要需求. 相似文献

3.

低功耗、高性能RISC-Ⅴ处理器的研究与设计

唐俊龙袁攀吴圳羲卢英龙邹望辉《单片机与嵌入式系统应用》2021,21(9):6-9,13

针对嵌入式物联网设备对处理器小面积、低功耗、高性能的需求,提出一种顺序发射、乱序执行、乱序写回的三级流水线结构,设计了一款基于开源RISC-Ⅴ指令集的32位低功耗高性能处理器,支持RISC-Ⅴ基本整数运算、乘除法指令集,采用WFI休眠指令与时钟门控技术实现休眠模式.在VCS环境下验证了处理器的逻辑功能,通过SMIC 110 nm工艺库在DC环境下完成了逻辑综合,得到了处理器功耗为0.21 mW,面积开销为20.5k个逻辑门,最后通过运行Core Mark跑分程序测试处理器性能,指令执行速度为2.54 CoreMark/MHz.验证结果表明,本设计同时兼顾了处理器功耗与性能,可以很好地应用于小面积、低功耗、高性能的嵌入式场景. 相似文献

4.

TI KeyStoneII架构助力多内核技术发展

《单片机与嵌入式系统应用》2012,12(4)

德州仪器（TI）宣布对其曾获奖的KeyStone多内核架构进行重要升级，从而为集信号处理、网络、安全和控制功能于一体的高性能28nm器件进入崭新发展时代铺平了道路。TI可扩展KeyStoneII架构支持TMS320C66x数字信号处理器（DSP）系列内核以及多高速缓存同步的四通道ARMCortex—A15集群，包含多达32个DSP和RISC内核，为需要高性能和低功耗应用领域的理想选择。相似文献

5.

归一化互相关灰度图像匹配的多核信号处理器实现

刘毅飞张旭明丁明跃《计算机应用》2011,31(12):3334-3336

为了满足图像处理对处理器性能的高要求,以基于灰度的归一化互相关(NCC)匹配算法为例,采用高性能、低功耗的多核数字信号处理器(DSP)系统,根据归一化互相关算法中模板图像在源图像中逐个像素搜索并计算相关性的特点,将搜索区域分成六个部分并使TMS320C6472的六个核并行搜索计算这六个区域,并在不同图像存储位置采用不同图像和模板大小实现了多核DSP归一化互相关图像匹配算法。实验结果表明,多核DSP具有作为数字信号处理器的高速信号和图像处理的特点,同时可以根据不同算法通过核间任务分配实现多核并行处理。对于归一化互相关灰度图像匹配算法,TMS320C6472六核DSP和单核DSP比较获得接近单核DSP六倍的性能,对于较大尺寸的图像和PC相比也具有一定的性能加速。相似文献

6.

双核架构在无线传感器网络节点设计中的应用

王韧郭晓春郭航《传感技术学报》2008,21(2):353-356

研究了采用超低功耗监控微控制器和高性能微处理器相结合的双核架构的无线传感器网络节点的实现.通过选用合适的芯片,从硬件上构建了基于双核架构节点的无线传感器网,基于APTEEN网络路由协议,根据实验环境将层次结构简化为平面结构,并进行了性能测试,将测试结果与现有的基于单核架构节点的的无线传感器网进行比较.结果表明双核架构在低功耗和高性能之间取得了平衡点,相对于单核结构具有以更低的功耗获取更高性能的优势. 相似文献

7.

魂芯DSP上复数类型的支持和优化

王玉林郑启龙赵高义《计算机系统应用》2017,26(9):40-45

魂芯DSP是一款采用VLIW和SIMD架构的针对高性能计算领域而设计的32bit静态标量数字信号处理器.为了满足数字高性能计算的性能要求,魂芯DSP提供了丰富的复数指令,而编译器不能直接利用这些复数指令来提升编译性能.因此针对魂芯DSP芯片提供了大量的复数类操作指令的特点,在传统开源编译器Open64的编译框架基础上进行研究,实现了复数作为编译器基础类型和复数运算操作的支持.同时,通过识别特定的复数类操作的模式利用魂芯DSP上的复数类指令对程序编译优化.实验结果表明,该实现方案在魂芯DSP编译器上对复数程序优化后能够取得平均5.28的加速比. 相似文献

8.

通用处理器和图像处理器新型融合架构分析

邹治海沈祥黄田祝永新《计算机应用》2011,31(Z1):168-171

CPU与图形处理器(GPU)作为两种主要的通用处理器,在协同工作时存在功耗过大、体积不易压缩、传输速度慢等问题,因而融合成为一种趋势。在分析两者技术特点及通过高性能基准程序实测其性能基础上,提出一种新型融合架构。该融合架构采用低功耗处理器进行任务分配,根据任务类型及计算量,平衡串行处理核心和并行处理核心之间的任务调度及使用效率;而两种处理核心专注于进行数据处理,根据不同任务采用不同组合方式。通过性能评估,该新融合架构在计算能力和功耗方面均有较大改善。相似文献

9.

DSP中如何利用高速缓存优化流媒体程序

汪国有谢励《计算机测量与控制》2007,15(3):402-404

缓存(Cache)是现代处理器架构中必不可少的功能部件,嵌入式/流媒体应用程序的特征显示,MCU中的数据和指令缓存的架构设计和编译优化方法能够大幅度提高处理器的性能;在分析嵌入式/流媒体程序特点的基础上,基于ADI公司Biackfin系列DSP(BF5332)的缓存架构为平台,讨论了如何处理指令配置缓存和数据缓存,优化和提升流媒体程序的性能;经过对指令和数据缓存的合理优化,显著地提高了DSP(BF533)上流媒体程序的性能. 相似文献

10.

一款高可靠嵌入式处理器芯片的设计

朱英田增陈叶蒋毅飞李彦哲刘晓强《计算机工程与科学》2023,(3):390-397

基于申威自主指令系统设计开发了一款高可靠性、高性能嵌入式处理器芯片。该处理器采用SoC技术和AMBA总线架构，片上集成自主研发的申威第3代64位高性能处理器核心Core3,以及PCIe2.0、USB2.0等多种标准I/O接口，基于国内成熟工艺开发，片上集成2.5亿晶体管，在-55℃～125℃宽温下的核心工作频率达到800 MHz,双精度浮点峰值性能为3.2 GFlops,全片峰值功耗小于3.2 W。详细介绍了该处理器为了实现高可靠性、低功耗和高性能等设计目标，在芯片结构设计、可靠性设计、低功耗设计和物理实现方面所采取的技术方法和手段，并给出了芯片频率、功耗和成品率等主要技术指标的测试结果。该处理器已在多个信息设备领域得到了应用，并取得了较好的社会效益。相似文献

11.

Scalable, vector processors for embedded systems

Kozyrakis C.E. Patterson D.A. 《Micro, IEEE》2003,23(6):36-45

For embedded applications with data-level parallelism, a vector processor offers high performance at low power consumption and low design complexity. Unlike superscalar and VLIW designs, a vector processor is scalable and can optimally match specific application requirements.To demonstrate that vector architectures meet the requirements of embedded media processing, we evaluate the Vector IRAM, or VIRAM (pronounced "V-IRAM"), architecture developed at UC Berkeley, using benchmarks from the Embedded Microprocessor Benchmark Consortium (EEMBC). Our evaluation covers all three components of the VIRAM architecture: the instruction set, the vectorizing compiler, and the processor microarchitecture. We show that a compiler can vectorize embedded tasks automatically without compromising code density. We also describe a prototype vector processor that outperforms high-end superscalar and VLIW designs by 1.5x to 100x for media tasks, without compromising power consumption. Finally, we demonstrate that clustering and modular design techniques let a vector processor scale to tens of arithmetic data paths before wide instruction-issue capabilities become necessary. 相似文献

12.

YHFT-QDSP: High-Performance Heterogeneous Multi-Core DSP

下载免费PDF全文

陈书明万江华鲁建壮刘仲孙海燕孙永节刘衡竹刘祥远李振涛徐毅陈小文《计算机科学技术学报》2010,25(2):214-224

Multi-core architectures are widely used to enhance the microprocessor performance within a limited increase in time-to-market and power consumption of the chips.Toward the application of high-density data signal processing, this paper presents a novel heterogeneous multi-core architecture digital signal processor(DSP),YHFT-QDSP,with one RISC CPU core and 4 VLIW DSP cores.By three kinds of interconnection,YHFT-QDSP provides high efficiency message communication for inner-chip RISC core and DSP cores,inne... 相似文献

13.

Introducing the FR500 embedded microprocessor

Suga A. Matsunami K. 《Micro, IEEE》2000,20(4):21-27

Because conventional RISC processors have insufficient processing power to support the continuing development of digital consumer products, we need a new high performance processor for multimedia applications. Processing multimedia video images requires more than 10 times the currently available performance. At Fujitsu, we provide this higher performance in software to attain a high degree of flexibility. We developed the FR500 microprocessor with a novel embedded VLIW (very long instruction word) architecture for use in such digital consumer products. The FR500 is the first product in the FR-V line, Fujitsu's generic name for VLIW architecture microprocessors. The FR-V line offers the flexibility to develop new products optimized for a wide variety of digital consumer products. In this paper, we describe the FR-V architecture, which includes our variable-length VLIW and instruction set architectures, speculative execution control, and conditional execution control. We also evaluate its performance 相似文献

14.

Evaluation and choice of various brånch predictors for low-power embedded processor

下载免费PDF全文

Fan?DongRui?Email author Yang?HongBo Gao?GuangRong Zhao?RongCai 《计算机科学技术学报》2003,18(6):833-838

Power is an important design constraint in embedded computing systems. To meet the power constraint, microarchitecture and hardware designed to achieve high performance need to be revisited, from both performance and power angles. This paper studies one of them: branch predictor. As well known, branch prediction is critical to exploit instruction level parallelism effectively, but may incur additional power consumption due to the hardware resource dedicated for branch prediction and the extra power consumed on mispredicted branches. This paper explores the design space of branch prediction mechanisms and tries to find the most beneficial one to realize low-power embedded processor. The sample processor studied is Godson-like processor, which is a dual-issue, out-of-order processor with deep pipeline, supporting MIPS instruction set. 相似文献

15.

Evaluation and Choice of Various Branch Predictors for Low-Power Embedded Processor 总被引：2，自引：0，他引：2

下载免费PDF全文

范东睿杨洪波高光荣赵荣彩《计算机科学技术学报》2003,18(6):0-0

Power is an important design constraint in embedded computing systems.To meet the power constraint,microarchitecture and hardware designed to achieve high performance need to be revisited,from both performance and power angles.This paper studies one of them:branch predictor.As well known,branch prediction is critical to exploit instruction level parallelism effectively,but may incur additional power consumption due to the hardware resource dedicated for branch prediction and the extra power consumed on mispredicted branches.This paper explores the design space of branch prediction mechanisms and tries to find the most beneficial one to realiz elow-power embedded processor.The sample processor studied is Godson-like processor,which is adual-issue,out-of-order processor with deep pipeline,supporting MIPS instruction set. 相似文献

16.

"银河飞腾"高性能数字信号处理器研究进展 总被引：19，自引：5，他引：19

陈书明李振涛万江华胡定磊郭阳汪东扈啸孙书为《计算机研究与发展》2006,43(6):993-1000

YHFT-DSP/700是2004年研制成功的“银河飞腾”系列超长指令字结构高性能浮点DSP,其主频达238MHz,峰值性能为每秒14亿次浮点运算和19亿条指令·介绍了YHFT-DSP/700的体系结构、设计方法和编译器等关键技术;介绍了同时多线程YHFT-DSP/SMT的体系结构,它可以将DSP的性能提高40%;分析了国际主流高性能DSP的体系结构和发展趋势· 相似文献

17.

YHFT-DX高性能DSP指令控制流水线设计与优化

下载免费PDF全文

郭阳甄体智李勇《计算机工程与应用》2010,46(7):69-71

YHFT-DX是国防科技大学设计的一款高性能定点DSP。论文设计并实现了YHFT-DX指令控制流水线,提出了在YHFT-DX 超长指令字结构中跨取指包边界派发和指令预取的方法,有效提升了流水线的性能。对指令流水线进行了高频结构优化,将派发部件的关键路径延时压缩40%,满足了600 MHz频率的设计目标。相似文献

18.

Flexible VLIW processor based on FPGA for efficient embedded real-time image processing

Vincent Brost Fan Yang Charles Meunier 《Journal of Real-Time Image Processing》2014,9(1):47-59

Modern field programmable gate array (FPGA) chips, with their larger memory capacity and reconfigurability potential, are opening new frontiers in rapid prototyping of embedded systems. With the advent of high-density FPGAs, it is now possible to implement a high-performance VLIW (very long instruction word) processor core in an FPGA. With VLIW architecture, the processor effectiveness depends on the ability of compilers to provide sufficient ILP (instruction-level parallelism) from program code. This paper describes research result about enabling the VLIW processor model for real-time processing applications by exploiting FPGA technology. Our goals are to keep the flexibility of processors to shorten the development cycle, and to use the powerful FPGA resources to increase real-time performance. We present a flexible VLIW VHDL processor model with a variable instruction set and a customizable architecture which allows exploiting intrinsic parallelism of a target application using advanced compiler technology and implementing it in an optimal manner on FPGA. Some common algorithms of image processing were tested and validated using the proposed development cycle. We also realized the rapid prototyping of embedded contactless palmprint extraction on an FPGA Virtex-6 based board for a biometric application and obtained a processing time of 145.6 ms per image. Our approach applies some criteria for co-design tools: flexibility, modularity, performance, and reusability. 相似文献

19.

面向多簇架构DSP的树匹配向量化算法

郭连伟郑启龙黄胜兵徐华叶《计算机系统应用》2015,24(10):142-147

BWDSP是针对高性能计算设计的一款新型的处理器, 采用多簇超长指令字体系结构和SIMD架构, 有丰富的指令集. 为充分利用BWDSP提供的向量化资源, 迫切需要提出一种向量化算法. 本文在open64基础上研究并实现了面向多簇超长指令字(VLIW)DSP的SIMD编译优化算法. 算法基于OPEN64的中间语言WHIRL, 能够充分地利用BWDSP丰富的硬件资源和向量化指令. 最终实验结果表明, 对于能够合成双字和单字的循环程序, 该优化算法能够平均取得6倍和4倍的加速比. 相似文献

20.

YHFT-D4汇编器的设计与实现

陈惠斌刘春林胡定磊陈书明《电脑与信息技术》2005,13(1):27-29,47

YHFT-D4是一款具有分簇的VLIW体系结构的DSP，它有多个功能单元，可在单个时钟周期并行地执行多条指令。指令执行的功能单元是哪个，哪些指令并行执行，这些由编译器或程序员静态决定，文章给出了YHFT-D4汇编器的设计和实现方法。相似文献