首页 | 官方网站   微博 | 高级检索  
     

一种高速DSP中延迟优化的乘累加单元的设计与实现
引用本文:Sheraz Anjum, 陈杰, 李海军,.一种高速DSP中延迟优化的乘累加单元的设计与实现[J].电子器件,2007,30(4):1375-1379.
作者姓名:Sheraz Anjum  陈杰  李海军  
作者单位:中国科学院微电子研究所通信与多媒体实验室,北京,100029
基金项目:COMSATS Institute of Information Technology,Pakistan,,
摘    要:乘累加单元是任何数字信号处理器(DSP)数据通路中的一个关键部分.多年来,硬件工程师们一直倾注于其优化与改进.本文描述了一种速度优化的乘累加单元的设计与实现.本文的乘累加单元是为一种高速VLIW结构的DSP核设计,能够进行16×16 40的无符号和带符号的二进制补码操作.在关键路径延迟上,本文的乘累加单元比其他任何使用相同或不同算数技术实现的乘累加单元都更优.本文的乘累加单元已成功使用于synopsys的工具,并与synopsys的Design Ware库中相同位宽的乘累加单元比较.比较结果表明,本文的乘累加单元比Design Ware库中的任何其他实现都要快,适合于在需要高吞吐率的DSP核中使用.注意:比较是在Design compiler中使用相同属性和开关下进行的.

关 键 词:乘累加单元  改进的波兹编码  部分积  修整向量  Wallace树压缩器  进位保留加法器  进位传播加法器  MAC  (Multiply  and  Accumulate)  Modified  Booth's  Encoder  PPs  (Partial  Products)  CV  (Correction  Vector)  Wallace  Tree  Compressor  CSA  (Carry  Save  Adder)  CPA  (Carry  Propagate  Adder)
文章编号:1005-9490(2007)04-1375-05
修稿时间:2006-10-21

Design and Implementation of a Delay Optimized Multiply-Accumulate Unit for High Speed DSPs
Sheraz Anjum,CHEN Jie,LI Hai-jun.Design and Implementation of a Delay Optimized Multiply-Accumulate Unit for High Speed DSPs[J].Journal of Electron Devices,2007,30(4):1375-1379.
Authors:Sheraz Anjum  CHEN Jie  LI Hai-jun
Affiliation:1. Communication and Multimedia SoC Lab, Institute of Microelectronics, Chinese Academy of Sciences, No. 3. Bei Tu Cheng West Road, Chaoyang District, Beijing, 100029,China
Abstract:The Multiply-Accumulate MAC] unit is a critical element in the data path of any DSP processor and has been a great focus of optimization by the hardware engineers in the last few years. This paper describes the design and implementation of a speed optimized MAC unit that is capable of performing 16×16+40 operations on unsigned and signed two's complement operands and is intended to be used in a high speed VLIW DSP Core. The proposed MAC is superior to the other MAC units implemented with the same or different algorithmic technologies in terms of critical delay. The said MAC has successfully been implemented, synthesized using synopsis tools and compared with the stream line MAC units of same data width from the synopsis design ware library. The comparison results showed that the proposed architecture is faster than all the other implementations from the synopsis's design ware IP library and is suitable for use in any DSP Core especially those requiring high throughput. Note: The comparison was taken under the same attributes and compile options.
Keywords:MAC(Multiply and Accumulate)  Modified Booth's Encoder  PPs(Partial Products)  CV(Correction Vector)  Wallace Tree Compressor  CSA(Carry Save Adder)  CPA(Carry Propagate Adder)
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《电子器件》浏览原始摘要信息
点击此处可从《电子器件》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号