一种高速DSP中延迟优化的乘累加单元的设计与实现 Design and Implementation of a Delay Optimized Multiply-Accumulate Unit for High Speed DSPs期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种高速DSP中延迟优化的乘累加单元的设计与实现

引用本文：	Sheraz Anjum, 陈杰, 李海军,.一种高速DSP中延迟优化的乘累加单元的设计与实现[J].电子器件,2007,30(4):1375-1379.

作者姓名：	Sheraz Anjum 陈杰李海军

作者单位：	中国科学院微电子研究所通信与多媒体实验室,北京,100029

基金项目：	COMSATS Institute of Information Technology,Pakistan，，

摘要：	乘累加单元是任何数字信号处理器(DSP)数据通路中的一个关键部分.多年来,硬件工程师们一直倾注于其优化与改进.本文描述了一种速度优化的乘累加单元的设计与实现.本文的乘累加单元是为一种高速VLIW结构的DSP核设计,能够进行16×16 40的无符号和带符号的二进制补码操作.在关键路径延迟上,本文的乘累加单元比其他任何使用相同或不同算数技术实现的乘累加单元都更优.本文的乘累加单元已成功使用于synopsys的工具,并与synopsys的Design Ware库中相同位宽的乘累加单元比较.比较结果表明,本文的乘累加单元比Design Ware库中的任何其他实现都要快,适合于在需要高吞吐率的DSP核中使用.注意:比较是在Design compiler中使用相同属性和开关下进行的.
关键词：	乘累加单元改进的波兹编码部分积修整向量 Wallace树压缩器进位保留加法器进位传播加法器 MAC (Multiply and Accumulate) Modified Booth's Encoder PPs (Partial Products) CV (Correction Vector) Wallace Tree Compressor CSA (Carry Save Adder) CPA (Carry Propagate Adder)
文章编号：	1005-9490（2007）04-1375-05
修稿时间：	2006-10-21
Design and Implementation of a Delay Optimized Multiply-Accumulate Unit for High Speed DSPs

Sheraz Anjum,CHEN Jie,LI Hai-jun.Design and Implementation of a Delay Optimized Multiply-Accumulate Unit for High Speed DSPs[J].Journal of Electron Devices,2007,30(4):1375-1379.

Authors:	Sheraz Anjum CHEN Jie LI Hai-jun

Affiliation:	1. Communication and Multimedia SoC Lab, Institute of Microelectronics, Chinese Academy of Sciences, No. 3. Bei Tu Cheng West Road, Chaoyang District, Beijing, 100029,China

Abstract:	The Multiply-Accumulate MAC] unit is a critical element in the data path of any DSP processor and has been a great focus of optimization by the hardware engineers in the last few years. This paper describes the design and implementation of a speed optimized MAC unit that is capable of performing 16×16+40 operations on unsigned and signed two's complement operands and is intended to be used in a high speed VLIW DSP Core. The proposed MAC is superior to the other MAC units implemented with the same or different algorithmic technologies in terms of critical delay. The said MAC has successfully been implemented, synthesized using synopsis tools and compared with the stream line MAC units of same data width from the synopsis design ware library. The comparison results showed that the proposed architecture is faster than all the other implementations from the synopsis's design ware IP library and is suitable for use in any DSP Core especially those requiring high throughput. Note: The comparison was taken under the same attributes and compile options.

Keywords:	MAC(Multiply and Accumulate) Modified Booth's Encoder PPs(Partial Products) CV(Correction Vector) Wallace Tree Compressor CSA(Carry Save Adder) CPA(Carry Propagate Adder)
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《电子器件》浏览原始摘要信息
	点击此处可从《电子器件》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏