期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Area-Delay and Energy-Efficient Throughput-Scalable VLSI Architecture for SDR Channelizer

Basant Kumar Mohanty Subodh Kumar Singhal 《Circuits, Systems, and Signal Processing》2016,35(8):2958-2971

A software-defined radio (SDR) channelizer extracts narrowband channels from the wideband signal. The impulse response of this filter is required to change with the desired channel to be extracted from the wideband input. A reconfigurable filter is used instead of fixed filters to implement the channelizer in a resource-constrained environment. In this paper, we present a throughput-scalable reconfigurable architecture for SDR channelizer. The proposed structure processes a block of L input samples and produces one block of L outputs in every clock cycle. The register complexity of the proposed structure is independent of throughput, whereas multiplier and adder complexity increases proportionately. A significant number of registers are saved when the proposed structure is implemented for larger filter-length and higher block-sizes. Theoretical estimates show that the proposed structure for the block-size 8 and filter-length 32 involves 256 extra multipliers and 105 extra adders against 6912 MUXes, 8 less registers than those of the existing similar structure, and it offers 8 times higher throughput. ASIC synthesis result shows that the proposed structure of block-size 8 and filter-length 32 involves 41 % less area-delay product and 22 % less energy per sample than those of the existing structure and offers nearly 6 times higher sampling rate than the other. At the normalized sampling rate, the proposed structure for filter-length 16 consumes 18 % and 22 % less power than the existing structure for block-sizes 4 and 8, respectively. 相似文献

2.

An Efficient Look-up Table-based Approach for Multiplication over GF(2 m ) Generated by Trinomials

Bimal K. Meher Pramod K. Meher 《Circuits, Systems, and Signal Processing》2013,32(6):2623-2638

In this paper, we present an efficient look-up table (LUT)-based approach to design multipliers for GF(2^m) generated by irreducible trinomials. A straightforward LUT-based multiplication requires a table of size (m×2^m) bits for the Galois field of degree m. The LUT size, therefore, becomes quite large for the fields of large degrees recommended by the National Institute of Standards and Technology (NIST). Keeping that in view, we have proposed a digit-serial LUT-based design, where operand bits are grouped into digits of fixed width, and multiplication is performed in serial/parallel manner. We restrict the digit size to 4 to store only 16 words in the LUT to have lower area-delay complexity. We have also proposed a digit-parallel LUT-based design for high-speed applications, using the same LUT as the digit-serial design, at the cost of some additional multiplexors and combinational logic for parallel modular reductions and additions. We have presented a simple circuit for the initialization of LUT content, which can be used to update the LUT in three cycles whenever required. The proposed digit-serial design involves less area-complexity and less time-complexity than those of the existing LUT-based designs. The proposed digit-parallel design offers nearly 28 % improvement in area-delay product over the best of the existing LUT-based designs. NIST has recommended five binary finite fields for elliptic curve cryptography, out of which two are generated by the trinomials Q(x)=x ²³³+x ⁷⁴+1 and Q(x)=x ⁴⁰⁹+x ⁸⁷+1. In this paper, we have designed a reconfigurable multiplier that can be used for both these fields. The proposed reconfigurable multiplier is shown to have a negligible reconfiguration overhead and would be useful for cryptographic applications. 相似文献

3.

An FIR processor with programmable dynamic data ranges

Chen O.T.-C. Wei-Lung Liu 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2000,8(4):440-446

This work developed a modified direct form based on the radix-4 Booth algorithm to realize a finite impulse response (FIR) architecture with programmable dynamic ranges of input data and filter coefficients. This architecture comprises a preprocessing unit, data latches, configurable connection units, double Booth decoders, coefficient registers, a path control unit, and a postprocessing unit. Programmable dynamic ranges of input data and filter coefficients can be any positive even numbers or multiple of a word length of coefficient registers, using configurable connection units or a path control unit, respectively. In particular, the proposed architecture employs only data-path controls to accomplish programmable operations, without changing word lengths and components of data latches and filter taps. A practical 8-bit and 16-bit FIR processor has also been implemented by using the TSMC 5 V 0.6 μm CMOS technology. It is suitable for operations of asymmetric, symmetric, and anti-symmetric filters at 64, 63, 32, 31, and 16 taps, and is well explored to optimize its functional units. The proposed processor has throughput rates of 50 M and 25 M samples/s for 8-bit and 16-bit input data of various filter applications, respectively 相似文献

4.

Area-efficient pulse-shaping 1:4 interpolated FIR filter based onLUT partitioning

Jong-Kwan Choi Sun-Young Hwang 《Electronics letters》1999,35(18):1504-1505

The design of an area-efficient pulse-shaping 1:4 interpolation FIR filter with a partitioned LUT structure is described. Since the LUT block of the FIR filter occupies a large area relative to that of other blocks, in the proposed FIR filter design the LUT size is reduced by partitioning, exploiting coefficient symmetry, and sharing the partitioned LUTs by multiplexing input data streams. Experiments show that in the proposed filter the area is reduced by >40% compared to the popular single-architecture dual-channel filter. The proposed FIR filter produces optimised results in the QPSK modulator 相似文献

5.

星地高速数传系统低复杂度可重构LDPC编码器设计

康婧安军社王冰冰《电子与信息学报》2021,43(12):3727-3734

为满足近地轨道(LEO)卫星星地高速数传系统对高通量、低复杂度、高可靠性信道编码的应用需求,该文提出一种基于国际空间数据系统咨询委员会(CCSDS)近地卫星通信标准低密度奇偶校验(LDPC)码的低复杂度可重构编码器设计实现方案。通过对输入信息比特插0处理和拆分循环矩阵,并分析不同并行度编码的结构特点,实现了可重构编码方案,提高了编码器的灵活性和编码数据吞吐率;采用优化的移位寄存器累加单元,降低了编码器的整体硬件资源规模。在Xilinx FPGA上对提出的编码器进行了实现,结果表明,在125 MHz系统工作时钟下,编码数据吞吐率最高可达1 Gbps,归一化编码数据吞吐率与其它文献并行度相近的编码器相比提高了17.1%,其寄存器资源和查找表资源与相同平台已有方案相比分别降低了13.7%和14.8%。相似文献

6.

FPGA realization of FIR filters for high-speed and medium-speed by using modified distributed arithmetic architectures

Jiafeng Xie Jianjun He Guanzheng Tan 《Microelectronics Journal》2010,41(6):365-350

This paper presents the design optimization of fully pipelined architectures for area-time-power-efficient implementation of finite impulse response (FIR) filter. The architectures are designed to obtain a suitable area-time tradeoff. Analysis of the performance of different filter orders and different address lengths of partial tables indicate the choice of four input partial tables presents the best of area-time-power-efficient realizations of FIR filter compared with the existing LUT-less DA-based implementations of FIR filters in both high-speed and medium-speed. Moreover, a number of further experiments not only shows the pipeline register’s significant influence to the maximum frequency of the FIR filters but also indicates it also has area usage. Final experiment shows that with the help of using pipeline register, the choice of 4-bits-per-clock (4BPC) of the architecture for word-length N=8 with four input partial table yields the best cost-effective when comparing with other different cases in both high-speed and medium-speed implementations. 相似文献

7.

High-Throughput Memory-Based Architecture for DHT Using a New Convolutional Formulation

Meher P.K. Patra J.C. Swamy M.N.S. 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2007,54(7):606-610

A new formulation is presented for the computation of an -point discrete Hartley transform (DHT) from two pairs of [(N/2-1)/2]-point cyclic convolutions, and further used to obtain modular structures consisting of simple and regular memory-based systolic arrays for concurrent pipelined realization of the DHT. The proposed structures for direct-memory-based implementation is found to involve nearly the same hardware complexity as those of the existing structures, but offers two to four times more throughput and two to four times less latency compared with others. The distributed-arithmetic (DA)-based implementation is also found to offer very less memory-complexity and considerably low area-delay complexity compared with the existing DA-based structures. 相似文献

8.

Novel Area-Efficient FPGA Architectures for FIR Filtering With Symmetric Signal Extension

Benkrid Abd.S. Benkrid K. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2009,17(5):709-722

This paper presents four novel area-efficient field-programmable gate-array (FPGA) bit-parallel architectures of finite impulse response (FIR) filters that smartly support the technique of symmetric signal extension while processing finite length signals at their boundaries. The key to this is a clever use of variable-depth shift registers which are efficiently implemented in Xilinx FPGAs in the form of shift register logic (SRL) components. Comparisons with the conventional architecture of FIR filter with symmetric boundary processing show considerable area saving especially with long-tap filters. For instance, our architecture implementation of the 8-tap low Daubechies-8 FIR filter achieves ~ 30% reduction in the area requirement (in terms of slices) compared to the conventional architecture while maintaining the same throughput. Two of the above-cited novel architectures are dedicated to the special case of symmetric FIR filters. The first architecture is highly area-efficient but requires a clock frequency doubler. While this reduces the overall processing speed (to a maximum of 2), it does maintain a high throughput. Moreover, this speed penalty is cancelled in bi-phase filters which are widely used in multirate architectures (e.g., wavelets). Our second symmetric FIR filter architecture saves less logic than the first architecture (e.g., 10% with the 9-tap low Biorthogonal 9&7 symmetric filter instead of 37% with the first architecture) but overcomes its speed penalty as it matches the throughput of the conventional architecture. 相似文献

9.

Fast variable-size block motion estimation for efficient H.264/AVC encoding

《Signal Processing: Image Communication》2005,20(7):595-623

In this paper, an efficient algorithm is proposed to reduce the computational complexity of variable-size block-matching motion estimation. We first investigate features of multiple candidate search centers, adaptive initial-blocksizes, search patterns, and search step-sizes, to match different motion characteristics and block-sizes. To avoid being trapped in local minima, the proposed algorithm uses multiple candidate motion vectors, which are obtained from different block-sizes. To further reduce the computation cost, a threshold-based early stop strategy according to the quantization parameter is suggested. With adaptive initial block-sizes, a merge-or-skip strategy is also proposed to reduce the computation for the final block-size decision. For the H.264/AVC encoder, simulations show that the proposed algorithms can speed up about 2.6–3.9 times of the original JM v6.1d encoder, which uses fast full-search for all block-sizes, and still maintain a comparable rate-distortion performance. 相似文献

10.

高阶FIR滤波器面向FPGA的多种实现方法

刘在爽卢莹莹《中国有线电视》2008,(2):164-168

从FIR滤波器最基本的直接型结构出发,系统地分析了高阶FIR滤波器面向FPGA的多种实现方法,主要针对FPGA实现过程中最敏感的逻辑资源占用作了分析和比较。详细论述了各种方法的优缺点,并提出一种对直接型结构的优化算法,最后通过一个滚降系数为0.05的256阶SRRC滤波器的设计实例,验证了各种方法的资源分析。相似文献

11.

一种基于FPGA的分布式FIR数字滤波器设计

李姮《电声技术》2012,36(10):28-32

在宽带中频软件无线电台收发系统中,由于FIR滤波器具有良好的线性相位特性及实现的灵活性,通常将它作为数字上下变频中的整形低通滤波器.本设计采用altera公司的CycloneⅡ系列中的EP2C20Q240C8芯片,以一个8阶分布式算法的FIR低通数字滤波器电路为例,其主要通过LUT、加法器和移位寄存器实现.最后对该分布式算法进行了仿真验证.结果表明,该优化结构高效合理地利用FPGA硬件资源,可有效应用于高性能中频数字电台的信号处理模块. 相似文献

12.

High Performance Reconfigurable FIR Filter Architecture Using Optimized Multiplier

J. L. Mazher Iqbal S. Varadarajan 《Circuits, Systems, and Signal Processing》2013,32(2):663-682

In mobile communication systems and multimedia applications, need for efficient reconfigurable digital finite impulse response (FIR) filters has been increasing tremendously because of the advantage of less area, low cost, low power and high speed of operation. This article presents a near optimum low- complexity, reconfigurable digital FIR filter architecture based on computation sharing multipliers (CSHM), constant shift method (CSM) and modified binary-based common sub-expression elimination (BCSE) method for different word-length filter coefficients. The CSHM identifies common computation steps and reuses them for different multiplications. The proposed reconfigurable FIR filter architecture reduces the adders cost and operates at high speed for low-complexity reconfigurable filtering applications such as channelization, channel equalization, matched filtering, pulse shaping, video convolution functions, signal preconditioning, and various other communication applications. The proposed architecture has been implemented and tested on a Virtex 2 xc2vp2-6fg256 field-programmable gate array (FPGA) with a precision of 8-bits, 12-bits, and 16-bits filter coefficients. The proposed novel reconfigurable FIR filter architecture using dynamically reconfigurable multiplier block offers good area and speed improvement compared to existing reconfigurable FIR filter implementations. 相似文献

13.

Power-efficient FIR filter architecture design for wireless embedded system

Shyh-Feng Lin Sheng-Chieh Huang Feng-Sung Yang Chung-Wei Ku Liang-Gee Chen 《Circuits and Systems II: Express Briefs, IEEE Transactions on》2004,51(1):21-25

This paper presents a novel approach for implementing power-efficient finite-impulse response (FIR) filters that requires less power consumption than traditional FIR filter implementation in wireless embedded systems. The proposed schemes can be adopted in the direct form FIR filter and achieve a large amount of reduction in the power consumption. By using a combination of proposed methods, balanced-modular techniques with retiming and separated processing data-flow scheme with modified canonical signed digit (CSD) representation, experimental results show that the proposed scheme reduce 76% power consumption of the original direct-form structure with slight area overhead. 相似文献

14.

A 200-MHz CMOS x/sin(x) digital filter forcompensating D/A converter frequency response distortion

Lin T. Samulei H. 《Solid-State Circuits, IEEE Journal of》1991,26(9):1278-1285

A 200-MHz 11-tap finite-impulse-response (FIR) digital filter for compensating the sin(x)/x spectrum distortion introduced by digital-to-analog (D/A) converters was designed and fabricated in a 1-μm CMOS technology. The chip core area is 1.91×3.28 mm² and its complexity is approximately 14,000 transistors. A fully parallel bit-level pipelined transpose-form carry-save architecture using simple powers-of-2 coefficients was used to achieve high throughput and low complexity. The various tradeoffs involving architecture selection, circuit design, and timing issues are presented, and the difficulties in realizing the speed potential of bit-level pipelined circuits are discussed 相似文献

15.

Efficient computation of time-varying and adaptive filters

Jones D.L. 《Signal Processing, IEEE Transactions on》1993,41(3):1077-1086

Two techniques for efficient computation of filters that support time-varying coefficients are developed. These methods are forms of distributed arithmetic that encode the data, rather than the filter coefficients. The first approach efficiently computes scalar-vector products, with which a digital filter is easily implemented in a transpose-form structure. This method, based on digital coding, supports time-varying coefficients with no additional overhead. Alternatively, distributed-arithmetic schemes that encode the data stream in sliding blocks support efficient direct-form filter computation with time-varying coefficients. A combination of both of these techniques greatly reduces the computation required to implement LMS adaptive filters 相似文献

16.

一种新的FIR滤波器脉动实现结构 总被引：6，自引：0，他引：6

尚勇吴顺君《电子学报》2000,28(1):57-59

为了提高FIR滤波器的处理速度,一个主要手段是并行处理技术.并行处理除了可以提高运算速度外,还可以提高FIR滤波器的数据通过率以及降低系统功耗.本文首先从多项式分解角度给出一种FIR滤波器的并行结构.通过对此并行FIR滤波器的分析,提出了一种新的FIR滤波器的脉动实现结构.这种结构与一般的实现FIR滤波器的脉动结构相比具有规模小、能适应更高处理速度的优点. 相似文献

17.

Design of Hilbert transformers by multiple use of same subfilter

Tai Y.-L. Lin T.-P. 《Electronics letters》1989,25(19):1288-1290

A method is proposed for the design of linear-phase FIR Hilbert transformers as a tapped cascaded interconnection of identical FIR subfilters. With the proposed method, the design of FIR Hilbert transformers can be determined by the prototype filter and subfilter through transformation; thus it has potential for VLSI realisation. It is shown that the number of distinct multipliers required by the proposed method is much less than that designed by direct-form minimax methods.<> 相似文献

18.

一种基于查找表的FIR成型滤波实现装置

下载免费PDF全文

梁尧周智勋何丽《太赫兹科学与电子信息学报》2015,13(1):101-105

脉冲数字成型滤波器属于有限冲激响应(FIR)滤波器的一种,常规做法是通过传统的乘累加(MACs)方法来实现,即通过对输入信号与单位冲激响应进行线性卷积。但是,随着成型滤波器系数的增加,这种卷积运算势必会占用大量的MAC单元以及延迟单元,导致现场可编程门阵列(FPGA)硬件资源紧张,系统延迟增大,设备成本增加。本文联合了FIR成型滤波器群延时特征以及基带数字调制符号特性,提出了一种新的查找表(LUT)结构的FIR滤波方法,并且在FPGA上实现。软硬件仿真结果表明,这一方法无论从精确度和资源利用上都具有一定的优势。相似文献

19.

基于FPGA高阶FIR滤波器的实现 总被引：1，自引：1，他引：0

戴曜泽王春雷朱智强《现代电子技术》2012,35(8):110-113

从FIR数字滤波器的基本结构模型出发,分析了FIR滤波器的设计思路及具体实现方法,详细介绍了FIR滤波器的分布式算法（DA）结构。通过分析计算,得到普通DA结构实现高阶滤波器会消耗大量的查找表资源,这样的资源消耗甚至令硬件资源不可接受。针对普通DA的不足,提出了改进型DA结构。并利用FPGA仿真软件分别对64阶FIR带通滤波器的两种改进型DA结构进行仿真,结果表明改进型DA结构所消耗的资源大幅度降低。从而验证了改进型DA结构在降低运算资源和提高性能等方面的优越性。相似文献

20.

高吞吐率任意倍内插滤波器设计

沈锐龙吕大鑫张建峰《现代雷达》2018,40(10):23-26

提出了一种高吞吐率用于任意倍内插的并行FARROW滤波器。在串行FARROW内插滤波器的基础上,通过数学推导得出了基于多相分解的并行FARROW内插滤波器。该滤波器由并行FIR滤波器、多输入多输出选择器、累加器和乘加器构成,详细讨论了这些模块在FPGA上的实现方法。仿真试验表明:该并行结构滤波器能够在低时钟速率下提供高吞吐率的任意小数或整数倍内插,实现灵活的采样率变换。相似文献