首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
This paper presents a method of using a parity prediction scheme for detecting erroneous outputs in bit-parallel, sequential, and digit-serial Gaussian normal basis (GNB) multipliers over GF(2m). Although all-type NB multipliers have different time and space complexities, our analytical results indicate that all-type GNB multipliers have the same structure if they use parity prediction function. For example, in the field GF(2233), we have estimated that the error detection rate for a sequential multiplier is nearly 100% if a comparison is made as per clock cycle. Our analytical results also show that the area overhead of the proposed digit-serial multiplier with concurrent error detection does not exceed 5%. Several efficient parity prediction techniques will be shown in this work to provide a low overhead solution to concurrent error detection particularly when the cryptography implementations using GF(2m) multiplier require higher reliability and the protection against adversarial attacks.  相似文献   

2.
Fan  H. Dai  Y. 《Electronics letters》2004,40(1):24-26
Based on the divide-and-conquer technique, three bit-parallel normal bases multipliers are presented for GF(2/sup n/). The space complexity of one multiplier is about 3/4 of the smallest known normal bases multiplier, although it needs at most one more XOR gate delay.  相似文献   

3.
The detection of errors in arithmetic operations is an important issue. This paper discusses the detection of multiple-bit errors due to faults in bit-serial and bit-parallel polynomial basis (PB) multipliers over binary extension fields. Our approach is based on multiple parity bits. Experimental results presented here show that due to an increase in the number of parity bits, the area overhead tends to increase linearly, but the probability of error detection approaches unity fairly quickly, e.g., for eight parity bits. In bit-serial implementation of a GF(2163) PB multiplier using eight parity bits, the area overhead and the probability of error detection are 10.29% and 0.996, respectively. This is achieved without any increase in the computation time of the GF(2163) PB multiplier  相似文献   

4.
In this paper, we present efficient algorithms for modular reduction to derive novel systolic and non-systolic architectures for polynomial basis finite field multipliers over $GF(2^{m})$ to be used in Reed–Solomon (RS) codec. Using the proposed algorithm for unit degree reduction and optimization of implementation of the logic functions in the processing elements (PEs), we have derived an efficient bit-parallel systolic design for finite field multiplier which involves nearly two-thirds of the area-complexity of the existing design having the same time-complexity. The proposed modular reduction algorithms are also used to derive efficient non-systolic serial/parallel designs of field multipliers over $GF(2^{8})$ with different digit-sizes, where the critical path and the hardware-complexity are further reduced by optimizing the implementation of modular reduction operations and finite field accumulations. The proposed bit-serial design involves nearly 55% of the minimum of area, and half the minimum of area-time complexity of the existing bit-serial designs. Similarly, the proposed digit-serial/parallel designs involve significantly less area, and less area-time complexities compared with the existing designs of the same digit-size. By parallel modular reduction through multiple degrees followed by appropriate logic-level sub-expression sharing; a hardware-efficient regular and modular form of a balanced-tree bit-parallel non-systolic multiplier is also derived. The proposed bit-parallel non-systolic pipelined design involves less than 65% of the area and nearly two-thirds of the area-time complexity of the existing bit-parallel design for a RS codec, while the non-pipelined form offers nearly 25% saving of area with less time-complexity.   相似文献   

5.
This paper presents new time-dependent and time-independent multiplication algorithms over finite fields GF(2m) by employing an interleaved conventional multiplication and a folded technique. The proposed algorithm allows efficient realization of the bit-parallel systolic multipliers. The results show that the proposed time-independent multiplier saves about 54% space complexity as compared to other related multipliers for polynomial and dual bases of GF(2m). The proposed architectures include the features of regularity, modularity and local interconnection. Accordingly, it is well suited for VLSI implementation.  相似文献   

6.
A new division scheme for GF(2m) is presented. This scheme is based on the recursive division algorithm and composite fields of the form GF(22n) (m=2n). The new division scheme offers reduced time complexity of approximately O(2n) when compared to traditional bit-serial architectures with O(22n). The scheme also offers lower hardware requirements when compared to bit-parallel architectures. The circuit architecture presented supports implementation in VLSI systems due to its regular and hardware efficient structures and is therefore suited to the implementation of Reed-Solomon codecs  相似文献   

7.
A new bit-parallel systolic multiplier over GF(2m) under the polynomial basis and normal basis is proposed. This new circuit is constructed by m 2 identical cells, each of which consists of one two-input AND gate, one three-input XOR gate and five 1-bit latches. Especially, the proposed architecture is without the basis conversion as compared to the well-known multipliers with the redundant representation. With this proposed multiplier, a parallel-in parallel-out systolic array has also been developed for computing inversion and division over GF(2m). The proposed architectures are well suited to VLSI systems due to their regular interconnection pattern and modular structure.
Che Wun ChiouEmail:
  相似文献   

8.
This paper presents two bit-serial modular multipliers based on the linear feedback shift register using an irreducible all one polynomial (AOP) over GF(2m). First, a new multiplication algorithm and its architecture are proposed for the modular AB multiplication. Then a new algorithm and architecture for the modular AB2 multiplication are derived based on the first multiplier. They have significantly smaller hardware complexity than the previous multipliers because of using the property of AOP. It simplifies the modular reduction compared with the case of using the generalized irreducible polynomial. Since the proposed multipliers have low hardware requirements and regular structures, they are suitable for VLSI implementation. The proposed multipliers can be used as the kernel architecture for the operations of exponentiation, inversion, and division.  相似文献   

9.
This paper presents a new heuristic, concurrent, iterative loop-based scheduling and allocation algorithm for high-level synthesis of digital signal processing (DSP) architectures using heterogeneous functional units. In a heterogeneous architecture, functional units could be either bit-serial or digit-serial or bit-parallel. We assume that a library of functional units based on heterogeneous implementation style is available. Experiments show that this new heuristic synthesis approach generates optimal and near-optimal area solutions. Although optimum synthesis of such architectures were proposed recently using an integer linear programming (ILP) model, our method can produce similar solutions in one to two orders of magnitude less time, at the expense of sacrificing the cost optimality. We compare the solutions generated by the proposed algorithm with the optimal solutions generated by the ILP approach and other recent techniques. We have incorporated this new algorithm into the Minnesota ARchitecture Synthesis (MARS-II) system.  相似文献   

10.
Wavelength division multiplexing (WDM) is emerging as a viable solution to reduce the electronic processing bottleneck in very high-speed optical networks. A set of parallel and independent channels are created on a single fiber using this technique. Parallel communication utilizing the WDM channels may be accomplished in two ways: (i) bit serial, where each source-destination pair communicates using one wavelength and data are sent serially on this wavelength; and (ii) bit parallel, where each source-destination pair communicates using a subset of channels and data are sent in multiple-bit words. Three architectures are studied in the paper: single-hop bit-serial star, single-hop bit-parallel star, and multi-hop bit-parallel shufflenet. The objective of this paper is to evaluate these architectures with respect to average packet delay, network utilization, and link throughput. It is shown that the Shufflenet offers the lowest latency but suffers from high cost and low link throughput. The star topology with bit-parallel access offers lower latency than the bit-serial star, but is more expensive to implement.  相似文献   

11.
Low-Energy Digit-Serial/Parallel Finite Field Multipliers   总被引:5,自引:0,他引:5  
Digit-serial architectures are best suited for systems requiring moderate sample rate and where area and power consumption are critical. This paper presents a new approach for designing digit-serial/parallel finite field multipliers. This approach combines both array-type and parallel multiplication algorithms, where the digit-level array-type algorithm minimizes the latency for one multiplication operation and the parallel architecture inside of each digit cell reduces both the cycle-time as well as the switching activities, hence power consumption. By appropriately constraining the feasible primitive polynomials, the mod p(x) operation involved in finite field multiplication can be performed in a more efficient way. As a result, the computation delay and energy consumption of one finite field multiplication using the proposed digit-serial/parallel architectures are significantly less than of those obtained by folding the parallel semi-systolic multipliers. Furthermore, their energy-delay products are reduced by a even larger percentage. Therefore, the proposed digit-serial/parallel architectures are attractive for both low-energy and high-performance applications.  相似文献   

12.
Novel fault-tolerant architectures for bit-parallel polynomial basis multiplier over GF(2m), which can correct the erroneous outputs using linear code, are presented. A parity prediction circuit based on the code generator polynomial that leads lower space overhead has been designed. For bit-parallel architectures, the space overhead is about 11%. Moreover, there is only marginal time overhead due to incorporation of error-correction capability that amounts to 3.5% in case of the bit-parallel multiplier. Unlike the existing concurrent error correction (CEC) multipliers or triple modular redundancy (TMR) techniques for single error correction, the proposed architectures have multiple error-correcting capabilities.  相似文献   

13.
Fault-Tolerant Bit-Parallel Multiplier for Polynomial Basis of GF(2^m)   总被引:1,自引:0,他引:1  
Novel fault-tolerant architectures for bit-parallel polynomial basis multiplier over GF(2^m), which can correct the erroneous outputs using linear code, are presented. A parity prediction circuit based on the code generator polynomial that leads lower space overhead has been designed. For bit-parallel architectures, the Moreover, there is incorporation of space overhead only marginal time error-correction is about 11%. overhead due to capability that amounts to 3.5% in case of the bit-parallel multiplier. Unlike the existing concurrent error correction (CEC) multipliers or triple modular redundancy (TMR) techniques for single error correction, the proposed architectures have multiple error-correcting capabilities.  相似文献   

14.
《Microelectronics Journal》2002,33(5-6):501-508
This paper proposes the FPGA implementation of the digit-serial Canonical Signed-Digit (CSD) coefficient FIR filters which can be used as format conversion filters in place of the ones employed for the MPEG2 TM 5 (test model 5). Canonical representation of a signed digit (CSD) is a method used to reduce cost by representing a signed number using the least amount of non-zero digits, thereby reducing the number of multiply operations. As Field Programmable Gate Arrays (FPGAs) have grown in capacity, improved in performance, and decreased in cost, they are becoming a viable solution for performing computationally intensive tasks, with the ability to tackle applications formerly reserved for custom chips and programmable digital signal processing (DSP) devices. A digit-serial CSD FIR filter design is realized and practical design guidelines are provided using FPGAs. An analysis of the performance comparison of bit-serial, serial distributed arithmetic, and digit-serial CSD FIR filters on a Xilinx XC4000XL-series FPGA is described. The results show that the proposed digit-serial CSD FIR filter is compact and an efficient implementation of real-time DSP applications on FPGAs.  相似文献   

15.
16.
In this paper, an efficient digit-serial systolic array is proposed for multiplication in finite field GF(2/sup m/) using the standard basis representation. From the least significant bit first multiplication algorithm, we obtain a new dependence graph and design an efficient digit-serial systolic multiplier. If input data come in continuously, the proposed array can produce multiplication results at a rate of one every /spl lceil/m/L/spl rceil/ clock cycles, where L is the selected digit size. Analysis shows that the computational delay time of the proposed architecture is significantly less than the previously proposed digit-serial systolic multiplier. Furthermore, since the new architecture has the features of regularity, modularity, and unidirectional data flow, it is well suited to VLSI implementation.  相似文献   

17.
This paper addresses the design of power efficient dedicated structures of Radix-2 Decimation in Time (DIT) pipelined butterflies, aiming the implementation of low power Fast Fourier Transform (FFT), using adder compressors, with a new XOR gate topology. In the FFT computation, the butterflies play a central role, since they allow calculation of complex terms. In this calculation, involving multiplications of input data with appropriate coefficients, the optimization of the butterfly can contribute for the reduction of power consumption of FFT architectures. In this paper, different and dedicated structures for the 16 bit-width pipelined Radix-2 DIT butterfly, running at 100 MHz, are implemented, where the main goal is to minimize both the number of real multipliers and the critical path of the structures. This is done by changing the structure of the complex multipliers and applying them into the butterflies. For logic synthesis of the implemented butterflies it was used Cadence Encounter RTL Compiler tool with XFAB MOSLP 0.18 μm library. Area and power consumption results are presented for the synthesized butterflies. Regarding power consumption, switching activity analysis is performed using 10,000 inputs vectors at inputs of the butterflies. The main results show that when combining the use of pipeline approach and the use of efficient adder compressors, with a new XOR gate topology, the power consumption of the butterflies is significantly reduced.  相似文献   

18.
《Signal processing》1998,68(1):73-86
A novel architecture for high performance two's complement digit-serial IIR filters is presented. The application of the digit-serial computation to the design of IIR filters introduces delay elements in the feedback loop of the IIR filter. This offers the possibility of pipelining the feedback loop inherent in the IIR filters. To fully explore the advantages offered by the use of digit-serial computation, the digit serial structure is based on the feed forward of the carry digit, which allows subdigit pipelining to increase the throughput rate of the IIR filters. A systematic design methodology is presented to derive a wide range of digit-serial IIR filter architectures which can be pipelined to the subdigit level. This will give designers greater flexibility in finding the best trade off between hardware cost and throughput rate. It is shown that the application of digit-serial computations for the realisation of IIR filters combined with the possibility of subdigit pipelining, results in an increase in the computation speed with a considerable reduction in silicon area consumption when compared to an equivalent bit-parallel IIR filter realisations.  相似文献   

19.
High throughput is a crucial factor in bit-serial GF(2m) fields multiplication for a variety of different applications including cryptography, error coding detection and computer algebra. The throughput of a multiplier is dependent on the required number of clock cycles to reach a result and its critical path delay. However, most bit-serial GF(2m) multipliers do not manage to reduce the required number of clock cycles below the threshold of m clock cycles without increasing dramatically their critical path delay. This increase is more evident if a multiplier is designed to be versatile. In this article, a new versatile bit-serial MSB multiplier for GF(2m) fields is proposed that achieves a 50% increase on average in throughput when compared to other designs, with a very small increase in its critical path delay. This is achieved by an average 33.4% reduction in the required number of clock cycles below m. The proposed design can handle arbitrary bit-lengths upper bounded by m and is suitable for applications where the field order may vary.  相似文献   

20.
In this paper, we present a novel scheme for performing fixed-point arithmetic efficiently on fine-grain, massively parallel, programmable architectures including both custom and FPGA-based systems. We achieve anO(n) speedup, wheren is the operand precision, over the bit-serial methods of existing fine-grain systems such as the DAP, the MPP and the CM2, within the constraints of regular, near neighbor communication and only a small amount of on-chip memory. This is possible by means of digit pipelined algorithms which avoid broadcast and which operate in a fully systolic manner by pipelining at the digit level. A base 4, signed-digit, fully redundant number system and on-line techniques are used to limit carry propagation and minimize communication costs. p ]Although our algorithms are digit-serial, we are able to match the performance of the bit-parallel methods, while retaining low communication complexity. Reconfigurable hardware systems built using field programmable gate arrays (FPGA's) can share in the speed benefits of these algorithms. By using the organization of logic blocks suggested in this paper, problems of placement and routing that exist in such systems can be avoided. Since the algorithms are amenable to pipelining, very high throughput can be obtained.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号