共查询到20条相似文献,搜索用时 62 毫秒
1.
Chiou-Yng Lee Author Vitae 《Integration, the VLSI Journal》2010,43(1):113-123
This paper presents a method of using a parity prediction scheme for detecting erroneous outputs in bit-parallel, sequential, and digit-serial Gaussian normal basis (GNB) multipliers over GF(2m). Although all-type NB multipliers have different time and space complexities, our analytical results indicate that all-type GNB multipliers have the same structure if they use parity prediction function. For example, in the field GF(2233), we have estimated that the error detection rate for a sequential multiplier is nearly 100% if a comparison is made as per clock cycle. Our analytical results also show that the area overhead of the proposed digit-serial multiplier with concurrent error detection does not exceed 5%. Several efficient parity prediction techniques will be shown in this work to provide a low overhead solution to concurrent error detection particularly when the cryptography implementations using GF(2m) multiplier require higher reliability and the protection against adversarial attacks. 相似文献
2.
Based on the divide-and-conquer technique, three bit-parallel normal bases multipliers are presented for GF(2/sup n/). The space complexity of one multiplier is about 3/4 of the smallest known normal bases multiplier, although it needs at most one more XOR gate delay. 相似文献
3.
Bayat-Sarmadi S. Hasan M. A. 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2007,15(4):413-426
The detection of errors in arithmetic operations is an important issue. This paper discusses the detection of multiple-bit errors due to faults in bit-serial and bit-parallel polynomial basis (PB) multipliers over binary extension fields. Our approach is based on multiple parity bits. Experimental results presented here show that due to an increase in the number of parity bits, the area overhead tends to increase linearly, but the probability of error detection approaches unity fairly quickly, e.g., for eight parity bits. In bit-serial implementation of a GF(2163) PB multiplier using eight parity bits, the area overhead and the probability of error detection are 10.29% and 0.996, respectively. This is achieved without any increase in the computation time of the GF(2163) PB multiplier 相似文献
4.
《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2009,17(6):747-757
5.
Chiou-Yng Lee Author Vitae 《Integration, the VLSI Journal》2008,41(1):106-112
This paper presents new time-dependent and time-independent multiplication algorithms over finite fields GF(2m) by employing an interleaved conventional multiplication and a folded technique. The proposed algorithm allows efficient realization of the bit-parallel systolic multipliers. The results show that the proposed time-independent multiplier saves about 54% space complexity as compared to other related multipliers for polynomial and dual bases of GF(2m). The proposed architectures include the features of regularity, modularity and local interconnection. Accordingly, it is well suited for VLSI implementation. 相似文献
6.
A new division scheme for GF(2m) is presented. This scheme is based on the recursive division algorithm and composite fields of the form GF(22n) (m=2n). The new division scheme offers reduced time complexity of approximately O(2n) when compared to traditional bit-serial architectures with O(22n). The scheme also offers lower hardware requirements when compared to bit-parallel architectures. The circuit architecture presented supports implementation in VLSI systems due to its regular and hardware efficient structures and is therefore suited to the implementation of Reed-Solomon codecs 相似文献
7.
A new bit-parallel systolic multiplier over GF(2m) under the polynomial basis and normal basis is proposed. This new circuit is constructed by m
2 identical cells, each of which consists of one two-input AND gate, one three-input XOR gate and five 1-bit latches. Especially,
the proposed architecture is without the basis conversion as compared to the well-known multipliers with the redundant representation.
With this proposed multiplier, a parallel-in parallel-out systolic array has also been developed for computing inversion and
division over GF(2m). The proposed architectures are well suited to VLSI systems due to their regular interconnection pattern and modular structure.
相似文献
Che Wun ChiouEmail: |
8.
This paper presents two bit-serial modular multipliers based on the linear feedback shift register using an irreducible all one polynomial (AOP) over GF(2m). First, a new multiplication algorithm and its architecture are proposed for the modular AB multiplication. Then a new algorithm and architecture for the modular AB2 multiplication are derived based on the first multiplier. They have significantly smaller hardware complexity than the previous multipliers because of using the property of AOP. It simplifies the modular reduction compared with the case of using the generalized irreducible polynomial. Since the proposed multipliers have low hardware requirements and regular structures, they are suitable for VLSI implementation. The proposed multipliers can be used as the kernel architecture for the operations of exponentiation, inversion, and division. 相似文献
9.
Heuristic Loop-Based Scheduling and Allocation for DSP Synthesis with Heterogeneous Functional Units
Yun-Nan Chang Ching-Yi Wang Keshab K. Parhi 《The Journal of VLSI Signal Processing》1998,19(3):243-256
This paper presents a new heuristic, concurrent, iterative loop-based scheduling and allocation algorithm for high-level synthesis of digital signal processing (DSP) architectures using heterogeneous functional units. In a heterogeneous architecture, functional units could be either bit-serial or digit-serial or bit-parallel. We assume that a library of functional units based on heterogeneous implementation style is available. Experiments show that this new heuristic synthesis approach generates optimal and near-optimal area solutions. Although optimum synthesis of such architectures were proposed recently using an integer linear programming (ILP) model, our method can produce similar solutions in one to two orders of magnitude less time, at the expense of sacrificing the cost optimality. We compare the solutions generated by the proposed algorithm with the optimal solutions generated by the ILP approach and other recent techniques. We have incorporated this new algorithm into the Minnesota ARchitecture Synthesis (MARS-II) system. 相似文献
10.
Wavelength division multiplexing (WDM) is emerging as a viable solution to reduce the electronic processing bottleneck in very high-speed optical networks. A set of parallel and independent channels are created on a single fiber using this technique. Parallel communication utilizing the WDM channels may be accomplished in two ways: (i) bit serial, where each source-destination pair communicates using one wavelength and data are sent serially on this wavelength; and (ii) bit parallel, where each source-destination pair communicates using a subset of channels and data are sent in multiple-bit words. Three architectures are studied in the paper: single-hop bit-serial star, single-hop bit-parallel star, and multi-hop bit-parallel shufflenet. The objective of this paper is to evaluate these architectures with respect to average packet delay, network utilization, and link throughput. It is shown that the Shufflenet offers the lowest latency but suffers from high cost and low link throughput. The star topology with bit-parallel access offers lower latency than the bit-serial star, but is more expensive to implement. 相似文献
11.
Low-Energy Digit-Serial/Parallel Finite Field Multipliers 总被引:5,自引:0,他引:5
Digit-serial architectures are best suited for systems requiring moderate sample rate and where area and power consumption are critical. This paper presents a new approach for designing digit-serial/parallel finite field multipliers. This approach combines both array-type and parallel multiplication algorithms, where the digit-level array-type algorithm minimizes the latency for one multiplication operation and the parallel architecture inside of each digit cell reduces both the cycle-time as well as the switching activities, hence power consumption. By appropriately constraining the feasible primitive polynomials, the mod p(x) operation involved in finite field multiplication can be performed in a more efficient way. As a result, the computation delay and energy consumption of one finite field multiplication using the proposed digit-serial/parallel architectures are significantly less than of those obtained by folding the parallel semi-systolic multipliers. Furthermore, their energy-delay products are reduced by a even larger percentage. Therefore, the proposed digit-serial/parallel architectures are attractive for both low-energy and high-performance applications. 相似文献
12.
Novel fault-tolerant architectures for bit-parallel polynomial basis multiplier over GF(2m), which can correct the erroneous outputs using linear code, are presented. A parity prediction circuit based on the code generator polynomial that leads lower space overhead has been designed. For bit-parallel architectures, the space overhead is about 11%. Moreover, there is only marginal time overhead due to incorporation of error-correction capability that amounts to 3.5% in case of the bit-parallel multiplier. Unlike the existing concurrent error correction (CEC) multipliers or triple modular redundancy (TMR) techniques for single error correction, the proposed architectures have multiple error-correcting capabilities. 相似文献
13.
Novel fault-tolerant architectures for bit-parallel polynomial basis multiplier over GF(2^m), which can correct the erroneous outputs using linear code, are presented. A parity prediction circuit based on the code generator polynomial that leads lower space overhead has been designed. For bit-parallel architectures, the Moreover, there is incorporation of space overhead only marginal time error-correction is about 11%. overhead due to capability that amounts to 3.5% in case of the bit-parallel multiplier. Unlike the existing concurrent error correction (CEC) multipliers or triple modular redundancy (TMR) techniques for single error correction, the proposed architectures have multiple error-correcting capabilities. 相似文献
14.
《Microelectronics Journal》2002,33(5-6):501-508
This paper proposes the FPGA implementation of the digit-serial Canonical Signed-Digit (CSD) coefficient FIR filters which can be used as format conversion filters in place of the ones employed for the MPEG2 TM 5 (test model 5). Canonical representation of a signed digit (CSD) is a method used to reduce cost by representing a signed number using the least amount of non-zero digits, thereby reducing the number of multiply operations. As Field Programmable Gate Arrays (FPGAs) have grown in capacity, improved in performance, and decreased in cost, they are becoming a viable solution for performing computationally intensive tasks, with the ability to tackle applications formerly reserved for custom chips and programmable digital signal processing (DSP) devices. A digit-serial CSD FIR filter design is realized and practical design guidelines are provided using FPGAs. An analysis of the performance comparison of bit-serial, serial distributed arithmetic, and digit-serial CSD FIR filters on a Xilinx XC4000XL-series FPGA is described. The results show that the proposed digit-serial CSD FIR filter is compact and an efficient implementation of real-time DSP applications on FPGAs. 相似文献
15.
16.
Chang Hoon Kim Chun Pyo Hong Soonhak Kwon 《Very Large Scale Integration (VLSI) Systems, IEEE Transactions on》2005,13(4):476-483
In this paper, an efficient digit-serial systolic array is proposed for multiplication in finite field GF(2/sup m/) using the standard basis representation. From the least significant bit first multiplication algorithm, we obtain a new dependence graph and design an efficient digit-serial systolic multiplier. If input data come in continuously, the proposed array can produce multiplication results at a rate of one every /spl lceil/m/L/spl rceil/ clock cycles, where L is the selected digit size. Analysis shows that the computational delay time of the proposed architecture is significantly less than the previously proposed digit-serial systolic multiplier. Furthermore, since the new architecture has the features of regularity, modularity, and unidirectional data flow, it is well suited to VLSI implementation. 相似文献
17.
Mateus Beck Fonseca Eduardo A. César da Costa João B. S. Martins 《Analog Integrated Circuits and Signal Processing》2012,73(3):945-954
This paper addresses the design of power efficient dedicated structures of Radix-2 Decimation in Time (DIT) pipelined butterflies, aiming the implementation of low power Fast Fourier Transform (FFT), using adder compressors, with a new XOR gate topology. In the FFT computation, the butterflies play a central role, since they allow calculation of complex terms. In this calculation, involving multiplications of input data with appropriate coefficients, the optimization of the butterfly can contribute for the reduction of power consumption of FFT architectures. In this paper, different and dedicated structures for the 16 bit-width pipelined Radix-2 DIT butterfly, running at 100 MHz, are implemented, where the main goal is to minimize both the number of real multipliers and the critical path of the structures. This is done by changing the structure of the complex multipliers and applying them into the butterflies. For logic synthesis of the implemented butterflies it was used Cadence Encounter RTL Compiler tool with XFAB MOSLP 0.18 μm library. Area and power consumption results are presented for the synthesized butterflies. Regarding power consumption, switching activity analysis is performed using 10,000 inputs vectors at inputs of the butterflies. The main results show that when combining the use of pipeline approach and the use of efficient adder compressors, with a new XOR gate topology, the power consumption of the butterflies is significantly reduced. 相似文献
18.
《Signal processing》1998,68(1):73-86
A novel architecture for high performance two's complement digit-serial IIR filters is presented. The application of the digit-serial computation to the design of IIR filters introduces delay elements in the feedback loop of the IIR filter. This offers the possibility of pipelining the feedback loop inherent in the IIR filters. To fully explore the advantages offered by the use of digit-serial computation, the digit serial structure is based on the feed forward of the carry digit, which allows subdigit pipelining to increase the throughput rate of the IIR filters. A systematic design methodology is presented to derive a wide range of digit-serial IIR filter architectures which can be pipelined to the subdigit level. This will give designers greater flexibility in finding the best trade off between hardware cost and throughput rate. It is shown that the application of digit-serial computations for the realisation of IIR filters combined with the possibility of subdigit pipelining, results in an increase in the computation speed with a considerable reduction in silicon area consumption when compared to an equivalent bit-parallel IIR filter realisations. 相似文献
19.
George N. Selimis Author Vitae Apostolos P. Fournaris Author VitaeAuthor Vitae Odysseas Koufopavlou Author Vitae 《Integration, the VLSI Journal》2009,42(2):217-226
High throughput is a crucial factor in bit-serial GF(2m) fields multiplication for a variety of different applications including cryptography, error coding detection and computer algebra. The throughput of a multiplier is dependent on the required number of clock cycles to reach a result and its critical path delay. However, most bit-serial GF(2m) multipliers do not manage to reduce the required number of clock cycles below the threshold of m clock cycles without increasing dramatically their critical path delay. This increase is more evident if a multiplier is designed to be versatile. In this article, a new versatile bit-serial MSB multiplier for GF(2m) fields is proposed that achieves a 50% increase on average in throughput when compared to other designs, with a very small increase in its critical path delay. This is achieved by an average 33.4% reduction in the required number of clock cycles below m. The proposed design can handle arbitrary bit-lengths upper bounded by m and is suitable for applications where the field order may vary. 相似文献
20.
Chetana Nagendra Robert Michael Owens Mary Jane Irwin 《The Journal of VLSI Signal Processing》1995,9(3):193-209
In this paper, we present a novel scheme for performing fixed-point arithmetic efficiently on fine-grain, massively parallel,
programmable architectures including both custom and FPGA-based systems. We achieve anO(n) speedup, wheren is the operand precision, over the bit-serial methods of existing fine-grain systems such as the DAP, the MPP and the CM2,
within the constraints of regular, near neighbor communication and only a small amount of on-chip memory. This is possible
by means of digit pipelined algorithms which avoid broadcast and which operate in a fully systolic manner by pipelining at
the digit level. A base 4, signed-digit, fully redundant number system and on-line techniques are used to limit carry propagation
and minimize communication costs. p ]Although our algorithms are digit-serial, we are able to match the performance of the
bit-parallel methods, while retaining low communication complexity. Reconfigurable hardware systems built using field programmable
gate arrays (FPGA's) can share in the speed benefits of these algorithms. By using the organization of logic blocks suggested
in this paper, problems of placement and routing that exist in such systems can be avoided. Since the algorithms are amenable
to pipelining, very high throughput can be obtained. 相似文献