首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents an architecture for the computation of the atan(Y/X) operation suitable for broadband communication applications where a throughput of 20 MHz is required. The architecture takes advantage of embedded hard-cores of the FPGA device to achieve lower power consumption with respect to an atan(Y/X) operator based on CORDIC algorithm or conventional LUT-based methods. The proposed architecture can compute the atan(Y/X) with a latency of two clock cycles and its power consumption is 49% lower than a CORDIC or 46% lower than multipartite approach.
J. VallsEmail:
  相似文献   

2.
In this paper we present a low complexity discrete cosine transform (DCT) architecture based on computation re-use in vector-scalar product. 1-D DCT operation is expressed as additions of vector-scalar products and basic common computations are identified and shared to reduce computational complexity in 1-D DCT operation. Compared to general distributed arithmetic based DCT architecture, the proposed DCT shows 38% of area and 18% of power savings with little performance degradation. We also propose an efficient method to trade off image quality for computational complexity. The approach is based on the modification of DCT bases in bit-wise manner and different computational complexity/image quality trade-off levels are suggested. Finally, based on the above approaches, we propose a low complexity DCT architecture, which can dynamically reconfigure from one trade-off level to another. The reconfigurable DCT architecture can achieve power savings ranging from 28% to 56% for 3 different trade-off levels.
Kaushik RoyEmail:
  相似文献   

3.
High-speed and low area hardware architectures of the Whirlpool hash function are presented in this paper. A full Look-up Table (LUT) based design is shown to be the fastest method by which to implement the non-linear layer of the algorithm in terms of logic. An unrolled Whirlpool architecture implemented on the Virtex XC4VLX100 device achieves a throughput of 4.9 Gbps. This is faster than a SHA-512 design implemented on the same device and other previously reported hash function architectures. A low area iterative architecture, which utilises 64-bit operations as opposed to full 512-bit operations, is also described. It runs at 430 Mbps and occupies 709 slices on a Virtex X4VLX15. This proves to be one of the smallest 512-bit hash function architectures currently available.
Ciaran McIvorEmail:
  相似文献   

4.
H.264/AVC is the latest video coding standard adopting variable block size motion estimation (VBS-ME), quarter-pixel accuracy, motion vector prediction and multi-reference frames for motion estimation. These new features result in much higher computation requirements than previous coding standards. In this paper we propose a novel most significant bit (MSB) first bit-serial architecture for full-search block matching VBS-ME, and compare it with systolic implementations. Since the nature of MSB-first processing enables early termination of the sum of absolute difference (SAD) calculation, the average hardware performance can be enhanced. Five different designs, one and two dimensional systolic and tree implementations along with bit-serial, are compared in terms of performance, pixel memory bandwidth, occupied area and power consumption.
Philip H. W. Leong (Corresponding author)Email:
  相似文献   

5.
Genomic sequence comparison algorithms represent the basic toolbox for processing large volume of DNA or protein sequences. They are involved both in the systematic scan of databases, mostly for detecting similarities with an unknown sequence, and in preliminary processing before advanced bioinformatics analysis. Due to the exponential growth of genomic data, new solutions are required to keep the computation time reasonable. This paper presents a specific hardware architecture to speed-up seed-based algorithms which are currently the most popular heuristics for detecting alignments. The architecture regroups FLASH and FPGA technologies on a common support, allowing a large amount of data to be rapidly accessed and quickly processed. Experiments on database search and intensive sequence comparison demonstrate a good cost/performance ratio compared to standard approaches.
D. LavenierEmail:
  相似文献   

6.
Elliptic curve cryptography (ECC) is recognized as a fast cryptography system and has many applications in security systems. In this paper, a novel sharing scheme is proposed to significantly reduce the number of field multiplications and the usage of lookup tables, providing high speed operations for both hardware and software realizations.
Brian KingEmail:
  相似文献   

7.
We propose a novel area/time efficient elliptic curve cryptography (ECC) processor architecture which performs all finite field arithmetic operations in the discrete Fourier domain. The proposed architecture utilizes a class of optimal extension fields (OEF) GF(q m ) where the field characteristic is a Mersenne prime q = 2 n  − 1 and m = n. The main advantage of our architecture is that it achieves extension field modular multiplication in the discrete Fourier domain with only a linear number of base field GF(q) multiplications in addition to a quadratic number of simpler operations such as addition and bitwise rotation. We achieve an area between 25k and 50k equivalent gates for the implementations over OEFs of size 169, 289 and 361 bits. With its low area and high speed, the proposed architecture is well suited for ECC in small device environments such as sensor networks. The work at hand presents the first hardware implementation of a frequency domain multiplier suitable for ECC and the first hardware implementation of ECC in the frequency domain.
Berk SunarEmail:
  相似文献   

8.
A novel optical buffering architecture for Optical Packet Switching (OPS) networks is proposed in this article. The architecture which adopts a fiber-sharing mechanism aims at solving the problem of using a large number of fiber delay lines that are used to solve resource contention in the core node in OPS networks. The new architecture employs fewer fiber delay lines compared to other simple architectures, but can achieve the same performance. Simulation results and analysis show that the new architecture can decrease packet loss probability effectively and achieve reasonable performance in average packet delay.
Fang GuoEmail:
  相似文献   

9.
Parallelization of operations is of utmost importance for efficient implementation of Public Key Cryptography algorithms. Starting with a classification of parallelization methods at different abstraction levels of public key algorithms, we propose a novel memory architecture for elliptic curve implementations with multiple modular multiplier units. This architecture is well-suited for different point addition and doubling algorithms over to be implemented on FPGAs. It allows the execution time to scale with the number of modular multipliers and exhibits nearly no overhead compared to the mere runtime of the multipliers. The advantages of this distributed memory architecture are demonstrated by means of two different point addition and doubling algorithms.
Sorin A. HussEmail:
  相似文献   

10.
Localization and Mapping are two of the most important capabilities for autonomous mobile robots and have been receiving considerable attention from the scientific computing community over the last 10 years. One of the most efficient methods to address these problems is based on the use of the Extended Kalman Filter (EKF). The EKF simultaneously estimates a model of the environment (map) and the position of the robot based on odometric and exteroceptive sensor information. As this algorithm demands a considerable amount of computation, it is usually executed on high end PCs coupled to the robot. In this work we present an FPGA-based architecture for the EKF algorithm that is capable of processing two-dimensional maps containing up to 1.8 k features at real time (14 Hz), a three-fold improvement over a Pentium M 1.6 GHz, and a 13-fold improvement over an ARM920T 200 MHz. The proposed architecture also consumes only 1.3% of the Pentium and 12.3% of the ARM energy per feature.
Vanderlei BonatoEmail:
  相似文献   

11.
This paper presents a propagator-based algorithm for underwater acoustic 2-D direction-of-arrival (DOA) estimation with arbitrarily spaced vector hydrophones at unknown locations. The proposed algorithm requires only linear operations but no eigen-decomposition or singular value decomposition into the signal and noise subspaces. Comparing with its ESPRIT counterpart (Wong and Zoltowski, IEEE J Oceanic Eng 22:566–575, 1997a), the proposed propagator algorithm has its computational complexity reduced by this ratio: the number of sources to quadruple the number of vector hydrophones. Simulation results show that at high and medium signal-to-noise ratio, the proposed propagator algorithm’s estimation accuracy is similar to its ESPRIT counterpart.
Zhong LiuEmail:
  相似文献   

12.
This article presents the performance comparison of TDCS and OFDM based cognitive radio for MIMO system using VBLAST receiver architecture to reconstruct the transmitted data. The interference avoidance performance in terms of BER and bitrate are improved by adding multiple antennas to the system and the use of V-BLAST technique at the receiver. The results show the most promising interference avoidance technique combined with MIMO V-BLAST architecture to be applied in the CR system.
L. P. LigthartEmail:
  相似文献   

13.
Alternative representations based on order statistics are derived for the probability of error for orthogonal, biorthogonal, and transorthogonal signaling. Short programs in are developed for the computation of these representations and to furnish evidence to show that their performance is superior to the traditional Monte Carlo approach.
Saralees NadarajahEmail:
  相似文献   

14.
Constant multiplications can be efficiently implemented in hardware by converting them into a sequence of nested additions and shift operations. They can be optimized further by finding common subexpressions among these operations. In this work, we present algebraic methods for eliminating common subexpressions. Algebraic techniques are established in multi-level logic synthesis for the minimization of the number of literals and hence gates to implement Boolean logic. In this work we use the concepts of two of these methods, namely rectangle covering and fast extract (FX) and adapt them to the problem of optimizing linear arithmetic expressions. The main advantage of using such methods is that we can optimize systems consisting of multiple variables, which is not possible using the conventional optimization techniques. Our optimizations are aimed at reducing the area and power consumption of the hardware, and experimental results show up to 30.3% improvement in the number of operations over conventional techniques. Synthesis and simulation results show up to 30% area reduction and up to 27% power reduction. We also modified our algorithm to perform delay aware optimization, where we perform common subexpression elimination such that the delay is not exceeded beyond a particular value.
Ryan KastnerEmail:
  相似文献   

15.
Wireless ATM (W-ATM) microcellular networks encounter severe problems during handovers. Microcellular solutions in W-ATM networks increase the network traffic control as a result of frequent handover requests. This paper presents a two-layer microcellular ATM architecture which optimizes the handoff blocking probability performance of priority subscribers (PS) in a congested urban area. The lower layer of the proposed architecture is based on a microcellular ATM solution for normal subscribers (NS) while the higher layer is based on a high altitude stratospheric platform (HASP) overlay solution for absorbing the traffic load of the existed handoff calls of PS. Analysis is performed using Markov state diagrams, in order to optimize the performance of W-ATM networks.
S. LouvrosEmail:
  相似文献   

16.
17.
In this article we describe a feedback-based OBS network architecture in which core nodes send messages to source nodes requesting them to reduce their transmission rate on congested links. Within this framework, we introduce a new congestion control mechanism called congestion control with explicit reduction request (CCERQ). Through feedback signals, CCERQ proactively attempts to prevent the network from entering the congestion state. Basic building blocks and performance tradeoffs of CCERQ are the main focus of this article.
Farid FarahmandEmail:
  相似文献   

18.
This paper presents an Application-Specific Signal Processor (ASSP) for Orthogonal Frequency Division Multiplexing (OFDM) Communication Systems, called SPOCS. The instruction set and its architecture are specially designed for OFDM systems, such as Fast Fourier Transform (FFT), scrambling/descrambling, puncturing, convolutional encoding, interleaving/deinterleaving, etc. SPOCS employs the optimized Data Processing Unit (DPU) to support the proposed instructions and the FFT Address Generation Unit (FAGU) to automatically calculate input/output data addresses. In addition, the proposed Bit Manipulation Unit (BMU) supports efficient bit manipulation operations. SPOCS has been synthesized using the SEC 0.18 μm standard cell library and has a much smaller area than commercial DSP chips. SPOCS can reduce the number of clock cycles over 8%~53% for FFT and about 48%~84% for scrambling, convolutional encoding and interleaving compared with existing DSP chips. SPOCS can support various OFDM communication standards, such as Wireless Local Area Network (WLAN), Digital Audio Broadcasting (DAB), Digital Video Broadcasting-Terrestrial (DVB-T), etc.
Myung H. SunwooEmail:
  相似文献   

19.
This paper presents a compact hardware architecture of Context-Based Adaptive Binary Arithmetic Coding (CABAC) codec for H.264/AVC. The similarities between encoding algorithm and decoding algorithm are explored to achieve remarkable hardware reuse. System-level hardware/software partition is conducted to improve overall performance. Meanwhile, the characteristics of CABAC algorithm are utilized to implement dynamic pipeline scheme, which increases the processing throughput with very small hardware overhead. Proposed architecture is implemented under 0.18 μm technology. Results show that the core area of proposed design is 0.496 mm2 when the maximum clock frequency is 230 MHz. It is estimated that the proposed architecture can support CABAC encoding or decoding for HD1080i resolution at a speed of 30 frame/s.
Lingfeng LiEmail:
  相似文献   

20.
Expressions are given for the moment generating functions of the Rayleigh and generalized Rayleigh distributions.
Saralees NadarajahEmail:
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号