首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 153 毫秒
1.
 MQ(Multiple Quantization)编码器由于效率低下已经成为JPEG2000的性能瓶颈.本文对MQ编码算法中的上下文关系进行了提取,对索引表中的启动态和非暂态进行了分离,并提出一种用于预测索引值的方法.同时,对重归一化运算中出现的大概率事件和小概率事件进行分离,使其可并行对2个上下文完成编码.依据该算法,本文提出了一种多上下文并行处理的MQ编码器VLSI结构.实验结果表明,本文提出的MQ编码器能够工作在286.80MHz,吞吐量为573.60 Msymbols/sec,相比Dyer提出的Brute Force with Modified Byteout结构,本文的吞吐量提升约35%,且面积减小78%.  相似文献   

2.
JPEG2000算术编码器的算法优化和VLSI设计   总被引:1,自引:1,他引:0       下载免费PDF全文
刘文松  朱恩  王健  徐龙涛  林叶 《电子学报》2011,39(11):2486-2491
研究了JPEG2000算术编码器的算法和电路实现.提出了重归一化规程的一种新的顺序结构,通过添加独立的总移位次数预测规程,使得编码算法可以一次性顺序完成当前上下文的处理.据此设计了具有从流水线的三级流水线电路结构,流水线用于处理无编码字节输出的常规情况,从流水线单独处理编码字节的输出,从而有效缩短了各级电路的关键路径延...  相似文献   

3.
一种适用于JPEG2000的高速MQ编码器的VLSI实现   总被引:6,自引:0,他引:6  
MQ编码器对于无损的数据压缩是一种非常有效的方法 ,它已被 JPEG2 0 0 0标准所采用。但该编码算法复杂度高 ,执行速度慢。文中提出了一种基于动态流水的高性能 MQ编码器的 VLSI结构。为了获得高速处理能力 ,首先分析了 JPEG2 0 0 0标准中 MQ编码算法的软件流程 ,并对其进行了相应的修改以适应硬件实现 ,然后采用了“动态流水”技术 ,可以根据变化的运算量来实时地安排流水操作。本 MQ编码器结构经 Xilinx FPGA实现 ,处理速度可达约 0 .6 2 5bit/ cycle( 32 .83Mbit/ sec)  相似文献   

4.
MQ编码器是JPEG 2000标准中重要的无损压缩算法,可获得很高的压缩效率.但因其算法复杂度高,执行速度慢,使其应用受到很大限制.为了获得高速处理能力,设计一种高速MQ编码器的VLSI结构,采用三级流水线结构,对算法进行优化,并改进概率估计表内容.设计使用Verilog进行编程,最后通过Modelsim 6.1进行仿真.实验结果表明,该设计极大地提高了编码速度.这里的研究对于JPEG 2000在实际中的应用有着重要的意义.  相似文献   

5.
为实现图像的压缩和加密同步,使用MQ编码器对内嵌零树小波压缩算法进行改进,将混合混沌序列作为流密钥对比特平面编码生成的上下文和判决进行修正,并送入MQ编码器进行熵编码。对算法进行仿真,结果表明:与原压缩算法相比,所提出算法的重构图像PSNR值至少提高了1 dB,且抗攻击性好,加解密速度快。算法实现了分辨率选择性加密,并在数据压缩的同时实现了算术加密。  相似文献   

6.
为满足JPEG2000编码器的硬件实现需求,针对其中最为复杂和耗时的Tier-1编码器,提出了一种高效的硬件实现结构.该结构采用通道并行的位平面编码器,并且在通道内部采用基于列的点跳跃算法,提升了位平面的编码速度.同时,MQ编码器与位平面编码器配合,引入5级动态流水结构,进一步提高编码效率.FPGA验证结果表明,运用该结构的Tier-1编码器,在提高70%编码效率的同时只增加了18.2%的硬件开销,取得了令人满意的结果.  相似文献   

7.
周赟  支琤  王峰  陈磊 《信息技术》2007,(10):49-52
提出了一种基于流水线技术的高速MQ算术编码器的VLSI实现架构。文中采用表扩展及乒乓buffer输出,同时对标准编码流程进行了优化及调整,以适合VLSI高速实现。结构采用流水线技术,将整体架构分为三个流水级,极大的提高了处理速度。经Xilinx公司的FPGA验证,本结构的处理速度可达到1bit/cycle(47.292Mbit/sec)。  相似文献   

8.
乔世杰  张益民  高勇   《电子器件》2007,30(6):2229-2232
位平面编码用于对量化的离散小波变换的码块数据进行编码.通过对位平面编码算法的分析和C语言验证,给出了位平面编码的四种基本编码操作和三个编码通道具体的VLSI结构实现.对位平面编码器的VLSI结构进行了仿真和综合,在图像验证系统上用逻辑分析仪实际测量的结果与仿真结果一致.该位平面编码器可在50 MHz的主频下,完成32×32码块数据的编码.所设计的位平面编码器已经作为单独的IP核应用于目前正在开发的JPEG2000图像编码芯片中.  相似文献   

9.
块匹配运动估计是视频编码器中的计算量和存储访问最密集的模块,为了满足实时编码的需求常用VLSI结构实现.本文对块匹配运动估计的VLSI结构作了系统的总结,并提出了改进的方向.  相似文献   

10.
许磊  王胜利  谢慧  王丽丽 《电子科技》2010,23(9):77-79,82
针对EBCOT中MQ占用大量编码时间和资源,提出了一种基于码率反馈MQ自适应率控制算法。根据小波子带特性自适应地选择Coding Pass进入MQ算术编码器,先进入码流的Coding Pass反馈控制未进入MQ的Coding Pass,查找截断点,舍弃对最终码流无贡献的Coding Pass的码段。从而提升了整个EBCOT编码效率。算法几乎对整个图像压缩质量无影响,同时还大幅度地提高了整个EBCOT的编码效率。试验结果表明,文中算法有效地减少了EBCOT中MQ的计算量和存储量,易于硬件实现。  相似文献   

11.
文章提出了一种适用H.264标准的自适应算术编码器的VLSI实现方案,它对算术编码的结构做了改进,用查表代替了乘法操作,并采用流水线结构实现,获得了较高的吞吐速率.在采用Verilog语言对编码模块进行描述后,用ALTEAR公司的现场可编程门阵列(FPGA)进行仿真验证.实验表明,这种流水线结构的算术编码器能够获得较高的编码速度.  相似文献   

12.
In this paper, we describe a fully pipelined single chip VLSI architecture for implementing the JPEG baseline image compression standard. The architecture exploits the principles of pipelining and parallelism to the maximum extent in order to obtain high speed and throughput. The architecture for discrete cosine transform and the entropy encoder are based on efficient algorithms designed for high speed VLSI implementation. The entire architecture can be implemented on a single VLSI chip to yield a clock rate of about 100 MHz which would allow an input rate of 30 frames per second for 1024×1024 color images  相似文献   

13.
Standard VLSI implementations of turbo decoding require substantial memory and incur a long latency, which cannot be tolerated in some applications. A parallel VLSI architecture for low-latency turbo decoding, comprising multiple single-input single-output (SISO) elements, operating jointly on one turbo-coded block, is presented and compared to sequential architectures. A parallel interleaver is essential to process multiple concurrent SISO outputs. A novel parallel interleaver and an algorithm for its design are presented, achieving the same error correction performance as the standard architecture. Latency is reduced up to 20 times and throughput for large blocks is increased up to six-fold relative to sequential decoders, using the same silicon area, and achieving a very high coding gain. The parallel architecture scales favorably: latency and throughput are improved with increased block size and chip area.  相似文献   

14.
A novel high performance bit parallel architecture to perform square root and division is proposed. Relevant VLSI design issues have been addressed. By employing redundant arithmetic and a semisystolic schedule, the throughput has been made independent of the size of the array.<>  相似文献   

15.
The computer-aided design of a VLSI PCM-FDM transmultiplexer is presented. The entire design process, from system specifications to integrated circuit layout, is carried out with the aid of specialized computer programs for the analysis, synthesis, and optimization at each design level: the filter network, the architecture, and the circuit layout. These CAD tools support a top-down custom design methodology based on bit-serial architectures and standard cells. A customized architecture is constructed which is integrated using a 5-/spl mu/m CMOS cell library. The results are compared with a fully manual design and demonstrate the power of architecture based computer-aided design methodologies for VLSI filtering. By combining both synthesis and optimization aids at each design level it is possible to achieve a high degree of automation while retaining an efficient use of silicon area, high throughput, and moderate power consumption.  相似文献   

16.
《Microelectronics Journal》2002,33(1-2):77-89
Despite further refinements of the CORDIC algorithm with the introduction of redundant arithmetic and higher radix CORDIC techniques, in terms of circuit latency and performance, the iterative nature remains to be the major bottleneck for further optimization. A technique known as flat CORDIC, in which the conventional X and Y recurrences are successively substituted to express the final vectors in terms of the initial vectors, can be used to eliminate the iterative process. In this paper, the techniques devised for the VLSI efficient implementation of a pipelined 16-bit flat CORDIC based sine–cosine generator are presented. Three possible schemes to pipeline the 16-bit flat CORDIC design have been presented to demonstrate the suitability of the proposed method to realize high throughput implementations. The 16-bit architecture has been synthesized with 0.35 μ CMOS process library using Synopsys. Finally, a detailed comparison with other major contributions show that the flat CORDIC based sine–cosine generators are, on average, 30% faster and occupy some 30% less silicon area.  相似文献   

17.
提出一种超精简处理单元架构。该处理单元基于运算-跳转式单指令处理器体系。使用指令优化和内部总线上加速器,该处理单元能够执行传统算术运算式单指令处理器难于执行的高效位运算以及执行效率较低的数据转移操作。以该处理单元构成的片上大规模并行计算阵列可用于图像处理等局部性强、实时性要求高的计算任务。包含有该处理单元架构的16 16的原型阵列已经在FPGA上实现,性能达30.7GOPS@120MHz,平均功耗39.5mW。  相似文献   

18.
The Block Decoder (BD) which is an indispensable component of the JPEG 2000 image compression standard has the highest computational complexity and determines the speed of the overall decoder system. This paper proposes a high throughput pass parallel BD architecture, which can decode more than one bit per clock cycle. In BD, the dependency between context generation and arithmetic decoding unit incorporates stalling and reduces the throughput of the decoding process. The proposed selective byte input and synchronous sample skipping techniques are used to prevent stalling in the decoding process. The proposed architecture achieves 86% more throughput with 50% increment in the hardware cost than that of the best available serial BD architecture. In comparison with the best available pass parallel architecture, throughput improves almost 8.2 times with 61% increment in the hardware cost. Incorporation of the speed up techniques in the design is the main reason for more hardware consumption. The Figure of Merit of the proposed design, which is the ratio of throughput and hardware cost, is more than that of the available BD architectures for typical code block (CB) size of 32 × 32. The ASIC implementation of the proposed design consumes 66 mW power at maximum operating frequency.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号