首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
提出了一种新颖的基于自适应小波基优化选择和心理声学模型相结合的数字音频信号的透明质量编码方法,保证固定失真水平上使每帧信号的变换系数的动态分配的比特数最少,并且利用动态码本的方法来消除音频信号的统计冗余,进一步压缩比特率,对于抽样率为44.1kHz每样值用16比特线性码表示的光盘单声道音乐信号可以压缩到64kBPS左右。  相似文献   

2.
This paper presents a technique to incorporate psychoacoustic models into an adaptive wavelet packet scheme to achieve perceptually transparent compression of high-quality (34.1 kHz) audio signals at about 45 kb/s. The filter bank structure adapts according to psychoacoustic criteria and according to the computational complexity that is available at the decoder. This permits software implementations that can perform according to the computational power available in order to achieve real time coding/decoding. The bit allocation scheme is an adapted zero-tree algorithm that also takes input from the psychoacoustic model. The measure of performance is a quantity called subband perceptual rate, which the filter bank structure adapts to approach the perceptual entropy (PE) as closely as possible. In addition, this method is also amenable to progressive transmission, that is, it can achieve the best quality of reconstruction possible considering the size of the bit stream available at the encoder. The result is a variable-rate compression scheme for high-quality audio that takes into account the allowed computational complexity, the available bit-budget, and the psychoacoustic criteria for transparent coding. This paper thus provides a novel scheme to marry the results in wavelet packets and perceptual coding to construct an algorithm that is well suited to high-quality audio transfer for Internet and storage applications  相似文献   

3.
基于小波变换和音质模型的音频编码算法研究   总被引:3,自引:0,他引:3  
音频编码要解决的问题是以最小感知失真用低速率表达音频信号.本文设计了一种基于正交小波变换和音质模型的自适应比特分配音频编码算法,它可以将1411.2kbit/s的双声道立体声高保真音频信号压缩成低至32kbit/s的速率,并保持很好的音频质量.  相似文献   

4.
Advances in speech and audio compression   总被引:4,自引:0,他引:4  
Speech and audio compression has advanced rapidly in recent years spurred on by cost-effective digital technology and diverse commercial applications. Recent activity in speech compression is dominated by research and development of a family of techniques commonly described as code-excited linear prediction (CELP) coding. These algorithms exploit models of speech production and auditory perception and offer a quality versus bit rate tradeoff that significantly exceeds most prior compression techniques for rates in the range of 4 to 16 kb/s. Techniques have also been emerging in recent years that offer enhanced quality in the neighborhood of 2.4 kb/s over traditional vocoder methods. Wideband audio compression is generally aimed at a quality that is nearly indistinguishable from consumer compact-disc audio. Subband and transform coding methods combined with sophisticated perceptual coding techniques dominate in this arena with nearly transparent quality achieved at bit rates in the neighborhood of 128 kb/s per channel  相似文献   

5.
A Hi-Fi audio codec with an improved adaptive transform coding (ATC) algorithm is presented using digital signal processors (DSPs). An audio signal with a 20 kHz bandwidth sampled at 48 kHz is coded at a rate of 128 kb/s. The algorithm utilizes adaptive block size selection, which is effective for preecho suppression. A modified discrete cosine transform (MDCT) with a simple window set is employed to reduce block boundary noise without decreasing the performance of transform coding. In addition, a fast MDCT calculation algorithm, based on a fast Fourier transform, is adopted. Weighted bit allocation is employed to quantize the transformed coefficients. The codec was realized by a multiprocessor system composed of newly developed DSP boards. Subjective tests with the codec show that the coding quality is comparable to that of compact disc signals  相似文献   

6.
Transform coding of audio signals using perceptual noise criteria   总被引:15,自引:0,他引:15  
A 4-b/sample transform coder is designed using a psychoacoustically derived noise-making threshold that is based on the short-term spectrum of the signal. The coder has been tested in a formal subjective test involving a wide selection of monophonic audio inputs. The signals used in the test were of 15-kHz bandwidth, sampled at 32 kHz. The bit rate of the resulting coder was 128 kb/s. The subjective test shows that the coded signal could not be distinguished from the original at that bit rate. Subsequent informal work suggests that a bit rate of 96 kb/s may maintain transparency for the set of inputs used in the test  相似文献   

7.
This paper describes a new audio coding scheme based on adaptive wavelet analysis that provides transparent audio coding for CD-audio signals at low bit rates (≈1.4 bits/sample per channel). A new perceptual cost function is defined to obtain the best wavelet-packet base for each audio frame. The sharp variations in quantization noise that appear at the border of the frames are minimized by a novel approach that avoids overlapping. The proposed coder guarantees high perceptual quality using filters that generate wavelets of any compact support, because a bit-allocation algorithm that takes into account the equivalent filter frequency responses of the synthesis filter bank branches is used.  相似文献   

8.
Algorithm of Adaptive Bit Allocation Wavelet Transform Audio Coding   总被引:2,自引:0,他引:2  
AlgorithmofAdaptiveBitAlocationWaveletTransformAudioCodingMaHongfeiFanChangxinSongGuoxiang(XidianUniversity,Xi’an71...  相似文献   

9.
A perceptual audio coder, in which each audio segment is adaptively analyzed using either a sinusoidal or an optimum wavelet basis according to the time-varying characteristics of the audio signals, has been constructed. The basis optimization is achieved by a novel switched filter bank scheme, which switches between a uniform filter bank structure (discrete cosine transform) and a non-uniform filter bank structure (discrete wavelet transform). A major artifact of the International ISO/Moving Pictures Experts Group (MPEG) audio coding standard (MPEG-I layers 1 and 2) known as pre-echo distortion which uses a uniform filter bank structure for audio signal analysis, is almost eliminated in the proposed coder. A perceptual masking model implemented using a high-resolution wavelet packet filter bank with 27 subbands, closely mimicking the critical bands of the human auditory system, is employed in this audio coder. The resulting scheme is a variable bit-rate audio coder, which provides compression ratios comparable to MPEG-I layers 1 and 2 with almost transparent quality.  相似文献   

10.
This paper presents a transform coding algorithm devoted to high quality audio coding at a bit rate of 64 kbps per monophonic channel. It enables the transmission of a high quality stereo sound through the basic access (2B channels) of ISDN. Although a complete system including framing, synchronization and error correction has been developed, only the bit rate compression algorithm is described here. A detailed analysis of the signal processing techniques such as the time/frequency transformation, the pre-echo reduction by adaptive filtering, the fast algorithm computations, etc., is provided. The use of psychoacoustical properties is also precisely reported. Finally, some subjective evaluation results and one real time implementation of the coder using the ATT DSP32C digital signal processor are presented  相似文献   

11.
This paper presents new wideband speech coding and integrated speech coding-enhancement systems based on frame-synchronized fast wavelet packet transform algorithms. It also formulates temporal and spectral psychoacoustic models of masking adapted to wavelet packet analysis. The algorithm of the proposed FFT-like overlapped block orthogonal wavelet packet transform permits us to efficiently approximate the auditory critical band decomposition in the time and frequency domains. This allows us to make use of the temporal and spectral masking properties of the human auditory system to decrease the average bit rate of the encoder while perceptually hiding the quantization error. The same wavelet packet representation is used to merge speech enhancement and coding in the context of auditory modeling. The advantage of the method presented in this paper over previous approaches is that perceptual enhancement and coding, which is usually implemented as a cascade of two separate systems, are combined. This leads to a decreased computational load. Experiments show that the proposed wideband coding procedure by itself can achieve transparent coding of speech signals sampled at 16 kHz at an average bit rate of 39.4 kbit/s. The combined speech coding-enhancement procedure achieves higher bit rate values that depend on the residual noise characteristics at the output of the enhancement process  相似文献   

12.
This article is an overview of the standardization, architecture, and performance of the new ITU-T Recommendation G.718. G.718 is an embedded variable bit rate codec providing a scalable solution for compression of 8 and 16 kHz sampled speech and audio signals at rates between 8 kb/s and 32 kb/s. It comprises five layers where higher-layer bitstreams can be discarded without affecting the lower layersiquest decoding. The codec also has an optional core layer interoperable with ITU-T G.722.2 (3GPP AMR-WB) at 12.65 kb/s. G.718 was designed to provide high speech quality at low bit rates and to be robust to significant rates of frame erasures or packet losses. It is also targeting good quality for generic audio at higher rates.  相似文献   

13.
Multi-domain speech compression based on wavelet packet transform   总被引:3,自引:0,他引:3  
The authors present a multi-domain speech compression method based on a wavelet packet transform. The signals are compressed in domains with different time-frequency resolutions according to their energy distribution in these domains. It is shown that this method is simple to implement and is effective at compressing speech and audio signals, even at bit rates as low as 2 kbit/s  相似文献   

14.
The paper presents an efficient method for speech encoding which is based on the well known idea of sub-band coding. Typically, the frequency range from 0.3 kc/s to 3.4 kc/s is split into four sub-bands, and the sub-band signals are encoded separately with different accuracies by means of familiar PCM techniques. An adaptive bit allocation scheme is introduced here, in order to replace the usual form of a fixed distribution of the bit rate among the sub-bands. Listening tests have shown that by these means the bit rate can be reduced by more than 2.5 kb/s without degrading speech quality. Accordingly, highly intelligible reproduction of speech is possible at bit rates below 7 kb/s.  相似文献   

15.
Boland  S. Deriche  M. 《Electronics letters》1997,33(4):262-263
A new audio coding system is proposed. Using an M-band multiresolution filter bank technique. This consists of a cascade of 4-band and 8-band filter banks. Experiments with a complete audio coding system were carried out with the proposed filter bank, masking model, bit allocation algorithm, scalar quantisation and Huffman coding. For the broadband signals tested, the proposed system resulted in near transparent quality at bit-rates of 78-91 kbit/s with low computational load. It also achieved similar performance to the MPEG layer 2 coder at 128 kbit/s  相似文献   

16.
Yao  S. Clarke  R.J. 《Electronics letters》1992,28(17):1566-1568
A new scheme for image sequence coding using the wavelet transform and adaptive vector quantisation is proposed. The transform is used to decompose an image into multiresolution and multiband sub-images. Adaptive vector quantisation is then applied to achieve image data compression with good quality and low bit rate. Experimental results are presented.<>  相似文献   

17.
基于自适应小波包分解的音频压缩编码   总被引:1,自引:0,他引:1  
文中所得出的就是把小波应用于音频压缩编码的一种有效方案。这一方案把自适应小波包分解与人耳的听觉心理学型相结合,充分利用人耳的掩蔽特性自适应应地构造滤波器组,并采用零树逄法进行比较分配,从而把由44.1kHz采样并以16bit编码的音频信号压缩成大约45kb/s的拉近CD音质的音频码流。  相似文献   

18.
We describe a spatially scalable video coding framework in which motion correspondences between successive video frames are exploited in the wavelet transform domain. The basic motivation for our coder is that motion fields are typically smooth and, therefore, can be efficiently captured through a multiresolutional framework. A wavelet decomposition is applied to each video frame and the coefficients at each level are predicted from the coarser level through backward motion compensation. To remove the aliasing effects caused by downsampling in the transform, a special interpolation filter is designed with the weighted aliasing energy as part of the optimization goal, and motion estimation is carried out with low pass filtering and interpolation in the estimation loop. Further, to achieve robust motion estimation against quantization noise, we propose a novel backward/forward hybrid motion compensation scheme, and a tree structured dynamic programming algorithm to optimize the backward/forward mode choices. A novel adaptive quantization scheme is applied to code the motion predicted residue wavelet coefficients, Experimental results reveal 0.3-2-dB increase in coded PSNR at low bit rates over the state-of-the-art H.263 standard with all enhancement modes enabled, and similar improvements over MPEG-2 at high bit rates, with a considerable improvement in subjective reconstruction quality, while simultaneously supporting a scalable representation.  相似文献   

19.
任意能量有限信号都可以用紧支撑正交小波基展开或分解,这一点对研究快速高效音频编码算法是非常重要的。本文设计一种基于正交小波变换的高保真音频编码算法,该算法可以把速率为705.6kbit/s的高保真音频信号压缩到192kbit/s,160kbit/s,128kbit/s,96kbit/s和64kbit/s,并保持重构音频信号的高质量。  相似文献   

20.
提出了一种基于位平面的图像质量可分级编码方法,结合小波变换的良好空间-频率等特性,对静止图像进行位平面编码和算术编码,并在解码时根据给定解码的数率不同对位平面进行重构以实现图像的质量可分级性。实验结果证明,该方法实现简单,编解码速度快,在保证具有很好的编码效果的前提下,可以实现图像的质量可分级编码。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号