期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Perceptual coding of narrow-band audio signals at low rates

Najaf-Zadeh H. Kabal P. 《IEEE transactions on audio, speech, and language processing》2006,14(2):609-622

This paper describes a coding paradigm using coding tools based on the characteristics of the human hearing system so as to accommodate a wide range of narrow-band audio inputs without annoying artifacts at low rates (down to 8 kb/s). The narrow-band perceptual audio coder (NPAC) employs a variety of algorithms to account for the perceptually irrelevant parts of the input signal in addition to statistical redundancies. The new algorithms used in the NPAC coder include a perceptual error measure in training the codebooks and selecting the best codewords which takes into account the audible parts of the quantization noise, a perception-based bit-allocation algorithm and a new predictive scheme to vector quantize the scale factors. The NPAC coder delivers acceptable quality without annoying artifacts for most narrow-band audio signals at around 1 bit/sample. Informal subjective tests have shown that the NPAC coder outperforms a commercial low-rate music coder operating at 8 kb/s. 相似文献

2.

On Integer MDCT for Perceptual Audio Coding

Te Li Rahardja S. Rongshan Yu Soo Ngee Koh 《IEEE transactions on audio, speech, and language processing》2007,15(8):2236-2248

In MPEG-4 scalable lossless coding (SLS) which was recently published as an ISO standard in June 2006, the integer modified discrete cosine transform (IntMDCT) was adopted to enable efficient lossless reconstruction. In addition, there is an MDCT filterbank which is inherent to the advanced audio coding (AAC) core that is present in the SLS codec. The presence of two filterbanks have undoubtedly increased the complexity of the implementation, and it is for this reason that the MDCT is disabled and the IntMDCT is then the only type of filterbank that is employed in SLS for both lossy and lossless operations. Because of the rounding operations in the IntMDCT, there is a concern if the use of IntMDCT for perceptual audio coding will eventually degrade the fidelity of the audio codec. This paper addresses this concern by analyzing the performance of the IntMDCT in a lossy coding scenario. It is found that noise introduced by the IntMDCT does not affect the perceptual quality of the coded audio under standard playback circumstances. As such, it concludes that the MDCT and IntMDCT filterbanks are interchangeable at lossy bitrate, and the way of using only the IntMDCT filterbank in scalable audio coding is also justified. 相似文献

3.

A fine granular scalable to lossless audio coder

Rongshan Yu Rahardja S. Lin Xiao Chi Chung Ko 《IEEE transactions on audio, speech, and language processing》2006,14(4):1352-1363

This paper presents Advanced Audio Zip (AAZ), a fine grained scalable to lossless (SLS) audio coder that has recently been adopted as the reference model for MPEG-4 audio SLS work. AAZ integrates the functionalities of high-compression perceptual audio coding, fine granular scalable audio coding, and lossless audio coding in a single framework, and simultaneously provides backward compatibility to MPEG-4 Advanced Audio Coding (AAC). AAZ provides the fine granular bit-rate scalability from lossy to lossless coding, and such a scalability is achieved in a perceptually meaningful way, i.e., better perceptual quality at higher bit-rates. Despite its abundant functionalities, AAZ only introduces negligible overhead in terms of lossless compression performance compared with a nonscalable, lossless only audio coder. As a result, AAZ provides a universal yet efficient solution for digital audio applications such as audio archiving, network audio streaming, portable audio playing, and music downloading which were previously catered for by several different audio coding technologies, and eliminates the need for any transcoding system to facilitate sharing of digital audio contents across these application domains. 相似文献

4.

AVS-P2、P7基于移动视频应用性能的比较

CHEN Yong-hua WANG Yi LIU Dong-hua YANG Li-zhi 《微机发展》2008,(7)

AVS是《信息技术先进音视频编码》系列标准的简称,是中国自主制定的音视频编码标准,主要面向高清晰度电视、高密度光存储和移动媒体等应用。它是一套包含系统、视频、音频、媒体版权管理在内的完整标准体系,其中视频标准包括两部分:面向数字电视应用领域的AVS-P2和面向移动应用领域的AVS-P7。针对AVS两种视频标准基于移动视频应用领域上的关键技术进行比较,通过实验数据进行分析;对两种视频标准在移动视频领域的应用前景进行探讨。相似文献

5.

基于小波包和心理声学模型的音频编码算法 总被引：6，自引：1，他引：5

何冬梅高文《计算机研究与发展》2000,37(3):329-335

文中提出了一种新的适用于实时多媒体应用领域的音频编码算法.该算法首先对音频信号进行小波包分解,然后在小波域中计算掩蔽阈值,最后根据从心理声学模型得到的信号-掩蔽比来对各子带小波系数进行动态比特分配、量化和编码.实验结果表明该算法将 CD 音频信号压缩到 64 Kbps 时,恢复信号的分段信噪比为 32.32 dB,主观上感觉无失真.该算法计算简单,可在不需任何附加硬件的 Pentium 133 MHz 个人计算机上实现实时音频编码. 相似文献

6.

Frequency Region-Based Prioritized Bit-Plane Coding for Scalable Audio

Te Li Rahardja S. Soo Ngee Koh 《IEEE transactions on audio, speech, and language processing》2008,16(1):94-105

A perceptually enhanced prioritized bit-plane audio coding algorithm is presented in this paper. According to the energy distribution in different frequency regions, the bit-planes are prioritized with optimized parameters. Based on the statistical modeling of the frequency spectrum, a much more simplified implementation of prioritized bit-plane coding is integrated with the recent release of MPEG-4 scalable lossless (SLS) audio coding structure by replacing the sequential bit-plane coding in the enhancement layer. With zero extra side information, trivial added complexity, and modification to the original SLS structure, extensive experimental results show that the perceptual quality of SLS with noncore and very low core bit-rate is improved significantly in a wide range of bit-rate combinations. Fully scalable audio coding up to lossless with much enhanced perceptual quality is thus achieved. 相似文献

7.

Adaptive Signal Modeling Based on Sparse Approximations for Scalable Parametric Audio Coding

Ruiz Reyes N. Vera Candeas P. 《IEEE transactions on audio, speech, and language processing》2010,18(3):447-460

This paper deals with the application of adaptive signal models for parametric audio coding. A fully parametric audio coder, which decomposes the audio signal into sinusoids, transients and noise, is here proposed. Adaptive signal models for sinusoidal, transient, and noise modeling are therefore included in the parametric scheme in order to achieve high-quality and low bit-rate audio coding. In this paper, a new sinusoidal modeling method based on a perceptual distortion measure is proposed. For transient modeling, a fast and effective method based on matching pursuit with a mixed dictionary is chosen. The residue of the previous models is analyzed as a noise-like signal. The proposed parametric audio coder allows high quality audio coding for one-channel audio signals at 16 kbits/s (average bit rate). A bit-rate scalable version of the parametric audio coder is also proposed in this work. Bit-rate scalability is intended for audio streaming applications, which are highly demanded nowadays. The performance of the proposed parametric audio coders (nonscalable and scalable coders) is assessed in comparison to widely used audio coders operating at similar bit rates. 相似文献

8.

A Backward-Compatible Multichannel Audio Codec

Hotho G. Villemoes L.F. Breebaart J. 《IEEE transactions on audio, speech, and language processing》2008,16(1):83-93

We propose in this paper a backward-compatible multichannel audio codec. This codec represents a multichannel audio input signal by a down mix and parametric data. In order to enable backward compatibility, it is necessary to have the possibility of exerting control over the down-mixing procedure. At the same time, in order to achieve a high coding efficiency, both signal and perceptual redundancies should be exploited. In this paper, we describe a codec that unifies the above-mentioned conditions: backward compatibility and exploitation of both signal and perceptual redundancies. The codec combines a high audio quality and a low parameter bit rate. Moreover, its design is flexible, examples of which are the scalability of the audio quality to (in principle) transparency and the possibility to preserve the correlation structure of the original input signals by using synthetic signals. A stereo backward compatible version of the proposed codec is used as a component of the recently standardized MPEG Surround multichannel audio codec. 相似文献

9.

AVS-P10立体声编码算法的优化设计与实现

下载免费PDF全文

李诗晴涂卫平《计算机工程与应用》2016,52(8):141-147

AVS-P10是我国首部具有完全自主知识产权的移动音频编解码标准,但是其立体声编码存在重建声像不够稳定、编码复杂度较高的问题。根据参数立体声编码原理,采用声道下混和立体声参数提取及合成技术,以AVS-P10的核心编码器为基础,设计并实现了一种高效的立体声编解码方案。实验结果表明,同等码率下,优化后的算法比AVS-P10立体声编码算法的主观音质提升约10MUSHRA得分,编码复杂度下降幅度达到40%~60%,解码复杂度略有下降。相似文献

10.

Data reduction of audio by exploiting musical repetition

Stuart Cunningham Vic Grout 《Multimedia Tools and Applications》2014,72(3):2299-2320

This paper presents and evaluates a method of audio compression specifically designed to exploit the natural repetition that occurs within musical audio. Our system is entitled Audio Compression Exploiting Repetition (ACER). ACER is a perceptual technique, but one that does not consider exploiting masking, but rather attempts to apply the principles of Lempel-Ziv and run-length encoding, by substituting audio sequences for numeric or character strings. The ACER procedure applies a pseudo exhaustive search process and spectral difference grading. Since ACER exploits musical structure, the amount of data reduction achieved varies from piece-to-piece. The system is described before results on a corpus of material are presented. The analysis shows moderate amounts of data reduction take place whilst the system is operating within parameters designed to maintain high-levels of perceptual audio quality, whilst lower rates of perceptual quality yield greater data reduction. Objective quality evaluations are conducted that reveal degradation in fidelity that is relative to the compression parameters. 相似文献

11.

BS.1387声学模型在音频编码系统中的应用

胡小鹏李迅贺贵明周小平《计算机工程与应用》2006,42(11):6-9

将ITU-RBS.1387中评判音频质量所采用的声学模型中的基本模式与实际的音频编码系统相结合,对该声学模型的特点进行了理论分析,提出了相应的改进措施以便其应用于实际的音频编码系统中。在我国最新制定的AVS音频编码标准参考编码器上,分别将该声学模型和MPEG-2AAC音频标准的心理声学参考模型2进行了实现,并将模型输出掩蔽参数以及主观听觉试验结果进行了对比验证。试验结果证明该文设计的应用于音频编码器的新声学模型是合理可行的。相似文献

12.

Introduction to AVS Audio 总被引：1，自引：0，他引：1

下载免费PDF全文

Hao-Jun Ai Shui-Xian Chen and Rui-Min Hu 《计算机科学技术学报》2006,21(3):360-365

This paper describes a general audio coding algorithm which has been recently standardized by AVS, China. The algorithm is based on a perceptual coding technique. The codec delivers near CD-quality audio at 128kb/s. This paper describes the coder structure in detail and discusses the reasons for specific design methods. A summary of the subjective test results are presented for the prototype codec. Comparison Mean Opinion Score （CMOS） test indicates that the quality of the AVS audio coder is comparable with MPEG Layer-3 audio coder. A reM-time decoder was used for the characterization test, which is based on a 16-bit fixed-point DSP. The performance of the DSP solution was demonstrated, including computational complexity and storage characteristics. 相似文献

13.

RS(255,223)码的编译码软件实现 总被引：2，自引：0，他引：2

刘悦刘明业尚振宏《计算机应用与软件》2006,23(11):46-47,116

为了实现RS（255,223）的软件编码和译码,在对纠错技术进行研究的基础上,采用高级语言设计了此码的编码和译码算法。实验表明,软件实现的RS纠错编译码算法是高效的。相似文献

14.

Scalable Audio Compression at Low Bitrates

Kandadai S. Creusere C.D. 《IEEE transactions on audio, speech, and language processing》2008,16(5):969-979

A perceptually scalable audio coder generates a bit-stream that contains layers of audio fidelity and is encoded in such a way that adding one of these layers enhances the reconstructed audio by an amount that is just noticeable by the listener. Such algorithms have applications like music on demand at variable levels of fidelity, for instance using 3G and 4G cellular radio systems operating at different bit rates. While the MPEG-4 natural audio coder can create finely scalable bit streams using bit sliced arithmetic coding (BSAC), its perceptual quality at low bit rates is poor. On the other hand, the nonscalable transform-domain weighted interleaved vector quantization (TWIN-VQ) performs well at low bit rates. In this paper, we present a modified version of TWIN-VQ algorithm that generates a perceptually scalable bit-stream with many fine layers of audio fidelity. Using TWIN-VQ as our base ensures the best possible perceptual quality at low bit rates. Specifically, the proposed scalable algorithm performs as well as TWIN-VQ at rates of 8 to 16 kb/s and outperforms scalable BSAC by between 64% and 172% at rates of less than 24 kb/s. 相似文献

15.

基于水印技术的隐蔽通信算法设计

下载免费PDF全文

梁强邱志宏张爱科《计算机工程与科学》2010,32(8):32-35

本文研究了在音频信息上实现水印技术的难点与价值,分析和对比了利用音频信号进行信息隐藏的算法,探讨了隐蔽通信的编码策略,提出了一种基于混合模式的语音信号的信息隐藏编码算法。文中首先给出了编码过程中覆盖半径、覆盖编码等相关定义和信息隐藏编码的相关定理,从理论上分析了隐蔽通信编码策略的可行性和步骤;接着详细论证了该编码算法的基本原理、构造方法和算法的实现过程;然后以一段音频信息为例,阐述了隐蔽通信算法的实现方法和编码步骤;最后从算法的频谱分析、隐蔽信息嵌入量和听觉效果三个方面对算法性能进行测试。测试结果表明,算法隐蔽信息嵌入量可达2.1×103bps。相似文献

16.

A neural network approach to audio data hiding based on perceptual masking model of the human auditory system

Hossein L. Najafi 《Applied Intelligence》2007,27(3):269-275

A new system that employs artificial neural networks to identify perceptually masked transmission opportunities within an audio stream is presented. The neural network is trained to automatically extract the perceptual map of the human auditory system. The network is then used at the encoding end to identify opportunities for transmission of inaudible data into the voice stream. At the decoding end, the network is used to monitor the audio channel for presence of masked data. Increased data transmission rates, resistance to compression algorithms and increased processing gains are among the advantages of the proposed solution. 相似文献

17.

A source coding scheme for authenticating audio signal with capability of self-recovery and anti-synchronization counterfeiting attack

Fan MingQuan 《Multimedia Tools and Applications》2020,79(1-2):1037-1055

Authenticating the veracity and integrity of digital media content is the most important application of fragile watermarking technique. Recently, fragile watermarking schemes for digital audio signals are developed to not only detect the malicious falsification, but also recover the tampered audio content. However, they are fragile against synchronization counterfeiting attack, which greatly narrows the applicability of audio watermarking schemes. In this paper, a novel source coding scheme for authenticating audio signal based on set partitioning in hierarchical trees (SPIHT) encoding and chaotic dynamical system with capability of self-recovery and anti-synchronization counterfeiting attack is proposed. For self-recovery feature, the compressed version of audio signal generated by SPIHT source coding and protected against maliciously tampering by repeated coding is embedded into the original audio signal. Besides, for robustness against synchronization counterfeiting attack feature, based on the position and content of audio section, check bits are generated by Hash algorithm and chaotic sequence, and taken as part of fragile watermark. Simulation results show the self-embedding audio authentication scheme is recoverable with proper audio quality, and it has capability against synchronization counterfeiting attack.

相似文献

18.

基于AVS的快速亚像素运动估计算法

宋雪桦包祥吴问云《计算机工程与设计》2012,33(7):2716-2720

数字音视频编解码标准(AVS)采用了可变大小块以及1/4像素精度的运动估计技术以提高编码效率.针对视频编码中运动估计计算量大的问题,提出一种改进的亚像素快速搜索算法,该算法采用分象限预测策略,并通过阈值判断提前结束搜索.该算法可以有效地减少亚像素搜索点数,与AVS参考软件中的亚像素全搜索算法相比较,该算法在保证图像质量和编码效率的同时,亚像素搜索点数减少50％～81.25％,编码时间节省了25.36％～34.51％. 相似文献

19.

高质量、低复杂度的纯软件实时MPEG音频编、解码器

何冬梅高文《计算机工程与应用》1999,35(12):7-10

ＭＰＥＧ音频是高保真立体声音频压缩编、解码的国际标准,该标准采用与心理声学模型相结合的子带编码方案,算法计算量大,难以满足实时应用的场合。文章不仅从理论上分析了编、解码算法的基本原理,而且提出了快速算法,设计并实现了一个纯软件ＭＰＥＧ音频编、解码器。该软件可在不需任何附加硬件的Ｐｅｎｔｉｕｍ１６６ＭＨｚ计算机上对立体声音频信号进行实时编码和解码。相似文献

20.

基于VB的计算机先进控制CAI设计

陶文华《计算机仿真》2004,21(3):164-166

计算机先进控制是一门以计算机为基础的讲述各种新型计算机控制方案的学科，主要包括预测控制、模糊控制、神经网络等多种新型控制，内容理论性强。该文在Windows平台上，用Visual Basic 6．0开发，并与Matlab相结合，开发了计算机先进控制计算机辅助教学(CAI)软件。其软件功能强大，操作方便简单，灵活性强，并采用了多媒体技术。该文详细介绍了其总体结构设计过程，并对声音、动画、文本等多媒体技术的开发也进行了详细的讲解。在VB中利用ActiveX自动化技术使用Matlab的方法，实现了在VB中调用Matlab的接口设计，实现了对控制系统的各种先进控制方案的仿真，从而使教学内容生动、具体，提高了教学质量。相似文献