首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到18条相似文献,搜索用时 703 毫秒
1.
李晓明  鲍长春  贾懋 《电子学报》2015,43(7):1286-1293
基于语音和音频信号的固有周期性特征,本文构建了一种适合语音和音频信号的统一分析/合成模型,并分别在24kbps和32kbps码率下,实现了对宽带语音和音频信号的高质量分层编码.首先,本文将具有时变周期的输入信号规整为具有固定周期的信号,并对规整后的周期信号构建规整矩阵;其次,对规整矩阵的行和列分别进行调制叠接变换(MLT)和离散余弦变换(DCT),完成规整矩阵的稀疏化;最后,利用分带量化和矢量哈夫曼编码完成稀疏矩阵元素的量化和编码.主客观测试结果表明,本文所提方法的语音、音频及其混合信号的编码质量均优于同等速率下的ITU-T G.722.1和AMR-WB编码器.  相似文献   

2.
基于奇异值分解的低速率波形内插语音编码算法   总被引:8,自引:7,他引:1       下载免费PDF全文
王贵平  鲍长春  张鹏 《电子学报》2006,34(1):135-140
波形内插(WI)语音编码模型作为当今最具潜力的低速率语音编码方案之一,因其良好的性能,越来越受到人们的重视.本文基于一种奇异值分解(SVD)的特征波形分解方法,利用语音信号的感知特性,将二维特征波形的幅度谱分成基本矩阵、过渡矩阵和补充矩阵,并采用了不同的量化方法,有效地降低了运算复杂度;另外,本文根据语音信号时变特性,将三个矩阵分为三种组合模式表示特征波形幅度谱,并引入周期因子和能量熵来衡量矩阵周期程度,解决了奇异值分解后参数难于量化的问题,提高了编码效率.主观A/B测试表明,本文提出的2.4kbps SVD-WI编码器的重建语音质量略好于2.4kbps MELP编码器.  相似文献   

3.
贾懋珅  鲍长春 《电子学报》2009,37(10):2291-2297
 基于国际电信联盟标准化组织(ITU-T)编码标准G.729.1,本文提出了一种嵌入式变速率立体声语音与音频编码方法.本算法利用G.729.1和改进的调制叠接变换(Modulated Lapped Transform,MLT)编码技术对输入信号的中值与边带信息进行分层编码,形成具有嵌入式结构的码流.编码器可处理宽带和超宽带的立体声信号,宽带立体声信号编码的最大码率为48kb/s,超宽带立体声信号编码的最大速率为64kb/s.实现结果表明,本编码器的编码质量均达到了ITU-T对G.EV-VBR立体声编码的指标要求.  相似文献   

4.
王坤赤  蒋华 《现代电子技术》2007,30(21):168-170
共振峰声码器因其在理论上具有最低码率而一直是参数语音编码算法研究的重点。共振峰编码器的关键算法是基频和共振峰等语音参数的提取。在高分辨率语谱图基础上,利用语音信号的频域特性设计了一种简单有效的基频和共振峰提取算法。通过评价重建语音信号的音质,证明了参数提取算法的准确性。根据语音实验确定编码参数包含基频和前4个共振峰,并在保证语音质量的前提下制定各参数的量化指标。应用实际语音信号对算法的性能进行测试,试验结果证明算法在码率为1 400 b/s时具有良好的语音质量。  相似文献   

5.
高质量鲁棒600BPS甚低速率语音编码算法   总被引:3,自引:0,他引:3  
邹霞  陈亮  张雄伟 《信号处理》2003,19(Z1):109-112
本文提出了一种600bps高质量鲁棒语音编码算法,该算法采用多帧参数分类联合矢量量化、动态比特分配、参数内插和参数相关预测等技术降低语音编码速率.为了提高算法的抗信道误码能力,算法采用了鲁棒的矢量量化.非正式主观试听表明该算法的合成语音质量优于传统的2.4kbps线性预测(LPC-l0e)语音编码算法,接近于2.4kbps的MELP,并且该算法在1%的随机误码信道条件下仍然具有良好的可懂度.  相似文献   

6.
王晶  匡镜明  赵胜辉 《信号处理》2007,23(5):755-758
本文将自适应后滤波技术引入3kbps特征波形内插语音编码算法中,在解码端级联短时后滤波、频谱倾斜补偿、长时后滤波及自动增益控制四个模块。通过理论分析及主观听音测试来合理设置滤波系数,使其随着语音帧的特性自适应改变。经后滤波处理的输出语音信号频谱在共振峰及谐波处频率成分得到加强,而谱谷值处噪声被削弱,同时保证了滤波前后的信号能量基本保持不变,且不引入频谱倾斜。实验结果表明,本文的3kbps波形内插编码器合成语音经过自适应后滤波处理后量化噪声明显减少,语音质量得到改善。  相似文献   

7.
改进的符合EV-VBR标准的嵌入式宽带语音编码器   总被引:3,自引:0,他引:3  
基于国际电信联盟标准化组织(ITU-T)嵌入式变比特率(EV-VBR)编码标准提案,在本实验室开发的候选编码器基础上提出了一套改进的嵌入式变速率宽带语音编码方法.本算法在前2层使用代数码激励线性预测(ACELP)编码,增加计算量化了中间子帧谱参数,设计实现了三脉冲深度优先树搜索算法;在后3个编码层,本算法使用累积频域系数矢量的方式重新构建了嵌入式变换域编码(TCX).此外,改进编码器还实现了语音激活检测(VAD)和非连续传输(DTX)功能.相关测试表明,改进编码器较原编码器,语音质量有明显改善,编码复杂度显著降低,编解码质量和效率与最新的G718标准接近,并保持了低延迟的优点.  相似文献   

8.
该文基于匹配跟踪的谐波和独立谱线正弦模型提出了一种用于参数音频编码的实现方案,输入音频信号的正弦成分由谐波联合独立谱线共同表示,分析合成过程采用50%叠接相加(OLA)消除帧间不连续,匹配跟踪(MP)算法在频域提取模型参数(幅度、频率和相位)大大降低运算复杂度。谐波谱线基频由谐波积谱法(SHS)获得,各次谐波频点确切位置借助MP迭代过程推出,并进行二次曲线拟合,对应的谐波幅度采用LPC谱包络近似。独立谱线的提取有效弥补了谐波提取不足。实验证明该文提出的正弦模型实现方案可以很好地表示出音频信号中的平稳成分,并对低比特率的参数音频编码有一定的借鉴性。  相似文献   

9.
王贵平  鲍长春 《信号处理》2005,21(Z1):156-159
波形内插语音编码模型作为当今最具潜力的低速率语音编码方案之一,因其良好的性能,越来越受到人们的重视.本文在波形内插语音编码算法基础上,提出了一种基于奇异值分解(SVD)的LP残差信号的分解与量化方法,减少了算法的延时,提高了分解精度.在分解模型中,将CW分成基本矩阵、过渡矩阵和补充矩阵,并采用不同的量化方法,有效地降低了运算复杂度;在量化过程中,引入周期因子和能量熵来衡量CW周期程度,解决了奇异值分解后参数难于量化的问题,提高了编码效率.主观A/B测试表明,本文提出的2.4kbpsSVD-WI编码器的重建语音质量略好于2.4kbpsMELP编码器.  相似文献   

10.
王嵩  鲍长春  李晓明 《信号处理》2011,27(4):575-586
音频编码主要有两类技术:波形编码,参数编码。前者适合高速率高质量的应用环境,后者适合带宽受限或存储容量受限的应用或环境。参数音频编码以源模型表示信号,运用基于心理声学原理的参数估计和量化方法,提取、量化感知重要参数,在保证重建信号质量的同时,有效地减小了编码比特率。近年来,研究者将自适应时间分段、联合参数量化、参数立体声等新技术引入参数音频编码,使算法得到了优化,重建信号质量也得到了显著提升,其中某些技术成为了国际标准,并获得商业应用。本文回顾了近十几年来参数音频编码的重要进展,评述、探讨了存在的问题和研究的难点,并给出了两个典型参数音频编码系统的听力测试数据,以定量显示这些技术的性能,最后,展望了参数音频编码发展的方向。   相似文献   

11.
In this paper, we present a new method for high quality audio coding at low delay and low bit rate for telecommunications applications such as audioconfe-rence or videoconference. The developped coder is adapted to code generic audio signals at a bit rate of 64 kbit/s with a delay close to 5 ms in the 20-15000 Hz bandwidth. The method is based on speech coding as well as audio coding concepts. The coder combines subband decomposition of the input signal and LD-CELP techniques. We introduce in this structure of coding a psychoacoustic model which allows to allocate an optimal bit rate on each subband according to perceptual properties of the human hearing. In order to satisfy the bit rate requirement of the psychoacoustic model and to reduce the complexity of such a coding algorithm, we suggested a new method of vector quantization based on lattice quantization. This method allows to quantify the residual signal in the LD-CELP coder and avoid the complexity of the full search. Objective and subjective tests have been made on a test set of audio signals which is a critical sub-set used by ISO. Formal tests showed that the quality of the proposed coder is comparable to the best implementation of the MPEG-1, Layer II, but our solution has the advantage of reaching a very low delay (5 ms).  相似文献   

12.
A new segment quantization method using the Lempel-Ziv algorithm is proposed, and it is applied to quantize line spectral frequency parameters in speech codec. The proposed segment quantizer can save four bits per frame, compared with the ITU-T G.729 speech codec (18 bits/frame), without degradation of subjective or objective speech quality  相似文献   

13.
Predictive Coding of Speech at Low Bit Rates   总被引:1,自引:0,他引:1  
Predictive coding is a promising approach for speech coding. In this paper, we review the recent work on adaptive predictive coding of speech signals, with particular emphasis on achieving high speech quality at low bit rates (less than 10 kbits/s). Efficient prediction of the redundant structure in speech signals is obviously important for proper functioning of a predictive coder. It is equally important to ensure that the distortion in the coded speech signal be perceptually small. The subjective loudness of quantization noise depends both on the short-time spectrum of the noise and its relation to the short-time spectrum of the Speech signal. The noise in the formant regions is partially masked by the speech signal itself. This masking of quantization noise by speech signal allows one to use low bit rates while maintaining high speech quality. This paper will present generalizations of predictive coding for minimizing subjective distortion in the reconstructed speech signal at the receiver. The quantizer in predictive coders quantizes its input on a sample-by-sample basis. Such sample-by-sample (instantaneous) quantization creates difficulty in realizing an arbitrary noise spectrum, particularly at low bit rates. We will describe a new class of speech coders in this paper which could be considered to be a generalization of the predictive coder. These new coders not only allow one to realize the precise optimum noise spectrum which is crucial to achieving very low bit rates, but also represent the important first step in bridging the gap between waveform coders and vocoders without suffering from their limitations.  相似文献   

14.
A new video coding algorithm called the first-order-residual/second-order-residual (FOR/SOR) codec is proposed for high definition (HD) video coding in this work. Several advanced coding techniques are adopted in the proposed FOR/SOR codec. For the FOR codec, the well known block-based motion compensated predictive codec is used to exploit temporal and spatial correlations in input image frames. However, it is observed that there still exists structured residual signal after the FOR coding, and a SOR coder is developed to encode residual image frames efficiently. To improve the coding performance furthermore, we consider bit allocation between the FOR and SOR coders at the same block and determine their optimal quantization parameters systematically. It is shown by experimental results that the proposed FOR/SOR codec outperforms H.264/AVC significantly in HD video coding.  相似文献   

15.
The aim of this paper is to improve the G.711 standard, which is widely used, especially in the public switched telephone network (PSTN). Two solutions are proposed. The first solution uses only lossless coder, achieving a bit-rate decrease of 0.82 bits/sample, compared to the G.711 codec. The second solution uses forward adaptation and a lossless coder, further decreasing the bit-rate (by 1.25 bits/sample) and achieving higher average signal-to-quantization noise ratio (SQNR) in comparison with the G.711 codec. Also, the second solution is more robust than the G.711 codec, which means that it has near constant SQNR for a wide range of input signal power. That is very important for signals whose input power varies with time, such as speech and video signals. Our solutions are compatible with the G.711 codec, they have little additional complexity and delay and therefore can be applied in real-time systems, such as PSTN or VoIP. They can also be used in many other systems, such as WiMax and OFDM, as a replacement or improvement of the G.711 codec. Standardization process of the G.711.1 standard (which is a wide-band extension of the G.711 standard) is largely present. Our solutions fulfill all the requirements for that new standard; therefore they can be implemented in its low-frequency part.  相似文献   

16.
刘鑫  鲍长春 《电子学报》2015,43(4):816-821
宽带音频带宽的限制会降低其主观质量和自然度.本文提出了一种基于相似关联度神经网络的宽带向超宽带音频频带扩展方法.该方法将宽带音频的精细谱重构成多维相空间,并建立相似关联度神经网络来恢复高频成分的精细谱,同时借助高斯混合模型估计高频谱包络,并以G.722.1编码器为平台实现音频信号的带宽扩展.测试结果表明,本文方法扩展性能优于参考方法,其主观质量接近于G.722.1C超宽带编码器.  相似文献   

17.
The MPEG‐D unified speech and audio coding (USAC) standardization process was initiated by MPEG to develop an audio codec that is able to provide consistent quality for mixed speech and music contents. The current USAC reference model structure consists of frequency domain (FD) and linear prediction domain (LPD) core modules and is controlled using a signal classifier tool. In this letter, we propose an LPD single‐mode USAC structure using an adaptive widowing‐based transform‐coded excitation module. We tested our system using official test items for all mono‐evaluation modes. The results of the experiment show that the objective and subjective performances of the proposed single‐mode USAC system are better than those of the FD/LPD dual‐mode USAC system.  相似文献   

18.
A medium-band speech coder is proposed that uses a weighted vector quantization scheme in the transformed domain. The linear prediction residue is transformed and vector-quantized. In order to control the quantization errors in the transformed domain, adaptively weighted matching is used instead of conventional adaptive bit allocation. Therefore, the residual signal can be reconstructed by the decoder, even if the spectral envelope parameters are destroyed due to transmission errors. This coder is also capable of maintaining higher SNR (signal-to-noise ratio) performance than time-domain vector quantization coders for a wide range of computation complexities and bit rates. Coded speech is natural and unaffected by background noise. The mean opinion score for this coder at 7.2 kb/s is comparable to that of 5.5-bit log PCM coded speech sampled at 6.4 kHz  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号