共查询到19条相似文献,搜索用时 218 毫秒
1.
基于小波变换的2.4kbit/s波形内插语音编码算法 总被引:1,自引:0,他引:1
基于双正交小波滤波器组对波形内插编码中提取的特征波进行多级分解与重构,提出了一种基于小波变换(WT)的2.4kbit/s特征波形内插(CWI)语音编码算法。编码端去除了特征波对齐运算,并对幅度谱进行多级分解,相位谱不传输,鉴于小波变换对信号的压缩特性,仅传输对人耳感知起主要贡献的最后一级特征波幅度谱;解码端对各尺度空间采用单独重建的方法,相位信息在重构的末级与幅度谱结合,并由浊音度标志选择固定或随机相位。此外,根据语音信号的时变特性,由基于子帧的浊音度标志选择需要传输的幅度谱及量化模式。主观R-A/B测试表明,这种基于小波变换的2.4kbit/s编码算法的合成语音质量明显优于标准的2.4kbit/s的MELP编码器及FS1016的4.8kbit/sCELP编码器,亦优于3.8kbit/s的传统CWI编码框架下的合成语音效果。 相似文献
2.
该文介绍了基于多带激励(MBE) 语音模型的改进型800/920/1200bps 语音编码器。该编码器采用二级离散余弦变换编码方案对逐帧变化的谱幅度参数进行自适应动态量化编码,大大减小了编码比特率,使得800~1200bps速率的MBE语音编码器的实现成为可能。该文重点介绍二级离散余弦变换编码方案及DSP硬件实现。 相似文献
3.
低速率WI编码器中4~6bit基音量化算法研究 总被引:1,自引:0,他引:1
基音在语音编码中通常采用7bit无失真均匀量化。由于浊音段语音的基音普遍具有缓慢渐变的特点,为了更有效地去除前后帧基音之间存在的相关性,该文基于Eriksson和Kang提出的4bit基音量化算法,针对汉语语音进行研究,实现了一套4~6bit基音量化算法。该算法计算简单,无需码书存储。将此基音量化方案应用于WI模型和WI编码器,主观A/B听力测试结果表明,该方案在高效量化基音的同时保证了合成语音质量几乎没有损失,完全满足低速率WI编码器对量化基音的要求。 相似文献
4.
改进的符合EV-VBR标准的嵌入式宽带语音编码器 总被引:3,自引:0,他引:3
基于国际电信联盟标准化组织(ITU-T)嵌入式变比特率(EV-VBR)编码标准提案,在本实验室开发的候选编码器基础上提出了一套改进的嵌入式变速率宽带语音编码方法.本算法在前2层使用代数码激励线性预测(ACELP)编码,增加计算量化了中间子帧谱参数,设计实现了三脉冲深度优先树搜索算法;在后3个编码层,本算法使用累积频域系数矢量的方式重新构建了嵌入式变换域编码(TCX).此外,改进编码器还实现了语音激活检测(VAD)和非连续传输(DTX)功能.相关测试表明,改进编码器较原编码器,语音质量有明显改善,编码复杂度显著降低,编解码质量和效率与最新的G718标准接近,并保持了低延迟的优点. 相似文献
5.
基于国际电信联盟标准化组织(ITU-T)编码标准G.729.1和改进的调制叠接变换(MLT,modulated Lapped transform)编码技术,提出了一种码率在8-64kbit/s的超宽带嵌入式变速率语音与音频编码方法,其中,8~32kbit/s码率的码流由G.729.1编码算法生成,编码信号为0~7kHz频段的信息;36、40和48kbit/s码率层及56、64kbit/s码率层码流由MLT变换编码方式生成,编码信号分别为7~14kHz频段的信息和G.729.1编码残差的MDCT信息.客观和主观听力测试表明本编码器的性能达到了ITU-T提出的参考指标要求. 相似文献
6.
7.
本文针对波形内插(WI)语音编码模型和参数量化等技术进行了研究,并最终提出了一种基于二维非负矩阵分解的1kb/s波形内插(2DNMF-WI)语音编码算法. 文中采用二维非负矩阵分解(2D-NMF)方法来分解语音特征波形(CW),该分解方法在行和列两个方向上同时压缩CW幅度谱矩阵的维数,使得CW幅度谱矩阵降维后得到的编码矩阵维数较小,易于量化. 此外,在甚低速率语音编码中,由于没有足够的比特数来描述编码参数,往往很难得到高质量的合成语音. 本算法采用两帧联合编码、帧间后向预测三级矢量量化、离散余弦变换(DCT)和分裂式矩阵量化等技术来降低编码速率和改善音质. 非正式主观听觉测试显示,1kb/s 2DNMF-WI编码器合成语音的质量稍差于2kb/s的NMF-WI语音编码算法. 相似文献
8.
针对极低速率语音通信的要求,提出了一种基于MELP(Mixed-Excitation Linear Prediction)的0.6Kb/s语音编码算法。把MELP算法中3个连续语音帧组成一个超级帧,充分利用参数的帧间相关性,进行联合量化,从而获得了高质量的合成语音。采用对线谱对频率的两帧联合量化与双向预测矢量量化对基音周期的按清浊音分模式量化,对子带清浊参数量化的统计码本构造,对能量参数采用分离均值矢量量化解码端对能量参数采用了一种效果更好的插值算法等。 相似文献
9.
CE -LPC称为码激励线性预测编码 ,它属于声编码器类。这类编码器从时间波形中提取重要的特征 ,它在低比特率编码器中最适用。本文通过CE -LPC编码的特点、系统组成和编码原理等几个方面 ,说明民航语音交换系统采用CE -LPC编码可在4 8kbit/s的速率上传输高质量的话音信号。 相似文献
10.
11.
12.
13.
14.
基于增强型混合激励线性预测模型,提出一种高质量的300 bit/s声码器算法。每个语音帧仅提取少量参数,为提高量化效率,每8个语音帧组成一个超级帧,对超级帧参数进行矢量量化。算法采用基于模式转移的码本映射估计带通浊音度参数,改善其量化精度。对不同带通浊音度模式下的基音参数量化码本尺寸进行联合优化,提高量化效率。同时,对线谱频率参数采用带有级间预测的多级矢量量化以降低谱失真。主观听觉测试表明,此声码器具有较高的可懂度并具有一定的自然度,诊断押韵测试(DRT)的分数为84.2%。 相似文献
15.
设计了一种可变速率的低时延、码激励线性预测编码(LD-CELP)的方案,它是通过修改码本来实现的。该方案工作在11.2kbit/s。对其做了计算机仿真,并与16kbit/s的LD-CELP算法在信经(SNR)、波形等方面进行了对比,仿真结果表明效果良好。 相似文献
16.
This paper presents several strategies to improve the performance of very low bit rate speech coders and describes a speech codec that incorporates these strategies and operates at an average bit rate of 1.2 kb/s. The encoding algorithm is based on several improvements in a mixed multiband excitation (MMBE) linear predictive coding (LPC) structure. A switched-predictive vector quantiser technique that outperforms previously reported schemes is adopted to encode the LSF parameters. Spectral and sound specific low rate models are used in order to achieve high quality speech at low rates. An MMBE approach with three sub-bands is employed to encode voiced frames, while fricatives and stops modelling and synthesis techniques are used for unvoiced frames. This strategy is shown to provide good quality synthesised speech, at a bit rate of only 0.4 kb/s for unvoiced frames. To reduce coding noise and improve decoded speech, spectral envelope restoration combined with noise reduction (SERNR) postfilter is used. The contributions of the techniques described in this paper are separately assessed and then combined in the design of a low bit rate codec that is evaluated against the North American Mixed Excitation Linear Prediction (MELP) coder. The performance assessment is carried out in terms of the spectral distortion of LSF quantisation, mean opinion score (MOS), A/B comparison tests and the ITU-T P.862 perceptual evaluation of speech quality (PESQ) standard. Assessment results show that the improved methods for LSF quantisation, sound specific modelling and synthesis and the new postfiltering approach can significantly outperform previously reported techniques. Further results also indicate that a system combining the proposed improvements and operating at 1.2 kb/s, is comparable (slightly outperforming) a MELP coder operating at 2.4 kb/s. For tandem connection situations, the proposed system is clearly superior to the MELP coder. 相似文献
17.
18.
The transform approach to speech coding has been established for some time, and has been shown to be very efficient in controlling the bit allocation and the shape of the noise spectrum. Various transform coders have been reported which produce high-quality digital speech at around 16 kbit/s. Although these coders can maintain good quality down to about 9.6 kbit/s, they perform poorly at lower bit rates. Here we discuss how vector quantisation (VQ) can be used to improve the quality of transform coders. We describe one specific design of vector-quantised transform coder (VQTC) which follows on from earlier work, and which is capable of producing good-quality speech at as low as 4.8 kbit/s. 相似文献
19.
Byung Lee Chong Un Hyeong Lee Byoung Shin Hwang Lee 《Communications, IEEE Transactions on》1983,31(6):775-783
In this paper, implementation of a compact and efficient multirate speech digitizer with variable transmission rates of 2.4, 4.8, 9.6, and 14.96 kbits/s is presented. The multirate algorithm has been made based on the residual-excited linear prediction (RELP) vocoder with a transmission rate of 9.6 kbits/s. The residual encoder employed in the RELP vocoder uses hybrid companding delta modulation (HCDM). This HCDM is also used as a 14.96 kbit/s coder. If the residual in the RELP system is down-sampled before encoding, a 4.8 kbit/s coder can be realized. If the residual encoder is not used, a 2.4 kbit/s linear predictive coder (LPC) can be realized by incorporating a pitch extractor. In the 4.8 and 9.6 kbit/s coders the pitch-implanted residual excitation method has been used to generate the excitation signal to the synthesis filter. The multirate speech digitizer algorithm has been implemented using 2900 series bit-slice microprocessors. The external memory is composed of 2K RAM's and 2K ROM's. The system design is a two-bus structure with a 204 ns cycle time. With efficient hardware and software design, the multirate speech digitizer requires almost the same hardware complexity as compared with the conventional 2.4 kblt/s LPC vocoder. 相似文献