共查询到19条相似文献,搜索用时 187 毫秒
1.
随着多带激励MBE模型的成功运用,MBE语音编解码算法也日新月易。介绍了多带激励MBE谱幅度参数和V/U判决参数的提取,阐述了参数的编解码方案,对谱幅度量化时,先作离散余弦变换(DCT),然后进行了矢量量化(VQ);最后介绍了语音信号的合成,通过实验验证了合成语音与原始语音在频率和幅度上几乎一致,说明该方法在合成语音的质量上比较理想。 相似文献
2.
改进的MBE语音算法作为国际卫星通讯组织采用的通讯语音标准在低速语音编码的是一个重要方面。文章介绍了该编码器的算法原理以及用一片TMS320VC5409芯片实现该编码器的软、硬件结构以及关键技术。 相似文献
3.
4.
5.
基于局部余弦变换的低比特变速率语音编码算法研究 总被引:1,自引:0,他引:1
提出将局部余弦变换(LCT)算法应用于语音编码中,系统设计了一个平均比特率近1.6kbit/s的低比特变速率语音编码器。在变比特率编码器设计中采用SVM算法进行VAD检测。激活语音帧的语音模式采用GSM半速率编码中的划分方法,但将其中的强浊音模式和中浊音模式合并为一个中强浊音模式。对各类语音模式和无声帧(背景噪声)的局部余弦变换系数采用分维矢量量化算法进行量化,码书设计采用LGB算法。编码中的码书搜索采用树形快速搜索算法。通过主观非正式听力测试表明设计的变比特率编码器编码的重建语音MOS约为3.15,与比特率为2.4kbit/s美国联邦声码器标准MELP的重建语音相当,具有较强的顽健性,适合于对存在各种环境噪声的语音进行编码。 相似文献
6.
本介绍意大利Teletera公司研制的,采用数字微波三次群(34Mb/s)传输的数字电视,该方案采用的是8×8像素,离散余弦变换方式。离散余弦是正交变换中的一种。在介绍具体方案之前,为了便于读深入理解,笔从变换编码入手,深入浅出地介绍一些基本原理。 相似文献
7.
本文针对波形内插(WI)语音编码模型和参数量化等技术进行了研究,并最终提出了一种基于二维非负矩阵分解的1kb/s波形内插(2DNMF-WI)语音编码算法. 文中采用二维非负矩阵分解(2D-NMF)方法来分解语音特征波形(CW),该分解方法在行和列两个方向上同时压缩CW幅度谱矩阵的维数,使得CW幅度谱矩阵降维后得到的编码矩阵维数较小,易于量化. 此外,在甚低速率语音编码中,由于没有足够的比特数来描述编码参数,往往很难得到高质量的合成语音. 本算法采用两帧联合编码、帧间后向预测三级矢量量化、离散余弦变换(DCT)和分裂式矩阵量化等技术来降低编码速率和改善音质. 非正式主观听觉测试显示,1kb/s 2DNMF-WI编码器合成语音的质量稍差于2kb/s的NMF-WI语音编码算法. 相似文献
8.
9.
混合激励线性预测低速率语音编码研究 总被引:1,自引:0,他引:1
为了满足数字通信及其他商业应用的需求,语音压缩编码技术得到迅速发展.近年来主流的低速率语音编码方案主要基于LPC-10,混合激励线性预测(MELP),多带激励编码(MBE),正弦变换编码(SCI),波形内插编码(WI),大多都工作在2.4 kb/s速率下.作为一种重要的低速率语音编码算法,MELP算法对LPC-10编码方案进行大量改进,引入混合激励,非周期脉冲,残差付氏幅度谱,脉冲散布和自适应谱滤波5个特征.实验结果表明,该混合激励线性预测编码在2.4 kb/s上得到了更好的合成语音,并使得合成语音能更好地拟合自然语音. 相似文献
10.
文章首先介绍7TETRA数字集群系统语音业务信道中采用的纠错编码技术,然后对语音业务信道的编码流程进行了分析,最后给出了TETRA数字集群系统信道编码器的FPGA实现方案。 相似文献
11.
1200/2400bps改进型多带激励声码器的实时实现 总被引:1,自引:0,他引:1
本文介绍了基于多带激励(MBE)语音模型的改进型全双工1200/2400bps声码器.该声码器已应用于多种通信系统中.其语音清晰度(DRT标准)1200bps时为9175,2400bps时为9267.本文重点介绍其硬件结构及算法实现. 相似文献
12.
An approach to the implementation of a discrete cosine transform (DCT) for application to coding speech is described. The approach is oriented toward single speech channel encoding. In addition, a detailed computer simulation of an adaptive transform coder is described. The purpose of the computer simulation is to determine the internal precision at various points in the implementation required to avoid subjective degradation. Specific recommmendations are made on the required internal precision in the implementation of the discrete cosine transform. A breadboard implementation of the DCT using SSI and MSI TTL logic based on the results of the computer simulation is reported. 相似文献
13.
14.
This paper presents a rate-distortion derived transform trellis coding (TTC) scheme with applications to Gaussian AR sources and speech data. The optimal encoder consists of a Karhunen-Loeve transform (KLT) on the source output, followed by a search on a trellis structured random code, where the decoder is a time-variant nonlinear filter. The scheme is implementable and applicable to stationary Gaussian sources with a bounded and continuous power spectrum and the squared error distortion measure. The code construction is based on the power or eigenvalue spectrum of the source with no restriction on the coding rate. The TTC scheme is first applied to encode a Gaussian AR source often used to model speech. Simulations were conducted at several rates, using an optimal KLT and the suboptimal discrete cosine transform (DCT). Results demonstrate that the DCT performs as well as the KLT, and both yield average distortions very close to the distortion-rate function. For speech data, an adaptive version of the DCT TTC scheme is applied to encode two speech sentences at several coding rates. The adaptation is controlled by an estimate of the short-term eigenvalue spectrum which is transmitted as side information to the receiver. The proposed scheme is a very efficient speech waveform coder that provides reconstructed speech with very high signal-to-noise ratio values and very good perceptual quality at low bit rates. 相似文献
15.
Adaptive image coding with perceptual distortion control 总被引:6,自引:0,他引:6
This paper presents a discrete cosine transform (DCT)-based locally adaptive perceptual image coder, which discriminates between image components based on their perceptual relevance for achieving increased performance in terms of quality and bit rate. The new coder uses a locally adaptive perceptual quantization scheme based on a tractable perceptual distortion metric. Our strategy is to exploit human visual masking properties by deriving visual masking thresholds in a locally adaptive fashion. The derived masking thresholds are used in controlling the quantization stage by adapting the quantizer reconstruction levels in order to meet the desired target perceptual distortion. The proposed coding scheme is flexible in that it can be easily extended to work with any subband-based decomposition in addition to block-based transform methods. Compared to existing perceptual coding methods, the proposed perceptual coding method exhibits superior performance in terms of bit rate and distortion control. Coding results are presented to illustrate the performance of the presented coding scheme. 相似文献
16.
A perceptual audio coder, in which each audio segment is
adaptively analyzed using either a sinusoidal or an optimum wavelet basis
according to the time-varying characteristics of the audio signals, has been
constructed. The basis optimization is achieved by a novel switched filter
bank scheme, which switches between a uniform filter bank structure
(discrete cosine transform) and a non-uniform filter bank structure
(discrete wavelet transform). A major artifact of the International
ISO/Moving Pictures Experts Group (MPEG) audio coding standard (MPEG-I
layers 1 and 2) known as pre-echo distortion which uses a uniform filter bank structure for
audio signal analysis, is almost eliminated in the proposed coder. A
perceptual masking model implemented using a high-resolution wavelet packet
filter bank with 27 subbands, closely mimicking the critical bands
of the human auditory system, is employed in this audio coder. The resulting
scheme is a variable bit-rate audio coder, which provides compression ratios
comparable to MPEG-I layers 1 and 2 with almost transparent quality. 相似文献
17.
针对波形内插(Waveform Interpolation,WI)语音编码的特征波形分解问题,本文首先提出了基于离散余弦变换(Discrete Cosine Transform,DCT)的特征波形分解方法,避免了复杂的特征波形对齐运算;其次,针对WI的相位重建问题,提出了清/浊音相位判决和浊音相位分类的方法,提高了重建语音质量;最后,分别构建了速率为2.0kbps和1.6kbps的DCT-WI声码器.主观MOS分表明,2.0kbps的DCT-WI声码器质量优于2.4kbps MELP声码器,1.6kbps的DCT-WI声码器亦取得了良好的听觉效果. 相似文献
18.
19.