共查询到20条相似文献,搜索用时 93 毫秒
1.
有效地检测出流行音乐中的歌唱部分对在海量数据库中进行音乐检索、浏览、归类,以及旋律提取和歌唱家识别等有较大的价值.本文使用在语音信号处理中广泛使用的基于梅尔频率的倒谱系数(MFCC)作为语音特征来分析所要处理的音乐信号,并采用高斯混合模型(GMM)的建模方法分别为音乐中的伴奏部分(non-vocal)和歌唱部分(vocal)建立相应的GMM,进而实现音乐中歌唱部分的智能检测.与传统的只用一组手工标示了vocal和non-vocal的训练数据分别为两类数据训练一个GMM的方法相比较,本文在此基础上,再分别用一组纯歌唱部分数据和一组纯伴奏部分数据为每类建立一个GMM,然后将上述得到的两个vocalGMMs和non-vocalGMMs进行线性组合得到表示每类的概率模型.本文使用似然概率分类器作为系统的决策函数.实验结果表明该方法能够有效提高系统的识别性能. 相似文献
2.
3.
4.
集合经验模态分解(Ensemble empirical mode decomposition,EEMD)方法在去除心电信号噪声时,噪声本征模态函数(Intrinsic mode function,IMF)分量难以选择且将噪声分量直接去掉会导致信号失真。针对上述问题,提出了一种基于EEMD的自适应阈值算法。首先对含噪心电图(Electrocardiogram,ECG)数据进行EEMD分解,得到IMF,根据马氏距离进行信号IMF分量和噪声IMF分量的判定,然后通过果蝇优化算法确定噪声IMF的阈值,将经过阈值去噪的新的分量和剩余分量重构得到去噪后的ECG。最后,使用MIT-BIH数据库中的心电数据进行实验,实验结果表明,该方法在去噪同时能够较好地保留信号细节。 相似文献
5.
针对现有的脉冲噪声去除算法在去噪性能和计算效率上的缺陷,提出了基于邻域统计检测的双树复小波图像去噪算法。根据噪声的灰度特征、邻域像素的多数原则以及灰度偏差等统计特性进行噪声检测,充分利用双树复小波变换的优秀特性,在双树复小波域中用光滑可导的阈值函数以及自适应阈值对噪声图像进行去噪处理,最后用去噪图像中的像素,替换噪声图像中对应的噪声像素以得到最终的去噪图像。实验数据证明,所提出的方法优于部分最新提出的算法,具有较好的去噪性能和快速的计算效率。 相似文献
6.
针对常规的ECG(electrocardiogram)信号去噪算法存在的缺陷,提出了一种基于形态学与小波变换的自适应综合去噪算法。该算法利用形态学滤波器去除基线漂移信号,用小波滤波器去除高频干扰信号,并将这两部分所得到的心电噪声分量作为自适应滤波器的参考输入信号,对ECG信号进行自适应滤波处理,最后得到去噪后的ECG信号。实验表明,本算法是一种有效的去噪算法。 相似文献
7.
针对语音去噪问题,提出一种基于循环生成对抗网络(CycleGAN)的方法来对声音场景中的语音进行去噪。该方法把CycleGAN的网络模型与不同领域间的语音转换技术进行结合与优化,通过提取语音频谱包络特征,对语音进行编码与解码的处理,旨在用先进的生成技术实现语音端到端的去噪,从而简化语音去噪过程中带来的高阶差异问题,同时泛化其应用场景。通过对非平行数据集和平行数据集进行训练与测试,主要比较该方法与传统CycleGAN的语音去噪方法下的去噪效果,由实验结果得到PESQ、NR、SSNR这3项指标分别相对提高了8.49%、6.53%、23.30%,有效地解决了实际场景中的非平行语音去噪问题。 相似文献
8.
特高频法是GIS设备在状态检测中最常用的方法之一,但在应用过程中传感器所采集的特高频信号通常会携带较多噪声,会干扰检测的准备性,因此对信号进行有效的噪声抑制是局部放电检测的关键技术。本文基于小波包去噪理论,提出了一种综合考虑平滑度和边缘特征的改进阈值函数,以实现对GIS局部放电特高频模拟信号的降噪。通过信噪比(SNR)、均方误差(MSE)和波形相似数(NCC)三种参数对方法的去噪效果进行评估,结果表明,改进小波包阈值的信号降噪方法具有良好的噪声抑制功能,应用后可提高GIS局部放电采集信号的初始数据质量。 相似文献
9.
在传感器网络研究领域中,去除感知数据含有的噪声是个重要的研究课题。现存的去噪算法没有考虑节点密度不均匀及信息拥塞的情况,从而过多地消耗了能量。考虑这两个因素,使用时间维加权的方法,提出了一个基于节点密度的网内自适应去噪算法-DHA(density-based hybrid approach)。DHA能够根据节点密度来进行算法决策,并且在时间维进行加权,能够对数据变化作出快速反应并且提高数据精度。实验结果表明,DHA方法能够在保证良好的去噪效果、快速响应时间的前提下,比目前最好的去噪算法WMA(weighted moving average-based)更节省能量。 相似文献
10.
为了更好地利用图像先验以及保护图像边缘、纹理等细节信息,提出一种结合反应扩散(TrainedNonlinearReactionDiffusion,TNRD)与基于块组先验去噪(PatchGroupPriorbasedDenoising,PGPD)的改进算法。首先,对PGPD去噪后的图像进行小波分解得到3个正交的子带,由理论分析可知图像为各子带之和;然后利用反应扩散对高频系数大于阈值的子带部分进行扩散处理,并将处理结果替代原来部分从而获得最终去噪图像。实验结果表明,改进算法在峰值信噪比、保护细节信息等方面都有较大的性能改善。 相似文献
11.
《Multimedia, IEEE Transactions on》2009,11(1):68-76
In the past, similarity search for audio data has largely been focused on music. Recent digitization efforts in some of the larger animal sound archives bring other types of audio recordings into the focus of interest. Although recordings in animal sound archives are usually very well annotated by metadata, it is almost impossible to manually annotate all sounds made by animals in each recording. Complementary to classical text-based querying of databases that exploit available annotations, algorithms capable of automatically finding sections of recordings similar to a given query fragment provide a promising approach for content-based navigation. In our work, we present algorithms for feature extraction, as well as indexing and retrieval of animal sound recordings. Making use of a concept from image processing, the structure tensor, our feature extraction algorithm is adapted to the typical curve-like spectral features that are characteristic for many types of animal sounds. We propose a method for similarity search in animal sound databases which is obtained by adding a novel ranking scheme to an existing inverted file based approach for multimedia retrieval. Evaluation of our methods is based on recordings from the Animal Sound Archive, Berlin. 相似文献
12.
Wichern G. Xue J. Thornburg H. Mechtley B. Spanias A. 《IEEE transactions on audio, speech, and language processing》2010,18(3):688-707
13.
在咳嗽识别中,语音是影响识别准确率的主要因素。分析咳嗽与语音相邻帧频谱的相似性特征,发现咳嗽相邻帧的频段互相关系数明显小于语音,因此频段互相关系数可以作为区分咳嗽与语音的动态特征。在相同实验条件下,以MFCC为静态特征,比较了以频段互相关系数和一阶MFCC作为动态特征参数的咳嗽识别性能。多组录音的咳嗽识别实验结果表明:采用频段互相关系数作为动态特征参数咳嗽识别的平均准确率为90.27%,其识别能力优于一阶MFCC。 相似文献
14.
咳嗽中包含丰富的病理信息,可以为临床诊断提供重要支持。自动咳嗽检测方法有助于提高检测结果的可靠性,并减少人为工作量。但在自然记录的语音信号中,非咳嗽信号的数量远多于咳嗽,语音流中咳嗽信号的自动检测是个典型的类别不均衡问题。针对该问题,提出一种基于偏最小二乘分类法的咳嗽信号检测模型APLSCX。利用非对称偏最小二乘分类器处理类别不均衡数据的能力,对归一化的特征向量进行特征抽取,同时基于低维数据的方差调整分类平面。实验结果显示,与LCM、SVM等主流模型相比,APLSCX兼顾了小类的召回率和精度指标,具有较高的检出率和较低的误警率,更适用于自然语流中咳嗽信号的检测。 相似文献
15.
Brdiczka O. Langet M. Maisonnasse J. Crowley J.L. 《Automation Science and Engineering, IEEE Transactions on》2009,6(4):588-597
This paper addresses learning and recognition of human behavior models from multimodal observation in a smart home environment. The proposed approach is part of a framework for acquiring a high-level contextual model for human behavior in an augmented environment. A 3-D video tracking system creates and tracks entities (persons) in the scene. Further, a speech activity detector analyzes audio streams coming from head set microphones and determines for each entity, whether the entity speaks or not. An ambient sound detector detects noises in the environment. An individual role detector derives basic activity like ldquowalkingrdquo or ldquointeracting with tablerdquo from the extracted entity properties of the 3-D tracker. From the derived multimodal observations, different situations like ldquoaperitifrdquo or ldquopresentationrdquo are learned and detected using statistical models (HMMs). The objective of the proposed general framework is two-fold: the automatic offline analysis of human behavior recordings and the online detection of learned human behavior models. To evaluate the proposed approach, several multimodal recordings showing different situations have been conducted. The obtained results, in particular for offline analysis, are very good, showing that multimodality as well as multiperson observation generation are beneficial for situation recognition. 相似文献
16.
17.
在既有平稳噪音又有突发噪声的环境下进行语音端点检测是一项挑战.在选择抗噪特征的基础上,提出了自适应判定阈值和用多层感知器进行语噪鉴别的语音端点检测办法.实验结果表明,选择的语音参数比传统的帧能量和过零率在信噪比为0 dB时,正确的语音端点检出率高出27%,而多层感知器在正常环境下,检出94.47%的开关门声、咳嗽声、翻书声和呼吸声等孤立突发噪声. 相似文献
18.
19.
Donatas Trapenskas
rjan Johansson 《International Journal of Industrial Ergonomics》2001,27(6):405-410
One of the problems associated with listening to binaurally recorded sound events is localization confusions. The main objective of this investigation was to find out whether a short training session prior to listening to binaural recordings through headphones would facilitate correct spatial perception of the sound field. Focus was on the localization of the sound stimuli in median plane. Sound signals were recorded with an artificial head in three different conditions namely, anechoic, highly reverberant and moderately reverberant. Fourteen subjects participated in the listening tests. All subjects were required to localize all virtual sound stimuli under two different conditions. The first condition had a short training session binaurally recorded in the same environments as preceeding sound stimuli, and only sound stimuli recorded in the same environment were presented. The second condition did not have a training session, and sound stimuli recorded in different environments were presented. Results showed that a short training session prior to listening to binaurally recorded sounds through headphones was useful as it facilitated localization performance. The biggest effect was in reduced amount of sounds perceived inside the head. It was most pronounced for sound stimuli recorded in anechoic environment. 相似文献
20.
关于空间听觉的研究表明对“与头相关联的传递函数(HRTF)”进行测量和研究备受关注。在自由声场声源的条件下,测量是由左右耳耳膜处的记录信号组成,测得的HRTF幅度特性的变化为声源位置的函数。本文对近年来的一些测量方法和测量数据作了研究分析 相似文献