首页 | 官方网站   微博 | 高级检索  
 共查询到20条相似文献,搜索用时 93 毫秒
有效地检测出流行音乐中的歌唱部分对在海量数据库中进行音乐检索、浏览、归类,以及旋律提取和歌唱家识别等有较大的价值.本文使用在语音信号处理中广泛使用的基于梅尔频率的倒谱系数(MFCC)作为语音特征来分析所要处理的音乐信号,并采用高斯混合模型(GMM)的建模方法分别为音乐中的伴奏部分(non-vocal)和歌唱部分(vocal)建立相应的GMM,进而实现音乐中歌唱部分的智能检测.与传统的只用一组手工标示了vocal和non-vocal的训练数据分别为两类数据训练一个GMM的方法相比较,本文在此基础上,再分别用一组纯歌唱部分数据和一组纯伴奏部分数据为每类建立一个GMM,然后将上述得到的两个vocalGMMs和non-vocalGMMs进行线性组合得到表示每类的概率模型.本文使用似然概率分类器作为系统的决策函数.实验结果表明该方法能够有效提高系统的识别性能.  相似文献   

小波变换在脉冲涡流检测信号中的应用   总被引:3,自引:1,他引:2  
脉冲涡流检测方法是涡流检测技术的一个新兴分支.通过实验装置采集了含有噪声的缺陷信号.介绍了小波去噪的基本原理,研究了脉冲涡流检测信号中的去噪问题,采用小波系数去噪对脉冲涡流检测信号进行了处理.实验结果表明:采用小波系数去噪的方法可使缺陷信号的信噪比得到显著的提高.  相似文献   

本文以工业制造自动生产线中的零件位置识别为研究主题,采用图像处理等方法来自动智能识别零件位置.首先使用二维单目的方式,采集数据集中的零件图像,然后对图像进行均衡化处理、灰度化与小波去噪算法等一系列预处理,使用基于区域算法与最优化算子Canny边缘检测方法的图像分割模型,对同一空间中的两个不同零件进行分割,得到的两个零件的图像分别作为基于Blob算法的零件位置识别模型的输入,从而识别出附件中两个不同零件的位置坐标.  相似文献   

集合经验模态分解(Ensemble empirical mode decomposition,EEMD)方法在去除心电信号噪声时,噪声本征模态函数(Intrinsic mode function,IMF)分量难以选择且将噪声分量直接去掉会导致信号失真。针对上述问题,提出了一种基于EEMD的自适应阈值算法。首先对含噪心电图(Electrocardiogram,ECG)数据进行EEMD分解,得到IMF,根据马氏距离进行信号IMF分量和噪声IMF分量的判定,然后通过果蝇优化算法确定噪声IMF的阈值,将经过阈值去噪的新的分量和剩余分量重构得到去噪后的ECG。最后,使用MIT-BIH数据库中的心电数据进行实验,实验结果表明,该方法在去噪同时能够较好地保留信号细节。  相似文献   

针对现有的脉冲噪声去除算法在去噪性能和计算效率上的缺陷,提出了基于邻域统计检测的双树复小波图像去噪算法。根据噪声的灰度特征、邻域像素的多数原则以及灰度偏差等统计特性进行噪声检测,充分利用双树复小波变换的优秀特性,在双树复小波域中用光滑可导的阈值函数以及自适应阈值对噪声图像进行去噪处理,最后用去噪图像中的像素,替换噪声图像中对应的噪声像素以得到最终的去噪图像。实验数据证明,所提出的方法优于部分最新提出的算法,具有较好的去噪性能和快速的计算效率。  相似文献   

针对常规的ECG(electrocardiogram)信号去噪算法存在的缺陷,提出了一种基于形态学与小波变换的自适应综合去噪算法。该算法利用形态学滤波器去除基线漂移信号,用小波滤波器去除高频干扰信号,并将这两部分所得到的心电噪声分量作为自适应滤波器的参考输入信号,对ECG信号进行自适应滤波处理,最后得到去噪后的ECG信号。实验表明,本算法是一种有效的去噪算法。  相似文献   

针对语音去噪问题,提出一种基于循环生成对抗网络(CycleGAN)的方法来对声音场景中的语音进行去噪。该方法把CycleGAN的网络模型与不同领域间的语音转换技术进行结合与优化,通过提取语音频谱包络特征,对语音进行编码与解码的处理,旨在用先进的生成技术实现语音端到端的去噪,从而简化语音去噪过程中带来的高阶差异问题,同时泛化其应用场景。通过对非平行数据集和平行数据集进行训练与测试,主要比较该方法与传统CycleGAN的语音去噪方法下的去噪效果,由实验结果得到PESQ、NR、SSNR这3项指标分别相对提高了8.49%、6.53%、23.30%,有效地解决了实际场景中的非平行语音去噪问题。  相似文献   

特高频法是GIS设备在状态检测中最常用的方法之一,但在应用过程中传感器所采集的特高频信号通常会携带较多噪声,会干扰检测的准备性,因此对信号进行有效的噪声抑制是局部放电检测的关键技术。本文基于小波包去噪理论,提出了一种综合考虑平滑度和边缘特征的改进阈值函数,以实现对GIS局部放电特高频模拟信号的降噪。通过信噪比(SNR)、均方误差(MSE)和波形相似数(NCC)三种参数对方法的去噪效果进行评估,结果表明,改进小波包阈值的信号降噪方法具有良好的噪声抑制功能,应用后可提高GIS局部放电采集信号的初始数据质量。  相似文献   

在传感器网络研究领域中,去除感知数据含有的噪声是个重要的研究课题。现存的去噪算法没有考虑节点密度不均匀及信息拥塞的情况,从而过多地消耗了能量。考虑这两个因素,使用时间维加权的方法,提出了一个基于节点密度的网内自适应去噪算法-DHA(density-based hybrid approach)。DHA能够根据节点密度来进行算法决策,并且在时间维进行加权,能够对数据变化作出快速反应并且提高数据精度。实验结果表明,DHA方法能够在保证良好的去噪效果、快速响应时间的前提下,比目前最好的去噪算法WMA(weighted moving average-based)更节省能量。  相似文献   

为了更好地利用图像先验以及保护图像边缘、纹理等细节信息,提出一种结合反应扩散(TrainedNonlinearReactionDiffusion,TNRD)与基于块组先验去噪(PatchGroupPriorbasedDenoising,PGPD)的改进算法。首先,对PGPD去噪后的图像进行小波分解得到3个正交的子带,由理论分析可知图像为各子带之和;然后利用反应扩散对高频系数大于阈值的子带部分进行扩散处理,并将处理结果替代原来部分从而获得最终去噪图像。实验结果表明,改进算法在峰值信噪比、保护细节信息等方面都有较大的性能改善。  相似文献   

In the past, similarity search for audio data has largely been focused on music. Recent digitization efforts in some of the larger animal sound archives bring other types of audio recordings into the focus of interest. Although recordings in animal sound archives are usually very well annotated by metadata, it is almost impossible to manually annotate all sounds made by animals in each recording. Complementary to classical text-based querying of databases that exploit available annotations, algorithms capable of automatically finding sections of recordings similar to a given query fragment provide a promising approach for content-based navigation. In our work, we present algorithms for feature extraction, as well as indexing and retrieval of animal sound recordings. Making use of a concept from image processing, the structure tensor, our feature extraction algorithm is adapted to the typical curve-like spectral features that are characteristic for many types of animal sounds. We propose a method for similarity search in animal sound databases which is obtained by adding a novel ranking scheme to an existing inverted file based approach for multimedia retrieval. Evaluation of our methods is based on recordings from the Animal Sound Archive, Berlin.  相似文献   

We propose a method for characterizing sound activity in fixed spaces through segmentation, indexing, and retrieval of continuous audio recordings. Regarding segmentation, we present a dynamic Bayesian network (DBN) that jointly infers onsets and end times of the most prominent sound events in the space, along with an extension of the algorithm for covering large spaces with distributed microphone arrays. Each segmented sound event is indexed with a hidden Markov model (HMM) that models the distribution of example-based queries that a user would employ to retrieve the event (or similar events). In order to increase the efficiency of the retrieval search, we recursively apply a modified spectral clustering algorithm to group similar sound events based on the distance between their corresponding HMMs. We then conduct a formal user study to obtain the relevancy decisions necessary for evaluation of our retrieval algorithm on both automatically and manually segmented sound clips. Furthermore, our segmentation and retrieval algorithms are shown to be effective in both quiet indoor and noisy outdoor recording conditions.   相似文献   

在咳嗽识别中,语音是影响识别准确率的主要因素。分析咳嗽与语音相邻帧频谱的相似性特征,发现咳嗽相邻帧的频段互相关系数明显小于语音,因此频段互相关系数可以作为区分咳嗽与语音的动态特征。在相同实验条件下,以MFCC为静态特征,比较了以频段互相关系数和一阶MFCC作为动态特征参数的咳嗽识别性能。多组录音的咳嗽识别实验结果表明:采用频段互相关系数作为动态特征参数咳嗽识别的平均准确率为90.27%,其识别能力优于一阶MFCC。  相似文献   

咳嗽中包含丰富的病理信息,可以为临床诊断提供重要支持。自动咳嗽检测方法有助于提高检测结果的可靠性,并减少人为工作量。但在自然记录的语音信号中,非咳嗽信号的数量远多于咳嗽,语音流中咳嗽信号的自动检测是个典型的类别不均衡问题。针对该问题,提出一种基于偏最小二乘分类法的咳嗽信号检测模型APLSCX。利用非对称偏最小二乘分类器处理类别不均衡数据的能力,对归一化的特征向量进行特征抽取,同时基于低维数据的方差调整分类平面。实验结果显示,与LCM、SVM等主流模型相比,APLSCX兼顾了小类的召回率和精度指标,具有较高的检出率和较低的误警率,更适用于自然语流中咳嗽信号的检测。  相似文献   

This paper addresses learning and recognition of human behavior models from multimodal observation in a smart home environment. The proposed approach is part of a framework for acquiring a high-level contextual model for human behavior in an augmented environment. A 3-D video tracking system creates and tracks entities (persons) in the scene. Further, a speech activity detector analyzes audio streams coming from head set microphones and determines for each entity, whether the entity speaks or not. An ambient sound detector detects noises in the environment. An individual role detector derives basic activity like ldquowalkingrdquo or ldquointeracting with tablerdquo from the extracted entity properties of the 3-D tracker. From the derived multimodal observations, different situations like ldquoaperitifrdquo or ldquopresentationrdquo are learned and detected using statistical models (HMMs). The objective of the proposed general framework is two-fold: the automatic offline analysis of human behavior recordings and the online detection of learned human behavior models. To evaluate the proposed approach, several multimodal recordings showing different situations have been conducted. The obtained results, in particular for offline analysis, are very good, showing that multimodality as well as multiperson observation generation are beneficial for situation recognition.  相似文献   

咳嗽音是众多疾病常见的重要症状之一,包含极其重要的临床信息.针对目前多参数无线家庭监护系统忽视此重要生理参数的现状,设计了一种基于无线传感器网络的咳嗽音检测系统,选用SPCE061A单片机和CC2430无线收发模块共同构建了关键节点的硬件平台和软件平台,并运用短时能量和过零率的双门限算法检测信号.测试结果表明:该系统可...  相似文献   

汤霖  姜世芬 《计算机工程与应用》2012,48(29):114-118,156
在既有平稳噪音又有突发噪声的环境下进行语音端点检测是一项挑战.在选择抗噪特征的基础上,提出了自适应判定阈值和用多层感知器进行语噪鉴别的语音端点检测办法.实验结果表明,选择的语音参数比传统的帧能量和过零率在信噪比为0 dB时,正确的语音端点检出率高出27%,而多层感知器在正常环境下,检出94.47%的开关门声、咳嗽声、翻书声和呼吸声等孤立突发噪声.  相似文献   

One of the problems associated with listening to binaurally recorded sound events is localization confusions. The main objective of this investigation was to find out whether a short training session prior to listening to binaural recordings through headphones would facilitate correct spatial perception of the sound field. Focus was on the localization of the sound stimuli in median plane. Sound signals were recorded with an artificial head in three different conditions namely, anechoic, highly reverberant and moderately reverberant. Fourteen subjects participated in the listening tests. All subjects were required to localize all virtual sound stimuli under two different conditions. The first condition had a short training session binaurally recorded in the same environments as preceeding sound stimuli, and only sound stimuli recorded in the same environment were presented. The second condition did not have a training session, and sound stimuli recorded in different environments were presented. Results showed that a short training session prior to listening to binaurally recorded sounds through headphones was useful as it facilitated localization performance. The biggest effect was in reduced amount of sounds perceived inside the head. It was most pronounced for sound stimuli recorded in anechoic environment.  相似文献   

关于空间听觉的研究表明对“与头相关联的传递函数(HRTF)”进行测量和研究备受关注。在自由声场声源的条件下,测量是由左右耳耳膜处的记录信号组成,测得的HRTF幅度特性的变化为声源位置的函数。本文对近年来的一些测量方法和测量数据作了研究分析  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号