首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 203 毫秒
1.
李强  于凤芹 《计算机应用》2018,38(8):2411-2415
针对复调音乐中不同声源的相互干扰而导致同一声源音高序列的不连续,从而降低音高估计精度的问题,提出改进音高轮廓创建和选择的旋律提取算法。算法首先计算时频谱中每一点的音高显著性,并提出基于听觉流线索和音高显著性的连续性创建音高轮廓;为了进一步选择旋律音高轮廓,随后提出根据伴奏的重复特性去除非旋律音高轮廓,主要采用动态时间规整算法计算旋律和非旋律音高轮廓间的相似度;最后,提出利用相邻音高轮廓的长时关系检测旋律音高轮廓中的倍频错误,并平滑旋律音高轮廓形成旋律音高线。在数据集ORCHSET上进行仿真实验,结果表明所提出的改进算法比改进前提高了2.86%的音高估计精度和3.32%的总精度,可有效解决音高估计问题。  相似文献   

2.
电力系统谐波阻抗随电流幅值变化出现电力波动特性干扰,导致估算精度下降。提出基于线性度校验的电力系统谐波阻抗估计方法。分析电流幅值随时间的正弦波动值、线性波动值与随机波动,获取电力波动特性对谐波阻抗估计的电力波动特性干扰因素,收集谐波阻抗数据,构建系统谐波阻抗估计的极大似然模型,获取谐波阻抗的元复正态线性度变量独立分布采样,计算线性度变量服从几率密度函数值,组建极大似然估计函数,估算谐波阻抗的极大似然模型,得到电力系统的谐波阻抗估计值。仿真结果证明,估算精度与稳定性得到提高。  相似文献   

3.
为了提高噪声和混响环境中说话人跟踪的精度,提出一种基于粒子滤波的混合声源跟踪算法。根据接收信号信噪比变化较大的特点,该算法使用相位变换加权的可控响应功率定位函数来计算每帧信号的粒子状态观测值,利用其方差将接收信号帧分为高信噪比和低信噪比两种。对于高信噪比帧,仍采用该定位函数构造的似然函数来评价粒子权重,对于低信噪比帧,则采用常规可控波束形成定位函数构造的似然函数来评价粒子权重。仿真结果表明,在平均信噪比较高的条件下,该算法的跟踪性能与传统算法接近;在平均信噪比低于10 dB,混响时间大于200 ms的条件下,跟踪误差比传统算法减少20%~30%。  相似文献   

4.
针对民族乐器音色特征提取准确率问题,提出谐波特征提取结合支持向量机的音色分类识别方法。其中,考虑到谐波结构对音色的重要性,提出离散谐波变换的音色谐波提取方法,以此构建音色表达谱;然后算融合LPCC、MPCC等特征作为SVM的输入,最终实现不同民族乐器的分类识别。结果表明,在识别准确率上,组合特征的识别率明显高于单一特征输入;在单音色识别上,本方法在96%左右,高于KNN等其他方法;在音乐片段识别方面,对全部乐器的识别准确率和针对音乐片段数据库的测度平均分别提高了7.3%和2.23%,识别效果更好。  相似文献   

5.
基于最大似然的网络拓扑估计方法能够获得全局最优的估计结果,优于一般局部最优化和节点对融合方法,但在网络规模较大时存在计算复杂度较高的缺点。首先证明了网络拓扑估计似然函数是单峰的(即只有一个极值)且峰值为最大值;然后利用单峰特征,改进了现有基于最大似然的拓扑估计方法,在最大似然树搜索过程中无需返回到似然值小的状态,降低了计算复杂度。最后,Matlab和ns-2仿真结果证明在不降低拓扑估计准确率的情况下,改进的算法将计算复杂度减少了30%~46%.  相似文献   

6.
多特征融合的退火粒子滤波目标跟踪   总被引:1,自引:0,他引:1       下载免费PDF全文
针对传统粒子滤波的建议分布没有利用到当前观测信息的缺点,提出了一种基于多特征融合的退火算法来改进建议分布的粒子滤波跟踪方法。该方法解决了高维状态下计算量大和粒子数匮乏问题。采用退火方法在蒙特卡洛重要采样范围内产生更好的建议分布,并用退火似然性抽样来代替简单的先验概率抽样。在似然逼近中,应用颜色和边缘相融合的图像特征属性在不同的退火层加权来产生权重功能函数。用该方法对复杂背景下和存在遮挡情况下的运动目标进行跟踪,结果表明该方法有较高的跟踪精度和较强的稳定性。  相似文献   

7.
针对混响噪声下声源定位精度低和鲁棒性弱等问题,提出了多特征自适应IMM粒子滤波算法.该算法以麦克风接收信号的多特征作为观测信息,采用空时相关和迭代滤波建立了时延选择机制和波束输出能量优化机制,并在两者的基础上构建了似然函数以获得合理的声源位置信息.考虑到说话人运动的随机性,给出了自适应IMM算法,通过在线粒子集生成并将不同过程方差的模型进行交互来拟合说话人的不同运动模式,改善了说话人跟踪系统的稳健性.仿真和实测结果表明,所提算法利用了多特征定位信息的互补性,降低了观测误差不确定性对声源位置估计的影响,增强了随机运动声源跟踪系统的鲁棒性,提高了系统的定位精度.  相似文献   

8.
研究近场声源定位性能优化问题,针对常规最大似然方法在空间非均匀高斯噪声背景下定位准确率下降的问题,基于平面阵建立了近场声源信号模型,推导了空间非均匀噪声条件下求解声源方位和距离信息的最大似然定位方法,提出使用引力搜索算法解决了以上最大似然方法在多维参数空间搜索的高运算复杂度问题,通过仿真验证了改进方法的可行性和有效性,并说明估计精度较高,在低信噪比下方位和距离的均方误差都小于常规最大似然方法,并且在高信噪比条件下方位和距离的均方误差都逼近克拉美-罗界.  相似文献   

9.
基于声源能量的无线传感器网络( WSNs)最大似然定位算法抗噪声干扰能力强,定位精度高,同时适用于多个目标定位,但是计算量大,不适用于实时定位。针对现有算法的缺点,提出了一种基于自适应迭代的最大似然定位算法。该算法将代价函数作为目标函数,在给定的梯度误差范围内自适应地搜索目标位置。为了提高算法的收敛速度和定位精度,提出了基于Sigmoid函数的变步长的搜索算法。仿真实验结果表明:与最大似然定位算法相比,自适应迭代算法运算量小,定位精度高,能满足对目标定位精度和速度要求较高的场合,具有一定的实际应用意义。  相似文献   

10.
《计算机工程》2018,(3):315-321
传统变调方法由于未考虑音色和边缘音,会影响音质并造成失真。为此,提出一种改进线性预测的变调方法。根据音乐语音存在边缘音的特点,借助谐波冲激分离将其分为谐波信号和冲激信号,基于线性预测模型,将谐波信号分解为声道传输函数和声门脉冲激励信号,采用重采样对脉冲激励信号进行变调处理,利用帧信号叠加合成提高拼接段的连续性,通过频域的谐波冲激叠加合成重构音乐语音信号。实验结果表明,该方法能使音乐语音信号在变调处理后改变音高,保持音色相对稳定不失真,且语音质量比传统的线性预测变调方法有较大提高。  相似文献   

11.
Timbre distance and similarity are expressions of the phenomenon that some music appears similar while other songs sound very different to us. The notion of genre is often used to categorize music, but songs from a single genre do not necessarily sound similar and vice versa. In this work, we analyze and compare a large amount of different audio features and psychoacoustic variants thereof for the purpose of modeling timbre distance. The sound of polyphonic music is commonly described by extracting audio features on short time windows during which the sound is assumed to be stationary. The resulting down sampled time series are aggregated to form a high-level feature vector describing the music. We generated high-level features by systematically applying static and temporal statistics for aggregation. The temporal structure of features in particular has previously been largely neglected. A novel supervised feature selection method is applied to the huge set of possible features. The distances of the selected feature correspond to timbre differences in music. The features show few redundancies and have high potential for explaining possible clusters. They outperform seven other previously proposed feature sets on several datasets with respect to the separation of the known groups of timbrally different music.  相似文献   

12.
《Ergonomics》2012,55(11):1471-1484
Abstract

The current study applied Structural Equation Modelling to analyse the relationship among pitch, loudness, tempo and timbre and their relationship with perceived sound quality. Twenty-eight auditory signals of horn, indicator, door open warning and parking sensor were collected from 11 car brands. Twenty-one experienced drivers were recruited to evaluate all sound signals with 11 semantic differential scales. The results indicate that for the continuous sounds, pitch, loudness and timbre each had a direct impact on the perceived quality. Besides the direct impacts, pitch also had an impact on loudness perception. For the intermittent sounds, tempo and timbre each had a direct impact on the perceived quality. These results can help to identify the psychoacoustic attributes affecting the consumers’ quality perception and help to design preferable sounds for vehicles. In the end, a design guideline is proposed for the development of auditory signals that adopts the current study’s research findings as well as those of other relevant research.

Practitioner Summary: This study applied Structural Equation Modelling to analyse the relationship among pitch, loudness, tempo and timbre and their relationship with perceived sound quality. The result can help to identify psychoacoustic attributes affecting the consumers’ quality perception and help to design preferable sounds for vehicles.  相似文献   

13.
Musical expressivity can be defined as the deviation from a musical standard when a score is performed by a musician. This deviation is made in terms of intrinsic note attributes like pitch, timbre, timing and dynamics. The advances in computational power capabilities and digital sound synthesis have allowed real-time control of synthesized sounds. Expressive control becomes then an area of great interest in the sound and music computing field. Musical expressivity can be approached from different perspectives. One approach is the musicological analysis of music and the study of the different stylistic schools. This approach provides a valuable understanding about musical expressivity. Another perspective is the computational modelling of music performance by means of automatic analysis of recordings. It is known that music performance is a complex activity that involves complementary aspects from other disciplines such as psychology and acoustics. It requires creativity and eventually, some manual abilities, being a hard task even for humans. Therefore, using machines appears as a very interesting and fascinating issue. In this paper, we present an overall view of the works many researchers have done so far in the field of expressive music performance, with special attention to the computational approach.  相似文献   

14.
类比生成是计算机生成自然和创造性音乐作品的一种关键方法.使用类比生成能够将高层次的音乐特征从一个作品转移到另一个.为了在进行高效类比的同时也能够控制音乐的特征属性,提出了一种新型的显式特征解耦的编解码模型,由编码器解开以和弦为条件的音乐片段的音高和节奏表示,并用解码器还原成原始的音乐.在进行音乐类比生成时,该模型能够使一个作品借用其他作品的表现形式,用不同的音高轮廓、节奏模式进行创作.另外,得益于可视化的特征编码方式,该模型可以对不同的特征属性进行直观控制.  相似文献   

15.
The pitch is a crucial parameter in speech and music signals. However, due to severe noisy conditions, missing harmonics, unsuitable physical vibration, the determination of pitch presents a great challenge when desiring to get a good accuracy. In this paper, we propose a method for pitch estimation of speech and music sounds. Our method is based on the fast Fourier transform (FFT) of the multi-scale product (MP) provided by a feature auditory model of the sound signals. The auditory model simulates the spectral behaviour of the cochlea by a gammachirp filter-bank, and the out/middle ear filtering by a low-pass filter. For the two output channels, the FFT function of the MP is computed over frames. The MP is based on constituting the product of the speech and music wavelet transform coefficients at three scales. The experimental results show that our method estimates the pitch with high accuracy. Besides, our proposed method outperforms several other pitch detection algorithms in clean and noisy environments.  相似文献   

16.
一种启发式的用哼唱检索音乐的层次化方法   总被引:11,自引:0,他引:11  
“用哼唱检索音乐”是一种友好的基于内容的音乐检索方法,它已经引起了广泛的研究兴趣;在对音乐库做了统计分析的基础上,总结了一些启发式规则,帮助对哼唱输入进行基音检测、音符分割,哼唱输入表达为音高轮廓图和节奏,音乐库中的音乐按音乐的节奏类型分为不同的节奏区域,并从每首音乐中抽取旋律轮廓图和节奏信息,用递归神经网络记忆旋律轮廓,音乐库的索引是神经网络的权值矩阵,将哼唱输入与音乐库中的音乐匹配的过程就是计算神经网络的输出过程。实验结果显示了所提方法的有效性。  相似文献   

17.
语音/音乐自动分类中的特征分析   总被引:16,自引:0,他引:16  
综合分析了语音和音乐的区别性特征,包括音调,亮度,谐度等感觉特征与MFCC(Mel-Frequency Cepstral Coefficients)系数等,提出一种left-right DHMM(Discrete Hidden Markov Model)的分类器,以极大似然作为判别规则,用于语音,音乐以及它们的混合声音的分类,并且考察了上述特征集合在该分类器中的分类性能,实验结果表明,文中提出的音频特征有效,合理,分类性能较好。  相似文献   

18.
Automatic mood detection and tracking of music audio signals   总被引:2,自引:0,他引:2  
Music mood describes the inherent emotional expression of a music clip. It is helpful in music understanding, music retrieval, and some other music-related applications. In this paper, a hierarchical framework is presented to automate the task of mood detection from acoustic music data, by following some music psychological theories in western cultures. The hierarchical framework has the advantage of emphasizing the most suitable features in different detection tasks. Three feature sets, including intensity, timbre, and rhythm are extracted to represent the characteristics of a music clip. The intensity feature set is represented by the energy in each subband, the timbre feature set is composed of the spectral shape features and spectral contrast features, and the rhythm feature set indicates three aspects that are closely related with an individual's mood response, including rhythm strength, rhythm regularity, and tempo. Furthermore, since mood is usually changeable in an entire piece of classical music, the approach to mood detection is extended to mood tracking for a music piece, by dividing the music into several independent segments, each of which contains a homogeneous emotional expression. Preliminary evaluations indicate that the proposed algorithms produce satisfactory results. On our testing database composed of 800 representative music clips, the average accuracy of mood detection achieves up to 86.3%. We can also on average recall 84.1% of the mood boundaries from nine testing music pieces.  相似文献   

19.
Chroma-based audio features are a well-established tool for analyzing and comparing harmony-based Western music that is based on the equal-tempered scale. By identifying spectral components that differ by a musical octave, chroma features possess a considerable amount of robustness to changes in timbre and instrumentation. In this paper, we describe a novel procedure that further enhances chroma features by significantly boosting the degree of timbre invariance without degrading the features' discriminative power. Our idea is based on the generally accepted observation that the lower mel-frequency cepstral coefficients (MFCCs) are closely related to timbre. Now, instead of keeping the lower coefficients, we discard them and only keep the upper coefficients. Furthermore, using a pitch scale instead of a mel scale allows us to project the remaining coefficients onto the 12 chroma bins. We present a series of experiments to demonstrate that the resulting chroma features outperform various state-of-the art features in the context of music matching and retrieval applications. As a final contribution, we give a detailed analysis of our enhancement procedure revealing the musical meaning of certain pitch-frequency cepstral coefficients.   相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号