首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Today, digital audio applications are part of our everyday lives. Audio classification can provide powerful tools for content management. If an audio clip automatically can be classified it can be stored in an organised database, which can improve the management of audio dramatically. In this paper, we propose effective algorithms to automatically classify audio clips into one of six classes: music, news, sports, advertisement, cartoon and movie. For these categories a number of acoustic features that include linear predictive coefficients, linear predictive cepstral coefficients and mel-frequency cepstral coefficients are extracted to characterize the audio content. The autoassociative neural network model (AANN) is used to capture the distribution of the acoustic feature vectors. The AANN model captures the distribution of the acoustic features of a class, and the backpropagation learning algorithm is used to adjust the weights of the network to minimize the mean square error for each feature vector. The proposed method also compares the performance of AANN with a Gaussian mixture model (GMM) wherein the feature vectors from each class were used to train the GMM models for those classes. During testing, the likelihood of a test sample belonging to each model is computed and the sample is assigned to the class whose model produces the highest likelihood.  相似文献   

2.
Multigroup classification of audio signals using time-frequency parameters   总被引:1,自引:0,他引:1  
The ongoing advancements in the multimedia technologies drive the need for efficient classification of the audio signals to make the content-based retrieval process more accurate and much easier from huge databases. The challenge of this task lies in an accurate extraction of signal characteristics so as to derive a strong discriminatory feature suitable for classification. In this paper, a time-frequency (TF) approach for audio classification is proposed. Audio signals are nonstationary in nature and TF approach is the best way to analyze them. The audio signals were decomposed using an adaptive TF decomposition algorithm, and the signal decomposition parameter based on octave (scaling) was used to generate a set of 42 features over three frequency bands within the auditory range. These features were analyzed using linear discriminant functions and classified into six music groups (rock, classical, country, jazz, folk and pop). Overall classification accuracies as high as 97.6 % was achieved by linear discriminant analysis of 170 audio signals.  相似文献   

3.
This paper deals with the problem of blind separation of audio signals from noisy mixtures. It proposes the application of a blind separation algorithm on the discrete cosine transform (DCT) or the discrete sine transform (DST) of the mixed signals, instead of performing the separation on the mixtures in the time domain. Wavelet denoising of the noisy mixtures is recommended in this paper as a preprocessing step for noise reduction. Both the DCT and the DST have an energy compaction property, which concentrates most of the signal energy in a few coefficients in the transform domain, leaving most of the transform domain coefficients close to zero. As a result, the separation is performed on a few coefficients in the transform domain. Another advantage of signal separation in transform domains is that the effect of noise on the signals in the transform domains is smaller than that in the time domain due to the averaging effect of the transform equations, especially when the separation algorithm is preceded by a wavelet denoising step. The simulation results confirm the superiority of transform domain separation to time domain separation and the importance of the wavelet denoising step.  相似文献   

4.
This paper deals with the problem of blind separation of audio signals from noisy mixtures. It proposes the application of a blind separation algorithm on the Discrete Cosine Transform (DCT) or the Discrete Sine Transform (DST) of the mixed signals, instead of performing the separation on the mixtures in the time domain. Kalman Filtering of the noisy separated signals is recommended in this paper as a post-processing step for noise reduction. Both the DCT and the DST have an energy compaction property, which concentrates most of the signal energy in a few coefficients in the transform domain, leaving the rest of the transform-domain coefficients close to zero. As a result, the separation is performed on a few coefficients in the transform domain. Another advantage of signal separation in transform domains is that the effect of noise on the signals in the transform domains is smaller than that in the time domain due to the averaging effect of the transform equations. The simulation results confirm the effectiveness of transform-domain signal separation and the feasibility of the post-processing Kalman filtering step.  相似文献   

5.
The electromyography (EMG) signal is a bioelectrical signal variation, generated in muscles during voluntary or involuntary muscle activities. The muscle activities such as contraction or relaxation are always controlled by the nervous system. The EMG signal is a complicated biomedical signal due to anatomical/physiological properties of the muscles and its noisy environment. In this paper, a classification technique is proposed to classify signals required for a prosperous arm prosthesis control by using surface EMG signals. This work uses recorded EMG signals generated by biceps and triceps muscles for four different movements. Each signal has one single pattern and it is essential to separate and classify these patterns properly. Discriminant analysis and support vector machine (SVM) classifier have been used to classify four different arm movement signals. Prior to classification, proper feature vectors are derived from the signal. The feature vectors are generated by using mean absolute value (MAV). These feature vectors are provided as inputs to the identification/classification system. Discriminant analysis using five different approaches, classification accuracy rates achieved from very good (98%) to poor (96%) by using 10-fold cross validation. SVM classifier gives a very good average accuracy rate (99%) for four movements with the classification error rate 1%. Correct classification rates of the applied techniques are very high which can be used to classify EMG signals for prosperous arm prosthesis control studies.  相似文献   

6.
董书琴  谢宏 《微型机与应用》2011,30(16):82-84,88
疲劳驾驶是导致交通伤亡事故的重要原因之一,因此采取相应的预防措施是很有必要的。针对两种不同程度的警觉度(清醒和睡眠),采用公共空间模式CSP(Common Spatial Pattern)算法对所采集到的脑电数据进行特征提取,用基于径向基函数(RBF)的支持向量机(SVM)对提取的特征进行分类,通过网格搜索法获得最优参数。与频带能量作为特征的已有方法相比,该算法测试准确率较高,能够达到较好的识别效果。  相似文献   

7.
支持向量机多类分类方法   总被引:30,自引:0,他引:30  
支持向量机本身是一个两类问题的判别方法,不能直接应用于多类问题。当前针对多类问题的支持向量机分类方法主要有5种:一类对余类法(OVR),一对一法(OVO),二叉树法(BT),纠错输出编码法和有向非循环图法。本文对这些方法进行了简单的介绍,通过对其原理和实现方法的分析,从速度和精度两方面对这些方法的优缺点进行了归纳和总结,给出了比较意见,并通过实验进行了验证,最后提出了一些改进建议。  相似文献   

8.
Because of rapid growing of radio communication technology of late years, importance of automatic classification of digital signal type is rising increasingly. This paper presents an advanced technique that identifies a variety of digital signal types. This method is a hybrid heuristic formed by a radial basis function neural networks (as a classifier) and particle swarm optimization technique. A suitable combination of higher order statistics up to eighth are proposed as the prominent characteristics of the considered signals. In conjunction with neural network we have used a cross-validation technique to improve the generalization ability. Experimental results indicate that the proposed technique has high percentage of correct classification to discriminate different types of digital signal even at low SNRs.  相似文献   

9.
Content-based audio signal classification into broad categories such as speech, music, or speech with noise is the first step before any further processing such as speech recognition, content-based indexing, or surveillance systems. In this paper, we propose an efficient content-based audio classification approach to classify audio signals into broad genres using a fuzzy c-means (FCM) algorithm. We analyze different characteristic features of audio signals in time, frequency, and coefficient domains and select the optimal feature vector by employing a noble analytical scoring method to each feature. We utilize an FCM-based classification scheme and apply it on the extracted normalized optimal feature vector to achieve an efficient classification result. Experimental results demonstrate that the proposed approach outperforms the existing state-of-the-art audio classification systems by more than 11% in classification performance.  相似文献   

10.
Bipolar Mood Disorder (BMD) and Attention Deficit Hyperactivity Disorder (ADHD) patients mostly share clinical signs and symptoms in children; therefore, accurate distinction of these two mental disorders is a challenging issue among the psychiatric society. In this study, 43 subjects are participated including 21 patients with ADHD and 22 subjects with BMD. Their electroencephalogram (EEG) signals are recorded by 22 electrodes in two eyes-open and eyes-closed resting conditions. After a preprocessing step, several features such as band power, fractal dimension, AR model coefficients and wavelet coefficients are extracted from recorded signals. This paper is aimed to achieve a high classification rate between ADHD and BMD patients using a suitable classifier to their EEG features. In this way, we consider a piece wise linear classifier which is designed based on XCSF. Experimental results of XCSF-LDA showed a significant improvement (86.44% accuracy) compare to that of standard XCSF (78.55%). To have a fair comparison, the other state-of-art classifiers such as LDA, Direct LDA, boosted JD-LDA (BJDLDA), and XCSF are assessed with the same feature set that finally the proposed method provided a better results in comparison with the other rival classifiers. To show the robustness of our method, additive white noise with different amplitude is added to the raw signals but the results achieved by the proposed classifier empirically confirmed a higher robustness against noise compare to the other classifiers. Consequently, the proposed classifier can be considered as an effective method to classify EEG features of BMD and ADHD patients.  相似文献   

11.
Epileptic seizures are manifestations of epilepsy. Careful analyses of the electroencephalograph (EEG) records can provide valuable insight and improved understanding of the mechanisms causing epileptic disorders. The detection of epileptiform discharges in the EEG is an important component in the diagnosis of epilepsy. As EEG signals are non-stationary, the conventional method of frequency analysis is not highly successful in diagnostic classification. This paper deals with a novel method of analysis of EEG signals using wavelet transform and classification using artificial neural network (ANN) and logistic regression (LR). Wavelet transform is particularly effective for representing various aspects of non-stationary signals such as trends, discontinuities and repeated patterns where other signal processing approaches fail or are not as effective. Through wavelet decomposition of the EEG records, transient features are accurately captured and localized in both time and frequency context. In epileptic seizure classification we used lifting-based discrete wavelet transform (LBDWT) as a preprocessing method to increase the computational speed. The proposed algorithm reduces the computational load of those algorithms that were based on classical wavelet transform (CWT). In this study, we introduce two fundamentally different approaches for designing classification models (classifiers) the traditional statistical method based on logistic regression and the emerging computationally powerful techniques based on ANN. Logistic regression as well as multilayer perceptron neural network (MLPNN) based classifiers were developed and compared in relation to their accuracy in classification of EEG signals. In these methods we used LBDWT coefficients of EEG signals as an input to classification system with two discrete outputs: epileptic seizure or non-epileptic seizure. By identifying features in the signal we want to provide an automatic system that will support a physician in the diagnosing process. By applying LBDWT in connection with MLPNN, we obtained novel and reliable classifier architecture. The comparisons between the developed classifiers were primarily based on analysis of the receiver operating characteristic (ROC) curves as well as a number of scalar performance measures pertaining to the classification. The MLPNN based classifier outperformed the LR based counterpart. Within the same group, the MLPNN based classifier was more accurate than the LR based classifier.  相似文献   

12.
Classification of noisy signals using fuzzy ARTMAP neural networks   总被引:5,自引:0,他引:5  
This paper describes an approach to classification of noisy signals using a technique based on the fuzzy ARTMAP neural network (FAMNN). The proposed method is a modification of the testing phase of the fuzzy ARTMAP that exhibits superior generalization performance compared to the generalization performance of the standard fuzzy ARTMAP in the presence of noise. An application to textured gray-scale image segmentation is presented. The superiority of the proposed modification over the standard fuzzy ARTMAP is established by a number of experiments using various texture sets, feature vectors and noise types. The texture sets include various aerial photos and also samples obtained from the Brodatz album. Furthermore, the classification performance of the standard and the modified fuzzy ARTMAP is compared for different network sizes. Classification results that illustrate the performance of the modified algorithm and the FAMNN are presented.  相似文献   

13.
在水声信号分类应用中,由于保密或采集条件限制等原因,样本通常会不足,导致深度学习框架的分类精度不高.为解决小样本水声信号分类精度不高的问题,提出一种结合频谱变换和深度学习框架的方法.通过对各类频谱变换测试,发现LOFAR频谱变换能显著提高声音信号中的特征表现.使用GAN网络对频谱变换后的样本扩充,使用改进的CNN网络对频谱图进行分类.实验结果表明了上述框架可以生成高质量的样本,显著提高水声信号的分类精度.  相似文献   

14.
针对传统优化方法提高径向基函数神经网络(RBFNN)分类能力存在的问题,提出一种基于合作型协同进化群体并行搜索的CO-RBFNN学习算法.该算法首先利用K-均值算法对最近邻方法确定的网络初始隐节点聚类,然后以聚类后的隐节点群作为子种群进行协同进化操作,最终获得网络的最优结构.算法采用包含整个网络隐节点结构和控制向量的矩阵式混合编码方式,隐层和输出层之间的连接权值由伪逆法确定.在UCI的8个数据集上进行的仿真实验结果验证该算法的有效性和可行性.  相似文献   

15.
基于支持向量机集成的分类   总被引:6,自引:0,他引:6  
魏玲  张文修 《计算机工程》2004,30(13):1-2,17
支持向量机是一种基于结构风险最小化原理的分类技术,本文提出了将支持向量机分类器进行集成的分类思想。首先.在原始样本的基础上形成子支持向量机,得到待检样本的子预测;进而对子预测进行适当的组合,以确定样本最终的类别预报。模拟实验结果表明,该方法具有明显优于单一支持向量机的更高的分类准确率。  相似文献   

16.
Automatic mood detection and tracking of music audio signals   总被引:2,自引:0,他引:2  
Music mood describes the inherent emotional expression of a music clip. It is helpful in music understanding, music retrieval, and some other music-related applications. In this paper, a hierarchical framework is presented to automate the task of mood detection from acoustic music data, by following some music psychological theories in western cultures. The hierarchical framework has the advantage of emphasizing the most suitable features in different detection tasks. Three feature sets, including intensity, timbre, and rhythm are extracted to represent the characteristics of a music clip. The intensity feature set is represented by the energy in each subband, the timbre feature set is composed of the spectral shape features and spectral contrast features, and the rhythm feature set indicates three aspects that are closely related with an individual's mood response, including rhythm strength, rhythm regularity, and tempo. Furthermore, since mood is usually changeable in an entire piece of classical music, the approach to mood detection is extended to mood tracking for a music piece, by dividing the music into several independent segments, each of which contains a homogeneous emotional expression. Preliminary evaluations indicate that the proposed algorithms produce satisfactory results. On our testing database composed of 800 representative music clips, the average accuracy of mood detection achieves up to 86.3%. We can also on average recall 84.1% of the mood boundaries from nine testing music pieces.  相似文献   

17.
基于SVM算法的图像分类   总被引:1,自引:0,他引:1  
介绍了SVM算法的原理和在图像分类上的一些应用,将该算法应用于飞机图像的分类,并跟传统的神经网络分类算法进行了比较。跟传统的基于神经网络的图像分类相比,具有良好的抗噪性和较高的识别率,并且具有良好的扩展性。对于飞机图像的分类问题有较好的应用。  相似文献   

18.
基于支持向量机的人脸分类   总被引:11,自引:2,他引:11  
张敏贵  潘泉  张洪才  姜睿 《计算机工程》2004,30(11):110-112
提出了一种基于支持向量机的人脸分类方法,首先对人脸图像作二维离散余弦变换,取离散余弦变换系数作为特征,然后用支持向量机进行分类。用Essex人脸图像数据库进行性别分类,取得了很好的分类效果。  相似文献   

19.
基于SVM的图像分类   总被引:2,自引:0,他引:2  
现有的图像检索系统多是针对底层特征的系统,而人类往往习惯于在语义级别进行相似性判别。如何跨越底层特征和高层语义之间的"鸿沟",成为基于内容检索的研究重点。本文提出一种利用SVM提取图像的高层特征,然后对图像进行语义级别的分类。实验结果表明,该方法在一定程度上跨越"语义鸿沟"。  相似文献   

20.
在基于内容图像检索中,图像的底层视觉特征和高层语义概念之间存在着较大的语义间隔。使用机器学习方法学习图像特征,自动建立图像类的模型成为一种有效的方法。本文提出了一种用支持向量机(SVM)实现自然图像自动语义归类的方法,基于块划分聚类得到特征向量作为SVM训练样本,实现语义分类器。由于参与聚类的是某类图像所有块的特征,提取的特征更能反映某一类图像特征。实验证明这种方法是有效的。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号

京公网安备 11010802026262号