首页 | 官方网站   微博 | 高级检索  
     

基于轻量级深度神经网络的环境声音识别
引用本文:杨磊,赵红东.基于轻量级深度神经网络的环境声音识别[J].计算机应用,2020,40(11):3172-3177.
作者姓名:杨磊  赵红东
作者单位:河北工业大学 电子信息工程学院, 天津 300300
基金项目:光电信息控制和安全技术重点实验室基金资助项目
摘    要:针对传统卷积神经网络(CNN)模型存在大量冗余参数的问题,提出了两个基于SqueezeNet核心结构Fire模块的轻量级网络模型Fnet1和Fnet2。之后结合移动端分布式数据采集和处理的特点,在Fnet2模型基础上,依据Dempster-Shafer(D-S)证据理论将Fnet2与深度神经网络(DNN)融合,提出新的网络模型FnetDNN。首先,建立一个具有四层卷积层的神经网络Cent作为基准,以梅尔倒谱系数(MFCC)作为特征输入来对比分析Fnet1、Fnet2和Cent的网络结构特点、计算量、卷积核参数数量及识别准确率,结论是Fnet1仅使用Cnet参数数量的10.3%就可达到86.7%的分类准确率;然后,将MFCC与全局特征向量输入到FnetDNN模型中,使得该模型的识别准确率提高到了94.4%。实验结果表明,Fnet网络模型不仅可以压缩冗余参数,还可以与其他网络相融合,具备模型扩展能力。

关 键 词:环境声音识别  深度神经网络  D-S证据理论  梅尔倒谱系数  
收稿时间:2020-04-08
修稿时间:2020-07-09

Environment sound recognition based on lightweight deep neural network
YANG Lei,ZHAO Hongdong.Environment sound recognition based on lightweight deep neural network[J].journal of Computer Applications,2020,40(11):3172-3177.
Authors:YANG Lei  ZHAO Hongdong
Affiliation:School of Electronic and Information Engineering, Hebei University of Technology, Tianjin 300300, China
Abstract:The existing Convolutional Neural Network (CNN) models have a large number of redundant parameters. In order to address this problem, two lightweight network models named Fnet1 and Fnet2, based on the SqueezeNet core structure Fire module, were proposed. Then, in the view of the characteristics of distributed data collection and processing of mobile terminals, based on Fnet2, a new network model named FnetDNN, with Fnet2 integrated with Deep Neural Network (DNN), was proposed according to Dempster-Shafer (D-S) evidence theory. Firstly, a neural network named Cent with four convolutional layers was used as the benchmark, and Mel Frequency Cepstral Coefficient (MFCC) as the input feature. From aspects of the network structure characteristics, calculation cost, number of convolution kernel parameters and recognition accuracy, Fnet1, Fnet2 and Cent were analyzed. Results showed that Fnet1 only used 10.3% parameters of that of Cnet, and had the recognition accuracy of 86.7%. Secondly, MFCC and the global feature vector were input into the FnetDNN model, which improved the recognition accuracy of the model to 94.4%. Experimental results indicate that the proposed Fnet network model can compress redundant parameters as well as integrate with other networks, which has the ability to expand the model.
Keywords:environment sound recognition  Deep Neural Network (DNN)  Dempster-Shafer (D-S) evidence theory  Mel Frequency Cepstral Coefficient (MFCC)  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号