首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Biometric speech recognition systems are often subject to various spoofing attacks, the most common of which are speech synthesis and speech conversion attacks. These spoofing attacks can cause the biometric speech recognition system to incorrectly accept these spoofing attacks, which can compromise the security of this system. Researchers have made many efforts to address this problem, and the existing studies have used the physical features of speech to identify spoofing attacks. However, recent studies have shown that speech contains a large number of physiological features related to the human face. For example, we can determine the speaker’s gender, age, mouth shape, and other information by voice. Inspired by the above researches, we propose a spoofing attack recognition method based on physiological-physical features fusion. This method involves feature extraction, a densely connected convolutional neural network with squeeze and excitation block (SE-DenseNet), and feature fusion strategies. We first extract physiological features in audio from a pre-trained convolutional network. Then we use SE-DenseNet to extract physical features. Such a dense connection pattern has high parameter efficiency, and squeeze and excitation blocks can enhance the transmission of the feature. Finally, we integrate the two features into the classification network to identify the spoofing attacks. Experimental results on the ASVspoof 2019 data set show that our model is effective for voice spoofing detection. In the logical access scenario, our model improves the tandem decision cost function and equal error rate scores by 5% and 7%, respectively, compared to existing methods.  相似文献   

2.
人体行为识别利用深度学习网络模型自动提取数据的深层特征,但传统机器学习算法存在依赖手工特征提取、模型泛化能力差等问题。提出基于空时特征融合的深度学习模型(CLT-net)用于人体行为识别。采用卷积神经网络(CNN)自动提取人体行为数据的深层次隐含特征,利用长短时记忆(LSTM)网络构建时间序列模型,学习人体行为特征在时间序列上的长期依赖关系。在此基础上,通过softmax分类器实现对不同人体行为分类。在DaLiAc数据集的实验结果表明,相比CNN、LSTM、BP模型,CLT-net模型对13种人体行为的总体识别率达到了97.6%,具有较优的人体行为识别分类性能。  相似文献   

3.
王倩  赵希梅 《计算机工程》2021,47(8):308-314
针对卷积神经网络对特征信息学习不全面、识别准确率和分类精度不高的问题,提出一种采用空间通道挤压激励模块的scSE_MVGG网络,将其应用于肝硬化识别。对肝硬化图像进行数据增强,以避免深度学习训练出现过拟合现象,改进VGG网络使其适应不同实验样本尺寸,同时将scSE模块与改进的MVGG网络相融合,通过提高网络提取特征的指向性增强肝硬化识别效果。实验结果表明,该网络对肝硬化图像的识别率达到98.78%,较scSE_VGG、scSE_AlexNet等网络识别效果更优。  相似文献   

4.
Automatic Target Recognition (ATR) based on Synthetic Aperture Radar (SAR) images plays a key role in military applications. However, there are difficulties with this traditional recognition method. Principally, it is a challenge to design robust features and classifiers for different SAR images. Although Convolutional Neural Networks (CNNs) are very successful in many image classification tasks, building a deep network with limited labeled data remains a problem. The topologies of CNNs like the fully connected structure will lead to redundant parameters and the negligence of channel-wise information flow. A novel CNNs approach, called Group Squeeze Excitation Sparsely Connected Convolutional Networks (GSESCNNs), is therefore proposed as a solution. The group squeeze excitation performs dynamic channel-wise feature recalibration with less parameters than squeeze excitation. Sparsely connected convolutional networks are a more efficient way to operate the concatenation of feature maps from different layers. Experimental results on Moving and Stationary Target Acquisition and Recognition (MSTAR) SAR images, demonstrate that this approach achieves, at 99.79%, the best prediction accuracy, outperforming the most common skip connection models, such as Residual Networks and Densely Connected Convolutional Networks, as well as other methods reported in the MSTAR dataset.  相似文献   

5.
Human activity recognition is an effective approach for identifying the characteristics of historical data. In the past decades, different shallow classifiers and handcrafted features were used to identify the activities from the sensor data. These approaches are configured for offline processing and are not suitable for sequential data. This article proposes an adaptive framework for human activity recognition using a deep learning mechanism. This deep learning approach forms the deep belief network (DBN), which contains a visible layer and hidden layers. The processing of raw sensor data is performed by these layers and the activity is identified at the top most layers. The DBN is tested using the real time environment with the help of mobile devices that contain an accelerometer, a magnetometer, and a gyroscope. The results are analyzed with the metrics of precision, recall, and the F1-score. The results proved that the proposed method has a higher F1_score when compared to the existing approach.  相似文献   

6.
With the rapid growth of the Internet of Things (IoT), smart systems and applications are equipped with an increasing number of wearable sensors and mobile devices. These sensors are used not only to collect data but, more importantly, to assist in tracking and analyzing the daily human activities. Sensor-based human activity recognition is a hotspot and starts to employ deep learning approaches to supersede traditional shallow learning that rely on hand-crafted features. Although many successful methods have been proposed, there are three challenges to overcome: (1) deep model’s performance overly depends on the data size; (2) deep model cannot explicitly capture abundant sample distribution characteristics; (3) deep model cannot jointly consider sample features, sample distribution characteristics, and the relationship between the two. To address these issues, we propose a meta-learning-based graph prototypical model with priority attention mechanism for sensor-based human activity recognition. This approach learns not only sample features and sample distribution characteristics via meta-learning-based graph prototypical model, but also the embeddings derived from priority attention mechanism that mines and utilizes relations between sample features and sample distribution characteristics. What is more, the knowledge learned through our approach can be seen as a priori applicable to improve the performance for other general reasoning tasks. Experimental results on fourteen datasets demonstrate that the proposed approach significantly outperforms other state-of-the-art methods. On the other hand, experiments of applying our model to two other tasks show that our model effectively supports other recognition tasks related to human activity and improves performance on the datasets of these tasks.  相似文献   

7.
Automated, real-time, and reliable equipment activity recognition on construction sites can help to minimize idle time, improve operational efficiency, and reduce emissions. Previous efforts in activity recognition of construction equipment have explored different classification algorithms anm accelerometers and gyroscopes. These studies utilized pattern recognition approaches such as statistical models (e.g., hidden-Markov models); shallow neural networks (e.g., Artificial Neural Networks); and distance algorithms (e.g., K-nearest neighbor) to classify the time-series data collected from sensors mounted on the equipment. Such methods necessitate the segmentation of continuous operational data with fixed or dynamic windows to extract statistical features. This heuristic and manual feature extraction process is limited by human knowledge and can only extract human-specified shallow features. However, recent developments in deep neural networks, specifically recurrent neural network (RNN), presents new opportunities to classify sequential time-series data with recurrent lateral connections. RNN can automatically learn high-level representative features through the network instead of being manually designed, making it more suitable for complex activity recognition. However, the application of RNN requires a large training dataset which poses a practical challenge to obtain from real construction sites. Thus, this study presents a data-augmentation framework for generating synthetic time-series training data for an RNN-based deep learning network to accurately and reliably recognize equipment activities. The proposed methodology is validated by generating synthetic data from sample datasets, that were collected from two earthmoving operations in the real world. The synthetic data along with the collected data were used to train a long short-term memory (LSTM)-based RNN. The trained model was evaluated by comparing its performance with traditionally used classification algorithms for construction equipment activity recognition. The deep learning framework presented in this study outperformed the traditionally used machine learning classification algorithms for activity recognition regarding model accuracy and generalization.  相似文献   

8.
A challenge in building pervasive and smart spaces is to learn and recognize human activities of daily living (ADLs). In this paper, we address this problem and argue that in dealing with ADLs, it is beneficial to exploit both their typical duration patterns and inherent hierarchical structures. We exploit efficient duration modeling using the novel Coxian distribution to form the Coxian hidden semi-Markov model (CxHSMM) and apply it to the problem of learning and recognizing ADLs with complex temporal dependencies. The Coxian duration model has several advantages over existing duration parameterization using multinomial or exponential family distributions, including its denseness in the space of nonnegative distributions, low number of parameters, computational efficiency and the existence of closed-form estimation solutions. Further we combine both hierarchical and duration extensions of the hidden Markov model (HMM) to form the novel switching hidden semi-Markov model (SHSMM), and empirically compare its performance with existing models. The model can learn what an occupant normally does during the day from unsegmented training data and then perform online activity classification, segmentation and abnormality detection. Experimental results show that Coxian modeling outperforms a range of baseline models for the task of activity segmentation. We also achieve a recognition accuracy competitive to the current state-of-the-art multinomial duration model, while gaining a significant reduction in computation. Furthermore, cross-validation model selection on the number of phases K in the Coxian indicates that only a small K is required to achieve the optimal performance. Finally, our models are further tested in a more challenging setting in which the tracking is often lost and the activities considerably overlap. With a small amount of labels supplied during training in a partially supervised learning mode, our models are again able to deliver reliable performance, again with a small number of phases, making our proposed framework an attractive choice for activity modeling.  相似文献   

9.
郭玉慧  梁循 《计算机学报》2022,45(1):98-114
如何识别同一物体的不同结构的表现形式,对于机器而言,是一个比较困难的识别工作.本文以易变形的纸币为例,提出了一种基于异构特征聚合的局部视图扭曲型纸币识别方法.首先利用灰度梯度共生矩阵、Haishoku算法和圆形LBP分别获得纹理风格、色谱风格和纹理,这些特征从不同的角度描述了局部纸币图像,然后通过VGG-16、ResN...  相似文献   

10.
目的 针对深度学习严重依赖大样本的问题,提出多源域混淆的双流深度迁移学习方法,提升了传统深度迁移学习中迁移特征的适用性。方法 采用多源域的迁移策略,增大源域对目标域迁移特征的覆盖率。提出两阶段适配学习的方法,获得域不变的深层特征表示和域间分类器相似的识别结果,将自然光图像2维特征和深度图像3维特征进行融合,提高小样本数据特征维度的同时抑制了复杂背景对目标识别的干扰。此外,为改善小样本机器学习中分类器的识别性能,在传统的softmax损失中引入中心损失,增强分类损失函数的惩罚监督能力。结果 在公开的少量手势样本数据集上进行对比实验,结果表明,相对于传统的识别模型和迁移模型,基于本文模型进行识别准确率更高,在以DenseNet-169为预训练网络的模型中,识别率达到了97.17%。结论 利用多源域数据集、两阶段适配学习、双流卷积融合以及复合损失函数,构建了多源域混淆的双流深度迁移学习模型。所提模型可增大源域和目标域的数据分布匹配率、丰富目标样本特征维度、提升损失函数的监督性能,改进任意小样本场景迁移特征的适用性。  相似文献   

11.
大豆有许多品种(cultivar),它们的叶片图像模式的差异非常细微,因此很难通过叶片特征将大豆品种区分开.虽然在使用叶片图像模式进行植物种类(species)识别方面的研究已经取得了巨大的进步,然而,作为一项非常细粒度的模式识别问题,大豆品种的识别与分类研究尚未引起足够的重视.传统的手工叶片图像分析方法一般无法刻画不同大豆品种的叶片特征的细微差异,因此识别率很低.本文尝试使用深度学习来提取具有强的辨识能力的叶片特征,以解决大豆的品种识别问题.我们提出了一种新颖的深度学习模型,称为目标转换注意力网络(Transformation Attention Network,TAN).该方法首先通过注意力机制提取细粒度的叶片图像特征,然后使用仿射变换纠正叶片姿势.我们构建了一个由240个大豆品种组成的大豆叶片品种图像数据库,每个品种有10个样本,以此数据集验证叶片图像模式中品种信息的可用性,并验证了所提出的深度学习模型对大豆品种识别的有效性.令人鼓舞的是实验结果证实了叶片图像模式在区分栽培大豆品种方面的有效性,并证明了所提出的方法优于流行的叶片手工特征提取方法和深度学习方法.  相似文献   

12.
由于空气污染与吸烟等原因, 肺炎已成为人类死亡率最高的疾病之一. 随着机器学习与深度学习技术在医疗图像检测上的应用, 为临床专家诊断各类疾病提供了帮助. 但由于缺少有效的配对肺部X射线数据集, 以及现有针对肺炎检测的方法均采用不是针对肺炎任务的普遍分类模型, 难以发现肺炎图像与正常图像的细微差别, 导致识别失败. 为此, 本文通过数据裁剪、旋转等方式扩充数据集中的正常图像; 再使用50层深度残差网络对胸部X射线中的浅层肺炎特征进行学习; 然后, 通过两层字典对残差网络学习到的肺炎特征进行更深度的抽象和学习, 发现不同肺部图像之间的微小差别; 最后, 融合残差网络和字典学习提取到的多级肺炎特征, 构建肺炎检测模型. 为了验证算法的有效性, 在Chest X-ray肺炎数据集上评估肺炎检测模型的性能. 根据测试结果, 本文提出模型的检测准确率为97.12%; 指标测试中, 精度与召回率之间的调和平均数上的得分为97.73%. 与现有方法相比, 获得了更高的识别精度.  相似文献   

13.
In the general machine learning domain, solutions based on the integration of deep learning models with knowledge-based approaches are emerging. Indeed, such hybrid systems have the advantage of improving the recognition rate and the model’s interpretability. At the same time, they require a significantly reduced amount of labeled data to reliably train the model. However, these techniques have been poorly explored in the sensor-based Human Activity Recognition (HAR) domain. The common-sense knowledge about activity execution can potentially improve purely data-driven approaches. While a few knowledge infusion approaches have been proposed for HAR, they rely on rigid logic formalisms that do not take into account uncertainty. In this paper, we propose P-NIMBUS, a novel knowledge infusion approach for sensor-based HAR that relies on probabilistic reasoning. A probabilistic ontology is in charge of computing symbolic features that are combined with the features automatically extracted by a CNN model from raw sensor data and high-level context data. In particular, the symbolic features encode probabilistic common-sense knowledge about the activities consistent with the user’s surrounding context. These features are infused within the model before the classification layer. We experimentally evaluated P-NIMBUS on a HAR dataset of mobile devices sensor data that includes 14 different activities performed by 25 users. Our results show that P-NIMBUS outperforms state-of-the-art neuro-symbolic approaches, with the advantage of requiring a limited amount of training data to reach satisfying recognition rates (i.e., more than 80% of F1-score with only 20% of labeled data).  相似文献   

14.
Extensive research has been carried out in the past on face recognition, face detection, and age estimation. However, age-invariant face recognition (AIFR) has not been explored that thoroughly. The facial appearance of a person changes considerably over time that results in introducing significant intraclass variations, which makes AIFR a very challenging task. Most of the face recognition studies that have addressed the ageing problem in the past have employed complex models and handcrafted features with strong parametric assumptions. In this work, we propose a novel deep learning framework that extracts age-invariant and generalized features from facial images of the subjects. The proposed model trained on facial images from a minor part (20–30%) of lifespan of subjects correctly identifies them throughout their lifespan. A variety of pretrained 2D convolutional neural networks are compared in terms of accuracy, time, and computational complexity to select the most suitable network for AIFR. Extensive experimental results are carried out on the popular and challenging face and gesture recognition network ageing dataset. The proposed method achieves promising results and outperforms the state-of-the-art AIFR models by achieving an accuracy of 99%, which proves the effectiveness of deep learning in facial ageing research.  相似文献   

15.
Electrocardiogram (ECG) biometric recognition has emerged as a hot research topic in the past decade.Although some promising results have been reported,especially using sparse representation learning (SRL) and deep neural network,robust identification for small-scale data is still a challenge.To address this issue,we integrate SRL into a deep cascade model,and propose a multi-scale deep cascade bi-forest (MDCBF) model for ECG biometric recognition.We design the bi-forest based feature generator by fusing L1-norm sparsity and L2-norm collaborative representation to efficiently deal with noise.Then we propose a deep cascade framework,which includes multi-scale signal coding and deep cascade coding.In the former,we design an adaptive weighted pooling operation,which can fully explore the discriminative information of segments with low noise.In deep cascade coding,we propose level-wise class coding without backpropagation to mine more discriminative features.Extensive experiments are conducted on four small-scale ECG databases,and the results demonstrate that the proposed method performs competitively with state-of-the-art methods.  相似文献   

16.
农作物叶片病害的自动识别是计算机视觉技术在农业领域的一个重要应用. 近年来, 深度学习在农作物叶片病害识别上取得了一些进展, 但这些方法都是采用基于单一深度卷积神经网络模型的深度特征表示. 而不同的深度卷积神经网络模型对图像的表征能力的互补性这一有用的特性, 还没有得到关注和研究. 本文提出一种用于融合不同深度特征的网络模型MDFF-Net. MDFF-Net将两个预训练的深度卷积神经网络模型进行并联, 再为各个模型分别设置一个具有相同神经元个数的全连接层, 以将不同模型输出的深度特征变换成相同维度的特征, 再通过2个全连接层的非线性变换, 进一步提升特征融合的效果. 我们选取VGG-16和ResNet-50作为MDFF-Net网络的并联骨干网络, 在一个包含5种苹果叶片病害的公开数据集上进行实验. 实验结果显示, MDFF-Net网络的识别精度为96.59%, 取得了比VGG-16和ResNet-50单一网络更好的识别效果, 证明了该深度特征融合方法的有效性.  相似文献   

17.
针对现有基于深度学习的人体动作识别模型参数量大、网络过深过重等问题,提出了一种轻量型的双流融合深度神经网络模型并将该模型应用于人体动作识别。该模型将浅层多尺度网络和深度网络相结合,实现了模型参数量的大幅减少,避免了网络过深的问题。在数据集UCF101和HMDB51上进行实验,该模型在ImageNet预训练模式下分别取得了94.0%和69.4%的识别准确率。实验表明,相较于现有大多基于深度学习的人体动作识别模型,该模型大幅减少了参数量,并且仍具有较高的动作识别准确率。  相似文献   

18.
随着规模和复杂性的迅猛膨胀,软件系统中不可避免地存在缺陷.近年来,基于深度学习的缺陷预测技术成为软件工程领域的研究热点.该类技术可以在不运行代码的情况下发现其中潜藏的缺陷,因而在工业界和学术界受到了广泛的关注.然而,已有方法大多关注方法级的源代码中是否存在缺陷,无法精确识别具体的缺陷类别,从而降低了开发人员进行缺陷定位及修复工作的效率.此外,在实际软件开发实践中,新的项目通常缺乏足够的缺陷数据来训练高精度的深度学习模型,而利用已有项目的历史数据训练好的模型往往在新项目上无法达到良好的泛化性能.因此,本文首先将传统的二分类缺陷预测任务表述为多标签分类问题,即使用CWE(common weakness enumeration)中描述的缺陷类别作为细粒度的模型预测标签.为了提高跨项目场景下的模型性能,本文提出一种融合对抗训练和注意力机制的多源域适应框架.具体而言,该框架通过对抗训练来减少域(即软件项目)差异,并进一步利用域不变特征来获得每个源域和目标域之间的特征相关性.同时,该框架还利用加权最大均值差异作为注意力机制以最小化源域和目标域特征之间的表示距离,从而使模型可以学习到更多的域无关特征.最后在八个真实世界的开源项目上与最先进的基线方法进行大量对比实验验证了所提方法的有效性.  相似文献   

19.
交通标志识别设备的功耗和硬件性能较低,而现有卷积神经网络模型内存占用高、训练速度慢、计算开销大,无法应用于识别设备.针对此问题,为降低模型存储,提升训练速度,引入深度可分离卷积和混洗分组卷积并与极限学习机相结合,提出两种轻量型卷积神经网络模型:DSC-ELM模型和SGC-ELM模型.模型使用轻量化卷积神经网络提取特征后,将特征送入极限学习机进行分类,解决了卷积神经网络全连接层参数训练慢的问题.新模型结合了轻量型卷积神经网络模型内存占用低、提取特征质量好以及ELM的泛化性好、训练速度快的优点.实验结果表明.与其他模型相比,该混合模型能够更加快速准确地完成交通标志识别任务.  相似文献   

20.
Complex network is graph network with non-trivial topological features often occurring in real systems, such as video monitoring networks, social networks and sensor networks. While there is growing research study on complex networks, the main focus has been on the analysis and modeling of large networks with static topology. Predicting and control of temporal complex networks with evolving patterns are urgently needed but have been rarely studied. In view of the research gaps we are motivated to propose a novel end-to-end deep learning based network model, which is called temporal graph convolution and attention (T-GAN) for prediction of temporal complex networks. To joint extract both spatial and temporal features of complex networks, we design new adaptive graph convolution and integrate it with Long Short-Term Memory (LSTM) cells. An encoder-decoder framework is applied to achieve the objectives of predicting properties and trends of complex networks. And we proposed a dual attention block to improve the sensitivity of the model to different time slices. Our proposed T-GAN architecture is general and scalable, which can be used for a wide range of real applications. We demonstrate the applications of T-GAN to three prediction tasks for evolving complex networks, namely, node classification, feature forecasting and topology prediction over 6 open datasets. Our T-GAN based approach significantly outperforms the existing models, achieving improvement of more than 4.7% in recall and 25.1% in precision. Additional experiments are also conducted to show the generalization of the proposed model on learning the characteristic of time-series images. Extensive experiments demonstrate the effectiveness of T-GAN in learning spatial and temporal feature and predicting properties for complex networks.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号