首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
当前,人脸识别技术遇到的突出问题是光照、姿态、遮挡和表情等因素所引起的识别精度的下降,这些问题是人脸识别系统不完美的主要原因,深度学习是一种新的方法,可有效解决这些问题。首先通过引入深度学习算法进行多层次的学习,然后提取高层特征进行人脸描述,最后应用最大间距准则减小最小二乘估计产生的重建误差,实现有效的面部识别分类。该算法在ORL、CAS-PEAL和扩展Yale-B人脸数据库中进行了不同光照、姿态、遮挡、表情和容貌特征变化条件下的仿真实验。结果表明,所提出的算法比传统线性分类算法具有更高的效率和准确度。  相似文献   

2.
随着深度学习的发展,近年来人脸识别借助深度学习技术取得了巨大突破。但是在已有的基于深度学习的人脸识别框架中,各个任务(人脸鉴别、认证和属性分类等)都是相互独立设计、运作的,使得整体算法低效、耗时。针对这些问题,提出一种基于多任务框架的深度卷积网络。通过将人脸鉴别、认证和属性分类同时作为网络目标函数,端到端地训练整个深度卷积网络,算法简洁高效。此网络可以同时完成上述三个任务,不需要额外的步骤。实验结果显示,即使在有限的数据支持下,该方法依然能够取得不错的性能,在人脸识别权威数据集LFW上获得了97.3%的精度。  相似文献   

3.
基于迁移学习的并行卷积神经网络牦牛脸识别算法   总被引:1,自引:0,他引:1  
陈争涛  黄灿  杨波  赵立  廖勇 《计算机应用》2021,41(5):1332-1336
为了在牦牛养殖过程中对牦牛实现精确管理,需要对牦牛的身份进行识别,而牦牛脸识别是一种可行的牦牛身份识别方式.然而已有的基于神经网络的牦牛脸识别算法中存在牦牛脸数据集特征多、神经网络训练时间长的问题,因此,借鉴迁移学习的方法并结合视觉几何组网络(VGG)和卷积神经网络(CNN),提出了一种并行CNN(Parallel-C...  相似文献   

4.
This paper presents an online learning approach to video-based face recognition that does not make any assumptions about the pose, expressions or prior localization of facial landmarks. Learning is performed online while the subject is imaged and gives near realtime feedback on the learning status. Face images are automatically clustered based on the similarity of their local features. The learning process continues until the clusters have a required minimum number of faces and the distance of the farthest face from its cluster mean is below a threshold. A voting algorithm is employed to pick the representative features of each cluster. Local features are extracted from arbitrary keypoints on faces as opposed to pre-defined landmarks and the algorithm is inherently robust to large scale pose variations and occlusions. During recognition, video frames of a probe are sequentially matched to the clusters of all individuals in the gallery and its identity is decided on the basis of best temporally cohesive cluster matches. Online experiments (using live video) were performed on a database of 50 enrolled subjects and another 22 unseen impostors. The proposed algorithm achieved a recognition rate of 97.8% and a verification rate of 100% at a false accept rate of 0.0014. For comparison, experiments were also performed using the Honda/UCSD database and 99.5% recognition rate was achieved.  相似文献   

5.
龚锐  丁胜  章超华  苏浩 《计算机应用》2020,40(3):704-709
目前基于深度学习的人脸识别方法存在识别模型参数量大、特征提取速度慢的问题,而且现有人脸数据集姿态单一,在实际人脸识别任务中无法取得好的识别效果。针对这一问题建立了一种多姿态人脸数据集,并提出了一种轻量级的多姿态人脸识别方法。首先,使用多任务级联卷积神经网络(MTCNN)算法进行人脸检测,并且使用MTCNN最后包含的高层特征做人脸跟踪;然后,根据检测到的人脸关键点位置来判断人脸姿态,通过损失函数为ArcFace的神经网络提取当前人脸特征,并将当前人脸特征与相应姿态的人脸数据库中的人脸特征比对得到人脸识别结果。实验结果表明,提出方法在多姿态人脸数据集上准确率为96.25%,相较于单一姿态的人脸数据集,准确率提升了2.67%,所提方法能够有效提高识别准确率。  相似文献   

6.
针对基于深度学习的静态人脸图像表情识别方法进行研究,首先介绍了深度学习的原理,并归纳了目前公开且常用的面部表情数据集;然后介绍了基于深度学习的表情识别的三个步骤,归纳了图像预处理和表情分类的主要方法,重点总结了目前性能较好用来提取特征的深度学习框架以及这些方法的基本原理和优劣势比较;最后指出了目前面部表情识别存在的问题和未来可能的发展趋势。  相似文献   

7.
考虑到现实环境中的人脸图片在角度、光线、分辨率上的复杂程度,对Inception-ResNet-V1网络结构进行了改进,同时完成了数据集制作、超参数调节等相关工作,并在家庭服务机器人平台上进行了实验研究。实验结果表明,改进的网络结构在LFW测试集上准确率达到99.22%,高于原始网络结构的99.05%;在亚洲人脸数据集上准确率达到99.20%,高于原始网络结构的97.10%;在自建非匹配人脸数据集上误识别率为3.43%,低于原始网络结构的12.28%。可以看出,与原始网络结构相比,改进网络结构提升了人脸识别的准确率且降低了误识别率。  相似文献   

8.
在真实环境下遮挡是准确分析识别人脸表情的主要障碍之一。近年来研究者采用深度学习技术解决遮挡条件下表情误识别率高的问题。针对遮挡表情识别的深度学习算法和遮挡相关的问题进行归纳总结。首先,概括局部遮挡条件下表情识别的发展现状、表情的表示方式以及研究遮挡表情用到的数据集;其次,回顾遮挡表情识别深度学习方法的最新进展和分析遮挡对表情的影响;最后,总结主要技术挑战,研究难点及其可能的应对策略。目的是为将来的遮挡表情识别研究提供更有益的参考依据和基准。  相似文献   

9.
Gait recognition has been considered as the emerging biometric technology for identifying the walking behaviors of humans. The major challenges addressed in this article is significant variation caused by covariate factors such as clothing, carrying conditions and view angle variations will undesirably affect the recognition performance of gait. In recent years, deep learning technique has produced a phenomenal performance accuracy on various challenging problems based on classification. Due to an enormous amount of data in the real world, convolutional neural network will approximate complex nonlinear functions in models to develop a generalized deep convolutional neural network (DCNN) architecture for gait recognition. DCNN can handle relatively large multiview datasets with or without using any data augmentation and fine-tuning techniques. This article proposes a color-mapped contour gait image as gait feature for addressing the variations caused by the cofactors and gait recognition across views. We have also compared the various edge detection algorithms for gait template generation and chosen the best from among them. The databases considered for our work includes the most widely used CASIA-B dataset and OULP database. Our experiments show significant improvement in the gait recognition for fixed-view, crossview, and multiview compared with the recent methodologies.  相似文献   

10.
With the development of deep learning, numerous models have been proposed for human activity recognition to achieve state-of-the-art recognition on wearable sensor data. Despite the improved accuracy achieved by previous deep learning models, activity recognition remains a challenge. This challenge is often attributed to the complexity of some specific activity patterns. Existing deep learning models proposed to address this have often recorded high overall recognition accuracy, while low recall and precision are often recorded on some individual activities due to the complexity of their patterns. Some existing models that have focused on tackling these issues are always bulky and complex. Since most embedded systems have resource constraints in terms of their processor, memory and battery capacity, it is paramount to propose efficient lightweight activity recognition models that require limited resources consumption, and still capable of achieving state-of-the-art recognition of activities, with high individual recall and precision. This research proposes a high performance, low footprint deep learning model with a squeeze and excitation block to address this challenge. The squeeze and excitation block consist of a global average-pooling layer and two fully connected layers, which were placed to extract the flattened features in the model, with best-fit reduction ratios in the squeeze and excitation block. The squeeze and excitation block served as channel-wise attention, which adjusted the weight of each channel to build more robust representations, which enabled our network to become more responsive to essential features while suppressing less important ones. By using the best-fit reduction ratio in the squeeze and excitation block, the parameters of the fully connected layer were reduced, which helped the model increase responsiveness to essential features. Experiments on three publicly available datasets (PAMAP2, WISDM, and UCI-HAR) showed that the proposed model outperformed existing state-of-the-art with fewer parameters and increased the recall and precision of some individual activities compared to the baseline, and the existing models.  相似文献   

11.
Spontaneous facial expression recognition is significantly more challenging than recognizing posed ones. We focus on two issues that are still under-addressed in this area. First, due to the inherent subtlety, the geometric and appearance features of spontaneous expressions tend to overlap with each other, making it hard for classifiers to find effective separation boundaries. Second, the training set usually contains dubious class labels which can hurt the recognition performance if no countermeasure is taken. In this paper, we propose a spontaneous expression recognition method based on robust metric learning with the aim of alleviating these two problems. In particular, to increase the discrimination of different facial expressions, we learn a new metric space in which spatially close data points have a higher probability of being in the same class. In addition, instead of using the noisy labels directly for metric learning, we define sensitivity and specificity to characterize the annotation reliability of each annotator. Then the distance metric and annotators' reliability is jointly estimated by maximizing the likelihood of the observed class labels. With the introduction of latent variables representing the true class labels, the distance metric and annotators' reliability can be iteratively solved under the Expectation Maximization framework. Comparative experiments show that our method achieves better recognition accuracy on spontaneous expression recognition, and the learned metric can be reliably transferred to recognize posed expressions.  相似文献   

12.
冯姝 《计算机应用》2017,37(2):512-516
特征表示是人脸识别的关键问题,由于人脸图像在拍摄过程中受光照、遮挡、姿势等因素的影响,如何提取鲁棒的图像特征成了研究的重点。受卷积网络框架的启发,结合K-means算法在卷积滤波器学习中所具有的效果稳定、收敛速度快等优点,提出了一种简单有效的人脸识别方法,主要包含三个部分:卷积滤波器学习、非线性处理和空间平均值池化。具体而言,首先在训练图像中提取局部图像块,预处理后,使用K-means算法快速学习滤波器,每个滤波器与图像进行卷积运算;然后通过双曲正切函数对卷积图像进行非线性变换;最后利用空间平均值池化对图像特征进行去噪和降维。分类阶段仅采用简单的线性回归分类器。在AR和ExtendedYaleB数据集上的评估实验结果表明所提方法虽然简单却非常有效,而且对光照和遮挡表现出了强鲁棒性。  相似文献   

13.
Recently, transforming windows files into images and its analysis using machine learning and deep learning have been considered as a state-of-the art works for malware detection and classification. This is mainly due to the fact that image-based malware detection and classification is platform independent, and the recent surge of success of deep learning model performance in image classification. Literature survey shows that convolutional neural network (CNN) deep learning methods are successfully employed for image-based windows malware classification. However, the malwares were embedded in a tiny portion in the overall image representation. Identifying and locating these affected tiny portions is important to achieve a good malware classification accuracy. In this work, a multi-headed attention based approach is integrated to a CNN to locate and identify the tiny infected regions in the overall image. A detailed investigation and analysis of the proposed method was done on a malware image dataset. The performance of the proposed multi-headed attention-based CNN approach was compared with various non-attention-CNN-based approaches on various data splits of training and testing malware image benchmark dataset. In all the data-splits, the attention-based CNN method outperformed non-attention-based CNN methods while ensuring computational efficiency. Most importantly, most of the methods show consistent performance on all the data splits of training and testing and that illuminates multi-headed attention with CNN model's generalizability to perform on the diverse datasets. With less number of trainable parameters, the proposed method has achieved an accuracy of 99% to classify the 25 malware families and performed better than the existing non-attention based methods. The proposed method can be applied on any operating system and it has the capability to detect packed malware, metamorphic malware, obfuscated malware, malware family variants, and polymorphic malware. In addition, the proposed method is malware file agnostic and avoids usual methods such as disassembly, de-compiling, de-obfuscation, or execution of the malware binary in a virtual environment in detecting malware and classifying malware into their malware family.  相似文献   

14.
为了克服人脸识别中存在光照、姿态、颜色等噪声的干扰,融合了卷积神经网络与孪生神经网络的优点,提出了一种改进的CNN网络结构,该结构由两个卷积神经网络组成,且共享网络权值,在该结构的训练中采用了差异深度度量学习(DDML)算法。卷积结构有效地去除外界噪声干扰,且在非线性降维中权值共享结构能够自动提取相同特征,DDML算法增加了提取特征的有效性。在ORL、YaleB和AR人脸数据库上实验结果表明,与PCA、CNN等算法相比,识别稳定度高,识别率提升近5个百分点。  相似文献   

15.
人脸识别具有广泛的应用,但容易受到伪造的欺骗人脸攻击而影响安全性,设计检测准确率高、泛化能力强、满足实时性需求的活体检测方法是目前的研究重点。将现有的人脸活体检测研究方法分为基于手工设计特征表达、基于深度学习和基于融合策略的方法,介绍每类方法所包含的典型算法的基本思想、实现步骤及优缺点。最后对已公开的人脸活体检测数据库进行整理说明,对人脸活体检测的发展趋势以及还需要进一步解决的问题进行综述,为今后人脸活体检测的研究提供参考和借鉴。  相似文献   

16.
卷积神经网络的多字体汉字识别   总被引:1,自引:0,他引:1       下载免费PDF全文
目的 多字体的汉字识别在中文自动处理及智能输入等方面具有广阔的应用前景,是模式识别领域的一个重要课题。近年来,随着深度学习新技术的出现,基于深度卷积神经网络的汉字识别在方法和性能上得到了突破性的进展。然而现有方法存在样本需求量大、训练时间长、调参难度大等问题,针对大类别的汉字识别很难达到最佳效果。方法 针对无遮挡的印刷及手写体汉字图像,提出了一种端对端的深度卷积神经网络模型。不考虑附加层,该网络主要由3个卷积层、2个池化层、1个全连接层和一个Softmax回归层组成。为解决样本量不足的问题,提出了综合运用波纹扭曲、平移、旋转、缩放的数据扩增方法。为了解决深度神经网络参数调整难度大、训练时间长的问题,提出了对样本进行批标准化以及采用多种优化方法相结合精调网络等策略。结果 实验采用该深度模型对国标一级3 755类汉字进行识别,最终识别准确率达到98.336%。同时通过多组对比实验,验证了所提出的各种方法对改善模型最终效果的贡献。其中使用数据扩增、使用混合优化方法和使用批标准化后模型对测试样本的识别率分别提高了8.0%、0.3%和1.4%。结论 与其他文献中利用手工提取特征结合卷积神经网络的方法相比,减少了人工提取特征的工作量;与经典卷积神经网络相比,该网络特征提取能力更强,识别率更高,训练时间更短。  相似文献   

17.
程广涛  巩家昌  李建 《计算机应用》2020,40(5):1465-1469
针对传统烟雾检测方法中提取的图像特征鲁棒性较差的问题,提出了基于稠密卷积神经网络(DenseNet)的烟雾识别方法。首先,利用卷积操作和特征图融合构建稠密网络块,在卷积层之间设计稠密连接机制,以增强稠密网络块结构内的信息流通和特征重利用;然后,将已构建的稠密网络块叠加成稠密卷积神经网络用于烟雾识别,节省计算资源的同时提升对烟雾图像特征的表达能力;最后,针对烟雾图像数据量较小的问题,采取数据增强技术进一步改善训练模型的识别能力。在公开烟雾数据集上对提出的方法进行实验验证,实验结果表明,所提方法的模型大小只有0.44 MB,在两个测试集上的准确率分别为96.20%和96.81%。  相似文献   

18.
Pedestrian detection is essential for improving pedestrian safety in an intelligent traffic system. The efficiency of the system is affected by real-time processing and the error rate of detection. These concerns have not been completely addressed in previous studies. Therefore, this study proposes a real-time pedestrian recognition system that ensures high accuracy by using a deep learning classifier and zebra-crossing recognition techniques. The proposed system was designed to improve pedestrian safety and reduce accidents at intersections. Environmental feature vectors were first used to detect zebra crossings and to determine crossing areas. An adaptive mapping technique was then used to map the pedestrian waiting area based on the crossing area. A dual camera mechanism was used to maintain detection accuracy and improve system fault tolerance. Finally, the you-only-look-once model was used to recognize pedestrians at intersections. A system prototype was implemented to verify the feasibility of the proposed system. The results revealed that the proposed scheme outperforms the conventional histogram of oriented gradients and Haarcascade schemes.  相似文献   

19.
Facial expression recognition (FER) in the wild is an active and challenging field of research. A system for automatic FER finds use in a wide range of applications related to advanced human–computer interaction (HCI), human–robot interaction (HRI), human behavioral analysis, gaming and entertainment, etc. Since their inception, convolutional neural networks (CNNs) have attained state-of-the-art accuracy in the facial analysis task. However, recognizing facial expressions in the wild with high confidence running on a low-cost embedded device remains challenging. To this end, this study presents an efficient dual-channel ensembled deep CNN (DCE-DCNN) for FER in the wild. Initially, two DCNNs, namely the DCNN G $$ {\mathrm{DCNN}}_G $$ and DCNN S $$ {\mathrm{DCNN}}_S $$ , are trained separately on the grayscale and Scharr-convolved vertical gradient facial images, respectively. The proposed network later integrates the two pre-trained DCNNs to obtain the dual-channel integrated DCNN (DCI-DCNN). Finally, all three neural networks, namely the DCNN G $$ {\mathrm{DCNN}}_G $$ , DCNN S $$ {\mathrm{DCNN}}_S $$ , and DCI-DCNN, are jointly fine-tuned to get a single dual-channel-multi-output model. The multi-output model produces three prediction scores for the given input facial image. The prediction scores are thus fused using the max-voting ensemble scheme to obtain the DCE-DCNN with the final classification label. On the FER2013, RAF-DB, NCAER-S, AffectNet, and CKPlus benchmark FER datasets, the proposed DCE-DCNN consistently outperforms the two individual DCNNs and numerous state-of-the-art CNNs. Moreover, the network achieves competitive recognition accuracy on all four FER in the wild datasets with reduced memory storage size and parameters. The proposed DCE-DCNN model with high throughput on resource-limited embedded devices is suitable for applications that seek real-time classification of facial expressions in the wild with high confidence.  相似文献   

20.
Automated, real-time, and reliable equipment activity recognition on construction sites can help to minimize idle time, improve operational efficiency, and reduce emissions. Previous efforts in activity recognition of construction equipment have explored different classification algorithms anm accelerometers and gyroscopes. These studies utilized pattern recognition approaches such as statistical models (e.g., hidden-Markov models); shallow neural networks (e.g., Artificial Neural Networks); and distance algorithms (e.g., K-nearest neighbor) to classify the time-series data collected from sensors mounted on the equipment. Such methods necessitate the segmentation of continuous operational data with fixed or dynamic windows to extract statistical features. This heuristic and manual feature extraction process is limited by human knowledge and can only extract human-specified shallow features. However, recent developments in deep neural networks, specifically recurrent neural network (RNN), presents new opportunities to classify sequential time-series data with recurrent lateral connections. RNN can automatically learn high-level representative features through the network instead of being manually designed, making it more suitable for complex activity recognition. However, the application of RNN requires a large training dataset which poses a practical challenge to obtain from real construction sites. Thus, this study presents a data-augmentation framework for generating synthetic time-series training data for an RNN-based deep learning network to accurately and reliably recognize equipment activities. The proposed methodology is validated by generating synthetic data from sample datasets, that were collected from two earthmoving operations in the real world. The synthetic data along with the collected data were used to train a long short-term memory (LSTM)-based RNN. The trained model was evaluated by comparing its performance with traditionally used classification algorithms for construction equipment activity recognition. The deep learning framework presented in this study outperformed the traditionally used machine learning classification algorithms for activity recognition regarding model accuracy and generalization.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号