首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
行人检测已成为安防、智能视频监控、景区人流量统计所依赖的核心技术,最新目标检测方法包括快速的区域卷积神经网络Fast RCNN、单发多重检测器 SSD、部分形变模型DPM等,皆为对行人整体的检测。在大场景下,行人姿态各异,物体间遮挡频繁,只有通过对行人身体部分位置建模,抓住人的局部特征,才能实现准确的定位。利用Faster RCNN深度网络原型,针对行人头部建立检测模型,同时提取行人不同方向的头部特征,并加入空间金字塔池化层,保证检测速率,有效解决大场景下行人的部分遮挡问题,同时清晰地显示人群大致流动方向,相比普通的人头估计,更有利于人流量统计。  相似文献   

2.
图像分类的深度卷积神经网络模型综述   总被引:3,自引:0,他引:3       下载免费PDF全文
图像分类是计算机视觉中的一项重要任务,传统的图像分类方法具有一定的局限性。随着人工智能技术的发展,深度学习技术越来越成熟,利用深度卷积神经网络对图像进行分类成为研究热点,图像分类的深度卷积神经网络结构越来越多样,其性能远远好于传统的图像分类方法。本文立足于图像分类的深度卷积神经网络模型结构,根据模型发展和模型优化的历程,将深度卷积神经网络分为经典深度卷积神经网络模型、注意力机制深度卷积神经网络模型、轻量级深度卷积神经网络模型和神经网络架构搜索模型等4类,并对各类深度卷积神经网络模型结构的构造方法和特点进行了全面综述,对各类分类模型的性能进行了对比与分析。虽然深度卷积神经网络模型的结构设计越来越精妙,模型优化的方法越来越强大,图像分类准确率在不断刷新的同时,模型的参数量也在逐渐降低,训练和推理速度不断加快。然而深度卷积神经网络模型仍有一定的局限性,本文给出了存在的问题和未来可能的研究方向,即深度卷积神经网络模型主要以有监督学习方式进行图像分类,受到数据集质量和规模的限制,无监督式学习和半监督学习方式的深度卷积神经网络模型将是未来的重点研究方向之一;深度卷积神经网络模型的速度和资源消耗仍不尽人意,应用于移动式设备具有一定的挑战性;模型的优化方法以及衡量模型优劣的度量方法有待深入研究;人工设计深度卷积神经网络结构耗时耗力,神经架构搜索方法将是未来深度卷积神经网络模型设计的发展方向。  相似文献   

3.
Detecting subjectivity expressed toward concerned targets is an interesting problem and has received intensive study. Previous work often treated each target independently, ignoring the potential (sometimes very strong) dependency that could exist among targets (eg, the subjectivity expressed toward two products or two political candidates in an election). In this paper, we relieve such an independence assumption in order to jointly model the subjectivity expressed toward multiple targets. We propose and show that an attention‐based encoder‐decoder framework is very effective for this problem, outperforming several alternatives that jointly learn dependent subjectivity through cascading classification or multitask learning, as well as models that independently predict subjectivity toward individual targets.  相似文献   

4.
Hyperspectral image analysis has been gaining research attention thanks to the current advances in sensor design which have made acquiring such imagery much more affordable. Although there exist various approaches for segmenting hyperspectral images, deep learning has become the mainstream. However, such large-capacity learners are characterized by significant memory footprints. This is a serious obstacle in employing deep neural networks on board a satellite for Earth observation. In this paper, we introduce resource-frugal quantized convolutional neural networks, and greatly reduce their size without adversely affecting the classification capability. Our experiments performed over two hyperspectral benchmarks showed that the quantization process can be seamlessly applied during the training, and it leads to much smaller and still well-generalizing deep models.  相似文献   

5.
Smile or happiness is one of the most universal facial expressions in our daily life. Smile detection in the wild is an important and challenging problem, which has attracted a growing attention from affective computing community. In this paper, we present an efficient approach for smile detection in the wild with deep learning. Different from some previous work which extracted hand-crafted features from face images and trained a classifier to perform smile recognition in a two-step approach, deep learning can effectively combine feature learning and classification into a single model. In this study, we apply the deep convolutional network, a popular deep learning model, to handle this problem. We construct a deep convolutional network called Smile-CNN to perform feature learning and smile detection simultaneously. Experimental results demonstrate that although a deep learning model is generally developed for tackling “big data,” the model can also effectively deal with “small data.” We further investigate into the discriminative power of the learned features, which are taken from the neuron activations of the last hidden layer of our Smile-CNN. By using the learned features to train an SVM or AdaBoost classifier, we show that the learned features have impressive discriminative ability. Experiments conducted on the GENKI4K database demonstrate that our approach can achieve a promising performance in smile detection.  相似文献   

6.
In this article, we propose a novel difference image (DI) creation method for unsupervised change detection in multi-temporal multi-spectral remote-sensing images based on deep learning theory. First, we apply deep belief network to learn local and high-level features from the local neighbour of a given pixel in an unsupervised manner. Second, a back propagation algorithm is improved to build a DI based on selected training samples, which can highlight the difference on changed regions and suppress the false changes on unchanged regions. Finally, we get the change trajectory map using simple clustering analysis. The proposed scheme is tested on three remote-sensing data sets. Qualitative and quantitative evaluations show its superior performance compared to the traditional pixel-level and texture-level-based approaches.  相似文献   

7.
Multimedia Tools and Applications - Today, images editing software has greatly evolved, thanks to them that the semantic manipulation of images has become easier. On the other hand, the...  相似文献   

8.
目的 肺结节是肺癌的早期存在形式。低剂量CT(computed tomogragphy)扫描作为肺癌筛查的重要检查手段,已经大规模应用于健康体检,但巨大的CT数据带来了大量工作,随着人工智能技术的快速发展,基于深度学习的计算机辅助肺结节检测引起了关注。由于肺结节尺寸差别较大,在多个尺度上表示特征对结节检测任务至关重要。针对结节尺寸差别较大导致的结节检测困难问题,提出一种基于深度卷积神经网络的胸部CT序列图像3D多尺度肺结节检测方法。方法 包括两阶段:1)尽可能提高敏感度的结节初检网络;2)尽可能减少假阳性结节数量的假阳性降低网络。在结节初检网络中,以组合了压缩激励单元的Res2Net网络为骨干结构,使同一层卷积具有多种感受野,提取肺结节的多尺度特征信息,并使用引入了上下文增强模块和空间注意力模块的区域推荐网络结构,确定候选区域;在由Res2Net网络模块和压缩激励单元组成的假阳性降低网络中对候选结节进一步分类,以降低假阳性,获得最终结果。结果 在公共数据集LUNA16(lung nodule analysis 16)上进行实验,实验结果表明,对于结节初检网络阶段,当平均每例假阳性个数为22时,敏感度可达到0.983,相比基准ResNet + FPN(feature pyramid network)方法,平均敏感度和最高敏感度分别提高了2.6%和0.8%;对于整个3D多尺度肺结节检测网络,当平均每例假阳性个数为1时,敏感度为0.924。结论 与现有主流方案相比,该检测方法不但提高了肺结节检测的敏感度,还有效地控制了假阳性,取得了更优的性能。  相似文献   

9.
Li  Lin  Fan  Mingyu  Liu  Defu 《Multimedia Tools and Applications》2021,80(17):25539-25555
Multimedia Tools and Applications - Steganalysers based on deep learning achieve state-of-the-art performance. However, due to the difficulty of capturing the distribution of the high-dimensional...  相似文献   

10.

场景流是连续动态场景之间的3D运动场,广泛应用于机器人技术和自动驾驶任务.现有方法忽略了点云点的相关性,仅关注源点云和目标点云逐点的匹配关系,由于匹配关系完全依赖于点云数据的特征信息,导致在局部特征信息不足的点上准确估计场景流仍然存在挑战.根据源点云相邻点具有相关性的特性,提出NCPUM(neighborhood consistency propagation update method)方法,在邻域内将场景流从高置信度点向低置信度点传播,从而优化局部特征信息不足点的场景流.具体来说,NCPUM包含2个模块:置信度预测模块,根据场景流先验分布图,预测源点云逐点的置信度;场景流传播模块,根据局部区域一致性的约束更新低置信度点集的场景流.NCPUM在合成数据集Flyingthings3D和真实驾驶场景数据集KITTI上进行评估,准确度上达到了国际先进水平.由于邻域一致性更符合真实激光雷达场景的先验假设,因此在KITTI数据集上的提升更加明显.

  相似文献   

11.
Multimedia Tools and Applications - Image retargeting is the task of making images capable of being displayed on screens with different sizes. This work should be done so that high-level visual...  相似文献   

12.
Many parts of the world experience severe episodes of flooding every year. In addition to the high cost of mitigation and damage to property, floods make roads impassable and hamper community evacuation, movement of goods and services, and rescue missions. Knowing the depth of floodwater is critical to the success of response and recovery operations that follow. However, flood mapping especially in urban areas using traditional methods such as remote sensing and digital elevation models (DEMs) yields large errors due to reshaped surface topography and microtopographic variations combined with vegetation bias. This paper presents a deep neural network approach to detect submerged stop signs in photos taken from flooded roads and intersections, coupled with Canny edge detection and probabilistic Hough transform to calculate pole length and estimate floodwater depth. Additionally, a tilt correction technique is implemented to address the problem of sideways tilt in visual analysis of submerged stop signs. An in-house dataset, named BluPix 2020.1 consisting of paired web-mined photos of submerged stop signs across 10 FEMA regions (for U.S. locations) and Canada is used to evaluate the models. Overall, pole length is estimated with an RMSE of 17.43 and 8.61 in. in pre- and post-flood photos, respectively, leading to a mean absolute error of 12.63 in. in floodwater depth estimation. Findings of this research are sought to equip jurisdictions, local governments, and citizens in flood-prone regions with a simple, reliable, and scalable solution that can provide (near-) real time estimation of floodwater depth in their surroundings.  相似文献   

13.
Classification-oriented Machine Learning methods are a precious tool, in modern Intrusion Detection Systems (IDSs), for discriminating between suspected intrusion attacks and normal behaviors. Many recent proposals in this field leveraged Deep Neural Network (DNN) methods, capable of learning effective hierarchical data representations automatically. However, many of these solutions were validated on data featuring stationary distributions and/or large amounts of training examples. By contrast, in real IDS applications different kinds of attack tend to occur over time, and only a small fraction of the data instances is labeled (usually with far fewer examples of attacks than of normal behavior). A novel ensemble-based Deep Learning framework is proposed here that tries to face the challenging issues above. Basically, the non-stationary nature of IDS log data is faced by maintaining an ensemble consisting of a number of specialized base DNN classifiers, trained on disjoint chunks of the data instances’ stream, plus a combiner model (reasoning on both the base classifiers predictions and original instance features). In order to learn deep base classifiers effectively from small training samples, an ad-hoc shared DNN architecture is adopted, featuring a combination of dropout capabilities, skip-connections, along with a cost-sensitive loss (for dealing with unbalanced data). Tests results, conducted on two benchmark IDS datasets and involving several competitors, confirmed the effectiveness of our proposal (in terms of both classification accuracy and robustness to data scarcity), and allowed us to evaluate different ensemble combination schemes.  相似文献   

14.
Multispectral pedestrian detection is an important functionality in various computer vision applications such as robot sensing, security surveillance, and autonomous driving. In this paper, our motivation is to automatically adapt a generic pedestrian detector trained in a visible source domain to a new multispectral target domain without any manual annotation efforts. For this purpose, we present an auto-annotation framework to iteratively label pedestrian instances in visible and thermal channels by leveraging the complementary information of multispectral data. A distinct target is temporally tracked through image sequences to generate more confident labels. The predicted pedestrians in two individual channels are merged through a label fusion scheme to generate multispectral pedestrian annotations. The obtained annotations are then fed to a two-stream region proposal network (TS-RPN) to learn the multispectral features on both visible and thermal images for robust pedestrian detection. Experimental results on KAIST multispectral dataset show that our proposed unsupervised approach using auto-annotated training data can achieve performance comparable to state-of-the-art deep neural networks (DNNs) based pedestrian detectors trained using manual labels.  相似文献   

15.
ColorCheckers are reference standards that professional photographers and filmmakers use to ensure predictable results under every lighting condition. The objective of this work is to propose a new fast and robust method for automatic ColorChecker detection. The process is divided into two steps: (1) ColorCheckers localization and (2) ColorChecker patches recognition. For the ColorChecker localization, we trained a detection convolutional neural network using synthetic images. The synthetic images are created with the 3D models of the ColorChecker and different background images. The output of the neural networks are the bounding box of each possible ColorChecker candidates in the input image. Each bounding box defines a cropped image which is evaluated by a recognition system, and each image is canonized with regards to color and dimensions. Subsequently, all possible color patches are extracted and grouped with respect to the center's distance. Each group is evaluated as a candidate for a ColorChecker part, and its position in the scene is estimated. Finally, a cost function is applied to evaluate the accuracy of the estimation. The method is tested using real and synthetic images. The proposed method is fast, robust to overlaps and invariant to affine projections. The algorithm also performs well in case of multiple ColorCheckers detection.  相似文献   

16.
Multimedia Tools and Applications - In recent years, we have witnessed the great success of deep learning on various problems both in low and high-level computer visions. The low-level vision...  相似文献   

17.
Multispectral pedestrian detection has received extensive attention in recent years as a promising solution to facilitate robust human target detection for around-the-clock applications (e.g., security surveillance and autonomous driving). In this paper, we demonstrate illumination information encoded in multispectral images can be utilized to boost the performance of pedestrian detection significantly. A novel illumination-aware weighting mechanism is present to depict illumination condition of a scene accurately. Such illumination information is incorporated into two-stream deep convolutional neural networks to learn multispectral human-related features under different illumination conditions (daytime and nighttime). Moreover, we utilized illumination information together with multispectral data to generate more accurate semantic segmentation which is used to supervise the training of pedestrian detector. Putting all of the pieces together, we present an effective framework for multispectral pedestrian detection based on multi-task learning of illumination-aware pedestrian detection and semantic segmentation. Our proposed method is trained end-to-end using a well-designed multi-task loss function and outperforms state-of-the-art approaches on KAIST multispectral pedestrian dataset.  相似文献   

18.
Detection of building objects in airborne LiDAR data is an essential task in many types of geospatial data applications such as urban reconstruction and damage assessment. Traditional approaches used in building detection often rely on shape primitives that can be detected by 2D/3D computer vision techniques. These approaches require carefully engineered features which tend to be specific to building types. Furthermore, these approaches are often computationally expensive with the increase of data size. In this paper, we propose a novel approach that employs a deep neural network to recognize and extract residential building objects in airborne LiDAR data. This proposed approach does not require any pre-defined geometric or texture features, and it is applicable to airborne LiDAR data sets with varied point densities and with damaged building objects. The latter makes our approach particularly useful in damage assessment applications. The research results show that the proposed approach is capable of achieving the state-of-the-art accuracy in building detection in these different types of point cloud data sets.  相似文献   

19.
互联网技术的飞速发展导致敏感内容图像由原先基本隐蔽的内容交换变为海量的数据共享,传统基于图像特征提取的敏感内容检测方法不再适用。针对上述难点,提出基于稀疏语义和双层深度卷积神经网络相结合的敏感内容检测方法。上层网络首先进行训练样本的预处理,并通过构造图像的稀疏语义表示作为神经网络的输入;而下层网络则进一步考虑第三方管控机制(如政府代理等),提出针对特定群体的敏感内容图像检测方法。与现有常用敏感内容图像检测方法相比,该检测方法可有效降低训练样本数量,且检测精度比传统图像检测方法(如基于视觉词袋方法等)提升7%以上。  相似文献   

20.
Ye  Fajie  Li  Xiongfei  Zhang  Xiaoli 《Multimedia Tools and Applications》2019,78(11):14683-14703
Multimedia Tools and Applications - In remote sensing image fusion field, traditional algorithms based on the human-made fusion rules are severely sensitive to the source images. In this paper, we...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号