期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

林成创单纯赵淦森杨志荣彭璟陈少洁黄润桦李壮伟易序晟杜嘉华李双印罗浩宇樊小毛陈冰川《计算机科学与探索》2021,15(4):583-611

深度学习是目前机器视觉的前沿解决方案,而海量高质量的训练数据集是深度学习解决机器视觉问题的基本保障。收集和准确标注图像数据集是一个极其费时且代价昂贵的过程。随着机器视觉的广泛应用,这个问题将会越来越突出。图像增广技术是一种有效解决深度学习在少量或者低质量训练数据中进行训练的一种技术手段,该技术不断地伴随着深度学习与机器视觉的发展。系统性梳理当前图像增广技术研究,从增广对象、增广空间、标签处理和增广策略生成的角度,分析现有图像增广技术的研究范式。依据研究范式提出现有图像增广技术的分类系统,重点介绍每类图像增广研究的代表性研究成果。最后,对现有图像增广研究进行总结,指出当前图像增广研究中存在的问题及未来的发展趋势。相似文献

2.

A survey of automated data augmentation algorithms for deep learning-based image classification tasks

Yang Zihan Sinnott Richard O. Bailey James Ke Qiuhong 《Knowledge and Information Systems》2023,65(7):2805-2861

In recent years, one of the most popular techniques in the computer vision community has been the deep learning technique. As a data-driven technique, deep model requires enormous amounts of accurately labelled training data, which is often inaccessible in many real-world applications. A data-space solution is Data Augmentation (DA), that can artificially generate new images out of original samples. Image augmentation strategies can vary by dataset, as different data types might require different augmentations to facilitate model training. However, the design of DA policies has been largely decided by the human experts with domain knowledge, which is considered to be highly subjective and error-prone. To mitigate such problem, a novel direction is to automatically learn the image augmentation policies from the given dataset using Automated Data Augmentation (AutoDA) techniques. The goal of AutoDA models is to find the optimal DA policies that can maximize the model performance gains. This survey discusses the underlying reasons of the emergence of AutoDA technology from the perspective of image classification. We identify three key components of a standard AutoDA model: a search space, a search algorithm and an evaluation function. Based on their architecture, we provide a systematic taxonomy of existing image AutoDA approaches. This paper presents the major works in AutoDA field, discussing their pros and cons, and proposing several potential directions for future improvements.

相似文献

3.

深度学习图像数据增广方法研究综述 总被引：1，自引：0，他引：1

下载免费PDF全文

马岽奡唐娉赵理君张正《中国图象图形学报》2021,26(3):487-502

数据作为深度学习的驱动力，对于模型的训练至关重要。充足的训练数据不仅可以缓解模型在训练时的过拟合问题，而且可以进一步扩大参数搜索空间，帮助模型进一步朝着全局最优解优化。然而，在许多领域或任务中，获取到充足训练样本的难度和代价非常高。因此，数据增广成为一种常用的增加训练样本的手段。本文对目前深度学习中的图像数据增广方法进行研究综述，梳理了目前深度学习领域为缓解模型过拟合问题而提出的各类数据增广方法，按照方法本质原理的不同，将其分为单数据变形、多数据混合、学习数据分布和学习增广策略等4类方法，并以图像数据为主要研究对象，对各类算法进一步按照核心思想进行细分，并对方法的原理、适用场景和优缺点进行比较和分析，帮助研究者根据数据的特点选用合适的数据增广方法，为后续国内外研究者应用和发展研究数据增广方法提供基础。针对图像的数据增广方法，单数据变形方法主要可以分为几何变换、色域变换、清晰度变换、噪声注入和局部擦除等5种；多数据混合可按照图像维度的混合和特征空间下的混合进行划分；学习数据分布的方法主要基于生成对抗网络和图像风格迁移的应用进行划分；学习增广策略的典型方法则可以按照基于元学习和基于强化学习进行分类。目前，数据增广已然成为推进深度学习在各领域应用的一项重要技术，可以很有效地缓解训练数据不足带来的深度学习模型过拟合的问题，进一步提高模型的精度。在实际应用中可根据数据和任务的特点选择和组合最合适的方法，形成一套有效的数据增广方案，进而为深度学习方法的应用提供更强的动力。在未来，根据数据和任务基于强化学习探索最优的组合策略，基于元学习自适应地学习最优数据变形和混合方式，基于生成对抗网络进一步拟合真实数据分布以采样高质量的未知数据，基于风格迁移探索多模态数据互相转换的应用，这些研究方向十分值得探索并且具有广阔的发展前景。相似文献

4.

Deep reinforcement learning in computer vision: a comprehensive survey

Le Ngan Rathour Vidhiwar Singh Yamazaki Kashu Luu Khoa Savvides Marios 《Artificial Intelligence Review》2022,55(4):2733-2819

Deep reinforcement learning augments the reinforcement learning framework and utilizes the powerful representation of deep neural networks. Recent works have demonstrated the remarkable successes of deep reinforcement learning in various domains including finance, medicine, healthcare, video games, robotics, and computer vision. In this work, we provide a detailed review of recent and state-of-the-art research advances of deep reinforcement learning in computer vision. We start with comprehending the theories of deep learning, reinforcement learning, and deep reinforcement learning. We then propose a categorization of deep reinforcement learning methodologies and discuss their advantages and limitations. In particular, we divide deep reinforcement learning into seven main categories according to their applications in computer vision, i.e. (i) landmark localization (ii) object detection; (iii) object tracking; (iv) registration on both 2D image and 3D image volumetric data (v) image segmentation; (vi) videos analysis; and (vii) other applications. Each of these categories is further analyzed with reinforcement learning techniques, network design, and performance. Moreover, we provide a comprehensive analysis of the existing publicly available datasets and examine source code availability. Finally, we present some open issues and discuss future research directions on deep reinforcement learning in computer vision.

相似文献

5.

Performance improvement of Deep Learning Models using image augmentation techniques

Nagaraju M. Chawla Priyanka Kumar Neeraj 《Multimedia Tools and Applications》2022,81(7):9177-9200

The major barrier while using deep learning models is lack of large number of images in the training dataset. In fact, there is a need of thousands of images in each image categories based on the complexity of problem. Prior studies have shown that picture augmentation techniques can be used to enhance the number of images in a training dataset artificially. These techniques can aid in improving the overall learning process and performance of a deep learning model. Hence, to address this problem we have proposed three algorithms. Firstly, two image acquisition algorithms have been proposed to systematically obtain real field images for testing and images from public datasets for training a model. Secondly, an algorithm is proposed to describe the procedure how the augmentations can be applied to enhance the datasets. During this study, we have investigated 52 augmentations that can allow enhancing the size of input dataset by improving the quantity of images. To perform the classification process of four maize crop diseases, a new convolutional neural network model is developed and several experiments have been performed to prove its effectiveness. Firstly, two tests were carried out using the original dataset from Kaggle public repository and the augmented dataset. When compared with the original dataset, the model improved by 5.14% with the augmented dataset. Secondly, three experiments carried out to evaluate the performance of proposed augmentation method. Experimental results demonstrated that the proposed approach outperforms the existing three approaches by 27.38%, 3.14%, and 1.34% during the classification process. The proposed IPA augmentation method has been compared with six existing methods: Full Stage Data Augmentation Framework, LeafGAN, Novel Augmentation method based on GAN, Wasserstein Generative Adversarial Network (WGAN), Activation Reconstruction-GAN, and Step-by-Step Data Augmentation Method and experimental results show that performance is better than existing methods by 28.31%, 19.76%, 20.18%, 13.75%, 2.42%, and 12.68% respectively.

相似文献

6.

定量磁共振图像的深度学习重建方法进展

下载免费PDF全文

叶慧慧陈昱婷胡大琨李世卓刘华锋《中国图象图形学报》2023,28(6):1698-1708

磁共振成像是一种应用广泛的无创医学成像方法,因其丰富的软组织对比度可以成像人体几乎所有内部结构,包括器官、骨骼、肌肉和血管,已成为临床医学影像诊断的利器。然而磁共振成像存在两大公认的瓶颈：成像速度慢、扫描操作烦琐。深度学习给磁共振成像带来莫大的契机,给传统磁共振加速成像带来新的可能。鉴于该领域的快速发展性质,本文旨在总结文献中报道的大量深度学习和磁共振图像重建相结合的方法,以更好地了解该领域。本文简单介绍磁共振成像在多通道线圈接收的并行加速和压缩感知加速下的深度学习重建方法,其中单对比度图像可通过多通道接收线圈提供的冗余度用于加速,多对比度图像可额外使用不同对比度图像这一维度用于加速,动态图像与多对比度图像类似可额外使用时间维度用于加速,本文也将介绍深度学习在这些方面的发展。随着磁共振成像近年来由定性多对比度成像向定量多参数成像的发展,其中定量成像的图像中可内含多对比度图像,如何借用深度学习提供的能力将定性多对比度图像映射到参数图也是一个难点,近年来这一方向也获得了较快的发展,文中也将针对这方面内容进行调研并介绍。针对上述内容,分别介绍国际研究现状和国内研究现状,拟更好地总结国内外研究的发展的异同和趋势。最后对深度学习助力定量磁共振成像方面进行了展望。相似文献

7.

可解释化、结构化、多模态化的深度神经网络

熊红凯高星李劭辉徐宇辉王涌壮余豪阳刘昕张云飞《模式识别与人工智能》2018,31(1):1-11

深度学习方法依赖于大规模的标签数据,通过端到端的监督训练,在计算机视觉、自然语言处理领域都取得优异性能.但是,现有方法通常针对单一模态数据,忽视数据的内在结构,缺乏理论支撑.针对上述问题,文中从基于小波核学习的深度滤波器组网络设计、基于结构化学习的深度学习、基于多模态学习的深度学习3个角度阐述结合深度学习方法与小波理论、结构化预测的潜在方法,以及其拓展到多模态数据的可行机制. 相似文献

8.

深度学习在医学影像中的应用综述

下载免费PDF全文

施俊汪琳琳王珊珊陈艳霞王乾魏冬铭梁淑君彭佳林易佳锦刘盛锋倪东王明亮张道强沈定刚《中国图象图形学报》2020,25(10):1953-1981

深度学习能自动从大样本数据中学习获得优良的特征表达,有效提升各种机器学习任务的性能,已广泛应用于信号处理、计算机视觉和自然语言处理等诸多领域。基于深度学习的医学影像智能计算是目前智慧医疗领域的研究热点,其中深度学习方法已经应用于医学影像处理、分析的全流程。由于医学影像内在的特殊性、复杂性,特别是考虑到医学影像领域普遍存在的小样本问题,相关学习任务和应用场景对深度学习方法提出了新要求。本文以临床常用的X射线、超声、计算机断层扫描和磁共振等4种影像为例,对深度学习在医学影像中的应用现状进行综述,特别面向图像重建、病灶检测、图像分割、图像配准和计算机辅助诊断这5大任务的主要深度学习方法的进展进行介绍,并对发展趋势进行展望。相似文献

9.

Random linear interpolation data augmentation for person re-identification

Li Zhi Guo Jun Jiao Wenli Xu Pengfei Liu Baoying Zhao Xiaowei 《Multimedia Tools and Applications》2020,79(7-8):4931-4947

Person Re-Identification (person re-ID) is an image retrieval task which identifies the same person in different camera views. Generally, a good person re-ID model requires a large dataset containing over 100000 images to reduce the risk of over-fitting. Most current handcrafted person re-ID datasets, however, are insufficient for training a learning model with high generalization ability. In addition, the lacking of images with various levels of occlusion is still remaining in most existing datasets. Motivated by these two problems, this paper proposes a new data augmentation method called Random Linear Interpolation that can enlarge the sizes of person re-ID datasets and improve the generalization ability of the learning model. The key enabler of our approach is generating fused images by interpolating pairs of original images. In other words, the innovation of the proposed approach is considering data augmentation between two random samples. Plenty of experimental results demonstrates that the proposed method is effective to improve baseline models. On Market1501 and DukeMTMC-reID datasets, our approach can achieve 92.71% and 82.19% rank-1 accuracy, respectively.

相似文献

10.

深度学习在图像识别中的应用研究综述 总被引：5，自引：0，他引：5

下载免费PDF全文

郑远攀李广阳李晔《计算机工程与应用》2019,55(12):20-36

深度学习作为图像识别领域重要的技术手段，有着广阔的应用前景，开展图像识别技术研究对推动计算机视觉及人工智能的发展具有重要的理论价值和现实意义，文中对深度学习在图像识别中的应用给予综述。介绍了深度学习的由来，具体分析了深度信念网络、卷积神经网络、循环神经网络、生成式对抗网络以及胶囊网络等深度学习模型，对各个深度学习模型的改进型模型逐一对比分析。总结近年来深度学习在人脸识别、医学图像识别、遥感图像分类等图像识别应用领域取得的研究成果并探讨了已有研究值得商榷之处，对深度学习在图像识别领域中的发展趋势进行探讨，指出有效使用迁移学习技术识别小样本数据，使用非监督与半监督学习对图像进行识别，如何对视频图像进行有效识别以及强化模型的理论性等是该领域研究的进一步方向。相似文献

11.

Research on Point-wise Gated Deep Networks

《Applied Soft Computing》2017

Stacking Restricted Boltzmann Machines (RBM) to create deep networks, such as Deep Belief Networks (DBN) and Deep Boltzmann Machines (DBM), has become one of the most important research fields in deep learning. DBM and DBN provide state-of-the-art results in many fields such as image recognition, but they don't show better learning abilities than RBM when dealing with data containing irrelevant patterns. Point-wise Gated Restricted Boltzmann Machines (pgRBM) can effectively find the task-relevant patterns from data containing irrelevant patterns and thus achieve satisfied classification results. For the limitations of the DBN and the DBM in the processing of data containing irrelevant patterns, we introduce the pgRBM into the DBN and the DBM and present Point-wise Gated Deep Belief Networks (pgDBN) and Point-wise Gated Deep Boltzmann Machines (pgDBM). The pgDBN and the pgDBM both utilize the pgRBM instead of the RBM to pre-train the weights connecting the networks' the visible layer and the hidden layer, and apply the pgRBM learning task-relevant data subset for traditional networks. Then, this paper discusses the validity that dropout and weight uncertainty methods are developed to prevent overfitting in pgRBMs, pgDBNs, and pgDBMs networks. Experimental results on MNIST variation datasets show that the pgDBN and the pgDBM are effective deep neural networks learning 相似文献

12.

Analyzing the potential of active learning for document image classification

Saifullah Saifullah Agne Stefan Dengel Andreas Ahmed Sheraz 《International Journal on Document Analysis and Recognition》2023,26(3):187-209

Deep learning has been extensively researched in the field of document analysis and has shown excellent performance across a wide range of document-related tasks. As a result, a great deal of emphasis is now being placed on its practical deployment and integration into modern industrial document processing pipelines. It is well known, however, that deep learning models are data-hungry and often require huge volumes of annotated data in order to achieve competitive performances. And since data annotation is a costly and labor-intensive process, it remains one of the major hurdles to their practical deployment. This study investigates the possibility of using active learning to reduce the costs of data annotation in the context of document image classification, which is one of the core components of modern document processing pipelines. The results of this study demonstrate that by utilizing active learning (AL), deep document classification models can achieve competitive performances to the models trained on fully annotated datasets and, in some cases, even surpass them by annotating only 15–40% of the total training dataset. Furthermore, this study demonstrates that modern AL strategies significantly outperform random querying, and in many cases achieve comparable performance to the models trained on fully annotated datasets even in the presence of practical deployment issues such as data imbalance, and annotation noise, and thus, offer tremendous benefits in real-world deployment of deep document classification models. The code to reproduce our experiments is publicly available at https://github.com/saifullah3396/doc_al.

相似文献

13.

关于深度学习的综述与讨论 总被引：2，自引：0，他引：2

下载免费PDF全文

胡越罗东阳花奎路海明张学工《智能系统学报》2019,14(1):1-19

机器学习是通过计算模型和算法从数据中学习规律的一门学问,在各种需要从复杂数据中挖掘规律的领域中有很多应用,已成为当今广义的人工智能领域最核心的技术之一。近年来,多种深度神经网络在大量机器学习问题上取得了令人瞩目的成果,形成了机器学习领域最亮眼的一个新分支——深度学习,也掀起了机器学习理论、方法和应用研究的一个新高潮。对深度学习代表性方法的核心原理和典型优化算法进行了综述,回顾与讨论了深度学习与以往机器学习方法之间的联系与区别,并对深度学习中一些需要进一步研究的问题进行了初步讨论。相似文献

14.

深度学习在口腔医学影像中的应用与挑战

下载免费PDF全文

赵阳李俊诚成博栋牛娜君王龙光高广谓施俊《中国图象图形学报》2024,29(3):586-607

口腔医学影像是进行临床口腔疾病检测、筛查、诊断和治疗评估的重要工具,对口腔影像进行准确分析对于后续治疗计划的制定至关重要。常规的口腔医学影像分析依赖于医师的水平和经验,存在阅片效率低、可重复性低以及定量分析欠缺的问题。深度学习可以从大样本数据中自动学习并获取优良的特征表达,提升各类机器学习任务的效率和性能,目前已广泛应用于医学影像分析处理的各类任务之中。基于深度学习的口腔医学影像处理是目前的研究热点,但由于口腔医学领域内在的特殊性和复杂性,以及口腔医学影像数据样本量通常较小的问题,给深度学习方法在相关学习任务和场景的应用带来了新的挑战。本文从口腔医学影像领域常用的二维X射线影像、三维点云/网格影像和锥形束计算机断层扫描影像3种影像出发,介绍深度学习技术在口腔医学影像处理及分析领域应用的思路和现状,分析了各算法的优缺点及该领域所面临的问题和挑战,并对未来的研究方向和可能开展的临床应用进行展望,以助力智慧口腔建设。相似文献

15.

一种基于局部扰动的图像对抗样本生成方法

下载免费PDF全文

王辛晨苏秋旸杨邓奇陈本辉李晓伟《信息安全学报》2022,7(6):94-104

近年来, 随着人工智能的研究和发展, 深度学习被广泛应用。深度学习在自然语言处理、计算机视觉等多个领域表现出良好的效果。特别是计算机视觉方面, 在图像识别和图像分类中, 深度学习具备非常高的准确性。然而越来越多的研究表明, 深度神经网络存在着安全隐患, 其中就包括对抗样本攻击。对抗样本是一种人为加入特定扰动的数据样本, 这种特殊样本在传递给已训练好的模型时, 神经网络模型会输出与预期结果不同的结果。在安全性要求较高的场景下, 对抗样本显然会对采用深度神经网络的应用产生威胁。目前国内外对于对抗样本的研究主要集中在图片领域, 图像对抗样本就是在图片中加入特殊信息的图片数据, 使基于神经网络的图像分类模型做出错误的分类。已有的图像对抗样本方法主要采用全局扰动方法,即将这些扰动信息添加在整张图片上。相比于全局扰动, 局部扰动将生成的扰动信息添加到图片的非重点区域, 从而使得对抗样本隐蔽性更强, 更难被人眼发现。本文提出了一种生成局部扰动的图像对抗样本方法。该方法首先使用 Yolo 目标检测方法识别出图片中的重点位置区域, 然后以 MIFGSM 方法为基础, 结合 Curls 方法中提到的先梯度下降再梯度上升的思想,在非重点区域添加扰动信息, 从而生成局部扰动的对抗样本。实验结果表明, 在对抗扰动区域减小的情况下可以实现与全局扰动相同的攻击成功率。相似文献

16.

基于数据增强的CT图像病灶检测方法

马国祥严传波张志豪森干《计算机系统应用》2021,30(10):187-194

基于医疗影像的辅助诊断技术正处于快速发展阶段,但是受医学影像数据量的制约,使得基于深度学习的建模方法无法向更复杂的模型进行探索.本文从医学CT影像数据增强方法出发,概述了医疗影像病灶图像的成像特点,针对病灶检测及分割任务对现有方法进行了归类总结,并阐述了当前医学影像检测和分割的难点.分别从病灶检测相关技术、影像数据增强方法、基于生成对抗网络(Generative Adversarial Network,GAN)的病灶检测方法等方面进行了总结.最后,针对医学领域内基于深度学习的数据增强方法:标准GAN、pix2pixGAN、CycleGAN模型进行了对比分析,并展望未来医学影像分析领域的发展趋势. 相似文献

17.

小样本学习研究综述 总被引：1，自引：0，他引：1

赵凯琳靳小龙王元卓《软件学报》2021,32(2):349-369

小样本学习旨在通过少量样本学习到解决问题的模型.近年来,在大数据训练模型的趋势下,机器学习和深度学习在许多领域中取得了成功.但是在现实世界中的很多应用场景中,样本量很少或者标注样本很少,而对大量无标签样本进行标注工作将会耗费很大的人力.所以,如何用少量样本进行学习就成为目前人们需要关注的问题.系统地梳理了当前小样本学习... 相似文献

18.

输电线路部件视觉缺陷检测综述

下载免费PDF全文

赵振兵蒋志钢李延旭戚银城翟永杰赵文清张珂《中国图象图形学报》2021,26(11):2545-2560

随着我国电网系统的不断发展，基层巡检作业负担越来越重，运维成本越来越高，如何实现输电线路部件缺陷的智能化检测变得愈发重要。同时，由于国家《新一代人工智能发展规划》的提出和国家电网"数字新基建"的部署，人工智能应用于电力设备运维的相关技术得到了快速发展，对输电线路部件视觉缺陷准确检测成为亟待解决的关键问题之一。早期基于图像处理和特征工程的部件视觉缺陷检测方法对图像质量的要求较高，无法真正应用于现实复杂的输电线路作业环境；随着深度学习的兴起，基于深度学习的检测模型可以有效地将部件目标及其缺陷从复杂的输电线路航拍图像中提取出来，既节省了人工设计特征的时间，又在性能上达到了显著提升，因此逐渐成为主流研究方法。本文首先描述了基于传统算法的输电线路关键部件视觉缺陷检测技术，回顾了深度学习的发展过程并分析了深度学习在缺陷检测领域的优缺点；围绕输电线路上3个重要的部件：绝缘子、金具以及螺栓，介绍了其定位与缺陷检测的研究现状；分析了输电线路部件缺陷检测中研究的样本不平衡问题、小目标检测问题、细粒度检测问题等几个关键问题；分析了符合电网巡检任务复杂场景需求和故障诊断标准的输电线路部件缺陷检测技术的未来发展趋势。相似文献

19.

深度人脸表情识别研究进展

下载免费PDF全文

李珊邓伟洪《中国图象图形学报》2020,25(11):2306-2320

随着人脸表情识别任务逐渐从实验室受控环境转移至具有挑战性的真实世界环境,在深度学习技术的迅猛发展下,深度神经网络能够学习出具有判别能力的特征,逐渐应用于自动人脸表情识别任务。目前的深度人脸表情识别系统致力于解决以下两个问题：1）由于缺乏足量训练数据导致的过拟合问题;2）真实世界环境下其他与表情无关因素变量（例如光照、头部姿态和身份特征）带来的干扰问题。本文首先对近十年深度人脸表情识别方法的研究现状以及相关人脸表情数据库的发展进行概括。然后,将目前基于深度学习的人脸表情识别方法分为两类：静态人脸表情识别和动态人脸表情识别,并对这两类方法分别进行介绍和综述。针对目前领域内先进的深度表情识别算法,对其在常见表情数据库上的性能进行了对比并详细分析了各类算法的优缺点。最后本文对该领域的未来研究方向和机遇挑战进行了总结和展望：考虑到表情本质上是面部肌肉运动的动态活动,基于动态序列的深度表情识别网络往往能够取得比静态表情识别网络更好的识别效果。此外,结合其他表情模型如面部动作单元模型以及其他多媒体模态,如音频模态和人体生理信息能够将表情识别拓展到更具有实际应用价值的场景。相似文献

20.

一种基于生成式对抗网络的图像数据扩充方法

王海文邱晓晖《计算机技术与发展》2020,(3):51-56

针对卷积神经网络(CNN)在数据集(训练集)较小时,易发生过度拟合的现象,提出并实现了一种引入Selu激活函数并结合带参数归一化的Dropout方法的深度卷积生成式对抗网络用于图像增强,生成图像实现数据集扩充,从而解决深度学习图像分类研究中因图像数据不足造成的模型表达能力差、训练时易过度拟合的问题。通过裁剪、旋转、插值、畸变变换等扩充图像集的传统图像增强方法往往只能扩充样式单一甚至信噪比较低的图像,与传统图像增强方法扩充图像集不同,使用生成式对抗网络生成的图像明显区别于原始图像,不仅可以得到数量更多,内容更丰富的高质量图像,数据集扩充效率也得以提升。仿真实验表明,该生成式对抗网络得到了质量相对较高的图像,有效地扩充了数据集。相似文献