首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
无人机航拍图像语义分割研究综述   总被引:1,自引:0,他引:1       下载免费PDF全文
随着无人机技术的快速发展,无人机在研究领域和工业应用方面受到了广泛的关注。图像和视频是无人机感知周围环境的重要途径。图像语义分割是计算机视觉领域的研究热点,在无人驾驶、智能机器人等场景中应用广泛。无人机航拍图像语义分割是在无人机航拍图像的基础上,运用语义分割技术使无人机获得场景目标智能感知能力。介绍了语义分割技术和无人机的应用发展、相关无人机航拍数据集、无人机航拍图像特点和常用语义分割评价指标。针对无人机航拍的特点介绍了相关语义分割方法,包括小目标、模型实时性和多尺度整合等方面。综述无人机语义分割相关应用,包括线检测、农业和建筑物提取等方向,并分析无人机语义分割未来发展趋势和挑战。  相似文献   

2.
在人工智能技术的支持下,无人机初步获得智能感知能力,在实际应用中展现出高效灵活的数据收集能力。无人机视角下的目标检测作为关键核心技术,在诸多领域中发挥着不可替代的作用,具有重要的研究意义。为了进一步展现无人机视角下的目标检测研究进展,本文对无人机视角下的目标检测算法进行了全面总结,并对已有算法进行了归类、分析和比较。1)介绍无人机视角下的目标检测概念,并总结无人机视角下目标检测所面临的目标尺度、空间分布、样本数量、类别语义以及优化目标等5大不均衡挑战。在介绍现有研究方法的基础上,特别整理并介绍了无人机视角下目标检测算法在交通监控、电力巡检、作物分析和灾害救援等实际场景中的应用。2)重点阐述从数据增强策略、多尺度特征融合、区域聚焦策略、多任务学习以及模型轻量化等方面提升无人机视角下目标检测性能的方法,总结这些方法的优缺点并分析了其与现存挑战之间的关联性。3)全面介绍基于无人机视角的目标检测数据集,并呈现已有算法在两个较常用公共数据集上的性能评估。4)对无人机视角下目标检测技术的未来发展方向进行了展望。  相似文献   

3.
For construction safety and health, continuous monitoring of unsafe conditions and action is essential in order to eliminate potential hazards in a timely manner. As a robust and automated means of field observation, computer vision techniques have been applied for the extraction of safety related information from site images and videos, and regarded as effective solutions complementary to current time-consuming and unreliable manual observational practices. Although some research efforts have been directed toward computer vision-based safety and health monitoring, its application in real practice remains premature due to a number of technical issues and research challenges in terms of reliability, accuracy, and applicability. This paper thus reviews previous attempts in construction applications from both technical and practical perspectives in order to understand the current status of computer vision techniques, which in turn suggests the direction of future research in the field of computer vision-based safety and health monitoring. Specifically, this paper categorizes previous studies into three groups—object detection, object tracking, and action recognition—based on types of information required to evaluate unsafe conditions and acts. The results demonstrate that major research challenges include comprehensive scene understanding, varying tracking accuracy by camera position, and action recognition of multiple equipment and workers. In addition, we identified several practical issues including a lack of task-specific and quantifiable metrics to evaluate the extracted information in safety context, technical obstacles due to dynamic conditions at construction sites and privacy issues. These challenges indicate a need for further research in these areas. Accordingly, this paper provides researchers insights into advancing knowledge and techniques for computer vision-based safety and health monitoring, and offers fresh opportunities and considerations to practitioners in understanding and adopting the techniques.  相似文献   

4.
In recent years, the computer graphics and computer vision communities have devoted significant attention to research based on Internet visual media resources. The huge number of images and videos continually being uploaded by millions of people have stimulated a variety of visual media creation and editing applications, while also posing serious challenges of retrieval, organization, and utilization. This article surveys recent research as regards processing of large collections of images and video, including work on analysis, manipulation, and synthesis. It discusses the problems involved, and suggests possible future directions in this emerging research area.  相似文献   

5.
Several papers addressed ellipse detection as a first step for several computer vision applications, but most of the proposed solutions are too slow to be applied in real time on large images or with limited hardware resources. This paper presents a novel algorithm for fast and effective ellipse detection and demonstrates its superior speed performance on large and challenging datasets. The proposed algorithm relies on an innovative selection strategy of arcs which are candidate to form ellipses and on the use of Hough transform to estimate parameters in a decomposed space. The final aim of this solution is to represent a building block for new generation of smart-phone applications which need fast and accurate ellipse detection also with limited computational resources.  相似文献   

6.
视听觉深度伪造检测技术研究综述   总被引:1,自引:0,他引:1       下载免费PDF全文
深度学习被广泛应用于自然语言处理、计算机视觉和无人驾驶等领域,引领了新一轮的人工智能浪潮。然而,深度学习也被用于构建对国家安全、社会稳定和个人隐私等造成潜在威胁的技术,如近期在世界范围内引起广泛关注的深度伪造技术能够生成逼真的虚假图像及音视频内容。本文介绍了深度伪造的背景及深度伪造内容生成原理,概述和分析了针对不同类型伪造内容(图像、视频、音频等)的检测方法和数据集,最后展望了深度伪造检测和防御未来的研究方向和面临的挑战。  相似文献   

7.
This work addresses the development of a computational model of visual attention to perform the automatic summarization of digital videos from television archives. Although the television system represents one of the most fascinating media phenomena ever created, we still observe the absence of effective solutions for content-based information retrieval from video recordings of programs produced by this media universe. This fact relates to the high complexity of the content-based video retrieval problem, which involves several challenges, among which we may highlight the usual demand on video summaries to facilitate indexing, browsing and retrieval operations. To achieve this goal, we propose a new computational visual attention model, inspired on the human visual system and based on computer vision methods (face detection, motion estimation and saliency map computation), to estimate static video abstracts, that is, collections of salient images or key frames extracted from the original videos. Experimental results with videos from the Open Video Project show that our approach represents an effective solution to the problem of automatic video summarization, producing video summaries with similar quality to the ground-truth manually created by a group of 50 users.  相似文献   

8.
9.
无人机视频是利用无人机航拍得到的一类重要的视频资源,被广泛运用于地面目 标的监测。但是,无人机视频的视野辽阔、不具有目标针对性的拍摄特点,使其存在大量时空 冗余,传统的视频交互手段显得十分低效。为此,提出了一种面向无人机视频的多尺度螺旋摘 要。首先,基于 YOLOv3 算法,训练能检测无人机视角的行人、车辆等目标的模型。然后,提 出了基于关键帧的视频目标检测算法,根据改进后的基于颜色特征的关键帧提取算法提取涵盖 视频关键信息的关键帧,并将检测模型应用于关键帧,高效获取整个视频的目标检测结果。之 后,从关键帧中提取相应的关键区域,作为摘要的呈现单元,并以螺旋的形式从内向外地将摘 要单元逐一呈现,辅以基于关键帧的视频定位和尺度缩放功能。最后,开发了草图注释、目标 分布螺旋、双螺旋播放等新颖的交互工具,满足用户的潜在需求,共同实现面向无人机视频的 高效交互。  相似文献   

10.
11.
Images captured in underwater environments usually exhibit complex illuminations, severe turbidity of water, and often display objects with large varieties in pose and spatial location, etc., which cause challenges to underwater vision research. In this paper, an extended underwater image database for salient-object detection or saliency detection is introduced. This database is called the Marine Underwater Environment Database (MUED), which contains 8600 underwater images of 430 individual groups of conspicuous objects with complex backgrounds, multiple salient objects, and complicated variations in pose, spatial location, illumination, turbidity of water, etc. The publicly available MUED provides researchers in relevant industrial and academic fields with underwater images under different types of variations. Manually labeled ground-truth information is also included in the database, so as to facilitate the research on more applicable and robust methods for both underwater image processing and underwater computer vision. The scale, accuracy, diversity, and background structure of MUED cannot only be widely used to assess and evaluate the performance of the state-of-the-art salient-object detection and saliency-detection algorithms for general images, but also particularly benefit the development of underwater vision technology and offer unparalleled opportunities to researchers in the underwater vision community and beyond.  相似文献   

12.
图像纹理分类方法研究进展和展望   总被引:4,自引:0,他引:4  
纹理分类是计算机视觉和模式识别领域的一个重要的基本问题,也是图像分割、物体识别、场景理解等其他视觉任务的基础.本文从纹理分类问题的基本定义出发,首先,对纹理分类研究中存在的困难与挑战进行阐述;接下来,对纹理分类方面的典型数据库进行全面梳理和总结;然后,对近期的纹理特征提取方法的发展和现状进行归类总结,并对主流纹理特征提取方法进行了详细的阐述和评述;最后,对纹理分类发展方向进行思考和讨论.  相似文献   

13.
This paper presents a state of the art review of features extraction for soccer video summarization research. The all existing approaches with regard to event detection, video summarization based on video stream and application of text sources in event detection have been surveyed. As regard the current challenges for automatic and real time provision of summary videos, different computer vision approaches are discussed and compared. Audio, video feature extraction methods and their combination with textual methods have been investigated. Available commercial products are presented to better clarify the boundaries in this domain and future directions for improvement of existing systems have been suggested.  相似文献   

14.
基于辅助信息的无人机图像批处理三维重建方法   总被引:6,自引:0,他引:6  
郭复胜  高伟 《自动化学报》2013,39(6):834-845
随着我国低空空域对民用的开放,无人机 (Unmanned aerial vehicles, UAVs)的应用将是一个巨大的潜在市场. 目前,如何对轻便的无人机获取的图像进行全自动处理,是一项急需解决的瓶颈技术. 本文将探索如何将近年来在视频、图像领域获得巨大成功的三维重建技术应用到无人机图像处理领域, 对无人机图像进行全自动的大场景三维重建.本文首先给出了经典增量式三维重建方法Bundler在无人机图像处理中存在的问题, 然后通过分析无人机图像的辅助信息的特点,提出了一种基于批处理重建(Batch reconstruction)框架下的鲁棒无人机图像三维重建方法.多组无人机图像三维重建实验表明: 本文提出的方法在算法鲁棒性、三维重建效率与精度等方面都具有很好的结果.  相似文献   

15.
医学影像的诊断是许多临床决策的基础,而医学影像的智能分析是医疗人工智能的重要组成部分。与此同时,随着越来越多3D空间传感器的兴起和普及,3D计算机视觉正变得越发重要。本文关注医学影像分析和3D计算机的交叉领域,即医学3D计算机视觉或医学3D视觉。本文将医学3D计算机视觉系统划分为任务、数据和表征3个层面,并结合最新文献呈现这3个层面的研究进展。在任务层面,介绍医学3D计算机视觉中的分类、分割、检测、配准和成像重建,以及这些任务在临床诊断和医学影像分析中的作用和特点。在数据层面,简要介绍了医学3D数据中最重要的数据模态:包括计算机断层成像(computed tomography,CT)、磁共振成像(magnetic resonance imaging,MRI)、正电子放射断层成像(positron emission tomography,PET)等,以及一些新兴研究提出的其他数据格式。在此基础上,整理了医学3D计算机视觉中重要的研究数据集,并标注其数据模态和主要视觉任务。在表征层面,介绍并讨论了2D网络、3D网络和混合网络在医学3D数据的表征学习上的优缺点。此外,针对医学影像中普遍存在的小数据问题,重点讨论了医学3D数据表征学习中的预训练问题。最后,总结了目前医学3D计算机视觉的研究现状,并指出目前尚待解决的研究挑战、问题和方向。  相似文献   

16.
Geo-tagging is a fast-emerging trend in digital photography and community photo sharing. The presence of geographically relevant metadata with images and videos has opened up interesting research avenues within the multimedia and computer vision domains. In this paper, we survey geo-tagging related research within the context of multimedia and along three dimensions: (1) Modalities in which geographical information can be extracted, (2) Applications that can benefit from the use of geographical information, and (3) The interplay between modalities and applications. Our survey will introduce research problems and discuss significant approaches. We will discuss the nature of different modalities and lay out factors that are expected to govern the choices with respect to multimedia and vision applications. Finally, we discuss future research directions in this field.  相似文献   

17.
To ensure the safety and the serviceability of civil infrastructure it is essential to visually inspect and assess its physical and functional condition. This review paper presents the current state of practice of assessing the visual condition of vertical and horizontal civil infrastructure; in particular of reinforced concrete bridges, precast concrete tunnels, underground concrete pipes, and asphalt pavements. Since the rate of creation and deployment of computer vision methods for civil engineering applications has been exponentially increasing, the main part of the paper presents a comprehensive synthesis of the state of the art in computer vision based defect detection and condition assessment related to concrete and asphalt civil infrastructure. Finally, the current achievements and limitations of existing methods as well as open research challenges are outlined to assist both the civil engineering and the computer science research community in setting an agenda for future research.  相似文献   

18.
视频质量评价(VQA)是以人眼的主观质量评估结果为依据,使用算法模型对失真视频进行评估。传统的评估方法难以做到主观评价结果与客观评价结果相一致。基于深度学习的视频质量评价方法无需加入手工特征,通过模型自主学习即可进行评估,对视频质量的监控和评价有重要意义,已成为计算机视觉领域的研究热点之一。首先对视频质量评价的研究背景和主要研究方法进行介绍;其次从全参考型和无参考型两方面介绍基于深度学习的客观质量评价方法,并且从所用的卷积神经网络模型对无参考型评价方法进行了分类比较;接着介绍视频质量评价算法的相关数据库和评价算法性能指标,并对算法性能进行比较;最后对目前视频质量评价研究存在的问题进行总结,并展望了该领域面临的挑战和未来发展方向。  相似文献   

19.
深度学习在计算机视觉领域取得了重大成功,超越了众多传统的方法.然而,近年来深度学习技术被滥用在假视频的制作上,使得以Deepfakes为代表的伪造视频在网络上泛滥成灾.这种深度伪造技术通过篡改或替换原始视频的人脸信息,并合成虚假的语音,来制作色情电影、虚假新闻、政治谣言等.为了消除此类伪造技术带来的负面影响,众多学者对假视频的鉴别进行了深入的研究,并提出一系列的检测方法帮助机构或社区来识别此类伪造视频.尽管如此,目前的检测技术仍然存在依赖特定分布数据、特定压缩率等众多的局限性,远远落后于假视频的生成技术.并且,不同的学者解决问题的角度不同,使用的数据集和评价指标均不统一.迄今为止,学术界对深度伪造与检测技术仍缺乏统一的认识,深度伪造和检测技术研究的体系架构尚不明确.在本综述中,我们回顾了深度伪造与检测技术的发展,并对现有研究工作进行了系统的总结和科学的归类.最后,我们讨论了深度伪造技术蔓延带来的社会风险,分析了检测技术的诸多局限性,并探讨了检测技术面临的挑战和潜在研究方向,旨在为后续学者进一步推动深度伪造检测技术的发展和部署提供指导.  相似文献   

20.
Background modeling and subtraction is a natural technique for object detection in videos captured by a static camera, and also a critical preprocessing step in various high-level computer vision applications. However, there have not been many studies concerning useful features and binary segmentation algorithms for this problem. We propose a pixelwise background modeling and subtraction technique using multiple features, where generative and discriminative techniques are combined for classification. In our algorithm, color, gradient, and Haar-like features are integrated to handle spatio-temporal variations for each pixel. A pixelwise generative background model is obtained for each feature efficiently and effectively by Kernel Density Approximation (KDA). Background subtraction is performed in a discriminative manner using a Support Vector Machine (SVM) over background likelihood vectors for a set of features. The proposed algorithm is robust to shadow, illumination changes, spatial variations of background. We compare the performance of the algorithm with other density-based methods using several different feature combinations and modeling techniques, both quantitatively and qualitatively.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号