首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
场景的深度估计问题是计算机视觉领域中的经典问题之一,也是3维重建和图像合成等应用中的一个重要环节。基于深度学习的单目深度估计技术高速发展,各种网络结构相继提出。本文对基于深度学习的单目深度估计技术最新进展进行了综述,回顾了基于监督学习和基于无监督学习方法的发展历程。重点关注单目深度估计的优化思路及其在深度学习网络结构中的表现,将监督学习方法分为多尺度特征融合的方法、结合条件随机场(conditional random field,CRF)的方法、基于序数关系的方法、结合多元图像信息的方法和其他方法等5类;将无监督学习方法分为基于立体视觉的方法、基于运动恢复结构(structure from motion,SfM)的方法、结合对抗性网络的方法、基于序数关系的方法和结合不确定性的方法等5类。此外,还介绍了单目深度估计任务中常用的数据集和评价指标,并对目前基于深度学习的单目深度估计技术在精确度、泛化性、应用场景和无监督网络中不确定性研究等方面的现状和面临的挑战进行了讨论,为相关领域的研究人员提供一个比较全面的参考。  相似文献   

2.
This paper concerns the exploration of a natural environment by a mobile robot equipped with both a video color camera and a stereo-vision system. We focus on the interest of such a multi-sensory system to deal with the navigation of a robot in an a priori unknown environment, including (1) the incremental construction of a landmark-based model, and the use of these landmarks for (2) the 3-D localization of the mobile robot and for (3) a sensor-based navigation mode.For robot localization, a slow process and a fast one are simultaneously executed during the robot motions. In the modeling process (currently 0.1 Hz), the global landmark-based model is incrementally built and the robot situation can be estimated from discriminant landmarks selected amongst the detected objects in the range data. In the tracking process (currently 4 Hz), selected landmarks are tracked in the visual data; the tracking results are used to simplify the matching between landmarks in the modeling process.Finally, a sensor-based visual navigation mode, based on the same landmark selection and tracking, is also presented; in order to navigate during a long robot motion, different landmarks (targets) can be selected as a sequence of sub-goals that the robot must successively reach.  相似文献   

3.
目的 远程光体积描记(remote photoplethysmograph,rPPG)是一种基于视频的非接触心率测量方法,通过跟踪人脸皮肤区域并从中提取周期性微弱变化的颜色信号估计出心率。目前基于级联回归树的人脸地标方法训练的Dlib库,由于能快速准确定位人脸轮廓,正逐渐被研究者用于跟踪皮肤感兴趣区域(region of interest,ROI)。由于实际应用中存在地标无规则抖动,且现有研究没有考虑目标晃动的影响,因此颜色信号提取不准确,心率估计精度不佳。为了克服以上缺陷,提出一种基于Dlib的抗地标抖动和运动晃动的跟踪方法。方法 本文方法主要包含两个步骤:首先,通过阈值判断两帧间地标的区别,若近似则沿用地标,反之使用当前帧地标以解决抖动问题。其次,针对运动晃动,通过左右眼地标中点计算旋转角度,矫正晃动的人脸,保证ROI在运动中也能保持一致。结果 通过信噪比(signal-to-noise,SNR)、平均绝对误差(mean absolute error,MAE)和均方根误差(root mean squared error,RMSE)来评价跟踪方法在rPPG中的测量表现。经在UBFC-RPPG(stands for Univ.Bourgogne Franche-Comté Remote PhotoPlethysmoGraphy)和PURE(Pulse Rate Detection Dataset)数据集测试,与Dlib相比,本文方法rPPG测量结果在UBFC-RPPG中SNR提高了约0.425 dB,MAE提高0.291 5 bpm,RMSE降低0.645 3 bpm;在PURE中SNR降低了0.041 1 dB,MAE降低0.065 2 bpm,RMSE降低0.271 8 bpm。结论 本文方法相比于Dlib有效提高跟踪框稳定性,在静止和运动中都能跟踪相同ROI,适合rPPG应用。  相似文献   

4.
提出一种单目视觉人工路标辅助惯性导航系统(Inertial Navigation System,INS)的定位方法。首先设计人工路标,并用相机拍摄各预先设置的人工路标,记录拍摄每个路标时相机位置和姿态,建立视觉路标库。在利用惯性导航系统定位过程中,对单目相机采集到的图像进行路标提取、与路标库中相应路标进行匹配,估计当前相机位置和姿态,然后利用卡尔曼滤波将视觉匹配估计的位置信息和INS有效地融合。实验结果表明:传统航位推算方法的平均误差为0.715 m,本文组合导航方法的平均误差为0.154 m,该方法有效地提高了惯性导航定位的精度。  相似文献   

5.
We describe a pipeline for structure-from-motion (SfM) with mixed camera types, namely omnidirectional and perspective cameras. For the steps of this pipeline, we propose new approaches or adapt the existing perspective camera methods to make the pipeline effective and automatic. We model our cameras of different types with the sphere camera model. To match feature points, we describe a preprocessing algorithm which significantly increases scale invariant feature transform (SIFT) matching performance for hybrid image pairs. With this approach, automatic point matching between omnidirectional and perspective images is achieved. We robustly estimate the hybrid fundamental matrix with the obtained point correspondences. We introduce the normalization matrices for lifted coordinates so that normalization and denormalization can be performed linearly for omnidirectional images. We evaluate the alternatives of estimating camera poses in hybrid pairs. A weighting strategy is proposed for iterative linear triangulation which improves the structure estimation accuracy. Following the addition of multiple perspective and omnidirectional images to the structure, we perform sparse bundle adjustment on the estimated structure by adapting it to use the sphere camera model. Demonstrations of the end-to-end multi-view SfM pipeline with the real images of mixed camera types are presented.  相似文献   

6.
This article describes a landmark-based navigation technique for a mobile robot. Robot position estimation is achieved by using a camera and a navigational landmark pattern, which consists of simple geometrical patterns. The Modified Elliptical Hough Transform (MEHT) is developed for detecting and measuring the projection of the landmark in the camera's image space. Robustness of this approach is demonstrated by studying the cases of noisy image data and partial occlusion of the landmark pattern. Error analysis of MEHT is performed to provide more understanding of the effects of applying elliptical approximation to the projection of a circular pattern. © 1997 John Wiley & Sons, Inc.  相似文献   

7.
An algorithm for accurate localization of facial landmarks coupled with a head pose estimation from a single monocular image is proposed. The algorithm is formulated as an optimization problem where the sum of individual landmark scoring functions is maximized with respect to the camera pose by fitting a parametric 3D shape model. The landmark scoring functions are trained by a structured output SVM classifier that takes a distance to the true landmark position into account when learning. The optimization criterion is non-convex and we propose a robust initialization scheme which employs a global method to detect a raw but reliable initial landmark position. Self-occlusions causing landmarks invisibility are handled explicitly by excluding the corresponding contributions from the data term. This allows the algorithm to operate correctly for a large range of viewing angles. Experiments on standard “in-the-wild” datasets demonstrate that the proposed algorithm outperforms several state-of-the-art landmark detectors especially for non-frontal face images. The algorithm achieves the average relative landmark localization error below 10% of the interocular distance in 98.3% of the 300 W dataset test images.  相似文献   

8.
This paper is centered around landmark detection, tracking, and matching for visual simultaneous localization and mapping using a monocular vision system with active gaze control. We present a system that specializes in creating and maintaining a sparse set of landmarks based on a biologically motivated feature-selection strategy. A visual attention system detects salient features that are highly discriminative and ideal candidates for visual landmarks that are easy to redetect. Features are tracked over several frames to determine stable landmarks and to estimate their 3-D position in the environment. Matching of current landmarks to database entries enables loop closing. Active gaze control allows us to overcome some of the limitations of using a monocular vision system with a relatively small field of view. It supports 1) the tracking of landmarks that enable a better pose estimation, 2) the exploration of regions without landmarks to obtain a better distribution of landmarks in the environment, and 3) the active redetection of landmarks to enable loop closing in situations in which a fixed camera fails to close the loop. Several real-world experiments show that accurate pose estimation is obtained with the presented system and that active camera control outperforms the passive approach.   相似文献   

9.
In the autonomous unmanned helicopter landing problem, the position of the unmanned helicopter relative to the landmark is very important. A camera carried on the unmanned helicopter can capture an image of the landmark. In earlier research, it was reported that the camera position could be estimated by features extracted from the landmark image. However, it is necessary that the landmark image should be complete, or with only slight deficiencies, in order for this estimation process to be possible. In this article, we report on an innovative design for an estimation made from a camera position giving an incomplete single image of the landmark. An adaptive neuro-fuzzy inference system (ANFIS) is used to construct the mapping relation between the features of complete and incomplete landmark images. It will be verified that it is possible to estimate the camera position from a landmark image more than half of which is defective via the proposed method.  相似文献   

10.
Localisation and mapping with an omnidirectional camera becomes more difficult as the landmark appearances change dramatically in the omnidirectional image. With conventional techniques, it is difficult to match the features of the landmark with the template. We present a novel robot simultaneous localisation and mapping (SLAM) algorithm with an omnidirectional camera, which uses incremental landmark appearance learning to provide posterior probability distribution for estimating the robot pose under a particle filtering framework. The major contribution of our work is to represent the posterior estimation of the robot pose by incremental probabilistic principal component analysis, which can be naturally incorporated into the particle filtering algorithm for robot SLAM. Moreover, the innovative method of this article allows the adoption of the severe distorted landmark appearances viewed with omnidirectional camera for robot SLAM. The experimental results demonstrate that the localisation error is less than 1 cm in an indoor environment using five landmarks, and the location of the landmark appearances can be estimated within 5 pixels deviation from the ground truth in the omnidirectional image at a fairly fast speed.  相似文献   

11.
目的 越来越多的应用依赖于对场景深度图像准确且快速的观测和分析,如机器人导航以及在电影和游戏中对虚拟场景的设计建模等.飞行时间深度相机等直接的深度测量设备可以实时的获取场景的深度图像,但是由于硬件条件的限制,采集的深度图像分辨率比较低,无法满足实际应用的需要.通过立体匹配算法对左右立体图对之间进行匹配获得视差从而得到深度图像是计算机视觉的一种经典方法,但是由于左右图像之间遮挡以及无纹理区域的影响,立体匹配算法在这些区域无法匹配得到正确的视差,导致立体匹配算法在实际应用中存在一定的局限性.方法 结合飞行时间深度相机等直接的深度测量设备和立体匹配算法的优势,提出一种新的深度图像重建方法.首先结合直接的深度测量设备采集的深度图像来构造自适应局部匹配权值,对左右图像之间的局部窗立体匹配过程进行约束,得到基于立体匹配算法的深度图像;然后基于左右检测原理将采集到的深度图像和匹配得到的深度图像进行有效融合;接着提出一种局部权值滤波算法,来进一步提高深度图像的重建质量.结果 实验结果表明,无论在客观指标还是视觉效果上,本文提出的深度图像重建算法较其他立体匹配算法可以得到更好的结果.其中错误率比较实验表明,本文算法较传统的立体匹配算法在深度重建错误率上可以提升10%左右.峰值信噪比实验结果表明,本文算法在峰值信噪比上可以得到10 dB左右的提升.结论 提出的深度图像重建方法通过结合高分辨率左右立体图对和初始的低分辨率深度图像,可以有效地重建高质量高分辨率的深度图像.  相似文献   

12.
Many insects and animals exploit their own navigation systems to navigate in space. Biologically-inspired methods have been introduced for landmark-based navigation algorithms of a mobile robot. The methods determine the movement direction based on a home snapshot image and another snapshot from the current position. In this paper, we suggest a new landmark-based matching method for robotic homing navigation that first computes the distance to each landmark based on ego-motion and estimates the landmark arrangement in the snapshot image. Then, landmark vectors are used to localize the robotic agent in the environment and to choose the appropriate direction to return home. As a result, this method has a higher success rate for returning home from an arbitrary position than do the conventional image-matching algorithms.  相似文献   

13.
Matching two-dimensional electrophoresis (2-DE) gel images typically generates a bottleneck in the automated protein analysis, and image distortion and experimental variation, which reduce the matching accuracy. However, conventional matching schemes only compare two complete images, and landmark selection and registration procedures are rather time-consuming. This work presents a novel and robust Maximum Relation Spanning Tree (MaxRST) algorithm, in which an autonomous sub-image matching method does not require registering or manual selection of landmarks. The 2D gel images are represented graphically. Image features are then quantitatively extracted regardless of image size. Similarity between a sub-image and large image is then determined based on Gaussian similarity measurement inspired by fuzzy method, thereby increasing the accuracy of fractional matching. The proposed autonomous matching algorithm achieves an accuracy of up to 97.29% when matching 627 2-DE gel test images. In addition to accommodating image rotation, reversals, shape deformation and intensity changes, the proposed algorithm effectively addresses the sub-image mapping problem and was analyzed thoroughly using a large dataset containing 4629 images. The contributions of this work are twofold. First, this work presents a novel MaxRST strategy and autonomous matching method that does not require manual landmark selection. Second, the proposed method, which extends 2-DE gel matching to query sub-image and a database containing large sets of images, can be adopted for mapping and locating, and to compare small gel images with large gel images with robustness and efficiency.  相似文献   

14.
Learning to select distinctive landmarks for mobile robot navigation   总被引:1,自引:0,他引:1  
In landmark-based navigation systems for mobile robots, sensory perceptions (e.g., laser or sonar scans) are used to identify the robot’s current location or to construct internal representations, maps, of the robot’s environment. Being based on an external frame of reference (which is not subject to incorrigible drift errors such as those occurring in odometry-based systems), landmark-based robot navigation systems are now widely used in mobile robot applications.The problem that has attracted most attention to date in landmark-based navigation research is the question of how to deal with perceptual aliasing, i.e., perceptual ambiguities. In contrast, what constitutes a good landmark, or how to select landmarks for mapping, is still an open research topic. The usual method of landmark selection is to map perceptions at regular intervals, which has the drawback of being inefficient and possibly missing ‘good’ landmarks that lie between sampling points.In this paper, we present an automatic landmark selection algorithm that allows a mobile robot to select conspicuous landmarks from a continuous stream of sensory perceptions, without any pre-installed knowledge or human intervention during the selection process. This algorithm can be used to make mapping mechanisms more efficient and reliable. Experimental results obtained with two different mobile robots in a range of environments are presented and analysed.  相似文献   

15.
光场相机目前已广泛应用于消费领域和工业应用领域,利用光场相机对目标物进行深度重建成为了一项重要的研究课题。在实际研究过程中,Lytro相机空间信息与角度信息复用于同一传感器,导致图像分辨率较低,从而使得重建效果不甚理想。为解决这一问题,提出了一种亚像素精度的光场图像深度估计方法,在频率域对子孔径图像进行多标签下的亚像素偏移,以中心视角图像为参照,建立像素匹配代价行为;使用引导滤波抑制噪声的同时保持了图像边缘;对多标签下的匹配代价行为进行优化,得到精确的深度估计结果。对目标深度图进行表面渲染、纹理映射等重建处理,得到较为精细的重建结果。实验结果表明,该算法在对复杂度较高的物体进行重建时,解决了重建模糊等问题,有较好的表现。  相似文献   

16.
《Graphical Models》2008,70(4):57-75
This paper studies the inside looking out camera pose estimation for the virtual studio. The camera pose estimation process, the process of estimating a camera’s extrinsic parameters, is based on closed-form geometrical approaches which use the benefit of simple corner detection of 3D cubic-like virtual studio landmarks. We first look at the effective parameters of the camera pose estimation process for the virtual studio. Our studies include all characteristic landmark parameters like landmark lengths, landmark corner angles and their installation position errors and some camera parameters like lens focal length and CCD resolution. Through computer simulation we investigate and analyze all these parameters’ efficiency in camera extrinsic parameters, including camera rotation and position matrixes. Based on this work, we found that the camera translation vector is affected more than other camera extrinsic parameters because of the noise of effective camera pose estimation parameters. Therefore, we present a novel iterative geometrical noise cancellation method for the closed-form camera pose estimation process. This is based on the collinearity theory that reduces the estimation error of the camera translation vector, which plays a major role in camera extrinsic parameters estimation errors. To validate our method, we test it in a complete virtual studio simulation. Our simulation results show that they are in the same order as those of some commercial systems, such as the BBC and InterSense IS-1200 VisTracker.  相似文献   

17.
In this paper, a landmark selection and tracking approach is presented for mobile robot navigation in natural environments, using textural distinctiveness-based saliency detection and spatial information acquired from stereo data. The presented method focuses on achieving high robustness of tracking rather than self-positioning accuracy. The landmark selection method is designed to select a small amount of the most salient feature points in a wide variety of sparse unknown environments to ensure successful matching. Landmarks are selected by an iterative algorithm from a textural distinctiveness-based saliency map extended with spatial information, where a repulsive potential field is created around the position of each already selected landmark for better distribution in order to increase robustness. The template matching of landmarks is aided with visual odometry-based motion estimation. Other robustness increasing strategies includes estimating landmark positions by unscented Kalman filters as well as from surrounding landmarks. Experimental results show that the introduced method is robust and suitable for natural environments.  相似文献   

18.
19.
目的 面向实时、准确、鲁棒的人体运动分析应用需求,从运动分析的特征提取和运动建模问题出发,本文人体运动分析的实例学习方法。方法 在构建人体姿态实例库基础上,首先,采用运动检测方法得到视频每帧的人体轮廓;其次,基于形状上下文轮廓匹配方法,从实例库中检索得到每帧视频的候选姿态集;最后,通过统计建模和转移概率建模实现人体运动分析。结果 对步行、跑步、跳跃等测试视频进行实验,基于轮廓的形状上下文特征表示和匹配方法具有良好的表达能力;本文方法运动分析结果,关节夹角平均误差在5°左右,与其他算法相比,有效提高了运动分析的精度。结论 本文人体运动分析的实例学习方法,能有效分析单目视频中的人体运动,并克服了映射的深度歧义,对运动的视角变化鲁棒,具有良好的计算效率和精度。  相似文献   

20.
Spectral clustering methods have various real-world applications, such as face recognition, community detection, protein sequences clustering etc. Although spectral clustering methods can detect arbitrary shaped clusters, resulting thus in high clustering accuracy, the heavy computational cost limits their scalability. In this paper, we propose an accelerated spectral clustering method based on landmark selection. According to the Weighted PageRank algorithm, the most important nodes of the data affinity graph are selected as landmarks. Furthermore, the selected landmarks are provided to a landmark spectral clustering technique to achieve scalable and accurate clustering. In our experiments, by using two benchmark face and shape image data sets, we examine several landmark selection strategies for scalable spectral clustering that either ignore or consider the topological properties of the data in the affinity graph. Also, we show that the proposed method outperforms baseline and accelerated spectral clustering methods, in terms of computational cost and clustering accuracy, respectively. Finally, we provide future directions in spectral clustering.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号