期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

席志红韩双全王洪旭《计算机应用》2019,39(10):2847-2851

针对动态物体在室内同步定位与地图构建（SLAM）系统中影响位姿估计的问题，提出一种动态场景下基于语义分割的SLAM系统。在相机捕获图像后，首先用PSPNet（Pyramid Scene Parsing Network）对图像进行语义分割；之后提取图像特征点，剔除分布在动态物体内的特征点，并用静态的特征点进行相机位姿估计；最后完成语义点云图和语义八叉树地图的构建。在公开数据集上的五个动态序列进行多次对比测试的结果表明，相对于使用SegNet网络的SLAM系统，所提系统的绝对轨迹误差的标准偏差有6.9%~89.8%的下降，平移和旋转漂移的标准偏差在高动态场景中的最佳效果也能分别提升73.61%和72.90%。结果表明，改进的系统能够显著减小动态场景下位姿估计的误差，准确地在动态场景中进行相机位姿估计。相似文献

2.

基于视觉与激光融合的井下灾后救援无人机自主位姿估计

何怡静杨维《工矿自动化》2024,(4):94-102

无人机在灾后矿井的自主导航能力是其胜任抢险救灾任务的前提,而在未知三维空间的自主位姿估计技术是无人机自主导航的关键技术之一。目前基于视觉的位姿估计算法由于单目相机无法直接获取三维空间的深度信息且易受井下昏暗光线影响,导致位姿估计尺度模糊和定位性能较差,而基于激光的位姿估计算法由于激光雷达存在视角小、扫描图案不均匀及受限于矿井场景结构特征,导致位姿估计出现错误。针对上述问题,提出了一种基于视觉与激光融合的井下灾后救援无人机自主位姿估计算法。首先,通过井下无人机搭载的单目相机和激光雷达分别获取井下的图像数据和激光点云数据,对每帧矿井图像数据均匀提取ORB特征点,使用激光点云的深度信息对ORB特征点进行深度恢复,通过特征点的帧间匹配实现基于视觉的无人机位姿估计。其次,对每帧井下激光点云数据分别提取特征角点和特征平面点,通过特征点的帧间匹配实现基于激光的无人机位姿估计。然后,将视觉匹配误差函数和激光匹配误差函数置于同一位姿优化函数下,基于视觉与激光融合来估计井下无人机位姿。最后,通过视觉滑动窗口和激光局部地图引入历史帧数据,构建历史帧数据和最新估计位姿之间的误差函数,通过对误差函数的非线性优化... 相似文献

3.

基于点线面特征的无漂移旋转视觉里程计

李用杰秦广健武利明熊军林《计算机应用研究》2023,40(12)

现有基于点特征的视觉SLAM（simultaneous localization and mapping）算法在弱纹理环境中表现不佳,为此提出了一种基于点线面特征融合的视觉里程计算法,能够在弱纹理环境中实现精准定位。首先基于曼哈顿世界假设下,使用线特征与面特征提取曼哈顿世界坐标系,并将线特征与面特征与坐标系联合;其次为了提升系统定位的准确性,使用了一种无漂移旋转的位姿估计算法,将位姿的旋转与平移分开求解;最后利用结构化的线特征与面特征对位姿与曼哈顿轴进行优化,综合考虑图像中的点线面特征,使得位姿估计的结果更加精确。实验表明,所提算法在TUM与ICL-NUIM数据集中的表现优于目前的其他方法。相似文献

4.

基于共面圆的距离传感器与相机的相对位姿标定

王硕祝海江李和平吴毅红《自动化学报》2020,46(6):1154-1165

近年来, 距离传感器与摄像机的组合系统标定在无人车环境感知中得到了广泛的研究与应用, 其中基于平面特征的方法简单易行而被广泛采用. 然而, 目前多数方法基于点匹配进行, 易错且鲁棒性较低. 本文提出了一种基于共面圆的距离传感器与相机的组合系统相对位姿估计方法. 该方法使用含有两个共面圆的标定板, 可以获取相机与标定板间的位姿, 以及距离传感器与标定板间的位姿. 此外, 移动标定板获取多组数据, 根据计算得到两个共面圆的圆心在距离传感器和相机下的坐标, 优化重投影误差与3D对应点之间的误差, 得到距离传感器与相机之间的位姿关系. 该方法不需要进行特征点的匹配, 利用射影不变性来获取相机与三维距离传感器的位姿. 仿真实验与真实数据实验结果表明, 本方法对噪声有较强的鲁棒性, 得到了精确的结果. 相似文献

5.

融合双目视觉与惯导信息的高效视觉里程计算法

潘林豪《计算机应用研究》2021,38(6):1739-1743,1769

为提高视觉里程计(VO)在大尺度环境下运行的实时性,提出一种融合双目视觉与惯导信息的视觉里程计算法,主要由前端位姿跟踪和后端局部地图优化两个线程组成.位姿跟踪线程首先使用惯导信息辅助光流法进行帧间特征点跟踪并估计相机初始位姿;接着通过最小化图像光度误差获取当前帧像素点与局部地图点的对应关系;而后最小化当前帧上局部地图点的重投影误差和惯性测量单元(IMU)预积分误差,得到当前帧准确的位姿估计.后端局部地图优化线程对滑动窗口内的关键帧提取特征点并三角化新地图点,使用光束平差法(BA)对逆深度参数化表示的地图点位置、关键帧位姿、速度以及陀螺仪和加速度计零偏进行滑窗优化,为前端提供更加精确的局部地图相机位姿和环境信息.在EuRoC数据集上的实验表明,相比于ORB-SLAM2、ORB-SLAM3算法,该融合双目视觉与惯导信息的视觉里程计算法的定位精度略有下降,但可以较大程度地提高位姿跟踪的实时性. 相似文献

6.

基于线特征与内区域特征的视觉伺服解耦控制

徐德刚周雷沈添天《信息与控制》2019,48(4):401-412

针对多自由度机械臂快速趋近任意四边形态目标的视觉伺服控制难题，提出了结合线特征与内区域特征的机器人视觉伺服解耦控制方法.构建了目标内区域特征以指导相机的平移运动速率，利用目标的线特征给出相机的旋转角速率，并通过引入内区域特征的矢量补偿和质心坐标的位置补偿，实现了平移和旋转控制的部分解耦.最后，对机器人视觉伺服控制系统进行了稳定性分析.仿真验证结果表明所提方法能控制相机以较快而平滑的动作收敛到期望位姿，且在相机光轴与目标平面近似垂直的条件下能较好地克服深度估计造成的不确定性问题. 相似文献

7.

基于深度学习的相机位姿估计方法综述

王静金玉楚郭苹胡少毅《计算机工程与应用》2023,(7):1-14

相机位姿估计是指在已知环境下精确地估计相机在世界坐标系中六自由度位姿的技术,该技术是机器人技术和自动驾驶中的关键技术。随着深度学习的飞速发展,使用深度学习来优化相机位姿估计算法已经成为了当前的研究热点之一。为了掌握目前相机位姿估计算法的研究现状与趋势,对基于深度学习的相机位姿估计的主流算法进行了综述。简单介绍了传统的基于特征点的相机位姿估计方法。重点介绍了基于深度学习的方法：根据核心算法的不同,从端到端的相机位姿估计、场景坐标回归、基于检索的相机位姿估计、层级结构、多信息融合和跨场景的相机位姿估计六个方面进行了详细的阐述和分析。对研究现状进行了总结,并基于深入的性能分析指出了相机位姿估计领域面临的挑战,展望了其发展动向。相似文献

8.

基于深度学习的刚体位姿估计方法综述

郭楠李婧源任曦《计算机科学》2023,(2):178-189

刚体位姿估计旨在获取刚体在相机坐标系下的3D平移信息和3D旋转信息，在自动驾驶、机器人、增强现实等快速发展的领域起着重要作用。现对2017-2021年间的基于深度学习的刚体位姿估计方向具有代表性的研究进行汇总与分析。将刚体位姿估计的方法分为基于坐标、基于关键点和基于模板的方法。将刚体位姿估计任务划分为图像预处理、空间映射或特征匹配、位姿恢复和位姿优化4项子任务，详细介绍每一类方法的子任务实现及其优势和存在的问题。分析刚体位姿估计任务面临的挑战，总结现有解决方案及其优缺点。介绍刚体位姿估计常用的数据集和性能评价指标，并对比分析现有方法在常用数据集上的表现。最后从位姿跟踪、类别级位姿估计等多个角度对未来研究方向进行了展望。相似文献

9.

基于半直接法SLAM的大场景稠密三维重建系统

徐浩楠余雷费树岷《模式识别与人工智能》2018,31(5):477-484

当前三维重建系统大多基于特征点法和直接法的同时定位与地图重建(SLAM)系统,特征点法SLAM难以在特征点缺失的地方具有较好的重建结果,直接法SLAM在相机运动过快时难以进行位姿估计,从而造成重建效果不理想.针对上述问题,文中提出基于半直接法SLAM的大场景稠密三维重建系统.通过深度相机(RGB-D相机)扫描,在特征点丰富的区域使用特征点法进行相机位姿估计,在特征点缺失区域使用直接法进行位姿估计,减小光度误差,优化相机位姿.然后使用优化后较准确的相机位姿进行地图构建,采用面元模型,应用构建变形图的方法进行点云的位姿估计和融合,最终获得较理想的三维重建模型.实验表明,文中系统可适用于各个场合的三维重建,得到较理想的三维重建模型. 相似文献

10.

低质量渲染图像的目标物体6D姿态估计

左国玉张成威刘洪星龚道雄《控制与决策》2022,37(1):135-141

从图像中获取目标物体的6D位姿信息在机器人操作和虚拟现实等领域有着广泛的应用,然而,基于深度学习的位姿估计方法在训练模型时通常需要大量的训练数据集来提高模型的泛化能力,一般的数据采集方法存在收集成本高同时缺乏3D空间位置信息等问题.鉴于此,提出一种低质量渲染图像的目标物体6D姿态估计网络框架.该网络中,特征提取部分以单张RGB图像作为输入,用残差网络提取输入图像特征;位姿估计部分的目标物体分类流用于预测目标物体的类别,姿态回归流在3D空间中回归目标物体的旋转角度和平移矢量.另外,采用域随机化方法以低收集成本方式构建大规模低质量渲染、带有物体3D空间位置信息的图像数据集Pose6DDR.在所建立的Pose6DDR数据集和LineMod公共数据集上的测试结果表明了所提出位姿估计方法的优越性以及大规模数据集域随机化生成数据方法的有效性. 相似文献

11.

How to localize humanoids with a single camera?

Pablo F. Alcantarilla Olivier Stasse Sebastien Druon Luis M. Bergasa Frank Dellaert 《Autonomous Robots》2013,34(1-2):47-71

In this paper, we propose a real-time vision-based localization approach for humanoid robots using a single camera as the only sensor. In order to obtain an accurate localization of the robot, we first build an accurate 3D map of the environment. In the map computation process, we use stereo visual SLAM techniques based on non-linear least squares optimization methods (bundle adjustment). Once we have computed a 3D reconstruction of the environment, which comprises of a set of camera poses (keyframes) and a list of 3D points, we learn the visibility of the 3D points by exploiting all the geometric relationships between the camera poses and 3D map points involved in the reconstruction. Finally, we use the prior 3D map and the learned visibility prediction for monocular vision-based localization. Our algorithm is very efficient, easy to implement and more robust and accurate than existing approaches. By means of visibility prediction we predict for a query pose only the highly visible 3D points, thus, speeding up tremendously the data association between 3D map points and perceived 2D features in the image. In this way, we can solve very efficiently the Perspective-n-Point (PnP) problem providing robust and fast vision-based localization. We demonstrate the robustness and accuracy of our approach by showing several vision-based localization experiments with the HRP-2 humanoid robot. 相似文献

12.

Enhanced Geometric Map: a 2D&3D Hybrid City Model of Large Scale Urban Environment for Robot Navigation

《机器人》2016,(3)

To facilitate scene understanding and robot navigation in large scale urban environment, a two-layer enhanced geometric map(EGMap) is designed using videos from a monocular onboard camera. The 2D layer of EGMap consists of a 2D building boundary map from top-down view and a 2D road map, which can support localization and advanced map-matching when compared with standard polyline-based maps. The 3D layer includes features such as 3D road model,and building facades with coplanar 3D vertical and horizontal line segments, which can provide the 3D metric features to localize the vehicles and flying-robots in 3D space. Starting from the 2D building boundary and road map, EGMap is initially constructed using feature fusion with geometric constraints under a line feature-based simultaneous localization and mapping(SLAM) framework iteratively and progressively. Then, a local bundle adjustment algorithm is proposed to jointly refine the camera localizations and EGMap features. Furthermore, the issues of uncertainty, memory use, time efficiency and obstacle effect in EGMap construction are discussed and analyzed. Physical experiments show that EGMap can be successfully constructed in large scale urban environment and the construction method is demonstrated to be very accurate and robust. 相似文献

13.

SLAM in Indoor Environments using Omni-directional Vertical and Horizontal Line Features

Sunhyo Kim Se-Young Oh 《Journal of Intelligent and Robotic Systems》2008,51(1):31-43

An autonomous mobile robot must have the ability to navigate in an unknown environment. The simultaneous localization and map building (SLAM) problem have relation to this autonomous ability. Vision sensors are attractive equipment for an autonomous mobile robot because they are information-rich and rarely have restrictions on various applications. However, many vision based SLAM methods using a general pin-hole camera suffer from variation in illumination and occlusion, because they mostly extract corner points for the feature map. Moreover, due to the narrow field of view of the pin-hole camera, they are not adequate for a high speed camera motion. To solve these problems, this paper presents a new SLAM method which uses vertical lines extracted from an omni-directional camera image and horizontal lines from the range sensor data. Due to the large field of view of the omni-directional camera, features remain in the image for enough time to estimate the pose of the robot and the features more accurately. Furthermore, since the proposed SLAM does not use corner points but the lines as the features, it reduces the effect of illumination and partial occlusion. Moreover, we use not only the lines at corners of wall but also many other vertical lines at doors, columns and the information panels on the wall which cannot be extracted by a range sensor. Finally, since we use the horizontal lines to estimate the positions of the vertical line features, we do not require any camera calibration. Experimental work based on MORIS, our mobile robot test bed, moving at a human’s pace in the real indoor environment verifies the efficacy of this approach. 相似文献

14.

Real-time vision-based tracking and reconstruction

Gabriele Bleser Mario Becker Didier Stricker 《Journal of Real-Time Image Processing》2007,2(2-3):161-175

Many of the recent real-time markerless camera tracking systems assume the existence of a complete 3D model of the target scene. Also the system developed in the MATRIS project assumes that a scene model is available. This can be a freeform surface model generated automatically from an image sequence using structure from motion techniques or a textured CAD model built manually using a commercial software. The offline model provides 3D anchors to the tracking. These are stable natural landmarks, which are not updated and thus prevent an accumulating error (drift) in the camera registration by giving an absolute reference. However, sometimes it is not feasible to model the entire target scene in advance, e.g. parts, which are not static, or one would like to employ existing CAD models, which are not complete. In order to allow camera movements beyond the parts of the environment modelled in advance it is desired to derive additional 3D information online. Therefore, a markerless camera tracking system for calibrated perspective cameras has been developed, which employs 3D information about the target scene and complements this knowledge online by reconstruction of 3D points. The proposed algorithm is robust and reduces drift, the most dominant problem of simultaneous localisation and mapping (SLAM), in real-time by a combination of the following crucial points: (1) stable tracking of longterm features on the 2D level; (2) use of robust methods like the well-known Random Sampling Consensus (RANSAC) for all 3D estimation processes; (3) consequent propagation of errors and uncertainties; (4) careful feature selection and map management; (5) incorporation of epipolar constraints into the pose estimation. Validation results on the operation of the system on synthetic and real data are presented. 相似文献

15.

Pose estimation from multiple cameras based on Sylvester’s equation

《Computer Vision and Image Understanding》2010,114(6):652-666

In this paper, we introduce a method to estimate the object’s pose from multiple cameras. We focus on direct estimation of the 3D object pose from 2D image sequences. Scale-Invariant Feature Transform (SIFT) is used to extract corresponding feature points from adjacent images in the video sequence. We first demonstrate that centralized pose estimation from the collection of corresponding feature points in the 2D images from all cameras can be obtained as a solution to a generalized Sylvester’s equation. We subsequently derive a distributed solution to pose estimation from multiple cameras and show that it is equivalent to the solution of the centralized pose estimation based on Sylvester’s equation. Specifically, we rely on collaboration among the multiple cameras to provide an iterative refinement of the independent solution to pose estimation obtained for each camera based on Sylvester’s equation. The proposed approach to pose estimation from multiple cameras relies on all of the information available from all cameras to obtain an estimate at each camera even when the image features are not visible to some of the cameras. The resulting pose estimation technique is therefore robust to occlusion and sensor errors from specific camera views. Moreover, the proposed approach does not require matching feature points among images from different camera views nor does it demand reconstruction of 3D points. Furthermore, the computational complexity of the proposed solution grows linearly with the number of cameras. Finally, computer simulation experiments demonstrate the accuracy and speed of our approach to pose estimation from multiple cameras. 相似文献

16.

面向半稠密三维重建的改进单目ORB-SLAM

下载免费PDF全文

周彦旷鸿章牟金震王冬丽刘宗明《计算机工程与应用》2021,57(8):180-184

构建更详细的地图以及估计更精准的相机位姿一直都是同时定位与地图构建（Simultaneous Localization And Mapping,SLAM）技术所追求的目标,但是以上目标与实时性要求、较低的计算代价和受限的计算资源条件是相矛盾的。提出一种在单目ORB-SLAM（Oriented FAST and Rotated BRIEF-SLAM）方法的基础上利用关键帧中提取到的直线特征进行半稠密三维重建的方法。由ORB-SLAM实时提供一组关键帧及其对应的相机位姿信息和一系列地图点,提出一种关键帧再剔除算法进一步减少冗余帧数目,使用直线段提取方法提取各帧中的直线段,使用纯几何约束方法对以上检测得到的直线段进行匹配,生成一个由直线段构成的半稠密三维场景模型。实验结果表明新方法持续稳定的运行,能在低计算代价条件下快速地在线三维重建。相似文献

17.

基于RGB-D相机的室内环境3D地图创建

王亚龙《计算机应用研究》2015,32(8)

RGB-D相机（如微软的Kinect）能够在获取彩色图像的同时得到每个像素的深度信息,在移动机器人三维地图创建方向具有广泛应用。本文设计了一种利用RGB-D相机进行机器人自定位及创建室内场景三维模型的方法,该方法首先由RGB-D相机获取周围环境的连续帧信息;其次提取并匹配连续帧间的SURF特征点,通过特征点的位置变化计算机器人的位姿并结合非线性最小二乘优化算法最小化对应点的双向投影误差;最后结合关键帧技术及观察中心法将相机观测到的三维点云依据当前位姿投影到全局地图。本文选择三个不同的场景试验了该方法,并对比了不同特征点下该方法的效果,试验中本文方法在轨迹长度为5.88m情况下误差仅为0.023,能够准确地创建周围环境的三维模型。相似文献

18.

3D Scene Reconstruction from Multiple Spherical Stereo Pairs

Hansung Kim Adrian Hilton 《International Journal of Computer Vision》2013,104(1):94-116

We propose a 3D environment modelling method using multiple pairs of high-resolution spherical images. Spherical images of a scene are captured using a rotating line scan camera. Reconstruction is based on stereo image pairs with a vertical displacement between camera views. A 3D mesh model for each pair of spherical images is reconstructed by stereo matching. For accurate surface reconstruction, we propose a PDE-based disparity estimation method which produces continuous depth fields with sharp depth discontinuities even in occluded and highly textured regions. A full environment model is constructed by fusion of partial reconstruction from spherical stereo pairs at multiple widely spaced locations. To avoid camera calibration steps for all camera locations, we calculate 3D rigid transforms between capture points using feature matching and register all meshes into a unified coordinate system. Finally a complete 3D model of the environment is generated by selecting the most reliable observations among overlapped surface measurements considering surface visibility, orientation and distance from the camera. We analyse the characteristics and behaviour of errors for spherical stereo imaging. Performance of the proposed algorithm is evaluated against ground-truth from the Middlebury stereo test bed and LIDAR scans. Results are also compared with conventional structure-from-motion algorithms. The final composite model is rendered from a wide range of viewpoints with high quality textures. 相似文献