首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper addresses the problem of self-calibration from one unknown motion of an uncalibrated stereo rig. Unlike the existing methods for stereo rig self-calibration, which have been focused on applying the autocalibration paradigm using both motion and stereo correspondences, our method does not require the recovery of stereo correspondences. Our method combines purely algebraic constraints with implicit geometric constraints. Assuming that the rotational part of the stereo geometry has two unknown degrees of freedom (i.e., the third dof is roughly known), and that the principle point of each camera is known, we first show that the computation of the intrinsic and extrinsic parameters of the stereo rig can be recovered from the motion correspondences only, i.e., the monocular fundamental matrices. We then provide an initialization procedure for the proposed non-linear method. We provide an extensive performance study for the method in the presence of image noise. In addition, we study some of the aspects related to the 3D motion that govern the accuracy of the proposed self-calibration method. Experiments conducted on synthetic and real data/images demonstrate the effectiveness and efficiency of the proposed method.  相似文献   

2.
A Maximum Likelihood Stereo Algorithm   总被引:8,自引:0,他引:8  
A stereo algorithm is presented that optimizes a maximum likelihood cost function. The maximum likelihood cost function assumes that corresponding features in the left and right images are normally distributed about a common true value and consists of a weighted squared error term if two features are matched or a (fixed) cost if a feature is determined to be occluded. The stereo algorithm finds the set of correspondences that maximize the cost function subject to ordering and uniqueness constraints. The stereo algorithm is independent of the matching primitives. However, for the experiments described in this paper, matching is performed on the $cf4$individual pixel intensities.$cf3$ Contrary to popular belief, the pixel-based stereo appears to be robust for a variety of images. It also has the advantages of (i) providing adensedisparity map, (ii) requiringnofeature extraction, and (iii)avoidingthe adaptive windowing problem of area-based correlation methods. Because feature extraction and windowing are unnecessary, a very fast implementation is possible. Experimental results reveal that good stereo correspondences can be found using only ordering and uniqueness constraints, i.e., withoutlocalsmoothness constraints. However, it is shown that the original maximum likelihood stereo algorithm exhibits multiple global minima. The dynamic programming algorithm is guaranteed to find one, but not necessarily the same one for each epipolar scanline, causing erroneous correspondences which are visible as small local differences between neighboring scanlines. Traditionally, regularization, which modifies the original cost function, has been applied to the problem of multiple global minima. We developed several variants of the algorithm that avoid classical regularization while imposing several global cohesiveness constraints. We believe this is a novel approach that has the advantage of guaranteeing that solutions minimize the original cost function and preserve discontinuities. The constraints are based on minimizing the total number of horizontal and/or vertical discontinuities along and/or between adjacent epipolar lines, and local smoothing is avoided. Experiments reveal that minimizing the sum of the horizontal and vertical discontinuities provides the most accurate results. A high percentage of correct matches and very little smearing of depth discontinuities are obtained. An alternative to imposing cohesiveness constraints to reduce the correspondence ambiguities is to use more than two cameras. We therefore extend the two camera maximum likelihood toNcameras. TheN-camera stereo algorithm determines the “best” set of correspondences between a given pair of cameras, referred to as the principal cameras. Knowledge of the relative positions of the cameras allows the 3D point hypothesized by an assumed correspondence of two features in the principal pair to be projected onto the image plane of the remainingN− 2 cameras. TheseN− 2 points are then used to verify proposed matches. Not only does the algorithm explicitly model occlusion between features of the principal pair, but the possibility of occlusions in theN− 2 additional views is also modeled. Previous work did not model this occlusion process, the benefits and importance of which are experimentally verified. Like other multiframe stereo algorithms, the computational and memory costs of this approach increase linearly with each additional view. Experimental results are shown for two outdoor scenes. It is clearly demonstrated that the number of correspondence errors is significantly reduced as the number of views/cameras is increased.  相似文献   

3.
The paper presents an approach to cutting out the same target object from a pair of stereo images interactively. With this approach, a user labels parts of the object and background in either of the images with strokes. The approach generates a segmentation result immediately. In case it is not satisfying, the result can be improved by interactively drawing more strokes, or using an alternative interaction way called adding corresponding points, which is first presented in this paper. The proposed segmentation approach is capable of providing feedback fast after each interaction. The fast computation is performed in the framework of graph cut. First, the labeled parts are used to learn foreground and background color models. Next, an energy function is built by formulating the similarities between unlabeled pixels and the foreground/background color models, color difference between neighbor pixels, and stereo correspondences obtained by SIFT feature matching. At last, graph cut is utilized to find the optimum of the energy function and obtain a segmentation result. Different from state-of-the-art methods, our segmentation approach formulates sparse correspondences rather than dense matches as stereo constraints in the energy function. Experimental results demonstrate that our method is faster in computation. In the meanwhile, it generates comparable results with state-of-the-art methods.  相似文献   

4.
In this paper, we address the problem of recovering 3-D models from sequences of partly calibrated images with unknown correspondence. To that end, we integrate tracking, structure from motion with geometric constraints (specifically in the form of linear class models) in a single framework. The key to making the proposed approach work is the use of appearance-based model matching and refinement which updates the estimated correspondences on each iteration of the algorithm. Another key feature is the matching of a 3-D model directly with the input images without the conventional 2-step approach of stereo data recovery and 3-D model fitting. Initialization of the linear class model to one of the input images (the reference image) is currently partly manual.This synthesis and refine approach, or appearance-based constrained structure from motion (AbCSfm), is especially useful in recovering shapes of objects whose general structureis known but which may have little discernable texture in significant parts of their surfaces. We applied the proposed approach to 3-D face modeling from multiple images to create new 3-D faces for DECface, a synthetic talking head developed at Cambridge Research Laboratory, Digital Equipment Corporation. The DECface model comprises a collection of 3-D triangular and rectangular facets, with nodes as vertices. In recovering the DECface model, we assume that the sequence of images is taken with a camera with unknown focal length and pose. The geometric constraints used are of the form of linear combination of prototypes of 3-D faces of real people. Results of this approach show its good convergence properties and its robustness against cluttered backgrounds.  相似文献   

5.
6.
Face recognition across pose is a problem of fundamental importance in computer vision. We propose to address this problem by using stereo matching to judge the similarity of two, 2D images of faces seen from different poses. Stereo matching allows for arbitrary, physically valid, continuous correspondences. We show that the stereo matching cost provides a very robust measure of similarity of faces that is insensitive to pose variations. To enable this, we show that, for conditions common in face recognition, the epipolar geometry of face images can be computed using either four or three feature points. We also provide a straightforward adaptation of a stereo matching algorithm to compute the similarity between faces. The proposed approach has been tested on the CMU PIE data set and demonstrates superior performance compared to existing methods in the presence of pose variation. It also shows robustness to lighting variation.  相似文献   

7.
Extracting 3D facial animation parameters from multiview video clips   总被引:1,自引:0,他引:1  
We propose an accurate and inexpensive procedure that estimates 3D facial motion parameters from mirror-reflected multiview video clips. We place two planar mirrors near a subject's cheeks and use a single camera to simultaneously capture a marker's front and side view images. We also propose a novel closed-form linear algorithm to reconstruct 3D positions from real versus mirrored point correspondences in an uncalibrated environment. Our computer simulations reveal that exploiting mirrors' various reflective properties yields a more robust, accurate, and simpler 3D position estimation approach than general-purpose stereo vision methods that use a linear approach or maximum-likelihood optimization. Our experiments show a root mean square (RMS) error of less than 2 mm in 3D space with only 20-point correspondences. For semiautomatic 3D motion tracking, we use an adaptive Kalman predictor and filter to improve stability and infer the occluded markers' position. Our approach tracks more than 50 markers on a subject's face and lips from 30-frame-per-second video clips. We've applied the facial motion parameters estimated from the proposed method to our facial animation system.  相似文献   

8.
F. Dornaika 《Pattern recognition》2002,35(10):2003-2012
Structure from motion and structure from stereo are two vision cues for achieving 3D reconstruction. The two cues have complementary strengths; while 3D reconstruction is accurate but correspondence establishment is difficult in the stereo cue, the reverse is true in the motion cue. This paper addresses how to combine the two cues when a stereo pair of cameras are available to capture image data for 3D reconstruction. The work is distinct in that, in contrast with the previous ones, it is not to exploit the redundancy in the image data for boosting the reconstruction accuracy, but to make the two vision cues complementary, preserving their strengths and avoiding their weaknesses. A mechanism is introduced that allows dense motion correspondences in the two separate image streams be transferred to dense binocular correspondences across the image streams, so that 3D can be reconstructed from the latter and accurate reconstruction is possible even with short motions of the stereo rig. Both the stereo correspondences and the motion of the stereo rig are assumed to be unknown in this work. Experiments involving real image data are presented to indicate the feasibility and robustness of the approach.  相似文献   

9.
This paper presents a new approach of combining stereo vision and dynamic vision with the objective of retaining their advantages and removing their disadvantages. It is shown that, by assuming affine cameras, the stereo correspondences and motion correspondences, if organized in a particular way in a matrix, can be decomposed into: the 3D structure of the scene, the camera parameters, the motion parameters, and the stereo geometry. With this, the approach can infer stereo correspondences from motion correspondences, requiring only a time linear with respect to the size of the available image data. The approach offers the advantages of simpler correspondence, as in dynamic vision, and accurate reconstruction, as in stereo vision, even with short image sequences  相似文献   

10.
Sparse optic flow maps are general enough to obtain useful information about camera motion. Usually, correspondences among features over an image sequence are estimated by radiometric similarity. When the camera moves under known conditions, global geometrical constraints can be introduced in order to obtain a more robust estimation of the optic flow. In this paper, a method is proposed for the computation of a robust sparse optic flow (OF) which integrates the geometrical constraints induced by camera motion to verify the correspondences obtained by radiometric-similarity-based techniques. A raw OF map is estimated by matching features by correlation. The verification of the resulting correspondences is formulated as an optimization problem that is implemented on a Hopfield neural network (HNN). Additional constraints imposed in the energy function permit us to achieve a subpixel accuracy in the image locations of matched features. Convergence of the HNN is reached in a small enough number of iterations to make the proposed method suitable for real-time processing. It is shown that the proposed method is also suitable for identifying independently moving objects in front of a moving vehicle. Received: 26 December 1995 / Accepted: 20 February 1997  相似文献   

11.
This paper addresses the finite‐time formation tracking control problem for multiple rigid bodies whose dynamics are defined on the matrix Lie groups, including the special Euclidean group SE(2),SE(3) as specific cases. The reference trajectory in a form of rotational and translational motions is generated offline in advance as the virtual leader for tracking. Moreover, the formation time is specified according to the task, and the desired formation shape is given a priori with respect to the virtual leader. By virtue of the system decomposition approach, intrinsic formation tracking laws are derived for arbitrary initial velocities for the rigid bodies. The tracking controllers are intrinsic meaning that the dynamics and controllers are developed in the body‐fixed frame without a global reference frame. Moreover, based on the geometric convex combination on SE(3), the result is extended to the distributed case where only the neighboring agents' state are used. Two numerical simulations on SE(2) and SE(3) are given respectively to illustrate the validity and robustness of the proposed controllers. Copyright © 2017 John Wiley & Sons, Ltd.  相似文献   

12.
一种面向实时交互的变形手势跟踪方法   总被引:5,自引:0,他引:5  
王西颖  张习文  戴国忠 《软件学报》2007,18(10):2423-2433
变形手势跟踪是基于视觉的人机交互研究中的一项重要内容.单摄像头条件下,提出一种新颖的变形手势实时跟踪方法.利用一组2D手势模型替代高维度的3D手模型.首先利用贝叶斯分类器对静态手势进行识别,然后对图像进行手指和指尖定位,通过将图像特征与识别结果进行匹配,实现了跟踪过程的自动初始化.提出将K-means聚类算法与粒子滤波相结合,用于解决多手指跟踪问题中手指互相干扰的问题.跟踪过程中进行跟踪状态检测,实现了自动恢复跟踪及手势模型更新.实验结果表明,该方法可以实现对变形手势快速、准确的连续跟踪,能够满足基于视觉的实时人机交互的要求.  相似文献   

13.
邸男  朱明  韩广良 《软件学报》2015,26(1):52-61
复杂背景条件下低对比度目标的跟踪和测量方法,是视觉领域的一个重要课题.低对比度,低信噪比,目标旋转、缩放、被遮挡等非理想状态给跟踪算法的研究带来很大困难,算法既要适应目标和背景的复杂变化,又要保证运算量小,满足工程实时性要求.提出一种基于似然相似度函数的低对比度目标跟踪方法.在建立模型阶段,利用棱锥面方程的单峰特性突出模型中的目标灰度信息,使目标与背景灰度信息的可区分性更高;在模型匹配阶段,从统计学中的极大似然估计方法得到启发,构造一种新的似然相似度函数,与传统的相似度量相比,度量值的可区分性更高,大大提高了匹配区域的无重复模式;最后,将目标跟踪过程转化为对目标跟踪位置的极大似然估计过程.目前,该算法已经成功嵌入TMS320C6416硬件平台.大量实验结果表明,该算法所能探测的目标对比度LSCR最低限度约为3.作为实例,给出复杂背景下低对比度LSCR=4.9时空中飞机的实验结果.  相似文献   

14.
Motion of points and lines in the uncalibrated case   总被引:4,自引:4,他引:0  
In the present paper we address the problem of computing structure and motion, given a set point and/or line correspondences, in a monocular image sequence, when the camera is not calibrated.Considering point correspondences first, we analyse how to parameterize the retinal correspondences, in function of the chosen geometry: Euclidean, affine or projective geometry. The simplest of these parameterizations is called the FQs-representation and is a composite projective representation. The main result is that considering N+1 views in such a monocular image sequence, the retinal correspondences are parameterized by 11 N–4 parameters in the general projective case. Moreover, 3 other parameters are required to work in the affine case and 5 additional parameters in the Euclidean case. These 8 parameters are calibration parameters and must be calculated considering at least 8 external informations or constraints. The method being constructive, all these representations are made explicit.Then, considering line correspondences, we show how the the same parameterizations can be used when we analyse the motion of lines, in the uncalibrated case. The case of three views is extensively studied and a geometrical interpretation is proposed, introducing the notion of trifocal geometry which generalizes the well known epipolar geometry. It is also discussed how to introduce line correspondences, in a framework based on point correspondences, using the same equations.Finally, considering the F Qs-representation, one implementation is proposed as a motion module, taking retinal correspondences as input, and providing and estimation of the 11 N–4 retinal motion parameters. As discussed in this paper, this module can also estimate the 3D depth of the points up to an affine and projective transformation, defined by the 8 parameters identified in the first section. Experimental results are provided.  相似文献   

15.
Locating the 3D positions of the points on the human back is an essential issue in stereo-based interactive robotic back massage machines. In stereoscopic 3D localization, the 3D positions are determined from the corresponding image points captured by calibrated stereo cameras. However, detecting these corresponding points on the human back is highly challenging due to the smooth and texture-less characteristics of human skin. In the present study, this problem is resolved by means of a novel correspondences detection scheme designated as Correspondences from Epipolar geometry and Contours via Triangle barycentric coordinates (CECT). In the proposed approach, reliable correspondences are extracted from the edge contours of the human back by applying epipolar geometry, and these correspondences are then used to compute the correspondences of the featureless points within the edge contour using a triangle barycentric coordinate approach. The accuracy and robustness of the estimated correspondences are ensured by applying three geometric constraints, namely a similarity constraint, a shape constraint and an epipolar constraint. The performance of the proposed approach is demonstrated by means of a series of experiments involving 28 subjects and four different testing conditions. In addition, the accuracy of the proposed localization scheme is evaluated by comparing the estimated 3D positions with those obtained using the cun-based measurement method in Traditional Chinese Medicine (TCM).  相似文献   

16.
Stereo image analysis is based on establishing correspondences between a pair of images by determining similarity measures for potentially corresponding image parts. Such similarity criteria are only strictly valid for surfaces with Lambertian (diffuse) reflectance characteristics. Specular reflections are viewpoint dependent and may thus cause large intensity differences at corresponding image points. In the presence of specular reflections, traditional stereo approaches are often unable to establish correspondences at all, or the inferred disparity values tend to be inaccurate, or the established correspondences do not belong to the same physical surface point. The stereo image analysis framework for non-Lambertian surfaces presented in this contribution combines geometric cues with photometric and polarimetric information into an iterative scheme that allows to establish stereo correspondences in accordance with the specular reflectance behaviour and at the same time to determine the surface gradient field based on the known photometric and polarimetric reflectance properties. The described approach yields a dense 3D reconstruction of the surface which is consistent with all observed geometric and photopolarimetric data. Initially, a sparse 3D point cloud of the surface is computed by traditional blockmatching stereo. Subsequently, a dense 3D profile of the surface is determined in the coordinate system of camera 1 based on the shape from photopolarimetric reflectance and depth technique. A synthetic image of the surface is rendered in the coordinate system of camera 2 using the illumination direction and reflectance properties of the surface material. Point correspondences between the rendered image and the observed image of camera 2 are established with the blockmatching technique. This procedure yields an increased number of 3D points of higher accuracy, compared to the initial 3D point cloud. The improved 3D point cloud is used to compute a refined dense 3D surface profile. These steps are iterated until convergence of the 3D reconstruction. An experimental evaluation of our method is provided for areas of several square centimetres of forged and cast iron objects with rough surfaces displaying both diffuse and significant specular reflectance components, where traditional stereo image analysis largely fails. A comparison to independently measured ground truth data reveals that the root-mean-square error of the 3D reconstruction results is typically of the order 30–100 μm at a lateral pixel resolution of 86 μm. For two example surfaces, the number of stereo correspondences established by the specular stereo algorithm is several orders of magnitude higher than the initial number of 3D points. For one example surface, the number of stereo correspondences decreases by a factor of about two, but the 3D point cloud obtained with the specular stereo method is less noisy, contains a negligible number of outliers, and shows significantly more surface detail than the initial 3D point cloud. For poorly known reflectance parameters we observe a graceful degradation of the accuracy of 3D reconstruction.  相似文献   

17.
一种对光照条件不敏感而快速的局部立体匹配   总被引:3,自引:0,他引:3  
赖小波  朱世强  马璇 《机器人》2011,33(3):292-298
针对绝大多数立体匹配算法的相似性测度过分依赖于图像灰度统计特性的问题,提出了一种对光照变化不敏感的立体匹配算法.首先,研究了Census非参数变换并分析了其局限性;其次,为了在立体匹配时能够考虑像素的空间位置信息,对于变换窗口内与中心像素的相对位置大于一个单位的各邻域像素,将其灰度值通过周围4个像素的灰度值插值获得:最...  相似文献   

18.
Visual tracking techniques based on stereo endoscope are developed to measure tissue motion in robot-assisted minimally invasive surgery. However, accurate 3D tracking of tissue surfaces remains challenging due to complicated deformation, poor imaging conditions, specular reflections and other dynamic effects during surgery. This study employs a robust and efficient 3D tracking scheme with two independent recursive processes, namely kernel-based inter-frame motion estimation and model-based intra-frame 3D matching. In the first process, target region is represented in joint spatial-color space for robust estimation. By defining a probabilistic similarity measure, a mean-shift-based iterative algorithm is derived for location of the target region in a new image. In the second process, the thin-plate spline model is used to fit the 3D shape of tissue surfaces around the target region. An iterative algorithm based on an efficient second-order minimization technique is derived to compute optimal model parameters. The two processes can be computed in parallel. Their outputs are combined to recover 3D information about the target region. The performance of the proposed method is validated using phantom heart videos and in vivo videos acquired by the daVinci®daVinci® surgical robotic platform and a synthesized data set with known ground truth.  相似文献   

19.
This paper presents a novel stereo visual odometry (VO) framework based on structure from motion, where a robust keypoint tracking and matching is combined with an effective keyframe selection strategy. In order to track and find correct feature correspondences a robust loop chain matching scheme on two consecutive stereo pairs is introduced. Keyframe selection is based on the proportion of features with high temporal disparity. This criterion relies on the observation that the error in the pose estimation propagates from the uncertainty of 3D points—higher for distant points, that have low 2D motion. Comparative results based on three VO datasets show that the proposed solution is remarkably effective and robust even for very long path lengths.  相似文献   

20.
Kernel-based object tracking   总被引:47,自引:0,他引:47  
A new approach toward target representation and localization, the central component in visual tracking of nonrigid objects, is proposed. The feature histogram-based target representations are regularized by spatial masking with an isotropic kernel. The masking induces spatially-smooth similarity functions suitable for gradient-based optimization, hence, the target localization problem can be formulated using the basin of attraction of the local maxima. We employ a metric derived from the Bhattacharyya coefficient as similarity measure, and use the mean shift procedure to perform the optimization. In the presented tracking examples, the new method successfully coped with camera motion, partial occlusions, clutter, and target scale variations. Integration with motion filters and data association techniques is also discussed. We describe only a few of the potential applications: exploitation of background information, Kalman tracking using motion models, and face tracking.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号