首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 377 毫秒
1.
Multi-frame estimation of planar motion   总被引:4,自引:0,他引:4  
Traditional plane alignment techniques are typically performed between pairs of frames. We present a method for extending existing two-frame planar motion estimation techniques into a simultaneous multi-frame estimation, by exploiting multi-frame subspace constraints of planar surfaces. The paper has three main contributions: 1) we show that when the camera calibration does not change, the collection of all parametric image motions of a planar surface in the scene across multiple frames is embedded in a low dimensional linear subspace; 2) we show that the relative image motion of multiple planar surfaces across multiple frames is embedded in a yet lower dimensional linear subspace, even with varying camera calibration; and 3) we show how these multi-frame constraints can be incorporated into simultaneous multi-frame estimation of planar motion, without explicitly recovering any 3D information, or camera calibration. The resulting multi-frame estimation process is more constrained than the individual two-frame estimations, leading to more accurate alignment, even when applied to small image regions.  相似文献   

2.
针对传统旋转运动参数估计都是采用两帧图像对齐技术,提出了为多帧运动参数估计方法,即使用多帧子空间约束技术.证明了当摄像机参数不变时,多帧运动参数集合可嵌入一个低维线性子空间上;使用奇异值分解方法来降低线性子空间的秩,用最小二乘技术求解所有帧的运动参数.该方法不需要恢复任何3D信息;由于多帧参数估计法比两帧有更多的约束,因此取得更精确的图像对齐效果.该方法可用小图像进行参数估计.  相似文献   

3.
Dynamic analysis of video sequences often relies on the segmentation of the sequence into regions of consistent motions. Approaching this problem requires a definition of which motions are regarded as consistent. Common approaches to motion segmentation usually group together points or image regions that have the same motion between successive frames (where the same motion can be 2D, 3D, or non-rigid). In this paper we define a new type of motion consistency, which is based on temporal consistency of behaviors across multiple frames in the video sequence. Our definition of consistent “temporal behavior” is expressed in terms of multi-frame linear subspace constraints. This definition applies to 2D, 3D, and some non-rigid motions without requiring prior model selection. We further show that our definition of motion consistency extends to data with directional uncertainty, thus leading to a dense segmentation of the entire image. Such segmentation is obtained by applying the new motion consistency constraints directly to covariance-weighted image brightness measurements. This is done without requiring prior correspondence estimation nor feature tracking.  相似文献   

4.
The aim of this work is the recovery of 3D structure and camera projection matrices for each frame of an uncalibrated image sequence. In order to achieve this, correspondences are required throughout the sequence. A significant and successful mechanism for automatically establishing these correspondences is by the use of geometric constraints arising from scene rigidity. However, problems arise with such geometry guided matching if general viewpoint and general structure are assumed whilst frames in the sequence and/or scene structure do not conform to these assumptions. Such cases are termed degenerate.In this paper we describe two important cases of degeneracy and their effects on geometry guided matching. The cases are a motion degeneracy where the camera does not translate between frames, and a structure degeneracy where the viewed scene structure is planar. The effects include the loss of correspondences due to under or over fitting of geometric models estimated from image data, leading to the failure of the tracking method. These degeneracies are not a theoretical curiosity, but commonly occur in real sequences where models are statistically estimated from image points with measurement error.We investigate two strategies for tackling such degeneracies: the first uses a statistical model selection test to identify when degeneracies occur: the second uses multiple motion models to overcome the degeneracies. The strategies are evaluated on real sequences varying in motion, scene type, and length from 13 to 120 frames.  相似文献   

5.
Camera networks have gained increased importance in recent years. Existing approaches mostly use point correspondences between different camera views to calibrate such systems. However, it is often difficult or even impossible to establish such correspondences. But even without feature point correspondences between different camera views, if the cameras are temporally synchronized then the data from the cameras are strongly linked together by the motion correspondence: all the cameras observe the same motion. The present article therefore develops the necessary theory to use this motion correspondence for general rigid as well as planar rigid motions. Given multiple static affine cameras which observe a rigidly moving object and track feature points located on this object, what can be said about the resulting point trajectories? Are there any useful algebraic constraints hidden in the data? Is a 3D reconstruction of the scene possible even if there are no point correspondences between the different cameras? And if so, how many points are sufficient? Is there an algorithm which warrants finding the correct solution to this highly non-convex problem? This article addresses these questions and thereby introduces the concept of low-dimensional motion subspaces. The constraints provided by these motion subspaces enable an algorithm which ensures finding the correct solution to this non-convex reconstruction problem. The algorithm is based on multilinear analysis, matrix and tensor factorizations. Our new approach can handle extreme configurations, e.g. a camera in a camera network tracking only one single point. Results on synthetic as well as on real data sequences act as a proof of concept for the presented insights.  相似文献   

6.
Artificial neural networks for 3-D motion analysis. I. Rigid motion   总被引:1,自引:0,他引:1  
Proposes an approach applying artificial neural net techniques to 3D rigid motion analysis based on sequential multiple time frames. The approach consists of two phases: (1) matching between every two consecutive frames and (2) estimating motion parameters based on the correspondences established. Phase 1 specifies the matching constraints to ensure a stable and coherent feature correspondence establishment between two sequential time frames and configures a 2D Hopfield neural net to enforce these constraints. Phase 2 constructs a 3-layer net to estimate parameters through supervised learning. The method performs motion analysis based on sequential multiple time frames. It represents an effective way to achieve optimal matching between two frames using neural net techniques. The energy function of the Hopfield net is designed to reflect the matching constraints and the minimization of this function leads to the optimal feature correspondence establishment. The approach introduces the learning concept to motion estimation. The structure of the net provides the flexibility in estimating motion parameters based on information from multiple frames.  相似文献   

7.
8.
目的 为了更准确地构建3维等距模型之间的对应关系,本文提出了一种基于热核签名与波核签名的融合特征描述符计算3维等距模型对应关系的方法。方法 首先计算3维模型Laplace算子获得模型的特征向量和特征值;然后将所得到特征值和特征向量作为基参数分别计算源模型与目标模型的热核签名和波核签名,并将热核签名与波核签名融合为一个新的特征描述符。融合特征描述符作为模型上随机均匀采样点的约束,通过最小值匹配算法得到源模型和目标模型之间的对应关系。结果 实验结果表明,利用融合特征描述符约束进行计算得到的对应关系正确匹配率比热核签名约束计算得到的对应关系匹配率平均提高19.429%,比波核签名约束计算得到的对应关系匹配率平均提高4.857%。结论 本文提出的融合特征描述符适用于计算3维等距模型或近似等距的3维模型之间的对应关系,与单一使用热核签名或波核签名特征描述符相比,可以得到更加准确的对应关系。  相似文献   

9.
This paper deals with the estimation of motion and structure with an absolute scale factor from stereo image sequences without stereo correspondence. We show that the absolute motion and structure can be determined using only motion correspondences. This property is very useful in two aspects: first, motion correspondence is easier to solve than stereo correspondence because sequences of images can be taken at short time intervals; second, it is not necessary that the rigid scene be included in the intersection of the field of view of the two cameras. It is also shown that the degenerate cases reported in this paper constitute all of the degenerate cases for the scheme and can be easily avoided.  相似文献   

10.
基于图像序列的交互式快速建模系统   总被引:1,自引:1,他引:0  
给出了一个基于图像序列的交互式三维建模系统.通过输入一段未标定的图像或视频序列,系统能够自动地恢复出摄像机参数;然后用户只需要在少量几帧图像上简单勾画出物体的形态结构,系统就能自动解析出多帧之间用户交互的对应关系,从而迅速、逼真地重建出场景的三维模型.该系统提供了点与线段的重建、直线与平面的重建、曲线与曲面的重建等功能,能够满足对现实世界中的复杂场景的快速高精度的重建要求.几组真实拍摄的图像序列的建模实验表明:该系统高效、实用.能够很好地满足实际建模需求.  相似文献   

11.
Motion of points and lines in the uncalibrated case   总被引:4,自引:4,他引:0  
In the present paper we address the problem of computing structure and motion, given a set point and/or line correspondences, in a monocular image sequence, when the camera is not calibrated.Considering point correspondences first, we analyse how to parameterize the retinal correspondences, in function of the chosen geometry: Euclidean, affine or projective geometry. The simplest of these parameterizations is called the FQs-representation and is a composite projective representation. The main result is that considering N+1 views in such a monocular image sequence, the retinal correspondences are parameterized by 11 N–4 parameters in the general projective case. Moreover, 3 other parameters are required to work in the affine case and 5 additional parameters in the Euclidean case. These 8 parameters are calibration parameters and must be calculated considering at least 8 external informations or constraints. The method being constructive, all these representations are made explicit.Then, considering line correspondences, we show how the the same parameterizations can be used when we analyse the motion of lines, in the uncalibrated case. The case of three views is extensively studied and a geometrical interpretation is proposed, introducing the notion of trifocal geometry which generalizes the well known epipolar geometry. It is also discussed how to introduce line correspondences, in a framework based on point correspondences, using the same equations.Finally, considering the F Qs-representation, one implementation is proposed as a motion module, taking retinal correspondences as input, and providing and estimation of the 11 N–4 retinal motion parameters. As discussed in this paper, this module can also estimate the 3D depth of the points up to an affine and projective transformation, defined by the 8 parameters identified in the first section. Experimental results are provided.  相似文献   

12.
Camera geometries for image matching in 3-D machine vision   总被引:2,自引:0,他引:2  
The location of a scene element can be determined from the disparity of two of its depicted entities (each in a different image). Prior to establishing disparity, however, the correspondence problem must be solved. It is shown that for the axial-motion stereo camera model the probability of determining unambiguous correspondence assignments is significantly greater than that for other stereo camera models. However, the mere geometry of the stereo camera system does not provide sufficient information for uniquely identifying correct correspondences. Therefore, additional constraints derived from justifiable assumptions about the scene domain and from the scene radiance model are utilized to reduce the number of potential matches. The measure for establishing the correct correspondence is shown to be a function of the geometrical constraints, scene constraints, and scene radiance model  相似文献   

13.
Images of cellular structures in growing plant roots acquired using confocal laser scanning microscopy have some unusual properties that make motion estimation challenging. These include multiple motions, non-Gaussian noise and large regions with little spatial structure. In this paper, a method for motion estimation is described that uses a robust multi-frame likelihood model and a technique for estimating uncertainty. An efficient region-based matching approach was used followed by a forward projection method. Over small timescales the dynamics are simple (approximately locally constant) and the change in appearance small. Therefore, a constant local velocity model is used and the MAP estimate of the joint probability over a set of frames is recovered. Occurrences of multiple modes in the posterior are detected, and in the case of a single dominant mode, motion is inferred using Laplace’e method. The method was applied to several Arabidopsis thaliana root growth sequences with varying levels of success. In addition, comparative results are given for three alternative motion estimation approaches, the Kanade–Lucas–Tomasi tracker, Black and Anandan’s robust smoothing method, and Markov random field based methods.  相似文献   

14.
We propose a depth and image scene flow estimation method taking the input of a binocular video. The key component is motion-depth temporal consistency preservation, making computation in long sequences reliable. We tackle a number of fundamental technical issues, including connection establishment between motion and depth, structure consistency preservation in multiple frames, and long-range temporal constraint employment for error correction. We address all of them in a unified depth and scene flow estimation framework. Our main contributions include development of motion trajectories, which robustly link frame correspondences in a voting manner, rejection of depth/motion outliers through temporal robust regression, novel edge occurrence map estimation, and introduction of anisotropic smoothing priors for proper regularization.  相似文献   

15.
Effects of Errors in the Viewing Geometry on Shape Estimation   总被引:2,自引:0,他引:2  
A sequence of images acquired by a moving sensor contains information about the three-dimensional motion of the sensor and the shape of the imaged scene. Interesting research during the past few years has attempted to characterize the errors that arise in computing 3D motion (egomotion estimation) as well as the errors that result in the estimation of the scene's structure (structure from motion). Previous research is characterized by the use of optic flow or correspondence of features in the analysis as well as by the employment of particular algorithms and models of the scene in recovering expressions for the resulting errors. This paper presents a geometric framework that characterizes the relationship between 3D motion and shape in the presence of errors. We examine how the three-dimensional space recovered by a moving monocular observer, whose 3D motion is estimated with some error, is distorted. We characterize the space of distortions by its level sets, that is, we characterize the systematic distortion via a family of iso-distortion surfaces, which describes the locus over which the depths of points in the scene in view are distorted by the same multiplicative factor. The framework introduced in this way has a number of applications: Since the visible surfaces have positive depth (visibility constraint), by analyzing the geometry of the regions where the distortion factor is negative, that is, where the visibility constraint is violated, we make explicit situations which are likely to give rise to ambiguities in motion estimation, independent of the algorithm used. We provide a uniqueness analysis for 3D motion analysis from normal flow. We study the constraints on egomotion, object motion, and depth for an independently moving object to be detectable by a moving observer, and we offer a quantitative account of the precision needed in an inertial sensor for accurate estimation of 3D motion.  相似文献   

16.
SoftPOSIT: Simultaneous Pose and Correspondence Determination   总被引:3,自引:0,他引:3  
The problem of pose estimation arises in many areas of computer vision, including object recognition, object tracking, site inspection and updating, and autonomous navigation when scene models are available. We present a new algorithm, called SoftPOSIT, for determining the pose of a 3D object from a single 2D image when correspondences between object points and image points are not known. The algorithm combines the iterative softassign algorithm (Gold and Rangarajan, 1996; Gold et al., 1998) for computing correspondences and the iterative POSIT algorithm (DeMenthon and Davis, 1995) for computing object pose under a full-perspective camera model. Our algorithm, unlike most previous algorithms for pose determination, does not have to hypothesize small sets of matches and then verify the remaining image points. Instead, all possible matches are treated identically throughout the search for an optimal pose. The performance of the algorithm is extensively evaluated in Monte Carlo simulations on synthetic data under a variety of levels of clutter, occlusion, and image noise. These tests show that the algorithm performs well in a variety of difficult scenarios, and empirical evidence suggests that the algorithm has an asymptotic run-time complexity that is better than previous methods by a factor of the number of image points. The algorithm is being applied to a number of practical autonomous vehicle navigation problems including the registration of 3D architectural models of a city to images, and the docking of small robots onto larger robots.  相似文献   

17.
Crowded motions refer to multiple objects moving around and interacting such as crowds, pedestrians and etc. We capture crowded scenes using a depth scanner at video frame rates. Thus, our input is a set of depth frames which sample the scene over time. Processing such data is challenging as it is highly unorganized, with large spatio‐temporal holes due to many occlusions. As no correspondence is given, locally tracking 3D points across frames is hard due to noise and missing regions. Furthermore global segmentation and motion completion in presence of large occlusions is ambiguous and hard to predict. Our algorithm utilizes Gestalt principles of common fate and good continuity to compute motion tracking and completion respectively. Our technique does not assume any pre‐given markers or motion template priors. Our key‐idea is to reduce the motion completion problem to a 1D curve fitting and matching problem which can be solved efficiently using a global optimization scheme. We demonstrate our segmentation and completion method on a variety of synthetic and real world crowded scanned scenes.  相似文献   

18.
Scalable video quality enhancement refers to the process of enhancing low quality frames using high quality ones in scalable video bitstreams with time-varying qualities. A key problem in the enhancement is how to search for correspondence between high quality and low quality frames. Previous algorithms usually use block-based motion estimation to search for correspondences. Such an approach can hardly estimate scale and rotation transforms and always introduces outliers to the motion estimation results. In this paper, we propose a pixel-based outlier-free motion estimation algorithm to solve this problem. In our algorithm, the motion vector for each pixel is calculated with respect to estimate translation, scale, and rotation transforms. The motion relationships between neighboring pixels are considered via the Markov random field model to improve the motion estimation accuracy. Outliers are detected and avoided by taking both blocking effects and matching percentage in scaleinvariant feature transform field into consideration. Experiments are conducted in two scenarios that exhibit spatial scalability and quality scalability, respectively. Experimental results demonstrate that, in comparison with previous algorithms, the proposed algorithm achieves better correspondence and avoids the simultaneous introduction of outliers, especially for videos with scale and rotation transforms.  相似文献   

19.
Sung-In Choi 《Advanced Robotics》2013,27(15):1005-1013
Several pose estimation algorithms, such as n-point and perspective n-point (PnP), have been introduced over the last few decades to solve the relative and absolute pose estimation problems in robotics research. Since the n-point algorithms cannot decide the real scale of robot motion, the PnP algorithms are often addressed to find the absolute scale of motion. This paper introduce a new PnP algorithm which use only two 3D–2D correspondences by considering only planar motion. Experiment results prove that the proposed algorithm solves the absolute motion in real scale with high accuracy and less computational time compared to previous algorithms.  相似文献   

20.
目的 为了解决四足动物运动数据难以获取的问题,建立一种快速易用的四足动物运动重建和制作途径,提出了一种面向四足动物的实时低维运动生成方法。方法 首先,建立以质点、刚体和弹簧为基础的低维物理解算器,将四足动物骨架抽象为低维物理模型;其次,依据步态模式建立足迹约束,自脚向上分肢体求解全身物理质点的运动信息;最后,依据通用约束修正后的质点位置,反算全身动画骨骼节点,生成目标运动。结果 针对不同步态、不同体型、不同风格的四足动物进行多组实验,本文方法能够达到330帧/s的生成速度,且具备良好的视觉效果和通用性。结论 本文方法的输入数据易于学习和获取,计算过程实时稳定,可以快速生成符合视觉真实感的多风格运动数据。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号