首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 375 毫秒
1.
2.
《Real》1997,3(6):415-432
Real-time motion capture plays a very important role in various applications, such as 3D interface for virtual reality systems, digital puppetry, and real-time character animation. In this paper we challenge the problem of estimating and recognizing the motion of articulated objects using theoptical motion capturetechnique. In addition, we present an effective method to control the articulated human figure in realtime.The heart of this problem is the estimation of 3D motion and posture of an articulated, volumetric object using feature points from a sequence of multiple perspective views. Under some moderate assumptions such as smooth motion and known initial posture, we develop a model-based technique for the recovery of the 3D location and motion of a rigid object using a variation of Kalman filter. The posture of the 3D volumatric model is updated by the 2D image flow of the feature points for all views. Two novel concepts – the hierarchical Kalman filter (KHF) and the adaptive hierarchical structure (AHS) incorporating the kinematic properties of the articulated object – are proposed to extend our formulation for the rigid object to the articulated one. Our formulation also allows us to avoid two classic problems in 3D tracking: the multi-view correspondence problem, and the occlusion problem. By adding more cameras and placing them appropriately, our approach can deal with the motion of the object in a very wide area. Furthermore, multiple objects can be handled by managing multiple AHSs and processing multiple HKFs.We show the validity of our approach using the synthetic data acquired simultaneously from the multiple virtual camera in a virtual environment (VE) and real data derived from a moving light display with walking motion. The results confirm that the model-based algorithm works well on the tracking of multiple rigid objects.  相似文献   

3.
Markerless tracking of complex human motions from multiple views   总被引:1,自引:0,他引:1  
We present a method for markerless tracking of complex human motions from multiple camera views. In the absence of markers, the task of recovering the pose of a person during such motions is challenging and requires strong image features and robust tracking. We propose a solution which integrates multiple image cues such as edges, color information and volumetric reconstruction. We show that a combination of multiple image cues helps the tracker to overcome ambiguous situations such as limbs touching or strong occlusions of body parts. Following a model-based approach, we match an articulated body model built from superellipsoids against these image cues. Stochastic Meta Descent (SMD) optimization is used to find the pose which best matches the images. Stochastic sampling makes SMD robust against local minima and lowers the computational costs as a small set of predicted image features is sufficient for optimization. The power of SMD is demonstrated by comparing it to the commonly used Levenberg–Marquardt method. Results are shown for several challenging sequences showing complex motions and full articulation, with tracking of 24 degrees of freedom in ≈1 frame per second.  相似文献   

4.
Recovering articulated shape and motion, especially human body motion, from video is a challenging problem with a wide range of applications in medical study, sport analysis and animation, etc. Previous work on articulated motion recovery generally requires prior knowledge of the kinematic chain and usually does not concern the recovery of the articulated shape. The non-rigidity of some articulated part, e.g. human body motion with nonrigid facial motion, is completely ignored. We propose a factorization-based approach to recover the shape, motion and kinematic chain of an articulated object with nonrigid parts altogether directly from video sequences under a unified framework. The proposed approach is based on our modeling of the articulated non-rigid motion as a set of intersecting motion subspaces. A motion subspace is the linear subspace of the trajectories of an object. It can model a rigid or non-rigid motion. The intersection of two motion subspaces of linked parts models the motion of an articulated joint or axis. Our approach consists of algorithms for motion segmentation, kinematic chain building, and shape recovery. It handles outliers and can be automated. We test our approach through synthetic and real experiments and demonstrate how to recover articulated structure with non-rigid parts via a single-view camera without prior knowledge of its kinematic chain.  相似文献   

5.
Objects can exhibit different dynamics at different spatio-temporal scales, a property that is often exploited by visual tracking algorithms. A local dynamic model is typically used to extract image features that are then used as inputs to a system for tracking the object using a global dynamic model. Approximate local dynamics may be brittle—point trackers drift due to image noise and adaptive background models adapt to foreground objects that become stationary—and constraints from the global model can make them more robust. We propose a probabilistic framework for incorporating knowledge about global dynamics into the local feature extraction processes. A global tracking algorithm can be formulated as a generative model and used to predict feature values thereby influencing the observation process of the feature extractor, which in turn produces feature values that are used in high-level inference. We combine such models utilizing a multichain graphical model framework. We show the utility of our framework for improving feature tracking as well as shape and motion estimates in a batch factorization algorithm. We also propose an approximate filtering algorithm appropriate for online applications and demonstrate its application to tasks in background subtraction, structure from motion and articulated body tracking.  相似文献   

6.
Robust detection and tracking of pedestrians in image sequences are essential for many vision applications. In this paper, we propose a method to detect and track multiple pedestrians using motion, color information and the AdaBoost algorithm. Our approach detects pedestrians in a walking pose from a single camera on a mobile or stationary system. In the case of mobile systems, ego-motion of the camera is compensated for by corresponding feature sets. The region of interest is calculated by the difference image between two consecutive images using the compensated image. Pedestrian detector is learned by boosting a number of weak classifiers which are based on Histogram of Oriented Gradient (HOG) features. Pedestrians are tracked by block matching method using color information. Our tracking system can track pedestrians with possibly partial occlusions and without misses using information stored in advance even after occlusion is ended. The proposed approach has been tested on a number of image sequences, and was shown to detect and track multiple pedestrians very well.  相似文献   

7.
Spline-Based Image Registration   总被引:10,自引:3,他引:7  
  相似文献   

8.
We present a novel variational approach for segmenting the image plane into a set of regions of parametric motion on the basis of two consecutive frames from an image sequence. Our model is based on a conditional probability for the spatio-temporal image gradient, given a particular velocity model, and on a geometric prior on the estimated motion field favoring motion boundaries of minimal length.Exploiting the Bayesian framework, we derive a cost functional which depends on parametric motion models for each of a set of regions and on the boundary separating these regions. The resulting functional can be interpreted as an extension of the Mumford-Shah functional from intensity segmentation to motion segmentation. In contrast to most alternative approaches, the problems of segmentation and motion estimation are jointly solved by continuous minimization of a single functional. Minimizing this functional with respect to its dynamic variables results in an eigenvalue problem for the motion parameters and in a gradient descent evolution for the motion discontinuity set.We propose two different representations of this motion boundary: an explicit spline-based implementation which can be applied to the motion-based tracking of a single moving object, and an implicit multiphase level set implementation which allows for the segmentation of an arbitrary number of multiply connected moving objects.Numerical results both for simulated ground truth experiments and for real-world sequences demonstrate the capacity of our approach to segment objects based exclusively on their relative motion.  相似文献   

9.
目的 在目标跟踪过程中,运动信息可以预测目标位置,忽视目标的运动信息或者对其运动方式的建模与实际差异较大,均可能导致跟踪失败。针对此问题,考虑到视觉显著性具有将注意快速指向感兴趣目标的特点,将其引入目标跟踪中,提出一种基于时空运动显著性的目标跟踪算法。方法 首先,依据大脑视皮层对运动信息的层次处理机制,建立一种自底向上的时空运动显著性计算模型,即通过3D时空滤波器完成对运动信号的底层编码、最大化汇集算子完成运动特征的局部编码;利用视频前后帧之间的时间关联性,通过时空运动特征的差分完成运动信息的显著性度量,形成时空运动显著图。其次,在粒子滤波基本框架之下,将时空运动显著图与颜色直方图相结合,来衡量不同预测状态与观测状态之间的相关性,从而确定目标的状态,实现目标跟踪。结果 与其他跟踪方法相比,本文方法能够提高目标跟踪的中心位置误差、精度和成功率等指标;在光照变化、背景杂乱、运动模糊、部分遮挡及形变等干扰因素下,仍能够稳定地跟踪目标。此外,将时空运动显著性融入其他跟踪方法,能够改善跟踪效果,进一步验证了运动显著性对于运动目标跟踪的有效性。结论 时空运动显著性可以有效度量目标的运动信息,增强运动显著的目标区域,抑制干扰区域,从而提升跟踪性能。  相似文献   

10.
Automatic acquisition and initialization of articulated models   总被引:3,自引:0,他引:3  
Tracking, classification and visual analysis of articulated motion is challenging because of the difficulties involved in separating noise and variabilities caused by appearance, size and viewpoint fluctuations from task-relevant variations. By incorporating powerful domain knowledge, model-based approaches are able to overcome these problem to a great extent and are actively explored by many researchers. However, model acquisition, initialization and adaptation are still relatively under-investigated problems, especially for the case of single-camera systems. In this paper, we address the problem of automatic acquisition and initialization of articulated models from monocular video without any prior knowledge of shape and kinematic structure. The framework is applied in a human-computer interaction context where articulated shape models have to be acquired from unknown users for subsequent limb tracking. Bayesian motion segmentation is used to extract and initialize articulated models from visual data. Image sequences are decomposed into rigid components that can undergo parametric motion. The relative motion of these components is used to obtain joint information. The resulting components are assembled into an articulated kinematic model which is then used for visual tracking, eliminating the need for manual initialization or adaptation. The efficacy of the method is demonstrated on synthetic as well as natural image sequences. The accuracy of the joint estimation stage is verified on ground truth data.Correspondence to: N. Krahnstoever  相似文献   

11.
We propose a model-based tracking method for articulated objects in monocular video sequences under varying illumination conditions. The tracking method uses estimates of optical flows constructed by projecting model textures into the camera images and comparing the projected textures with the recorded information. An articulated body is modelled in terms of 3D primitives, each possessing a specified texture on its surface. An important step in model-based tracking of 3D objects is the estimation of the pose of the object during the tracking process. The optimal pose is estimated by minimizing errors between the computed optical flow and the projected 2D velocities of the model textures. This estimation uses a least-squares method with kinematic constraints for the articulated object and a perspective camera model. We test our framework with an articulated robot and show results.  相似文献   

12.
Computing occluding and transparent motions   总被引:13,自引:6,他引:7  
Computing the motions of several moving objects in image sequences involves simultaneous motion analysis and segmentation. This task can become complicated when image motion changes significantly between frames, as with camera vibrations. Such vibrations make tracking in longer sequences harder, as temporal motion constancy cannot be assumed. The problem becomes even more difficult in the case of transparent motions.A method is presented for detecting and tracking occluding and transparent moving objects, which uses temporal integration without assuming motion constancy. Each new frame in the sequence is compared to a dynamic internal representation image of the tracked object. The internal representation image is constructed by temporally integrating frames after registration based on the motion computation. The temporal integration maintains sharpness of the tracked object, while blurring objects that have other motions. Comparing new frames to the internal representation image causes the motion analysis algorithm to continue tracking the same object in subsequent frames, and to improve the segmentation.  相似文献   

13.
Motion segmentation in moving camera videos is a very challenging task because of the motion dependence between the camera and moving objects. Camera motion compensation is recognized as an effective approach. However, existing work depends on prior-knowledge on the camera motion and scene structure for model selection. This is not always available in practice. Moreover, the image plane motion suffers from depth variations, which leads to depth-dependent motion segmentation in 3D scenes. To solve these problems, this paper develops a prior-free dependent motion segmentation algorithm by introducing a modified Helmholtz-Hodge decomposition (HHD) based object-motion oriented map (OOM). By decomposing the image motion (optical flow) into a curl-free and a divergence-free component, all kinds of camera-induced image motions can be represented by these two components in an invariant way. HHD identifies the camera-induced image motion as one segment irrespective of depth variations with the help of OOM. To segment object motions from the scene, we deploy a novel spatio-temporal constrained quadtree labeling. Extensive experimental results on benchmarks demonstrate that our method improves the performance of the state-of-the-art by 10%~20% even over challenging scenes with complex background.  相似文献   

14.
The aim of this paper is to investigate whether it is possible to construct invariants for articulated objects—these are objects that are composed of different rigid parts which are allowed to perform a restricted motion with respect to each other—which use only partial information from each component. To this end, the transformation group describing the deformations of the image of an articulated object due to relative motions of the components, and/or changes in the position of the camera, is identified. It turns out that for a planar articulated object with two rigid components that are allowed to move within the object plane, this transformation group is (anti-isomorphic to) the semi-direct product of the group one would obtain if the object was rigid, and its smallest normal subgroup containing the transformations due to the relative motions of the components. Depending on the projection model, different answers to the question above evoke. For instance, when using perspective projection no other invariants exist than those obtained by considering each part separately as a rigid object, whereas in the pseudo-orthographic case simpler invariants (using only partial information from each component) do exist. Examples of such invariants are given.Postdoctoral Research Fellow of the Belgian National Fund for Scientific Research (N.F.W.O.).  相似文献   

15.
We introduce a robust framework for learning and fusing of orientation appearance models based on both texture and depth information for rigid object tracking. Our framework fuses data obtained from a standard visual camera and dense depth maps obtained by low-cost consumer depth cameras such as the Kinect. To combine these two completely different modalities, we propose to use features that do not depend on the data representation: angles. More specifically, our framework combines image gradient orientations as extracted from intensity images with the directions of surface normals computed from dense depth fields. We propose to capture the correlations between the obtained orientation appearance models using a fusion approach motivated by the original Active Appearance Models (AAMs). To incorporate these features in a learning framework, we use a robust kernel based on the Euler representation of angles which does not require off-line training, and can be efficiently implemented online. The robustness of learning from orientation appearance models is presented both theoretically and experimentally in this work. This kernel enables us to cope with gross measurement errors, missing data as well as other typical problems such as illumination changes and occlusions. By combining the proposed models with a particle filter, the proposed framework was used for performing 2D plus 3D rigid object tracking, achieving robust performance in very difficult tracking scenarios including extreme pose variations.  相似文献   

16.
We present an algorithm for identifying and tracking independently moving rigid objects from optical flow. Some previous attempts at segmentation via optical flow have focused on finding discontinuities in the flow field. While discontinuities do indicate a change in scene depth, they do not in general signal a boundary between two separate objects. The proposed method uses the fact that each independently moving object has a unique epipolar constraint associated with its motion. Thus motion discontinuities based on self-occlusion can be distinguished from those due to separate objects. The use of epipolar geometry allows for the determination of individual motion parameters for each object as well as the recovery of relative depth for each point on the object. The algorithm assumes an affine camera where perspective effects are limited to changes in overall scale. No camera calibration parameters are required. A Kalman filter based approach is used for tracking motion parameters with time  相似文献   

17.
We address the problem of simultaneous two-view epipolar geometry estimation and motion segmentation from nonstatic scenes. Given a set of noisy image pairs containing matches of n objects, we propose an unconventional, efficient, and robust method, 4D tensor voting, for estimating the unknown n epipolar geometries, and segmenting the static and motion matching pairs into n, independent motions. By considering the 4D isotropic and orthogonal joint image space, only two tensor voting passes are needed, and a very high noise to signal ratio (up to five) can be tolerated. Epipolar geometries corresponding to multiple, rigid motions are extracted in succession. Only two uncalibrated frames are needed, and no simplifying assumption (such as affine camera model or homographic model between images) other than the pin-hole camera model is made. Our novel approach consists of propagating a local geometric smoothness constraint in the 4D joint image space, followed by global consistency enforcement for extracting the fundamental matrices corresponding to independent motions. We have performed extensive experiments to compare our method with some representative algorithms to show that better performance on nonstatic scenes are achieved. Results on challenging data sets are presented.  相似文献   

18.
模板跟踪在计算机视觉上已经有了很广泛的应用。利用图像的单应性特征,构建目标的运动模型,找出观察数据与目标运动参数的关系,估计目标的运动参数,实现目标跟踪的目的。本文提出了一种基于混合线性模型的模板跟踪方法,提取目标的运动参数和表观特征,建立数据集,利用全监督学习的方法计算出两者的映射关系,从而实现对目标进行有效跟踪。这种方法既克服了由单一线性模型造成的非线性误差,又减少了由于计算非线性模型的计算量,提高了目标跟踪的精度。此外,本文还提出了一种快速学习的计算方法,克服了混合模型中每个子空间由于学习样本少而容易受到噪声影响的缺点,不仅增加了系统的鲁棒性,而且减少了计算量。实验结果表明该算法具有良好地跟踪效果。  相似文献   

19.
Inferring scene geometry from a sequence of camera images is one of the central problems in computer vision. While the overwhelming majority of related research focuses on diffuse surface models, there are cases when this is not a viable assumption: in many industrial applications, one has to deal with metal or coated surfaces exhibiting a strong specular behavior. We propose a novel and generalized constrained gradient descent method to determine the shape of a purely specular object from the reflection of a calibrated scene and additional data required to find a unique solution. This data is exemplarily provided by optical flow measurements obtained by small scale motion of the specular object, with camera and scene remaining stationary. We present a non-approximative general forward model to predict the optical flow of specular surfaces, covering rigid body motion as well as elastic deformation, and allowing for a characterization of problematic points. We demonstrate the applicability of our method by numerical experiments on synthetic and real data.  相似文献   

20.
Autonomous operation of a vehicle on a road calls for understanding of various events involving the motions of the vehicles in its vicinity. In this paper we show how a moving vehicle which is carrying a camera can estimate the relative motions of nearby vehicles. We show how to “smooth” the motion of the observing vehicle, i.e. to correct the image sequence so that transient motions (primarily rotations) resulting from bumps, etc. are removed and the sequence corresponds more closely to the sequence that would have been collected if the motion had been smooth. We also show how to detect the motions of nearby vehicles relative to the observing vehicle. We present results for several road image sequences which demonstrate the effectiveness of our approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号