首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 536 毫秒
1.
We introduce a new GPGPU-based real-time dense stereo matching algorithm. The algorithm is based on a progressive multi-resolution pipeline which includes background modeling and dense matching with adaptive windows. For applications in which only moving objects are of interest, this approach effectively reduces the overall computation cost quite significantly, and preserves the high definition details. Running on an off-the-shelf commodity graphics card, our implementation achieves a 36 fps stereo matching on 1024 × 768 stereo video with a fine 256 pixel disparity range. This is effectively same as 7200 M disparity evaluations per second. For scenes where the static background assumption holds, our approach outperforms all published alternative algorithms in terms of the speed performance, by a large margin. We envision a number of potential applications such as real-time motion capture, as well as tracking, recognition and identification of moving objects in multi-camera networks.  相似文献   

2.
We present an algorithm for identifying and tracking independently moving rigid objects from optical flow. Some previous attempts at segmentation via optical flow have focused on finding discontinuities in the flow field. While discontinuities do indicate a change in scene depth, they do not in general signal a boundary between two separate objects. The proposed method uses the fact that each independently moving object has a unique epipolar constraint associated with its motion. Thus motion discontinuities based on self-occlusion can be distinguished from those due to separate objects. The use of epipolar geometry allows for the determination of individual motion parameters for each object as well as the recovery of relative depth for each point on the object. The algorithm assumes an affine camera where perspective effects are limited to changes in overall scale. No camera calibration parameters are required. A Kalman filter based approach is used for tracking motion parameters with time  相似文献   

3.
An approach based on fuzzy logic for matching both articulated and non-articulated objects across multiple non-overlapping field of views (FoVs) from multiple cameras is proposed. We call it fuzzy logic matching algorithm (FLMA). The approach uses the information of object motion, shape and camera topology for matching objects across camera views. The motion and shape information of targets are obtained by tracking them using a combination of ConDensation and CAMShift tracking algorithms. The information of camera topology is obtained and used by calculating the projective transformation of each view with the common ground plane. The algorithm is suitable for tracking non-rigid objects with both linear and non-linear motion. We show videos of tracking objects across multiple cameras based on FLMA. From our experiments, the system is able to correctly match the targets across views with a high accuracy.  相似文献   

4.
In recent years, there have been a lot of interests in incorporating semantics into simultaneous localization and mapping (SLAM) systems. This paper presents an approach to generate an outdoor large-scale 3D dense semantic map based on binocular stereo vision. The inputs to system are stereo color images from a moving vehicle. First, dense 3D space around the vehicle is constructed, and the motion of camera is estimated by visual odometry. Meanwhile, semantic segmentation is performed through the deep learning technology online, and the semantic labels are also used to verify the feature matching in visual odometry. These three processes calculate the motion, depth and semantic label of every pixel in the input views. Then, a voxel conditional random field (CRF) inference is introduced to fuse semantic labels to voxel. After that, we present a method to remove the moving objects by incorporating the semantic labels, which improves the motion segmentation accuracy. The last is to generate the dense 3D semantic map of an urban environment from arbitrary long image sequence. We evaluate our approach on KITTI vision benchmark, and the results show that the proposed method is effective.  相似文献   

5.
基于SAD与UKF-MeanShift的主动目标跟踪   总被引:1,自引:0,他引:1  
针对复杂场景下动态目标难以准确分割以及目标难以准确定位的问题,提出将绝对差值和(SAD)方法、无迹卡尔曼滤波(UKF)和Mean shift算法相结合的混合自主跟踪动态目标的方法。首先,采用SAD方法获相邻两帧的视差信息,利用视差实现动态目标的检测,并依此建立目标的核直方图描述模型和状态空间模型,然后UKF算法对状态空间进行滤波估计,最后采用Mean shift 算法精确定位目标。实验结果表明该方法不仅能有效检测场景的动态目标,同时还能获得目标的运动信息。文中所提出的基于UKF-Mean shift的跟踪策略与相关算法相比,体现出较好的跟踪效果与时间性能。  相似文献   

6.
目的 针对多运动目标在移动背景情况下跟踪性能下降和准确度不高的问题,本文提出了一种基于OPTICS聚类与目标区域概率模型的方法。方法 首先引入了Harris-Sift特征点检测,完成相邻帧特征点匹配,提高了特征点跟踪精度和鲁棒性;再根据各运动目标与背景运动向量不同这一点,引入了改进后的OPTICS加注算法,在构建的光流图上聚类,从而准确的分离出背景,得到各运动目标的估计区域;对每个运动目标建立一个独立的目标区域概率模型(OPM),随着检测帧数的迭代更新,以得到运动目标的准确区域。结果 多运动目标在移动背景情况下跟踪性能下降和准确度不高的问题通过本文方法得到了很好地解决,Harris-Sift特征点提取、匹配时间仅为Sift特征的17%。在室外复杂环境下,本文方法的平均准确率比传统背景补偿方法高出14%,本文方法能从移动背景中准确分离出运动目标。结论 实验结果表明,该算法能满足实时要求,能够准确分离出运动目标区域和背景区域,且对相机运动、旋转,场景亮度变化等影响因素具有较强的鲁棒性。  相似文献   

7.
Detecting and tracking moving objects within a scene is an essential step for high-level machine vision applications such as video content analysis. In this paper, we propose a fast and accurate method for tracking an object of interest in a dynamic environment (active camera model). First, we manually select the region of the object of interest and extract three statistical features, namely the mean, the variance and the range of intensity values of the feature points lying inside the selected region. Then, using the motion information of the background’s feature points and k-means clustering algorithm, we calculate camera motion transformation matrix. Based on this matrix, the previous frame is transformed to the current frame’s coordinate system to compensate the impact of camera motion. Afterwards, we detect the regions of moving objects within the scene using our introduced frame difference algorithm. Subsequently, utilizing DBSCAN clustering algorithm, we cluster the feature points of the extracted regions in order to find the distinct moving objects. Finally, we use the same statistical features (the mean, the variance and the range of intensity values) as a template to identify and track the moving object of interest among the detected moving objects. Our approach is simple and straightforward yet robust, accurate and time efficient. Experimental results on various videos show an acceptable performance of our tracker method compared to complex competitors.  相似文献   

8.
A combined 2D, 3D approach is presented that allows for robust tracking of moving people and recognition of actions. It is assumed that the system observes multiple moving objects via a single, uncalibrated video camera. Low-level features are often insufficient for detection, segmentation, and tracking of non-rigid moving objects. Therefore, an improved mechanism is proposed that integrates low-level (image processing), mid-level (recursive 3D trajectory estimation), and high-level (action recognition) processes. A novel extended Kalman filter formulation is used in estimating the relative 3D motion trajectories up to a scale factor. The recursive estimation process provides a prediction and error measure that is exploited in higher-level stages of action recognition. Conversely, higher-level mechanisms provide feedback that allows the system to reliably segment and maintain the tracking of moving objects before, during, and after occlusion. Heading-guided recognition (HGR) is proposed as an efficient method for adaptive classification of activity. The HGR approach is demonstrated using “motion history images” that are then recognized via a mixture-of-Gaussians classifier. The system is tested in recognizing various dynamic human outdoor activities: running, walking, roller blading, and cycling. In addition, experiments with real and synthetic data sets are used to evaluate stability of the trajectory estimator with respect to noise.  相似文献   

9.
Computing occluding and transparent motions   总被引:13,自引:6,他引:7  
Computing the motions of several moving objects in image sequences involves simultaneous motion analysis and segmentation. This task can become complicated when image motion changes significantly between frames, as with camera vibrations. Such vibrations make tracking in longer sequences harder, as temporal motion constancy cannot be assumed. The problem becomes even more difficult in the case of transparent motions.A method is presented for detecting and tracking occluding and transparent moving objects, which uses temporal integration without assuming motion constancy. Each new frame in the sequence is compared to a dynamic internal representation image of the tracked object. The internal representation image is constructed by temporally integrating frames after registration based on the motion computation. The temporal integration maintains sharpness of the tracked object, while blurring objects that have other motions. Comparing new frames to the internal representation image causes the motion analysis algorithm to continue tracking the same object in subsequent frames, and to improve the segmentation.  相似文献   

10.
We propose a novel method to model and learn the scene activity, observed by a static camera. The proposed model is very general and can be applied for solution of a variety of problems. The motion patterns of objects in the scene are modeled in the form of a multivariate nonparametric probability density function of spatiotemporal variables (object locations and transition times between them). Kernel Density Estimation is used to learn this model in a completely unsupervised fashion. Learning is accomplished by observing the trajectories of objects by a static camera over extended periods of time. It encodes the probabilistic nature of the behavior of moving objects in the scene and is useful for activity analysis applications, such as persistent tracking and anomalous motion detection. In addition, the model also captures salient scene features, such as the areas of occlusion and most likely paths. Once the model is learned, we use a unified Markov Chain Monte Carlo (MCMC)-based framework for generating the most likely paths in the scene, improving foreground detection, persistent labeling of objects during tracking, and deciding whether a given trajectory represents an anomaly to the observed motion patterns. Experiments with real-world videos are reported which validate the proposed approach.  相似文献   

11.
In this work a method is presented to track and estimate pose of articulated objects using the motion of a sparse set of moving features. This is achieved by using a bottom-up generative approach based on the Pictorial Structures representation [1]. However, unlike previous approaches that rely on appearance, our method is entirely dependent on motion. Initial low-level part detection is based on how a region moves as opposed to its appearance. This work is best described as Pictorial Structures using motion. A standard feature tracker is used to automatically extract a sparse set of features. These features typically contain many tracking errors, however, the presented approach is able to overcome both this and their sparsity. The proposed method is applied to two problems: 2D pose estimation of articulated objects walking side onto the camera and 3D pose estimation of humans walking and jogging at arbitrary orientations to the camera. In each domain quantitative results are reported that improve on state of the art. The motivation of this work is to illustrate the information present in low-level motion that can be exploited for the task of pose estimation.  相似文献   

12.
针对移动镜头下的运动目标检测中的背景建模复杂、计算量大等问题,提出一种基于运动显著性的移动镜头下的运动目标检测方法,在避免复杂的背景建模的同时实现准确的运动目标检测。该方法通过模拟人类视觉系统的注意机制,分析相机平动时场景中背景和前景的运动特点,计算视频场景的显著性,实现动态场景中运动目标检测。首先,采用光流法提取目标的运动特征,用二维高斯卷积方法抑制背景的运动纹理;然后采用直方图统计衡量运动特征的全局显著性,根据得到的运动显著图提取前景与背景的颜色信息;最后,结合贝叶斯方法对运动显著图进行处理,得到显著运动目标。通用数据库视频上的实验结果表明,所提方法能够在抑制背景运动噪声的同时,突出并准确地检测出场景中的运动目标。  相似文献   

13.
We propose an approach for modeling, measurement and tracking of rigid and articulated motion as viewed from a stationary or moving camera. We first propose an approach for learning temporal-flow models from exemplar image sequences. The temporal-flow models are represented as a set of orthogonal temporal-flow bases that are learned using principal component analysis of instantaneous flow measurements. Spatial constraints on the temporal-flow are then incorporated to model the movement of regions of rigid or articulated objects. These spatio-temporal flow models are subsequently used as the basis for simultaneous measurement and tracking of brightness motion in image sequences. Then we address the problem of estimating composite independent object and camera image motions. We employ the spatio-temporal flow models learned through observing typical movements of the object from a stationary camera to decompose image motion into independent object and camera motions. The performance of the algorithms is demonstrated on several long image sequences of rigid and articulated bodies in motion.  相似文献   

14.
《Real》1997,3(6):415-432
Real-time motion capture plays a very important role in various applications, such as 3D interface for virtual reality systems, digital puppetry, and real-time character animation. In this paper we challenge the problem of estimating and recognizing the motion of articulated objects using theoptical motion capturetechnique. In addition, we present an effective method to control the articulated human figure in realtime.The heart of this problem is the estimation of 3D motion and posture of an articulated, volumetric object using feature points from a sequence of multiple perspective views. Under some moderate assumptions such as smooth motion and known initial posture, we develop a model-based technique for the recovery of the 3D location and motion of a rigid object using a variation of Kalman filter. The posture of the 3D volumatric model is updated by the 2D image flow of the feature points for all views. Two novel concepts – the hierarchical Kalman filter (KHF) and the adaptive hierarchical structure (AHS) incorporating the kinematic properties of the articulated object – are proposed to extend our formulation for the rigid object to the articulated one. Our formulation also allows us to avoid two classic problems in 3D tracking: the multi-view correspondence problem, and the occlusion problem. By adding more cameras and placing them appropriately, our approach can deal with the motion of the object in a very wide area. Furthermore, multiple objects can be handled by managing multiple AHSs and processing multiple HKFs.We show the validity of our approach using the synthetic data acquired simultaneously from the multiple virtual camera in a virtual environment (VE) and real data derived from a moving light display with walking motion. The results confirm that the model-based algorithm works well on the tracking of multiple rigid objects.  相似文献   

15.
Intelligent visual surveillance — A survey   总被引:3,自引:0,他引:3  
Detection, tracking, and understanding of moving objects of interest in dynamic scenes have been active research areas in computer vision over the past decades. Intelligent visual surveillance (IVS) refers to an automated visual monitoring process that involves analysis and interpretation of object behaviors, as well as object detection and tracking, to understand the visual events of the scene. Main tasks of IVS include scene interpretation and wide area surveillance control. Scene interpretation aims at detecting and tracking moving objects in an image sequence and understanding their behaviors. In wide area surveillance control task, multiple cameras or agents are controlled in a cooperative manner to monitor tagged objects in motion. This paper reviews recent advances and future research directions of these tasks. This article consists of two parts: The first part surveys image enhancement, moving object detection and tracking, and motion behavior understanding. The second part reviews wide-area surveillance techniques based on the fusion of multiple visual sensors, camera calibration and cooperative camera systems.  相似文献   

16.
Adaptive pyramid mean shift for global real-time visual tracking   总被引:2,自引:0,他引:2  
Tracking objects in videos using the mean shift technique has attracted considerable attention. In this work, a novel approach for global target tracking based on mean shift technique is proposed. The proposed method represents the model and the candidate in terms of background weighted histogram and color weighted histogram, respectively, which can obtain precise object size adaptively with low computational complexity. To track targets whose displacements between two successive frames are relatively large, we implement the mean shift procedure via a coarse-to-fine way for global maximum seeking. This procedure is termed as adaptive pyramid mean shift, because it uses the pyramid analysis technique and can determine the pyramid level adaptively to decrease the number of iterations required to achieve convergence. Experimental results on various tracking videos and its application to a tracking and pointing subsystem show that the proposed method can successfully cope with different situations such as camera motion, camera vibration, camera zoom and focus, high-speed moving object tracking, partial occlusions, target scale variations, etc.  相似文献   

17.
Object tracking is an important task in computer vision that is essential for higher level vision applications such as surveillance systems, human-computer interaction, industrial control, smart compression of video, and robotics. Tracking, however, cannot be easily accomplished due to challenges such as real-time processing, occlusions, changes in intensity, abrupt motions, variety of objects, and mobile platforms. In this paper, we propose a new method to estimate and eliminate the camera motion in mobile platforms, and accordingly, we propose a set of optimal feature points for accurate tracking. Experimental results on different videos show that the proposed method estimates camera motion very well and eliminate its effect on tracking moving objects. And the use of optimal feature points results in a promising tracking. The proposed method in terms of accuracy and processing time has desirable results compared to the state-of-the-art methods.  相似文献   

18.
《Real》1998,4(1):21-40
This paper describes how image sequences taken by a moving video camera may be processed to detect and track moving objects against a moving background in real-time. The motion segmentation and shape tracking system is known as a scene segmenter establishing tracking, version 2 (ASSET-2). Motion is found by tracking image features, and segmentation is based on first-order (i.e. six-parameter) flow fields. Shape tracking is performed using two-dimensional radial map representations. The system runs in real-time, and is accurate and reliable. It requires no camera calibration and no knowledge of the camera's motion.  相似文献   

19.
《Real》2001,7(6):529-544
This paper presents a multiresolution approach to visual motion tracking. In the approach, the foveation mechanism of the human visual system is used to model the multiresolution information perception algorithms of a Transputer-based pyramid visual tracking system. The video images of a moving target are transformed into pyramidal data structures, each of those images consists of multiple image layers with different resolutions by a Gaussian pyramid generation algorithm. The tracking of a moving target over an image sequence is accomplished by performing a foveal search that is based on an iterative intensity pattern correlation along the multiple resolution levels of the Gaussian pyramids of two successive images. Analyses are given as to the efficiency and accuracy of our tracking algorithm, showing that the algorithm is over 160 times faster than conventional mono-resolution tracking methods, with the tracking error within one pixel. To demonstrate the superiority of the multiresolution tracking algorithm in the connection to parallel computation, a scheme for mapping the tracking algorithm into a Transputer-based pyramidal parallel computing structure is proposed in the paper. Experimental results demonstrate good performance of the proposed approach.  相似文献   

20.
From depth sensors to thermal cameras, the increased availability of camera sensors beyond the visible spectrum has created many exciting applications. Most of these applications require combining information from these hyperspectral cameras with a regular RGB camera. Information fusion from multiple heterogeneous cameras can be a very complex problem. They can be fused at different levels from pixel to voxel or even semantic objects, with large variations in accuracy, communication, and computation costs. In this paper, we propose a system for robust segmentation of human figures in video sequences by fusing visible-light and thermal imageries. Our system focuses on the geometric transformation between visual blobs corresponding to human figures observed at both cameras. This approach provides the most reliable fusion at the expense of high computation and communication costs. To reduce the computational complexity of the geometric fusion, an efficient calibration procedure is first applied to rectify the two camera views without the complex procedure of estimating the intrinsic parameters of the cameras. To geometrically register different blobs at the pixel level, a blob-to-blob homography in the rectified domain is then computed in real-time by estimating the disparity for each blob-pair. Precise segmentation is finally achieved using a two-tier tracking algorithm and a unified background model. Our experimental results show that our proposed system provides significant improvements over existing schemes under various conditions.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号