首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper integrates fully automatic video object segmentation and tracking including detection and assignment of uncovered regions in a 2-D mesh-based framework. Particular contributions of this work are (i) a novel video object segmentation method that is posed as a constrained maximum contrast path search problem along the edges of a 2-D triangular mesh, and (ii) a 2-D mesh-based uncovered region detection method along the object boundary as well as within the object. At the first frame, an optimal number of feature points are selected as nodes of a 2-D content-based mesh. These points are classified as moving (foreground) and stationary nodes based on multi-frame node motion analysis, yielding a coarse estimate of the foreground object boundary. Color differences across triangles near the coarse boundary are employed for a maximum contrast path search along the edges of the 2-D mesh to refine the boundary of the video object. Next, we propagate the refined boundary to the subsequent frame by using motion vectors of the node points to form the coarse boundary at the next frame. We detect occluded regions by using motion-compensated frame differences and range filtered edge maps. The boundaries of detected uncovered regions are then refined by using the search procedure. These regions are either appended to the foreground object or tracked as new objects. The segmentation procedure is re-initialized when unreliable motion vectors exceed a certain number. The proposed scheme is demonstrated on several video sequences.  相似文献   

2.
A novel method for visual object tracking in stereo videos is proposed, which fuses an appearance based representation of the object based on Local Steering Kernel features and 2D color–disparity histogram information. The algorithm employs Kalman filtering for object position prediction and a sampling technique for selecting the candidate object regions of interest in the left and right channels. Disparity information is exploited, for matching corresponding regions in the left and right video frames. As tracking evolves, any significant changes in object appearance due to scale, rotation, or deformation are identified and embodied in the object model. The object appearance changes are identified simultaneously in the left and right channel video frames, ensuring correct 3D representation of the resulting bounding box in a 3D display monitor. The proposed framework performs stereo object tracking and it is suitable for application in 3D movies, 3D TV content and 3D video content captured by consuming stereo cameras. Experimental results proved the effectiveness of the proposed method in tracking objects under geometrical transformations, zooming and partial occlusion, as well as in tracking slowly deforming articulated 3D objects in stereo video.  相似文献   

3.
基于颜色信息的运动目标检测易受光照、阴影等影响,基于深度信息的运动目标检测存在目标边缘噪声大,无法检测距离背景较近的目标等问题。针对上述问题,该文利用CCD相机获取的颜色信息及TOF相机获取的深度信息分别为每个像素建立颜色与深度信息的分类器,根据像素点的深度特征及前一帧的检测结果,自适应地为每个分类器的输出分配不同的权值,实现运动目标的检测。该文采集多组视频序列进行实验,实验结果表明该方法能有效解决单独利用颜色或深度信息进行运动目标检测时出现的问题。  相似文献   

4.
The performance of Motion Compensated Discrete Cosine Transform (MC‐DCT) video coding is improved by using the region adaptive subband image coding [18]. On the assumption that the video is acquired from the camera on a moving platform and the distance between the camera and the scene is large enough, both the motion of camera and the motion of moving objects in a frame are compensated. For the compensation of camera motion, a feature matching algorithm is employed. Several feature points extracted using a Sobel operator are used to compensate the camera motion of translation, rotation, and zoom. The illumination change between frames is also compensated. Motion compensated frame differences are divided into three regions called stationary background, moving objects, and newly emerging areas each of which is arbitrarily shaped. Different quantizers are used for different regions. Compared to the conventional MC‐DCT video coding using block matching algorithm, our video coding scheme shows about 1.0‐dB improvements on average for the experimental video samples.  相似文献   

5.
In this paper, we present an automatic foreground object detection method for videos captured by freely moving cameras. While we focus on extracting a single foreground object of interest throughout a video sequence, our approach does not require any training data nor the interaction by the users. Based on the SIFT correspondence across video frames, we construct robust SIFT trajectories in terms of the calculated foreground feature point probability. Our foreground feature point probability is able to determine candidate foreground feature points in each frame, without the need of user interaction such as parameter or threshold tuning. Furthermore, we propose a probabilistic consensus foreground object template (CFOT), which is directly applied to the input video for moving object detection via template matching. Our CFOT can be used to detect the foreground object in videos captured by a fast moving camera, even if the contrast between the foreground and background regions is low. Moreover, our proposed method can be generalized to foreground object detection in dynamic backgrounds, and is robust to viewpoint changes across video frames. The contribution of this paper is trifold: (1) we provide a robust decision process to detect the foreground object of interest in videos with contrast and viewpoint variations; (2) our proposed method builds longer SIFT trajectories, and this is shown to be robust and effective for object detection tasks; and (3) the construction of our CFOT is not sensitive to the initial estimation of the foreground region of interest, while its use can achieve excellent foreground object detection results on real-world video data.  相似文献   

6.
This paper presents a real-time surveillance system for detecting and tracking people, which takes full advantage of local texture patterns, under a stationary monocular camera. A novel center-symmetric scale invariant local ternary pattern feature is put forward to combine with pattern kernel density estimation for building a pixel-level-based background model. The background model is then used to detect moving foreground objects on every newly captured frame. A variant of a fast human detector that utilizes local texture patterns is adopted to look for human objects from the foreground regions, and it is assisted by a head detector, which is proposed to find in advance the candidate locations of human, to reduce computational costs. Each human object is given a unique identity and is represented by a spatio-color-texture object model. The real-time performance of tracking is achieved by a fast mean-shift algorithm coupled with several efficient occlusion-handling techniques. Experiments on challenging video sequences show that the proposed surveillance system can run in real-time and is quite robust in segmenting and tracking people in complex environments that include appearance changes, abrupt motion, occlusions, illumination variations and clutter.  相似文献   

7.
Video inpainting under constrained camera motion.   总被引:1,自引:0,他引:1  
A framework for inpainting missing parts of a video sequence recorded with a moving or stationary camera is presented in this work. The region to be inpainted is general: it may be still or moving, in the background or in the foreground, it may occlude one object and be occluded by some other object. The algorithm consists of a simple preprocessing stage and two steps of video inpainting. In the preprocessing stage, we roughly segment each frame into foreground and background. We use this segmentation to build three image mosaics that help to produce time consistent results and also improve the performance of the algorithm by reducing the search space. In the first video inpainting step, we reconstruct moving objects in the foreground that are "occluded" by the region to be inpainted. To this end, we fill the gap as much as possible by copying information from the moving foreground in other frames, using a priority-based scheme. In the second step, we inpaint the remaining hole with the background. To accomplish this, we first align the frames and directly copy when possible. The remaining pixels are filled in by extending spatial texture synthesis techniques to the spatiotemporal domain. The proposed framework has several advantages over state-of-the-art algorithms that deal with similar types of data and constraints. It permits some camera motion, is simple to implement, fast, does not require statistical models of background nor foreground, works well in the presence of rich and cluttered backgrounds, and the results show that there is no visible blurring or motion artifacts. A number of real examples taken with a consumer hand-held camera are shown supporting these findings.  相似文献   

8.
A scheme based on a difference scheme using object structures and color analysis is proposed for video object segmentation in rainy situations. Since shadows and color reflections on the wet ground pose problems for conventional video object segmentation, the proposed method combines the background construction-based video object segmentation and the foreground extraction-based video object segmentation where pixels in both the foreground and background from a video sequence are separated using histogram-based change detection from which the background can be constructed and detection of the initial moving object masks based on a frame difference mask and a background subtraction mask can be further used to obtain coarse object regions. Shadow regions and color-reflection regions on the wet ground are removed from the initial moving object masks via a diamond window mask and color analysis of the moving object. Finally, the boundary of the moving object is refined using connected component labeling and morphological operations. Experimental results show that the proposed method performs well for video object segmentation in rainy situations.  相似文献   

9.
Moving shadow detection and removal from the extracted foreground regions of video frames, aim to limit the risk of misconsideration of moving shadows as a part of moving objects. This operation thus enhances the rate of accuracy in detection and classification of moving objects. With a similar reasoning, the present paper proposes an efficient method for the discrimination of moving object and moving shadow regions in a video sequence, with no human intervention. Also, it requires less computational burden and works effectively under dynamic traffic road conditions on highways (with and without marking lines), street ways (with and without marking lines). Further, we have used scale-invariant feature transform-based features for the classification of moving vehicles (with and without shadow regions), which enhances the effectiveness of the proposed method. The potentiality of the method is tested with various data sets collected from different road traffic scenarios, and its superiority is compared with the existing methods.  相似文献   

10.
This paper presents a framework for object-oriented scene segmentation in video, which uses motion as the major characteristic to distinguish different moving objects and then to segment the scene into object regions. From the feature block (FB) correspondences through at least two frames obtained via a tracking algorithm, the reference feature measurement matrix and feature displacement matrix are formed. We propose a technique for initial motion clustering of the FBs, in which the principal components (PC) of the two matrices are adopted as the motion features. The motion features have several advantages: (1) They are low-dimensional (2-dim). (2) They preserve well both the spatial closeness and the motion similarity of their corresponding FBs. (3) They tend to form distinctive clusters in the feature space, thus allowing simple clustering schemes to be applied. The Expectation-Maximization (EM) algorithm is applied for clustering the motion features. For those scenes involving mainly the camera motion, the PC-based motion features will exhibit nearly parallel lines in the feature space. This facilitates a simple and yet effective layer extraction scheme. The final motion-based segmentation involves labeling of all the blocks in the frame. The EM algorithm is again applied to minimize an energy function which takes motion consistency and neighborhood-sensitivity into account. The proposed algorithm has been applied to several test sequences and the simulation results suggest a promising potential for video applications.  相似文献   

11.
运动目标的自动分割与跟踪   总被引:6,自引:0,他引:6  
该文提出了一种对视频序列中的运动目标进行自动分割的算法。该算法分析图像在L U V空间中的局部变化,同时使用运动信息来把目标从背景中分离出来。首先根据图像的局部变化,使用基于图论的方法把图像分割成不同的区域。然后,通过度量合成的全局运动与估计的局部运动之间的偏差来检测出运动的区域,运动的区域通过基于区域的仿射运动模型来跟踪到下一帧。为了提高提取的目标的时空连续性,使用Hausdorff跟踪器对目标的二值模型进行跟踪。对一些典型的MPEG-4测试序列所进行的评估显示了该算法的优良性能。  相似文献   

12.
为了从视频序列中分割出完整的、一致的运动视频对象,该文使用基于模糊聚类的分割算法获得组成对象边界的像素,从而提取对象。该算法首先使用了当前帧以及之前一些帧的图像信息计算其在小波域中不同子带的运动特征,并根据这些运动特征构造了低分辨率图像的运动特征矢量集;然后,使用模糊C-均值聚类算法分离出图像中发生显著变化的像素,以此代替帧间差图像,并利用传统的变化检测方法获得对象变化检测模型,从而提取对象;同时,使用相继两帧之间的平均绝对差值大小确定计算当前帧运动特征所需帧的数量,保证提取视频对象的精确性。实验结果证明该方法对于分割各种图像序列中的视频对象是有效的。  相似文献   

13.
该文提出一种基于优选特征轨迹的视频稳定算法。首先,采用改进的Harris角点检测算子提取特征点,通过K-Means聚类算法剔除前景特征点。然后,利用帧间特征点的空间运动一致性减少错误匹配和时间运动相似性实现长时间跟踪,从而获取有效特征轨迹。最后,建立同时包含特征轨迹平滑度与视频质量退化程度的目标函数计算视频序列的几何变换集以平滑特征轨迹获取稳定视频。针对图像扭曲产生的空白区,由当前帧定义区与参考帧的光流作引导来腐蚀,并通过图像拼接填充仍属于空白区的像素。经仿真验证,该文方法稳定的视频,空白区面积仅为Matsushita方法的33%左右,对动态复杂场景和多个大运动前景均具有较高的有效性并可生成内容完整的视频,既提高了视频的视觉效果,又减轻了费时的边界修复任务。  相似文献   

14.
This paper addresses the issue of tracking a single visual object through crowded scenarios, where a target object may be intersected or partially occluded by other objects for a long duration, experience severe deformation and pose changes, and different motion speed in cluttered background. A robust visual object tracking scheme is proposed that exploits the dynamics of object shape and appearance similarity. The method uses a particle filter where a multi-mode anisotropic mean shift is embedded to improve the initial particles. Comparing with the conventional particle filter and mean shift-based tracking (Shan et al. 2004), our method offers the following novelties: We employ a fully tunable rectangular bounding box described by five parameters (2D central location, width, height, and orientation) and full functionaries in the joint tracking scheme; We derive the equations for the multi-mode version of the anisotropic mean shift where the rectangular bounding box is partitioned into concentric areas, allowing better tracking objects with multiple modes. The bounding box parameters are then computed by using eigen-decomposition of mean shift estimates and weighted averaging. This enables a more efficient re-distributions of initial particles towards locations associated with large weights, hence an efficient particle filter tracking using a very small number of particles (N = 15 is used). Experiments have been conducted on video containing a range of complex scenarios, where tracking results are further evaluated by using two objective criteria and compared with two existing tracking methods. Our results have shown that the propose method is robust in terms of tracking drift, tightness and accuracy of tracked bounding boxes, especially in scenarios where the target object contains long-term partial occlusions, intersections, severe deformation, pose changes, or cluttered background with similar color distributions.  相似文献   

15.
This paper addresses the problem of side information extraction for distributed coding of videos captured by a camera moving in a 3-D static environment. Examples of targeted applications are augmented reality, remote-controlled robots operating in hazardous environments, or remote exploration by drones. It explores the benefits of the structure-from-motion paradigm for distributed coding of this type of video content. Two interpolation methods constrained by the scene geometry, based either on block matching along epipolar lines or on 3-D mesh fitting, are first developed. These techniques are based on a robust algorithm for sub-pel matching of feature points, which leads to semi-dense correspondences between key frames. However, their rate-distortion (RD) performances are limited by misalignments between the side information and the actual Wyner-Ziv (WZ) frames due to the assumption of linear motion between key frames. To cope with this problem, two feature point tracking techniques are introduced, which recover the camera parameters of the WZ frames. A first technique, in which the frames remain encoded separately, performs tracking at the decoder and leads to significant RD performance gains. A second technique further improves the RD performances by allowing a limited tracking at the encoder. As an additional benefit, statistics on tracks allow the encoder to adapt the key frame frequency to the video motion content.  相似文献   

16.
We present an unsupervised motion-based object segmentation algorithm for video sequences with moving camera, employing bidirectional inter-frame change detection. For every frame, two error frames are generated using motion compensation. They are combined and a segmentation algorithm based on thresholding is applied. We employ a simple and effective error fusion scheme and consider spatial error localization in the thresholding step. We find the optimal weights for the weighted mean thresholding algorithm that enables unsupervised robust moving object segmentation. Further, a post processing step for improving the temporal consistency of the segmentation masks is incorporated and thus we achieve improved performance compared to the previously proposed methods. The experimental evaluation and comparison with other methods demonstrate the validity of the proposed method.  相似文献   

17.
This paper presents a video context enhancement method for night surveillance. The basic idea is to extract and fuse the meaningful information of video sequence captured from a fixed camera under different illuminations. A unique characteristic of the algorithm is to separate the image context into two classes and estimate them in different ways. One class contains basic surrounding scene information and scene model, which is obtained via background modeling and object tracking in daytime video sequence. The other class is extracted from nighttime video, including frequently moving region, high illumination region and high gradient region. The scene model and pixel-wise difference method are used to segment the three regions. A shift-invariant discrete wavelet based image fusion technique is used to integral all those context information in the final result. Experiment results demonstrate that the proposed approach can provide much more details and meaningful information for nighttime video.  相似文献   

18.
空域视频场景监视中运动对象的实时检测与跟踪技术   总被引:3,自引:0,他引:3  
王东升  李在铭 《信号处理》2005,21(2):195-198
本文分析了空域视频场景中运动对象实时检测、跟踪系统的模型。提出了一种在运动背景下实时检测与跟踪视频运动目标的技术。该方法首先进行背景的全局运动参数估计,并对背景进行补偿校正,将补偿校正后的相邻两帧进行差分检测。然后利用假设检验从差分图像中提取运动区域,利用遗传学方法在指定区域内确定最优分割门限,提取视频运动对象及其特征;最后利用线性预测器对目标进行匹配跟踪。在基于高速DSP的系统平台上的实验结果表明该方法取得了很好的效果。  相似文献   

19.
该文提出了一种工作于MPEG压缩域的快速视频目标分割算法。该算法以从MPEG1/2码流中部分解码提取的特征为输入,提取P帧中的运动目标。针对一般的压缩域算法目标边界精度不高的特点,算法采用I帧和P帧中每个块的直流DCT系数和3个交流DCT系数,以及运动补偿信息,重建出P帧的原图像1/16大小的子图像,采用快速平均移聚类得到具有较高边界精度的亮度一致的区域;针对运动向量的噪声容易造成错误检测的缺点,算法结合聚类分析结果和运动块的分布,采用基于马尔可夫随机场的统计标号方法对目标和背景区域进行分类,得到每个P帧的目标掩模。该算法可以得到44子块的边界精度,对于CIF格式的码流,在Pentium IV 2GHz平台上可以达到每秒40帧的处理速度。  相似文献   

20.
运动目标检测是计算机视觉领域极具挑战性的难题,该文针对这一问题提出一种基于空时多线索融合的超像素运动目标检测方法。首先利用简单线性迭代聚类算法将当前帧分割为超像素集合,根据帧间的像素级时变线索找到当前帧中包含运动信息的前景超像素子块;然后根据运动目标的一致性原则建立前一帧目标模型,结合目标空间线索进一步确定包含运动目标的检测窗口,将目标检测问题转化为目标分割问题,利用密集角点检测将目标从窗口中分割出来。在多个具有挑战性的公开视频序列上同几种流行检测算法的实验对比结果证明了所提算法的有效性和优越性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号