首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The exploitation of video data requires methods able to extract high-level information from the images. Video summarization, video retrieval, or video surveillance are examples of applications. In this paper, we tackle the challenging problem of recognizing dynamic video contents from low-level motion features. We adopt a statistical approach involving modeling, (supervised) learning, and classification issues. Because of the diversity of video content (even for a given class of events), we have to design appropriate models of visual motion and learn them from videos. We have defined original parsimonious global probabilistic motion models, both for the dominant image motion (assumed to be due to the camera motion) and the residual image motion (related to scene motion). Motion measurements include affine motion models to capture the camera motion and low-level local motion features to account for scene motion. Motion learning and recognition are solved using maximum likelihood criteria. To validate the interest of the proposed motion modeling and recognition framework, we report dynamic content recognition results on sports videos.  相似文献   

2.
We present a two-dimensional (2-D) mesh-based mosaic representation, consisting of an object mesh and a mosaic mesh for each frame and a final mosaic image, for video objects with mildly deformable motion in the presence of self and/or object-to-object (external) occlusion. Unlike classical mosaic representations where successive frames are registered using global motion models, we map the uncovered regions in the successive frames onto the mosaic reference frame using local affine models, i.e., those of the neighboring mesh patches. The proposed method to compute this mosaic representation is tightly coupled with an occlusion adaptive 2-D mesh tracking procedure, which consist of propagating the object mesh frame to frame, and updating of both object and mosaic meshes to optimize texture mapping from the mosaic to each instance of the object. The proposed representation has been applied to video object rendering and editing, including self transfiguration, synthetic transfiguration, and 2-D augmented reality in the presence of self and/or external occlusion. We also provide an algorithm to determine the minimum number of still views needed to reconstruct a replacement mosaic which is needed for synthetic transfiguration. Experimental results are provided to demonstrate both the 2-D mesh-based mosaic synthesis and two different video object editing applications on real video sequences.  相似文献   

3.
Segmentation of moving objects in video sequences is a basic task in many applications. However, it is still challenging due to the semantic gap between the low-level visual features and the high-level human interpretation of video semantics. Compared with segmentation of fast moving objects, accurate and perceptually consistent segmentation of slowly moving objects is more difficult. In this paper, a novel hybrid algorithm is proposed for segmentation of slowly moving objects in video sequence aiming to acquire perceptually consistent results. Firstly, the temporal information of the differences among multiple frames is employed to detect initial moving regions. Then, the Gaussian mixture model (GMM) is employed and an improved expectation maximization (EM) algorithm is introduced to segment a spatial image into homogeneous regions. Finally, the results of motion detection and spatial segmentation are fused to extract final moving objects. Experiments are conducted and provide convincing results.  相似文献   

4.
5.
For a variety of applications such as video surveillance and event annotation, the spatial–temporal boundaries between video objects are required for annotating visual content with high-level semantics. In this paper, we define spatial–temporal sampling as a unified process of extracting video objects and computing their spatial–temporal boundaries using a learnt video object model. We first provide a computational approach for learning an optimal key-object codebook sequence from a set of training video clips to characterize the semantics of the detected video objects. Then, dynamic programming with the learnt codebook sequence is used to locate the video objects with spatial–temporal boundaries in a test video clip. To verify the performance of the proposed method, a human action detection and recognition system is constructed. Experimental results show that the proposed method gives good performance on several publicly available datasets in terms of detection accuracy and recognition rate.  相似文献   

6.
7.
运动目标的自动分割与跟踪   总被引:6,自引:0,他引:6  
该文提出了一种对视频序列中的运动目标进行自动分割的算法。该算法分析图像在L U V空间中的局部变化,同时使用运动信息来把目标从背景中分离出来。首先根据图像的局部变化,使用基于图论的方法把图像分割成不同的区域。然后,通过度量合成的全局运动与估计的局部运动之间的偏差来检测出运动的区域,运动的区域通过基于区域的仿射运动模型来跟踪到下一帧。为了提高提取的目标的时空连续性,使用Hausdorff跟踪器对目标的二值模型进行跟踪。对一些典型的MPEG-4测试序列所进行的评估显示了该算法的优良性能。  相似文献   

8.
AMethodfor3DSceneDescriptionandSegmentationinanObjectRecord¥ChenTingbiao(DepartmentofRadioEngineering,NamingUniversityofPosts...  相似文献   

9.
To enable content-based functionalities in video coding, a decomposition of the scene into physical objects is required. Such objects are normally not characterised by homogeneous colour, intensity, or optical flow. Therefore, conventional techniques based on these low-level features cannot perform the desired segmentation. The authors address segmentation and tracking of moving objects and present a new video object plane (VOP) segmentation algorithm that extracts semantically meaningful objects. A morphological motion filter detects physical objects by identifying areas that are moving differently from the background. A new filter criterion is introduced that measures the deviation of the estimated local motion from the synthesised global motion. A two-dimensional binary model is derived for the object of interest and tracked throughout the sequence by a Hausdorff object tracker. To accommodate for rotations and changes in shape, the model is updated every frame by a two-stage method that accounts for rigid and non-rigid moving parts of the object. The binary model then guides the actual VOP extraction, whereby a novel boundary post-processor ensures high boundary accuracy. Experimental results demonstrate the performance of the proposed algorithm  相似文献   

10.
In the MPEG-4 paradigm, the sequence must be described in terms of meaningful objects. This meaningful, high-level representation should emerge from low-level primitives such as optical flow and prediction error which are the basic elements of previous-generation video coders. The accuracy of the high-level models strongly depends on the robustness of the primitives used. It is shown how perceptual weighting in optical flow computation gives rise to better motion estimates which consistently improve motion-based segmentation compared to equivalent unweighted motion estimates  相似文献   

11.
从视频序列中复原高分辨率的运动对象在众多研究领域具有重要的应用意义.本文针对动态视频中整体运动的刚性或准刚性对象,提出一种基于对象的超分辨率复原方案,首先引入基于6参数仿射模型的对象跟踪和匹配算法,用于视频中运动对象的自动跟踪和匹配.进而将该运动模型与最大后验概率(MAP)算法相结合实现了所跟踪对象的超分辨率复原.对仿真和实测序列的实验结果表明,这种基于对象的处理方法能够实现更为准确的运动估计,因而收到了更好的复原效果.  相似文献   

12.
13.
Semantic high-level event recognition of videos is one of most interesting issues for multimedia searching and indexing. Since low-level features are semantically distinct from high-level events, a hierarchical video analysis framework is needed, i.e., using mid-level features to provide clear linkages between low-level audio-visual features and high-level semantics. Therefore, this paper presents a framework for video event classification using temporal context of mid-level interval-based multimodal features. In the framework, a co-occurrence symbol transformation method is proposed to explore full temporal relations among multiple modalities in probabilistic HMM event classification. The results of our experiments on baseball video event classification demonstrate the superiority of the proposed approach.  相似文献   

14.
一种基于Petri网的监控视频事件抽取方法   总被引:1,自引:0,他引:1  
代科学  李国辉 《电视技术》2006,(1):83-85,89
通过扩展Petri网定义,提出了一种监控视频事件时空关系和逻辑关系的描述方法,通过将语义级的查询事件映射成Petri网,再在Petri网推理过程中结合计算机视觉算法对场景运动目标行为的解释,实现了有关运动目标行为的事件抽取和相应监控视频片段的定位.  相似文献   

15.
《Spectrum, IEEE》2007,44(4):21-26
This paper discusses a theory of the neocortical algorithm called the hierarchical temporal memory (HTM). Hierarchical temporal memories are built around a hierarchy of nodes. The hierarchy and how it works are the most important features of HTM theory. In an HTM, knowledge is distributed across many nodes up and down the hierarchy. As an HTM is trained, the low-level nodes learn first. Representations in high-level nodes then share what was previously learned in low-level nodes  相似文献   

16.
A generic definition of video objects, which is a group of pixels with temporal motion coherence, is considered. The generic video object (GVO) is the superset of the conventional video objects considered in the object segmentation literature. Because of its motion coherence, the GVO can be easily recognised by the human visual system. However, due to its arbitrary spatial distribution, the GVO cannot be easily detected by the existing algorithms which often assume the spatial homogeneousness of the video objects. The concept of extended optical flow is introduced and a dynamic programming framework for the GVO detection and segmentation is developed, whose solution is given by the Viterbi algorithm. Using this dynamic programming formulation, the proposed object detection algorithm is able to discover the motion path of the GVO automatically and refine its spatial region of support progressively. In addition to object segmentation, the proposed algorithm can also be applied to video pre-processing, removing the so-called 'video mask' noise in digital videos. Experimental results show that this type of vision-assisted video pre-processing significantly improves the compression efficiency.  相似文献   

17.
Dominant sets based movie scene detection   总被引:1,自引:0,他引:1  
Multimedia indexing and retrieval has become a challenging topic in organizing huge amount of multimedia data. This problem is not a trivial task for large visual databases; hence, segmentation into low- and high-level temporal video segments might improve the realization of this task. In this paper, we introduce a weighted undirected graph-based movie scene detection approach to detect semantically meaningful temporal video segments. The method is based on the idea of finding the dominant scene of the video according to the selected low-level feature. The proposed method starts from obtaining the most reliable solution first and exploit each solution in the subsequent steps recursively. The dominant movie scene boundary, which can be the highest probability to be the correct one, is determined and this scene boundary information is also exploited in the subsequent steps. We handle two partitioning strategies to determine the boundaries of the remaining scenes. One is a tree-based strategy and the other is an order-based strategy. The proposed dominant sets based movie scene detection method is compared with the graph-based video scene detection methods presented in literature.  相似文献   

18.
This paper describes an object-based video coding system with new ideas in both the motion analysis and source encoding procedures. The moving objects in a video are extracted by means of a joint motion estimation and segmentation algorithm based on the Markov random field (MRF) model. The two important features of the presented technique are the temporal linking of the objects, and the guidance of the motion segmentation with spatial color information. This facilitates several aspects of an object-based coder. First, a new temporal updating scheme greatly reduces the bit rate to code the object boundaries without resorting to crude lossy approximations. Next, the uncovered regions can be extracted and encoded in an efficient manner by observing their revealed contents. The objects are classified adaptively as P objects or I objects and encoded accordingly. Subband/wavelet coding is applied in encoding the object interiors. Simulations at very low bit rates yielded comparable performance in terms of reconstructed PSNR to the H.263 coder. The object-based coder produced visually more pleasing video with less blurriness and devoid of block artifacts, thus confirming the advantages of object-based coding at very low bit-rates  相似文献   

19.
This paper proposes low power VLSI architecture for motion tracking that can be used in online video applications such as in MPEG and VRML. The proposed architecture uses a hierarchical adaptive structured mesh (HASM) concept that generates a content-based video representation. The developed architecture shows the significant reducing of power consumption that is inherited in the HASM concept. The proposed architecture consists of two units: a motion estimation and motion compensation units.The motion estimation (ME) architecture generates a progressive mesh code that represents a mesh topology and its motion vectors. ME reduces the power consumption since it (1) implements a successive splitting strategy to generate the mesh topology. The successive split allows the pipelined implementation of the processing elements. (2) It approximates the mesh nodes motion vector by using the three step search algorithm. (3) and it uses parallel units that reduce the power consumption at a fixed throughput.The motion compensation (MC) architecture processes a reference frame, mesh nodes and motion vectors to predict a video frame using affine transformation to warp the texture with different mesh patches. The MC reduces the power consumption since it uses (1) a multiplication-free algorithm for affine transformation. (2) It uses parallel threads in which each thread implements a pipelined chain of scalable affine units to compute the affine transformation of each patch.The architecture has been prototyped using top-down low-power design methodology. The performance of the architecture has been analyzed in terms of video construction quality, power and delay.  相似文献   

20.
This paper presents the online estimation of temporal frequency to simultaneously detect and identify the quasiperiodic motion of an object. We introduce color to increase discriminative power of a reoccurring object and to provide robustness to appearance changes due to illumination changes. Spatial contextual information is incorporated by considering the object motion at different scales. We combined spatiospectral Gaussian filters and a temporal reparameterized Gabor filter to construct the online temporal frequency filter. We demonstrate the online filter to respond faster and decay faster than offline Gabor filters. Further, we show the online filter to be more selective to the tuned frequency than Gabor filters. We contribute to temporal frequency analysis in that we both identify ("what") and detect ("when") the frequency. In color video, we demonstrate the filter to detect and identify the periodicity of natural motion. The velocity of moving gratings is determined in a real world example. We consider periodic and quasiperiodic motion of both stationary and nonstationary objects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号