排序方式: 共有6条查询结果,搜索用时 171 毫秒
1
1.
Multi-frame estimation of planar motion 总被引:4,自引:0,他引:4
Zelnik-Manor L. Irani M. 《IEEE transactions on pattern analysis and machine intelligence》2000,22(10):1105-1116
Traditional plane alignment techniques are typically performed between pairs of frames. We present a method for extending existing two-frame planar motion estimation techniques into a simultaneous multi-frame estimation, by exploiting multi-frame subspace constraints of planar surfaces. The paper has three main contributions: 1) we show that when the camera calibration does not change, the collection of all parametric image motions of a planar surface in the scene across multiple frames is embedded in a low dimensional linear subspace; 2) we show that the relative image motion of multiple planar surfaces across multiple frames is embedded in a yet lower dimensional linear subspace, even with varying camera calibration; and 3) we show how these multi-frame constraints can be incorporated into simultaneous multi-frame estimation of planar motion, without explicitly recovering any 3D information, or camera calibration. The resulting multi-frame estimation process is more constrained than the individual two-frame estimations, leading to more accurate alignment, even when applied to small image regions. 相似文献
2.
Lihi Zelnik-Manor Moshe Machline Michal Irani 《International Journal of Computer Vision》2006,68(1):27-41
Dynamic analysis of video sequences often relies on the segmentation of the sequence into regions of consistent motions. Approaching
this problem requires a definition of which motions are regarded as consistent. Common approaches to motion segmentation usually
group together points or image regions that have the same motion between successive frames (where the same motion can be 2D, 3D, or non-rigid). In this paper we define a new type
of motion consistency, which is based on temporal consistency of behaviors across multiple frames in the video sequence. Our
definition of consistent “temporal behavior” is expressed in terms of multi-frame linear subspace constraints. This definition
applies to 2D, 3D, and some non-rigid motions without requiring prior model selection. We further show that our definition
of motion consistency extends to data with directional uncertainty, thus leading to a dense segmentation of the entire image.
Such segmentation is obtained by applying the new motion consistency constraints directly to covariance-weighted image brightness
measurements. This is done without requiring prior correspondence estimation nor feature tracking. 相似文献
3.
Statistical analysis of dynamic actions 总被引:4,自引:0,他引:4
Zelnik-Manor L Irani M 《IEEE transactions on pattern analysis and machine intelligence》2006,28(9):1530-1535
Real-world action recognition applications require the development of systems which are fast, can handle a large variety of actions without a priori knowledge of the type of actions, need a minimal number of parameters, and necessitate as short as possible learning stage. In this paper, we suggest such an approach. We regard dynamic activities as long-term temporal objects, which are characterized by spatio-temporal features at multiple temporal scales. Based on this, we design a simple statistical distance measure between video sequences which captures the similarities in their behavioral content. This measure is nonparametric and can thus handle a wide range of complex dynamic actions. Having a behavior-based distance measure between sequences, we use it for a variety of tasks, including: video indexing, temporal segmentation, and action-based video clustering. These tasks are performed without prior knowledge of the types of actions, their models, or their temporal extents. 相似文献
4.
In many scenarios a dynamic scene is filmed by multiple video cameras located at different viewing positions. Visualizing such multi-view data on a single display raises an immediate question—which cameras capture better views of the scene? Typically, (e.g. in TV broadcasts) a human producer manually selects the best view. In this paper we wish to automate this process by evaluating the quality of a view, captured by every single camera. We regard human actions as three-dimensional shapes induced by their silhouettes in the space-time volume. The quality of a view is then evaluated based on features of the space-time shape, which correspond with limb visibility. Resting on these features, two view quality approaches are proposed. One is generic while the other can be trained to fit any preferred action recognition method. Our experiments show that the proposed view selection provide intuitive results which match common conventions. We further show that it improves action recognition results. 相似文献
5.
Every picture tells a story. In photography, the story is portrayed by a composition of objects, commonly referred to as the subjects of the piece. Were we to remove these objects, the story would be lost. When manipulating images, either for artistic rendering or cropping, it is crucial that the story of the piece remains intact. As a result, the knowledge of the location of these prominent objects is essential. We propose an approach for saliency detection that combines previously suggested patch distinctness with an object probability map. The object probability map infers the most probable locations of the subjects of the photograph according to highly distinct salient cues. The benefits of the proposed approach are demonstrated through state-of-the-art results on common data sets. We further show the benefit of our method in various manipulations of real-world photographs while preserving their meaning. 相似文献
6.
Subspace based factorization methods are commonly used for a variety of applications, such as 3D reconstruction, multi-body
segmentation and optical flow estimation. These are usually applied to a single video sequence. In this paper we present an
analysis of the multi-sequence case and place it under a single framework with the single sequence case. In particular, we
start by analyzing the characteristics of subspace based spatial and temporal segmentation. We show that in many cases objects
moving with different 3D motions will be captured as a single object using multi-body (spatial) factorization approaches.
Similarly, frames viewing different shapes might be grouped as displaying the same shape in the temporal factorization framework.
Temporal factorization provides temporal grouping of frames by employing a subspace based approach to capture non-rigid shape
changes (Zelnik-Manor and Irani, 2004). We analyze what causes these degeneracies and show that in the case of multiple sequences
these can be made useful and provide information for both temporal synchronization of sequences and spatial matching of points
across sequences.
A preliminary version of this paper appeared in Zelnik-Manor and Irani (2003). 相似文献
1