首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Images/videos captured by portable devices (e.g., cellphones, DV cameras) often have limited fields of view. Image stitching, also referred to as mosaics or panorama, can produce a wide angle image by compositing several photographs together. Although various methods have been developed for image stitching in recent years, few works address the video stitching problem. In this paper, we present the first system to stitch videos captured by hand‐held cameras. We first recover the 3D camera paths and a sparse set of 3D scene points using CoSLAM system, and densely reconstruct the 3D scene in the overlapping regions. Then, we generate a smooth virtual camera path, which stays in the middle of the original paths. Finally, the stitched video is synthesized along the virtual path as if it was taken from this new trajectory. The warping required for the stitching is obtained by optimizing over both temporal stability and alignment quality, while leveraging on 3D information at our disposal. The experiments show that our method can produce high quality stitching results for various challenging scenarios.  相似文献   

2.
Previous video matting approaches mostly adopt the “binary segmentation + matting” strategy, i.e., first segment each frame into foreground and background regions, then extract the fine details of the foreground boundary using matting techniques. This framework has several limitations due to the fact that binary segmentation is employed. In this paper, we propose a new supervised video matting approach. Instead of applying binary segmentation, we explicitly model segmentation uncertainty in a novel tri‐level segmentation procedure. The segmentation is done progressively, enabling us to handle difficult cases such as large topology changes, which are challenging to previous approaches. The tri‐level segmentation results can be naturally fed into matting techniques to generate the final alpha mattes. Experimental results show that our system can generate high quality results with less user inputs than the state‐of‐theart methods.  相似文献   

3.
This paper presents a novel content‐based method for transferring the colour patterns between images. Unlike previous methods that rely on image colour statistics, our method puts an emphasis on high‐level scene content analysis. We first automatically extract the foreground subject areas and background scene layout from the scene. The semantic correspondences of the regions between source and target images are established. In the second step, the source image is re‐coloured in a novel optimization framework, which incorporates the extracted content information and the spatial distributions of the target colour styles. A new progressive transfer scheme is proposed to integrate the advantages of both global and local transfer algorithms, as well as avoid the over‐segmentation artefact in the result. Experiments show that with a better understanding of the scene contents, our method well preserves the spatial layout, the colour distribution and the visual coherence in the transfer process. As an interesting extension, our method can also be used to re‐colour video clips with spatially‐varied colour effects.  相似文献   

4.
In virtual reality (VR) applications, the contents are usually generated by creating a 360° Video panorama of a real‐world scene. Although many capture devices are being released, getting high‐resolution panoramas and displaying a virtual world in real‐time remains challenging due to its computationally demanding nature. In this paper, we propose a real‐time 360° Video foveated stitching framework, that renders the entire scene in different level of detail, aiming to create a high‐resolution panoramic Video in real‐time that can be streamed directly to the client. Our foveated stitching algorithm takes Videos from multiple cameras as input, combined with measurements of human visual attention (i.e. the acuity map and the saliency map), can greatly reduce the number of pixels to be processed. We further parallelize the algorithm using GPU to achieve a responsive interface and validate our results via a user study. Our system accelerates graphics computation by a factor of 6 on a Google Cardboard display.  相似文献   

5.
In this paper, we propose an interactive technique for constructing a 3D scene via sparse user inputs. We represent a 3D scene in the form of a Layered Depth Image (LDI) which is composed of a foreground layer and a background layer, and each layer has a corresponding texture and depth map. Given user‐specified sparse depth inputs, depth maps are computed based on superpixels using interpolation with geodesic‐distance weighting and an optimization framework. This computation is done immediately, which allows the user to edit the LDI interactively. Additionally, our technique automatically estimates depth and texture in occluded regions using the depth discontinuity. In our interface, the user paints strokes on the 3D model directly. The drawn strokes serve as 3D handles with which the user can pull out or push the 3D surface easily and intuitively with real‐time feedback. We show our technique enables efficient modeling of LDI that produce sufficient 3D effects.  相似文献   

6.
Interactive digital matting, the process of extracting a foreground object from an image based on limited user input, is an important task in image and video editing. From a computer vision perspective, this task is extremely challenging because it is massively ill-posed -- at each pixel we must estimate the foreground and the background colors, as well as the foreground opacity ("alpha matte") from a single color measurement. Current approaches either restrict the estimation to a small part of the image, estimating foreground and background colors based on nearby pixels where they are known, or perform iterative nonlinear estimation by alternating foreground and background color estimation with alpha estimation.In this paper we present a closed-form solution to natural image matting. We derive a cost function from local smoothness assumptions on foreground and background colors, and show that in the resulting expression it is possible to analytically eliminate the foreground and background colors to obtain a quadratic cost function in alpha. This allows us to find the globally optimal alpha matte by solving a sparse linear system of equations. Furthermore, the closed-form formula allows us to predict the properties of the solution by analyzing the eigenvectors of a sparse matrix, closely related to matrices used in spectral image segmentation algorithms. We show that high quality mattes for natural images may be obtained from a small amount of user input.  相似文献   

7.
传统的火灾检测系统主要依靠传感器实现,由于设备长时间接触灰尘等其他因素,有很大的可能发生失灵或故障。基于上述的缺陷,将图像拼接系统应用在野外防火检测系统上将会产生更好的效果。通过摄像头采集的图像信息往往具有一定的视差,拼接后得到的全景图像易产生重影等现象。在进行多幅图像拼接时,拼接后的全景图像不能呈现出规则的矩形形状。针对上述问题,采用一种图像变形的方法来实现精确对齐,并利用Seam Carving算法将不规则的全景图像矩形化输出为规则的全景图像,最后将规则全景图像与原场景图像进行背景对比,从而达到火灾检测的目的。  相似文献   

8.
Stitching motions in multiple videos into a single video scene is a challenging task in current video fusion and mosaicing research and film production. In this paper, we present a novel method of video motion stitching based on the similarities of trajectory and position of foreground objects. First, multiple video sequences are registered in a common reference frame, whereby we estimate the static and dynamic backgrounds, with the former responsible for distinguishing the foreground from the background and the static region from the dynamic region, and the latter functioning in mosaicing the warped input video sequences into a panoramic video. Accordingly, the motion similarity is calculated by reference to trajectory and position similarity, whereby the corresponding motion parts are extracted from multiple video sequences. Finally, using the corresponding motion parts, the foregrounds of different videos and dynamic backgrounds are fused into a single video scene through Poisson editing, with the motions involved being stitched together. Our major contributions are a framework of multiple video mosaicing based on motion similarity and a method of calculating motion similarity from the trajectory similarity and the position similarity. Experiments on everyday videos show that the agreement of trajectory and position similarities with the real motion similarity plays a decisive role in determining whether two motions can be stitched. We acquire satisfactory results for motion stitching and video mosaicing.  相似文献   

9.
10.
介绍了一种全景图象的建模和实现方式 .考虑了图象旋转对拼接过程的影响 ,这有助于提高拼接质量 .在立方体表面拼接成功了柱面全景图象 .开发了能够浏览立方体表面的柱面全景图象的浏览器 .  相似文献   

11.
复杂背景下全景视频运动小目标检测算法   总被引:2,自引:0,他引:2  
为解决复杂背景下全景视频中运动小目标检测精度低的问题,提出一种基于复杂背景下全景视频运动小目标检测算法.首先,为降低复杂背景信息的干扰,提高目标检测的精度,采用快速鲁棒性主成分分析(Fast RPCA)算法将全景视频图像的前景背景信息分离,并提取出前景信息作为有效的图像特征;然后,改进更快的基于区域的卷积神经网络(Faster R-CNN)中的区域生成网络(RPN)的候选框尺度大小,使之适应全景图像中的目标尺寸,再对前景特征图进行训练;最后,通过RPN网络和Fast R-CNN网络共享卷积层输出检测模型,实现对全景视频图像中小目标的精准检测.实验结果表明,所提出算法可以有效抑制复杂的背景信息对目标检测精度的影响,并对全景视频图像中的运动小目标具有较高的检测精度.  相似文献   

12.
Guo  Yuanhao  Zhao  Rongkai  Wu  Song  Wang  Chao 《Multimedia Tools and Applications》2018,77(17):22299-22318

Panoramic photography requires intensive operations of image stitching. A large quantity of images may lead to a rather expensive image stitching; while a sparse imaging may cause a poor-quality panorama due to the insufficient correlation between adjacent images. So, a good study for the balance between image quantity and image correlation may improve the efficiency and quality of panoramic photography. Therefore, in this work, we are motivated to present a novel approach to estimate the optimal image capture patterns for panoramic photography. We aim at the minimization of the image quantity which still preserves sufficient image correlation. We represent the image correlation as overlap area between the view range that can be separately observed from adjacent images. Moreover, a time-consuming imaging process of panoramic photography will result in a considerable illumination variation of the scene in images. Subsequently, the image stitching will be more challenged. To solve this problem, we design a series of imaging routines for our image capture patterns to preserve the content consistency, ensuring the generalization of our method to various cameras. Experimental results show that the proposed method can obtain the optimal image capture pattern in a very efficient manner. In these patterns, we can obtain a balanced image quantity but still achieve good results of panoramic photography.

  相似文献   

13.
360度柱面全景图象生成算法及其实现   总被引:17,自引:3,他引:14  
360度柱面全景图解是全景视频系统的基本要素之一。只要观察点不动,对场景的任意方向的观察都能通过它对该视平面的重新投影而得到。本文提出了一套新的图解柱面投影和无缝拼接算法,能有效地、快速地生成这种全景图解,并在实践中得到了证明。  相似文献   

14.
全景图是目前较为有效的图像绘制技术之一。论文对现今常用的几种图像拼接算法进行了分析和比较,提出了一种改进的全景图像生成算法。算法通过对初始图像进行预处理、改进特征点选取方法、优化特征点匹配过程,提高了全景图虚拟场景生成的效率和准确性。  相似文献   

15.
Segmenting a moving foreground (fg) from its background (bg) is a fundamental step in many Machine Vision and Computer Graphics applications. Nevertheless, hardly any attempts have been made to tackle this problem in dynamic 3D scanned scenes. Scanned dynamic scenes are typically challenging due to noise and large missing parts. Here, we present a novel approach for motion segmentation in dynamic point‐cloud scenes designed to cater to the unique properties of such data. Our key idea is to augment fg/bg classification with an active learning framework by refining the segmentation process in an adaptive manner. Our method initially classifies the scene points as either fg or bg in an un‐supervised manner. This, by training discriminative RBF‐SVM classifiers on automatically labeled, high‐certainty fg/bg points. Next, we adaptively detect unreliable classification regions (i.e. where fg/bg separation is uncertain), locally add more training examples to better capture the motion in these areas, and re‐train the classifiers to fine‐tune the segmentation. This not only improves segmentation accuracy, but also allows our method to perform in a coarse‐to‐fine manner, thereby efficiently process high‐density point‐clouds. Additionally, we present a unique interactive paradigm for enhancing this learning process, by using a manual editing tool. The user explicitly edits the RBF‐SVM decision borders in unreliable regions in order to refine and correct the classification. We provide extensive qualitative and quantitative experiments on both real (scanned) and synthetic dynamic scenes.  相似文献   

16.
Repeated scene elements are copious and ubiquitous in natural images. Cutout of those repeated elements usually involves tedious and laborious user interaction by previous image segmentation methods. In this paper, we present RepSnapping, a novel method oriented to cutout of repeated scene elements with much less user interaction. By exploring inherent similarity between repeated elements, a new optimization model is introduced to thread correlated elements in the segmentation procedure. The model proposed here enables efficient solution using max‐flow/min cut on an extended graph. Experiments indicate that RepSnapping facilitates cutout of repeated elements better than the state‐of‐the‐art interactive image segmentation and repetition detection methods.  相似文献   

17.
We propose an iterative energy minimization framework for interactive image matting. Our approach is easy in the sense that it is fast and requires only few user‐specified strokes for marking the foreground and background. Beginning with the known region, we model the unknown region as a Markov Random Field (MRF) and formulate its energy in each iteration as the combination of one data term and one smoothness term. By automatically adjusting the weights of both terms during the iterations, the first‐order continuous and feature‐preserving result is rapidly obtained with several iterations. The energy optimization can be further performed in selected local regions for refined results. We demonstrate that our energy‐driven scheme can be extended to video matting, with which the spatio‐temporal smoothness is faithfully preserved. We show that the proposed approach outperforms previous methods in terms of both the quality and performance for quite challenging examples.  相似文献   

18.
Video remains the method of choice for capturing temporal events. However, without access to the underlying 3D scene models, it remains difficult to make object level edits in a single video or across multiple videos. While it may be possible to explicitly reconstruct the 3D geometries to facilitate these edits, such a workflow is cumbersome, expensive, and tedious. In this work, we present a much simpler workflow to create plausible editing and mixing of raw video footage using only sparse structure points (SSP) directly recovered from the raw sequences. First, we utilize user‐scribbles to structure the point representations obtained using structure‐from‐motion on the input videos. The resultant structure points, even when noisy and sparse, are then used to enable various video edits in 3D, including view perturbation, keyframe animation, object duplication and transfer across videos, etc. Specifically, we describe how to synthesize object images from new views adopting a novel image‐based rendering technique using the SSPs as proxy for the missing 3D scene information. We propose a structure‐preserving image warping on multiple input frames adaptively selected from object video, followed by a spatio‐temporally coherent image stitching to compose the final object image. Simple planar shadows and depth maps are synthesized for objects to generate plausible video sequence mimicking real‐world interactions. We demonstrate our system on a variety of input videos to produce complex edits, which are otherwise difficult to achieve.  相似文献   

19.
图像分割是从图像中提取有意义的区域,是图像处理和计算机视觉中的关键技术。而自动分割方法不能很好地处理前景复杂的图像,对此提出一种基于区域中心的交互式图像前景提取算法。针对图像前景的复杂度,很难用单一的相似区域描述前景,文中采用多个区域中心来刻画目标区域。为提升图像分割的稳定性,给出基于超像素颜色、空间位置和纹理信息的相似性度量方法;为确保图像分割区域的连通性和准确性,定义了基于超像素的测地距离计算方法。使用基于测地距离的超像素局部密度,来分析图像的若干区域中心;基于用户交互的方式来分析前景的区域中心,得到图像前景。经过大量彩色图像的仿真表明,在分割过程中利用少量的用户交互信息,可有效提升图像分割的稳定性和准确性。  相似文献   

20.
Extracting foreground objects from videos captured by a handheld camera has emerged as a new challenge. While existing approaches aim to exploit several clues such as depth and motion to extract the foreground layer, there are limitations in handling partial movement and cast shadow. In this paper, we bring a novel perspective to address these two issues by utilizing occlusion map introduced by object and camera motion and taking the advantage of interactive image segmentation methods. For partial movement, we treat each video frame as an image and synthesize “seeding” user interactions (i.e., user manually marking foreground and background) with both forward and backward occlusion maps to leverage the advances in high quality interactive image segmentation. For cast shadow, we utilize a paired region based shadow detection method to further refine initial segmentation results by removing detected shadow regions. Experimental results from both qualitative evaluation and quantitative evaluation on the Hopkins dataset demonstrate both the effectiveness and the efficiency of our proposed approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号