首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 28 毫秒
1.
This paper integrates fully automatic video object segmentation and tracking including detection and assignment of uncovered regions in a 2-D mesh-based framework. Particular contributions of this work are (i) a novel video object segmentation method that is posed as a constrained maximum contrast path search problem along the edges of a 2-D triangular mesh, and (ii) a 2-D mesh-based uncovered region detection method along the object boundary as well as within the object. At the first frame, an optimal number of feature points are selected as nodes of a 2-D content-based mesh. These points are classified as moving (foreground) and stationary nodes based on multi-frame node motion analysis, yielding a coarse estimate of the foreground object boundary. Color differences across triangles near the coarse boundary are employed for a maximum contrast path search along the edges of the 2-D mesh to refine the boundary of the video object. Next, we propagate the refined boundary to the subsequent frame by using motion vectors of the node points to form the coarse boundary at the next frame. We detect occluded regions by using motion-compensated frame differences and range filtered edge maps. The boundaries of detected uncovered regions are then refined by using the search procedure. These regions are either appended to the foreground object or tracked as new objects. The segmentation procedure is re-initialized when unreliable motion vectors exceed a certain number. The proposed scheme is demonstrated on several video sequences.  相似文献   

2.
We present a two-dimensional (2-D) mesh-based mosaic representation, consisting of an object mesh and a mosaic mesh for each frame and a final mosaic image, for video objects with mildly deformable motion in the presence of self and/or object-to-object (external) occlusion. Unlike classical mosaic representations where successive frames are registered using global motion models, we map the uncovered regions in the successive frames onto the mosaic reference frame using local affine models, i.e., those of the neighboring mesh patches. The proposed method to compute this mosaic representation is tightly coupled with an occlusion adaptive 2-D mesh tracking procedure, which consist of propagating the object mesh frame to frame, and updating of both object and mosaic meshes to optimize texture mapping from the mosaic to each instance of the object. The proposed representation has been applied to video object rendering and editing, including self transfiguration, synthetic transfiguration, and 2-D augmented reality in the presence of self and/or external occlusion. We also provide an algorithm to determine the minimum number of still views needed to reconstruct a replacement mosaic which is needed for synthetic transfiguration. Experimental results are provided to demonstrate both the 2-D mesh-based mosaic synthesis and two different video object editing applications on real video sequences.  相似文献   

3.
Accurate and fast localization of a predefined target region inside the patient is an important component of many image-guided therapy procedures. This problem is commonly solved by registration of intraoperative 2-D projection images to 3-D preoperative images. If the patient is not fixed during the intervention, the 2-D image acquisition is repeated several times during the procedure, and the registration problem can be cast instead as a 3-D tracking problem. To solve the 3-D problem, we propose in this paper to apply 2-D region tracking to first recover the components of the transformation that are in-plane to the projections. The 2-D motion estimates of all projections are backprojected into 3-D space, where they are then combined into a consistent estimate of the 3-D motion. We compare this method to intensity-based 2-D to 3-D registration and a combination of 2-D motion backprojection followed by a 2-D to 3-D registration stage. Using clinical data with a fiducial marker-based gold-standard transformation, we show that our method is capable of accurately tracking vertebral targets in 3-D from 2-D motion measured in X-ray projection images. Using a standard tracking algorithm (hyperplane tracking), tracking is achieved at video frame rates but fails relatively often (32% of all frames tracked with target registration error (TRE) better than 1.2 mm, 82% of all frames tracked with TRE better than 2.4 mm). With intensity-based 2-D to 2-D image registration using normalized mutual information (NMI) and pattern intensity (PI), accuracy and robustness are substantially improved. NMI tracked 82% of all frames in our data with TRE better than 1.2 mm and 96% of all frames with TRE better than 2.4 mm. This comes at the cost of a reduced frame rate, 1.7 s average processing time per frame and projection device. Results using PI were slightly more accurate, but required on average 5.4 s time per frame. These results are still substantially faster than 2-D to 3-D registration. We conclude that motion backprojection from 2-D motion tracking is an accurate and efficient method for tracking 3-D target motion, but tracking 2-D motion accurately and robustly remains a challenge.  相似文献   

4.
The purpose of this study is to investigate a variational method for joint multiregion three-dimensional (3-D) motion segmentation and 3-D interpretation of temporal sequences of monocular images. Interpretation consists of dense recovery of 3-D structure and motion from the image sequence spatiotemporal variations due to short-range image motion. The method is direct insomuch as it does not require prior computation of image motion. It allows movement of both viewing system and multiple independently moving objects. The problem is formulated following a variational statement with a functional containing three terms. One term measures the conformity of the interpretation within each region of 3-D motion segmentation to the image sequence spatiotemporal variations. The second term is of regularization of depth. The assumption that environmental objects are rigid accounts automatically for the regularity of 3-D motion within each region of segmentation. The third and last term is for the regularity of segmentation boundaries. Minimization of the functional follows the corresponding Euler-Lagrange equations. This results in iterated concurrent computation of 3-D motion segmentation by curve evolution, depth by gradient descent, and 3-D motion by least squares within each region of segmentation. Curve evolution is implemented via level sets for topology independence and numerical stability. This algorithm and its implementation are verified on synthetic and real image sequences. Viewers presented with anaglyphs of stereoscopic images constructed from the algorithm's output reported a strong perception of depth.  相似文献   

5.
This paper proposes low power VLSI architecture for motion tracking that can be used in online video applications such as in MPEG and VRML. The proposed architecture uses a hierarchical adaptive structured mesh (HASM) concept that generates a content-based video representation. The developed architecture shows the significant reducing of power consumption that is inherited in the HASM concept. The proposed architecture consists of two units: a motion estimation and motion compensation units.The motion estimation (ME) architecture generates a progressive mesh code that represents a mesh topology and its motion vectors. ME reduces the power consumption since it (1) implements a successive splitting strategy to generate the mesh topology. The successive split allows the pipelined implementation of the processing elements. (2) It approximates the mesh nodes motion vector by using the three step search algorithm. (3) and it uses parallel units that reduce the power consumption at a fixed throughput.The motion compensation (MC) architecture processes a reference frame, mesh nodes and motion vectors to predict a video frame using affine transformation to warp the texture with different mesh patches. The MC reduces the power consumption since it uses (1) a multiplication-free algorithm for affine transformation. (2) It uses parallel threads in which each thread implements a pipelined chain of scalable affine units to compute the affine transformation of each patch.The architecture has been prototyped using top-down low-power design methodology. The performance of the architecture has been analyzed in terms of video construction quality, power and delay.  相似文献   

6.
三维小波变换结合运动补偿的视频编码器   总被引:1,自引:0,他引:1  
俞静  覃团发  区骋 《电讯技术》2006,46(3):66-69
对三雏小波变换结合运动补偿的视频压缩算法提出了改进方案。针对运动补偿提升(MCLIFT)框架的弱点,结合MPEG的特点,采用新的帧结构对视频序列进行帧间滤波去除时间冗余,再对每个帧在空间上进行小波分解并用SPIHT算法对小波系数进行编码。实验表明,此方法继承了MCLIFT框架的优点,同时又减少了时延和所需的帧缓存,而且这种与MPEG相似的帧结构能进一步降低码率,提高压缩比。  相似文献   

7.
Fluoroscopic overlay images rendered from preoperative volumetric data can provide additional anatomical details to guide physicians during catheter ablation procedures for treatment of atrial fibrillation (AFib). As these overlay images are often compromised by cardiac and respiratory motion, motion compensation methods are needed to keep the overlay images in sync with the fluoroscopic images. So far, these approaches have either required simultaneous biplane imaging for 3-D motion compensation, or in case of monoplane X-ray imaging, provided only a limited 2-D functionality. To overcome the downsides of the previously suggested methods, we propose an approach that facilitates a full 3-D motion compensation even if only monoplane X-ray images are available. To this end, we use a training phase that employs a biplane sequence to establish a patient specific motion model. Afterwards, a constrained model-based 2-D/3-D registration method is used to track a circumferential mapping catheter. This device is commonly used for AFib catheter ablation procedures. Based on the experiments on real patient data, we found that our constrained monoplane 2-D/3-D registration outperformed the unconstrained counterpart and yielded an average 2-D tracking error of 0.6 mm and an average 3-D tracking error of 1.6 mm. The unconstrained 2-D/3-D registration technique yielded a similar 2-D performance, but the 3-D tracking error increased to 3.2 mm mostly due to wrongly estimated 3-D motion components in X-ray view direction. Compared to the conventional 2-D monoplane method, the proposed method provides a more seamless workflow by removing the need for catheter model re-initialization otherwise required when the C-arm view orientation changes. In addition, the proposed method can be straightforwardly combined with the previously introduced biplane motion compensation technique to obtain a good trade-off between accuracy and radiation dose reduction.  相似文献   

8.
Although it has been observed that motion-compensated frame differences increase toward block boundaries and overlapped block motion compensation (OBMC) has been shown to provide reduced blocking artifacts as well as improved prediction accuracy, there is almost no satisfactory theoretical basis that clearly interprets the space-dependent characteristics of motion-compensated frame differences, nor have the theoretical aspects of OBMC been investigated thoroughly. We first interpret the space-dependent characteristics of motion-compensated frame differences based on a novel statistical motion distribution model. We then apply the statistical motion distribution model to the analysis of prediction efficiency of OBMC. Through the analysis, we prove theoretically that OBMC can reduce and equalize the motion-compensated frame differences across a block. The analytical results are justified by empirical experiments with typical image sequences.  相似文献   

9.
The authors propose a new approach for tracking the deformation of the left-ventricular (LV) myocardium from two-dimensional (2-D) magnetic resonance (MR) phase contrast velocity fields. The use of phase contrast MR velocity data in cardiac motion problems has been introduced by others (N.J. Pelc et al., 1991) and shown to be potentially useful for tracking discrete tissue elements, and therefore, characterizing LV motion. However, the authors show here that these velocity data: 1) are extremely noisy near the LV borders; and 2) cannot alone be used to estimate the motion and the deformation of the entire myocardium due to noise in the velocity fields. In this new approach, the authors use the natural spatial constraints of the endocardial and epicardial contours, detected semiautomatically in each image frame, to help remove noisy velocity vectors at the LV contours. The information from both the boundaries and the phase contrast velocity data is then integrated into a deforming mesh that is placed over the myocardium at one time frame and then tracked over the entire cardiac cycle. The deformation is guided by a Kalman filter that provides a compromise between 1) believing the dense field velocity and the contour data when it is crisp and coherent in a local spatial and temporal sense and 2) employing a temporally smooth cyclic model of cardiac motion when contour and velocity data are not trustworthy. The Kalman filter is particularly well suited to this task as it produces an optimal estimate of the left ventricle's kinematics (in the sense that the error is statistically minimized) given incomplete and noise corrupted data, and given a basic dynamical model of the left ventricle. The method has been evaluated with simulated data; the average error between tracked nodes and theoretical position was 1.8% of the total path length. The algorithm has also been evaluated with phantom data; the average error was 4.4% of the total path length. The authors show that in their initial tests with phantoms that the new approach shows small, but concrete improvements over previous techniques that used primarily phase contrast velocity data alone. They feel that these improvements will be amplified greatly as they move to direct comparisons in in vivo and three-dimensional (3-D) datasets.  相似文献   

10.
This paper describes a semi-automatic method for moving object segmentation and tracking. This method is suitable when a few objects have to be tracked, while the camera moves and fixates on them. The user delineates approximately the initial locations in a selected frame and specifies the depth ordering of the objects to be tracked. First, motion-based segmentation is obtained through an initial application of a region growing algorithm. The partition map is sequentially tracked from frame to frame using motion compensation and location prediction. The segmentation map is obtained by the region growing algorithm. Translational motion is assumed for the moving objects, and local intensity or color average may be used as additional features. A post-processing procedure regularizes the object boundaries over time.  相似文献   

11.
This paper addresses the problem of side information extraction for distributed coding of videos captured by a camera moving in a 3-D static environment. Examples of targeted applications are augmented reality, remote-controlled robots operating in hazardous environments, or remote exploration by drones. It explores the benefits of the structure-from-motion paradigm for distributed coding of this type of video content. Two interpolation methods constrained by the scene geometry, based either on block matching along epipolar lines or on 3-D mesh fitting, are first developed. These techniques are based on a robust algorithm for sub-pel matching of feature points, which leads to semi-dense correspondences between key frames. However, their rate-distortion (RD) performances are limited by misalignments between the side information and the actual Wyner-Ziv (WZ) frames due to the assumption of linear motion between key frames. To cope with this problem, two feature point tracking techniques are introduced, which recover the camera parameters of the WZ frames. A first technique, in which the frames remain encoded separately, performs tracking at the decoder and leads to significant RD performance gains. A second technique further improves the RD performances by allowing a limited tracking at the encoder. As an additional benefit, statistics on tracks allow the encoder to adapt the key frame frequency to the video motion content.  相似文献   

12.
Motion compensation using two-dimensional (2-D) mesh models requires computation of the parameters of a spatial transformation for each mesh element (patch). It is well known that the parameters of an affine (bilinear or perspective) mapping can be uniquely estimated from three (four) point correspondences (at the vertices of a triangular or quadrilateral mesh element). On the other hand, overdetermined solutions using more than the required minimum number of point correspondences provide increased robustness against correspondence-estimation errors, however, this necessitates special consideration to preserve mesh-connectivity. This paper presents closed-form, overdetermined solutions for least squares estimation of affine motion parameters for a triangular mesh, which preserve mesh-connectivity using patch-based or node-based connectivity constraints. In particular, four new algorithms are presented: patch-constrained methods using point correspondences or spatio-temporal intensity gradients, and node-constrained methods using point correspondences or spatio-temporal intensity gradients. The methods using point correspondences can be viewed as postprocessing of a dense motion field for best representation in terms of a set of irregularly spaced samples. The methods that are based on spatio-temporal intensity gradients offer closed-form solutions for direct estimation of the best node-point motion vectors (equivalently the best transformation parameters). We show that the performance of the proposed closed-form solutions are comparable to those of the alternative search-based solutions at a fraction of the computational cost.  相似文献   

13.
This paper investigates analytically the effects of motion compensation in a coder based on the observed properties of motion-compensated frame difference (MCFD) signals. The AR(1) processes with a given pixel-to-pixel autocorrelation coefficient will be used to model the intraframe images. For interframe motion, each image is allowed to have both moving and nonmoving parts, with the moving parts executing a range of translational motion. From this mathematical model, the statistical characteristics of MCFD signals are derived. Motion compensation gain, 2-D intraframe transform gain, and hybrid gain are next evaluated to ascertain their suitability for the coding process. Experimental results on the Trevor sequence seem to confirm the model and conform with its analytical results.  相似文献   

14.
Fast tracking of cardiac motion using 3D-HARP   总被引:1,自引:0,他引:1  
Magnetic resonance (MR) tagging is capable of accurate, noninvasive quantification of regional myocardial function. Routine clinical use, however, is hindered by cumbersome and time-consuming postprocessing procedures. We propose a fast, semiautomatic method for tracking three-dimensional (3-D) cardiac motion from a temporal sequence of short- and long-axis tagged MR images. The new method, called 3-D-HARmonic Phase (3D-HARP), extends the HARP approach, previously described for two-dimensional (2-D) tag analysis, to 3-D. A 3-D material mesh model is built to represent a collection of material points inside the left ventricle (LV) wall at a reference time. Harmonic phase, a material property that is time-invariant, is used to track the motion of the mesh through a cardiac cycle. Various motion-related functional properties of the myocardium, such as circumferential strain and left ventricular twist, are computed from the tracked mesh. The correlation analysis of 3D-HARP and FINDTAGS + Tag Strain(E) Analysis (TEA), which are well-established tag analysis techniques, shows that the regression coefficients of circumferential strain (E(CC)) and twist angle are r2 = 0.8605 and r2 = 0.8645, respectively. The total time required for tracking 3-D cardiac motion is approximately 10 min in a 9 timeframe tagged MRI dataset and has the potential to be much faster.  相似文献   

15.
We propose and evaluate a number of novel improvements to the mesh-based coding scheme for 3-D brain magnetic resonance images. This includes: 1) elimination of the clinically irrelevant background leading to meshing of only the brain part of the image; 2) content-based (adaptive) mesh generation using spatial edges and optical flow between two consecutive slices; 3) a simple solution for the aperture problem at the edges, where an accurate estimation of motion vectors is not possible; and 4) context-based entropy coding of the residues after motion compensation using affine transformations. We address only lossless coding of the images, and compare the performance of uniform and adaptive mesh-based schemes. The bit rates achieved (about 2 bits per voxel) by these schemes are comparable to those of the state-of-the-art three-dimensional (3-D) wavelet-based schemes. The mesh-based schemes have been shown to be effective for the compression of 3-D brain computed tomography data also. Adaptive mesh-based schemes perform marginally better than the uniform mesh-based methods, at the expense of increased complexity.  相似文献   

16.
A three-dimensional (3-D) method for tracking the coronary arteries through a temporal sequence of biplane X-ray angiography images is presented. A 3-D centerline model of the coronary vasculature is reconstructed from a biplane image pair at one time frame, and its motion is tracked using a coarse-to-fine hierarchy of motion models. Three-dimensional constraints on the length of the arteries and on the spatial regularity of the motion field are used to overcome limitations of classical two-dimensional vessel tracking methods, such as tracking vessels through projective occlusions. This algorithm was clinically validated in five patients by tracking the motion of the left coronary tree over one cardiac cycle. The root mean square reprojection errors were found to be submillimeter in 93% (54/58) of the image pairs. The performance of the tracking algorithm was quantified in three dimensions using a deforming vascular phantom. RMS 3-D distance errors were computed between centerline models tracked in the X-ray images and gold-standard centerline models of the phantom generated from a gated 3-D magnetic resonance image acquisition. The mean error was 0.69 (+/- 0.06) mm over eight temporal phases and four different biplane orientations.  相似文献   

17.
A content-based approach to the design of a triangular mesh is presented, and its application to affine motion compensation is investigated. An image is first segmented into moving objects, which are then approximated with polygons. Then, a triangular mesh is generated within each polygon, thus ensuring that no triangle straddles multiple regions. Translation and affine motion parameters are determined for each triangle, using bidirectional motion estimation. Results for three test sequences demonstrate the advantages offered by the proposed mesh design method, and by the use of affine motion compensation.  相似文献   

18.
A technique for global-motion estimation and compensation in image sequences of 3-D scenes is described in this paper. Each frame is segmented into regions whose motion can be described by a single set of parameters and a set of motion parameters is estimated for each segment. This is done using an iterative block-based image segmentation combined with the estimation of the parameters describing the global motion of each segment. The segmentation is done using a Gibbs-Markov model-based iterative technique for finding a local optimum solution to a maximum a posteriori probability (MAP) segmentation problem. The initial condition for this process is obtained by applying a Hough transform to the motion vectors of each block in the frame obtained by block matching. In each iteration, given a segmentation, the motion parameters are estimated using the least-squares (LS) technique. To obtain the final segmentation and the more appropriate higher-order motion model for each segment, a final stage of splitting/merging of segments is needed. This step is performed on the basis of maximum-likelihood decisions combined with the determination of the higher-order model parameters by LS. The incorporation of the proposed global-motion estimation technique in an image-sequence coder was found to bring about a substantial reduction in bit-rate without degrading the perceived quality or the PSNR.  相似文献   

19.
In this paper, we propose a new bi-directional 2-D mesh representation of video objects, which utilizes forward and backward reference frames (keyframes). This framework extends the previous uni-directional mesh representation to enable efficient rendering, editing, and superresolution of video objects in the presence of occlusion by allowing bi-directional texture mapping as in MPEG B-frames. The video object of interest is tracked between two successive keyframes (which can be automatically or interactively selected) both in forward and backward directions. Keyframes provide the texture of the video object, whereas its motion is modeled by forward and backward 2-D meshes. In addition, we employ “validity maps”, associated with each 2-D mesh, which allow selective texture mapping from the keyframes. Experimental results for efficient video object editing and object-based video resolution enhancement in the presence of self-occlusion are presented to demonstrate the effectiveness of the proposed representation.  相似文献   

20.
运动补偿插帧是目前主要的帧率上转换方法。为减小内插帧中的块效应,并降低运算量以满足实时高清视频应用,该文提出了一种基于3维递归搜索(3-D Recursive Search, 3-D RS)的多级块匹配运动估计视频帧率上转换算法。该算法将3-D RS与双向运动估计相结合,首先对序列中相邻帧进行由粗到精的三级运动估计,再利用简化的中值滤波器平滑运动矢量场,最后通过线性插值补偿得到内插帧。实验结果表明,与现有的运动补偿插帧算法相比,该算法内插帧的主、客观质量都有所提高,且算法复杂度低,有很强的实用性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号