首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
目的 目前,特征点轨迹稳像算法无法兼顾轨迹长度、鲁棒性及轨迹利用率,因此容易造成该类算法的视频稳像结果扭曲失真或者局部不稳。针对此问题,提出基于三焦点张量重投影的特征点轨迹稳像算法。方法 利用三焦点张量构建长虚拟轨迹,通过平滑虚拟轨迹定义稳定视图,然后利用三焦点张量将实特征点重投影到稳定视图,以此实现实特征点轨迹的平滑,最后利用网格变形生成稳定帧。结果 对大量不同类型的视频进行稳像效果测试,并且与典型的特征点轨迹稳像算法以及商业软件进行稳像效果对比,其中包括基于轨迹增长的稳像算法、基于对极几何点转移的稳像算法以及商业软件Warp Stabilizer。本文算法的轨迹长度要求低、轨迹利用率高以及鲁棒性好,对于92%剧烈抖动的视频,稳像效果优于基于轨迹增长的稳像算法;对于93%缺乏长轨迹的视频以及71.4%存在滚动快门失真的视频,稳像效果优于Warp Stabilizer;而与基于对极几何点转移的稳像算法相比,退化情况更少,可避免摄像机阶段性静止、摄像机纯旋转等情况带来的算法失效问题。结论 本文算法对摄像机运动模式和场景深度限制少,不仅适宜处理缺少视差、场景结构非平面、滚动快门失真等常见的视频稳像问题,而且在摄像机摇头、运动模糊、剧烈抖动等长轨迹缺乏的情况下,依然能取得较好的稳像效果,但该算法的时间性能还有所不足。  相似文献   

2.
The precise positioning of robotic systems is of great interest particularly in mobile robots. In this context, the use of omnidirectional vision provides many advantages thanks to its wide field of view. This paper presents an image-based visual control to drive a mobile robot to a desired location, which is specified by a target image previously acquired. It exploits the properties of omnidirectional images to preserve the bearing information by using a 1D trifocal tensor. The main contribution of the paper is that the elements of the tensor are introduced directly in the control law and neither any a priori knowledge of the scene nor any auxiliary image are required. Our approach can be applied with any visual sensor obeying approximately a central projection model, presents good robustness to image noise, and avoids the problem of a short baseline by exploiting the information of three views. A sliding mode control law in a square system ensures stability and robustness for the closed loop. The good performance of the control system is proven via simulations and real world experiments with a hypercatadioptric imaging system.  相似文献   

3.
Human faces are remarkably similar in global properties, including size, aspect ratio, and location of main features, but can vary considerably in details across individuals, gender, race, or due to facial expression. We propose a novel method for 3D shape recovery of faces that exploits the similarity of faces. Our method obtains as input a single image and uses a mere single 3D reference model of a different person's face. Classical reconstruction methods from single images, i.e., shape-from-shading, require knowledge of the reflectance properties and lighting as well as depth values for boundary conditions. Recent methods circumvent these requirements by representing input faces as combinations (of hundreds) of stored 3D models. We propose instead to use the input image as a guide to "mold" a single reference model to reach a reconstruction of the sought 3D shape. Our method assumes Lambertian reflectance and uses harmonic representations of lighting. It has been tested on images taken under controlled viewing conditions as well as on uncontrolled images downloaded from the Internet, demonstrating its accuracy and robustness under a variety of imaging conditions and overcoming significant differences in shape between the input and reference individuals including differences in facial expressions, gender, and race.  相似文献   

4.
In this paper, an innovative extended Kalman filter (EKF) algorithm for pose tracking using the trifocal tensor is proposed. In the EKF, a constant-velocity motion model is used as the dynamic system, and the trifocal-tensor constraint is incorporated into the measurement model. The proposed method has the advantages of those structure- and-motion-based approaches in that the pose sequence can be computed with no prior information on the scene structure. It also has the strengths of those model-based algorithms in which no updating of the three-dimensional (3-D) structure is necessary in the computation. This results in a stable, accurate, and efficient algorithm. Experimental results show that the proposed approach outperformed other existing EKFs that tackle the same problem. An extension to the pose-tracking algorithm has been made to demonstrate the application of the trifocal constraint to fast recursive 3-D structure recovery.  相似文献   

5.
We present a new vision-based control approach which drives autonomously a nonholonomic vehicle to a target location. The vision system is a camera fixed on the vehicle and the target location is defined by an image taken previously in that location. The control scheme is based on the trifocal tensor model, which is computed from feature correspondences in calibrated retina across three views: initial, current and target images. The contribution is a trifocal-based control law defined by an exact input–output linearization of the trifocal tensor model. The desired evolution of the system towards the target is directly defined in terms of the trifocal tensor elements by means of sinusoidal functions without needing metric or additional information from the environment. The trifocal tensor presents important advantages for visual control purposes, because it is more robust than two-view geometry as it includes the information of a third view and, contrary to the epipolar geometry, short baseline is not a problem. Simulations show the performance of the approach, which has been tested with image noise and calibration errors.  相似文献   

6.
A novel corrupted region detection technique based on tensor voting is proposed to automatically improve the image quality. This method is suitable for restoring degraded images and enhancing binary images. First, the input images are converted into layered images in which each layer contains objects having similar characteristics. By encoding the pixels in the layered images with second-order tensors and performing voting among them, the corrupted regions are automatically detected using the resulting tensors. These corrupted regions are then restored to improve the image quality. The experimental results obtained from automatic image restoration and binary image enhancement applications show that our method can successfully detect and correct the corrupted regions.  相似文献   

7.
This paper proposes a fast image sequence-based navigation approach for a flat route represented in sparse waypoints. Instead of purely optimizing the length of the path, this paper aims to speed up the navigation by lengthening the distance between consecutive waypoints. When local visual homing in a variable velocity is applied for robot navigation between two waypoints, the robot's speed changes according to the distance between waypoints. Because long distance implies large scale difference between the robot's view and the waypoint image, log-polar transform is introduced to find a correspondence between images and infer a less accurate motion vector. In order to maintain the navigation accuracy, our prior work on local visual homing with SIFT feature matching is adopted when the robot is relatively close to the waypoint. Experiments support the proposed navigation approach in a multiple-waypoint route. Compared to other prior work on visual homing with SIFT feature matching, the proposed navigation approach requires fewer waypoints and the navigation speed is improved without compromising the accuracy in navigation.  相似文献   

8.
Millions of smart phones and GPS-equipped digital cameras sold each year, as well as photo-sharing websites such as Picasa and Panoramio have enabled personal photos to be associated with geographic information. It has been shown by recent research results that the additional global positioning system (GPS) information helps visual recognition for geotagged photos by providing valuable location context. However, the current GPS data only identifies the camera location, leaving the camera viewing direction uncertain within the possible scope of 360°. To produce more precise photo location information, i.e. the viewing direction for geotagged photos, we utilize both Google Street View and Google Earth satellite images. Our proposed system is two-pronged: (1) visual matching between a user photo and any available street views in the vicinity can determine the viewing direction, and (2) near-orthogonal view matching between a user photo taken on the ground and the overhead satellite view at the user geo-location can compute the viewing direction when only the satellite view is available. Experimental results have shown the effectiveness of the proposed framework.  相似文献   

9.
10.
Over the last decade 3D face models have been extensively used in many applications such as face recognition, facial animation and facial expression analysis. 3D Morphable Models (MMs) have become a popular tool to build and fit 3D face models to images. Critical to the success of MMs is the ability to build a generic 3D face model. Major limitations in the MMs building process are: (1) collecting 3D data usually involves the use of expensive laser scans and complex capture setups, (2) the number of available 3D databases is limited, and typically there is a lack of expression variability and (3) finding correspondences and registering the 3D model is a labor intensive and error prone process.  相似文献   

11.
A novel color image segmentation method using tensor voting based color clustering is proposed. By using tensor voting, the number of dominant colors in a color image can be estimated efficiently. Furthermore, the centroids and structures of the color clusters in the color feature space can be extracted. In this method, the color feature vectors are first encoded by second order, symmetric, non-negative definite tensors. These tensors then communicate with each other by a voting process. The resulting tensors are used to determine the number of clusters, locations of the centroids, and structures of the clusters used for performing color clustering. Our method is based on tensor voting, a non-iterative method, and requires only the voting range as its input parameter. The experimental results show that the proposed method can estimate the dominant colors and generate good segmented images in which those regions having the same color are not split up into small parts and the objects are separated well. Therefore, the proposed method is suitable for many applications, such as dominant colors estimation and multi-color text image segmentation.  相似文献   

12.
In this paper, we propose a fast 3-D facial shape recovery algorithm from a single image with general, unknown lighting. In order to derive the algorithm, we formulate a nonlinear least-square problem with two parameter vectors which are related to personal identity and light conditions. We then combine the spherical harmonics for the surface normals of a human face with tensor algebra and show that in a certain condition, the dimensionality of the least-square problem can be further reduced to one-tenth of the regular subspace-based model by using tensor decomposition (N-mode SVD), which greatly speeds up the computations. In order to enhance the shape recovery performance, we have incorporated prior information in updating the parameters. In the experiment, the proposed algorithm takes less than 0.4 s to reconstruct a face and shows a significant performance improvement over other reported schemes.  相似文献   

13.
杨苏  杨兆中 《计算机应用》2014,34(6):1724-1726
传统的图像修复工作仅仅利用破损图像本身的信息完成,破损面积较大并且结构比较复杂时,破损图像不能提供足够的信息导致修复效果不理想。针对这个问题提出了基于参考图像纹理与破损图像自身颜色的修复算法。该算法在图像库中通过图像检索智能筛选相似参考图像,并选择最优区域填充破损图像区域,利用参考图像样块与自身未破损区域的纹理信息保证修复边界的平滑性,再结合颜色迁移与扩展算法使破损图像修复区域与完好区域的色彩协调一致。实验结果表明新提出的修复算法使得图像修复区域过渡更加自然,能在视觉上有较好的效果。  相似文献   

14.
为降低三维地震图像噪声和增强图像横向分辨率,提出了基于结构张量的地震图像纹理增强方法。针对具有复杂纹理的三维地震图像结构张量,将非线性各向异性扩散图像边缘增强滤波与最优旋转不变性的微分滤波器相结合,用于检测纹理边缘和判断连续区域。通过检测三维地震图像结构张量的方差变化程度,实现增强图像纹理边缘的同时避免因算法误差导致的图像模糊。分析结果表明,该方法适用于地震图像增强,可以有效地提高三维地震图像质量,并且清晰地刻画图像中构造特征。  相似文献   

15.
This paper describes a method of geo-registering a sequence of panoramic images to a digital map by matching pixel information from the images with information on the building footprint contained in a digital map. Recently, images captured at the ground level using a Mobile Mapping System (MMS), such as the panoramic images displayed by Google Street View, have been considered as a valuable resource for three-dimensional (3D) building modeling. However, the wide intervals between these panoramic images, as well as locational and directional error from the related sensors, make it difficult to analyze the image data. This paper demonstrates a formulation method for connecting pixels in panoramic images with information on footprint vertices and building lines contained in a digital map. To allow both pixel and footprint information consistent in 3D space, each panoramic image is tilt-corrected in pre-processing to upright the image using the estimated pitch and roll of a vehicle and removing the pitch and roll effects from the panoramic image pixels. Through the proposed formulation, a single panoramic image can be easily geo-registered with simple user-provided constraints, and adjacent sequential images can then be automatically geo-registered using point feature matching. Experimental results showed a significant reduction in the locational and directional error of sequential panoramic images, and the proposed vanishing point (VP) based validation process was found to successfully detect failure cases.  相似文献   

16.
Balance control of a biped robot using camera image of reference object   总被引:1,自引:0,他引:1  
This paper presents a new balance control scheme for a biped robot. Instead of using dynamic sensors to measure the pose of a biped robot, this paper uses only the visual information of a specific reference object in the workspace. The zero moment point (ZMP) of the biped robot can be calculated from the robot’s pose, which is measured from the reference object image acquired by a CCD camera on the robot’s head. For balance control of the biped robot a servo controller uses an error between the reference ZMP and the current ZMP, estimated by Kalman filter. The efficiency of the proposed algorithm has been proven by the experiments performed on both flat and uneven floors with unknown thin obstacles. Recommended by Editorial Board member Dong Hwan Kim under the direction of Editor Jae-Bok Song. This work was supported by the Korea Research Foundation Grant funded by the Korean Government (MOEHRD). This research was supported by the MKE(The Ministry of Knowledge Economy), Korea, under the ITRC (Information Technology Research Center) support program supervised by the IITA(Institute for Information Technology Advancement) (IITA-2008-C1090-0803-0006). Sangbum Park received the B.S. and M.S. degrees from Electronic Engineering of Soongsil University, Seoul, Korea, in 2004 and 2006 respectively. He has been with School of Electronic Engineering, Soongsil University since 2006, where he is currently pursuing a Ph.D. His current research interests include biped walking robot, robotics vision. Youngjoon Han received the B.S., M.S. and Ph.D. degrees in Electronic Engineering from Soongsil University, Seoul, Korea, in 1996, 1998, and 2003, respectively. He is currently an Assistant Professor in the School of Electornic Engineering at Soongsil University. His research interests include robot vision system, and visual servo control. Hernsoo Hahn received the B.S. and M.S. degrees in Electronic Engineering at Soongsil University and Younsei University, Korea in 1982 and 1983 respectively. He received the Ph.D. degree in Computer Engineering from University of Southern California in 1991, and became an Assistant Professor at the School Electroncis Engneering in Soongsil University in 1992. Currently, he is a Professor. His research interests include application of vision sensors to mobile robots and measurement systems.  相似文献   

17.
In this paper, we introduce a novel image signature effective in both image retrieval and image classification. Our approach is based on the aggregation of tensor products of discriminant local features, named VLATs (vector of locally aggregated tensors). We also introduce techniques for the packing and the fast comparison of VLATs. We present connections between VLAT and methods like kernel on bags and Fisher vectors. Finally, we show the ability of our method to be effective for two different retrieval problems, thanks to experiments carried out on similarity search and classification datasets.  相似文献   

18.
Recent studies have demonstrated that high-level semantics in data can be captured using sparse representation. In this paper, we propose an approach to human body pose estimation in static images based on sparse representation. Given a visual input, the objective is to estimate 3D human body pose using feature space information and geometrical information of the pose space. On the assumption that each data point and its neighbors are likely to reside on a locally linear patch of the underlying manifold, our method learns the sparse representation of the new input using both feature and pose space information and then estimates the corresponding 3D pose by a linear combination of the bases of the pose dictionary. Two strategies for dictionary construction are presented: (i) constructing the dictionary by randomly selecting the frames of a sequence and (ii) selecting specific frames of a sequence as dictionary atoms. We analyzed the effect of each strategy on the accuracy of pose estimation. Extensive experiments on datasets of various human activities show that our proposed method outperforms state-of-the-art methods.  相似文献   

19.
In this paper, a novel reduced-reference stereoscopic image quality assessment (RR-SIQA) algorithm is proposed by means of an unconventional use of watermarking technique. Watermarking techniques are usually employed for authenticity verification and copyright protection. Here, watermarking technique is adopted to provide a new approach for RR-SIQA. Firstly, the features of image are extracted in reorganized discrete cosine transform domain, and then embedded into the stereoscopic image as invisible hidden information. In order to improve the reliability of the watermarking, some channel coding techniques are applied before the process of embedding watermark. At the receiver, the watermark can be decoded and used to measure the quality of the distorted stereoscopic image. The proposed algorithm overcomes the limitations of other existing methods that require an auxiliary channel. Experimental results illustrate that the proposed algorithm has a good consistency with subjective quality scores, and can reflect the visual perception of stereoscopic image effectively.  相似文献   

20.
In this paper, we present a novel approach to recover a 3D human pose in real-time from a single depth image using principal direction analysis (PDA). Human body parts are first recognized from a human depth silhouette via trained random forests (RFs). PDA is applied to each recognized body part, which is presented as a set of points in 3D, to estimate its principal direction. Finally, a 3D human pose is recovered by mapping the principal direction to each body part of a 3D synthetic human model. We perform both quantitative and qualitative evaluations of our proposed 3D human pose recovering methodology. We show that our proposed approach has a low average reconstruction error of 7.07 degrees for four key joint angles and performs more reliably on a sequence of unconstrained poses than conventional methods. In addition, our methodology runs at a speed of 20 FPS on a standard PC, indicating that our system is suitable for real-time applications. Our 3D pose recovery methodology is applicable to applications ranging from human computer interactions to human activity recognition.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号