首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
2.
Describes an efficient approach to pose invariant pictorial object recognition employing spectral signatures of image patches that correspond to object surfaces which are roughly planar. Based on singular value decomposition (SVD), the affine transform is decomposed into slant, tilt, swing, scale, and 2D translation. Unlike previous log-polar representations which were not invariant to slant, our log-log sampling configuration in the frequency domain yields complete affine invariance. The images are preprocessed by a novel model-based segmentation scheme that detects and segments objects that are affine-similar to members of a model set of basic geometric shapes. The segmented objects are then recognized by their signatures using multidimensional indexing in a pictorial dataset represented in the frequency domain. Experimental results with a dataset of 26 models show 100 percent recognition rates in a wide range of 3D pose parameters and imaging degradations: 0-360° swing and tilt, 0-82° of slant, more than three octaves in scale change, window-limited translation, high noise levels (0 dB), and significantly reduced resolution (1:5)  相似文献   

3.
4.
Tracking is a very important research subject in a real-time augmented reality context. The main requirements for trackers are high accuracy and little latency at a reasonable cost. In order to address these issues, a real-time, robust, and efficient 3D model-based tracking algorithm is proposed for a "video see through" monocular vision system. The tracking of objects in the scene amounts to calculating the pose between the camera and the objects. Virtual objects can then be projected into the scene using the pose. In this paper, nonlinear pose estimation is formulated by means of a virtual visual servoing approach. In this context, the derivation of point-to-curves interaction matrices are given for different 3D geometrical primitives including straight lines, circles, cylinders, and spheres. A local moving edges tracker is used in order to provide real-time tracking of points normal to the object contours. Robustness is obtained by integrating an M-estimator into the visual control law via an iteratively reweighted least squares implementation. This approach is then extended to address the 3D model-free augmented reality problem. The method presented in this paper has been validated on several complex image sequences including outdoor environments. Results show the method to be robust to occlusion, changes in illumination, and mistracking.  相似文献   

5.
This paper presents a robust framework for tracking complex objects in video sequences. Multiple hypothesis tracking (MHT) algorithm reported in (IEEE Trans. Pattern Anal. Mach. Intell. 18(2) (1996)) is modified to accommodate a high level representations (2D edge map, 3D models) of objects for tracking. The framework exploits the advantages of MHT algorithm which is capable of resolving data association/uncertainty and integrates it with object matching techniques to provide a robust behavior while tracking complex objects. To track objects in 2D, a 4D feature is used to represent edge/line segments and are tracked using MHT. In many practical applications 3D models provide more information about the object's pose (i.e., rotation information in the transformation space) which cannot be recovered using 2D edge information. Hence, a 3D model-based object tracking algorithm is also presented. A probabilistic Hausdorff image matching algorithm is incorporated into the framework in order to determine the geometric transformation that best maps the model features onto their corresponding ones in the image plane. 3D model of the object is used to constrain the tracker to operate in a consistent manner. Experimental results on real and synthetic image sequences are presented to demonstrate the efficacy of the proposed framework.  相似文献   

6.
张龙媛  陈莹 《计算机工程》2012,38(12):125-128
根据姿态与表情变化对人脸识别的影响,采用对图像的旋转、尺度变化保持不变性的SIFT算子作为人脸特征,建立人脸各个子区域的相似性测度,并通过混合高斯建立不同变形条件下相同样本与不同样本的相似性概率模型。在此基础上,利用各子区域特有的识别能力获取子区域概率权值,结合基于贝叶斯公式建立的概率框架确定识别结果。实验结果表明,与直接用SIFT算子进行人脸识别的方法相比,该方法在姿态变化较大及表情变化较大的情况下识别率有明显提高。  相似文献   

7.
Pose refinement is an essential task for computer vision systems that require the calibration and verification of model and camera parameters. Typical domains include the real-time tracking of objects and verification in model-based recognition systems. A technique is presented for recovering model and camera parameters of 3D objects from a single two-dimensional image. This basic problem is further complicated by the incorporation of simple bounds on the model and camera parameters and linear constraints restricting some subset of object parameters to a specific relationship. It is demonstrated in this paper that this constrained pose refinement formulation is no more difficult than the original problem based on numerical analysis techniques, including active set methods and lagrange multiplier analysis. A number of bounded and linearly constrained parametric models are tested and convergence to proper values occurs from a wide range of initial error, utilizing minimal matching information (relative to the number of parameters and components). The ability to recover model parameters in a constrained search space will thus simplify associated object recognition problems.  相似文献   

8.
3D Free-Form Object Recognition Using Indexing by Contour Features   总被引:1,自引:0,他引:1  
We address the problem of recognizing free-form 3D objects from a single 2D intensity image. A model-based solution within the alignment paradigm is presented which involves three major schemes—modeling, matching, and indexing. The modeling scheme constructs a set of model aspects which can predict the object contour as seen from any viewpoint. The matching scheme aligns the edgemap of a candidate model to the observed edgemap using an initial approximate pose. The major contribution of this paper involves the indexing scheme and its integration with modeling and matching to perform recognition. Indexing generates hypotheses specifying both candidate model aspects and approximate pose and scale. Hypotheses are ordered by likelihood based on prior knowledge of pre-stored models and the visual evidence from the observed objects. A prototype implementation has been tested in recognition and localization experiments with a database containing 658 model aspects from twenty 3D objects and eighty 2D objects. Bench tests and simulations show that many kinds of objects can be handled accurately and efficiently even in cluttered scenes. We conclude that the proposed recognition-by-alignment paradigm is a viable approach to many 3D object recognition problems.  相似文献   

9.
Distinctive Image Features from Scale-Invariant Keypoints   总被引:517,自引:6,他引:517  
This paper presents a method for extracting distinctive invariant features from images that can be used to perform reliable matching between different views of an object or scene. The features are invariant to image scale and rotation, and are shown to provide robust matching across a substantial range of affine distortion, change in 3D viewpoint, addition of noise, and change in illumination. The features are highly distinctive, in the sense that a single feature can be correctly matched with high probability against a large database of features from many images. This paper also describes an approach to using these features for object recognition. The recognition proceeds by matching individual features to a database of features from known objects using a fast nearest-neighbor algorithm, followed by a Hough transform to identify clusters belonging to a single object, and finally performing verification through least-squares solution for consistent pose parameters. This approach to recognition can robustly identify objects among clutter and occlusion while achieving near real-time performance.  相似文献   

10.
3D不变量作为不随姿态、视点等成像条件变化而变化的特征参量,可以广泛应用于计算机视觉的多重领域.通过分析2D射影变换矩阵求解的多种可能性,由单纯基于点集对应的思路扩展到利用点集、线集、点、线组合等其它方法,从而拓宽了建立两射影平面对应关系的应用条件.由此提出了一种基于多种点线组合构造虚元素的方法,结合实元素和虚元素可以巧妙提取空间复杂结构下的多种3D不变量,以用于目标识别和描述当中.实验结果验证了方法的有效性。  相似文献   

11.
12.
This paper presents a novel vision-based global localization that uses hybrid maps of objects and spatial layouts. We model indoor environments with a stereo camera using the following visual cues: local invariant features for object recognition and their 3D positions for object pose estimation. We also use the depth information at the horizontal centerline of image where the optical axis passes through, which is similar to the data from a 2D laser range finder. This allows us to build our topological node that is composed of a horizontal depth map and an object location map. The horizontal depth map describes the explicit spatial layout of each local space and provides metric information to compute the spatial relationships between adjacent spaces, while the object location map contains the pose information of objects found in each local space and the visual features for object recognition. Based on this map representation, we suggest a coarse-to-fine strategy for global localization. The coarse pose is estimated by means of object recognition and SVD-based point cloud fitting, and then is refined by stochastic scan matching. Experimental results show that our approaches can be used for an effective vision-based map representation as well as for global localization methods.  相似文献   

13.
Detecting objects, estimating their pose, and recovering their 3D shape are critical problems in many vision and robotics applications. This paper addresses the above needs using a two stages approach. In the first stage, we propose a new method called DEHV – Depth-Encoded Hough Voting. DEHV jointly detects objects, infers their categories, estimates their pose, and infers/decodes objects depth maps from either a single image (when no depth maps are available in testing) or a single image augmented with depth map (when this is available in testing). Inspired by the Hough voting scheme introduced in [1], DEHV incorporates depth information into the process of learning distributions of image features (patches) representing an object category. DEHV takes advantage of the interplay between the scale of each object patch in the image and its distance (depth) from the corresponding physical patch attached to the 3D object. Once the depth map is given, a full reconstruction is achieved in a second (3D modelling) stage, where modified or state-of-the-art 3D shape and texture completion techniques are used to recover the complete 3D model. Extensive quantitative and qualitative experimental analysis on existing datasets [2], [3], [4] and a newly proposed 3D table-top object category dataset shows that our DEHV scheme obtains competitive detection and pose estimation results. Finally, the quality of 3D modelling in terms of both shape completion and texture completion is evaluated on a 3D modelling dataset containing both in-door and out-door object categories. We demonstrate that our overall algorithm can obtain convincing 3D shape reconstruction from just one single uncalibrated image.  相似文献   

14.
The Lucas–Kanade tracker (LKT) is a commonly used method to track target objects over 2D images. The key principle behind the object tracking of an LKT is to warp the object appearance so as to minimize the difference between the warped object’s appearance and a pre-stored template. Accordingly, the 2D pose of the tracked object in terms of translation, rotation, and scaling can be recovered from the warping. To extend the LKT for 3D pose estimation, a model-based 3D LKT assumes a 3D geometric model for the target object in the 3D space and tries to infer the 3D object motion by minimizing the difference between the projected 2D image of the 3D object and the pre-stored 2D image template. In this paper, we propose an extended model-based 3D LKT for estimating 3D head poses by tracking human heads on video sequences. In contrast to the original model-based 3D LKT, which uses a template with each pixel represented by a single intensity value, the proposed model-based 3D LKT exploits an adaptive template with each template pixel modeled by a continuously updated Gaussian distribution during head tracking. This probabilistic template modeling improves the tracker’s ability to handle temporal fluctuation of pixels caused by continuous environmental changes such as varying illumination and dynamic backgrounds. Due to the new probabilistic template modeling, we reformulate the head pose estimation as a maximum likelihood estimation problem, rather than the original difference minimization procedure. Based on the new formulation, an algorithm to estimate the best head pose is derived. The experimental results show that the proposed extended model-based 3D LKT achieves higher accuracy and reliability than the conventional one does. Particularly, the proposed LKT is very effective in handling varying illumination, which cannot be well handled in the original LKT.  相似文献   

15.
This paper addresses the problems associated in processing arrays of depth data in order to achieve the goal of automatic inspection of mechanical parts, i. e. developing general model-based inspection strategies that can be applied to a range of objects. The main problems in processing this data are segmenting out reliable primitives from the data and matching these primitives to those in a stored geometric model of an object. The ability of a 3D vision system to provide depth data accurate enough to perform automatic inspection tasks was until recently only possible at a short range from an object, typically a few centimetres. However it is now possible to produce dense data from a vision system situated further from the object, typically half a metre to a metre. Such a system is outlined. Some current model-based matching techniques are assessed for their suitability for employment in inspection type tasks. One approach is adopted and modifications that improve the efficiency and accuracy of the method for inspection purposes are presented. Finally, an inspection strategy is outlined and its performance assessed. Results are presented on both artificial and real depth data.  相似文献   

16.
The current work addresses the problem of 3D model tracking in the context of monocular and stereo omnidirectional vision in order to estimate the camera pose. To this end, we track 3D objects modeled by line segments because the straight line feature is often used to model the environment. Indeed, we are interested in mobile robot navigation using omnidirectional vision in structured environments. In the case of omnidirectional vision, 3D straight lines are projected as conics in omnidirectional images. Under certain conditions, these conics may have singularities.In this paper, we present two contributions. We, first, propose a new spherical formulation of the pose estimation withdrawing singularities, using an object model composed of lines. The theoretical formulation and the validation on synthetic images thus show that the new formulation clearly outperforms the former image plane one. The second contribution is the extension of the spherical representation to the stereovision case. We consider in the paper a sensor which combines a camera and four mirrors. Results in various situations show the robustness to illumination changes and local mistracking. As a final result, the proposed new stereo spherical formulation allows us to localize online a robot indoor and outdoor whereas the classical formulation fails.  相似文献   

17.
18.
19.
20.
阎冲 《传感器世界》2012,18(9):22-26
验证了一种能够在不同图像之间进行同一个物体相匹配的方法,具有很强的可靠性,称之为SIFT算法(尺度不变特征变换).SIFT算法能够处理图像间发生的尺度变换、旋转、很大范围内的仿射形变、视角变换、噪声以及光照变换.它的功能十分强大,甚至可以仅仅根据一个简单的物体特征,在一个大型数据库中的许多高品质图像中进行相应目标的寻找...  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号