首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Viewpoint independent recognition of free-form objects and their segmentation in the presence of clutter and occlusions is a challenging task. We present a novel 3D model-based algorithm which performs this task automatically and efficiently. A 3D model of an object is automatically constructed offline from its multiple unordered range images (views). These views are converted into multidimensional table representations (which we refer to as tensors). Correspondences are automatically established between these views by simultaneously matching the tensors of a view with those of the remaining views using a hash table-based voting scheme. This results in a graph of relative transformations used to register the views before they are integrated into a seamless 3D model. These models and their tensor representations constitute the model library. During online recognition, a tensor from the scene is simultaneously matched with those in the library by casting votes. Similarity measures are calculated for the model tensors which receive the most votes. The model with the highest similarity is transformed to the scene and, if it aligns accurately with an object in the scene, that object is declared as recognized and is segmented. This process is repeated until the scene is completely segmented. Experiments were performed on real and synthetic data comprised of 55 models and 610 scenes and an overall recognition rate of 95 percent was achieved. Comparison with the spin images revealed that our algorithm is superior in terms of recognition rate and efficiency.  相似文献   

3.
4.
Trivedi  M.M. Chen  C. Marapane  S.B. 《Computer》1989,22(6):91-97
A model-based approach has been proposed to make object recognition computationally tractable. In this approach, models associated with objects expected to appear in the scene are recorded in the system's knowledge base. The system extracts various features from the input images using robust, low-level, general-purpose operators. Finally, matching is performed between the image-derived features and the scene domain models to recognize objects. Factors affecting the successful design and implementation of model-based vision systems include the ability to derive suitable object models, the nature of image features extracted by the operators, a computationally effective matching approach, knowledge representation schemes, and effective control mechanisms for guiding the systems's overall operation. The vision system they describe uses gray-scale images, which can successfully handle complex scenes with multiple object types  相似文献   

5.
This paper presents a computational model to recover the most likely interpretation of the 3D scene structure from a planar image, where some objects may occlude others. The estimated scene interpretation is obtained by integrating some global and local cues and provides both the complete disoccluded objects that form the scene and their ordering according to depth. Our method first computes several distal scenes which are compatible with the proximal planar image. To compute these different hypothesized scenes, we propose a perceptually inspired object disocclusion method, which works by minimizing the Euler’s elastica as well as by incorporating the relatability of partially occluded contours and the convexity of the disoccluded objects. Then, to estimate the preferred scene, we rely on a Bayesian model and define probabilities taking into account the global complexity of the objects in the hypothesized scenes as well as the effort of bringing these objects in their relative position in the planar image, which is also measured by an Euler’s elastica-based quantity. The model is illustrated with numerical experiments on, both, synthetic and real images showing the ability of our model to reconstruct the occluded objects and the preferred perceptual order among them. We also present results on images of the Berkeley dataset with provided figure-ground ground-truth labeling.  相似文献   

6.
7.
8.
9.
结合纹理和分布特征的遥感图像群目标识别方法   总被引:4,自引:0,他引:4  
针对航空图像中的群目标提出结合纹理和分布特征的目标识别方法。该方法可以分为两步: 首先选取一组纹理特征,采用最大似然分类算法完成子目标区域的分割;然后基于分布特征从子目标区域中快速定位和识别群目标。实验表明,所选的纹理特征可以有效区分防护掩体与各种自然背景;提出的基于分布特征定位和识别目标的剪枝算法,与同类算法相比速度获得较大提高。对于多幅航空图像进行识别实验均得到满意的结果,表明这种方法可以有效的从复杂自然场景中快速识别出感兴趣的群目标。  相似文献   

10.
现有基于深度学习的显著性检测算法主要针对二维RGB图像设计,未能利用场景图像的三维视觉信息,而当前光场显著性检测方法则多数基于手工设计,特征表示能力不足,导致上述方法在各种挑战性自然场景图像上的检测效果不理想。提出一种基于卷积神经网络的多模态多级特征精炼与融合网络算法,利用光场图像丰富的视觉信息,实现面向四维光场图像的精准显著性检测。为充分挖掘三维视觉信息,设计2个并行的子网络分别处理全聚焦图像和深度图像。在此基础上,构建跨模态特征聚合模块实现对全聚焦图像、焦堆栈序列和深度图3个模态的跨模态多级视觉特征聚合,以更有效地突出场景中的显著性目标对象。在DUTLF-FS和HFUT-Lytro光场基准数据集上进行实验对比,结果表明,该算法在5个权威评估度量指标上均优于MOLF、AFNet、DMRA等主流显著性目标检测算法。  相似文献   

11.
Geometric hashing (GH) and partial pose clustering are well-known algorithms for pattern recognition. However, the performance of both these algorithms degrades rapidly with an increase in scene clutter and the measurement uncertainty in the detected features. The primary contribution of this paper is the formulation of a framework that unifies the GH and the partial pose clustering paradigms for pattern recognition in cluttered scenes. The proposed scheme has a better discrimination capability as compared to the GA algorithm, thus improving recognition accuracy. The scheme is incorporated in a Bayesian MLE framework to make it robust to the presence of sensor noise. It is able to handle partial occlusions, is robust to measurement uncertainty in the data features and to the presence of spurious scene features (scene clutter). An efficient hash table representation of 3D features extracted from range images is also proposed. Simulations with real and synthetic 2D/3D objects show that the scheme performs better than the GH algorithm in scenes with a large amount of clutter.  相似文献   

12.
This article presents a system for texture-based probabilistic classification and localisation of three-dimensional objects in two-dimensional digital images and discusses selected applications. In contrast to shape-based approaches, our texture-based method does not rely on object features extracted using image segmentation techniques. Rather, the objects are described by local feature vectors computed directly from image pixel values using the wavelet transform. Both gray level and colour images can be processed. In the training phase, object features are statistically modelled as normal density functions. In the recognition phase, the system classifies and localises objects in scenes with real heterogeneous backgrounds. Feature vectors are calculated and a maximisation algorithm compares the learned density functions with the extracted feature vectors and yields the classes and poses of objects found in the scene. Experiments carried out on a real dataset of over 40,000 images demonstrate the robustness of the system in terms of classification and localisation accuracy. Finally, two important real application scenarios are discussed, namely recognising museum exhibits from visitors’ own photographs and classification of metallography images.  相似文献   

13.
Since indoor scenes are frequently changed in daily life, such as re‐layout of furniture, the 3D reconstructions for them should be flexible and easy to update. We present an automatic 3D scene update algorithm to indoor scenes by capturing scene variation with RGBD cameras. We assume an initial scene has been reconstructed in advance in manual or other semi‐automatic way before the change, and automatically update the reconstruction according to the newly captured RGBD images of the real scene update. It starts with an automatic segmentation process without manual interaction, which benefits from accurate labeling training from the initial 3D scene. After the segmentation, objects captured by RGBD camera are extracted to form a local updated scene. We formulate an optimization problem to compare to the initial scene to locate moved objects. The moved objects are then integrated with static objects in the initial scene to generate a new 3D scene. We demonstrate the efficiency and robustness of our approach by updating the 3D scene of several real‐world scenes.  相似文献   

14.
We present a data‐driven method for synthesizing 3D indoor scenes by inserting objects progressively into an initial, possibly, empty scene. Instead of relying on few hundreds of hand‐crafted 3D scenes, we take advantage of existing large‐scale annotated RGB‐D datasets, in particular, the SUN RGB‐D database consisting of 10,000+ depth images of real scenes, to form the prior knowledge for our synthesis task. Our object insertion scheme follows a co‐occurrence model and an arrangement model, both learned from the SUN dataset. The former elects a highly probable combination of object categories along with the number of instances per category while a plausible placement is defined by the latter model. Compared to previous works on probabilistic learning for object placement, we make two contributions. First, we learn various classes of higher‐order object‐object relations including symmetry, distinct orientation, and proximity from the database. These relations effectively enable considering objects in semantically formed groups rather than by individuals. Second, while our algorithm inserts objects one at a time, it attains holistic plausibility of the whole current scene while offering controllability through progressive synthesis. We conducted several user studies to compare our scene synthesis performance to results obtained by manual synthesis, state‐of‐the‐art object placement schemes, and variations of parameter settings for the arrangement model.  相似文献   

15.
A spherical representation for recognition of free-form surfaces   总被引:3,自引:0,他引:3  
Introduces a new surface representation for recognizing curved objects. The authors approach begins by representing an object by a discrete mesh of points built from range data or from a geometric model of the object. The mesh is computed from the data by deforming a standard shaped mesh, for example, an ellipsoid, until it fits the surface of the object. The authors define local regularity constraints that the mesh must satisfy. The authors then define a canonical mapping between the mesh describing the object and a standard spherical mesh. A surface curvature index that is pose-invariant is stored at every node of the mesh. The authors use this object representation for recognition by comparing the spherical model of a reference object with the model extracted from a new observed scene. The authors show how the similarity between reference model and observed data can be evaluated and they show how the pose of the reference object in the observed scene can be easily computed using this representation. The authors present results on real range images which show that this approach to modelling and recognizing 3D objects has three main advantages: (1) it is applicable to complex curved surfaces that cannot be handled by conventional techniques; (2) it reduces the recognition problem to the computation of similarity between spherical distributions; in particular, the recognition algorithm does not require any combinatorial search; and (3) even though it is based on a spherical mapping, the approach can handle occlusions and partial views  相似文献   

16.
In this paper, a contour-based focus of attention approach is presented. Fast to compute, contour based features are extracted from 3D scenes and matched to model parts of objects. Local reference frames associated with the features induce a translation and rotation, resulting in a vote being cast for the presence of the object in a certain position within the scene. In these positions, HoG features are extracted and SVM classification is applied. Detection results and computation times are compared to those corresponding to a sliding window approach.  相似文献   

17.
The framework of mining of moving objects from image data sequence is presented. Scenes are first clustered and labeled by using two-stage SOM that is modified to recognize images including similar moving objects as the same cluster, and that well recognizes scenes including prominent objects. After extraction of images which include prominent objects based on clustering result, the position and the shape of objects are approximated by using mixture gaussian model via EM algorithm, providing the adequate or larger number of components. By adopting the average of the data points in the smaller blocks as the initial parameters, the solutions are stabilized and the identification of components among time-series images and the tracking of a specific object become easier.This framework is applied to a four-year (ranging from 1997 to 2000) dataset of cloud images taken by Japanese weather satellite GMS-5 to evaluate its performance. Modified SOM method well classifies scenes which include prominent moving object, and seasonal variation tendency is detected in the cluster ID sequence. The result of object detection via EM algorithm for summer-type images including clear cloud masses such as typhoons shows that this approach well approximate the adequate distribution of cloud masses in many cases. Objects in the very irregular shapes are also well represented as the mixtures of gaussians.The extracted object information, together with the scene clustering result, is expected to offer us a rich source for knowledge discovery of video datasets. This approach is one of the effective ways of mining video images whose characteristics are unknown in advance, and thus applicable to the various type of applications.  相似文献   

18.
Evidence-based recognition of 3-D objects   总被引:1,自引:0,他引:1  
An evidence-based recognition technique is defined that identifies 3-D objects by looking for their notable features. This technique makes use of an evidence rule base, which is a set of salient or evidence conditions with corresponding evidence weights for various objects in the database. A measure of similarity between the set of observed features and the set of evidence conditions for a given object in the database is used to determine the identity of an object in the scene or reject the object(s) in the scene as unknown. This procedure has polynomial time complexity and correctly identifies a variety of objects in both synthetic and real range images. A technique for automatically deriving the evidence rule base from training views of objects is shown to generate evidence conditions that successfully identify new views of those objects  相似文献   

19.
In this paper, we propose a computational model of the recognition of real world scenes that bypasses the segmentation and the processing of individual objects or regions. The procedure is based on a very low dimensional representation of the scene, that we term the Spatial Envelope. We propose a set of perceptual dimensions (naturalness, openness, roughness, expansion, ruggedness) that represent the dominant spatial structure of a scene. Then, we show that these dimensions may be reliably estimated using spectral and coarsely localized information. The model generates a multidimensional space in which scenes sharing membership in semantic categories (e.g., streets, highways, coasts) are projected closed together. The performance of the spatial envelope model shows that specific information about object shape or identity is not a requirement for scene categorization and that modeling a holistic representation of the scene informs about its probable semantic category.  相似文献   

20.
In this paper, we introduce an interactive method suitable for retargeting both 3D objects and scenes. Initially, the input object or scene is decomposed into a collection of constituent components enclosed by corresponding control bounding volumes which capture the intra‐structures of the object or semantic grouping of objects in the 3D scene. The overall retargeting is accomplished through a constrained optimization by manipulating the control bounding volumes. Without inferring the intricate dependencies between the components, we define a minimal set of constraints that maintain the spatial arrangement and connectivity between the components to regularize the valid retargeting results. The default retargeting behavior can then be easily altered by additional semantic constraints imposed by users. This strategy makes the proposed method highly flexible to process a wide variety of 3D objects and scenes under an unified framework. In addition, the proposed method achieved more general structure‐preserving pattern synthesis in both object and scene levels. We demonstrate the effectiveness of our method by applying it to several complicated 3D objects and scenes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号