首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 343 毫秒
1.
In this paper, we propose a keyword retrieval system for locating words in historical Mongolian document images. Based on the word spotting technology, a collection of historical Mongolian document images is converted into a collection of word images by word segmentation, and a number of profile-based features are extracted to represent word images. For each word image, a fixed-length feature vector is formulated by obtaining the appropriate number of the complex coefficients of discrete Fourier transform on each profile feature. The system supports online image-to-image matching by calculating similarities between a query word image and each word image in the collection, and consequently, a ranked result is returned in descending order of the similarities. Therein, the query word image can be generated by synthesizing a sequence of glyphs when being retrieved. By experimental evaluations, the performance of the system is confirmed.  相似文献   

2.
3.
摄像机标定是三维重建时的必要步骤。传统的标定方法对设备要求高、操作繁琐,而自标定方法虽然简便,但精度不高,会严重影响三维重建的效果。因此,越来越需要一种操作简便并且精度高的自标定方法。采用SIFT特征点匹配算法,根据多视序列图像中对应点间的相互关系,利用光束法平差,提出了一种基于局部-全局混合优化的迭代优化方法。针对图像匹配量大的问题,提出了一种邻域内图像互匹配方法来降低时间代价。实验表明,本文提出的多摄像机自标定方法是一种有效的高精度方法,采用的邻域内图像互匹配技术能很好地降低图像匹配的时间消耗。根据多视图像的对应点间相互关系,充分利用局部-全局优化的思想,通过混合优化的方法得到相机参数,对比现有自标定算法,本文给出的方法有较高的精度和鲁棒性。  相似文献   

4.
In this paper, we propose a method to jointly transfer the color and detail of multiple source images to a target video or image. Our method is based on a probabilistic segmentation scheme using Gaussian mixture model (GMM) to divide each source image as well as the target video frames or image into soft regions and determine the relevant source regions for each target region. For detail transfer, we first decompose each image as well as the target video frames or image into base and detail components. Then histogram matching is performed for detail components to transfer the detail of matching regions from source images to the target. We propose a unified framework to perform both color and detail transforms in an integrated manner. We also propose a method to maintain consistency for video targets, by enforcing consistent region segmentations for consecutive video frames using GMM-based parameter propagation and adaptive scene change detection. Experimental results demonstrate that our method automatically produces consistent color and detail transferred videos and images from a set of source images.  相似文献   

5.
This paper presents an integrated approach to spot the spoken keywords in digitized Tamil documents by combining word image matching and spoken word recognition techniques. The work involves the segmentation of document images into words, creation of an index of keywords, and construction of word image hidden Markov model (HMM) and speech HMM for each keyword. The word image HMMs are constructed using seven dimensional profile and statistical moment features and used to recognize a segmented word image for possible inclusion of the keyword in the index. The spoken query word is recognized using the most likelihood of the speech HMMs using the 39 dimensional mel frequency cepstral coefficients derived from the speech samples of the keywords. The positional details of the search keyword obtained from the automatically updated index retrieve the relevant portion of text from the document during word spotting. The performance measures such as recall, precision, and F-measure are calculated for 40 test words from the four groups of literary documents to illustrate the ability of the proposed scheme and highlight its worthiness in the emerging multilingual information retrieval scenario.  相似文献   

6.
Sampling the disparity space image   总被引:1,自引:0,他引:1  
A central issue in stereo algorithm design is the choice of matching cost. Many algorithms simply use squared or absolute intensity differences based on integer disparity steps. In this paper, we address potential problems with such approaches. We begin with a careful analysis of the properties of the continuous disparity space image (DSI) and propose several new matching cost variants based on symmetrically matching interpolated image signals. Using stereo images with ground truth, we empirically evaluate the performance of the different cost variants and show that proper sampling can yield improved matching performance.  相似文献   

7.
与普通场景图像相比,无人机影像中纹理信息较丰富,局部特征与目标对象“一对多”的对应问题更加严重,经典SURF算法不适用于无人机影像的特征点匹配.为此,提出一种辅以空间约束的SURF特征点匹配方法,并应用于无人机影像拼接.该方法对基准影像整体提取SURF特征点,对目标影像分块提取SURF特征点,在特征点双向匹配过程中使用两特征点对进行空间约束,实现目标影像子图像与基准影像的特征点匹配;根据特征点对计算目标影像初始变换参数,估计目标影像特征点的匹配点在基准影像上的点位,对匹配点搜索空间进行约束,提高匹配速度与精度;利用点疏密度空间约束,得到均匀分布的特征点对.最后,利用所获取的特征点对实现无人机影像的配准与拼接,通过人工选取均匀分布的特征点对验证拼接精度.实验结果表明,采用本文方法提取的特征点能够得到较好的无人机影像拼接效果.  相似文献   

8.
In this paper, we present an approach for 3D face recognition from frontal range data based on the ridge lines on the surface of the face. We use the principal curvature, kmax, to represent the face image as a 3D binary image called ridge image. The ridge image shows the locations of the ridge points around the important facial regions on the face (i.e., the eyes, the nose, and the mouth). We utilized the robust Hausdorff distance and the iterative closest points (ICP) for matching the ridge image of a given probe image to the ridge images of the facial images in the gallery. To evaluate the performance of our approach for 3D face recognition, we performed experiments on GavabDB face database (a small size database) and Face Recognition Grand Challenge V2.0 (a large size database). The results of the experiments show that the ridge lines have great capability for 3D face recognition. In addition, we found that as long as the size of the database is small, the performance of the ICP-based matching and the robust Hausdorff matching are comparable. But, when the size of the database increases, ICP-based matching outperforms the robust Hausdorff matching technique.  相似文献   

9.
Efficient and compact representation of images is a fundamental problem in computer vision. In this paper, we propose methods that use Haar-like binary box functions to represent a single image or a set of images. A desirable property of these box functions is that their inner product operation with an image can be computed very efficiently. We propose two closely related novel subspace methods to model images: the non-orthogonal binary subspace (NBS) method and binary principal component analysis (B-PCA) algorithm. NBS is spanned directly by binary box functions and can be used for image representation, fast template matching and many other vision applications. B-PCA is a structure subspace that inherits the merits of both NBS (fast computation) and PCA (modeling data structure information). B-PCA base vectors are obtained by a novel PCA guided NBS method. We also show that BPCA base vectors are nearly orthogonal to each other. As a result, in the non-orthogonal vector decomposition process, the computationally intensive pseudo-inverse projection operator can be approximated by the direct dot product without causing significant distance distortion. Experiments on real image datasets show promising performance in image matching, reconstruction and recognition tasks with significant speed improvement.  相似文献   

10.
We present a fast and efficient homing algorithm based on Fourier transformed panoramic images. By continuously comparing Fourier coefficients calculated from the current view with coefficients representing the goal location, a mobile robot is able to find its way back to known locations. No prior knowledge about the orientation with respect to the goal location is required, since the Fourier phase is used for a fast sub-pixel orientation estimation. We present homing runs performed by an autonomous mobile robot in an office environment. In a more comprehensive investigation the algorithm is tested on an image data base recorded by a small mobile robot in a toy house arena. Catchment areas for the proposed algorithm are calculated and compared to results of a homing scheme described in [M. Franz, B. Schölkopf, H. Mallot, H. Bülthoff, Where did I take that snapshot? Scene based homing by image matching, Biological Cybernetics 79 (1998) 191–202] and a simple homing strategy using neighbouring views. The results show that a small number of coefficients is sufficient to achieve a good homing performance. Also, a coarse-to-fine homing strategy is proposed in order to achieve both a large catchment area and a high homing accuracy: the number of Fourier coefficients used is increased during the homing run.  相似文献   

11.
In this paper, we present a scheme based on feature mining and pattern classification to detect LSB matching steganography in grayscale images, which is a very challenging problem in steganalysis. Five types of features are proposed. In comparison with other well-known feature sets, the set of proposed features performs the best. We compare different learning classifiers and deal with the issue of feature selection that is rarely mentioned in steganalysis. In our experiments, the combination of a dynamic evolving neural fuzzy inference system (DENFIS) with a feature selection of support vector machine recursive feature elimination (SVMRFE) achieves the best detection performance. Results also show that image complexity is an important reference to evaluation of steganalysis performance.  相似文献   

12.
In this paper, we present a parallel search scheme for model-based interpretation of aerial images, following a focus-of-attention paradigm. Interpretation is performed using the gray level image of an aerial scene and its segmentation into connected components of almost constant gray level. Candidate objects are generated from the window as connected combinations of its components. Each candidate is matched against the model by checking if the model constraints are satisfied by the parameters computed from the region. The problem of candidate generation and matching is posed as searching in the space of combinations of connected components in the image, with finding an (optimally) successful region as the goal. Our implementation exploits parallelism at multiple levels by parallelizing the management of the open list and other control tasks as well as the task of model matching. We discuss and present the implementation of the interpretation system on a Connection Machine CM-2. The implementation reported a successful match in a few hundred milliseconds whenever they existed.  相似文献   

13.
The across-track illumination variation in Earth Observing-1 (EO-1) Hyperion images is a result of wavelength-shift and full-width-at-half-maximum (FWHM)-shift in the cross-track direction. Correction in across-track illumination variation is necessary for accurate spectral matching and classification. This contribution reviews the available methods for the correction of across-track illumination variation, and evaluates them for correcting a Hyperion image of study area around the Udaipur city in western India. We also describe and demonstrated a new technique for correcting these artefacts. For each band, the spatial trends of (a) nonlinear shifts in the nominal centre wavelengths of bands across the image columns and (b) nonlinear changes in the nominal FWHM of bands across the image columns are modelled using quadratic regression and are compensated using a radiance correction factor estimated from the columns characterized by minimum illumination variations in a spectrally flat area of the image. A series of statistical measures, spectral matching, minimum noise fraction transform (MNF) images, and post-correction classification results were used to evaluate the performance of the proposed algorithm vis-à-vis some of the previous methods on the Hyperion image of the study area. The results indicate that the proposed method effectively corrects the across-track illumination effects in the Hyperion image of the study area, and also show better performance in lithological as well as for land-use and land-cover mapping, as compared to the other previous methods.  相似文献   

14.
Motivated by the need for the automatic indexing and analysis of huge number of documents in Ottoman divan poetry, and for discovering new knowledge to preserve and make alive this heritage, in this study we propose a novel method for segmenting and retrieving words in Ottoman divans. Documents in Ottoman are difficult to segment into words without a prior knowledge of the word. In this study, using the idea that divans have multiple copies (versions) by different writers in different writing styles, and word segmentation in some of those versions may be relatively easier to achieve than in other versions, segmentation of the versions (which are difficult, if not impossible, with traditional techniques) is performed using information carried from the simpler version. One version of a document is used as the source dataset and the other version of the same document is used as the target dataset. Words in the source dataset are automatically extracted and used as queries to be spotted in the target dataset for detecting word boundaries. We present the idea of cross-document word matching for a novel task of segmenting historical documents into words. We propose a matching scheme based on possible combinations of sequence of sub-words. We improve the performance of simple features through considering the words in a context. The method is applied on two versions of Layla and Majnun divan by Fuzuli. The results show that, the proposed word-matching-based segmentation method is promising in finding the word boundaries and in retrieving the words across documents.  相似文献   

15.
16.
We introduce a two-step iterative segmentation and registration method to find coplanar surfaces among stereo images of a polyhedral environment. The novelties of this paper are: (i) to propose a user-defined initialization easing the image matching and segmentation, (ii) to incorporate color appearance and planar projection information into a Bayesian segmentation scheme, and (iii) to add consistency to the projective transformations related to the polyhedral structure of the scenes. The method utilizes an assisted Bayesian color segmentation scheme. The initial user-assisted segmentation is used to define search regions for planar homography image registration. The two reliable methods cooperate to obtain probabilities for coplanar regions with similar color information that are used to get a new segmentation by means of quadratic Markov measure fields (QMMF). We search for the best regions by iterating both steps: registration and segmentation.  相似文献   

17.
We describe a general probabilistic framework for matching patterns that experience in-plane nonlinear deformations, such as iris patterns. Given a pair of images, we derive a maximum a posteriori probability (MAP) estimate of the parameters of the relative deformation between them. Our estimation process accomplishes two things simultaneously: it normalizes for pattern warping and it returns a distortion-tolerant similarity metric which can be used for matching two nonlinearly deformed image patterns. The prior probability of the deformation parameters is specific to the pattern-type and, therefore, should result in more accurate matching than an arbitrary general distribution. We show that the proposed method is very well suited for handling iris biometrics, applying it to two databases of iris images which contain real instances of warped patterns. We demonstrate a significant improvement in matching accuracy using the proposed deformed Bayesian matching methodology. We also show that the additional computation required to estimate the deformation is relatively inexpensive, making it suitable for real-time applications  相似文献   

18.
This paper describes a finite state machine approach to string matching for an intrusion detection system. To obtain high performance, we typically need to be able to operate on input data that is several bytes wide. However, finite state machine designs become more complex when operating on large input data words, partly because of needing to match the starts and ends of a string that may occur part way through an input data word. Here we use finite state machines that each operate on only a single byte wide data input. We then provide a separate finite state machine for each byte wide data path from a multi-byte wide input data word. By splitting the search strings into multiple interleaved substrings and by combining the outputs from the individual finite state machines in an appropriate way we can perform string matching in parallel across multiple finite state machines. A hardware design for a parallel string matching engine has been generated, built for implementation in a Xilinx Field Programmable Gate Array and tested by simulation. The design is capable of operating at a search rate of 4.7 Gbps with a 32-bit input word size.  相似文献   

19.
Based on the local keypoints extracted as salient image patches, an image can be described as a ?bag-of-visual-words (BoW)? and this representation has appeared promising for object and scene classification. The performance of BoW features in semantic concept detection for large-scale multimedia databases is subject to various representation choices. In this paper, we conduct a comprehensive study on the representation choices of BoW, including vocabulary size, weighting scheme, stop word removal, feature selection, spatial information, and visual bi-gram. We offer practical insights in how to optimize the performance of BoW by choosing appropriate representation choices. For the weighting scheme, we elaborate a soft-weighting method to assess the significance of a visual word to an image. We experimentally show that the soft-weighting outperforms other popular weighting schemes such as TF-IDF with a large margin. Our extensive experiments on TRECVID data sets also indicate that BoW feature alone, with appropriate representation choices, already produces highly competitive concept detection performance. Based on our empirical findings, we further apply our method to detect a large set of 374 semantic concepts. The detectors, as well as the features and detection scores on several recent benchmark data sets, are released to the multimedia community.  相似文献   

20.
Binarization plays an important role in document image processing, especially in degraded documents. For degraded document images, adaptive binarization methods often incorporate local information to determine the binarization threshold for each individual pixel in the document image. We propose a two-stage parameter-free window-based method to binarize the degraded document images. In the first stage, an incremental scheme is used to determine a proper window size beyond which no substantial increase in the local variation of pixel intensities is observed. In the second stage, based on the determined window size, a noise-suppressing scheme delivers the final binarized image by contrasting two binarized images which are produced by two adaptive thresholding schemes which incorporate the local mean gray and gradient values. Empirical results demonstrate that the proposed method is competitive when compared to the existing adaptive binarization methods and achieves better performance in precision, accuracy, and F-measure.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号