期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Clutter noise removal in binary document images

Mudit Agrawal David Doermann 《International Journal on Document Analysis and Recognition》2013,16(4):351-369

The paper presents a clutter detection and removal algorithm for complex document images. This distance transform based technique aims to remove irregular and independent unwanted clutter while preserving the text content. The novelty of this approach is in its approximation to the clutter–content boundary when the clutter is attached to the content in irregular ways. As an intermediate step, a residual image is created, which forms the basis for clutter detection and removal. Clutter detection and removal are independent of clutter’s position, size, shape, and connectivity with text. The method is tested on a collection of highly degraded and noisy, machine-printed and handwritten Arabic and English documents, and results show pixel-level accuracies of 99.18 and 98.67 % for clutter detection and removal, respectively. This approach is also extended to documents having a mix of clutter and salt-and-pepper noise. 相似文献

2.

Document representation and its application to page decomposition 总被引：6，自引：0，他引：6

Jain A.K. Bin Yu 《IEEE transactions on pattern analysis and machine intelligence》1998,20(3):294-308

Transforming a paper document to its electronic version in a form suitable for efficient storage, retrieval, and interpretation continues to be a challenging problem. An efficient representation scheme for document images is necessary to solve this problem. Document representation involves techniques of thresholding, skew detection, geometric layout analysis, and logical layout analysis. The derived representation can then be used in document storage and retrieval. Page segmentation is an important stage in representing document images obtained by scanning journal pages. The performance of a document understanding system greatly depends on the correctness of page segmentation and labeling of different regions such as text, tables, images, drawings, and rulers. We use the traditional bottom-up approach based on the connected component extraction to efficiently implement page segmentation and region identification. A new document model which preserves top-down generation information is proposed based on which a document is logically represented for interactive editing, storage, retrieval, transfer, and logical analysis. Our algorithm has a high accuracy and takes approximately 1.4 seconds on a SGI Indy workstation for model creation, including orientation estimation, segmentation, and labeling (text, table, image, drawing, and ruler) for a 2550×3300 image of a typical journal page scanned at 300 dpi. This method is applicable to documents from various technical journals and can accommodate moderate amounts of skew and noise 相似文献

3.

A multi-plane approach for text segmentation of complex document images

Yen-Lin Chen Author Vitae 《Pattern recognition》2009,42(7):1419-1444

This study presents a new method, namely the multi-plane segmentation approach, for segmenting and extracting textual objects from various real-life complex document images. The proposed multi-plane segmentation approach first decomposes the document image into distinct object planes to extract and separate homogeneous objects including textual regions of interest, non-text objects such as graphics and pictures, and background textures. This process consists of two stages—localized histogram multilevel thresholding and multi-plane region matching and assembling. Then a text extraction procedure is applied on the resultant planes to detect and extract textual objects with different characteristics in the respective planes. The proposed approach processes document images regionally and adaptively according to their respective local features. Hence detailed characteristics of the extracted textual objects, particularly small characters with thin strokes, as well as gradational illuminations of characters, can be well-preserved. Moreover, this way also allows background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well. Experimental results on real-life complex document images demonstrate that the proposed approach is effective in extracting textual objects with various illuminations, sizes, and font styles from various types of complex document images. 相似文献

4.

Text segmentation using gabor filters for automatic document processing 总被引：24，自引：0，他引：24

Anil K. Jain Sushil Bhattacharjee 《Machine Vision and Applications》1992,5(3):169-184

There is a considerable interest in designing automatic systems that will scan a given paper document and store it on electronic media for easier storage, manipulation, and access. Most documents contain graphics and images in addition to text. Thus, the document image has to be segmented to identify the text regions, so that OCR techniques may be applied only to those regions. In this paper, we present a simple method for document image segmentation in which text regions in a given document image are automatically identified. The proposed segmentation method for document images is based on a multichannel filtering approach to texture segmentation. The text in the document is considered as a textured region. Nontext contents in the document, such as blank spaces, graphics, and pictures, are considered as regions with different textures. Thus, the problem of segmenting document images into text and nontext regions can be posed as a texture segmentation problem. Two-dimensional Gabor filters are used to extract texture features for each of these regions. These filters have been extensively used earlier for a variety of texture segmentation tasks. Here we apply the same filters to the document image segmentation problem. Our segmentation method does not assume any a priori knowledge about the content or font styles of the document, and is shown to work even for skewed images and handwritten text. Results of the proposed segmentation method are presented for several test images which demonstrate the robustness of this technique. This work was supported by the National Science Foundation under NSF grant CDA-88-06599 and by a grant from E. 1. Du Pont De Nemours & Company. 相似文献

5.

Off-line handwritten signature detection by analysis of evidence accumulation

José L. Esteban José F. Vélez ángel Sánchez 《International Journal on Document Analysis and Recognition》2012,15(4):359-368

One fundamental step in off-line handwritten signature verification is the detection of the signature position within the document image. This paper introduces an original approach for signature position detection. The method is based on an accumulative evidence technique, searching the region that maximizes some measure of correspondence with a given reference signature. This measure is based on the similarity of the slope marked out by each of the strokes in the signature. Experiments have shown that the method can be used on real documents, such as bank checks, where images have a high noise level due to background interferences (i.e. machine or handwritten texts, stamps, and lines). The proposed method is robust to variability in the size of the signatures and has the advantage of using only one reference signature per person. 相似文献

6.

Document cleanup using page frame detection

Faisal Shafait Joost van Beusekom Daniel Keysers Thomas M. Breuel 《International Journal on Document Analysis and Recognition》2008,11(2):81-96

When a page of a book is scanned or photocopied, textual noise (extraneous symbols from the neighboring page) and/or non-textual noise (black borders, speckles, ...) appear along the border of the document. Existing document analysis methods can handle non-textual noise reasonably well, whereas textual noise still presents a major issue for document analysis systems. Textual noise may result in undesired text in optical character recognition (OCR) output that needs to be removed afterwards. Existing document cleanup methods try to explicitly detect and remove marginal noise. This paper presents a new perspective for document image cleanup by detecting the page frame of the document. The goal of page frame detection is to find the actual page contents area, ignoring marginal noise along the page border. We use a geometric matching algorithm to find the optimal page frame of structured documents (journal articles, books, magazines) by exploiting their text alignment property. We evaluate the algorithm on the UW-III database. The results show that the error rates are below 4% each of the performance measures used. Further tests were run on a dataset of magazine pages and on a set of camera captured document images. To demonstrate the benefits of using page frame detection in practical applications, we choose OCR and layout-based document image retrieval as sample applications. Experiments using a commercial OCR system show that by removing characters outside the computed page frame, the OCR error rate is reduced from 4.3 to 1.7% on the UW-III dataset. The use of page frame detection in layout-based document image retrieval application decreases the retrieval error rates by 30%. 相似文献

7.

Page segmentation using minimum homogeneity algorithm and adaptive mathematical morphology

Tuan Anh Tran In Seop Na Soo Hyung Kim 《International Journal on Document Analysis and Recognition》2016,19(3):191-209

Document layout analysis or page segmentation is the task of decomposing document images into many different regions such as texts, images, separators, and tables. It is still a challenging problem due to the variety of document layouts. In this paper, we propose a novel hybrid method, which includes three main stages to deal with this problem. In the first stage, the text and non-text elements are classified by using minimum homogeneity algorithm. This method is the combination of connected component analysis and multilevel homogeneity structure. Then, in the second stage, a new homogeneity structure is combined with an adaptive mathematical morphology in the text document to get a set of text regions. Besides, on the non-text document, further classification of non-text elements is applied to get separator regions, table regions, image regions, etc. The final stage, in refinement region and noise detection process, all regions both in the text document and non-text document are refined to eliminate noises and get the geometric layout of each region. The proposed method has been tested with the dataset of ICDAR2009 page segmentation competition and many other databases with different languages. The results of these tests showed that our proposed method achieves a higher accuracy compared to other methods. This proves the effectiveness and superiority of our method. 相似文献

8.

An end-to-end framework for the detection of mathematical expressions in scientific document images

Bui Hai Phong Thang Manh Hoang Thi-Lan Le 《Expert Systems》2022,39(1):e12800

The detection of mathematical expressions is a prerequisite step for the digitisation of scientific documents. Many different multistage approaches have been proposed for the detection of expressions in document images, that is, page segmentation and expression detection. However, the detection accuracy of such methods still needs improvement owing to errors in the page segmentation of complex documents. This paper presents an end-to-end framework for mathematical expression detection in scientific document images without requiring optical character recognition (OCR) or document analysis techniques applied in conventional methods. The novelty of this paper is twofold. First, because document images are usually in binary form, the direct use of these images, which lack texture information as input for detection networks, may lead to an incorrect detection. Therefore, we propose the application of a distance transform to obtain a discriminating and meaningful representation of mathematical expressions in document images. Second, the transformed images are fed into the faster region with a convolutional neural network (Faster R-CNN) optimized to improve the accuracy of the detection. The proposed framework was tested on two benchmark data sets (Marmot and GTDB). Compared with the original Faster R-CNN, the proposed network improves the accuracies of detection of isolated and inline expressions by 5.09% and 3.40%, respectfully, on the Marmot data set, whereas those on the GTDB data set are improved by 4.04% and 4.55%. A performance comparison with conventional methods shows the effectiveness of the proposed method. 相似文献

9.

Skew detection in document images based on rectangular active contour

Huijie Fan Linlin Zhu Yandong Tang 《International Journal on Document Analysis and Recognition》2010,13(4):261-269

The digitalization processes of documents produce frequently images with small rotation angles. The skew angles in document images degrade the performance of optical character recognition (OCR) tools. Therefore, skew detection of document images plays an important role in automatic document analysis systems. In this paper, we propose a Rectangular Active Contour Model (RAC Model) for content region detection and skew angle calculation by imposing a rectangular shape constraint on the zero-level set in Chan–Vese Model (C-V Model) according to the rectangular feature of content regions in document images. Our algorithm differs from other skew detection methods in that it does not rely on local image features. Instead, it uses global image features and shape constraint to obtain a strong robustness in detecting skew angles of document images. We experimented on different types of document images. Comparing the results with other skew detection algorithms, our algorithm is more accurate in detecting the skews of the complex document images with different fonts, tables, illustrations, and layouts. We do not need to pre-process the original image, even if it is noisy, and at the same time the rectangular content region of a document image is also detected. 相似文献

10.

Adaptive degraded document image binarization

B. Gatos Author Vitae I. Pratikakis Author Vitae Author Vitae 《Pattern recognition》2006,39(3):317-327

This paper presents a new adaptive approach for the binarization and enhancement of degraded documents. The proposed method does not require any parameter tuning by the user and can deal with degradations which occur due to shadows, non-uniform illumination, low contrast, large signal-dependent noise, smear and strain. We follow several distinct steps: a pre-processing procedure using a low-pass Wiener filter, a rough estimation of foreground regions, a background surface calculation by interpolating neighboring background intensities, a thresholding by combining the calculated background surface with the original image while incorporating image up-sampling and finally a post-processing step in order to improve the quality of text regions and preserve stroke connectivity. After extensive experiments, our method demonstrated superior performance against four (4) well-known techniques on numerous degraded document images. 相似文献

11.

各向同性同质区域选取的高光谱遥感图像噪声估计方法

孙鑫傅鹏孙权森《数据采集与处理》2018,33(5):809-817

在现有高光谱遥感图像噪声估计方法中,同质区域的选取通常是最关键的步骤,有效的同质区域选取方法能够提高图像的噪声估计精度。本文充分利用了高光谱遥感图像中丰富的空间信息和光谱信息,提出了一种各向同性同质区域选取算法,其中,为了更好地区分同质区域内像元相似度,构造了一种新的兰氏-光谱角度量;结合基于多元线性回归的去相关法,通过最优区域评估高光谱遥感图像噪声水平。利用不同结构及信噪比的模拟图像和真实高光谱遥感图像进行实验,通过与现有的多种噪声估计方法比较,验证了本文方法在针对不同噪声水平、不同复杂程度的图像时更加准确和稳定。相似文献

12.

结合相位一致与全变差模型的高分辨率遥感图像边缘检测 总被引：1，自引：0，他引：1

下载免费PDF全文

黄秋燕肖鹏峰冯学智王珂《中国图象图形学报》2014,19(3):439-446

目的边缘检测是有效利用遥感数据开展地物目标自动识别的重要步骤。高分辨率遥感图像地物类型复杂,细节信息过于丰富,使得基于相位一致的边缘检测结果中存在过多的噪声与伪边缘。为此提出了一种结合相位一致与全变差模型的高分辨率遥感图像边缘检测方法。方法根据相位一致原理,应用Log Gabor构造的2维相位一致模型,引入全变差去噪模型对基于相位一致的边缘强度图进行改进。结果借助有界变差空间对图像光滑性的约束,实现了高分辨率遥感图像噪声去除与伪边缘抑制,利用改进后的相位一致边缘强度图可有效检测高分辨率遥感图像的边缘。结论实验结果表明,与相位一致模型、Canny算法相比,该方法能消除了高分辨率遥感图像中同类地物内部细节特征形成的噪声,抑制相位一致边缘检测结果中的伪边缘,突出地物的真实边缘,并能正确地提取地物目标的整体轮廓信息,有助于后续地物目标的自动识别。相似文献

13.

Object-based image labeling through learning by example and multi-level segmentation

Y. XuAuthor Vitae A.M. TekalpAuthor Vitae F.T. Yarman-Vural^{Author Vitae} 《Pattern recognition》2003,36(6):1407-1423

We propose a method for automatic extraction and labeling of semantically meaningful image objects using “learning by example” and threshold-free multi-level image segmentation. The proposed method scans through images, each of which is pre-segmented into a hierarchical uniformity tree, to seek and label objects that are similar to an example object presented by the user. By representing images with stacks of multi-level segmentation maps, objects can be extracted in the segmentation map level with adequate detail. Experiments have shown that the proposed multi-level image segmentation results in significant reduction in computation complexity for object extraction and labeling (compared to a single fine-level segmentation) by avoiding unnecessary tests of combinations in finer levels. The multi-level segmentation-based approach also achieves better accuracy in detection and labeling of small objects. 相似文献

14.

Text extraction in complex color documents

C. StrouthopoulosN. Papamarkos A.E. Atsalakis 《Pattern recognition》2002,35(8):1743-1758

Text extraction in mixed-type documents is a pre-processing and necessary stage for many document applications. In mixed-type color documents, text, drawings and graphics appear with millions of different colors. In many cases, text regions are overlaid onto drawings or graphics. In this paper, a new method to automatically detect and extract text in mixed-type color documents is presented. The proposed method is based on a combination of an adaptive color reduction (ACR) technique and a page layout analysis (PLA) approach. The ACR technique is used to obtain the optimal number of colors and to convert the document into the principal of them. Then, using the principal colors, the document image is split into the separable color plains. Thus, binary images are obtained, each one corresponding to a principal color. The PLA technique is applied independently to each of the color plains and identifies the text regions. A merging procedure is applied in the final stage to merge the text regions derived from the color plains and to produce the final document. Several experimental and comparative results, exhibiting the performance of the proposed technique, are also presented. 相似文献

15.

Automatic generation of structured hyperdocuments from document images

Ji-Yeon Lee Jeong-Seon Park Hyeran Byun Jongsub Moon Seong-Whan Lee 《Pattern recognition》2002,35(2):485-503

As sharing documents through the World Wide Web has been recently and constantly increasing, the need for creating hyperdocuments to make them accessible and retrievable via the internet, in formats such as HTML and SGML/XML, has also been rapidly rising. Nevertheless, only a few works have been done on the conversion of paper documents into hyperdocuments. Moreover, most of these studies have concentrated on the direct conversion of single-column document images that include only text and image objects. In this paper, we propose two methods for converting complex multi-column document images into HTML documents, and a method for generating a structured table of contents page based on the logical structure analysis of the document image. Experiments with various kinds of multi-column document images show that, by using the proposed methods, their corresponding HTML documents can be generated in the same visual layout as that of the document images, and their structured table of contents page can be also produced with the hierarchically ordered section titles hyperlinked to the contents. 相似文献

16.

Hand radiograph image segmentation using a coarse-to-fine strategy

Chin-Chuan Han Author Vitae Chang-Hsing Lee Author Vitae 《Pattern recognition》2007,40(11):2994-3004

Image segmentation techniques have been widely applied in diagnosis systems with medical image support. Information about metaphyseal and epiphyseal regions is crucial in bone age assessment. In this study, hand radiograph images have been segmented using a coarse-to-fine strategy. Watershed transform is first done to get metaphyseal regions at a coarse level. Some image processing algorithms, such as noise removal, labeling, and ellipse region fitting, are performed to find the epiphyseal regions of interest (ROIs). The epiphyseal regions are extracted using an active contour model approach based on GVF (gradient vector flow) at a fine level. Some hand radiograph images are processed to show the validity of the proposed approach. 相似文献

17.

A forgery detection algorithm for exemplar-based inpainting images using multi-region relation 总被引：1，自引：0，他引：1

I-Cheng Chang J. Cloud YuChih-Chuan Chang 《Image and vision computing》2013

The identification of image authenticity has received much attention because of the increasing power of image editing methods. This paper proposes a novel forgery detection algorithm to recognize tampered inpainting images, which is one of the effective approaches for image manipulation. The proposed algorithm contains two major processes: suspicious region detection and forged region identification. Suspicious region detection searches the similarity blocks in an image to find the suspicious regions and uses a similarity vector field to remove the false positives caused by uniform area. Forged region identification applies a new method, multi-region relation (MRR), to identify the forged regions from the suspicious regions. The proposed approach can effectively recognize if an image is a forged one and identify the forged regions, even for the images containing the uniform background. Moreover, we propose a two-stage searching algorithm based on weight transformation to speed up the computation speed. The experimental results show that the proposed approach has good performance with fast speed under different kinds of inpainting images. 相似文献

18.

The effect of border noise on the performance of projection-based page segmentation methods

Shafait F Breuel TM 《IEEE transactions on pattern analysis and machine intelligence》2011,33(4):846-851

Projection methods have been used in the analysis of bitonal document images for different tasks such as page segmentation and skew correction for more than two decades. However, these algorithms are sensitive to the presence of border noise in document images. Border noise can appear along the page border due to scanning or photocopying. Over the years, several page segmentation algorithms have been proposed in the literature. Some of these algorithms have come into widespread use due to their high accuracy and robustness with respect to border noise. This paper addresses two important questions in this context: 1) Can existing border noise removal algorithms clean up document images to a degree required by projection methods to achieve competitive performance? 2) Can projection methods reach the performance of other state-of-the-art page segmentation algorithms (e.g., Docstrum or Voronoi) for documents where border noise has successfully been removed? We perform extensive experiments on the University of Washington (UW-III) data set with six border noise removal methods. Our results show that although projection methods can achieve the accuracy of other state-of-the-art algorithms on the cleaned document images, existing border noise removal techniques cannot clean up documents captured under a variety of scanning conditions to the degree required to achieve that accuracy. 相似文献

19.

Automatically improving image quality using tensor voting

Toan Dinh Nguyen Jonghyun Park Soohyung Kim Hyukro Park Gueesang Lee 《Neural computing & applications》2011,20(7):1017-1026

A novel corrupted region detection technique based on tensor voting is proposed to automatically improve the image quality. This method is suitable for restoring degraded images and enhancing binary images. First, the input images are converted into layered images in which each layer contains objects having similar characteristics. By encoding the pixels in the layered images with second-order tensors and performing voting among them, the corrupted regions are automatically detected using the resulting tensors. These corrupted regions are then restored to improve the image quality. The experimental results obtained from automatic image restoration and binary image enhancement applications show that our method can successfully detect and correct the corrupted regions. 相似文献

20.

Segmentation and classification of brain images using firefly and hybrid kernel-based support vector machine

K. Selva Bhuvaneswari P. Geetha 《人工智能实验与理论杂志》2013,25(3):663-678

Abstract

Magnetic resonance imaging segmentation refers to a process of assigning labels to set of pixels or multiple regions. It plays a major role in the field of biomedical applications as it is widely used by the radiologists to segment the medical images input into meaningful regions. In recent years, various brain tumour detection techniques are presented in the literature. The entire segmentation process of our proposed work comprises three phases: threshold generation with dynamic modified region growing phase, texture feature generation phase and region merging phase. by dynamically changing two thresholds in the modified region growing approach, the first phase of the given input image can be performed as dynamic modified region growing process, in which the optimisation algorithm, firefly algorithm help to optimise the two thresholds in modified region growing. After obtaining the region growth segmented image using modified region growing, the edges can be detected with edge detection algorithm. In the second phase, the texture feature can be extracted using entropy-based operation from the input image. In region merging phase, the results obtained from the texture feature-generation phase are combined with the results of dynamic modified region growing phase and similar regions are merged using a distance comparison between regions. After identifying the abnormal tissues, the classification can be done by hybrid kernel-based SVM (Support Vector Machine). The performance analysis of the proposed method will be carried by K-cross fold validation method. The proposed method will be implemented in MATLAB with various images. 相似文献