期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A multi-plane approach for text segmentation of complex document images

Yen-Lin Chen Author Vitae 《Pattern recognition》2009,42(7):1419-1444

This study presents a new method, namely the multi-plane segmentation approach, for segmenting and extracting textual objects from various real-life complex document images. The proposed multi-plane segmentation approach first decomposes the document image into distinct object planes to extract and separate homogeneous objects including textual regions of interest, non-text objects such as graphics and pictures, and background textures. This process consists of two stages—localized histogram multilevel thresholding and multi-plane region matching and assembling. Then a text extraction procedure is applied on the resultant planes to detect and extract textual objects with different characteristics in the respective planes. The proposed approach processes document images regionally and adaptively according to their respective local features. Hence detailed characteristics of the extracted textual objects, particularly small characters with thin strokes, as well as gradational illuminations of characters, can be well-preserved. Moreover, this way also allows background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well. Experimental results on real-life complex document images demonstrate that the proposed approach is effective in extracting textual objects with various illuminations, sizes, and font styles from various types of complex document images. 相似文献

2.

Adaptive thresholding algorithm: Efficient computation technique based on intelligent block detection for degraded document images

Yu-Ting Pai Author VitaeAuthor Vitae Shanq-Jang Ruan^{Author Vitae} 《Pattern recognition》2010,43(9):3177-3187

Document image binarization involves converting gray level images into binary images, which is a feature that has significantly impacted many portable devices in recent years, including PDAs and mobile camera phones. Given the limited memory space and the computational power of portable devices, reducing the computational complexity of an embedded system is of priority concern. This work presents an efficient document image binarization algorithm with low computational complexity and high performance. Integrating the advantages of global and local methods allows the proposed algorithm to divide the document image into several regions. A threshold surface is then constructed based on the diversity and the intensity of each region to derive the binary image. Experimental results demonstrate the effectiveness of the proposed method in providing a promising binarization outcome and low computational cost. 相似文献

3.

一种去除二值文本图像边缘噪声的新方法 总被引：1，自引：0，他引：1

杨博戚飞虎郝峻晟《计算机工程》2006,32(5):186-188,191

提出了一种去除二值文本图像边缘噪声的新厅法。该方法包括3个步骤：边缘噪声预处理,边缘噪声检测和边缘噪声去除。将该方法和其它方法作了比较。实验结果表明,该新方法可以更有效地去除边缘噪声,并在实际的文本图像处理系统中取得了较理想的效果。相似文献

4.

On foreground — background separation in low quality document images 总被引：1，自引：0，他引：1

Utpal Garain Thierry Paquet Laurent Heutte 《International Journal on Document Analysis and Recognition》2006,8(1):47-63

相似文献

5.

Document cleanup using page frame detection

Faisal Shafait Joost van Beusekom Daniel Keysers Thomas M. Breuel 《International Journal on Document Analysis and Recognition》2008,11(2):81-96

When a page of a book is scanned or photocopied, textual noise (extraneous symbols from the neighboring page) and/or non-textual noise (black borders, speckles, ...) appear along the border of the document. Existing document analysis methods can handle non-textual noise reasonably well, whereas textual noise still presents a major issue for document analysis systems. Textual noise may result in undesired text in optical character recognition (OCR) output that needs to be removed afterwards. Existing document cleanup methods try to explicitly detect and remove marginal noise. This paper presents a new perspective for document image cleanup by detecting the page frame of the document. The goal of page frame detection is to find the actual page contents area, ignoring marginal noise along the page border. We use a geometric matching algorithm to find the optimal page frame of structured documents (journal articles, books, magazines) by exploiting their text alignment property. We evaluate the algorithm on the UW-III database. The results show that the error rates are below 4% each of the performance measures used. Further tests were run on a dataset of magazine pages and on a set of camera captured document images. To demonstrate the benefits of using page frame detection in practical applications, we choose OCR and layout-based document image retrieval as sample applications. Experiments using a commercial OCR system show that by removing characters outside the computed page frame, the OCR error rate is reduced from 4.3 to 1.7% on the UW-III dataset. The use of page frame detection in layout-based document image retrieval application decreases the retrieval error rates by 30%. 相似文献

6.

Shape based local thresholding for binarization of document images

Jichuan ShiNilanjan Ray Hong Zhang 《Pattern recognition letters》2012,33(1):24-32

This paper presents a novel local threshold algorithm for the binarization of document images. Stroke width of handwritten and printed characters in documents is utilized as the shape feature. As a result, in addition to the intensity analysis, the proposed algorithm introduces the stroke width as shape information into local thresholding. Experimental results for both synthetic and practical document images show that the proposed local threshold algorithm is superior in terms of segmentation quality to the threshold approaches that solely use intensity information. 相似文献

7.

A multi-scale framework for adaptive binarization of degraded document images

Reza Farrahi Moghaddam^{Author Vitae} 《Pattern recognition》2010,43(6):2186-2198

In this work, a multi-scale binarization framework is introduced, which can be used along with any adaptive threshold-based binarization method. This framework is able to improve the binarization results and to restore weak connections and strokes, especially in the case of degraded historical documents. This is achieved thanks to localized nature of the framework on the spatial domain. The framework requires several binarizations on different scales, which is addressed by introduction of fast grid-based models. This enables us to explore high scales which are usually unreachable to the traditional approaches. In order to expand our set of adaptive methods, an adaptive modification of Otsu's method, called AdOtsu, is introduced. In addition, in order to restore document images suffering from bleed-through degradation, we combine the framework with recursive adaptive methods. The framework shows promising performance in subjective and objective evaluations performed on available datasets. 相似文献

8.

A simple and effective table detection system from document images

S. Mandal S. P. Chowdhury A. K. Das Bhabatosh Chanda 《International Journal on Document Analysis and Recognition》2006,8(2-3):172-182

The requirement of detection and identification of tables from document images is crucial to any document image analysis and digital library system. In this paper we report a very simple but extremely powerful approach to detect tables present in document pages. The algorithm relies on the observation that the tables have distinct columns which implies that gaps between the fields are substantially larger than the gaps between the words in text lines. This deceptively simple observation has led to the design of a simple but powerful table detection system with low computation cost. Moreover, mathematical foundation of the approach is also established including formation of a regular expression for ease of implementation. 相似文献

9.

复杂版面的文本图像图文分割算法

杨洋平西建《微计算机信息》2006,22(13):224-225

为了满足办公自动化的实时性要求,本文提出了一种改进的自顶向下的图文分割算法。该方法利用文本行基线之间的距离自适应的确定结构元素的大小,克服自顶向下算法要求对页面有先验知识的缺点。实验表明,本文提出的算法分割准确,速度快。相似文献

10.

A method for combining complementary techniques for document image segmentation

Nikolaos Stamatopoulos Basilis Gatos Stavros J. PerantonisAuthor vitae 《Pattern recognition》2009,42(12):3158-3168

Image segmentation is a major task of handwritten document image processing. Many of the proposed techniques for image segmentation are complementary in the sense that each of them using a different approach can solve different difficult problems such as overlapping, touching components, influence of author or font style etc. In this paper, a combination method of different segmentation techniques is presented. Our goal is to exploit the segmentation results of complementary techniques and specific features of the initial image so as to generate improved segmentation results. Experimental results on line segmentation methods for handwritten documents demonstrate the effectiveness of the proposed combination method. 相似文献

11.

Correcting show-through effects on scanned color document images by multiscale analysis

Hirobumi Takeshi 《Pattern recognition》2003,36(12):2835-2847

This paper describes a new approach to restoring scanned color document images where the backside image shows through the paper sheet. A new framework is presented for correcting show-through components using digital image processing techniques. First, the foreground components on the front side are separated from the background and backside components through locally adaptive binarization for each color component and edge magnitude thresholding. Background colors are estimated locally through color thresholding to generate a restored image, and then corrected adaptively through multi-scale analysis along with comparison of edge distributions between the original and the restored image. The proposed method does not require specific input devices or the backside to be input; it is able to correct unneeded image components through analysis of the front side image alone. Experimental results are given to verify effectiveness of the proposed method. 相似文献

12.

An improved binarization algorithm based on a water flow model for document image with inhomogeneous backgrounds

Hyun-Hwa Oh Kil-Taek Lim Sung-Il Chien 《Pattern recognition》2005,38(12):2612-2625

A segmentation algorithm using a water flow model [Kim et al., Pattern Recognition 35 (2002) 265–277] has already been presented where a document image can be efficiently divided into two regions, characters and background, due to the property of locally adaptive thresholding. However, this method has not decided when to stop the iterative process and required long processing time. Plus, characters on poor contrast backgrounds often fail to be separated successfully. Accordingly, to overcome the above drawbacks to the existing method, the current paper presents an improved approach that includes extraction of regions of interest (ROIs), an automatic stopping criterion, and hierarchical thresholding. Experimental results show that the proposed method can achieve a satisfactory binarization quality, especially for document images with a poor contrast background, and is significantly faster than the existing method. 相似文献

13.

Automatic generation of structured hyperdocuments from document images

Ji-Yeon Lee Jeong-Seon Park Hyeran Byun Jongsub Moon Seong-Whan Lee 《Pattern recognition》2002,35(2):485-503

As sharing documents through the World Wide Web has been recently and constantly increasing, the need for creating hyperdocuments to make them accessible and retrievable via the internet, in formats such as HTML and SGML/XML, has also been rapidly rising. Nevertheless, only a few works have been done on the conversion of paper documents into hyperdocuments. Moreover, most of these studies have concentrated on the direct conversion of single-column document images that include only text and image objects. In this paper, we propose two methods for converting complex multi-column document images into HTML documents, and a method for generating a structured table of contents page based on the logical structure analysis of the document image. Experiments with various kinds of multi-column document images show that, by using the proposed methods, their corresponding HTML documents can be generated in the same visual layout as that of the document images, and their structured table of contents page can be also produced with the hierarchically ordered section titles hyperlinked to the contents. 相似文献

14.

Fourier–Mellin registration of line-delineated tabular document images

Luke A. D. Hutchison William A. Barrett 《International Journal on Document Analysis and Recognition》2006,8(2-3):87-110

Image registration (or alignment) is a useful preprocessing tool for assisting in manual data extraction from handwritten forms, as well as for preparing documents for batch OCR of specific page regions. A new technique is presented for fast registration of lined tabular document images in the presence of a global affine transformation, using the Discrete Fourier--Mellin Transform (DFMT). Each component of the affine transform is handled separately, which dramatically reduces the total parameter space of the problem. This method is robust and deals with all components of the affine transform in a uniform way by working in the frequency domain. The DFMT is extended to handle shear, which can approximate a small amount of perspective distortion. In order to limit registration to foreground pixels only, and to eliminate Fourier edge effects, a novel, locally adaptive foreground-background segmentation algorithm is introduced, based on the median filter, which eliminates the need for Blackman windowing as usually required by DFMT image registration. A novel information-theoretic optimization of the median filter is presented. An original method is demonstrated for automatically obtaining blank document templates from a set of registered document images. 相似文献

15.

Text region extraction in a document image based on the Delaunay tessellation

Yi XiaoAuthor Vitae Hong YanAuthor Vitae 《Pattern recognition》2003,36(3):799-809

In this paper, Delaunay triangulation is applied for the extraction of text areas in a document image. By representing the location of connected components in a document image with their centroids, the page structure is described as a set of points in two-dimensional space. When imposing Delaunay triangulation on these points, the text regions in the Delaunay triangulation will have distinguishing triangular features from image and drawing regions. For analysis, the Delaunay triangles are divided into four classes. The study reveals that specific triangles in text areas can be clustered together and identified as text body. Using this method, text regions in a document image containing fragments can also be recognized accurately. Experiments show the method is also very efficient. 相似文献

16.

A spatially adaptive statistical method for the binarization of historical manuscripts and degraded document images

Rachid Hedjam^{Author Vitae} Reza Farrahi Moghaddam Author Vitae Author Vitae 《Pattern recognition》2011,44(9):2184-2196

相似文献

17.

A knowledge-based system for extracting text-lines from mixed and overlapping text/graphics compound document images

Yen-Lin ChenZeng-Wei Hong Cheng-Hung Chuang 《Expert systems with applications》2012,39(1):494-507

This paper presents a new knowledge-based system for extracting and identifying text-lines from various real-life mixed text/graphics compound document images. The proposed system first decomposes the document image into distinct object planes to separate homogeneous objects, including textual regions of interest, non-text objects such as graphics and pictures, and background textures. A knowledge-based text extraction and identification method obtains the text-lines with different characteristics in each plane. The proposed system offers high flexibility and expandability by merely updating new rules to cope with various types of real-life complex document images. Experimental and comparative results prove the effectiveness of the proposed knowledge-based system and its advantages in extracting text-lines with a large variety of illumination levels, sizes, and font styles from various types of mixed and overlapping text/graphics complex compound document images. 相似文献

18.

Automated segmentation of brain MR images 总被引：5，自引：0，他引：5

C. B.S. bioR. 《Pattern recognition》1995,28(12):1825-1837

A simple, robust and efficient image segmentation algorithm for classifying brain tissues from dual echo Magnetic Resonance (MR) images is presented. The algorithm consists of a sequence of adaptive histogram analysis, morphological operations and knowledge based rules to accurately classify various regions such as the brain matter and the cerebrospinal fluid, and detect if there are any abnormal regions. It can be completely automated and has been tested on over hundred images from several patient studies. Experimental results are provided. 相似文献

19.

Self-adaptive algorithm of impulsive noise reduction in color images

B. Smolka K.N. Plataniotis A. Chydzinski M. Szczepanski A.N. Venetsanopoulos K. Wojciechowski 《Pattern recognition》2002,35(8):1771-1784

In this paper a new approach to the problem of impulsive noise reduction in color images is presented. The basic idea behind the new image filtering technique is the maximization of the similarities between pixels in a predefined filtering window. The improvement introduced to this technique lies in the adaptive establishing of parameters of the similarity function and causes that the new filter adapts itself to the fraction of corrupted image pixels. The new method preserves edges, corners and fine image details, is relatively fast and easy to implement. The results show that the proposed method outperforms most of the basic algorithms for the reduction of impulsive noise in color images. 相似文献

20.

Adaptive-hierarchical filtering approach for noise removal

Tsung-Nan Lin Kai-Jie Chan 《Displays》2008,29(3):209-213

In this paper, our contribution is to propose an adaptive-hierarchical filter that can remove the impulse noise while preserving the details in an image. The global image structure, which is estimated from a set of pyramid images, can be used as prior information in order to apply different filters adaptively. The proposed filter outperforms other methods in that it can make use of both local and global information efficiently. Experimental results show that the proposed method produces better performance than many other well-known methods do. 相似文献