An empirical measure of the performance of a document image segmentation algorithm |
| |
Authors: | Amit Kumar Das Sanjoy Kumar Saha Bhabatosh Chanda |
| |
Affiliation: | (1) Computer Science & Technology Department, Bengal Engineering College (DU), Sibpore, Howrah 711 103, India; e-mail: {amit,sks}@becs.ac.in , IN;(2) Electronics and Communication Sciences Unit, Indian Statistical Institute, Calcutta 700 035, India; e-mail: chanda@isical.ac.in , IN |
| |
Abstract: | Document image segmentation is the first step in document image analysis and understanding. One major problem centres on
the performance analysis of the evolving segmentation algorithms. The use of a standard document database maintained at the
Universities/Research Laboratories helps to solve the problem of getting authentic data sources and other information, but
some methodologies have to be used for performance analysis of the segmentation. We describe a new document model in terms
of a bounding box representation of its constituent parts and suggest an empirical measure of performance of a segmentation
algorithm based on this new graph-like model of the document. Besides the global error measures, the proposed method also
produces segment-wise details of common segmentation problems such as horizontal and vertical split and merge as well as invalid
and mismatched regions.
Received July 14, 2000 / Revised June 12, 2001-1mm] |
| |
Keywords: | :Document image analysis – Segmentation – Document image database – Document model – Performance analysis |
本文献已被 SpringerLink 等数据库收录! |
|