首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Correct segmentation of handwritten Chinese characters is crucial to their successful recognition. However, due to many difficulties involved, little work has been reported in this area. In this paper, a two-stage approach is presented to segment unconstrained handwritten Chinese characters. A handwritten Chinese character string is first coarsely segmented according to the background skeleton and vertical projection after a proper image preprocessing. With several geometric features, all possible segmentation paths are evaluated by using the fuzzy decision rules learned from examples. As a result, unsuitable segmentation paths are discarded. In the fine segmentation stage that follows, the strokes that may contain segmentation points are first identified. The feature points are then extracted from candidate strokes and taken as segmentation point candidates through each of which a segmentation path may be formed. The geometric features similar to the coarse segmentation stage are used and corresponding fuzzy decision rules are generated to evaluate fine segmentation paths. Experimental results on 1000 Chinese character strings from postal mail show that our approach can achieve a reasonable good overall accuracy in segmenting unconstrained handwritten Chinese characters.  相似文献   

2.
一种无约束手写体数字串分割方法   总被引:11,自引:1,他引:11  
针对无约束手写体数字串中的连笔字符,本文提出以基于识别的分割方法为主,结合运用剖分方法和全局识别方法等多种分割策略的数字串分割方法。这种方法直接针对数字串分割,也可以运用到非数字字符串的分割中,其分割思想对连笔汉字的分割也具有一定指导意义。  相似文献   

3.
罗佳  王玲 《微计算机信息》2007,23(25):275-276,284
针对现有的切分算法结构复杂,时间和空间复杂度高等不足,提出了一种基于凹凸特性的非限制粘连手写数字串切分的新方法。首先计算数字串图像的赋值背景,然后从中提取凹凸特性,找到切分区域,最后在切分区域内提取切分线。该方法简单快速,在提高切分正确率的同时也降低了复杂度。利用NISTSD19收集到的样本进行实验,正确率高达97.5%,切分时间也大大缩短。  相似文献   

4.
In this paper, we develop a new method to separate single-touching handwritten numeral strings with two numerals using structural features. A binary image of a single-touching handwritten numeral string is preprocessed with an efficient algorithm for smoothing, linearization and detection of structural points of image contours. The touching region of a single-touching handwritten numeral string is determined based on distribution of the structural points in the handwritten numeral string. A candidate touching point is preselected based on the geometrical information of a special structural point in the touching region. In some cases, the left or right lateral numeral of a single-touching handwritten numeral string can be recognized. The recognition information can be utilized to correct the position of the candidate touching point. We have tested our method on image samples taken from the U.S. National Institute of Science and Technology (NIST) database. We used 500 sample images for training and obtained a correct separation rate of 99.1%. For 3287 test samples not used for training the correct separation rate was 97.2%.  相似文献   

5.
手写数字串切分是手写数字OCR系统中必不可少的组成部分.实际应用中一般用框格对数字的书写范围进行约束,切分过程比较容易,如果没有框格约束,手写数字串的切分就成为一个难题.针对无约束的手写数字串切分的难点,提出了一种新的粘连数字串切分方法.该方法先使用主曲线实现字符模板的笔画抽取,然后依据字符笔画的模糊特征处理笔画,最后以字符识别器提供的置信度为依据完成切分过程.为验证该新切分方法的效果.对从银行实地采集的3 000份真实支票进行了切分实验,其中363张支票存在粘连现象,切分正确率为89.68%.实验结果表明,该算法能够有效地切分多字粘连的手写体数字串.  相似文献   

6.
For the first time, a genetic framework using contextual knowledge is proposed for segmentation and recognition of unconstrained handwritten numeral strings. New algorithms have been developed to locate feature points on the string image, and to generate possible segmentation hypotheses. A genetic representation scheme is utilized to show the space of all segmentation hypotheses (chromosomes). For the evaluation of segmentation hypotheses, a novel evaluation scheme is introduced, in order to improve the outlier resistance of the system. Our genetic algorithm tries to search and evolve the population of segmentation hypotheses, and to find the one with the highest segmentation/recognition confidence. The NIST NSTRING SD19 and CENPARMI databases were used to evaluate the performance of our proposed method. Our experiments showed that proper use of contextual knowledge in segmentation, evaluation and search greatly improves the overall performance of the system. On average, our system was able to obtain correct recognition rates of 95.28% and 96.42% on handwritten numeral strings using neural network and support vector classifiers, respectively. These results compare favorably with the ones reported in the literature.  相似文献   

7.
8.
In integrated segmentation and recognition of character strings, the underlying classifier is trained to be resistant to noncharacters. We evaluate the performance of state-of-the-art pattern classifiers of this kind. First, we build a baseline numeral string recognition system with simple but effective presegmentation. The classification scores of the candidate patterns generated by presegmentation are combined to evaluate the segmentation paths and the optimal path is found using the beam search strategy. Three neural classifiers, two discriminative density models, and two support vector classifiers are evaluated. Each classifier has some variations depending on the training strategy: maximum likelihood, discriminative learning both with and without noncharacter samples. The string recognition performances are evaluated on the numeral string images of the NIST special database 19 and the zipcode images of the CEDAR CDROM-1. The results show that noncharacter training is crucial for neural classifiers and support vector classifiers, whereas, for the discriminative density models, the regularization of parameters is important. The string recognition results compare favorably to the best ones reported in the literature though we totally ignored the geometric context. The best results were obtained using a support vector classifier, but the neural classifiers and discriminative density models show better trade-off between accuracy and computational overhead.  相似文献   

9.
The segmentation of handwritten digit strings into isolated digits remains a challenging task. The difficulty for recognizing handwritten digit strings is related to several factors such as sloping, overlapping, connecting and unknown length of the digit string. Hence, this paper aims to propose a segmentation and recognition system for unknown-length handwritten digit strings by combining several explicit segmentation methods depending on the configuration link between digits. Three segmentation methods are combined based on histogram of the vertical projection, the contour analysis and the sliding window Radon transform. A recognition and verification module based on support vector machine classifiers allows analyzing and deciding the rejection or acceptance each segmented digit image. Moreover, various submodules are included leading to enhance the robustness of the proposed system. Experimental results conducted on the benchmark dataset show that the proposed system is effective for segmenting handwritten digit strings without prior knowledge of their length comparatively to the state of the art.  相似文献   

10.
手写体数字字符串识别常用于邮件自动分拣、银行票据和财务报表的录入中,针对其分割识别算法复杂度较高、准确率较低的问题,提出一种多分类器下无分割手写数字字符串识别算法。该算法的核心是采用四个分类器实现粘连字符串的无分割识别;将残差结构应用于LeNet-5网络,以增加网络深度,提高识别准确率,加快收敛速度;使用动态选择策略,以避免长度分类器误分类对识别结果的影响。实验结果表明,在NIST SD19一位数字和Synthetic数据集训练网络下,使用NIST SD19上长度为2、3、4、5、6的字符串验证网络,其识别准确率分别为99.3%、98.5%、98.1%、96.6%和97.2%。  相似文献   

11.
The recognition of connected handwritten digit strings is a challenging task due mainly to two problems: poor character segmentation and unreliable isolated character recognition. The authors first present a rational B-spline representation of digit templates based on Pixel-to-Boundary Distance (PBD) maps. We then present a neural network approach to extract B-spline PBD templates and an evolutionary algorithm to optimize these templates. In total, 1000 templates (100 templates for each of 10 classes) were extracted from and optimized on 10426 training samples from the NIST Special Database 3. By using these templates, a nearest neighbor classifier can successfully reject 90.7 percent of nondigit patterns while achieving a 96.4 percent correct classification of isolated test digits. When our classifier is applied to the recognition of 4958 connected handwritten digit strings (4555 2-digit, 355 3-digit, and 48 4-digit strings) from the NIST Special Database 3 with a dynamic programming approach, it has a correct classification rate of 82.4 percent with a rejection rate of as low as 0.85 percent. Our classifier compares favorably in terms of correct classification rate and robustness with other classifiers that are tested  相似文献   

12.
基于感兴趣区域的运动分割算法   总被引:1,自引:0,他引:1  
提出了一种基于感兴趣区域ROI(Region Of Interests)进行形态学处理的运动分割算法。首先对变化检测得到的二值图像进行分割(粗分割),然后对粗分割后的每个区域内部进行形态学处理,最后再次对经过处理后的二值图像进行分割(细分割)。该算法能够很好地处理噪声较大情况下的运动分割,避免了大面积噪声对于前景目标分割的影响。  相似文献   

13.
Image thresholding is a common segmentation technique with applications in various fields, such as computer vision, pattern recognition, microscopy, remote sensing, and biology. The selection of threshold values for segmenting pixels into foreground and background regions is usually based on subjective assumptions or user judgments under empirical rules or manually determined. This work describes and evaluates six effective threshold selection strategies for image segmentation based on global optimization methods: genetic algorithms, particle swarm, simulated annealing, and pattern search. Experiments are conducted on several images to demonstrate the effectiveness of the proposed methodology.  相似文献   

14.
基于形态学重建的粘连物体分割   总被引:1,自引:0,他引:1  
提出一种基于形态学重建(Morphological Reconstruction)的图像分割方法。该方法先对待分割图像进行预处理,使边界点具有局部极大的灰度值;然后利用灰度形态学重建提取穹顶(Dome),并根据其特性利用阚值对穹顶进行二值化获得候选边界点集;再利用二值形态学重建确定候选边界点集中的边界点,得到分割边界。实验结果表明,本分割方法所得边界连续性好、假边界少;该方法受噪声和对象内部灰度变化的影响较小,适合用于分割含有粘连对象的图像。  相似文献   

15.
为在复杂环境下准确分割出手部轮廓,提出了一种改进的分水岭算法。采用码本对背景建模以提取前景,提取出前景和背景的骨架,将骨架作为标记进行分水岭变换,利用Freeman链码平滑轮廓得到最贴近视觉效果的手部轮廓。样本图片为1 280像素×720像素,从基于距离和基于区域两个测度来评价分割结果的精确度,平均绝对偏差在5像素以内,误分类误差在0.9%以内。实验结果表明,该算法能够有效解决分水岭的过分割问题,准确提取出多变的手部轮廓,对复杂背景和光照变化都有较好的鲁棒性。  相似文献   

16.
粘连断裂字符行的切分识别,是很多OCR 实际应用中存在的主要困难之一. 本文针对粘连断裂的印刷体数字行,提出了一种基于Viterbi 算法的切分识别方案,该方案采用两次切分识别的层次型结构. 在第二次切分识别过程中,首先,在候选切分点区域,结合灰度图像与二值轮廓信息,采用基于Viterbi 算法搜索的非直线路径进行切分,得到有效的切分路径;然后,结合分类器输出的可信度,采用Viterbi 算法来合并前面得到的候选切分图像块,进行动态切分与识别. 实际的金融票据识别系统实验表明,本文提出的印刷体数字行切分识别方法能够较好的克服字符行的粘连与断裂情况,提高了识别系统的识别率和鲁棒性.  相似文献   

17.
Variations in inter-line gaps and skewed or curled text-lines are some of the challenging issues in segmentation of handwritten text-lines. Moreover, overlapping and touching text-lines that frequently appear in unconstrained handwritten text documents significantly increase segmentation complexities. In this paper, we propose a novel approach for unconstrained handwritten text-line segmentation. A new painting technique is employed to smear the foreground portion of the document image. The painting technique enhances the separability between the foreground and background portions enabling easy detection of text-lines. A dilation operation is employed on the foreground portion of the painted image to obtain a single component for each text-line. Thinning of the background portion of the dilated image and subsequently some trimming operations are performed to obtain a number of separating lines, called candidate line separators. By using the starting and ending points of the candidate line separators and analyzing the distances among them, related candidate line separators are connected to obtain segmented text-lines. Furthermore, the problems of overlapping and touching components are addressed using some novel techniques. We tested the proposed scheme on text-pages of English, French, German, Greek, Persian, Oriya and Bangla and remarkable results were obtained.  相似文献   

18.
基于主分量分析法的脱机手写数字识别   总被引:1,自引:0,他引:1       下载免费PDF全文
张国华  万钧力 《计算机工程》2007,33(18):219-221
针对手写数字识别研究中统计特征和结构特征融合困难的问题,利用主分量分析法提取数字字符结构特征的统计信息,重建数字模型,并估计重构偏差,同时提取数字的高宽比特征和欧拉特征,通过组合与3种特征相对应的贝叶斯分类器的分类结果实现数字识别。使用该方法对样本库中的样本进行测试,正确识别率为90.73%。  相似文献   

19.
In this paper, a two-stage HMM-based recognition method allows us to compensate for the possible loss in terms of recognition performance caused by the necessary trade-off between segmentation and recognition in an implicit segmentation-based strategy. The first stage consists of an implicit segmentation process that takes into account some contextual information to provide multiple segmentation-recognition hypotheses for a given preprocessed string. These hypotheses are verified and re-ranked in a second stage by using an isolated digit classifier. This method enables the use of two sets of features and numeral models: one taking into account both the segmentation and recognition aspects in an implicit segmentation-based strategy, and the other considering just the recognition aspects of isolated digits. These two stages have been shown to be complementary, in the sense that the verification stage compensates for the loss in terms of recognition performance brought about by the necessary tradeoff between segmentation and recognition carried out in the first stage. The experiments on 12,802 handwritten numeral strings of different lengths have shown that the use of a two-stage recognition strategy is a promising idea. The verification stage brought about an average improvement of 9.9% on the string recognition rates. On touching digit pairs, the method achieved a recognition rate of 89.6%. Received June 28, 2002 / Revised July 03, 2002  相似文献   

20.
The touching character segmentation problem becomes complex when touching strings are multi-oriented. Moreover in graphical documents sometimes characters in a single-touching string have different orientations. Segmentation of such complex touching is more challenging. In this paper, we present a scheme towards the segmentation of English multi-oriented touching strings into individual characters. When two or more characters touch, they generate a big cavity region in the background portion. Based on the convex hull information, at first, we use this background information to find some initial points for segmentation of a touching string into possible primitives (a primitive consists of a single character or part of a character). Next, the primitives are merged to get optimum segmentation. A dynamic programming algorithm is applied for this purpose using the total likelihood of characters as the objective function. A SVM classifier is used to find the likelihood of a character. To consider multi-oriented touching strings the features used in the SVM are invariant to character orientation. Experiments were performed in different databases of real and synthetic touching characters and the results show that the method is efficient in segmenting touching characters of arbitrary orientations and sizes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号