首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A lexicon-based, handwritten word recognition system combining segmentation-free and segmentation-based techniques is described. The segmentation-free technique constructs a continuous density hidden Markov model for each lexicon string. The segmentation-based technique uses dynamic programming to match word images and strings. The combination module uses differences in classifier capabilities to achieve significantly better performance  相似文献   

2.
Two hybrid fuzzy neural systems are developed and applied to handwritten word recognition. The word recognition system requires a module that assigns character class membership values to segments of images of handwritten words. The module must accurately represent ambiguities between character classes and assign low membership values to a wide variety of noncharacter segments resulting from erroneous segmentations. Each hybrid is a cascaded system. The first stage of both is a self-organizing feature map (SOFM). The second stages map distances into membership values. The third stage of one system is a multilayer perceptron (MLP). The third stage of the other is a bank of Choquet fuzzy integrals (FI). The two systems are compared individually and as a combination to the baseline system. The new systems each perform better than the baseline system. The MLP system slightly outperforms the FI system, but the combination of the two outperforms the individual systems with a small increase in computational cost over the MLP system. Recognition rates of over 92% are achieved with a lexicon set having average size of 100. Experiments were performed on a standard test set from the SUNY/USPS CD-ROM database  相似文献   

3.
For a segmentation and dynamic programming-based handwritten word recognition system, outlier rejection at the character level can improve word recognition performance because it reduces the chances that erroneous combinations of segments result in high word confidence values. We studied the multilayer perceptron (MLP) and a variant of radial basis function network (RBF) with the goal to use them as character level classifiers that have enhanced outlier rejection ability. The variant of the RBF uses principal component analysis (PCA) on the clusters defined by the nodes in the hidden layer. It was also trained with and without a regularization term that was aimed at minimizing the variances of the nodes in the hidden layer. Our experiments on handwritten word recognition showed: (1) In the case of MLPs, using more hidden nodes than that required for classification and including outliers in the training data can improve outlier rejection performance; (2) in the case of PCA-RBFs, training with the regularization term and no outlier can achieve performance very close to training with outliers. These results are both interesting. Result (1) is of interest because it is well known that minimizing the number of parameters, and therefore keeping the number of hidden units low, should increase the generalization capability. On the other hand, using more hidden units increases the chances of creating closed decision regions, as predicted by the theory in Gori and Scarselli (IEEE Trans. PAMI 20 (11) (1998) 1121). Result (2) is a strong statement in support of the use of regularization terms for the training of RBF-type neural networks in problems such as handwriting recognition for which outlier rejection is important. Additional tests on combining MLPs and PCA-RBF networks showed the potential to improve word recognition performance by exploiting the complementarity of these two kinds of neural networks.  相似文献   

4.
手写文本识别方法主要应用于文本输入技术,对人机交互领域的发展起关键作用。针对多数在线输入法无法识别中英文混合手写识别的问题,提出一种在线中英文混合手写文本识别方法。通过对文本笔画进行基于水平相对位置、垂直重叠率、面积重叠率规则的整合以及连笔切分,得到一系列字符片段,同时利用笔画个数、宽高比、中心偏离、平滑度等几何特征和识别置信度,对字符片段进行中英文分类。在此基础上,根据分类结果并结合自然语言模型的路径评价及动态规划搜索算法,分别对候选的中、英文字符片段进行合并处理,得到待识别的中、英文字符序列,并将其分别送入卷积神经网络的中、英文识别模型中,得到手写文本识别结果。实验结果表明,在线手写中英文混合文本识别正确率达93.67%,不仅能切分在线手写中文文本行,而且对包含字符连笔的在线手写中英文文本行也有较好的切分效果。  相似文献   

5.
Previous handwritten numeral recognition algorithms applied structural classification to extract geometric primitives that characterize each image, and then utilized artificial intelligence methods, like neural network or fuzzy memberships, to classify the images. We propose a handwritten numeral recognition methodology based on simplified structural classification, by using a much smaller set of primitive types, and fuzzy memberships. More specifically, based on three kinds of feature points, we first extract five kinds of primitive segments for each image. A fuzzy membership function is then used to estimate the likelihood of these primitives being close to the two vertical boundaries of the image. Finally, a tree-like classifier based on the extracted feature points, primitives and fuzzy memberships is applied to classify the numerals. With our system, handwritten numerals in NIST Special Database 19 are recognized with correct rate between 87.33% and 88.72%.  相似文献   

6.
Experiments comparing neural networks trained with crisp and fuzzy desired outputs are described. A handwritten word recognition algorithm using the neural networks for character level confidence assignment was tested on images of words taken from the United States Postal Service mailstream. The fuzzy outputs were defined using a fuzzy k-nearest neighbor algorithm. The crisp networks slightly outperformed the fuzzy networks at the character level but the fuzzy networks outperformed the crisp networks at the word level. This empirical result is interpreted as an example of the principle of least commitment  相似文献   

7.
目前关键词检测面临的一个主要挑战是集外词问题。由于集外词发音的不确定性导致其检测性能与集内词相差很多。对此,本文提出了一种融合查询扩展和动态匹配的方法来改善集外词检测的性能。首先比较了基于联合多元模型的查询扩展和基于最小编辑距离的动态匹配。考虑到二者潜在的互补性,采用两种融合方法:一种方法是结果融合,分别应用查询扩展和动态匹配并行的检测集外词,然后合并检测结果;另一种是置信度融合,融合最小编辑距离和发音得分构成混合置信度进行集外词的检出与确认。实验结果表明,第二种融合方法的效果更好,系统的品质因数相对提升了19.8%。  相似文献   

8.
Handprinted word recognition on a NIST data set   总被引:1,自引:0,他引:1  
An approach to handprinted word recognition is described. The approach is based on the use of generating multiple possible segmentations of a word image into characters and matching these segmentations to a lexicon of candidate strings. The segmentation process uses a combination of connected component analysis and distance transform-based, connected character splitting. Neural networks are used to assign character confidence values to potential character within word images. Experimental results are provided for both character and word recognition modules on data extracted from the NIST handprinted character database.  相似文献   

9.
Variability in handwriting styles suggests that many letter recognition engines cannot correctly identify some handwritten letters of poor quality at reasonable computational cost. Methods that are capable of searching the resulting sparse graph of letter candidates are therefore required. The method presented here employs ‘wildcards’ to represent missing letter candidates. Multiple experts are used to represent different aspects of handwriting. Each expert evaluates closeness of match and indicates its confidence. Explanation experts determine the degree to which the word alternative under consideration explains extraneous letter candidates. Schemata for normalisation and combination of scores are investigated and their performance compared. Hill climbing yields near-optimal combination weights that outperform comparable methods on identical dynamic handwriting data.  相似文献   

10.
An efficient unambiguous stereo matching technique is presented in this paper. Our main contribution is to introduce a new reliability measure to dynamic programming approaches in general. For stereo vision application, the reliability of a proposed match on a scanline is defined as the cost difference between the globally best disparity assignment that includes the match and the globally best assignment that does not include the match. A reliability-based dynamic programming algorithm is derived accordingly, which can selectively assign disparities to pixels when the corresponding reliabilities exceed a given threshold. The experimental results show that the new approach can produce dense (>70 percent of the unoccluded pixels) and reliable (error rate < 0.5 percent) matches efficiently (<0.2 sec on a 2 GHz P4) for the four Middlebury stereo data sets.  相似文献   

11.
A recognition system for general isolated off-line handwritten words using an approximate segment-string matching algorithm is described. The fundamental paradigm employed is a character-based segment-then-recognize/match strategy. An additional user supplied contextual information in the form of a lexicon guides a graph search to estimate the most likely word image identity. This system is designed to operate robustly in the presence of document noise, poor handwriting, and lexicon errors. A pre-processing step is initially applied to the image to remove noise artifacts and normalize the handwriting. An oversegmentation approach is used to improve the likelihood of capturing the individual characters embedded in the word. A directed graph is constructed that contains many possible interpretations of the word image, many implausible. The most likely graph path and associated confidence is computed for each lexicon word to produce a final lexicon ranking. Experiments highlighting the characteristics of this algorithm are given  相似文献   

12.
13.
针对手写阿拉伯单词书写连笔,且相似词较多的特点,该文提出一种新的脱机手写文字识别算法。该算法以固定组件为成分拆分阿拉伯单词,构建自组件特征至单词类别的加权贝叶斯推理模型。算法结合单词组件分割、多级混合式组件识别、组件加权系数估计等,计算单词类别的后验概率并得到单词识别结果。在IFN/ENIT库上的实验,获得了90.03%的单词识别率,证实组件分解对笔画连写具有鲁棒性,组件识别能提高相似词的辨别能力,而且该算法所需训练类别少,易向大词汇量识别扩展。  相似文献   

14.
手写数字串的分割与字符识别密切相关.采用基于识别的分割方法,在分割过程中引入识别机制识别分割碎片,将识别结果经过差值运算后置为每个识别对象的识别可信度,利用动态规划找到最佳分割路径.在训练分类器时,使用反例样本估计分类器参数,得到了性能良好的分类器.实验数据表明,利用正例和反例样本结合训练的分类器比只经过正例样本训练的分类器的识别率要高很多.  相似文献   

15.
In this paper we propose a novel character recognition method for Bangla compound characters. Accurate recognition of compound characters is a difficult problem due to their complex shapes. Our strategy is to decompose a compound character into skeletal segments. The compound character is then recognized by extracting the convex shape primitives and using a template matching scheme. The novelty of our approach lies in the formulation of appropriate rules of character decomposition for segmenting the character skeleton into stroke segments and then grouping them for extraction of meaningful shape components. Our technique is applicable to both printed and handwritten characters. The proposed method performs well for complex-shaped compound characters, which were confusing to the existing methods.  相似文献   

16.
17.
为有效地获取脱机手写体汉字笔划信息,采用过程神经元网络提取手写体汉字基本笔段,分析各类笔段间的拓扑性质,并将手写体汉字图像转化为具有容错表征方式的六种汉字笔划类型在不同位置组成的几何图形.模仿人类汉字形码输入法,统计具有冗余容错形状的笔划类型和相合相交点的数量和位置,建立手写体汉字多维特征知识数据结构表,通过对比和判断仿人容错地识别手写体汉字.对SCUT-IRAC手写体汉字库中汉字进行了实验仿真,该方法具有较强的"认知"手写体汉字的能力.  相似文献   

18.
In the standard segmentation-based approach to handwritten word recognition, individual character-class confidence scores are combined via averaging to estimate confidences in the hypothesized identities for a word. We describe a methodology for generating optimal linear combination of order statistics operators for combining character class confidence scores. Experimental results are provided on over 1000 word images  相似文献   

19.
20.
书写顺序恢复是从静态文本图像中提取动态的字符书写顺序信息,将2维的图像转换为1维的书写位置的时间序列的过程.为了对手写汉字进行书写顺序提取,提出了一种脱机手写汉字书写顺序的恢复模型.该模型首先将汉字分为整字、部件、子部件和笔画4个层次;然后利用4种拆分操作将整字拆分为部件,再将部件拆分为子部件;最后通过定义一组拆分关系与子部件偏序关系之间的对应规则来得到子部件的全序关系.而将子部件作为最基本的恢复单位,其书写顺序可通过对笔画和交叉笔画对进行分类来得到.实验表明,该模型提出的汉字书写顺序恢复方法的恢复结果具有较高的准确率,且处理速度达到了6.9字/s.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号