首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
Analysing online handwritten notes is a challenging problem because of the content heterogeneity and the lack of prior knowledge, as users are free to compose documents that mix text, drawings, tables or diagrams. The task of separating text from non-text strokes is of crucial importance towards automated interpretation and indexing of these documents, but solving this problem requires a careful modelling of contextual information, such as the spatial and temporal relationships between strokes. In this work, we present a comprehensive study of contextual information modelling for text/non-text stroke classification in online handwritten documents. Formulating the problem with a conditional random field permits to integrate and combine multiple sources of context, such as several types of spatial and temporal interactions. Experimental results on a publicly available database of freely hand-drawn documents demonstrate the superiority of our approach and the benefit of contextual information combination for solving text/non-text classification.  相似文献   

2.
3.
Automatic character recognition and image understanding of a given paper document are the main objectives of the computer vision field. For these problems, a basic step is to isolate characters and group words from these isolated characters. In this paper, we propose a new method for extracting characters from a mixed text/graphic machine-printed document and an algorithm for distinguishing words from the isolated characters. For extracting characters, we exploit several features (size, elongation, and density) of characters and propose a characteristic value for classification using the run-length frequency of the image component. In the context of word grouping, previous works have largely been concerned with words which are placed on a horizontal or vertical line. Our word grouping algorithm can group words which are on inclined lines, intersecting lines, and even curved lines. To do this, we introduce the 3D neighborhood graph model which is very useful and efficient for character classification and word grouping. In the 3D neighborhood graph model, each connected component of a text image segment is mapped onto 3D space according to the area of the bounding box and positional information from the document. We conducted tests with more than 20 English documents and more than ten oriental documents scanned from books, brochures, and magazines. Experimental results show that more than 95% of words are successfully extracted from general documents, even in very complicated oriental documents. Received August 3, 2001 / Accepted August 8, 2001  相似文献   

4.
The paper provides a practical solution to a real-time text/shape differentiation problem for online handwriting input. The proposed structure of the classification system comprises stroke grouping and stroke classification blocks. A new set of features is derived that has low computational complexity. The method achieves 98.5 % text/shape classification accuracy on a benchmark dataset. The proposed stroke grouping machine learning approach improves classification robustness in relation to different input styles. In contrast to the threshold-based techniques, this grouping adaptation enhances the overall discriminating accuracy of the text/shape recognition system by 11.3 %. The solution improves system’s response on a touch-screen device.  相似文献   

5.
Offline handwritten Amharic word recognition   总被引:1,自引:0,他引:1  
This paper describes two approaches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are a set of primitive strokes and their spatial relationships. The recognition system does not require segmentation of characters but requires text line detection and extraction of structural features, which is done by making use of direction field tensor. The performance of the recognition system is tested by a dataset of unconstrained handwritten documents collected from various sources, and promising results are obtained.  相似文献   

6.
Chinese text location under complex background using Gabor filter and SVM   总被引:1,自引:0,他引:1  
For the Chinese text location under complex background, this paper presents a novel method by combining Gabor filter and support vector machine (SVM). It bases on such a fact that Chinese characters are composed of four kinds of strokes. By extracting four kinds of stroke features with Gabor filters, Chinese text location problem can be transformed into a texture classification one, which can use SVM classifier for the purpose. So, the proposed method is composed of two phases. First, Gabor filters with different scales and orientations are employed to obtain four texture images representing the stokes of Chinese text in horizontal line, top-down vertical line, left-downward slope line and short pausing stroke directions. Then, the text regions and background regions in four texture images are used to train four SVM classifiers to distinguish the texture in four directions, by integrating an SVM classification network to obtain the final classification results, according to the sum of the weights to determine whether the block is the text region. Some experiments are conducted on a large amount of typical images with different texts and different fonts. Compared with some existing methods, the proposed approach achieves better results for Chinese text location.  相似文献   

7.
This paper describes a new method for recognizing overtraced strokes to 2D geometric primitives, which are further interpreted as 2D line drawings. This method can support rapid grouping and fitting of overtraced polylines or conic curves based on the classified characteristics of each stroke during its preprocessing stage. The orientation and its endpoints of a classified stroke are used in the stroke grouping process. The grouped strokes are then fitted with 2D geometry. This method can deal with overtraced sketch strokes in both solid and dash linestyles, fit grouped polylines as a whole polyline and simply fit conic strokes without computing the direction of a stroke. It avoids losing joint information due to segmentation of a polyline into line-segments. The proposed method has been tested with our freehand sketch recognition system (FSR), which is robust and easier to use by removing some limitations embedded with most existing sketching systems which only accept non-overtraced stroke drawing. The test results showed that the proposed method can support freehand sketching based conceptual design with no limitations on drawing sequence, directions and overtraced cases while achieving a satisfactory interpretation rate.  相似文献   

8.
In this paper, we present a new text line detection method for handwritten documents. The proposed technique is based on a strategy that consists of three distinct steps. The first step includes image binarization and enhancement, connected component extraction, partitioning of the connected component domain into three spatial sub-domains and average character height estimation. In the second step, a block-based Hough transform is used for the detection of potential text lines while a third step is used to correct possible splitting, to detect text lines that the previous step did not reveal and, finally, to separate vertically connected characters and assign them to text lines. The performance evaluation of the proposed approach is based on a consistent and concrete evaluation methodology.  相似文献   

9.
分裂合并算法是一种基于区域的串行图像分割算法.在图像的分裂阶段,从图像区域的表示出发,引入Morton码,降低了算法的空间复杂度.同时在图像的相邻区域的合并阶段,提出了一种新的合并原则,增加了满足合并条件的相邻区域的匹配率,减少算法的迭代次数,提高了算法的执行效率.最后给出并分析了相关的实验数据,证明了算法的有效性.  相似文献   

10.
Text extraction in mixed-type documents is a pre-processing and necessary stage for many document applications. In mixed-type color documents, text, drawings and graphics appear with millions of different colors. In many cases, text regions are overlaid onto drawings or graphics. In this paper, a new method to automatically detect and extract text in mixed-type color documents is presented. The proposed method is based on a combination of an adaptive color reduction (ACR) technique and a page layout analysis (PLA) approach. The ACR technique is used to obtain the optimal number of colors and to convert the document into the principal of them. Then, using the principal colors, the document image is split into the separable color plains. Thus, binary images are obtained, each one corresponding to a principal color. The PLA technique is applied independently to each of the color plains and identifies the text regions. A merging procedure is applied in the final stage to merge the text regions derived from the color plains and to produce the final document. Several experimental and comparative results, exhibiting the performance of the proposed technique, are also presented.  相似文献   

11.
针对光照不均和背景复杂度所导致的自然场景文本检测中文本的漏检和错检现象,提出一种基于笔画角度变换和宽度特征的自然场景文本检测方法。分析发现与非文本相比,文本具有较稳定的笔画角度变换次数和笔画宽度,针对这两个特性提出笔画外边界优劣角变换次数和增强笔画支持像素面积比两种特征。前者分段统计笔画外轮廓角度变换次数;后者计算笔画宽度稳定区域在笔画总面积的占比,用来分别反映笔画角度和宽度变化稳定特性。为降低文本漏检率,采用多通道最大稳定极值区域(maximally stable extremal regions,MSER)检测,合并所有候选区域,提取候选区域的笔画特征和纹理特征,利用支持向量机完成文本和非文本区域分类。在ICDAR2015数据库上,算法的精确率和召回率分别达到79.3%和72.8%,并在一定程度上解决了光照不均和复杂背景的问题。  相似文献   

12.
The extraction of component line segments and circular arcs from freehand strokes along with their relations is a prerequisite for sketch understanding. Existing approaches usually take three stages to segment a stroke: first identifying segmentation points, then classifying the substroke between each pair of adjacent segmentation points, and, finally, obtaining graphical representations of substrokes by fitting graphical primitives to them. Since a stroke inevitably contains noises, the first stage may produce wrong or inaccurate segmentation points, resulting in the wrong substroke classification in the second stage and inaccurately fitted parameters in the third stage. To overcome the noise sensitivity of the three-stage method, the segmental homogeneity feature is emphasized in this paper. We propose a novel approach, which first extracts graphical primitives from a stroke by a connected segment growing from a seed-segment and then utilizes relationships between the primitives to refine their control parameters. We have conducted experiments using real-life strokes and compared the proposed approach with others. Experimental results demonstrate that the proposed approach is effective and robust.  相似文献   

13.
14.
基于感知的多方向在线手写笔迹文本行提取   总被引:1,自引:0,他引:1  
提出一种多方向手写笔迹文本行的提取方法.该方法以视觉感知理论为基础,采取自底向上的策略,先将笔画组合成类比字符的笔画块,然后基于这些笔画块建立链接模型,最后采用分支限界搜索算法从链接模型中找出最优行排列.实验结果表明,该方法能有效地提取多方向笔迹行结构,并适用于弯曲文本行的提取.  相似文献   

15.
针对维吾尔文手写体文本中行分割问题,基于连通域大小将图像中文字分为三类,提出了自适应涂抹细化算法,对主体文本行进行定位;并对第三类连通域中相邻两文本行间粘连的字符进行切割;此外,利用重心范围内的邻域搜索算法,解决了剩余笔画的文本行归附问题。实验结果表明,该方法与常见的水平投影法,分段投影法,及涂抹方法相比具有更好的分割效果。  相似文献   

16.
针对古代汉字文档的特点,提出了适合于古文档的列切分方法和字切分方法。提出的列切分方法直接对文档的笔画投影进行分析,采用一种基于分层投影过滤和变长间隙阈值的递归切分算法。该算法在列间隔较小、列与格线存在粘连、文档具有一定程度的倾斜的情况下,也能准确地抽取出列,尤其对短列的切分达到了较好的效果。提出的字切分方法分为两步,进行粗切分确定大致的切分位置,采用基于连通域分析与粘连点判断的方法做进一步的细切分。该算法对具有较多粘连和重叠汉字的列,也能较好地切分出完整的单字。实验结果表明,提出的方法用于古代汉字文档切分能够获得较好的效果。  相似文献   

17.
This paper proposes a new two-phase approach to robust text detection by integrating the visual appearance and the geometric reasoning rules. In the first phase, geometric rules are used to achieve a higher recall rate. Specifically, a robust stroke width transform (RSWT) feature is proposed to better recover the stroke width by additionally considering the cross of two strokes and the continuousness of the letter border. In the second phase, a classification scheme based on visual appearance features is used to reject the false alarms while keeping the recall rate. To learn a better classifier from multiple visual appearance features, a novel classification method called double soft multiple kernel learning (DS-MKL) is proposed. DS-MKL is motivated by a novel kernel margin perspective for multiple kernel learning and can effectively suppress the influence of noisy base kernels. Comprehensive experiments on the benchmark ICDAR2005 competition dataset demonstrate the effectiveness of the proposed two-phase text detection approach over the state-of-the-art approaches by a performance gain up to 4.4% in terms of F-measure.  相似文献   

18.
目的 为了满足羽毛球教练针对球员单打视频中的动作进行辅助分析,以及用户欣赏每种击球动作的视频集锦等多元化需求,提出一种在提取的羽毛球视频片段中对控球球员动作进行时域定位和分类的方法。方法 在羽毛球视频片段上基于姿态估计方法检测球员执拍手臂,并根据手臂的挥动幅度变化特点定位击球动作时域,根据定位结果生成元视频。将通道—空间注意力机制引入时序分段网络,并通过网络训练实现对羽毛球动作的分类,分类结果包括正手击球、反手击球、头顶击球和挑球4种常见类型,同时基于图像形态学处理方法将头顶击球判别为高远球或杀球。结果 实验结果表明,本文对羽毛球视频片段中动作时域定位的交并比(intersection over union,IoU)值为82.6%,对羽毛球每种动作类别预测的AUC (area under curve)值均在0.98以上,平均召回率与平均查准率分别为91.2%和91.6%,能够有效针对羽毛球视频片段中的击球动作进行定位与分类,较好地实现对羽毛球动作的识别。结论 本文提出的基于羽毛球视频片段的动作识别方法,兼顾了羽毛球动作时域定位和动作分类,使羽毛球动作识别过程更为智能,对体育视频分析提供了重要的应用价值。  相似文献   

19.
Searching for similar document has an important role in text mining and document management. In whether similar document search or in other text mining applications generally document classification is focused and class or category that the documents belong to is tried to be determined. The aim of the present study is the investigation of the case which includes the documents that belong to more than one category. The system used in the present study is a similar document search system that uses fuzzy clustering. The situation of belonging to more than one category for the documents is included by this system. The proposed approach consists of two stages to solve multicategories problem. The first stage is to find out the documents belonging to more than one category. The second stage is the determination of the categories to which these found documents belong to. For these two aims -threshold Fuzzy Similarity Classification Method (-FSCM) and Multiple Categories Vector Method (MCVM) are proposed as written order. Experimental results showed that proposed system can distinguish the documents that belong to more than one category efficiently. Regarding to the finding which documents belong to which classes, proposed system has better performance and success than the traditional approach.  相似文献   

20.
针对目前很多文本分类方法很少控制混杂变量,且分类准确度对数据分布的鲁棒性较低的问题,提出一种基于协变量调整的文本分类方法.首先,假设文本分类中的混杂因子(变量)可在训练阶段观察到,但无法在测试阶段观察到;然后,以训练阶段的混杂因子为条件,在预测阶段计算出混杂因子的总和;最后,基于Pearl的协变量调整,通过控制混杂因子来观察文本特征和分类变量对分类器的精度影响.通过微博数据集和IMDB数据集验证所提方法的性能,实验结果表明,与其他方法相比,所提方法处理混杂关系时,可以得到更高的分类准确度,且对混杂变量具备鲁棒性.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号