首页 | 官方网站   微博 | 高级检索  
     

基于深度学习的场景文本检测与识别
引用本文:宫法明,刘芳华,李厥瑾,宫文娟.基于深度学习的场景文本检测与识别[J].计算机系统应用,2021,30(8):179-185.
作者姓名:宫法明  刘芳华  李厥瑾  宫文娟
作者单位:中国石油大学(华东) 计算机科学与技术学院, 青岛 266580;山东电子职业技术学院 教务处, 济南 250200
基金项目:科技部创新方法工作专项(2015IM010300)
摘    要:针对复杂场景下文本识别流程复杂繁琐、适应性差、准确度低等缺点,本文提出一种复杂场景下文本检测和识别的新方法.该方法由文本区域检测网络及文本识别网络构成,文本区域检测网络为改进的PSENet,将PSENet的骨干网络改为ResNeXt-101,在特征提取过程中加入可微二值化操作来优化分割网络,不仅简化了后处理,而且提高了文本检测的性能.将卷积神经网络和加入聚合交叉熵损失的长短时记忆网络组成文本识别网络,聚合交叉熵的引入提高了文本识别的准确性.本文在两个数据集上进行验证,实验结果表明,两个网络模型融合后准确率最高达到95.6%,优于改进之前的方法.该方法能有效地检测和识别任意文本实例,具有很好的实用性.

关 键 词:可微二值化  聚合交叉熵  文本检测  文本识别
收稿时间:2020/11/19 0:00:00
修稿时间:2020/12/21 0:00:00

Scene Text Detection and Recognition Based on Deep Learning
GONG Fa-Ming,LIU Fang-Hu,LI Jue-Jin,GONG Wen-Juan.Scene Text Detection and Recognition Based on Deep Learning[J].Computer Systems& Applications,2021,30(8):179-185.
Authors:GONG Fa-Ming  LIU Fang-Hu  LI Jue-Jin  GONG Wen-Juan
Affiliation:College of Computer Science and Technology, China University of Petroleum, Qingdao 266580, China;Academic Affairs Office, Shandong College of Electronic Technology, Jinan 250200, China
Abstract:This study proposes a new method for text detection and recognition in complex scenes to eliminate the shortcomings of a complicated text recognition process, poor adaptability, and low accuracy. This method is composed of a text area detection network and a text recognition network. The text area detection network is an improved PSENet. The backbone network of PSENet is changed to ResNeXt-101, and a differentiable binarization operation is added to optimize the segmentation network in the feature extraction process, which not only simplifies post-processing but also improves text detection. The text recognition network is formed by combining a convolutional neural network with a long short-term memory network with aggregate cross-entropy loss. The introduction of aggregate cross-entropy improves the accuracy of text recognition. Furthermore, experimental verification is carried out on two data sets, and the results show that the new method has accuracy as high as 95.6%, which is better than the previous methods. This method can effectively detect and recognize any text instances and has good practicability.
Keywords:differentiable binarization  aggregate cross-entropy  text detection  text recognition
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号