基于双层级联文本分类的简历信息抽取 Resume Information Extraction Based on Cascaded Double- layer Classification期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于双层级联文本分类的简历信息抽取

引用本文：	于琨,管刚,周明,王煦法,蔡庆生.基于双层级联文本分类的简历信息抽取[J].中文信息学报,2006,20(1):61-68.

作者姓名：	于琨管刚周明王煦法蔡庆生

作者单位：	1.中国科学技术大学计算机科学技术系2.微软亚洲研究院

摘要：	本文提出了一种基于双层级联文本分类的方法,用于简历信息的自动抽取。本方法将简历文本分解为文本块和文本串,并将简历中包含的信息分解为概要信息与详细信息。首先对简历文本中的文本块进行切分与分类,抽取出概要信息,然后选择可能包含详细信息的文本块,将其切分为文本串,再通过对文本串的分类抽取出详细信息。对1200份中文简历的实验结果表明,本方法适用于简历信息的自动抽取和管理。
关键词：	计算机应用中文信息处理信息抽取文本分类简历管理
文章编号：	1003-0077（2006）01-0059-08
收稿时间：	2004-09-21
修稿时间：	2005-09-28
Resume Information Extraction Based on Cascaded Double- layer Classification

YU Kun,GUAN Gang,ZHOU Ming,WANG Xu-fa,CAI Qing-sheng.Resume Information Extraction Based on Cascaded Double- layer Classification[J].Journal of Chinese Information Processing,2006,20(1):61-68.

Authors:	YU Kun GUAN Gang ZHOU Ming WANG Xu-fa CAI Qing-sheng

Affiliation:	1.Dept. of Computer Science and Technology USTC2.Microsoft Research Asia

Abstract:	This paper presents an approach based on cascaded double-layer text classification for resume information extraction.This approach first divides a resume into block and string.Then it divides the target information into general information and detailed information.It first extracts general information by block segmentation and classification.Then it selects those blocks that may contain predefined detailed information with a fuzzy strategy.At last,it segments these blocks into strings and labels the strings with detailed information classes.The experimental results on 1200 Chinese resumes show that our approach is suitable for the information extraction and management of resumes.

Keywords:	computer application Chinese information processing information extraction text classification resume management
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏