基于CRFs的多策略生物医学命名实体识别 |
| |
引用本文: | 马瑞民,马民艳.基于CRFs的多策略生物医学命名实体识别[J].齐齐哈尔轻工业学院学报,2011(1):39-42. |
| |
作者姓名: | 马瑞民 马民艳 |
| |
作者单位: | 东北石油大学数据库理论与技术科研室,黑龙江大庆163318 |
| |
摘 要: | 生物医学命名实体识别是生物医学文本挖掘的基本任务。机器学习方法是生物医学命名实体研究的主流方法,选取有效的机器学习算法和采取有效的识别策略是提高生物医学命名实体识别性能的关键,鉴于条件随机域算法在自然语言处理领域的优势,本文采用该算法并结合多种识别策略对生物医学命名实体识别进行研究。实验取得了良好的效果,F测度达到了70.52%,与其它相关系统比较,识别性能有了明显提高。
|
关 键 词: | 生物医学命名实体识别 特征提取 缩写词识别 条件随机域 |
Bio-entity recognition based on CRFs of Multi-strategy |
| |
Authors: | MA Rui-min MA Min-yan |
| |
Affiliation: | ( Database Theory and Technology Lab, Northeastern Petroleum University, Heilongjiang Daqing 163318, China ) |
| |
Abstract: | Bio-entity recognition is a basic task in biomedical text mining. The machine learning method is an important method to hio-entity recognition: Select an effective machine learning algorithms and post-processing strategies is the key to improve the performance of bio-entity recognition. A key advantage of CRFs is their great flexibility to integrate a wide variety of arbitrary, non-independent features of the input. In this paper, We present a bio-entity recognition system based on CRFs of multi-strategy. Evaluation on this system proved that the feature selection and the post-processing we explored have important contribution on system performance to achieve better results. |
| |
Keywords: | Bio-entity recognition feature extraction abbreviations recognition conditional random fields |
本文献已被 维普 等数据库收录! |