首页 | 官方网站   微博 | 高级检索  
     

基于TValue融合领域度的术语抽取法
引用本文:杨雅娜,刘胜奇.基于TValue融合领域度的术语抽取法[J].情报工程,2015,1(5):025-031.
作者姓名:杨雅娜  刘胜奇
作者单位:中国邮政储蓄银行,中国专利信息中心
摘    要:提出 ATValue(Advanced TValue and Fieldhood Integration) 术语抽取法。为提高术语抽取质量,在 TValue 五属性的基础上,提出领域度。通过相关性分析获得六属性组合值 AValue,最后识别AValue 高于术语可信度的词串来选择候选术语。能源行业的实验结果表明:ATValue 术语抽取法的F值约比 TValue 术语抽取法高出 2 个百分点,原因在于 ATValue 的领域度测算了词串中各种单词对领域的贡献。

关 键 词:术语抽取,术语识别,数据挖掘,领域度

Automatic Term Extraction Based on Advanced TValue and Fieldhood Integration
Authors:YANG Yana and LIU Shengqi
Affiliation:Postal Savings Bank of China and China Patent Information Center
Abstract:It proposes an automatic term extraction based on ATValue (advanced TValue and fieldhood integration). In order to increase the quality of term extraction, it puts forward the degree of fieldhood based on the five attributes of TValue. The value of AValue is computed by the six attributes of the strings based on multiplication of probability after their correlations are analyzed. It gains the candidate terms by the analysis of the strings whose value of AValue is more than the pre-defined confidence threshold. The simulation results of term extraction in energy industry show that the F-score of automatic term extraction based on ATValue is about 2% higher than that based on TValue, because it measures the score of importance of compound words by the degree of fieldhood of ATValue.
Keywords:Term Extraction  Term Recognition  Data Mining  Fieldhood
点击此处可从《情报工程》浏览原始摘要信息
点击此处可从《情报工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号