Fact distribution in Information Extraction |
| |
Authors: | Mark Stevenson |
| |
Affiliation: | (1) Department of Computer Science, University of Sheffield, Regent Court, 211 Portobello Street, S1 4DP Sheffield, UK |
| |
Abstract: | Several recent Information Extraction (IE) systems have been restricted to the identification facts which are described within
a single sentence. It is not clear what effect this has on the difficulty of the extraction task or how the performance of
systems which consider only single sentences should be compared with those which consider multiple sentences. This paper compares
three IE evaluation corpora, from the Message Understanding Conferences, and finds that a significant proportion of the facts
mentioned therein are not described within a single sentence. Therefore systems which are evaluated only on facts described
within single sentences are being tested against a limited portion of the relevant information in the text and it is difficult
to compare their performance with other systems. Further analysis demonstrates that anaphora resolution and world knowledge
are required to combine information described across multiple sentences. This result has implications for the development
and evaluation of IE systems.
|
| |
Keywords: | Information Extraction Evaluation Message understanding conferences |
本文献已被 SpringerLink 等数据库收录! |
|