AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
Rates of interannotator agreement for Automatic Content Extraction named entities are comparable to rates shown in previous programs like MUC

The Automatic Content Extraction (ACE) Program - Tasks, Data, and Evaluation.

LREC, (2004)

Cited by: 245|Views113
EI
Full Text
Bibtex
Weibo

Abstract

The objective of the ACE program is to develop technology to automatically infer from human language data the entities being mentioned, the relations among these entities that are directly expressed, and the events in which these entities participate. Data sources include audio and image data in addition to pure text, and Arabic and Chine...More

Code:

Data:

Introduction
  • Introduction and Background

    Today’s global web of electronic information, including most notably the www, provides a resource of unbounded information-bearing potential.
  • These tasks were identified in general as the extraction of the entities, relations and events being discussed in the language.
  • In ACE, on the other hand, the corresponding task is to identify the entity so named.
  • The ACE research targets, namely entities, relations, and events, are represented in terms of their underlying attributes and constituents.
Highlights
  • Introduction and Background

    Today’s global web of electronic information, including most notably the www, provides a resource of unbounded information-bearing potential
  • The Automatic Content Extraction program is a “technocentric” research effort, meaning that the emphasis is on developing core enabling technologies rather than solving the application needs that motivate the research
  • The Automatic Content Extraction program, attempts to take the task “off the page” in the sense that the research objectives are defined in terms of the target objects rather than in terms of the words in the text
  • Annotation Tasks There are three primary Automatic Content Extraction annotation tasks corresponding to the three research tasks: Entity Detection and Tracking (EDT), Relation Detection and Characterization (RDC), and Event Detection and Characterization (VDC)
  • In addition to multiple passes over all Automatic Content Extraction data, an additional 5% to 10% of the data is completely re-annotated from scratch by different annotators
  • Rates of interannotator agreement for Automatic Content Extraction named entities are comparable to rates shown in previous programs like MUC (NIST 1999)
Results
  • Under the ACE (NIST 2003) and DARPA TIDES (TIDES 2004) Programs, the Linguistic Data Consortium at the University of Pennsylvania develops annotation guidelines, corpora and other linguistic resources to support information extraction research (LDC 2004).
  • LDC's ACE annotators tag broadcast transcripts, newswire and newspaper data in English, Chinese and Arabic, producing both training and test data for common research task evaluations.
  • Annotation Tasks There are three primary ACE annotation tasks corresponding to the three research tasks: Entity Detection and Tracking (EDT), Relation Detection and Characterization (RDC), and Event Detection and Characterization (VDC).
  • During RDC tagging, annotators identify relations that exist between the entities tagged during the EDT task.
  • In VDC, annotators identify and characterize five types of events in which EDT entities participate.
  • In future phases of ACE, annotators will identify additional event types as well as characterizing relations between events.
  • Particular challenges to annotators include the coreference of generic entities and the use of metonymy, characterization of GPEs, distinguishing certain relation types, and identifying implicit vs explicit relations.
  • ACE evaluation requires meaningful and helpful scoring of entities, relations and events.
  • If the output entity is mapped, the minimum value for the sys entity and its corresponding ref entity is used.
  • Entity_Value is discounted for errors in entity type, subtype and class.
  • If the output relation is mapped, the minimum value for the sys relation and its corresponding ref relation is used.
  • Relation_Value is discounted for errors in relation type and subtype.
Conclusion
  • 6 In order for a system output argument to be reasonably considered to represent its corresponding reference argument it is required to exhibit a reasonable overlap with the reference, in terms of Entity_Value.
  • If the output event is mapped, the minimum value for the sys event and its corresponding ref event is used.
  • Event_Value is discounted for errors in event type and modality.
  • Those event entity mentions that appear in these documents are used to compute Participant_Value, .[10]
Tables
  • Table1: List of Corpora developed for and used to support ACE research
Download tables as Excel
Funding
  • The objective of the ACE program is to develop technology to automatically infer from human language data the entities being mentioned, the relations among these entities that are directly expressed, and the events in which these entities participate
  • 1 While the ACE program is directed toward extraction of information from audio and image sources in addition to pure text, the research effort is restricted to information extraction from text
  • An ACE event can have a number of participants, and each participant is characterized by a role that it plays in the event
  • The performance measure for all three tasks is formulated in terms of a synthetic application value, where value is accrued by correctly detecting the target objects and correctly recognizing their attributes, and where value is lost by falsely detecting target objects or incorrectly determining attributes of the target objects
  • 2 The mapping of system output mentions to reference mentions is chosen so as to maximize the total value of the mentions. 3 All mentions of a system output entity are unmapped for entities that are themselves unmapped. 4 The coreference discount is intended to reduce the penalty for mentions that are valid mentions of an entity but that are incorrectly associated at the entity level
Reference
  • 3. Event extraction. Though not in any previous ACE evaluation, event detection and characterization is planned for the 2004 evaluation (August-September, 2004). Details of the task definition, annotation guidelines, and scoring are being worked out at the time of writing this paper.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科