Text Analysis Conference
Knowledge Base Population 2014
Evaluation: February-November, 2014
Workshop: November 17-18, 2014
Follow us on twitter @tackbp
U.S. National Institute of Standards and Technology (NIST)
With support from:
U.S. Department of Defense
The Text Analysis Conference (TAC) is a series of evaluations and workshops organized to promote research in Natural Language Processing and related applications, by providing a large test collection, common evaluation procedures, and a forum for organizations to share their results.
The goal of TAC Knowledge Base Population (KBP) is to develop and evaluate technologies for building and populating knowledge bases (KBs) from unstructured text. KBP systems must ultimately build a KB from scratch, but must also be able to populate an existing reference KB that has incomplete or unkown provenance.
You are invited to participate in TAC KBP 2014. Organizations may choose to participate in any or all of the TAC KBP 2014 tracks. NIST provides test data for each KBP task, and participants run their NLP systems on the data and return their results to NIST for evaluation. TAC KBP culminates in a November workshop at NIST in Gaithersburg, Maryland, USA.
All results submitted to NIST are archived on the TAC web site, and all evaluations of submitted results are included in the workshop proceedings. Dissemination of TAC work and results other than in the workshop proceedings is welcomed, but the conditions of participation specifically preclude any advertising claims based on TAC results.
1) Cold Start KBP
The Cold Start track builds a knowledge base from scratch.
2) Entity Linking
The entity linking task is to discover and link names in a document collection to entities in a reference KB, or to new named entities discovered in the document collection.
3) Slot Filling
The slot filling task is to search a document collection to fill in values for predefined slots (attributes) for a given entity in a reference KB.
4) Slot Filler Validation
The Slot Filler Validation track focuses on the refinement of output from slot filling systems by either combining information from multiple slot filling systems, or applying more intensive linguistic processing to validate individual candidate slot fillers.
The goal of the Sentiment track is to assess the quality of detectors for scoped and attributed sentiment.
The goal of the Event track is to extract information about events such that the information would be suitable as input to a knowledge base.
1) Event track for identifying events from a predefined ontology and extracting their arguments from text
2) English entity DISCOVERY and linking task
3) Cross-lingual Spanish and Chinese entity linking over discussion forums
3) Multi-document provenance and inference for slot filling and Cold Start KBP
4) Cold Start task variant providing evaluation queries in advance (similar to slot filling)
Organizations wishing to participate in any of the TAC KBP 2014 tracks are invited to register online by June 15, 2014. Participants are advised to register and submit all required agreement forms as soon as possible in order to receive timely access to evaluation resources, including any sample and training data. Registration for a track does not commit you to participating in the track, but is helpful to know for planning. Late registration will be permitted only if resources allow. Any questions about conference participation may be sent to the TAC project manager: email@example.com.
Track registration: http://www.nist.gov/tac/2014/KBP/registration.html
The TAC 2014 workshop will be held November 17-18, 2014, in Gaithersburg, Maryland, USA. The workshop is a forum both for presentation of results (including failure analyses and system comparisons), and for more lengthy system presentations describing techniques used, experiments run on the data, and other issues of interest to NLP researchers. KBP track participants who wish to give a presentation during the workshop will submit a short abstract in September describing the experiments they performed. As there is a limited amount of time for oral presentations, the abstracts will be used to determine which participants are asked to speak and which will present in a poster session.
March Initial track guidelines posted
April Distribution of document collections
June 15 Deadline for registration for track participation
July - September Track evaluation windows (varies by track)
September 30 Deadline for short system descriptions
September 30 Deadline for workshop presentation proposals
By October Release of individual evaluated results to participants (varies by track)
mid October Notification of acceptance of presentation proposals
November 1 Deadline for system reports (workshop notebook version)
November 17-18 TAC 2014 workshop in Gaithersburg, Maryland, USA
February 15, 2015 Deadline for system reports (final proceedings version)
ORGANIZING COMMITTEE (Partial)
Claire Cardie (Cornell University)
Hoa Trang Dang (U.S. National Institute of Standards and Techonology)
Jason Duncan (U.S. Department of Defense)
Joe Ellis (Linguistic Data Consortium)
Marjorie Freedman (BBN Technologies)
Kira Griffitt (Linguistic Data Consortium)
Ralph Grishman (New York University)
Yasaman Haghpanah (U.S. National Institute of Standards and Techonology)
Heng Ji (Rensselaer Polytechnic Institute)
James Mayfield (Johns Hopkins University)
Boyan Onyshkevych (U.S. Department of Defense)
Stephanie Strassel (Linguistic Data Consortium)
Mihai Surdeanu (University of Arizona)