The 3rd Computational Linguistics Scientific Document Summarization Shared Task

Event Notification Type: 
Call for Papers
Abbreviated Title: 
CL-SciSumm 2017
Friday, 11 August 2017
Country: 
Japan
City: 
Tokyo
Contact: 
Kokil Jaidka
Muthu Kumar Chandrasekaran
Min-Yen Kan
Submission Deadline: 
Wednesday, 31 May 2017

Call for Participation
-----------------------------
The 3rd CL-SciSumm 2017 Shared Task
at SIGIR 2017 on Friday, August 11, 2017

To be held as a part of the
2nd Joint Workshop of Bibliometric-enhanced IR and NLP for Digital Libraries (BIRNDL)
Sponsored by Microsoft Research Asia

Introduction
-------------------
We invite you to participate in our Shared Task on the relationship mining and scientific summarization of computational linguistics research papers. Scientific summarization can play an important role in developing methods to index, represent, retrieve, browse and visualize information in large scholarly databases.

The proceedings of our previous workshops (BIRNDL and BIR) are being published as a special issue on “Bibliometrics, Information Retrieval and Natural Language Processing in Digital Libraries” in the International Journal on Digital Libraries, and as a special issue on “Bibliometric-enhanced IR” in Scientometrics. At SIGIR 2017, we will once again invite the authors of selected system papers at the CL-SciSumm Shared Task, to submit extended versions to a special issue in a highly visible and prestigious journal.

Objective
-------------------
The 3rd CL-SciSumm Shared Task provides resources to encourage research in entity mining, relationship extraction, question answering and other NLP tasks for scientific papers. It comprises annotated citation information connecting research papers with citing papers. Citations are embedded with meta-commentary, which offer a contextual, interpretative layer to the cited text and emphasize certain kinds of information over others.

The Task
------------------
The task comprises a set of topics, each consisting of a research paper (RP) in CL, and ten or more papers which cite it (citing papers, CP). The text spans (citances) which relate the citing paper to the reference paper have already been identified.

Task 1a: For each citance, identify the cited text span in the RP that most accurately reflect the citance.
Task 1b: For each cited text span, identify what facet of the paper it belongs to, from a predefined set of facets.
Evaluation: Task 1 will be scored by overlap of text spans in the system output vs the gold standard created by human annotators

Task 2: (optional bonus task): Finally, generate a structured summary of the RP from the cited text spans of the RP. The length of the summary should not exceed 250 words.
Evaluation: Task 2 will be scored using the ROUGE evaluation metric to compare automatic summaries against paper abstracts, human written summaries and community summaries constructed using the output of Task 1a.

How To Participate
---------------------------
1. Register for the CL-SciSumm Shared Task at <https://easychair.org/conferences/?conf=birndl2017> by May 31
2. Browse our git repository at <https://github.com/WING-NUS/scisumm-corpus> and download the training set.
3. Develop and train your system to solve Task 1a, 1b and/or Task 2 on the training set.
4. Meanwhile, submit a tentative system description, by May 20.
5. Evaluate your system on the test set, to be released on July 1, and upload your results to our Codalabs portal (to be announced later), to self-evaluate your performance.
6. Tell us about your approach in a paper; submit it by July 30, 2017.
7. Attend the BIRNDL workshop at SIGIR on August 11, and present your work.

Important Dates
------------------------
Registration opens: April 20, 2017
Training set posted: May 1, 2017
Short system description due: May 31, 2017
Test Set posted and evaluation period begins: July 1, 2017
Evaluation period ends: July 15, 2017
System reports (papers) due: July 30, 2017
Presentation at 2nd BIRNDL 2017 workshop, SIGIR: Aug 11, 2017
Camera ready contributions due for CEUR proceedings: TBD

Organizers
-------------------
Kokil Jaidka, University of Pennsylvania (jaidka at sas.upenn.edu)
Muthu Kumar Chandrasekaran, National University of Singapore (muthu.chandra at comp.nus.edu.sg)
Min-Yen Kan, National University of Singapore (kanmy at comp.nus.edu.sg )