Fourth Workshop on Scholarly Document Processing

Event Notification Type: 
Call for Papers
Abbreviated Title: 
SDP 2024
Location: 
Hybrid with ACL 2024
Friday, 16 August 2024
State: 
Country: 
Thailand
Contact Email: 
City: 
Bangkok
Contact: 
SDP 2024 Organizers
Submission Deadline: 
Friday, 17 May 2024

*** Call for Research Papers ***

Scholarly literature is the chief means by which scientists and academics document and communicate their results and is therefore critical to the advancement of knowledge and improvement of human well-being. At the same time, this literature poses challenges to NLP uncommon in other genres, such as specialized language and high background knowledge requirements, long documents and strong structural conventions, multimodal presentation, citation relationships among documents, an emphasis on rational argumentation, and the frequent availability of detailed metadata. These challenges necessitate the development of NLP methods and resources optimized for this domain. The Scholarly Document Processing (SDP) workshop provides a venue for discussing these challenges, bringing together stakeholders from different communities including computational linguistics, machine learning, text mining, information retrieval, digital libraries, scientometrics and others, to develop methods, tasks, and resources in support of these goals.

This workshop builds on the success of prior workshops: the 1st, 2nd, and 3rd SDP workshops held at EMNLP 2020, NAACL 2021, and COLING 2022, and the 1st and 2nd SciNLP workshops held at AKBC 2020 and 2021. In addition to having broad appeal within the NLP community, we hope the SDP workshop will attract researchers from other relevant fields including meta-science, scientometrics, data mining, information retrieval, and digital libraries, bringing together these disparate communities within ACL.

Website: https://sdproc.org/2024/
X (Twitter): https://twitter.com/sdpworkshop

Topics of Interest

We invite submissions from all communities demonstrating usage of and challenges associated with natural language processing, information retrieval, and data mining of scholarly and scientific documents. Relevant topics include (but are not limited to):

Large Language Models (LLMs) for Science
Representation learning and language modeling
Information extraction and NER
Document understanding
Summarization and generation
Question-answering
Discourse modeling/argumentation mining
Network analysis
Bibliometrics, scientometrics, and altmetrics
Reproducibility and research integrity, including new challenges posed by generative AI
Peer review tools, principles and technology
Metadata and indexing
Inclusion of datasets and computational resources
Research infrastructures and digital libraries
Increasing the representation in scholarly work of disadvantaged populations
LLM-based interfaces to consume/produce scholarly documents

** Submission Information **

Authors are invited to submit full and short papers with unpublished, original work. Submissions will be subject to a double-blind peer-review process. Accepted papers will be presented by the authors at the workshop either as a talk or a poster. All accepted papers will be published in the workshop proceedings (proceedings from previous years can be found here: https://aclanthology.org/venues/sdp/).

The submissions must be in PDF format and anonymized for review. All submissions must be written in English and follow the ACL 2024 formatting requirements:

Long paper submissions: up to 8 pages of content, plus unlimited references.
Short paper submissions: up to 4 pages of content, plus unlimited references.

Submission Website: Paper submission has to be done through openreview:

Final versions of accepted papers will be allowed 1 additional page of content so that reviewer comments can be taken into account.

** Important Dates (Main Research Track) **

Paper submission deadline: May 17 (Friday), 2024
Notification of acceptance: June 17 (Monday), 2024
Camera-ready paper due: July 1 (Monday), 2024
Workshop dates: August 16, 2024

** SDP 2024 Keynote Speakers **

We are excited to have several keynote speakers at SDP 2024.
Iryna Gurevych, Professor at Technical University Darmstadt and head of the UKP Lab, Germany.
Anna Rogers, Assistant Professor, University of Copenhagen, Denmark
Heng Ji, Professor, University of Illinois at Urbana-Champaign, USA.
Doug Downey, Associate Professor at Northwestern University and Research Manager at Allen Institute for AI, USA.

** SDP 2024 Shared Tasks **

SDP 2024 will host two exciting shared tasks. More information about all shared tasks is provided on the workshop website: https://sdproc.org/2024/sharedtasks.html

DAGPap24: Detecting automatically generated scientific papers

A big problem with the ubiquity of Generative AI is that it has now become very easy to generate fake scientific papers. This can erode public trust in science and attack the foundations of science: are we standing on the shoulders of robots? The Detecting Automatically Generated Papers (DAGPAP) competition aims to encourage the development of robust, reliable AI-generated scientific text detection systems, utilizing a diverse dataset and varied machine learning models in a number of scientific domains.

Organizers: Savvas Chamezopoulos, Yury Kashnitsky, Drahomira Herrmannova, Anita de Waard (Elsevier), Domenic Rosati (Scite)

Context24: Contextualizing Scientific Figures and Tables

When making sense of results across many research papers on a topic, figures or tables of key results from the papers can serve as effective, information-dense summaries that can be compared/contrasted and synthesized with other results. However, to understand the results, key elements (e.g., measures, sample) need to be contextualized with associated methodological details, which are typically dispersed throughout the text, often far from the figure/table and from each other. In this shared task, we are interested in contextualizing scientific figures and tables, i.e., automatically retrieving and ranking snippets from the paper that are most needed to interpret their results, with the goal of making figures/tables more self-contained.

Organizers: Joel Chan, Matthew Akamatsu

** Organizing Committee **

Tirthankar Ghosal, Oak Ridge National Laboratory, USA
Philipp Mayr, GESIS – Leibniz Institute for the Social Sciences, Germany
Aakanksha Naik, Allen Institute for AI, USA
Shannon Shen, Massachusetts Institute of Technology, USA
Amanpreet Singh, Allen Institute for AI, USA
Anita de Waard, Elsevier, Netherlands
Orion Weller, Johns Hopkins University, USA
Yanxia Qin, National University of Singapore, Singapore
Yoonjoo Lee, Korea Advanced Institute of Science & Technology, South Korea