SUKI: Structured and Unstructured Knowledge Integration Workshop at NAACL 2022

Event Notification Type: 
Call for Papers
Abbreviated Title: 
SUKI 2022
Location: 
NAACL 2022
Thursday, 14 July 2022
State: 
WA
Country: 
USA
Contact Email: 
City: 
Seattle
Contact: 
Rui Zhang
Submission Deadline: 
Friday, 8 April 2022

Call for Papers
World knowledge is distributed across diverse resources in either structured (tables, lists, graphs, and databases) or unstructured forms (texts, large pretrained language models). Recently, there have been extensive efforts to represent, inject, and ground knowledge in various NLP tasks. Because many downstream applications require the integration of structured and unstructured knowledge, it is essential to design more generalized models to handle multiple sources of knowledge inputs. However, recent NLP progress is mostly focused on dealing with homogeneous external knowledge resource in a single form. This workshop aims to bring researchers from different backgrounds together to discuss challenges and promote solutions in NLP techniques for jointly dealing with structured and unstructured knowledge. This draws wide attention from multiple NLP areas such as Information Extraction, Question Answering, Semantic Parsing, Information Retrieval, Fact Verification, Summarization, Data-to-Text Generation, and Conversational AI.

We seek submissions in two tracks:

Research Track. We welcome submissions on research that broadly relates to combining Structured and Unstructured Knowledge on topics including but not limited to:

  • Datasets combining Structured and Unstructured Data
  • Data Augmentation for Structured and Unstructured Data
  • Joint PreTraining for Structured and Unstructured Knowledge
  • Joint Modeling with Structured and Unstructured Knowledge
  • Conversational AI over Structured and Unstructured Knowledge
  • Summarization over Structured and Unstructured Knowledge
  • Language Generation over Structured and Unstructured Knowledge
  • Multilingual Data and Modeling for Structured and Unstructured Knowledge
  • Fairness and Bias in Structured and Unstructured Knowledge
  • Transfer Learning over Structured and Unstructured Knowledge
  • Multitask Learning over Structured and Unstructured Knowledge
  • Human-in-the-loop Learning for Structured and Unstructured Knowledge
  • Human-in-the-loop Evaluation for Models over Structured and Unstructured Knowledge
  • Interpretability for Models over Structured and Unstructured Knowledge

Shared Task Track. We plan to host two shared tasks: UnifiedSKG and FinQA. Please check the shared task page for more information. We accept system descriptions of our shared tasks as non-archival workshop submissions.

Important Dates for Research Track

  • April 8, 2022: Research Track Submission deadline
  • April 9-April 29, 2022: Review period
  • May 6, 2022: Notification of acceptance
  • May 20, 2022: Camera-ready version deadline

Important Dates for Non-Archival Shared Task Track

  • Feb 15, 2022: Shared Task Launch
  • June 8, 2022: Result Submission Deadline
  • June 15, 2022: System Description Submission Deadline

All deadlines are 11:59 PM UTC -12h (Anywhere on Earth).

Submission Guidelines
Submissions should have at least 4 pages and at most 8 pages of content, plus unlimited pages for references and appendices.
Accepted papers will be given 1 additional content page to address reviewers’ comments.
Please use the official ARR style files available as an Overleaf template to format your papers.
Our reviewing policy is double-blind, and the submissions should be fully anonymized.
Please submit through our Openreview submission site.

Dual Sumbission: We also allow submissions that are under review in other venues or have preprint versions. But please make sure to follow dual submission policies from other venues.

Archival vs Non-Archival: During submission, please also indicate if you want your paper to be archival or non-archival. Archival papers, if accepted, will be included in the workshop proceeding, while non-archival will not be included. Both archival or non-archival papers, if accepted, will be presented at workshop.

Anonymity Period: We do not enforce an anonymity period. We allow preprint servers such as arXiv at any point of time.

Call for Shared Task

We will have cash awards for shared task winners!

We plan to host two shared tasks: UnifiedSKG and FinQA. We accept system descriptions of our shared tasks as workshop submissions. Please check the Call for Paper for paper submission details.

Both shared tasks have the same timelines:

  • Feb 15, 2022: Shared Task Launch
  • June 8, 2022: Result Submission Deadline
  • June 15, 2022: System Description Submission Deadline

All deadlines are 11:59 PM UTC -12h (Anywhere on Earth).

UnifiedSKG
UnifiedSKG is a recently proposed framework that unifies and multi-tasks 21 structured knowledge grounding tasks including table/database/knowledge-base/apis semantic parsing, question answering, data-to-text, and factual verification. For this shared task, we consider the following 4 datasets:

  • Spider for database semantic parsing
  • GrailQA for knowledge graph semantic parsing/question answering
  • WikiTableQuestions for table question answering
  • SParC for multi-turn database semantic parsing
  • TabFact for table factual verification.

Models are allowed to train on the training sets of these datasets, and we will evaluate the model using the test sets of these datasets. We provide unified data formats and strong but simple SOTA/baseline models in this Github repo.

Our UnifiedSKG shared task have two subtasks focusing on two aspects, Generalization and Multi-Tasking. We will report a joint score on the 4 tasks (also consider the size of your submitted models).

  • Generalization: The goal of this subtask is to propose general structured knowledge encoding for table/database/knowledge-base/apis, general methods for integrating structured & unstructured (e.g., user NL requests) inputs, and effective structured knowledge retrival methods. For this subtask, models are allowed to train on the training set of individual tasks separately, and we will evaluate the model on the corresponding test sets. For more details, please follow the setting used in Table 2 of the paper.
  • Multi-Tasking: The goal of this subtask is to propose effective multi-tasking methods that jointly learn structured knowledge and integrate structured & unstructured inputs. For this subtask, models are allowed to use all the training sets together in a multi-task learning fashion, and we will evaluate the model on the test sets. Please follow the setting used in Table 4 of the paper.

FinQA
FinQA is a large-scale dataset on answering deep questions over financial data, aiming to automate the analysis of a large corpus of financial documents. Please check our FinQA paper for more details. The dataset, code and instructions can be found at Github.

The leaderboard is hosted on Codalab. Please first submit your results on the FinQA test set to the leaderboard, and then submit your paper following the Call for Paper. To be eligible for result archives and consideration for awards, we kindly request you to send the following information to zhiyuchen@cs.ucsb.edu using your main contact email:

  • Team name.
  • Team members.
  • The username used in CodaLab submissions.
  • The forum link to your paper submission on Openreview.

Please check the FinQA challenge website for more details.