Artem Abzaliev


2023

pdf bib
DELPHI: Data for Evaluating LLMs’ Performance in Handling Controversial Issues
David Sun | Artem Abzaliev | Hadas Kotek | Christopher Klein | Zidi Xiu | Jason Williams
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track

Controversy is a reflection of our zeitgeist, and an important aspect to any discourse. The rise of large language models (LLMs) as conversational systems has increased public reliance on these systems for answers to their various questions. Consequently, it is crucial to systematically examine how these models respond to questions that pertaining to ongoing debates. However, few such datasets exist in providing human-annotated labels reflecting the contemporary discussions. To foster research in this area, we propose a novel construction of a controversial questions dataset, expanding upon the publicly released Quora Question Pairs Dataset. This dataset presents challenges concerning knowledge recency, safety, fairness, and bias. We evaluate different LLMs using a subset of this dataset, illuminating how they handle controversial issues and the stances they adopt. This research ultimately contributes to our understanding of LLMs’ interaction with controversial issues, paving the way for improvements in their comprehension and handling of complex societal debates.

pdf bib
Query Rewriting for Effective Misinformation Discovery
Ashkan Kazemi | Artem Abzaliev | Naihao Deng | Rui Hou | Scott Hale | Veronica Perez-Rosas | Rada Mihalcea
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics (Volume 1: Long Papers)

2022

pdf bib
Towards Understanding the Relation between Gestures and Language
Artem Abzaliev | Andrew Owens | Rada Mihalcea
Proceedings of the 29th International Conference on Computational Linguistics

In this paper, we explore the relation between gestures and language. Using a multimodal dataset, consisting of Ted talks where the language is aligned with the gestures made by the speakers, we adapt a semi-supervised multimodal model to learn gesture embeddings. We show that gestures are predictive of the native language of the speaker, and that gesture embeddings further improve language prediction result. In addition, gesture embeddings might contain some linguistic information, as we show by probing embeddings for psycholinguistic categories. Finally, we analyze the words that lead to the most expressive gestures and find that function words drive the expressiveness of gestures.

2019

pdf bib
On GAP Coreference Resolution Shared Task: Insights from the 3rd Place Solution
Artem Abzaliev
Proceedings of the First Workshop on Gender Bias in Natural Language Processing

This paper presents the 3rd-place-winning solution to the GAP coreference resolution shared task. The approach adopted consists of two key components: fine-tuning the BERT language representation model (Devlin et al., 2018) and the usage of external datasets during the training process. The model uses hidden states from the intermediate BERT layers instead of the last layer. The resulting system almost eliminates the difference in log loss per gender during the cross-validation, while providing high performance.