Svitlana Galeshchuk


2023

pdf bib
Abstractive Summarization for the Ukrainian Language: Multi-Task Learning with Hromadske.ua News Dataset
Svitlana Galeshchuk
Proceedings of the Second Ukrainian Natural Language Processing Workshop (UNLP)

Despite recent NLP developments, abstractive summarization remains a challenging task, especially in the case of low-resource languages like Ukrainian. The paper aims at improving the quality of summaries produced by mT5 for news in Ukrainian by fine-tuning the model with a mixture of summarization and text similarity tasks using summary-article and title-article training pairs, respectively. The proposed training set-up with small, base, and large mT5 models produce higher quality résumé. Besides, we present a new Ukrainian dataset for the abstractive summarization task that consists of circa 36.5K articles collected from Hromadske.ua until June 2021.

2019

pdf bib
Sentiment Analysis for Multilingual Corpora
Svitlana Galeshchuk | Ju Qiu | Julien Jourdan
Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing

The paper presents a generic approach to the supervised sentiment analysis of social media content in Slavic languages. The method proposes translating the documents from the original language to English with Google’s Neural Translation Model. The resulted texts are then converted to vectors by averaging the vectorial representation of words derived from a pre-trained Word2Vec English model. Testing the approach with several machine learning methods on Polish, Slovenian and Croatian Twitter datasets returns up to 86% of classification accuracy on the out-of-sample data.