FiNLP at FinCausal 2020 Task 1: Mixture of BERTs for Causal Sentence Identification in Financial Texts

Sarthak Gupta


Abstract
This paper describes our system developed for the sub-task 1 of the FinCausal shared task in the FNP-FNS workshop held in conjunction with COLING-2020. The system classifies whether a financial news text segment contains causality or not. To address this task, we fine-tune and ensemble the generic and domain-specific BERT language models pre-trained on financial text corpora. The task data is highly imbalanced with the majority non-causal class; therefore, we train the models using strategies such as under-sampling, cost-sensitive learning, and data augmentation. Our best system achieves a weighted F1-score of 96.98 securing 4th position on the evaluation leaderboard. The code is available at https://github.com/sarthakTUM/fincausal
Anthology ID:
2020.fnp-1.12
Volume:
Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation
Month:
December
Year:
2020
Address:
Barcelona, Spain (Online)
Editors:
Dr Mahmoud El-Haj, Dr Vasiliki Athanasakou, Dr Sira Ferradans, Dr Catherine Salzedo, Dr Ans Elhag, Dr Houda Bouamor, Dr Marina Litvak, Dr Paul Rayson, Dr George Giannakopoulos, Nikiforos Pittaras
Venue:
FNP
SIG:
Publisher:
COLING
Note:
Pages:
74–79
Language:
URL:
https://aclanthology.org/2020.fnp-1.12
DOI:
Bibkey:
Cite (ACL):
Sarthak Gupta. 2020. FiNLP at FinCausal 2020 Task 1: Mixture of BERTs for Causal Sentence Identification in Financial Texts. In Proceedings of the 1st Joint Workshop on Financial Narrative Processing and MultiLing Financial Summarisation, pages 74–79, Barcelona, Spain (Online). COLING.
Cite (Informal):
FiNLP at FinCausal 2020 Task 1: Mixture of BERTs for Causal Sentence Identification in Financial Texts (Gupta, FNP 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.fnp-1.12.pdf
Code
 sarthaktum/fincausal
Data
SemEval-2010 Task-8