Mark Buckley


2022

pdf bib
Named Entity Recognition in Industrial Tables using Tabular Language Models
Aneta Koleva | Martin Ringsquandl | Mark Buckley | Rakeb Hasan | Volker Tresp
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Industry Track

Specialized transformer-based models for encoding tabular data have gained interest in academia. Although tabular data is omnipresent in industry, applications of table transformers are still missing. In this paper, we study how these models can be applied to an industrial Named Entity Recognition (NER) problem where the entities are mentioned in tabular-structured spreadsheets. The highly technical nature of spreadsheets as well as the lack of labeled data present major challenges for fine-tuning transformer-based models. Therefore, we develop a dedicated table data augmentation strategy based on available domain-specific knowledge graphs. We show that this boosts performance in our low-resource scenario considerably. Further, we investigate the benefits of tabular structure as inductive bias compared to tables as linearized sequences. Our experiments confirm that a table transformer outperforms other baselines and that its tabular inductive bias is vital for convergence of transformer-based models.

2019

pdf bib
News Article Teaser Tweets and How to Generate Them
Sanjeev Kumar Karn | Mark Buckley | Ulli Waltinger | Hinrich Schütze
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

In this work, we define the task of teaser generation and provide an evaluation benchmark and baseline systems for the process of generating teasers. A teaser is a short reading suggestion for an article that is illustrative and includes curiosity-arousing elements to entice potential readers to read particular news items. Teasers are one of the main vehicles for transmitting news to social media users. We compile a novel dataset of teasers by systematically accumulating tweets and selecting those that conform to the teaser definition. We have compared a number of neural abstractive architectures on the task of teaser generation and the overall best performing system is See et al. seq2seq with pointer network.

2008

pdf bib
A Classification of Dialogue Actions in Tutorial Dialogue
Mark Buckley | Magdalena Wolska
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)