Christoph Tillmann

Also published as: C. Tillmann


2023

pdf bib
Muted: Multilingual Targeted Offensive Speech Identification and Visualization
Christoph Tillmann | Aashka Trivedi | Sara Rosenthal | Santosh Borse | Rong Zhang | Avirup Sil | Bishwaranjan Bhattacharjee
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Offensive language such as hate, abuse, and profanity (HAP) occurs in various content on the web. While previous work has mostly dealt with sentence level annotations, there have been a few recent attempts to identify offensive spans as well. We build upon this work and introduce MUTED, a system to identify multilingual HAP content by displaying offensive arguments and their targets using heat maps to indicate their intensity. MUTED can leverage any transformer-based HAP-classification model and its attention mechanism out-of-the-box to identify toxic spans, without further fine-tuning. In addition, we use the spaCy library to identify the specific targets and arguments for the words predicted by the attention heatmaps. We present the model’s performance on identifying offensive spans and their targets in existing datasets and present new annotations on German text. Finally, we demonstrate our proposed visualization tool on multilingual inputs.

2014

pdf bib
Improved Sentence-Level Arabic Dialect Classification
Christoph Tillmann | Saab Mansour | Yaser Al-Onaizan
Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects

pdf bib
Automatic dialect classification for statistical machine translation
Saab Mansour | Yaser Al-Onaizan | Graeme Blackwood | Christoph Tillmann
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track

The training data for statistical machine translation are gathered from various sources representing a mixture of domains. In this work, we argue that when translating dialects representing varieties of the same language, a manually assigned data source is not a reliable indicator of the dialect. We resort to automatic dialect classification to refine the training corpora according to the different dialects and build improved dialect specific systems. A fairly standard classifier for Arabic developed within this work achieves state-of-the-art performance, with classification precision above 90%, making it usefully accurate for our application. The classification of the data is then used to distinguish between the different dialects, split the data accordingly, and utilize the new splits for several adaptation techniques. Performing translation experiments on a large scale dialectal Arabic to English translation task, our results show that the classifier generates better contrast between the dialects and achieves superior translation quality than using the original manual corpora splits.

2009

pdf bib
A Simple Sentence-Level Extraction Algorithm for Comparable Data
Christoph Tillmann | Jian-ming Xu
Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers

pdf bib
A Beam-Search Extraction Algorithm for Comparable Data
Christoph Tillmann
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2008

pdf bib
A Rule-Driven Dynamic Programming Decoder for Statistical MT
Christoph Tillmann
Proceedings of the ACL-08: HLT Second Workshop on Syntax and Structure in Statistical Translation (SSST-2)

2006

pdf bib
Efficient Dynamic Programming Search Algorithms for Phrase-Based SMT
Christoph Tillmann
Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing

pdf bib
A Discriminative Global Training Algorithm for Statistical MT
Christoph Tillmann | Tong Zhang
Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics

2005

pdf bib
A Localized Prediction Model for Statistical Machine Translation
Christoph Tillmann | Tong Zhang
Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05)

2004

pdf bib
A Unigram Orientation Model for Statistical Machine Translation
Christoph Tillmann
Proceedings of HLT-NAACL 2004: Short Papers

2003

pdf bib
A Projection Extension Algorithm for Statistical Machine Translation
Christoph Tillmann
Proceedings of the 2003 Conference on Empirical Methods in Natural Language Processing

pdf bib
Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation
Christoph Tillmann | Hermann Ney
Computational Linguistics, Volume 29, Number 1, March 2003

pdf bib
A Phrase-based Unigram Model for Statistical Machine Translation
Christoph Tillmann | Fei Xia
Companion Volume of the Proceedings of HLT-NAACL 2003 - Short Papers

pdf bib
TIPS: A Translingual Information Processing System
Yaser Al-Onaizan | Radu Florian | Martin Franz | Hany Hassan | Young-Suk Lee | J. Scott McCarley | Kishore Papineni | Salim Roukos | Jeffrey Sorensen | Christoph Tillmann | Todd Ward | Fei Xia
Companion Volume of the Proceedings of HLT-NAACL 2003 - Demonstrations

2000

pdf bib
Word Re-ordering and DP-based Search in Statistical Machine Translation
Christoph Tillmann | Hermann Ney
COLING 2000 Volume 2: The 18th International Conference on Computational Linguistics

1999

pdf bib
Improved Alignment Models for Statistical Machine Translation
Franz Josef Och | Christoph Tillmann | Hermann Ney
1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora

pdf bib
A Statistical Parser for Czech
Michael Collins | Jan Hajic | Lance Ramshaw | Christoph Tillmann
Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics

1998

pdf bib
A DP based Search Algorithm for Statistical Machine Translation
S. Nießen | S. Vogel | H. Ney | C. Tillmann
COLING 1998 Volume 2: The 17th International Conference on Computational Linguistics

pdf bib
A DP based Search Algorithm for Statistical Machine Translation
S. Nießen | S. Vogel | H. Ney | C. Tillmann
36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Volume 2

1997

pdf bib
Word Triggers and the EM Algorithm
Christoph Tillmann | Hermann Ney
CoNLL97: Computational Natural Language Learning

pdf bib
A DP-based Search Using Monotone Alignments in Statistical Translation
Christoph Tillmann | Stephan Vogel | Hermann Ney | Alex Zubiaga
35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics

1996

pdf bib
HMM-Based Word Alignment in Statistical Translation
Stephan Vogel | Hermann Ney | Christoph Tillmann
COLING 1996 Volume 2: The 16th International Conference on Computational Linguistics