Paweł Skórzewski


2023

pdf bib
Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors
Marek Kubis | Paweł Skórzewski | Marcin Sowański | Tomasz Zietkiewicz
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing

In a spoken dialogue system, an NLU model is preceded by a speech recognition system that can deteriorate the performance of natural language understanding. This paper proposes a method for investigating the impact of speech recognition errors on the performance of natural language understanding models. The proposed method combines the back transcription procedure with a fine-grained technique for categorizing the errors that affect the performance of NLU models. The method relies on the usage of synthesized speech for NLU evaluation. We show that the use of synthesized speech in place of audio recording does not change the outcomes of the presented technique in a significant way.

2022

pdf bib
Named Entity Recognition to Detect Criminal Texts on the Web
Paweł Skórzewski | Mikołaj Pieniowski | Grazyna Demenko
Proceedings of the Thirteenth Language Resources and Evaluation Conference

This paper presents a toolkit that applies named-entity extraction techniques to identify information related to criminal activity in texts from the Polish Internet. The methodological and technical assumptions were established following the requirements of our application users from the Border Guard. Due to the specificity of the users’ needs and the specificity of web texts, we used original methodologies related to the search for desired texts, the creation of domain lexicons, the annotation of the collected text resources, and the combination of rule-based and machine-learning techniques for extracting the information desired by the user. The performance of our tools has been evaluated on 6240 manually annotated text fragments collected from Internet sources. Evaluation results and user feedback show that our approach is feasible and has potential value for real-life applications in the daily work of border guards. Lexical lookup combined with hand-crafted rules and regular expressions, supported by text statistics, can make a decent specialized entity recognition system in the absence of large data sets required for training a good neural network.

2017

pdf bib
EUDAMU at SemEval-2017 Task 11: Action Ranking and Type Matching for End-User Development
Marek Kubis | Paweł Skórzewski | Tomasz Ziętkiewicz
Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017)

The paper describes a system for end-user development using natural language. Our approach uses a ranking model to identify the actions to be executed followed by reference and parameter matching models to select parameter values that should be set for the given commands. We discuss the results of evaluation and possible improvements for future work.