Deception Detection in News Reports in the Russian Language: Lexics and Discourse

Dina Pisarevskaya


Abstract
News verification and automated fact checking tend to be very important issues in our world. The research is initial. We collected a corpus for Russian (174 news reports, truthful and fake ones). We held two experiments, for both we applied SVMs algorithm (linear/rbf kernel) and Random Forest to classify the news reports into 2 classes: truthful/deceptive. In the first experiment, we used 18 markers on lexics level, mostly frequencies of POS tags in texts. In the second experiment, on discourse level we used frequencies of rhetorical relations types in texts. The classification task in the first experiment is solved better by SVMs (rbf kernel) (f-measure 0.65). The model based on RST features shows best results with Random Forest Classifier (f-measure 0.54) and should be modified. In the next research, the combination of different deception detection markers for the Russian language should be taken in order to make a better predictive model.
Anthology ID:
W17-4213
Volume:
Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Editors:
Octavian Popescu, Carlo Strapparava
Venue:
WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
74–79
Language:
URL:
https://aclanthology.org/W17-4213
DOI:
10.18653/v1/W17-4213
Bibkey:
Cite (ACL):
Dina Pisarevskaya. 2017. Deception Detection in News Reports in the Russian Language: Lexics and Discourse. In Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism, pages 74–79, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Deception Detection in News Reports in the Russian Language: Lexics and Discourse (Pisarevskaya, 2017)
Copy Citation:
PDF:
https://aclanthology.org/W17-4213.pdf