Research Replication Prediction Using Weakly Supervised Learning

Tianyi Luo, Xingyu Li, Hainan Wang, Yang Liu


Abstract
Knowing whether a published research result can be replicated is important. Carrying out direct replication of published research incurs a high cost. There are efforts tried to use machine learning aided methods to predict scientific claims’ replicability. However, existing machine learning aided approaches use only hand-extracted statistics features such as p-value, sample size, etc. without utilizing research papers’ text information and train only on a very small size of annotated data without making the most use of a large number of unlabeled articles. Therefore, it is desirable to develop effective machine learning aided automatic methods which can automatically extract text information as features so that we can benefit from Natural Language Processing techniques. Besides, we aim for an approach that benefits from both labeled and the large number of unlabeled data. In this paper, we propose two weakly supervised learning approaches that use automatically extracted text information of research papers to improve the prediction accuracy of research replication using both labeled and unlabeled datasets. Our experiments over real-world datasets show that our approaches obtain much better prediction performance compared to the supervised models utilizing only statistic features and a small size of labeled dataset. Further, we are able to achieve an accuracy of 75.76% for predicting the replicability of research.
Anthology ID:
2020.findings-emnlp.132
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2020
Month:
November
Year:
2020
Address:
Online
Editors:
Trevor Cohn, Yulan He, Yang Liu
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1464–1474
Language:
URL:
https://aclanthology.org/2020.findings-emnlp.132
DOI:
10.18653/v1/2020.findings-emnlp.132
Bibkey:
Cite (ACL):
Tianyi Luo, Xingyu Li, Hainan Wang, and Yang Liu. 2020. Research Replication Prediction Using Weakly Supervised Learning. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1464–1474, Online. Association for Computational Linguistics.
Cite (Informal):
Research Replication Prediction Using Weakly Supervised Learning (Luo et al., Findings 2020)
Copy Citation:
PDF:
https://aclanthology.org/2020.findings-emnlp.132.pdf