Investigating Robustness and Interpretability of Link Prediction via Adversarial Modifications

Pouya Pezeshkpour, Yifan Tian, Sameer Singh


Abstract
Representing entities and relations in an embedding space is a well-studied approach for machine learning on relational data. Existing approaches, however, primarily focus on improving accuracy and overlook other aspects such as robustness and interpretability. In this paper, we propose adversarial modifications for link prediction models: identifying the fact to add into or remove from the knowledge graph that changes the prediction for a target fact after the model is retrained. Using these single modifications of the graph, we identify the most influential fact for a predicted link and evaluate the sensitivity of the model to the addition of fake facts. We introduce an efficient approach to estimate the effect of such modifications by approximating the change in the embeddings when the knowledge graph changes. To avoid the combinatorial search over all possible facts, we train a network to decode embeddings to their corresponding graph components, allowing the use of gradient-based optimization to identify the adversarial modification. We use these techniques to evaluate the robustness of link prediction models (by measuring sensitivity to additional facts), study interpretability through the facts most responsible for predictions (by identifying the most influential neighbors), and detect incorrect facts in the knowledge base.
Anthology ID:
N19-1337
Volume:
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota
Editors:
Jill Burstein, Christy Doran, Thamar Solorio
Venue:
NAACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3336–3347
Language:
URL:
https://aclanthology.org/N19-1337
DOI:
10.18653/v1/N19-1337
Bibkey:
Cite (ACL):
Pouya Pezeshkpour, Yifan Tian, and Sameer Singh. 2019. Investigating Robustness and Interpretability of Link Prediction via Adversarial Modifications. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3336–3347, Minneapolis, Minnesota. Association for Computational Linguistics.
Cite (Informal):
Investigating Robustness and Interpretability of Link Prediction via Adversarial Modifications (Pezeshkpour et al., NAACL 2019)
Copy Citation:
PDF:
https://aclanthology.org/N19-1337.pdf
Poster:
 N19-1337.Poster.pdf
Code
 pouyapez/criage