ADAPT at SemEval-2018 Task 9: Skip-Gram Word Embeddings for Unsupervised Hypernym Discovery in Specialised Corpora

Alfredo Maldonado, Filip Klubička


Abstract
This paper describes a simple but competitive unsupervised system for hypernym discovery. The system uses skip-gram word embeddings with negative sampling, trained on specialised corpora. Candidate hypernyms for an input word are predicted based based on cosine similarity scores. Two sets of word embedding models were trained separately on two specialised corpora: a medical corpus and a music industry corpus. Our system scored highest in the medical domain among the competing unsupervised systems but performed poorly on the music industry domain. Our system does not depend on any external data other than raw specialised corpora.
Anthology ID:
S18-1151
Volume:
Proceedings of the 12th International Workshop on Semantic Evaluation
Month:
June
Year:
2018
Address:
New Orleans, Louisiana
Editors:
Marianna Apidianaki, Saif M. Mohammad, Jonathan May, Ekaterina Shutova, Steven Bethard, Marine Carpuat
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
924–927
Language:
URL:
https://aclanthology.org/S18-1151
DOI:
10.18653/v1/S18-1151
Bibkey:
Cite (ACL):
Alfredo Maldonado and Filip Klubička. 2018. ADAPT at SemEval-2018 Task 9: Skip-Gram Word Embeddings for Unsupervised Hypernym Discovery in Specialised Corpora. In Proceedings of the 12th International Workshop on Semantic Evaluation, pages 924–927, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):
ADAPT at SemEval-2018 Task 9: Skip-Gram Word Embeddings for Unsupervised Hypernym Discovery in Specialised Corpora (Maldonado & Klubička, SemEval 2018)
Copy Citation:
PDF:
https://aclanthology.org/S18-1151.pdf
Data
SemEval-2018 Task-9