Toward a deep dialectological representation of Indo-Aryan

Chundra Cathcart


Abstract
This paper presents a new approach to disentangling inter-dialectal and intra-dialectal relationships within one such group, the Indo-Aryan subgroup of Indo-European. We draw upon admixture models and deep generative models to tease apart historic language contact and language-specific behavior in the overall patterns of sound change displayed by Indo-Aryan languages. We show that a “deep” model of Indo-Aryan dialectology sheds some light on questions regarding inter-relationships among the Indo-Aryan languages, and performs better than a “shallow” model in terms of certain qualities of the posterior distribution (e.g., entropy of posterior distributions), and outline future pathways for model development.
Anthology ID:
W19-1411
Volume:
Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects
Month:
June
Year:
2019
Address:
Ann Arbor, Michigan
Editors:
Marcos Zampieri, Preslav Nakov, Shervin Malmasi, Nikola Ljubešić, Jörg Tiedemann, Ahmed Ali
Venue:
VarDial
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
110–119
Language:
URL:
https://aclanthology.org/W19-1411
DOI:
10.18653/v1/W19-1411
Bibkey:
Cite (ACL):
Chundra Cathcart. 2019. Toward a deep dialectological representation of Indo-Aryan. In Proceedings of the Sixth Workshop on NLP for Similar Languages, Varieties and Dialects, pages 110–119, Ann Arbor, Michigan. Association for Computational Linguistics.
Cite (Informal):
Toward a deep dialectological representation of Indo-Aryan (Cathcart, VarDial 2019)
Copy Citation:
PDF:
https://aclanthology.org/W19-1411.pdf