Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input

Cory Shain, William Bryce, Lifeng Jin, Victoria Krakovna, Finale Doshi-Velez, Timothy Miller, William Schuler, Lane Schwartz


Abstract
This paper presents a new memory-bounded left-corner parsing model for unsupervised raw-text syntax induction, using unsupervised hierarchical hidden Markov models (UHHMM). We deploy this algorithm to shed light on the extent to which human language learners can discover hierarchical syntax through distributional statistics alone, by modeling two widely-accepted features of human language acquisition and sentence processing that have not been simultaneously modeled by any existing grammar induction algorithm: (1) a left-corner parsing strategy and (2) limited working memory capacity. To model realistic input to human language learners, we evaluate our system on a corpus of child-directed speech rather than typical newswire corpora. Results beat or closely match those of three competing systems.
Anthology ID:
C16-1092
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yuji Matsumoto, Rashmi Prasad
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
964–975
Language:
URL:
https://aclanthology.org/C16-1092
DOI:
Bibkey:
Cite (ACL):
Cory Shain, William Bryce, Lifeng Jin, Victoria Krakovna, Finale Doshi-Velez, Timothy Miller, William Schuler, and Lane Schwartz. 2016. Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 964–975, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input (Shain et al., COLING 2016)
Copy Citation:
PDF:
https://aclanthology.org/C16-1092.pdf