Mapping natural language commands to web elements

Panupong Pasupat, Tian-Shun Jiang, Evan Liu, Kelvin Guu, Percy Liang


Abstract
The web provides a rich, open-domain environment with textual, structural, and spatial properties. We propose a new task for grounding language in this environment: given a natural language command (e.g., “click on the second article”), choose the correct element on the web page (e.g., a hyperlink or text box). We collected a dataset of over 50,000 commands that capture various phenomena such as functional references (e.g. “find who made this site”), relational reasoning (e.g. “article by john”), and visual reasoning (e.g. “top-most article”). We also implemented and analyzed three baseline models that capture different phenomena present in the dataset.
Anthology ID:
D18-1540
Volume:
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
Month:
October-November
Year:
2018
Address:
Brussels, Belgium
Editors:
Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
4970–4976
Language:
URL:
https://aclanthology.org/D18-1540
DOI:
10.18653/v1/D18-1540
Bibkey:
Cite (ACL):
Panupong Pasupat, Tian-Shun Jiang, Evan Liu, Kelvin Guu, and Percy Liang. 2018. Mapping natural language commands to web elements. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4970–4976, Brussels, Belgium. Association for Computational Linguistics.
Cite (Informal):
Mapping natural language commands to web elements (Pasupat et al., EMNLP 2018)
Copy Citation:
PDF:
https://aclanthology.org/D18-1540.pdf
Code
 stanfordnlp/phrasenode +  additional community code