Charles Sutton


2023

pdf bib
Natural Language to Code Generation in Interactive Data Science Notebooks
Pengcheng Yin | Wen-Ding Li | Kefan Xiao | Abhishek Rao | Yeming Wen | Kensen Shi | Joshua Howland | Paige Bailey | Michele Catasta | Henryk Michalewski | Oleksandr Polozov | Charles Sutton
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Computational notebooks, such as Jupyter notebooks, are interactive computing environments that are ubiquitous among data scientists to perform data wrangling and analytic tasks. To measure the performance of AI pair programmers that automatically synthesize programs for those tasks given natural language (NL) intents from users, we build ARCADE, a benchmark of 1078 code generation problems using the pandas data analysis framework in data science notebooks. ARCADE features multiple rounds of NL-to-code problems from the same notebook. It requires a model to understand rich multi-modal contexts, such as existing notebook cells and their execution states as well as previous turns of interaction. To establish a strong baseline on this challenging task, we develop PaChiNCo, a 62B code language model (LM) for Python computational notebooks, which significantly outperforms public code LMs. Finally, we explore few-shot prompting strategies to elicit better code with step-by-step decomposition and NL explanation, showing the potential to improve the diversity and explainability of model predictions. Arcade is publicly available at https://github.com/google-research/arcade-nl2code/.

2018

pdf bib
Deep Dungeons and Dragons: Learning Character-Action Interactions from Role-Playing Game Transcripts
Annie Louis | Charles Sutton
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers)

An essential aspect to understanding narratives is to grasp the interaction between characters in a story and the actions they take. We examine whether computational models can capture this interaction, when both character attributes and actions are expressed as complex natural language descriptions. We propose role-playing games as a testbed for this problem, and introduce a large corpus of game transcripts collected from online discussion forums. Using neural language models which combine character and action descriptions from these stories, we show that we can learn the latent ties. Action sequences are better predicted when the character performing the action is also taken into account, and vice versa for character attributes.

2006

pdf bib
Reducing Weight Undertraining in Structured Discriminative Learning
Charles Sutton | Michael Sindelar | Andrew McCallum
Proceedings of the Human Language Technology Conference of the NAACL, Main Conference

pdf bib
Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing
Ryan McDonald | Charles Sutton | Hal Daumé III | Andrew McCallum | Fernando Pereira | Jeff Bilmes
Proceedings of the Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing

2005

pdf bib
Joint Parsing and Semantic Role Labeling
Charles Sutton | Andrew McCallum
Proceedings of the Ninth Conference on Computational Natural Language Learning (CoNLL-2005)

pdf bib
Composition of Conditional Random Fields for Transfer Learning
Charles Sutton | Andrew McCallum
Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing