Content Aware Source Code Change Description Generation

Pablo Loyola, Edison Marrese-Taylor, Jorge Balazs, Yutaka Matsuo, Fumiko Satoh


Abstract
We propose to study the generation of descriptions from source code changes by integrating the messages included on code commits and the intra-code documentation inside the source in the form of docstrings. Our hypothesis is that although both types of descriptions are not directly aligned in semantic terms —one explaining a change and the other the actual functionality of the code being modified— there could be certain common ground that is useful for the generation. To this end, we propose an architecture that uses the source code-docstring relationship to guide the description generation. We discuss the results of the approach comparing against a baseline based on a sequence-to-sequence model, using standard automatic natural language generation metrics as well as with a human study, thus offering a comprehensive view of the feasibility of the approach.
Anthology ID:
W18-6513
Volume:
Proceedings of the 11th International Conference on Natural Language Generation
Month:
November
Year:
2018
Address:
Tilburg University, The Netherlands
Editors:
Emiel Krahmer, Albert Gatt, Martijn Goudbeek
Venue:
INLG
SIG:
SIGGEN
Publisher:
Association for Computational Linguistics
Note:
Pages:
119–128
Language:
URL:
https://aclanthology.org/W18-6513
DOI:
10.18653/v1/W18-6513
Bibkey:
Cite (ACL):
Pablo Loyola, Edison Marrese-Taylor, Jorge Balazs, Yutaka Matsuo, and Fumiko Satoh. 2018. Content Aware Source Code Change Description Generation. In Proceedings of the 11th International Conference on Natural Language Generation, pages 119–128, Tilburg University, The Netherlands. Association for Computational Linguistics.
Cite (Informal):
Content Aware Source Code Change Description Generation (Loyola et al., INLG 2018)
Copy Citation:
PDF:
https://aclanthology.org/W18-6513.pdf