Analysis of Equation Structure using Least Cost Parsing

R. Nigel Horspool, John Aycock


Abstract
Mathematical equations in LaTeX are composed with tags that express formatting as opposed to structure. For conversion from LaTeX to other word-processing systems, the structure of each equation must be inferred. We show how a form of least cost parsing used with a very general and ambiguous grammar may be used to select an appropriate structure for a LaTeX equation. MathML provides another application for the same technology; it has two alternative tagging schemes - presentation tags to specify formatting and content tags to specify structure. While conversion from content tagging to presentation tagging is straightforward, the converse is not. Our implementation of least cost parsing is based on Earley’s algorithm.
Anthology ID:
2000.iwpt-1.36
Volume:
Proceedings of the Sixth International Workshop on Parsing Technologies
Month:
February 23-25
Year:
2000
Address:
Trento, Italy
Editors:
Alberto Lavelli, John Carroll, Robert C. Berwick, Harry C. Bunt, Bob Carpenter, John Carroll, Ken Church, Mark Johnson, Aravind Joshi, Ronald Kaplan, Martin Kay, Bernard Lang, Alon Lavie, Anton Nijholt, Christer Samuelsson, Mark Steedman, Oliviero Stock, Hozumi Tanaka, Masaru Tomita, Hans Uszkoreit, K. Vijay-Shanker, David Weir, Mats Wiren
Venue:
IWPT
SIG:
SIGPARSE
Publisher:
Association for Computational Linguistics
Note:
Pages:
307–308
Language:
URL:
https://aclanthology.org/2000.iwpt-1.36
DOI:
Bibkey:
Cite (ACL):
R. Nigel Horspool and John Aycock. 2000. Analysis of Equation Structure using Least Cost Parsing. In Proceedings of the Sixth International Workshop on Parsing Technologies, pages 307–308, Trento, Italy. Association for Computational Linguistics.
Cite (Informal):
Analysis of Equation Structure using Least Cost Parsing (Horspool & Aycock, IWPT 2000)
Copy Citation:
PDF:
https://aclanthology.org/2000.iwpt-1.36.pdf