Eleni Miltsakaki


2023

pdf bib
Enhancing Human Summaries for Question-Answer Generation in Education
Hannah Gonzalez | Liam Dugan | Eleni Miltsakaki | Zhiqi Cui | Jiaxuan Ren | Bryan Li | Shriyash Upadhyay | Etan Ginsberg | Chris Callison-Burch
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

We address the problem of generating high-quality question-answer pairs for educational materials. Previous work on this problem showed that using summaries as input improves the quality of question generation (QG) over original textbook text and that human-written summaries result in higher quality QG than automatic summaries. In this paper, a) we show that advances in Large Language Models (LLMs) are not yet sufficient to generate quality summaries for QG and b) we introduce a new methodology for enhancing bullet point student notes into fully fledged summaries and find that our methodology yields higher quality QG. We conducted a large-scale human annotation study of generated question-answer pairs for the evaluation of our methodology. In order to aid in future research, we release a new dataset of 9.2K human annotations of generated questions.

pdf bib
Automatically Generated Summaries of Video Lectures May Enhance Students’ Learning Experience
Hannah Gonzalez | Jiening Li | Helen Jin | Jiaxuan Ren | Hongyu Zhang | Ayotomiwa Akinyele | Adrian Wang | Eleni Miltsakaki | Ryan Baker | Chris Callison-Burch
Proceedings of the 18th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2023)

We introduce a novel technique for automatically summarizing lecture videos using large language models such as GPT-3 and we present a user study investigating the effects on the studying experience when automatic summaries are added to lecture videos. We test students under different conditions and find that the students who are shown a summary next to a lecture video perform better on quizzes designed to test the course materials than the students who have access only to the video or the summary. Our findings suggest that adding automatic summaries to lecture videos enhances the learning experience. Qualitatively, students preferred summaries when studying under time constraints.

2022

pdf bib
A Feasibility Study of Answer-Agnostic Question Generation for Education
Liam Dugan | Eleni Miltsakaki | Shriyash Upadhyay | Etan Ginsberg | Hannah Gonzalez | DaHyeon Choi | Chuning Yuan | Chris Callison-Burch
Findings of the Association for Computational Linguistics: ACL 2022

We conduct a feasibility study into the applicability of answer-agnostic question generation models to textbook passages. We show that a significant portion of errors in such systems arise from asking irrelevant or un-interpretable questions and that such errors can be ameliorated by providing summarized input. We find that giving these models human-written summaries instead of the original text results in a significant increase in acceptability of generated questions (33% 83%) as determined by expert annotators. We also find that, in the absence of human-written summaries, automatic summarization can serve as a good middle ground.

2019

pdf bib
Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification
Reno Kriz | João Sedoc | Marianna Apidianaki | Carolina Zheng | Gaurav Kumar | Eleni Miltsakaki | Chris Callison-Burch
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Sentence simplification is the task of rewriting texts so they are easier to understand. Recent research has applied sequence-to-sequence (Seq2Seq) models to this task, focusing largely on training-time improvements via reinforcement learning and memory augmentation. One of the main problems with applying generic Seq2Seq models for simplification is that these models tend to copy directly from the original sentence, resulting in outputs that are relatively long and complex. We aim to alleviate this issue through the use of two main techniques. First, we incorporate content word complexities, as predicted with a leveled word complexity model, into our loss function during training. Second, we generate a large set of diverse candidate simplifications at test time, and rerank these to promote fluency, adequacy, and simplicity. Here, we measure simplicity through a novel sentence complexity model. These extensions allow our models to perform competitively with state-of-the-art systems while generating simpler sentences. We report standard automatic and human evaluation metrics.

2018

pdf bib
Simplification Using Paraphrases and Context-Based Lexical Substitution
Reno Kriz | Eleni Miltsakaki | Marianna Apidianaki | Chris Callison-Burch
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)

Lexical simplification involves identifying complex words or phrases that need to be simplified, and recommending simpler meaning-preserving substitutes that can be more easily understood. We propose a complex word identification (CWI) model that exploits both lexical and contextual features, and a simplification mechanism which relies on a word-embedding lexical substitution model to replace the detected complex words with simpler paraphrases. We compare our CWI and lexical simplification models to several baselines, and evaluate the performance of our simplification system against human judgments. The results show that our models are able to detect complex words with higher accuracy than other commonly used methods, and propose good simplification substitutes in context. They also highlight the limited contribution of context features for CWI, which nonetheless improve simplification compared to context-unaware models.

2012

pdf bib
Do NLP and machine learning improve traditional readability formulas?
Thomas François | Eleni Miltsakaki
Proceedings of the First Workshop on Predicting and Improving Text Readability for target reader populations

2010

pdf bib
Corpus-based Semantics of Concession: Where do Expectations Come from?
Livio Robaldo | Eleni Miltsakaki | Alessia Bianchini
Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)

In this paper, we discuss our analysis and resulting new annotations of Penn Discourse Treebank (PDTB) data tagged as Concession. Concession arises whenever one of the two arguments creates an expectation, and the other ones denies it. In Natural Languages, typical discourse connectives conveying Concession are 'but', 'although', 'nevertheless', etc. Extending previous theoretical accounts, our corpus analysis reveals that concessive interpretations are due to different sources of expectation, each giving rise to critical inferences about the relationship of the involved eventualities. We identify four different sources of expectation: Causality, Implication, Correlation, and Implicature. The reliability of these categories is supported by a high inter-annotator agreement score, computed over a sample of one thousand tokens of explicit connectives annotated as Concession in PDTB. Following earlier work of (Hobbs, 1998) and (Davidson, 1967) notion of reification, we extend the logical account of Concession originally proposed in (Robaldo et al., 2008) to provide refined formal descriptions for the first three mentioned sources of expectations in Concessive relations.

pdf bib
Antelogue: Pronoun Resolution for Text and Dialogue
Eleni Miltsakaki
Coling 2010: Demonstrations

2009

pdf bib
Matching Readers’ Preferences and Reading Skills with Appropriate Web Texts
Eleni Miltsakaki
Proceedings of the Demonstrations Session at EACL 2009

2008

pdf bib
The Penn Discourse TreeBank 2.0.
Rashmi Prasad | Nikhil Dinesh | Alan Lee | Eleni Miltsakaki | Livio Robaldo | Aravind Joshi | Bonnie Webber
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We present the second version of the Penn Discourse Treebank, PDTB-2.0, describing its lexically-grounded annotations of discourse relations and their two abstract object arguments over the 1 million word Wall Street Journal corpus. We describe all aspects of the annotation, including (a) the argument structure of discourse relations, (b) the sense annotation of the relations, and (c) the attribution of discourse relations and each of their arguments. We list the differences between PDTB-1.0 and PDTB-2.0. We present representative statistics for several aspects of the annotation in the corpus.

pdf bib
Real Time Web Text Classification and Analysis of Reading Difficulty
Eleni Miltsakaki | Audrey Troutt
Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications

pdf bib
Refining the Meaning of Sense Labels in PDTB: “Concession”
Livio Robaldo | Eleni Miltsakaki | Jerry R. Hobbs
Semantics in Text Processing. STEP 2008 Conference Proceedings

2005

pdf bib
Attribution and the (Non-)Alignment of Syntactic and Discourse Arguments of Connectives
Nikhil Dinesh | Alan Lee | Eleni Miltsakaki | Rashmi Prasad | Aravind Joshi | Bonnie Webber
Proceedings of the Workshop on Frontiers in Corpus Annotations II: Pie in the Sky

2004

pdf bib
The Penn Discourse Treebank
Eleni Miltsakaki | Rashmi Prasad | Aravind Joshi | Bonnie Webber
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf bib
Annotation and Data Mining of the Penn Discourse TreeBank
Rashmi Prasad | Eleni Miltsakaki | Aravind Joshi | Bonnie Webber
Proceedings of the Workshop on Discourse Annotation

pdf bib
Annotating Discourse Connectives and Their Arguments
Eleni Miltsakaki | Aravind Joshi | Rashmi Prasad | Bonnie Webber
Proceedings of the Workshop Frontiers in Corpus Annotation at HLT-NAACL 2004

2003

pdf bib
Anaphoric arguments of discourse connectives: Semantic properties of antecedents versus non-antecedents
Eleni Miltsakaki | Cassandre Creswell | Katherine Forbes | Aravind Joshi | Bonnie Webber
Proceedings of the 2003 EACL Workshop on The Computational Treatment of Anaphora

2002

pdf bib
Toward an Aposynthesis of Topic Continuity and Intrasentential Anaphora
Eleni Miltsakaki
Computational Linguistics, Volume 28, Number 3, September 2002

2000

pdf bib
The Role of Centering Theory’s Rough-Shift in the Teaching and Evaluation of Writing Skills
Eleni Miltsakaki | Karen Kukich
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics