<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://www.aclweb.org/aclwiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dimitra</id>
	<title>ACL Wiki - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://www.aclweb.org/aclwiki/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dimitra"/>
	<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/Special:Contributions/Dimitra"/>
	<updated>2026-04-09T00:12:01Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.43.6</generator>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12561</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12561"/>
		<updated>2019-06-17T14:22:14Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system. &lt;br /&gt;
&lt;br /&gt;
== Chimera ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/AmitMY/chimera&lt;br /&gt;
&lt;br /&gt;
Chimera is a component-based step-by-step pipeline for data-to-text generation based on https://arxiv.org/abs/1904.03396&lt;br /&gt;
It handles the necessary pre-processing for text-planning and surface realization which use neural networks, and does referring-expressions generation.&lt;br /&gt;
It can automatically evaluate datasets with a train-dev-test split, with both BLEU and data coverage.&lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
http://code.google.com/p/crisp-nlg/&lt;br /&gt;
&lt;br /&gt;
CRISP is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== CODA Tools software Release 1.1 ==&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/tools_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains 1) software for converting text parsed with RST relations into dialogue and 2) an annotation tool for annotating dialogue and translating it into monologue (used for creating CODA corpus).&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
https://www.cs.bgu.ac.il/~elhadad/install-fuf.html&lt;br /&gt;
&lt;br /&gt;
FUF/SURGE is a surface realisation system, based on functional unification grammar.&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html&lt;br /&gt;
&lt;br /&gt;
The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== GoPhi : an AMR to ENGLISH VERBALIZER == &lt;br /&gt;
&lt;br /&gt;
https://github.com/rali-udem/gophi&lt;br /&gt;
&lt;br /&gt;
GoPhi (Generation Of Parenthesized Human Input) is a system for generating a literal reading of Abstract Meaning Representation (AMR) structures. The system, written in SWI-Prolog, uses a symbolic approach to transform the original rooted graph into a tree of constituents that is transformed into an English sentence by jsRealB.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
jsRealB is a text realizer designed specifically for the web, easy to learn and to use. This realizer allows its user to build a variety of French and English expressions and sentences, to add HTML tags to them and to easily integrate them into web pages. jsRealB can also be used in Javascript application by means of a node.js module.&lt;br /&gt;
Sources for the programs, linguistic resources and demonstrations are available on the RALI GitHub [https://github.com/rali-udem/jsRealB].&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
http://wiki.delph-in.net/moin/LkbTop&lt;br /&gt;
&lt;br /&gt;
LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
== NLGen and NLGen2 ==&lt;br /&gt;
https://launchpad.net/nlgen&lt;br /&gt;
&lt;br /&gt;
https://launchpad.net/nlgen2&lt;br /&gt;
&lt;br /&gt;
The NLGen natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
NLGen2 uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
http://openccg.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
OpenCCG is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== rLDCP: Text Generation from Data ==&lt;br /&gt;
https://cran.r-project.org/web/packages/rLDCP/index.html&lt;br /&gt;
&lt;br /&gt;
R package for text generation from data&lt;br /&gt;
&lt;br /&gt;
== RNNLG ==&lt;br /&gt;
https://github.com/shawnwun/RNNLG&lt;br /&gt;
&lt;br /&gt;
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg   (English)&lt;br /&gt;
&lt;br /&gt;
https://github.com/rali-udem/SimpleNLG-EnFr  (English and French)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-GL    (Galician)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-ES    (Spanish)&lt;br /&gt;
&lt;br /&gt;
SimpleNLG is a simple Java-based realiser.  Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has a Java API, and can be used from other languages via an XML interface.  There are &amp;quot;unofficial&amp;quot; ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including [https://aclweb.org/anthology/W18-6508 Dutch], [https://github.com/alexmazzei/SimpleNLG-IT Italian], [https://aclweb.org/anthology/papers/W/W18/W18-6506/ Mandarin]&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
http://www.cs.rutgers.edu/~mdstone/nlg.html&lt;br /&gt;
&lt;br /&gt;
SPUD (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php&lt;br /&gt;
&lt;br /&gt;
STANDUP (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
http://www.suregen.de/00023.html&lt;br /&gt;
&lt;br /&gt;
Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
== TGen == &lt;br /&gt;
------&lt;br /&gt;
A statistical generator generating sentences from dialogue acts or similar representations, based on the sequence-to-sequence (seq2seq) neural network architecture. Beams generated using seq2seq are reranked based on whether they conform to the input meaning representation. The system is written in Python and uses Tensorflow.&lt;br /&gt;
&lt;br /&gt;
Link: https://github.com/UFAL-DSG/tgen&lt;br /&gt;
&lt;br /&gt;
Paper: https://aclweb.org/anthology/P16-2008&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12560</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12560"/>
		<updated>2019-06-17T14:19:59Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== Personage Stylistic Variation for NLG ===&lt;br /&gt;
https://nlds.soe.ucsc.edu/stylistic-variation-nlg&lt;br /&gt;
&lt;br /&gt;
This dataset provides training data for natural language generation of restaurant descriptions in different Big-Five personality styles.&lt;br /&gt;
&lt;br /&gt;
=== Personage Sentence Planning for NLG === &lt;br /&gt;
https://nlds.soe.ucsc.edu/sentence-planning-NLG&lt;br /&gt;
&lt;br /&gt;
This dataset provides training data for natural language generation of restaurant descriptions using sentence planning operations of various kinds.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
==Surface Realisation ==&lt;br /&gt;
&lt;br /&gt;
=== Surface Realization Shared Task 2018 (SR&#039;18) dataset ===&lt;br /&gt;
http://taln.upf.edu/pages/msr2018-ws/SRST.html#data&lt;br /&gt;
&lt;br /&gt;
Description: A multilingual dataset automatically converted from the Universal Dependencies v2.0, comprising unordered syntactic structures (10 languages) and predicate-argument structures (3 languages).&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Alex Context NLG Dataset===&lt;br /&gt;
https://github.com/UFAL-DSG/alex_context_nlg_dataset&lt;br /&gt;
&lt;br /&gt;
A dataset for NLG in dialogue systems in the public transport information domain. It includes preceding context along with each data instance, which should allow NLG systems trained on this data to adapt to user&#039;s way of speaking, which should improve perceived naturalness. Papers: http://workshop.colips.org/re-wochat/documents/02_Paper_6.pdf, https://www.aclweb.org/anthology/W16-3622&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
=== CODA corpus Release 1.0 === &lt;br /&gt;
http://computing.open.ac.uk/coda/resources/code_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains approximately 700 turns of human-authored expository dialogue (by Mark Twain and George Berkeley) which has been aligned with monologue that expresses the same information as the dialogue. The monologue side is annotated with Coherence Relations (RST), and the dialogue side with Dialogue Act tags.&lt;br /&gt;
&lt;br /&gt;
=== Hotel Dialogs for NLG === &lt;br /&gt;
https://nlds.soe.ucsc.edu/hotels&lt;br /&gt;
&lt;br /&gt;
This set of hotel corpora includes a set of paraphrases, room and property descriptions, and full hotel dialogues aimed at exploring different ways of eliciting dialogic, conversational descriptions about hotels.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Image description ==&lt;br /&gt;
&lt;br /&gt;
===Chinese===&lt;br /&gt;
* Flickr8k-CN: http://lixirong.net/datasets/flickr8kcn&lt;br /&gt;
&lt;br /&gt;
===Dutch===&lt;br /&gt;
&lt;br /&gt;
* DIDEC: http://didec.uvt.nl&lt;br /&gt;
* Flickr30K https://github.com/cltl/DutchDescriptions&lt;br /&gt;
&lt;br /&gt;
===German===&lt;br /&gt;
* Multi30K: http://www.statmt.org/wmt16/multimodal-task.html&lt;br /&gt;
&lt;br /&gt;
== Question Generation ==&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC 2010 Generating Questions from Sentences Corpus ===&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/qg_form.html&lt;br /&gt;
&lt;br /&gt;
A corpus of over 1000 questions (both human and machine generated). The automatically generated questions have been rated by several raters according to five criteria (relevance, question type, syntactic correctness and fluency, ambiguity, and variety).&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC+ ===&lt;br /&gt;
https://github.com/Keith-Godwin/QG-STEC-plus &lt;br /&gt;
&lt;br /&gt;
Improved annotations for the QGSTEC corpus (with higher inter-rater reliability) as described in [http://oro.open.ac.uk/47284/ Godwin and Piwek (2016)].&lt;br /&gt;
&lt;br /&gt;
==Challenge Data Repository ==&lt;br /&gt;
&lt;br /&gt;
https://sites.google.com/site/genchalrepository/ &lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=NLG_research_groups&amp;diff=12559</id>
		<title>NLG research groups</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=NLG_research_groups&amp;diff=12559"/>
		<updated>2019-06-17T12:43:01Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;* [http://mcs.open.ac.uk/nlg/ The Open University Natural Language Generation Group]&lt;br /&gt;
* [http://www.siggen.org/ ACL Special Interest Group on Natural Language Generation (SIGGEN)]&lt;br /&gt;
* [https://www.abdn.ac.uk/ncs/departments/computing-science/natural-language-generation-187.php Natural Language Generation, School of Natural and Computing Sciences, The University of Aberdeen]&lt;br /&gt;
* [https://www.isi.edu/research_groups/nlg/home Information Sciences Institute, University of Southern California]&lt;br /&gt;
* [http://web.inf.ed.ac.uk/ilcc Institute for Language, Cognition and Computation, School of Informatics, The University of Edinburgh]&lt;br /&gt;
* [http://nlp.seas.harvard.edu/ Harvard NLP, Harvard University]&lt;br /&gt;
* [https://sites.google.com/site/hwinteractionlab/ Interaction Lab, School of Mathematical and Computer Sciences, Heriot-Watt University]&lt;br /&gt;
* [https://synalp.loria.fr/ SyNaLP, LORIA]&lt;br /&gt;
* [https://www.sheffield.ac.uk/dcs/research/groups/nlp Natural Language Processing Group, The University of Sheffield]&lt;br /&gt;
* [https://www.cs.washington.edu/ Paul G. Allen School of Computer Science and Engineering, University of Washington]&lt;br /&gt;
* [https://www.upf.edu/web/taln ALN Natural Language Processing Research Group, Pompeu Fabra University]&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12558</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12558"/>
		<updated>2019-06-17T12:41:57Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== Personage Stylistic Variation for NLG ===&lt;br /&gt;
https://nlds.soe.ucsc.edu/stylistic-variation-nlg&lt;br /&gt;
&lt;br /&gt;
This dataset provides training data for natural language generation of restaurant descriptions in different Big-Five personality styles.&lt;br /&gt;
&lt;br /&gt;
=== Personage Sentence Planning for NLG === &lt;br /&gt;
https://nlds.soe.ucsc.edu/sentence-planning-NLG&lt;br /&gt;
&lt;br /&gt;
This dataset provides training data for natural language generation of restaurant descriptions using sentence planning operations of various kinds.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
==Surface Realisation ==&lt;br /&gt;
&lt;br /&gt;
=== Surface Realization Shared Task 2018 (SR&#039;18) dataset ===&lt;br /&gt;
http://taln.upf.edu/pages/msr2018-ws/SRST.html#data&lt;br /&gt;
&lt;br /&gt;
Description: A multilingual dataset automatically converted from the Universal Dependencies v2.0, comprising unordered syntactic structures (10 languages) and predicate-argument structures (3 languages).&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Alex Context NLG Dataset===&lt;br /&gt;
https://github.com/UFAL-DSG/alex_context_nlg_dataset&lt;br /&gt;
&lt;br /&gt;
A dataset for NLG in dialogue systems in the public transport information domain. It includes preceding context along with each data instance, which should allow NLG systems trained on this data to adapt to user&#039;s way of speaking, which should improve perceived naturalness. Papers: http://workshop.colips.org/re-wochat/documents/02_Paper_6.pdf, https://www.aclweb.org/anthology/W16-3622&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
=== CODA corpus Release 1.0 === &lt;br /&gt;
http://computing.open.ac.uk/coda/resources/code_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains approximately 700 turns of human-authored expository dialogue (by Mark Twain and George Berkeley) which has been aligned with monologue that expresses the same information as the dialogue. The monologue side is annotated with Coherence Relations (RST), and the dialogue side with Dialogue Act tags.&lt;br /&gt;
&lt;br /&gt;
=== Hotel Dialogs for NLG === &lt;br /&gt;
https://nlds.soe.ucsc.edu/hotels&lt;br /&gt;
&lt;br /&gt;
This set of hotel corpora includes a set of paraphrases, room and property descriptions, and full hotel dialogues aimed at exploring different ways of eliciting dialogic, conversational descriptions about hotels.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Image description ==&lt;br /&gt;
&lt;br /&gt;
===Chinese===&lt;br /&gt;
* Flickr8k-CN: http://lixirong.net/datasets/flickr8kcn&lt;br /&gt;
&lt;br /&gt;
===Dutch===&lt;br /&gt;
&lt;br /&gt;
* DIDEC: http://didec.uvt.nl&lt;br /&gt;
* Flickr30K https://github.com/cltl/DutchDescriptions&lt;br /&gt;
&lt;br /&gt;
===German===&lt;br /&gt;
* Multi30K: http://www.statmt.org/wmt16/multimodal-task.html&lt;br /&gt;
&lt;br /&gt;
== Question Generation ==&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC 2010 Generating Questions from Sentences Corpus ===&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/qg_form.html&lt;br /&gt;
&lt;br /&gt;
A corpus of over 1000 questions (both human and machine generated). The automatically generated questions have been rated by several raters according to five criteria (relevance, question type, syntactic correctness and fluency, ambiguity, and variety).&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC+ ===&lt;br /&gt;
https://github.com/Keith-Godwin/QG-STEC-plus &lt;br /&gt;
&lt;br /&gt;
Improved annotations for the QGSTEC corpus (with higher inter-rater reliability) as described in [http://oro.open.ac.uk/47284/ Godwin and Piwek (2016)].&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12557</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12557"/>
		<updated>2019-06-17T12:39:46Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Dialogue */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== Personage Stylistic Variation for NLG ===&lt;br /&gt;
https://nlds.soe.ucsc.edu/stylistic-variation-nlg&lt;br /&gt;
&lt;br /&gt;
This dataset provides training data for natural language generation of restaurant descriptions in different Big-Five personality styles.&lt;br /&gt;
&lt;br /&gt;
=== Personage Sentence Planning for NLG === &lt;br /&gt;
https://nlds.soe.ucsc.edu/sentence-planning-NLG&lt;br /&gt;
&lt;br /&gt;
This dataset provides training data for natural language generation of restaurant descriptions using sentence planning operations of various kinds.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Alex Context NLG Dataset===&lt;br /&gt;
https://github.com/UFAL-DSG/alex_context_nlg_dataset&lt;br /&gt;
&lt;br /&gt;
A dataset for NLG in dialogue systems in the public transport information domain. It includes preceding context along with each data instance, which should allow NLG systems trained on this data to adapt to user&#039;s way of speaking, which should improve perceived naturalness. Papers: http://workshop.colips.org/re-wochat/documents/02_Paper_6.pdf, https://www.aclweb.org/anthology/W16-3622&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
=== CODA corpus Release 1.0 === &lt;br /&gt;
http://computing.open.ac.uk/coda/resources/code_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains approximately 700 turns of human-authored expository dialogue (by Mark Twain and George Berkeley) which has been aligned with monologue that expresses the same information as the dialogue. The monologue side is annotated with Coherence Relations (RST), and the dialogue side with Dialogue Act tags.&lt;br /&gt;
&lt;br /&gt;
=== Hotel Dialogs for NLG === &lt;br /&gt;
https://nlds.soe.ucsc.edu/hotels&lt;br /&gt;
&lt;br /&gt;
This set of hotel corpora includes a set of paraphrases, room and property descriptions, and full hotel dialogues aimed at exploring different ways of eliciting dialogic, conversational descriptions about hotels.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Image description ==&lt;br /&gt;
&lt;br /&gt;
===Chinese===&lt;br /&gt;
* Flickr8k-CN: http://lixirong.net/datasets/flickr8kcn&lt;br /&gt;
&lt;br /&gt;
===Dutch===&lt;br /&gt;
&lt;br /&gt;
* DIDEC: http://didec.uvt.nl&lt;br /&gt;
* Flickr30K https://github.com/cltl/DutchDescriptions&lt;br /&gt;
&lt;br /&gt;
===German===&lt;br /&gt;
* Multi30K: http://www.statmt.org/wmt16/multimodal-task.html&lt;br /&gt;
&lt;br /&gt;
== Question Generation ==&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC 2010 Generating Questions from Sentences Corpus ===&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/qg_form.html&lt;br /&gt;
&lt;br /&gt;
A corpus of over 1000 questions (both human and machine generated). The automatically generated questions have been rated by several raters according to five criteria (relevance, question type, syntactic correctness and fluency, ambiguity, and variety).&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC+ ===&lt;br /&gt;
https://github.com/Keith-Godwin/QG-STEC-plus &lt;br /&gt;
&lt;br /&gt;
Improved annotations for the QGSTEC corpus (with higher inter-rater reliability) as described in [http://oro.open.ac.uk/47284/ Godwin and Piwek (2016)].&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12556</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12556"/>
		<updated>2019-06-17T12:37:57Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* TGen */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system. &lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
http://code.google.com/p/crisp-nlg/&lt;br /&gt;
&lt;br /&gt;
CRISP is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== CODA Tools software Release 1.1 ==&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/tools_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains 1) software for converting text parsed with RST relations into dialogue and 2) an annotation tool for annotating dialogue and translating it into monologue (used for creating CODA corpus).&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
https://www.cs.bgu.ac.il/~elhadad/install-fuf.html&lt;br /&gt;
&lt;br /&gt;
FUF/SURGE is a surface realisation system, based on functional unification grammar.&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html&lt;br /&gt;
&lt;br /&gt;
The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== GoPhi : an AMR to ENGLISH VERBALIZER == &lt;br /&gt;
&lt;br /&gt;
https://github.com/rali-udem/gophi&lt;br /&gt;
&lt;br /&gt;
GoPhi (Generation Of Parenthesized Human Input) is a system for generating a literal reading of Abstract Meaning Representation (AMR) structures. The system, written in SWI-Prolog, uses a symbolic approach to transform the original rooted graph into a tree of constituents that is transformed into an English sentence by jsRealB.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
jsRealB is a text realizer designed specifically for the web, easy to learn and to use. This realizer allows its user to build a variety of French and English expressions and sentences, to add HTML tags to them and to easily integrate them into web pages. jsRealB can also be used in Javascript application by means of a node.js module.&lt;br /&gt;
Sources for the programs, linguistic resources and demonstrations are available on the RALI GitHub [https://github.com/rali-udem/jsRealB].&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
http://wiki.delph-in.net/moin/LkbTop&lt;br /&gt;
&lt;br /&gt;
LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
== NLGen and NLGen2 ==&lt;br /&gt;
https://launchpad.net/nlgen&lt;br /&gt;
&lt;br /&gt;
https://launchpad.net/nlgen2&lt;br /&gt;
&lt;br /&gt;
The NLGen natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
NLGen2 uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
http://openccg.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
OpenCCG is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== rLDCP: Text Generation from Data ==&lt;br /&gt;
https://cran.r-project.org/web/packages/rLDCP/index.html&lt;br /&gt;
&lt;br /&gt;
R package for text generation from data&lt;br /&gt;
&lt;br /&gt;
== RNNLG ==&lt;br /&gt;
https://github.com/shawnwun/RNNLG&lt;br /&gt;
&lt;br /&gt;
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg   (English)&lt;br /&gt;
&lt;br /&gt;
https://github.com/rali-udem/SimpleNLG-EnFr  (English and French)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-GL    (Galician)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-ES    (Spanish)&lt;br /&gt;
&lt;br /&gt;
SimpleNLG is a simple Java-based realiser.  Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has a Java API, and can be used from other languages via an XML interface.  There are &amp;quot;unofficial&amp;quot; ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including [https://aclweb.org/anthology/W18-6508 Dutch], [https://github.com/alexmazzei/SimpleNLG-IT Italian], [https://aclweb.org/anthology/papers/W/W18/W18-6506/ Mandarin]&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
http://www.cs.rutgers.edu/~mdstone/nlg.html&lt;br /&gt;
&lt;br /&gt;
SPUD (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php&lt;br /&gt;
&lt;br /&gt;
STANDUP (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
http://www.suregen.de/00023.html&lt;br /&gt;
&lt;br /&gt;
Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
== TGen == &lt;br /&gt;
------&lt;br /&gt;
A statistical generator generating sentences from dialogue acts or similar representations, based on the sequence-to-sequence (seq2seq) neural network architecture. Beams generated using seq2seq are reranked based on whether they conform to the input meaning representation. The system is written in Python and uses Tensorflow.&lt;br /&gt;
&lt;br /&gt;
Link: https://github.com/UFAL-DSG/tgen&lt;br /&gt;
&lt;br /&gt;
Paper: https://aclweb.org/anthology/P16-2008&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12555</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12555"/>
		<updated>2019-06-17T12:37:35Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system. &lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
http://code.google.com/p/crisp-nlg/&lt;br /&gt;
&lt;br /&gt;
CRISP is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== CODA Tools software Release 1.1 ==&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/tools_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains 1) software for converting text parsed with RST relations into dialogue and 2) an annotation tool for annotating dialogue and translating it into monologue (used for creating CODA corpus).&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
https://www.cs.bgu.ac.il/~elhadad/install-fuf.html&lt;br /&gt;
&lt;br /&gt;
FUF/SURGE is a surface realisation system, based on functional unification grammar.&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html&lt;br /&gt;
&lt;br /&gt;
The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== GoPhi : an AMR to ENGLISH VERBALIZER == &lt;br /&gt;
&lt;br /&gt;
https://github.com/rali-udem/gophi&lt;br /&gt;
&lt;br /&gt;
GoPhi (Generation Of Parenthesized Human Input) is a system for generating a literal reading of Abstract Meaning Representation (AMR) structures. The system, written in SWI-Prolog, uses a symbolic approach to transform the original rooted graph into a tree of constituents that is transformed into an English sentence by jsRealB.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
jsRealB is a text realizer designed specifically for the web, easy to learn and to use. This realizer allows its user to build a variety of French and English expressions and sentences, to add HTML tags to them and to easily integrate them into web pages. jsRealB can also be used in Javascript application by means of a node.js module.&lt;br /&gt;
Sources for the programs, linguistic resources and demonstrations are available on the RALI GitHub [https://github.com/rali-udem/jsRealB].&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
http://wiki.delph-in.net/moin/LkbTop&lt;br /&gt;
&lt;br /&gt;
LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
== NLGen and NLGen2 ==&lt;br /&gt;
https://launchpad.net/nlgen&lt;br /&gt;
&lt;br /&gt;
https://launchpad.net/nlgen2&lt;br /&gt;
&lt;br /&gt;
The NLGen natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
NLGen2 uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
http://openccg.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
OpenCCG is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== rLDCP: Text Generation from Data ==&lt;br /&gt;
https://cran.r-project.org/web/packages/rLDCP/index.html&lt;br /&gt;
&lt;br /&gt;
R package for text generation from data&lt;br /&gt;
&lt;br /&gt;
== RNNLG ==&lt;br /&gt;
https://github.com/shawnwun/RNNLG&lt;br /&gt;
&lt;br /&gt;
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg   (English)&lt;br /&gt;
&lt;br /&gt;
https://github.com/rali-udem/SimpleNLG-EnFr  (English and French)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-GL    (Galician)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-ES    (Spanish)&lt;br /&gt;
&lt;br /&gt;
SimpleNLG is a simple Java-based realiser.  Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has a Java API, and can be used from other languages via an XML interface.  There are &amp;quot;unofficial&amp;quot; ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including [https://aclweb.org/anthology/W18-6508 Dutch], [https://github.com/alexmazzei/SimpleNLG-IT Italian], [https://aclweb.org/anthology/papers/W/W18/W18-6506/ Mandarin]&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
http://www.cs.rutgers.edu/~mdstone/nlg.html&lt;br /&gt;
&lt;br /&gt;
SPUD (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php&lt;br /&gt;
&lt;br /&gt;
STANDUP (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
http://www.suregen.de/00023.html&lt;br /&gt;
&lt;br /&gt;
Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
== TGen == &lt;br /&gt;
------&lt;br /&gt;
A statistical generator generating sentences from dialogue acts or similar representations, based on the sequence-to-sequence (seq2seq) neural network architecture. Beams generated using seq2seq are reranked based on whether they conform to the input meaning representation. The system is written in Python and uses Tensorflow.&lt;br /&gt;
Link: https://github.com/UFAL-DSG/tgen&lt;br /&gt;
Paper: https://aclweb.org/anthology/P16-2008&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12486</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12486"/>
		<updated>2019-04-14T10:07:11Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Dialogue */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== Personage Stylistic Variation for NLG ===&lt;br /&gt;
https://nlds.soe.ucsc.edu/stylistic-variation-nlg&lt;br /&gt;
&lt;br /&gt;
This dataset provides training data for natural language generation of restaurant descriptions in different Big-Five personality styles.&lt;br /&gt;
&lt;br /&gt;
=== Personage Sentence Planning for NLG === &lt;br /&gt;
https://nlds.soe.ucsc.edu/sentence-planning-NLG&lt;br /&gt;
&lt;br /&gt;
This dataset provides training data for natural language generation of restaurant descriptions using sentence planning operations of various kinds.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
=== CODA corpus Release 1.0 === &lt;br /&gt;
http://computing.open.ac.uk/coda/resources/code_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains approximately 700 turns of human-authored expository dialogue (by Mark Twain and George Berkeley) which has been aligned with monologue that expresses the same information as the dialogue. The monologue side is annotated with Coherence Relations (RST), and the dialogue side with Dialogue Act tags.&lt;br /&gt;
&lt;br /&gt;
=== Hotel Dialogs for NLG === &lt;br /&gt;
https://nlds.soe.ucsc.edu/hotels&lt;br /&gt;
&lt;br /&gt;
This set of hotel corpora includes a set of paraphrases, room and property descriptions, and full hotel dialogues aimed at exploring different ways of eliciting dialogic, conversational descriptions about hotels.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Image description ==&lt;br /&gt;
&lt;br /&gt;
===Chinese===&lt;br /&gt;
* Flickr8k-CN: http://lixirong.net/datasets/flickr8kcn&lt;br /&gt;
&lt;br /&gt;
===Dutch===&lt;br /&gt;
&lt;br /&gt;
* DIDEC: http://didec.uvt.nl&lt;br /&gt;
* Flickr30K https://github.com/cltl/DutchDescriptions&lt;br /&gt;
&lt;br /&gt;
===German===&lt;br /&gt;
* Multi30K: http://www.statmt.org/wmt16/multimodal-task.html&lt;br /&gt;
&lt;br /&gt;
== Question Generation ==&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC 2010 Generating Questions from Sentences Corpus ===&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/qg_form.html&lt;br /&gt;
&lt;br /&gt;
A corpus of over 1000 questions (both human and machine generated). The automatically generated questions have been rated by several raters according to five criteria (relevance, question type, syntactic correctness and fluency, ambiguity, and variety).&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC+ ===&lt;br /&gt;
https://github.com/Keith-Godwin/QG-STEC-plus &lt;br /&gt;
&lt;br /&gt;
Improved annotations for the QGSTEC corpus (with higher inter-rater reliability) as described in [http://oro.open.ac.uk/47284/ Godwin and Piwek (2016)].&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12485</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12485"/>
		<updated>2019-04-14T10:06:35Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== Personage Stylistic Variation for NLG ===&lt;br /&gt;
https://nlds.soe.ucsc.edu/stylistic-variation-nlg&lt;br /&gt;
&lt;br /&gt;
This dataset provides training data for natural language generation of restaurant descriptions in different Big-Five personality styles.&lt;br /&gt;
&lt;br /&gt;
=== Personage Sentence Planning for NLG === &lt;br /&gt;
https://nlds.soe.ucsc.edu/sentence-planning-NLG&lt;br /&gt;
&lt;br /&gt;
This dataset provides training data for natural language generation of restaurant descriptions using sentence planning operations of various kinds.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
=== CODA corpus Release 1.0 === &lt;br /&gt;
http://computing.open.ac.uk/coda/resources/code_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains approximately 700 turns of human-authored expository dialogue (by Mark Twain and George Berkeley) which has been aligned with monologue that expresses the same information as the dialogue. The monologue side is annotated with Coherence Relations (RST), and the dialogue side with Dialogue Act tags.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Image description ==&lt;br /&gt;
&lt;br /&gt;
===Chinese===&lt;br /&gt;
* Flickr8k-CN: http://lixirong.net/datasets/flickr8kcn&lt;br /&gt;
&lt;br /&gt;
===Dutch===&lt;br /&gt;
&lt;br /&gt;
* DIDEC: http://didec.uvt.nl&lt;br /&gt;
* Flickr30K https://github.com/cltl/DutchDescriptions&lt;br /&gt;
&lt;br /&gt;
===German===&lt;br /&gt;
* Multi30K: http://www.statmt.org/wmt16/multimodal-task.html&lt;br /&gt;
&lt;br /&gt;
== Question Generation ==&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC 2010 Generating Questions from Sentences Corpus ===&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/qg_form.html&lt;br /&gt;
&lt;br /&gt;
A corpus of over 1000 questions (both human and machine generated). The automatically generated questions have been rated by several raters according to five criteria (relevance, question type, syntactic correctness and fluency, ambiguity, and variety).&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC+ ===&lt;br /&gt;
https://github.com/Keith-Godwin/QG-STEC-plus &lt;br /&gt;
&lt;br /&gt;
Improved annotations for the QGSTEC corpus (with higher inter-rater reliability) as described in [http://oro.open.ac.uk/47284/ Godwin and Piwek (2016)].&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12484</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12484"/>
		<updated>2019-04-14T10:01:37Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system. &lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
http://code.google.com/p/crisp-nlg/&lt;br /&gt;
&lt;br /&gt;
CRISP is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== CODA Tools software Release 1.1 ==&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/tools_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains 1) software for converting text parsed with RST relations into dialogue and 2) an annotation tool for annotating dialogue and translating it into monologue (used for creating CODA corpus).&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
https://www.cs.bgu.ac.il/~elhadad/install-fuf.html&lt;br /&gt;
&lt;br /&gt;
FUF/SURGE is a surface realisation system, based on functional unification grammar.&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html&lt;br /&gt;
&lt;br /&gt;
The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== GoPhi : an AMR to ENGLISH VERBALIZER == &lt;br /&gt;
&lt;br /&gt;
https://github.com/rali-udem/gophi&lt;br /&gt;
&lt;br /&gt;
GoPhi (Generation Of Parenthesized Human Input) is a system for generating a literal reading of Abstract Meaning Representation (AMR) structures. The system, written in SWI-Prolog, uses a symbolic approach to transform the original rooted graph into a tree of constituents that is transformed into an English sentence by jsRealB.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
jsRealB is a text realizer designed specifically for the web, easy to learn and to use. This realizer allows its user to build a variety of French and English expressions and sentences, to add HTML tags to them and to easily integrate them into web pages. jsRealB can also be used in Javascript application by means of a node.js module.&lt;br /&gt;
Sources for the programs, linguistic resources and demonstrations are available on the RALI GitHub [https://github.com/rali-udem/jsRealB].&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
http://wiki.delph-in.net/moin/LkbTop&lt;br /&gt;
&lt;br /&gt;
LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
== NLGen and NLGen2 ==&lt;br /&gt;
https://launchpad.net/nlgen&lt;br /&gt;
&lt;br /&gt;
https://launchpad.net/nlgen2&lt;br /&gt;
&lt;br /&gt;
The NLGen natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
NLGen2 uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
http://openccg.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
OpenCCG is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== rLDCP: Text Generation from Data ==&lt;br /&gt;
https://cran.r-project.org/web/packages/rLDCP/index.html&lt;br /&gt;
&lt;br /&gt;
R package for text generation from data&lt;br /&gt;
&lt;br /&gt;
== RNNLG ==&lt;br /&gt;
https://github.com/shawnwun/RNNLG&lt;br /&gt;
&lt;br /&gt;
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg   (English)&lt;br /&gt;
&lt;br /&gt;
https://github.com/rali-udem/SimpleNLG-EnFr  (English and French)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-GL    (Galician)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-ES    (Spanish)&lt;br /&gt;
&lt;br /&gt;
SimpleNLG is a simple Java-based realiser.  Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has a Java API, and can be used from other languages via an XML interface.  There are &amp;quot;unofficial&amp;quot; ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including [https://aclweb.org/anthology/W18-6508 Dutch], [https://github.com/alexmazzei/SimpleNLG-IT Italian], [https://aclweb.org/anthology/papers/W/W18/W18-6506/ Mandarin]&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
http://www.cs.rutgers.edu/~mdstone/nlg.html&lt;br /&gt;
&lt;br /&gt;
SPUD (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php&lt;br /&gt;
&lt;br /&gt;
STANDUP (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
http://www.suregen.de/00023.html&lt;br /&gt;
&lt;br /&gt;
Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12483</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12483"/>
		<updated>2019-04-14T10:00:26Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* SimpleNLG */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system. &lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
http://code.google.com/p/crisp-nlg/&lt;br /&gt;
&lt;br /&gt;
CRISP is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== CODA Tools software Release 1.1 ==&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/tools_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains 1) software for converting text parsed with RST relations into dialogue and 2) an annotation tool for annotating dialogue and translating it into monologue (used for creating CODA corpus).&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
https://www.cs.bgu.ac.il/~elhadad/install-fuf.html&lt;br /&gt;
&lt;br /&gt;
FUF/SURGE is a surface realisation system, based on functional unification grammar.&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html&lt;br /&gt;
&lt;br /&gt;
The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
jsRealB is a text realizer designed specifically for the web, easy to learn and to use. This realizer allows its user to build a variety of French and English expressions and sentences, to add HTML tags to them and to easily integrate them into web pages. jsRealB can also be used in Javascript application by means of a node.js module.&lt;br /&gt;
Sources for the programs, linguistic resources and demonstrations are available on the RALI GitHub [https://github.com/rali-udem/jsRealB].&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
http://wiki.delph-in.net/moin/LkbTop&lt;br /&gt;
&lt;br /&gt;
LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
== NLGen and NLGen2 ==&lt;br /&gt;
https://launchpad.net/nlgen&lt;br /&gt;
&lt;br /&gt;
https://launchpad.net/nlgen2&lt;br /&gt;
&lt;br /&gt;
The NLGen natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
NLGen2 uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
http://openccg.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
OpenCCG is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== rLDCP: Text Generation from Data ==&lt;br /&gt;
https://cran.r-project.org/web/packages/rLDCP/index.html&lt;br /&gt;
&lt;br /&gt;
R package for text generation from data&lt;br /&gt;
&lt;br /&gt;
== RNNLG ==&lt;br /&gt;
https://github.com/shawnwun/RNNLG&lt;br /&gt;
&lt;br /&gt;
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg   (English)&lt;br /&gt;
&lt;br /&gt;
https://github.com/rali-udem/SimpleNLG-EnFr  (English and French)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-GL    (Galician)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-ES    (Spanish)&lt;br /&gt;
&lt;br /&gt;
SimpleNLG is a simple Java-based realiser.  Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has a Java API, and can be used from other languages via an XML interface.  There are &amp;quot;unofficial&amp;quot; ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including [https://aclweb.org/anthology/W18-6508 Dutch], [https://github.com/alexmazzei/SimpleNLG-IT Italian], [https://aclweb.org/anthology/papers/W/W18/W18-6506/ Mandarin]&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
http://www.cs.rutgers.edu/~mdstone/nlg.html&lt;br /&gt;
&lt;br /&gt;
SPUD (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php&lt;br /&gt;
&lt;br /&gt;
STANDUP (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
http://www.suregen.de/00023.html&lt;br /&gt;
&lt;br /&gt;
Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12482</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12482"/>
		<updated>2019-04-14T09:59:02Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* jsRealB */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system. &lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
http://code.google.com/p/crisp-nlg/&lt;br /&gt;
&lt;br /&gt;
CRISP is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== CODA Tools software Release 1.1 ==&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/tools_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains 1) software for converting text parsed with RST relations into dialogue and 2) an annotation tool for annotating dialogue and translating it into monologue (used for creating CODA corpus).&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
https://www.cs.bgu.ac.il/~elhadad/install-fuf.html&lt;br /&gt;
&lt;br /&gt;
FUF/SURGE is a surface realisation system, based on functional unification grammar.&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html&lt;br /&gt;
&lt;br /&gt;
The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
jsRealB is a text realizer designed specifically for the web, easy to learn and to use. This realizer allows its user to build a variety of French and English expressions and sentences, to add HTML tags to them and to easily integrate them into web pages. jsRealB can also be used in Javascript application by means of a node.js module.&lt;br /&gt;
Sources for the programs, linguistic resources and demonstrations are available on the RALI GitHub [https://github.com/rali-udem/jsRealB].&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
http://wiki.delph-in.net/moin/LkbTop&lt;br /&gt;
&lt;br /&gt;
LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
== NLGen and NLGen2 ==&lt;br /&gt;
https://launchpad.net/nlgen&lt;br /&gt;
&lt;br /&gt;
https://launchpad.net/nlgen2&lt;br /&gt;
&lt;br /&gt;
The NLGen natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
NLGen2 uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
http://openccg.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
OpenCCG is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== rLDCP: Text Generation from Data ==&lt;br /&gt;
https://cran.r-project.org/web/packages/rLDCP/index.html&lt;br /&gt;
&lt;br /&gt;
R package for text generation from data&lt;br /&gt;
&lt;br /&gt;
== RNNLG ==&lt;br /&gt;
https://github.com/shawnwun/RNNLG&lt;br /&gt;
&lt;br /&gt;
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg   (English)&lt;br /&gt;
&lt;br /&gt;
http://www-etud.iro.umontreal.ca/~vaudrypl/snlgbil/snlgEnFr_english.html   (French)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-GL    (Galician)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-ES    (Spanish)&lt;br /&gt;
&lt;br /&gt;
SipleNLG is a simple Java-based realiser.  Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has a Java API, and can be used from other languages via an XML interface.  There are &amp;quot;unofficial&amp;quot; ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including [https://aclweb.org/anthology/W18-6508 Dutch], [https://github.com/alexmazzei/SimpleNLG-IT Italian], [https://aclweb.org/anthology/papers/W/W18/W18-6506/ Mandarin]&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
http://www.cs.rutgers.edu/~mdstone/nlg.html&lt;br /&gt;
&lt;br /&gt;
SPUD (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php&lt;br /&gt;
&lt;br /&gt;
STANDUP (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
http://www.suregen.de/00023.html&lt;br /&gt;
&lt;br /&gt;
Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12481</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12481"/>
		<updated>2019-04-14T09:53:11Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
=== CODA corpus Release 1.0 === &lt;br /&gt;
http://computing.open.ac.uk/coda/resources/code_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains approximately 700 turns of human-authored expository dialogue (by Mark Twain and George Berkeley) which has been aligned with monologue that expresses the same information as the dialogue. The monologue side is annotated with Coherence Relations (RST), and the dialogue side with Dialogue Act tags.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Image description ==&lt;br /&gt;
&lt;br /&gt;
===Chinese===&lt;br /&gt;
* Flickr8k-CN: http://lixirong.net/datasets/flickr8kcn&lt;br /&gt;
&lt;br /&gt;
===Dutch===&lt;br /&gt;
&lt;br /&gt;
* DIDEC: http://didec.uvt.nl&lt;br /&gt;
* Flickr30K https://github.com/cltl/DutchDescriptions&lt;br /&gt;
&lt;br /&gt;
===German===&lt;br /&gt;
* Multi30K: http://www.statmt.org/wmt16/multimodal-task.html&lt;br /&gt;
&lt;br /&gt;
== Question Generation ==&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC 2010 Generating Questions from Sentences Corpus ===&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/qg_form.html&lt;br /&gt;
&lt;br /&gt;
A corpus of over 1000 questions (both human and machine generated). The automatically generated questions have been rated by several raters according to five criteria (relevance, question type, syntactic correctness and fluency, ambiguity, and variety).&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC+ ===&lt;br /&gt;
https://github.com/Keith-Godwin/QG-STEC-plus &lt;br /&gt;
&lt;br /&gt;
Improved annotations for the QGSTEC corpus (with higher inter-rater reliability) as described in [http://oro.open.ac.uk/47284/ Godwin and Piwek (2016)].&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12480</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12480"/>
		<updated>2019-04-14T09:52:33Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
=== CODA corpus Release 1.0 === &lt;br /&gt;
http://computing.open.ac.uk/coda/resources/code_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains approximately 700 turns of human-authored expository dialogue (by Mark Twain and George Berkeley) which has been aligned with monologue that expresses the same information as the dialogue. The monologue side is annotated with Coherence Relations (RST), and the dialogue side with Dialogue Act tags.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Image description ==&lt;br /&gt;
&lt;br /&gt;
===Chinese===&lt;br /&gt;
* Flickr8k-CN: http://lixirong.net/datasets/flickr8kcn&lt;br /&gt;
&lt;br /&gt;
===Dutch===&lt;br /&gt;
&lt;br /&gt;
* DIDEC: http://didec.uvt.nl&lt;br /&gt;
* Flickr30K https://github.com/cltl/DutchDescriptions&lt;br /&gt;
&lt;br /&gt;
===German===&lt;br /&gt;
* Multi30K: http://www.statmt.org/wmt16/multimodal-task.html&lt;br /&gt;
&lt;br /&gt;
== Question Generation ==&lt;br /&gt;
&lt;br /&gt;
=== QGSTEC 2010 Generating Questions from Sentences Corpus ===&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/qg_form.html&lt;br /&gt;
&lt;br /&gt;
A corpus of over 1000 questions (both human and machine generated). The automatically generated questions have been rated by several raters according to five criteria (relevance, question type, syntactic correctness and fluency, ambiguity, and variety).&lt;br /&gt;
&lt;br /&gt;
== QGSTEC+ == &lt;br /&gt;
https://github.com/Keith-Godwin/QG-STEC-plus &lt;br /&gt;
&lt;br /&gt;
Improved annotations for the QGSTEC corpus (with higher inter-rater reliability) as described in [http://oro.open.ac.uk/47284/ Godwin and Piwek (2016)].&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12479</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12479"/>
		<updated>2019-04-14T09:48:56Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* = CODA corpus Release 1.0 */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
=== CODA corpus Release 1.0 === &lt;br /&gt;
http://computing.open.ac.uk/coda/resources/code_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains approximately 700 turns of human-authored expository dialogue (by Mark Twain and George Berkeley) which has been aligned with monologue that expresses the same information as the dialogue. The monologue side is annotated with Coherence Relations (RST), and the dialogue side with Dialogue Act tags.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Image description ==&lt;br /&gt;
&lt;br /&gt;
===Chinese===&lt;br /&gt;
* Flickr8k-CN: http://lixirong.net/datasets/flickr8kcn&lt;br /&gt;
&lt;br /&gt;
===Dutch===&lt;br /&gt;
&lt;br /&gt;
* DIDEC: http://didec.uvt.nl&lt;br /&gt;
* Flickr30K https://github.com/cltl/DutchDescriptions&lt;br /&gt;
&lt;br /&gt;
===German===&lt;br /&gt;
* Multi30K: http://www.statmt.org/wmt16/multimodal-task.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12478</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12478"/>
		<updated>2019-04-14T09:48:23Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Dialogue */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
=== CODA corpus Release 1.0 == &lt;br /&gt;
http://computing.open.ac.uk/coda/resources/code_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains approximately 700 turns of human-authored expository dialogue (by Mark Twain and George Berkeley) which has been aligned with monologue that expresses the same information as the dialogue. The monologue side is annotated with Coherence Relations (RST), and the dialogue side with Dialogue Act tags.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Image description ==&lt;br /&gt;
&lt;br /&gt;
===Chinese===&lt;br /&gt;
* Flickr8k-CN: http://lixirong.net/datasets/flickr8kcn&lt;br /&gt;
&lt;br /&gt;
===Dutch===&lt;br /&gt;
&lt;br /&gt;
* DIDEC: http://didec.uvt.nl&lt;br /&gt;
* Flickr30K https://github.com/cltl/DutchDescriptions&lt;br /&gt;
&lt;br /&gt;
===German===&lt;br /&gt;
* Multi30K: http://www.statmt.org/wmt16/multimodal-task.html&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12477</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12477"/>
		<updated>2019-04-14T09:44:57Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system. &lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
http://code.google.com/p/crisp-nlg/&lt;br /&gt;
&lt;br /&gt;
CRISP is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== CODA Tools software Release 1.1 ==&lt;br /&gt;
http://computing.open.ac.uk/coda/resources/tools_form.html&lt;br /&gt;
&lt;br /&gt;
This release contains 1) software for converting text parsed with RST relations into dialogue and 2) an annotation tool for annotating dialogue and translating it into monologue (used for creating CODA corpus).&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
https://www.cs.bgu.ac.il/~elhadad/install-fuf.html&lt;br /&gt;
&lt;br /&gt;
FUF/SURGE is a surface realisation system, based on functional unification grammar.&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html&lt;br /&gt;
&lt;br /&gt;
The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
jsReakB is a bilingual (French and English) text realiser for web programming&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
http://wiki.delph-in.net/moin/LkbTop&lt;br /&gt;
&lt;br /&gt;
LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
== NLGen and NLGen2 ==&lt;br /&gt;
https://launchpad.net/nlgen&lt;br /&gt;
&lt;br /&gt;
https://launchpad.net/nlgen2&lt;br /&gt;
&lt;br /&gt;
The NLGen natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
NLGen2 uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
http://openccg.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
OpenCCG is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== rLDCP: Text Generation from Data ==&lt;br /&gt;
https://cran.r-project.org/web/packages/rLDCP/index.html&lt;br /&gt;
&lt;br /&gt;
R package for text generation from data&lt;br /&gt;
&lt;br /&gt;
== RNNLG ==&lt;br /&gt;
https://github.com/shawnwun/RNNLG&lt;br /&gt;
&lt;br /&gt;
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg   (English)&lt;br /&gt;
&lt;br /&gt;
http://www-etud.iro.umontreal.ca/~vaudrypl/snlgbil/snlgEnFr_english.html   (French)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-GL    (Galician)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-ES    (Spanish)&lt;br /&gt;
&lt;br /&gt;
SipleNLG is a simple Java-based realiser.  Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has a Java API, and can be used from other languages via an XML interface.  There are &amp;quot;unofficial&amp;quot; ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including [https://aclweb.org/anthology/W18-6508 Dutch], [https://github.com/alexmazzei/SimpleNLG-IT Italian], [https://aclweb.org/anthology/papers/W/W18/W18-6506/ Mandarin]&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
http://www.cs.rutgers.edu/~mdstone/nlg.html&lt;br /&gt;
&lt;br /&gt;
SPUD (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php&lt;br /&gt;
&lt;br /&gt;
STANDUP (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
http://www.suregen.de/00023.html&lt;br /&gt;
&lt;br /&gt;
Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12471</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12471"/>
		<updated>2019-04-12T16:26:57Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12470</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12470"/>
		<updated>2019-04-12T14:36:09Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
=== TL;DR ===&lt;br /&gt;
https://toolbox.google.com/datasetsearch/search?query=Webis-TLDR-17%20Corpus&amp;amp;docid=kzcwbWD9z3B4Ah3wAAAAAA%3D%3D&lt;br /&gt;
&lt;br /&gt;
Dataset for abstractive summarization constructed using Reddit posts. It is the largest corpus (approximately 3 Million posts) for informal text such as Social Media text,  which can be used to train neural networks for summarization technology.&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12469</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12469"/>
		<updated>2019-04-12T14:33:34Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system. &lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
http://code.google.com/p/crisp-nlg/&lt;br /&gt;
&lt;br /&gt;
CRISP is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
https://www.cs.bgu.ac.il/~elhadad/install-fuf.html&lt;br /&gt;
&lt;br /&gt;
FUF/SURGE is a surface realisation system, based on functional unification grammar.&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html&lt;br /&gt;
&lt;br /&gt;
The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
jsReakB is a bilingual (French and English) text realiser for web programming&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
http://wiki.delph-in.net/moin/LkbTop&lt;br /&gt;
&lt;br /&gt;
LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
== NLGen and NLGen2 ==&lt;br /&gt;
https://launchpad.net/nlgen&lt;br /&gt;
&lt;br /&gt;
https://launchpad.net/nlgen2&lt;br /&gt;
&lt;br /&gt;
The NLGen natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
NLGen2 uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
http://openccg.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
OpenCCG is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== rLDCP: Text Generation from Data ==&lt;br /&gt;
https://cran.r-project.org/web/packages/rLDCP/index.html&lt;br /&gt;
&lt;br /&gt;
R package for text generation from data&lt;br /&gt;
&lt;br /&gt;
== RNNLG ==&lt;br /&gt;
https://github.com/shawnwun/RNNLG&lt;br /&gt;
&lt;br /&gt;
RNNLG is an open source benchmark toolkit for Natural Language Generation (NLG) in spoken dialogue system application domains. It is released by Tsung-Hsien (Shawn) Wen from Cambridge Dialogue Systems Group under Apache License 2.0.&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg   (English)&lt;br /&gt;
&lt;br /&gt;
http://www-etud.iro.umontreal.ca/~vaudrypl/snlgbil/snlgEnFr_english.html   (French)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-GL    (Galician)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-ES    (Spanish)&lt;br /&gt;
&lt;br /&gt;
SipleNLG is a simple Java-based realiser.  Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has a Java API, and can be used from other languages via an XML interface.  There are &amp;quot;unofficial&amp;quot; ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including [https://aclweb.org/anthology/W18-6508 Dutch], [https://github.com/alexmazzei/SimpleNLG-IT Italian], [https://aclweb.org/anthology/papers/W/W18/W18-6506/ Mandarin]&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
http://www.cs.rutgers.edu/~mdstone/nlg.html&lt;br /&gt;
&lt;br /&gt;
SPUD (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php&lt;br /&gt;
&lt;br /&gt;
STANDUP (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
http://www.suregen.de/00023.html&lt;br /&gt;
&lt;br /&gt;
Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12468</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12468"/>
		<updated>2019-04-12T14:32:02Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Dialogue Systems */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue ==&lt;br /&gt;
&lt;br /&gt;
=== Cam4NLG === &lt;br /&gt;
https://github.com/shawnwun/RNNLG/tree/master/data&lt;br /&gt;
&lt;br /&gt;
Cam4NLG: Cam4NLG contains 4 NLG datasets for dialogue system development, each of them is in a unique domain. Each data point contains a (dialogue act, ground truth, handcrafted baseline) tuple. &lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12467</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12467"/>
		<updated>2019-04-12T14:29:30Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Data-to-text/Concept-to-text Generation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== boxscore-data ===&lt;br /&gt;
https://github.com/harvardnlp/boxscore-data/&lt;br /&gt;
&lt;br /&gt;
This dataset consists of (human-written) NBA basketball game summaries aligned with their corresponding box- and line-scores.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue Systems ==&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12466</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12466"/>
		<updated>2019-04-12T14:25:56Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system. &lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
http://code.google.com/p/crisp-nlg/&lt;br /&gt;
&lt;br /&gt;
CRISP is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
https://www.cs.bgu.ac.il/~elhadad/install-fuf.html&lt;br /&gt;
&lt;br /&gt;
FUF/SURGE is a surface realisation system, based on functional unification grammar.&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
GenI is a surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html&lt;br /&gt;
&lt;br /&gt;
The Grammar Explorer provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
jsReakB is a bilingual (French and English) text realiser for web programming&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
http://wiki.delph-in.net/moin/LkbTop&lt;br /&gt;
&lt;br /&gt;
LKB (Linguistic Knowledge Builder) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
== NLGen and NLGen2 ==&lt;br /&gt;
https://launchpad.net/nlgen&lt;br /&gt;
&lt;br /&gt;
https://launchpad.net/nlgen2&lt;br /&gt;
&lt;br /&gt;
The NLGen natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
NLGen2 uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
http://openccg.sourceforge.net/&lt;br /&gt;
&lt;br /&gt;
OpenCCG is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== rLDCP: Text Generation from Data ==&lt;br /&gt;
https://cran.r-project.org/web/packages/rLDCP/index.html&lt;br /&gt;
&lt;br /&gt;
R package for text generation from data&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg   (English)&lt;br /&gt;
&lt;br /&gt;
http://www-etud.iro.umontreal.ca/~vaudrypl/snlgbil/snlgEnFr_english.html   (French)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-GL    (Galician)&lt;br /&gt;
&lt;br /&gt;
https://github.com/citiususc/SimpleNLG-ES    (Spanish)&lt;br /&gt;
&lt;br /&gt;
SipleNLG is a simple Java-based realiser.  Its grammatical coverage and syntactic knowledge is small compared to KPML or FUF/SURGE. However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has a Java API, and can be used from other languages via an XML interface.  There are &amp;quot;unofficial&amp;quot; ports to other programming languages such as Python and Ruby. Versions for other human languages are being worked on, including [https://aclweb.org/anthology/W18-6508 Dutch], [https://github.com/alexmazzei/SimpleNLG-IT Italian], [https://aclweb.org/anthology/papers/W/W18/W18-6506/ Mandarin]&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
http://www.cs.rutgers.edu/~mdstone/nlg.html&lt;br /&gt;
&lt;br /&gt;
SPUD (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/standup-315.php&lt;br /&gt;
&lt;br /&gt;
STANDUP (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
http://www.suregen.de/00023.html&lt;br /&gt;
&lt;br /&gt;
Suregen is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12465</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12465"/>
		<updated>2019-04-12T14:20:27Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for real-wrold objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue Systems ==&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
== Summarisation ==&lt;br /&gt;
=== The Wikipedia company corpus ===&lt;br /&gt;
https://gricad-gitlab.univ-grenoble-alpes.fr/getalp/wikipediacompanycorpus&lt;br /&gt;
&lt;br /&gt;
Company descriptions collected from Wikipedia. The dataset contains semantic representations, short, and long descriptions for 51K companies in English&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12464</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12464"/>
		<updated>2019-04-12T14:16:16Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Referring Expressions Generation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists data sets and corpora used for research in natural language generation.  They are available for download over the web. If you know of a dataset which is not listed here, you can email siggen-board@aclweb.org, or just click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
These datasets contain data and corresponding texts based on this data.&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
Crowdsourced restaurant descriptions with corresponding restaurant data.  English.&lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
Weather forecasts written by human forecasters, with corresponding forecast data, for UK North Sea marine forecasts.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
Computer-generated weather forecasts from weather.gov (US public forecast), along with corresponding weather data.&lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
Crowdsourced descriptions of semantic web entities, with corresponding RDF triples.&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
http://jetteviethen.net/research/spatial.html&lt;br /&gt;
&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects.&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
Referring expressions for objects in images, and the corresponding images.&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
Geographical descriptions (eg, &amp;quot;Norte de Galicia&amp;quot;) and corresponding regions on a map&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
https://www.abdn.ac.uk/ncs/departments/computing-science/corpus-496.php&lt;br /&gt;
&lt;br /&gt;
https://www.abdn.ac.uk/ncs/documents/corpus.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
The TUNA Reference Corpus is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. &lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
http://www.pitt.edu/~coconut/coconut-corpus.html&lt;br /&gt;
&lt;br /&gt;
http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz    [direct download]&lt;br /&gt;
&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The COCONUT corpus is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme].&lt;br /&gt;
&lt;br /&gt;
=== Stars2 corpus of referring expressions === &lt;br /&gt;
A collection of 884 annotated definite descriptions produced by 56 subjects in collaborative communication involving speaker-hearer pairs in situations designed so as to challenge existing REG algorithms, with a particular focus on the issue of attribute choice in referential overspeci�fication. &lt;br /&gt;
Link: https://drive.google.com/file/d/0B-KyU7T8S8bLZ1lEQmJRdUc1V28/view?usp=sharing&lt;br /&gt;
Cite: https://link.springer.com/article/10.1007/s10579-016-9350-y &lt;br /&gt;
&lt;br /&gt;
=== b5 corpus of text and referring expressions labelled with personality information ===&lt;br /&gt;
A collection of crowd sourced scene descriptions and an annotated REG corpus, both of which labelled with Big Five personality scores of their authors. Suitable for studies in personality-dependent text generation and referring expression generation.&lt;br /&gt;
Link: https://drive.google.com/open?id=0B-KyU7T8S8bLTHpaMnh2U2NWZzQ&lt;br /&gt;
Cite: https://www.aclweb.org/anthology/L18-1183&lt;br /&gt;
&lt;br /&gt;
== Dialogue Systems ==&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
http://www.classic-project.org/corpora&lt;br /&gt;
&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The Wizard-of-Oz corpus on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Other ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/&lt;br /&gt;
&lt;br /&gt;
http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz     [direct download]&lt;br /&gt;
&lt;br /&gt;
The Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton.&lt;br /&gt;
&lt;br /&gt;
=== Validity of BLEU Evaluation Metric ===&lt;br /&gt;
https://abdn.pure.elsevier.com/en/datasets/data-for-structured-review-of-the-validity-of-bleu&lt;br /&gt;
&lt;br /&gt;
https://abdn.pure.elsevier.com/files/125166547/bleu_survey_data.zip    [direct download]&lt;br /&gt;
&lt;br /&gt;
Correlations between BLEU and human evaluations (for MT as well as NLG), extracted from papers in the ACL Anthology&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Natural_Language_Generation_Portal&amp;diff=12450</id>
		<title>Natural Language Generation Portal</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Natural_Language_Generation_Portal&amp;diff=12450"/>
		<updated>2019-04-11T09:07:42Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;[[File:Siggen_logo_small.JPG|right]]&lt;br /&gt;
&#039;&#039;This portal is based on and replaces the [http://www.siggen.org/resources/moin.html NLG Resources Wiki] run by [http://www.siggen.org/ ACL SIGGEN] from November 2005 to February 2009.&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Resources for Natural Language Generation ==&lt;br /&gt;
&lt;br /&gt;
* [[Downloadable NLG systems]]&lt;br /&gt;
* [[Data sets for NLG]]&lt;br /&gt;
* [[Generation grammars]]&lt;br /&gt;
&amp;lt;!-- &lt;br /&gt;
* [[Online NLG demos]] --&amp;gt;&lt;br /&gt;
* [[NLG publications list]]&lt;br /&gt;
* [[NLG research groups]]&lt;br /&gt;
&lt;br /&gt;
Additionally, a large [http://www.nlg-wiki.org/systems/ database of implemented NLG systems with references] has been &lt;br /&gt;
put together by Michael Zock and John Bateman, and is now maintained in a Semantic Wiki at http://nlg-wiki.org.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Natural Language Generation|*]]&lt;br /&gt;
[[Category:Imported from the SIGGEN Resources Wiki]]&lt;br /&gt;
[[Category:Resources]]&lt;br /&gt;
[[Category:Natural Language Generation Portal]]&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12449</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12449"/>
		<updated>2019-04-11T09:05:22Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;[[Tools and Software for English]] - Downloadable NLG systems&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
For languages other than English, see [[List of resources by language]].&lt;br /&gt;
&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, please click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system.&lt;br /&gt;
&lt;br /&gt;
== CLINT ==&lt;br /&gt;
http://www.cs.bgu.ac.il/~elhadad/clint.html &lt;br /&gt;
&lt;br /&gt;
CLINT is a hybrid template / word-based generation system with an example application of &lt;br /&gt;
business letter generation. The system is written in C++ and runs under Microsoft Windows. &lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
[http://code.google.com/p/crisp-nlg/ CRISP] is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
[http://www.cs.bgu.ac.il/~elhadad/research.html FUF] is available as the [ftp://ftp.cs.bgu.ac.il/pub/fuf/fuf5.3.tar.gz original Common Lisp implementation] and as a C++ port called [http://www.cs.bgu.ac.il/~elhadad/cfuf.zip CFUF] which has an embedded Scheme interpreter.&lt;br /&gt;
&lt;br /&gt;
For more information, see [[#SURGE]], [[#SURGE_2.3]], [[#SURG-SP]], [[#SURG-IT]].&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html ([http://www.stir.ac.uk/crcl/Computational-tools/Grexplorer/grexplorer.html old site] unavailable as April, 2011)&lt;br /&gt;
&lt;br /&gt;
provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
[http://wiki.delph-in.net/moin/LkbTop LKB] ([[Linguistic Knowledge Builder]]) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
==NLGen==&lt;br /&gt;
The [https://launchpad.net/nlgen NLGen] natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output. Not to be confused with NLGen2, below, which uses a different sentence generation theory.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
== NLGen2 ==&lt;br /&gt;
The [https://launchpad.net/nlgen2 NLGen2] natural language generation system uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Not to be confused with NLGen, above, which uses a different sentence generation theory. Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
[http://openccg.sourceforge.net/ OpenCCG], the OpenNLP CCG Library (formerly Grok), is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== RAGS (Reference Architecture for Generation Systems) software ==&lt;br /&gt;
http://www.csd.abdn.ac.uk/~cmellish/rags/deliverables/&lt;br /&gt;
&lt;br /&gt;
Deliverables from the RAGS project - RAGSOCKS software for interfacing modules using RAGS data representations,&lt;br /&gt;
example RAGS module (genetic algorithm based text planner) and RAGS wrapper for FUF/SURGE.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg &lt;br /&gt;
&lt;br /&gt;
is an ultra-simple Java-based realiser.  Its&lt;br /&gt;
grammatical coverage and syntactic knowledge is&lt;br /&gt;
minuscule compared to KPML or FUF/SURGE.&lt;br /&gt;
However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has&lt;br /&gt;
been used by many people in Aberdeen, and also&lt;br /&gt;
for teaching.  It is set up as a Java package,&lt;br /&gt;
so it can only be used by Java programs.&lt;br /&gt;
&lt;br /&gt;
SimpleNLG in French: http://www-etud.iro.umontreal.ca/~vaudrypl/snlgbil/snlgEnFr_english.html&lt;br /&gt;
&lt;br /&gt;
SimpleNLG in Spanish: https://github.com/citiususc/SimpleNLG-ES &lt;br /&gt;
&lt;br /&gt;
SimpleNLG in Galician: https://github.com/citiususc/SimpleNLG-GL&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
[http://www.cs.rutgers.edu/~mdstone/nlg.html SPUD] (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/research/standup/ STANDUP project] (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
[http://www.suregen.de/00023.html Suregen] is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
== SURGE ==&lt;br /&gt;
http://www.cs.bgu.ac.il/surge/&lt;br /&gt;
&lt;br /&gt;
Syntactic realization package. (A CommonLisp package providing an interpreter for a functional&lt;br /&gt;
unification formalism called FUF and SURGE, a large grammar of English written in FUF.) Offers download of SURGE 2.2.&lt;br /&gt;
&lt;br /&gt;
SURGE 2.3: The latest version of Surge, including support for written dialogue, and expanded&lt;br /&gt;
syntactic coverage based on the Penn TreeBank. http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
SURG-SP: Systemic Unification Reusable Grammar for Spanish is a large scale&lt;br /&gt;
Spanish grammar allowing systems which already use FUF/SURGE for English NLG to be able&lt;br /&gt;
to generate syntactically (and many times semantically) equivalent text in Spanish when&lt;br /&gt;
new lexical items are introduced.  SURG-SP makes use of inputs almost identical to the&lt;br /&gt;
English version Surge 2.3.  http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
SURG-IT: The Italian version of Surge 2.3. http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12448</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12448"/>
		<updated>2019-04-11T09:04:07Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;[[Tools and Software for English]] - Downloadable NLG systems&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
For languages other than English, see [[List of resources by language]].&lt;br /&gt;
&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, please click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system.&lt;br /&gt;
&lt;br /&gt;
== CLINT ==&lt;br /&gt;
http://www.cs.bgu.ac.il/~elhadad/clint.html &lt;br /&gt;
&lt;br /&gt;
CLINT is a hybrid template / word-based generation system with an example application of &lt;br /&gt;
business letter generation. The system is written in C++ and runs under Microsoft Windows. &lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
[http://code.google.com/p/crisp-nlg/ CRISP] is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== DAYDREAMER ==&lt;br /&gt;
&lt;br /&gt;
[ftp://ftp.cs.cmu.edu/user/ai/new/daydreamer/0new.html DAYDREAMER] is a computer model of the stream of thought developed at UCLA by Erik T. Mueller from 1983 to 1988. The generator is located in the file dd_gen.cl. Common lisp source code available under GPL v2.&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
[http://www.cs.bgu.ac.il/~elhadad/research.html FUF] is available as the [ftp://ftp.cs.bgu.ac.il/pub/fuf/fuf5.3.tar.gz original Common Lisp implementation] and as a C++ port called [http://www.cs.bgu.ac.il/~elhadad/cfuf.zip CFUF] which has an embedded Scheme interpreter.&lt;br /&gt;
&lt;br /&gt;
For more information, see [[#SURGE]], [[#SURGE_2.3]], [[#SURG-SP]], [[#SURG-IT]].&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html ([http://www.stir.ac.uk/crcl/Computational-tools/Grexplorer/grexplorer.html old site] unavailable as April, 2011)&lt;br /&gt;
&lt;br /&gt;
provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
== jsRealB ==&lt;br /&gt;
&lt;br /&gt;
http://rali.iro.umontreal.ca/rali/?q=en/jsrealb-bilingual-text-realiser&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
[http://wiki.delph-in.net/moin/LkbTop LKB] ([[Linguistic Knowledge Builder]]) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
==NLGen==&lt;br /&gt;
The [https://launchpad.net/nlgen NLGen] natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output. Not to be confused with NLGen2, below, which uses a different sentence generation theory.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
== NLGen2 ==&lt;br /&gt;
The [https://launchpad.net/nlgen2 NLGen2] natural language generation system uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Not to be confused with NLGen, above, which uses a different sentence generation theory. Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
[http://openccg.sourceforge.net/ OpenCCG], the OpenNLP CCG Library (formerly Grok), is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
== RAGS (Reference Architecture for Generation Systems) software ==&lt;br /&gt;
http://www.csd.abdn.ac.uk/~cmellish/rags/deliverables/&lt;br /&gt;
&lt;br /&gt;
Deliverables from the RAGS project - RAGSOCKS software for interfacing modules using RAGS data representations,&lt;br /&gt;
example RAGS module (genetic algorithm based text planner) and RAGS wrapper for FUF/SURGE.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
https://github.com/simplenlg/simplenlg &lt;br /&gt;
&lt;br /&gt;
is an ultra-simple Java-based realiser.  Its&lt;br /&gt;
grammatical coverage and syntactic knowledge is&lt;br /&gt;
minuscule compared to KPML or FUF/SURGE.&lt;br /&gt;
However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has&lt;br /&gt;
been used by many people in Aberdeen, and also&lt;br /&gt;
for teaching.  It is set up as a Java package,&lt;br /&gt;
so it can only be used by Java programs.&lt;br /&gt;
&lt;br /&gt;
SimpleNLG in French: http://www-etud.iro.umontreal.ca/~vaudrypl/snlgbil/snlgEnFr_english.html&lt;br /&gt;
&lt;br /&gt;
SimpleNLG in Spanish: https://github.com/citiususc/SimpleNLG-ES &lt;br /&gt;
&lt;br /&gt;
SimpleNLG in Galician: https://github.com/citiususc/SimpleNLG-GL&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
[http://www.cs.rutgers.edu/~mdstone/nlg.html SPUD] (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/research/standup/ STANDUP project] (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
[http://www.suregen.de/00023.html Suregen] is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
== SURGE ==&lt;br /&gt;
http://www.cs.bgu.ac.il/surge/&lt;br /&gt;
&lt;br /&gt;
Syntactic realization package. (A CommonLisp package providing an interpreter for a functional&lt;br /&gt;
unification formalism called FUF and SURGE, a large grammar of English written in FUF.) Offers download of SURGE 2.2.&lt;br /&gt;
&lt;br /&gt;
SURGE 2.3: The latest version of Surge, including support for written dialogue, and expanded&lt;br /&gt;
syntactic coverage based on the Penn TreeBank. http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
SURG-SP: Systemic Unification Reusable Grammar for Spanish is a large scale&lt;br /&gt;
Spanish grammar allowing systems which already use FUF/SURGE for English NLG to be able&lt;br /&gt;
to generate syntactically (and many times semantically) equivalent text in Spanish when&lt;br /&gt;
new lexical items are introduced.  SURG-SP makes use of inputs almost identical to the&lt;br /&gt;
English version Surge 2.3.  http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
SURG-IT: The Italian version of Surge 2.3. http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12447</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12447"/>
		<updated>2019-04-11T08:56:10Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* SimpleNLG */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;[[Tools and Software for English]] - Downloadable NLG systems&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
For languages other than English, see [[List of resources by language]].&lt;br /&gt;
&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, please click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system.&lt;br /&gt;
&lt;br /&gt;
== CLINT ==&lt;br /&gt;
http://www.cs.bgu.ac.il/~elhadad/clint.html &lt;br /&gt;
&lt;br /&gt;
CLINT is a hybrid template / word-based generation system with an example application of &lt;br /&gt;
business letter generation. The system is written in C++ and runs under Microsoft Windows. &lt;br /&gt;
&amp;lt;!-- THIS IS NOT NLG:&lt;br /&gt;
== Concordance ==&lt;br /&gt;
http://www.concordancesoftware.co.uk/&lt;br /&gt;
&lt;br /&gt;
Concordance is a sophisticated text analysis software for making concordances, wordlists, &lt;br /&gt;
and Web Concordances.  &lt;br /&gt;
Supports many different Western languages.  Turn a concordance into HTML. &lt;br /&gt;
Fully functional version available for download with a time limit.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
[http://code.google.com/p/crisp-nlg/ CRISP] is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== DAYDREAMER ==&lt;br /&gt;
&lt;br /&gt;
[ftp://ftp.cs.cmu.edu/user/ai/new/daydreamer/0new.html DAYDREAMER] is a computer model of the stream of thought developed at UCLA by Erik T. Mueller from 1983 to 1988. The generator is located in the file dd_gen.cl. Common lisp source code available under GPL v2.&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
[http://www.cs.bgu.ac.il/~elhadad/research.html FUF] is available as the [ftp://ftp.cs.bgu.ac.il/pub/fuf/fuf5.3.tar.gz original Common Lisp implementation] and as a C++ port called [http://www.cs.bgu.ac.il/~elhadad/cfuf.zip CFUF] which has an embedded Scheme interpreter.&lt;br /&gt;
&lt;br /&gt;
For more information, see [[#SURGE]], [[#SURGE_2.3]], [[#SURG-SP]], [[#SURG-IT]].&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html ([http://www.stir.ac.uk/crcl/Computational-tools/Grexplorer/grexplorer.html old site] unavailable as April, 2011)&lt;br /&gt;
&lt;br /&gt;
provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- no longer available as of April, 2011&lt;br /&gt;
== HALogen ==&lt;br /&gt;
http://www.isi.edu/licensed-sw/halogen/&lt;br /&gt;
&lt;br /&gt;
HALogen is a general-purpose natural language generation system developed by Irene Langkilde-Geary and  Kevin Knight at the USC Information Sciences Institute. &lt;br /&gt;
The download package consists of the symbolic generator, the forest ranker, and some sample inputs. The symbolic generator includes the  Sensus Ontology dictionary (which is based on WordNet). The forest ranker includes a 250-million word ngram language model (unigram, bigram, and trigram) trained on WSJ newspaper text. The symbolic generator is written in LISP and requires a CommonLisp interpreter.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- NOT AN NLG SYSTEM:&lt;br /&gt;
== kfNgram ==&lt;br /&gt;
http://www.kwicfinder.com/kfNgram/&lt;br /&gt;
&lt;br /&gt;
kfNgram is a free stand-alone Windows program for linguistic research which generates lists of n-grams in text and HTML files.  Here n-gram is understood as a sequence of either n words, where n can be any positive integer, also known as lexical bundles, chains, wordgrams, and, in WordSmith, clusters, or else of n characters, also known as chargrams.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
The KPML system was a direct descendent of the Penman text generation system, as developed further &lt;br /&gt;
multilingually in cooperative work between &lt;br /&gt;
the Komet (http://www.darmstadt.gmd.de/publish/komet/index.html)&lt;br /&gt;
project in Darmstadt and the Systemic Modelling Group&lt;br /&gt;
at Macquarie University. Downloadable standalone executables of the system are available for &lt;br /&gt;
PCs running Windows. The source code is written in ANSI Common Lisp and uses the &lt;br /&gt;
Common Lisp Interface Manager (CLIM). &lt;br /&gt;
The system has been compiled and tested[&lt;br /&gt;
under Franz Allegro Common Lisp (4.2, 4.3, 4.3.1, 5.0, 6.0, 7.0)&lt;br /&gt;
for Unix and Franz Allegro Common Lisp 3.0 &lt;br /&gt;
and Harlequin Lispworks 4.0, 4.1, 4.2 for Windows. &lt;br /&gt;
It is possible to use the system without the window interface as a generator serving requests for generation across sockets or via files.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
[http://wiki.delph-in.net/moin/LkbTop LKB] ([[Linguistic Knowledge Builder]]) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
==NLGen==&lt;br /&gt;
The [https://launchpad.net/nlgen NLGen] natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output. Not to be confused with NLGen2, below, which uses a different sentence generation theory.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
== NLGen2 ==&lt;br /&gt;
The [https://launchpad.net/nlgen2 NLGen2] natural language generation system uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Not to be confused with NLGen, above, which uses a different sentence generation theory. Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
[http://openccg.sourceforge.net/ OpenCCG], the OpenNLP CCG Library (formerly Grok), is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- as of April, 2011, a 30-day trial of project reporter is no longer offered&lt;br /&gt;
== Project Reporter ==&lt;br /&gt;
http://www.cogentex.com/products/reporter&lt;br /&gt;
&lt;br /&gt;
Project Reporter generates dynamic web-based project status reports from files created with Microsoft Project or &lt;br /&gt;
other compatible project management software. Reports feature hyperlinked textual descriptions of &lt;br /&gt;
project elements, as well as coordinated multimodal display with an interactive Gantt chart applet. &lt;br /&gt;
Commercial product. Implemented in Java. Free 30-day evaluation; on-line demo on website.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAGS (Reference Architecture for Generation Systems) software ==&lt;br /&gt;
http://www.csd.abdn.ac.uk/~cmellish/rags/deliverables/&lt;br /&gt;
&lt;br /&gt;
Deliverables from the RAGS project - RAGSOCKS software for interfacing modules using RAGS data representations,&lt;br /&gt;
example RAGS module (genetic algorithm based text planner) and RAGS wrapper for FUF/SURGE.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- no longer available, nor a NLG syste,&lt;br /&gt;
== RSTTool ==&lt;br /&gt;
http://www.dai.ed.ac.uk/staff/personal_pages/micko/RSTTool/&lt;br /&gt;
&lt;br /&gt;
is a tool which allows you to graphically annotate the &lt;br /&gt;
rhetorical structure of your text. The structure can be saved in an xml format, or save &lt;br /&gt;
eps versions of the structure diagram for inclusion in Latex, etc. Written in Tcl/Tk. &lt;br /&gt;
Runs on any machine.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
http://simplenlg.googlecode.com/&lt;br /&gt;
&lt;br /&gt;
is an ultra-simple Java-based realiser.  Its&lt;br /&gt;
grammatical coverage and syntactic knowledge is&lt;br /&gt;
minuscule compared to KPML or FUF/SURGE.&lt;br /&gt;
However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has&lt;br /&gt;
been used by many people in Aberdeen, and also&lt;br /&gt;
for teaching.  It is set up as a Java package,&lt;br /&gt;
so it can only be used by Java programs.&lt;br /&gt;
&lt;br /&gt;
SimpleNLG in French: http://www-etud.iro.umontreal.ca/~vaudrypl/snlgbil/snlgEnFr_english.html&lt;br /&gt;
&lt;br /&gt;
SimpleNLG in Spanish: https://github.com/citiususc/SimpleNLG-ES &lt;br /&gt;
&lt;br /&gt;
SimpleNLG in Galician: https://github.com/citiususc/SimpleNLG-GL&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
[http://www.cs.rutgers.edu/~mdstone/nlg.html SPUD] (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/research/standup/ STANDUP project] (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
[http://www.suregen.de/00023.html Suregen] is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
== SURGE ==&lt;br /&gt;
http://www.cs.bgu.ac.il/surge/&lt;br /&gt;
&lt;br /&gt;
Syntactic realization package. (A CommonLisp package providing an interpreter for a functional&lt;br /&gt;
unification formalism called FUF and SURGE, a large grammar of English written in FUF.) Offers download of SURGE 2.2.&lt;br /&gt;
&lt;br /&gt;
== SURGE 2.3 ==&lt;br /&gt;
http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
The latest version of Surge, including support for written dialogue, and expanded&lt;br /&gt;
syntactic coverage based on the Penn TreeBank.&lt;br /&gt;
&lt;br /&gt;
== SURG-SP ==&lt;br /&gt;
http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
Systemic Unification Reusable Grammar for Spanish is a large scale&lt;br /&gt;
Spanish grammar allowing systems which already use FUF/SURGE for English NLG to be able&lt;br /&gt;
to generate syntactically (and many times semantically) equivalent text in Spanish when&lt;br /&gt;
new lexical items are introduced.  SURG-SP makes use of inputs almost identical to the&lt;br /&gt;
English version Surge 2.3.&lt;br /&gt;
&lt;br /&gt;
== SURG-IT ==&lt;br /&gt;
http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
The Italian version of Surge 2.3.&lt;br /&gt;
&lt;br /&gt;
== TG/2 ==&lt;br /&gt;
http://www.dfki.de/pas/f2w.cgi?lts/tg2-e&lt;br /&gt;
&lt;br /&gt;
is a shallow verbalizer that can be quickly accustomed to new domains and tasks. &lt;br /&gt;
It combines context-free grammars with templates and canned &lt;br /&gt;
text in a single formalism. Thus the granularity of the language model may depend on the application&lt;br /&gt;
needs. The system currently runs under Solaris 2.5. It is available freely under a research license.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12446</id>
		<title>Downloadable NLG systems</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems&amp;diff=12446"/>
		<updated>2019-04-11T08:55:49Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* SimpleNLG */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&#039;&#039;&#039;[[Tools and Software for English]] - Downloadable NLG systems&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
For languages other than English, see [[List of resources by language]].&lt;br /&gt;
&amp;lt;!-- MoinMoin name:  DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:        changed simplenlg URL --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DownloadableSystems --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000012 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Tue Jan 23 16:22:14 2007 (1169569334000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
The natural language generation systems listed below are available for download over the web.  &lt;br /&gt;
If you know of a system which is not listed here, please click on Edit in the upper left corner of this page and add the system yourself.&lt;br /&gt;
&lt;br /&gt;
== ASTROGEN ==&lt;br /&gt;
http://www.dsv.su.se/~hercules/ASTROGEN/ASTROGEN.html&lt;br /&gt;
&lt;br /&gt;
Aggregated deep and Surface naTuRal language GENerator - Prolog based system.&lt;br /&gt;
&lt;br /&gt;
== CLINT ==&lt;br /&gt;
http://www.cs.bgu.ac.il/~elhadad/clint.html &lt;br /&gt;
&lt;br /&gt;
CLINT is a hybrid template / word-based generation system with an example application of &lt;br /&gt;
business letter generation. The system is written in C++ and runs under Microsoft Windows. &lt;br /&gt;
&amp;lt;!-- THIS IS NOT NLG:&lt;br /&gt;
== Concordance ==&lt;br /&gt;
http://www.concordancesoftware.co.uk/&lt;br /&gt;
&lt;br /&gt;
Concordance is a sophisticated text analysis software for making concordances, wordlists, &lt;br /&gt;
and Web Concordances.  &lt;br /&gt;
Supports many different Western languages.  Turn a concordance into HTML. &lt;br /&gt;
Fully functional version available for download with a time limit.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== CRISP ==&lt;br /&gt;
[http://code.google.com/p/crisp-nlg/ CRISP] is Alexander Koller&#039;s NLG system that tries to cast both microplanning and sentence realisation as an AI planning problem. The code is a mixture of Java and Scala, a scripting language for the Java virtual machine. CRISP comes with its own implementation of GraphPlan, but it can also output plans in PDDL (“Planning Domain Definition Language”, a successor to STRIPS) for use with other AI planners. License: LGPL.&lt;br /&gt;
&lt;br /&gt;
== DAYDREAMER ==&lt;br /&gt;
&lt;br /&gt;
[ftp://ftp.cs.cmu.edu/user/ai/new/daydreamer/0new.html DAYDREAMER] is a computer model of the stream of thought developed at UCLA by Erik T. Mueller from 1983 to 1988. The generator is located in the file dd_gen.cl. Common lisp source code available under GPL v2.&lt;br /&gt;
&lt;br /&gt;
== FUF/SURGE ==&lt;br /&gt;
[http://www.cs.bgu.ac.il/~elhadad/research.html FUF] is available as the [ftp://ftp.cs.bgu.ac.il/pub/fuf/fuf5.3.tar.gz original Common Lisp implementation] and as a C++ port called [http://www.cs.bgu.ac.il/~elhadad/cfuf.zip CFUF] which has an embedded Scheme interpreter.&lt;br /&gt;
&lt;br /&gt;
For more information, see [[#SURGE]], [[#SURGE_2.3]], [[#SURG-SP]], [[#SURG-IT]].&lt;br /&gt;
&lt;br /&gt;
== GenI ==&lt;br /&gt;
http://kowey.github.io/GenI&lt;br /&gt;
&lt;br /&gt;
surface realiser for (Feature-Based Lexicalised) Tree Adjoining Grammar and a flat MRS-like semantics (sans top handle and underspecification).  Toy example grammars provided for English and French.  Largish core grammar for French is under development (contact us for details).  GPL (commercial dual licensing available upon request). Known to work under Linux and Mac OS X (potential for making it work on Windows as well).  Written in Haskell. Source code available via [http://hackage.haskell.org/package/GenI hackage], [https://github.com/kowey/GenI GitHub], or [http://hub.darcs.net/kowey/GenI hub.darcs.net].&lt;br /&gt;
&lt;br /&gt;
== Grammar Explorer ==&lt;br /&gt;
http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/tutorials/Grexplorer/grexplorer.html ([http://www.stir.ac.uk/crcl/Computational-tools/Grexplorer/grexplorer.html old site] unavailable as April, 2011)&lt;br /&gt;
&lt;br /&gt;
provides a means of exploring large-scale systemic-functional grammars in order to see how they are &lt;br /&gt;
organized and what kinds of things they cover. It can be used to explore the KPML resources. &lt;br /&gt;
Downloadable standalone executables of the grammar explorer are available for&amp;amp;nbsp;Windows 95/98/NT.&lt;br /&gt;
These already include a version of the Nigel grammar of English and pre-installed examples.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- no longer available as of April, 2011&lt;br /&gt;
== HALogen ==&lt;br /&gt;
http://www.isi.edu/licensed-sw/halogen/&lt;br /&gt;
&lt;br /&gt;
HALogen is a general-purpose natural language generation system developed by Irene Langkilde-Geary and  Kevin Knight at the USC Information Sciences Institute. &lt;br /&gt;
The download package consists of the symbolic generator, the forest ranker, and some sample inputs. The symbolic generator includes the  Sensus Ontology dictionary (which is based on WordNet). The forest ranker includes a 250-million word ngram language model (unigram, bigram, and trigram) trained on WSJ newspaper text. The symbolic generator is written in LISP and requires a CommonLisp interpreter.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- NOT AN NLG SYSTEM:&lt;br /&gt;
== kfNgram ==&lt;br /&gt;
http://www.kwicfinder.com/kfNgram/&lt;br /&gt;
&lt;br /&gt;
kfNgram is a free stand-alone Windows program for linguistic research which generates lists of n-grams in text and HTML files.  Here n-gram is understood as a sequence of either n words, where n can be any positive integer, also known as lexical bundles, chains, wordgrams, and, in WordSmith, clusters, or else of n characters, also known as chargrams.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
== KPML ==&lt;br /&gt;
&lt;br /&gt;
http://www.purl.org/net/kpml&lt;br /&gt;
&lt;br /&gt;
The KPML system offers a robust, mature platform for large-scale grammar engineering that is particularly oriented to multilingual grammar development and generation. It is particularly targetted at providing resources for realistic but broad-coverage generation applications, where both flexibility of expression and speed of generation are at issue—for example in online webpage generation or spoken dialogue. KPML is also used extensively in multilingual text generation research and for teaching. It is based on systemic functional linguistics.&lt;br /&gt;
&lt;br /&gt;
The KPML system was a direct descendent of the Penman text generation system, as developed further &lt;br /&gt;
multilingually in cooperative work between &lt;br /&gt;
the Komet (http://www.darmstadt.gmd.de/publish/komet/index.html)&lt;br /&gt;
project in Darmstadt and the Systemic Modelling Group&lt;br /&gt;
at Macquarie University. Downloadable standalone executables of the system are available for &lt;br /&gt;
PCs running Windows. The source code is written in ANSI Common Lisp and uses the &lt;br /&gt;
Common Lisp Interface Manager (CLIM). &lt;br /&gt;
The system has been compiled and tested[&lt;br /&gt;
under Franz Allegro Common Lisp (4.2, 4.3, 4.3.1, 5.0, 6.0, 7.0)&lt;br /&gt;
for Unix and Franz Allegro Common Lisp 3.0 &lt;br /&gt;
and Harlequin Lispworks 4.0, 4.1, 4.2 for Windows. &lt;br /&gt;
It is possible to use the system without the window interface as a generator serving requests for generation across sockets or via files.&lt;br /&gt;
&lt;br /&gt;
A growing set of generation grammars are under development for a variety of languages, inlcluding English, Spanish, Dutch, Chinese, German, Czech, and more. See the &lt;br /&gt;
Generation Bank (http://www.fb10.uni-bremen.de/anglistik/langpro/kpml/genbank/generation-bank.html )&lt;br /&gt;
for current examples. The development of further languages and of extensions to existing resources are very welcome!&lt;br /&gt;
&lt;br /&gt;
== LKB ==&lt;br /&gt;
[http://wiki.delph-in.net/moin/LkbTop LKB] ([[Linguistic Knowledge Builder]]) is a grammar engineering environment for unification-based formalisms, typically HPSG.&lt;br /&gt;
It includes a [http://wiki.delph-in.net/moin/LkbGeneration realiser] that takes as input Minimal Recursion Semantics (MRS). LKB is implemented in Common Lisp, and is freely available under an open source license. It includes also a KNOPPIX-based GNU/Linux live-CD, with all the system installed, ready to use.&lt;br /&gt;
&lt;br /&gt;
== Multimodal Unification Grammar ==&lt;br /&gt;
http://www.david-reitter.com/compling/mug/&lt;br /&gt;
&lt;br /&gt;
MUG Workbench is a development and debugging tool for Multimodal NLG.  The grammar formalism supported is&lt;br /&gt;
Multimodal Functional Unification  Grammar (MUG).  The MUG system runs MUG grammars with fixed (test cases)&lt;br /&gt;
and  arbitrary input specifications to produce output in a natural  language, graphical user interface and&lt;br /&gt;
possibly in other modes. It is  designed to do three things:&lt;br /&gt;
- Multimodal Fission (distributing output to interaction/communication  modes)&lt;br /&gt;
- Some sentence planning (chosing information to include in the utterance)&lt;br /&gt;
- Natural Language and graphical user interface realization (producing  some form of output)&lt;br /&gt;
The MUG system does these three jobs in parallel. MUG Workbench can  serve to inspect the data-structures&lt;br /&gt;
used during generation. It  should help you to learn more about the nature of unification  grammars used&lt;br /&gt;
for parsing or natural language generation.  Furthermore, the MUG Workbench is helpful in debugging your grammars.&lt;br /&gt;
&lt;br /&gt;
== NaturalOWL ==&lt;br /&gt;
http://www.aueb.gr/users/ion/software/NaturalOWL1.1.tar.gz NaturalOWL (version 1.1)&lt;br /&gt;
&lt;br /&gt;
Generates descriptions of entities and classes from OWL ontologies that have been annotated with linguistic and user modeling resources expressed in RDF. Currently supports English and Greek. Extensions for other languages welcome. NaturalOWL can also be used as a [http://protege.stanford.edu/ Protégé] plug-in. See [http://www.aueb.gr/users/ion/publications.html here] for publications describing NaturalOWL. (GPL)&lt;br /&gt;
&lt;br /&gt;
==NLGen==&lt;br /&gt;
The [https://launchpad.net/nlgen NLGen] natural language generation system applies the [http://www.opencog.org/wiki/SegSim SegSim strategy] for generating English sentences. Probabilistic inference for sentence construction is based on a statistical analysis of [http://opencog.org/wiki/RelEx RelEx] output. Not to be confused with NLGen2, below, which uses a different sentence generation theory.  Java, Apache license. See demo: [http://novamente.net/example/nlp.html Demo of AI Virtual Pet Answering Simple Questions].&lt;br /&gt;
&lt;br /&gt;
== NLGen2 ==&lt;br /&gt;
The [https://launchpad.net/nlgen2 NLGen2] natural language generation system uses [http://opencog.org/wiki/RelEx RelEx] dependency parses, together with [http://www.abisource.com/projects/link-grammar/ Link Grammar] linkage analysis to generate English-language output.  Not to be confused with NLGen, above, which uses a different sentence generation theory. Java, Apache license. Reference: Blake Lemoine, &amp;quot;[http://www.louisiana.edu/~bal2277/NLGen2.doc NLGen2: A Linguistically Plausible, General Purpose Natural Language Generation System]&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
== OpenCCG ==&lt;br /&gt;
[http://openccg.sourceforge.net/ OpenCCG], the OpenNLP CCG Library (formerly Grok), is both a parser and a realizer for [[Combinatory Categorial Grammar]]. It has been used in several dialog systems. The realizer has been enhanced with n-gram models and a supertagging approach called hypertagging. OpenCCG is implemented in Java, and is freely available under the LGPL.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- as of April, 2011, a 30-day trial of project reporter is no longer offered&lt;br /&gt;
== Project Reporter ==&lt;br /&gt;
http://www.cogentex.com/products/reporter&lt;br /&gt;
&lt;br /&gt;
Project Reporter generates dynamic web-based project status reports from files created with Microsoft Project or &lt;br /&gt;
other compatible project management software. Reports feature hyperlinked textual descriptions of &lt;br /&gt;
project elements, as well as coordinated multimodal display with an interactive Gantt chart applet. &lt;br /&gt;
Commercial product. Implemented in Java. Free 30-day evaluation; on-line demo on website.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== RAGS (Reference Architecture for Generation Systems) software ==&lt;br /&gt;
http://www.csd.abdn.ac.uk/~cmellish/rags/deliverables/&lt;br /&gt;
&lt;br /&gt;
Deliverables from the RAGS project - RAGSOCKS software for interfacing modules using RAGS data representations,&lt;br /&gt;
example RAGS module (genetic algorithm based text planner) and RAGS wrapper for FUF/SURGE.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;!-- no longer available, nor a NLG syste,&lt;br /&gt;
== RSTTool ==&lt;br /&gt;
http://www.dai.ed.ac.uk/staff/personal_pages/micko/RSTTool/&lt;br /&gt;
&lt;br /&gt;
is a tool which allows you to graphically annotate the &lt;br /&gt;
rhetorical structure of your text. The structure can be saved in an xml format, or save &lt;br /&gt;
eps versions of the structure diagram for inclusion in Latex, etc. Written in Tcl/Tk. &lt;br /&gt;
Runs on any machine.&lt;br /&gt;
--&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== SimpleNLG ==&lt;br /&gt;
&lt;br /&gt;
http://simplenlg.googlecode.com/&lt;br /&gt;
&lt;br /&gt;
is an ultra-simple Java-based realiser.  Its&lt;br /&gt;
grammatical coverage and syntactic knowledge is&lt;br /&gt;
minuscule compared to KPML or FUF/SURGE.&lt;br /&gt;
However, because it is so simple, its relatively&lt;br /&gt;
easy for people to learn how to use it.  It has&lt;br /&gt;
been used by many people in Aberdeen, and also&lt;br /&gt;
for teaching.  It is set up as a Java package,&lt;br /&gt;
so it can only be used by Java programs.&lt;br /&gt;
&lt;br /&gt;
SimpleNLG in French: http://www-etud.iro.umontreal.ca/~vaudrypl/snlgbil/snlgEnFr_english.html&lt;br /&gt;
SimpleNLG in Spanish: https://github.com/citiususc/SimpleNLG-ES &lt;br /&gt;
SimpleNLG in Galician: https://github.com/citiususc/SimpleNLG-GL&lt;br /&gt;
&lt;br /&gt;
== SPUD ==&lt;br /&gt;
[http://www.cs.rutgers.edu/~mdstone/nlg.html SPUD] (Sentence Planner Using Descriptions) is Matthew Purver&#039;s LTAG-based NLG system. There are two versions: SPUD version 0.01 was written in SML. Later versions, known as SPUD lite, are written in Prolog. The small codebase of SPUD lite makes it ideal for teaching, but it is also used in dialog system prototypes.&lt;br /&gt;
&lt;br /&gt;
== STANDUP ==&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/research/standup/ STANDUP project] (System To Augment Non-speakers&#039; Dialogue Using Puns) is a collaborative project on generating simple jokes from a graphical user interface appropriate for non-speaking children. The project began in October 2003 and ran until March 2007. The software was written in Java and is available for Windows and Linux, including source code and database files.&lt;br /&gt;
&lt;br /&gt;
== Suregen-2 ==&lt;br /&gt;
[http://www.suregen.de/00023.html Suregen] is “a hybrid, multilingual (German, English) ontology based and NLG-oriented formalism for generating text for documents in clinical medicine.”&lt;br /&gt;
The system Suregen-2 is written in (Allegro) Common Lisp. A [http://www.suregen.de/ftp/standalone1.zip demo system] which runs under Windows is available for download. A [http://www.suregen.de/ftp/selfrunningdemo.zip screencast video] shows data being entered into computer forms using mouse and keyboard while a feedback text is continually updated and shown below. (Try playing the AVI file in [http://www.videolan.org/vlc/ VLC] if you run into problems.) Perhaps this system could be considered an instance of the [http://en.wikipedia.org/wiki/WYSIWYM_(Meant) WYSIWYM] approach.&lt;br /&gt;
&lt;br /&gt;
== SURGE ==&lt;br /&gt;
http://www.cs.bgu.ac.il/surge/&lt;br /&gt;
&lt;br /&gt;
Syntactic realization package. (A CommonLisp package providing an interpreter for a functional&lt;br /&gt;
unification formalism called FUF and SURGE, a large grammar of English written in FUF.) Offers download of SURGE 2.2.&lt;br /&gt;
&lt;br /&gt;
== SURGE 2.3 ==&lt;br /&gt;
http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
The latest version of Surge, including support for written dialogue, and expanded&lt;br /&gt;
syntactic coverage based on the Penn TreeBank.&lt;br /&gt;
&lt;br /&gt;
== SURG-SP ==&lt;br /&gt;
http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
Systemic Unification Reusable Grammar for Spanish is a large scale&lt;br /&gt;
Spanish grammar allowing systems which already use FUF/SURGE for English NLG to be able&lt;br /&gt;
to generate syntactically (and many times semantically) equivalent text in Spanish when&lt;br /&gt;
new lexical items are introduced.  SURG-SP makes use of inputs almost identical to the&lt;br /&gt;
English version Surge 2.3.&lt;br /&gt;
&lt;br /&gt;
== SURG-IT ==&lt;br /&gt;
http://homepages.inf.ed.ac.uk/ccallawa/resources.html&lt;br /&gt;
&lt;br /&gt;
The Italian version of Surge 2.3.&lt;br /&gt;
&lt;br /&gt;
== TG/2 ==&lt;br /&gt;
http://www.dfki.de/pas/f2w.cgi?lts/tg2-e&lt;br /&gt;
&lt;br /&gt;
is a shallow verbalizer that can be quickly accustomed to new domains and tasks. &lt;br /&gt;
It combines context-free grammars with templates and canned &lt;br /&gt;
text in a single formalism. Thus the granularity of the language model may depend on the application&lt;br /&gt;
needs. The system currently runs under Solaris 2.5. It is available freely under a research license.&lt;br /&gt;
&lt;br /&gt;
[[Category:Software]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12445</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12445"/>
		<updated>2019-04-11T08:52:53Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
This [https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip data] contain predictions for meteorological parameters such as precipitation, temperature, wind speed, and cloud cover at various altitudes, at regular intervals for various points in the area of interest.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects. [http://jetteviethen.net/research/spatial.html The corpora and stimulus scenes are available here.]&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])&lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The [http://www.pitt.edu/~coconut/coconut-corpus.html COCONUT corpus] is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme]. ([http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
== Dialogue Systems ==&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The [http://www.classic-project.org/corpora Wizard-of-Oz corpus] on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Focus on studying the generation target ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
The [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton. ([http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12444</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12444"/>
		<updated>2019-04-11T08:51:45Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* SUMTIME */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
This [https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip data] contain predictions for meteorological parameters such as precipitation, temperature, wind speed, and cloud cover at various altitudes, at regular intervals for various points in the area of interest.&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
== Dialogue Systems ==&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The [http://www.classic-project.org/corpora Wizard-of-Oz corpus] on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects. [http://jetteviethen.net/research/spatial.html The corpora and stimulus scenes are available here.]&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])&lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The [http://www.pitt.edu/~coconut/coconut-corpus.html COCONUT corpus] is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme]. ([http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
== Focus on studying the generation target ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
The [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton. ([http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12443</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12443"/>
		<updated>2019-04-11T08:50:41Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* SUMTIME */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
These data contain predictions for meteorological parameters such as precipitation, temperature, wind speed, and cloud cover at various altitudes, at regular intervals for various points in the area of interest.&lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip&lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
== Dialogue Systems ==&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The [http://www.classic-project.org/corpora Wizard-of-Oz corpus] on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects. [http://jetteviethen.net/research/spatial.html The corpora and stimulus scenes are available here.]&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])&lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The [http://www.pitt.edu/~coconut/coconut-corpus.html COCONUT corpus] is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme]. ([http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
== Focus on studying the generation target ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
The [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton. ([http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12442</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12442"/>
		<updated>2019-04-11T08:50:19Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Focus on content selection, aggregation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip &lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
== Dialogue Systems ==&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The [http://www.classic-project.org/corpora Wizard-of-Oz corpus] on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged.&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects. [http://jetteviethen.net/research/spatial.html The corpora and stimulus scenes are available here.]&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])&lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The [http://www.pitt.edu/~coconut/coconut-corpus.html COCONUT corpus] is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme]. ([http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
== Focus on studying the generation target ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
The [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton. ([http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12441</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12441"/>
		<updated>2019-04-11T08:49:26Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.&lt;br /&gt;
&lt;br /&gt;
==Data-to-text/Concept-to-text Generation==&lt;br /&gt;
&lt;br /&gt;
=== E2E === &lt;br /&gt;
http://www.macs.hw.ac.uk/InteractionLab/E2E/#data &lt;br /&gt;
&lt;br /&gt;
=== SUMTIME === &lt;br /&gt;
https://ehudreiter.files.wordpress.com/2016/12/sumtime.zip &lt;br /&gt;
&lt;br /&gt;
=== WeatherGov ===&lt;br /&gt;
https://cs.stanford.edu/~pliang/data/weather-data.zip &lt;br /&gt;
&lt;br /&gt;
=== WebNLG=== &lt;br /&gt;
https://github.com/ThiagoCF05/webnlg &lt;br /&gt;
&lt;br /&gt;
== Focus on content selection, aggregation ==&lt;br /&gt;
=== SumTime Meteo ===&lt;br /&gt;
&lt;br /&gt;
These data contain predictions for meteorological parameters such as precipitation, temperature, wind speed, and cloud cover at various altitudes, at regular intervals for various points in the area of interest.&lt;br /&gt;
&lt;br /&gt;
The weather corpus currently exists as an Access database and, alternatively, in form of CSV (ASCII) files.&lt;br /&gt;
&lt;br /&gt;
Download and Info: [[SumTime-Meteo]]&lt;br /&gt;
&lt;br /&gt;
Project link: http://www.csd.abdn.ac.uk/research/sumtime/&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The [http://www.classic-project.org/corpora Wizard-of-Oz corpus] on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects. [http://jetteviethen.net/research/spatial.html The corpora and stimulus scenes are available here.]&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])&lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The [http://www.pitt.edu/~coconut/coconut-corpus.html COCONUT corpus] is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme]. ([http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
== Focus on studying the generation target ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
The [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton. ([http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12440</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12440"/>
		<updated>2019-04-11T08:46:16Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Referring Expressions Generation */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.&lt;br /&gt;
&lt;br /&gt;
== Focus on studying the generation target ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
The [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton. ([http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
== Focus on content selection, aggregation ==&lt;br /&gt;
=== SumTime Meteo ===&lt;br /&gt;
&lt;br /&gt;
These data contain predictions for meteorological parameters such as precipitation, temperature, wind speed, and cloud cover at various altitudes, at regular intervals for various points in the area of interest.&lt;br /&gt;
&lt;br /&gt;
The weather corpus currently exists as an Access database and, alternatively, in form of CSV (ASCII) files.&lt;br /&gt;
&lt;br /&gt;
Download and Info: [[SumTime-Meteo]]&lt;br /&gt;
&lt;br /&gt;
Project link: http://www.csd.abdn.ac.uk/research/sumtime/&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The [http://www.classic-project.org/corpora Wizard-of-Oz corpus] on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG that focuses only on the generation of referring expressions (descriptions) that identify specific entities called targets.&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects. [http://jetteviethen.net/research/spatial.html The corpora and stimulus scenes are available here.]&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
=== The REAL dataset ===&lt;br /&gt;
https://datastorre.stir.ac.uk/handle/11667/82&lt;br /&gt;
&lt;br /&gt;
=== GeoDescriptors ===&lt;br /&gt;
https://gitlab.citius.usc.es/alejandro.ramos/geodescriptors &lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])&lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The [http://www.pitt.edu/~coconut/coconut-corpus.html COCONUT corpus] is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme]. ([http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
== Focus on lexicalization ==&lt;br /&gt;
...&lt;br /&gt;
&lt;br /&gt;
== Focus on syntax, realization ==&lt;br /&gt;
...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12439</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12439"/>
		<updated>2019-04-11T08:43:19Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Refcoco */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.&lt;br /&gt;
&lt;br /&gt;
== Focus on studying the generation target ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
The [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton. ([http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
== Focus on content selection, aggregation ==&lt;br /&gt;
=== SumTime Meteo ===&lt;br /&gt;
&lt;br /&gt;
These data contain predictions for meteorological parameters such as precipitation, temperature, wind speed, and cloud cover at various altitudes, at regular intervals for various points in the area of interest.&lt;br /&gt;
&lt;br /&gt;
The weather corpus currently exists as an Access database and, alternatively, in form of CSV (ASCII) files.&lt;br /&gt;
&lt;br /&gt;
Download and Info: [[SumTime-Meteo]]&lt;br /&gt;
&lt;br /&gt;
Project link: http://www.csd.abdn.ac.uk/research/sumtime/&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The [http://www.classic-project.org/corpora Wizard-of-Oz corpus] on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG with an active research community.&lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The [http://www.pitt.edu/~coconut/coconut-corpus.html COCONUT corpus] is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme]. ([http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects. [http://jetteviethen.net/research/spatial.html The corpora and stimulus scenes are available here.]&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment, and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])&lt;br /&gt;
&lt;br /&gt;
=== RefClef, RefCOCO, RefCOCO+ and RefCOCOg ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
== Focus on lexicalization ==&lt;br /&gt;
...&lt;br /&gt;
&lt;br /&gt;
== Focus on syntax, realization ==&lt;br /&gt;
...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12438</id>
		<title>Data sets for NLG</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&amp;diff=12438"/>
		<updated>2019-04-11T08:41:43Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Focus on generating referring expressions */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;&amp;lt;!-- MoinMoin name:  DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Comment:         --&amp;gt;&lt;br /&gt;
&amp;lt;!-- WikiMedia name: DataSets --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Page revision:  00000001 --&amp;gt;&lt;br /&gt;
&amp;lt;!-- Original date:  Fri Nov 11 09:00:35 2005 (1131699635000000) --&amp;gt;&lt;br /&gt;
&lt;br /&gt;
This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.&lt;br /&gt;
&lt;br /&gt;
== Focus on studying the generation target ==&lt;br /&gt;
=== PIL: Patient Information Leaflet corpus ===&lt;br /&gt;
The [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton. ([http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
== Focus on content selection, aggregation ==&lt;br /&gt;
=== SumTime Meteo ===&lt;br /&gt;
&lt;br /&gt;
These data contain predictions for meteorological parameters such as precipitation, temperature, wind speed, and cloud cover at various altitudes, at regular intervals for various points in the area of interest.&lt;br /&gt;
&lt;br /&gt;
The weather corpus currently exists as an Access database and, alternatively, in form of CSV (ASCII) files.&lt;br /&gt;
&lt;br /&gt;
Download and Info: [[SumTime-Meteo]]&lt;br /&gt;
&lt;br /&gt;
Project link: http://www.csd.abdn.ac.uk/research/sumtime/&lt;br /&gt;
&lt;br /&gt;
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===&lt;br /&gt;
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The [http://www.classic-project.org/corpora Wizard-of-Oz corpus] on Information Presentation in Spoken Dialogue Systems contains the wizards&#039; choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Referring Expressions Generation==&lt;br /&gt;
Referring expression generation is a sub-task of NLG with an active research community.&lt;br /&gt;
&lt;br /&gt;
=== COCONUT Corpus ===&lt;br /&gt;
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The [http://www.pitt.edu/~coconut/coconut-corpus.html COCONUT corpus] is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme]. ([http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz direct download link])&lt;br /&gt;
&lt;br /&gt;
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===&lt;br /&gt;
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.&lt;br /&gt;
The resulting corpora GRE3D3 and GRE3D7 contain 720  and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects. [http://jetteviethen.net/research/spatial.html The corpora and stimulus scenes are available here.]&lt;br /&gt;
&lt;br /&gt;
=== TUNA Reference Corpus ===&lt;br /&gt;
The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment, and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])&lt;br /&gt;
&lt;br /&gt;
=== Refcoco ===&lt;br /&gt;
https://github.com/lichengunc/refer&lt;br /&gt;
&lt;br /&gt;
== Focus on lexicalization ==&lt;br /&gt;
...&lt;br /&gt;
&lt;br /&gt;
== Focus on syntax, realization ==&lt;br /&gt;
...&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
[[Category:Knowledge Collections and Datasets]]&lt;br /&gt;
{{SIGGEN Wiki}}&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12373</id>
		<title>SIGGEN</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12373"/>
		<updated>2019-01-28T14:09:28Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Board */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h1&amp;gt;ACL Special Interest Group on Natural Language Generation &amp;lt;/h1&amp;gt; &lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|-&lt;br /&gt;
|[[File:Siggen_logo_small.JPG|left]]||&amp;lt;h4 style=&amp;quot;width:95%;margin:0;background-color:#cedff2;font-size:120%;font-weight:bold;border:1px solid #a3b0bf;text-align:justify;color:#000;padding:0.2em 0.4em;&amp;quot;&amp;gt;Welcome to the home page of the Association for Computational Linguistics Special Interest Group on Natural Language Generation. SIGGEN [ˈsɪɡ.ʤɛn] is a special interest group of the Association for Computational Linguistics (ACL). It provides a forum for the discussion, dissemination and archiving of research topics and results in the field of text generation. &amp;lt;/h4&amp;gt;&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Active topics of interest include:&lt;br /&gt;
&lt;br /&gt;
*Discourse models, content planning.&lt;br /&gt;
*Syntactic realization: formalisms and models of grammars for sentence production.&lt;br /&gt;
*Architecture of generators.&lt;br /&gt;
*Lexical choice.&lt;br /&gt;
*Psychological modelling of discourse production.&lt;br /&gt;
*Pragmatic influences on lexical choice, syntax and content selection.&lt;br /&gt;
*Multilingual or multi-modal generation.&lt;br /&gt;
*Applications of generation technology (report generation, explanation for knowledge-based systems, automatic translation...).&lt;br /&gt;
*Learning methods.&lt;br /&gt;
*Evaluation of generation results.&lt;br /&gt;
&lt;br /&gt;
Relevant aspects of the following areas relate to problems of natural language generation:&lt;br /&gt;
&lt;br /&gt;
*Grammar theory&lt;br /&gt;
*Statistical methods&lt;br /&gt;
*Speech synthesis&lt;br /&gt;
*Psycholinguistics&lt;br /&gt;
*Neuroscience&lt;br /&gt;
*Philosophy&lt;br /&gt;
&lt;br /&gt;
== Upcoming Events ==&lt;br /&gt;
&lt;br /&gt;
INLG 2019 will be announced soon!&lt;br /&gt;
&lt;br /&gt;
== Recent Events ==&lt;br /&gt;
&lt;br /&gt;
[https://inlg2018.uvt.nl/ INLG 2018]&lt;br /&gt;
&lt;br /&gt;
Tilburg, Netherlands, 5-8 Novemeber 2018&lt;br /&gt;
&lt;br /&gt;
== Mailing List ==&lt;br /&gt;
=== Joining the mailing list: ===&lt;br /&gt;
&lt;br /&gt;
:The SIGGEN mailing list is currently going through a transition. &lt;br /&gt;
:To sign up, view preferences, change preferences, or unsubscribe, go to: &lt;br /&gt;
&lt;br /&gt;
::&#039;&#039;&#039;[http://www.jiscmail.ac.uk/SIGGEN http://www.jiscmail.ac.uk/SIGGEN]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:If there are any issues, e-mail: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Posting messages to the mailing list ===&lt;br /&gt;
&lt;br /&gt;
:Please join the mailing list first (see above). Then you may use the email alias &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-list (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt; to post e-mails to the list.&lt;br /&gt;
&lt;br /&gt;
== Board ==&lt;br /&gt;
The SIGGEN board is made up of the following people:&lt;br /&gt;
&lt;br /&gt;
*[https://ehudreiter.com/ Ehud Reiter] ([mailto:e.reiter@abdn.ac.uk mail]) [https://www.abdn.ac.uk/ncs/profiles/e.reiter/ Professor/Chair in Computer Science at University of Aberdeen]. [mailto:siggen-chair(ta)aclweb(dot)org chair])&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[https://dimitragkatzia.wordpress.com Dimitra Gkatzia] ([mailto:d.gkatzia@napier.ac.uk mail]) [http://www.napier.ac.uk/about-us/our-schools/school-of-computing/staff School of Computing, Edinburgh Napier University], Edinburgh.&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[http://amandastent.com// Amanda Stent] ([mailto:amanda.stent@gmail.com mail]), Bloomberg LP ([mailto:siggen-treasurer(ta)aclweb(dot)org treasurer])&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[https://citius.usc.es/equipo/investigadores-postdoutorais/jose-maria-alonso-moral Jose M. Alonso] ([mailto:josemaria.alonso.moral@usc.es]) [https://citius.usc.es/equipo/investigadores-postdoutorais/jose-maria-alonso-moral University of Santiago de Compostela], Spain (secretary)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[https://www.edinburgh-robotics.org/students/amanda-cercas-curry Amanda Curry] ([mailto:ac293@hw.ac.uk  mail]) [https://www.hw.ac.uk/ School of Mathematical and Computer Sciences, Heriot-Watt University] (student member)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2020&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To contact the entire board, please use the email alias: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-board (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== [http://www.aclweb.org/anthology/siggen.html  Workshop Proceedings ] ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== [[SIGGEN: Archive|Archive]] ==&lt;br /&gt;
== [[SIGGEN: Newsletter Archive|Newsletter Archive]] ==&lt;br /&gt;
== [[SIGGEN: Constitution|Constitution]] ==&lt;br /&gt;
== [[SIGGEN: Who&#039;s Who in NLG|Who&#039;s Who in NLG]] ==&lt;br /&gt;
== [[SIGGEN: What&#039;s Where in NLG|What&#039;s Where in NLG]] ==&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
[[Natural_Language_Generation_Portal|Natural Language Generation Portal]]&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12372</id>
		<title>SIGGEN</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12372"/>
		<updated>2019-01-28T14:09:00Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Board */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h1&amp;gt;ACL Special Interest Group on Natural Language Generation &amp;lt;/h1&amp;gt; &lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|-&lt;br /&gt;
|[[File:Siggen_logo_small.JPG|left]]||&amp;lt;h4 style=&amp;quot;width:95%;margin:0;background-color:#cedff2;font-size:120%;font-weight:bold;border:1px solid #a3b0bf;text-align:justify;color:#000;padding:0.2em 0.4em;&amp;quot;&amp;gt;Welcome to the home page of the Association for Computational Linguistics Special Interest Group on Natural Language Generation. SIGGEN [ˈsɪɡ.ʤɛn] is a special interest group of the Association for Computational Linguistics (ACL). It provides a forum for the discussion, dissemination and archiving of research topics and results in the field of text generation. &amp;lt;/h4&amp;gt;&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Active topics of interest include:&lt;br /&gt;
&lt;br /&gt;
*Discourse models, content planning.&lt;br /&gt;
*Syntactic realization: formalisms and models of grammars for sentence production.&lt;br /&gt;
*Architecture of generators.&lt;br /&gt;
*Lexical choice.&lt;br /&gt;
*Psychological modelling of discourse production.&lt;br /&gt;
*Pragmatic influences on lexical choice, syntax and content selection.&lt;br /&gt;
*Multilingual or multi-modal generation.&lt;br /&gt;
*Applications of generation technology (report generation, explanation for knowledge-based systems, automatic translation...).&lt;br /&gt;
*Learning methods.&lt;br /&gt;
*Evaluation of generation results.&lt;br /&gt;
&lt;br /&gt;
Relevant aspects of the following areas relate to problems of natural language generation:&lt;br /&gt;
&lt;br /&gt;
*Grammar theory&lt;br /&gt;
*Statistical methods&lt;br /&gt;
*Speech synthesis&lt;br /&gt;
*Psycholinguistics&lt;br /&gt;
*Neuroscience&lt;br /&gt;
*Philosophy&lt;br /&gt;
&lt;br /&gt;
== Upcoming Events ==&lt;br /&gt;
&lt;br /&gt;
INLG 2019 will be announced soon!&lt;br /&gt;
&lt;br /&gt;
== Recent Events ==&lt;br /&gt;
&lt;br /&gt;
[https://inlg2018.uvt.nl/ INLG 2018]&lt;br /&gt;
&lt;br /&gt;
Tilburg, Netherlands, 5-8 Novemeber 2018&lt;br /&gt;
&lt;br /&gt;
== Mailing List ==&lt;br /&gt;
=== Joining the mailing list: ===&lt;br /&gt;
&lt;br /&gt;
:The SIGGEN mailing list is currently going through a transition. &lt;br /&gt;
:To sign up, view preferences, change preferences, or unsubscribe, go to: &lt;br /&gt;
&lt;br /&gt;
::&#039;&#039;&#039;[http://www.jiscmail.ac.uk/SIGGEN http://www.jiscmail.ac.uk/SIGGEN]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:If there are any issues, e-mail: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Posting messages to the mailing list ===&lt;br /&gt;
&lt;br /&gt;
:Please join the mailing list first (see above). Then you may use the email alias &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-list (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt; to post e-mails to the list.&lt;br /&gt;
&lt;br /&gt;
== Board ==&lt;br /&gt;
The SIGGEN board is made up of the following people:&lt;br /&gt;
&lt;br /&gt;
*[https://ehudreiter.com/ Ehud Reiter] ([mailto:e.reiter@abdn.ac.uk mail]) [https://www.abdn.ac.uk/ncs/profiles/e.reiter/ Professor/Chair in Computer Science at University of Aberdeen]. [mailto:siggen-chair(ta)aclweb(dot)org chair])&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[https://dimitragkatzia.wordpress.com Dimitra Gkatzia] ([mailto:d.gkatzia@napier.ac.uk mail]) [http://www.napier.ac.uk/about-us/our-schools/school-of-computing/staff School of Computing, Edinburgh Napier University], Edinburgh.&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[http://amandastent.com// Amanda Stent] ([mailto:amanda.stent@gmail.com mail]), Bloomberg LP ([mailto:siggen-treasurer(ta)aclweb(dot)org treasurer])&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[https://citius.usc.es/equipo/investigadores-postdoutorais/jose-maria-alonso-moral Jose M. Alonso] ([mailto:josemaria.alonso.moral@usc.es]) [https://citius.usc.es/equipo/investigadores-postdoutorais/jose-maria-alonso-moral University of Santiago de Compostela], Spain (secretary)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[https://www.edinburgh-robotics.org/students/amanda-cercas-curry Amanda Curry] ([mailto:ac293@hw.ac.uk  mail]) [https://www.hw.ac.uk/ School of Mathematical and Computer Sciences, Heriot-Watt University] (student member)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2020&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To contact the entire board, please use the email alias: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-board (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
For questions regarding this website, please email: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== [http://www.aclweb.org/anthology/siggen.html  Workshop Proceedings ] ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== [[SIGGEN: Archive|Archive]] ==&lt;br /&gt;
== [[SIGGEN: Newsletter Archive|Newsletter Archive]] ==&lt;br /&gt;
== [[SIGGEN: Constitution|Constitution]] ==&lt;br /&gt;
== [[SIGGEN: Who&#039;s Who in NLG|Who&#039;s Who in NLG]] ==&lt;br /&gt;
== [[SIGGEN: What&#039;s Where in NLG|What&#039;s Where in NLG]] ==&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
[[Natural_Language_Generation_Portal|Natural Language Generation Portal]]&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12371</id>
		<title>SIGGEN</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12371"/>
		<updated>2019-01-28T14:07:09Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Board */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h1&amp;gt;ACL Special Interest Group on Natural Language Generation &amp;lt;/h1&amp;gt; &lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|-&lt;br /&gt;
|[[File:Siggen_logo_small.JPG|left]]||&amp;lt;h4 style=&amp;quot;width:95%;margin:0;background-color:#cedff2;font-size:120%;font-weight:bold;border:1px solid #a3b0bf;text-align:justify;color:#000;padding:0.2em 0.4em;&amp;quot;&amp;gt;Welcome to the home page of the Association for Computational Linguistics Special Interest Group on Natural Language Generation. SIGGEN [ˈsɪɡ.ʤɛn] is a special interest group of the Association for Computational Linguistics (ACL). It provides a forum for the discussion, dissemination and archiving of research topics and results in the field of text generation. &amp;lt;/h4&amp;gt;&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Active topics of interest include:&lt;br /&gt;
&lt;br /&gt;
*Discourse models, content planning.&lt;br /&gt;
*Syntactic realization: formalisms and models of grammars for sentence production.&lt;br /&gt;
*Architecture of generators.&lt;br /&gt;
*Lexical choice.&lt;br /&gt;
*Psychological modelling of discourse production.&lt;br /&gt;
*Pragmatic influences on lexical choice, syntax and content selection.&lt;br /&gt;
*Multilingual or multi-modal generation.&lt;br /&gt;
*Applications of generation technology (report generation, explanation for knowledge-based systems, automatic translation...).&lt;br /&gt;
*Learning methods.&lt;br /&gt;
*Evaluation of generation results.&lt;br /&gt;
&lt;br /&gt;
Relevant aspects of the following areas relate to problems of natural language generation:&lt;br /&gt;
&lt;br /&gt;
*Grammar theory&lt;br /&gt;
*Statistical methods&lt;br /&gt;
*Speech synthesis&lt;br /&gt;
*Psycholinguistics&lt;br /&gt;
*Neuroscience&lt;br /&gt;
*Philosophy&lt;br /&gt;
&lt;br /&gt;
== Upcoming Events ==&lt;br /&gt;
&lt;br /&gt;
INLG 2019 will be announced soon!&lt;br /&gt;
&lt;br /&gt;
== Recent Events ==&lt;br /&gt;
&lt;br /&gt;
[https://inlg2018.uvt.nl/ INLG 2018]&lt;br /&gt;
&lt;br /&gt;
Tilburg, Netherlands, 5-8 Novemeber 2018&lt;br /&gt;
&lt;br /&gt;
== Mailing List ==&lt;br /&gt;
=== Joining the mailing list: ===&lt;br /&gt;
&lt;br /&gt;
:The SIGGEN mailing list is currently going through a transition. &lt;br /&gt;
:To sign up, view preferences, change preferences, or unsubscribe, go to: &lt;br /&gt;
&lt;br /&gt;
::&#039;&#039;&#039;[http://www.jiscmail.ac.uk/SIGGEN http://www.jiscmail.ac.uk/SIGGEN]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:If there are any issues, e-mail: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Posting messages to the mailing list ===&lt;br /&gt;
&lt;br /&gt;
:Please join the mailing list first (see above). Then you may use the email alias &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-list (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt; to post e-mails to the list.&lt;br /&gt;
&lt;br /&gt;
== Board ==&lt;br /&gt;
The SIGGEN board is made up of the following people:&lt;br /&gt;
&lt;br /&gt;
*[https://ehudreiter.com/ Ehud Reiter] ([mailto:e.reiter@abdn.ac.uk mail]) Professor/Chair in Computer Science at [https://www.abdn.ac.uk/ncs/profiles/e.reiter/] University of Aberdeen. [mailto:siggen-chair(ta)aclweb(dot)org chair])&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[https://dimitragkatzia.wordpress.com Dimitra Gkatzia] ([mailto:d.gkatzia@napier.ac.uk mail]) [http://www.napier.ac.uk/about-us/our-schools/school-of-computing/staff School of Computing, Edinburgh Napier University], Edinburgh.&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[http://amandastent.com// Amanda Stent] ([mailto:amanda.stent@gmail.com mail]), Bloomberg LP ([mailto:siggen-treasurer(ta)aclweb(dot)org treasurer])&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[https://citius.usc.es/equipo/investigadores-postdoutorais/jose-maria-alonso-moral Jose M. Alonso] ([mailto:josemaria.alonso.moral@usc.es]) [https://citius.usc.es/equipo/investigadores-postdoutorais/jose-maria-alonso-moral University of Santiago de Compostela], Spain (secretary)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[https://www.edinburgh-robotics.org/students/amanda-cercas-curry Amanda Curry] ([mailto:ac293@hw.ac.uk  mail]) [https://www.hw.ac.uk/ School of Mathematical and Computer Sciences, Heriot-Watt University] (student member)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2020&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To contact the entire board, please use the email alias: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-board (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
For questions regarding this website, please email: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== [http://www.aclweb.org/anthology/siggen.html  Workshop Proceedings ] ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== [[SIGGEN: Archive|Archive]] ==&lt;br /&gt;
== [[SIGGEN: Newsletter Archive|Newsletter Archive]] ==&lt;br /&gt;
== [[SIGGEN: Constitution|Constitution]] ==&lt;br /&gt;
== [[SIGGEN: Who&#039;s Who in NLG|Who&#039;s Who in NLG]] ==&lt;br /&gt;
== [[SIGGEN: What&#039;s Where in NLG|What&#039;s Where in NLG]] ==&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
[[Natural_Language_Generation_Portal|Natural Language Generation Portal]]&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12370</id>
		<title>SIGGEN</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12370"/>
		<updated>2019-01-28T14:03:47Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Board */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h1&amp;gt;ACL Special Interest Group on Natural Language Generation &amp;lt;/h1&amp;gt; &lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|-&lt;br /&gt;
|[[File:Siggen_logo_small.JPG|left]]||&amp;lt;h4 style=&amp;quot;width:95%;margin:0;background-color:#cedff2;font-size:120%;font-weight:bold;border:1px solid #a3b0bf;text-align:justify;color:#000;padding:0.2em 0.4em;&amp;quot;&amp;gt;Welcome to the home page of the Association for Computational Linguistics Special Interest Group on Natural Language Generation. SIGGEN [ˈsɪɡ.ʤɛn] is a special interest group of the Association for Computational Linguistics (ACL). It provides a forum for the discussion, dissemination and archiving of research topics and results in the field of text generation. &amp;lt;/h4&amp;gt;&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Active topics of interest include:&lt;br /&gt;
&lt;br /&gt;
*Discourse models, content planning.&lt;br /&gt;
*Syntactic realization: formalisms and models of grammars for sentence production.&lt;br /&gt;
*Architecture of generators.&lt;br /&gt;
*Lexical choice.&lt;br /&gt;
*Psychological modelling of discourse production.&lt;br /&gt;
*Pragmatic influences on lexical choice, syntax and content selection.&lt;br /&gt;
*Multilingual or multi-modal generation.&lt;br /&gt;
*Applications of generation technology (report generation, explanation for knowledge-based systems, automatic translation...).&lt;br /&gt;
*Learning methods.&lt;br /&gt;
*Evaluation of generation results.&lt;br /&gt;
&lt;br /&gt;
Relevant aspects of the following areas relate to problems of natural language generation:&lt;br /&gt;
&lt;br /&gt;
*Grammar theory&lt;br /&gt;
*Statistical methods&lt;br /&gt;
*Speech synthesis&lt;br /&gt;
*Psycholinguistics&lt;br /&gt;
*Neuroscience&lt;br /&gt;
*Philosophy&lt;br /&gt;
&lt;br /&gt;
== Upcoming Events ==&lt;br /&gt;
&lt;br /&gt;
INLG 2019 will be announced soon!&lt;br /&gt;
&lt;br /&gt;
== Recent Events ==&lt;br /&gt;
&lt;br /&gt;
[https://inlg2018.uvt.nl/ INLG 2018]&lt;br /&gt;
&lt;br /&gt;
Tilburg, Netherlands, 5-8 Novemeber 2018&lt;br /&gt;
&lt;br /&gt;
== Mailing List ==&lt;br /&gt;
=== Joining the mailing list: ===&lt;br /&gt;
&lt;br /&gt;
:The SIGGEN mailing list is currently going through a transition. &lt;br /&gt;
:To sign up, view preferences, change preferences, or unsubscribe, go to: &lt;br /&gt;
&lt;br /&gt;
::&#039;&#039;&#039;[http://www.jiscmail.ac.uk/SIGGEN http://www.jiscmail.ac.uk/SIGGEN]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:If there are any issues, e-mail: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Posting messages to the mailing list ===&lt;br /&gt;
&lt;br /&gt;
:Please join the mailing list first (see above). Then you may use the email alias &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-list (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt; to post e-mails to the list.&lt;br /&gt;
&lt;br /&gt;
== Board ==&lt;br /&gt;
The SIGGEN board is made up of the following people:&lt;br /&gt;
&lt;br /&gt;
*[https://ehudreiter.com/Ehud Reiter] ([mailto:e.reiter@abdn.ac.uk mail]) Professor/Chair in Computer Science at [https://www.abdn.ac.uk/ncs/profiles/e.reiter/] University of Aberdeen. [mailto:siggen-chair(ta)aclweb(dot)org chair])&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[https://dimitragkatzia.wordpress.com Dimitra Gkatzia] ([mailto:d.gkatzia@napier.ac.uk mail]) [http://www.napier.ac.uk/about-us/our-schools/school-of-computing/staff School of Computing, Edinburgh Napier University], Edinburgh.&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[http://amandastent.com// Amanda Stent] ([mailto:amanda.stent@gmail.com mail]), Bloomberg LP ([mailto:siggen-treasurer(ta)aclweb(dot)org treasurer])&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[https://citius.usc.es/equipo/investigadores-postdoutorais/jose-maria-alonso-moral Jose M. Alonso] ([mailto:josemaria.alonso.moral@usc.es]) [https://citius.usc.es/equipo/investigadores-postdoutorais/jose-maria-alonso-moral University of Santiago de Compostela], Spain (secretary)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[http://homepages.inf.ed.ac.uk/amyi Amy Isard] ([mailto:amy.isard@ed.ac.uk mail]) [http://www.inf.ed.ac.uk School of Informatics, University of Edinburgh] (student member)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2020&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To contact the entire board, please use the email alias: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-board (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
For questions regarding this website, please email: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== [http://www.aclweb.org/anthology/siggen.html  Workshop Proceedings ] ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== [[SIGGEN: Archive|Archive]] ==&lt;br /&gt;
== [[SIGGEN: Newsletter Archive|Newsletter Archive]] ==&lt;br /&gt;
== [[SIGGEN: Constitution|Constitution]] ==&lt;br /&gt;
== [[SIGGEN: Who&#039;s Who in NLG|Who&#039;s Who in NLG]] ==&lt;br /&gt;
== [[SIGGEN: What&#039;s Where in NLG|What&#039;s Where in NLG]] ==&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
[[Natural_Language_Generation_Portal|Natural Language Generation Portal]]&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12369</id>
		<title>SIGGEN</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12369"/>
		<updated>2019-01-28T14:02:50Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Board */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h1&amp;gt;ACL Special Interest Group on Natural Language Generation &amp;lt;/h1&amp;gt; &lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|-&lt;br /&gt;
|[[File:Siggen_logo_small.JPG|left]]||&amp;lt;h4 style=&amp;quot;width:95%;margin:0;background-color:#cedff2;font-size:120%;font-weight:bold;border:1px solid #a3b0bf;text-align:justify;color:#000;padding:0.2em 0.4em;&amp;quot;&amp;gt;Welcome to the home page of the Association for Computational Linguistics Special Interest Group on Natural Language Generation. SIGGEN [ˈsɪɡ.ʤɛn] is a special interest group of the Association for Computational Linguistics (ACL). It provides a forum for the discussion, dissemination and archiving of research topics and results in the field of text generation. &amp;lt;/h4&amp;gt;&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Active topics of interest include:&lt;br /&gt;
&lt;br /&gt;
*Discourse models, content planning.&lt;br /&gt;
*Syntactic realization: formalisms and models of grammars for sentence production.&lt;br /&gt;
*Architecture of generators.&lt;br /&gt;
*Lexical choice.&lt;br /&gt;
*Psychological modelling of discourse production.&lt;br /&gt;
*Pragmatic influences on lexical choice, syntax and content selection.&lt;br /&gt;
*Multilingual or multi-modal generation.&lt;br /&gt;
*Applications of generation technology (report generation, explanation for knowledge-based systems, automatic translation...).&lt;br /&gt;
*Learning methods.&lt;br /&gt;
*Evaluation of generation results.&lt;br /&gt;
&lt;br /&gt;
Relevant aspects of the following areas relate to problems of natural language generation:&lt;br /&gt;
&lt;br /&gt;
*Grammar theory&lt;br /&gt;
*Statistical methods&lt;br /&gt;
*Speech synthesis&lt;br /&gt;
*Psycholinguistics&lt;br /&gt;
*Neuroscience&lt;br /&gt;
*Philosophy&lt;br /&gt;
&lt;br /&gt;
== Upcoming Events ==&lt;br /&gt;
&lt;br /&gt;
INLG 2019 will be announced soon!&lt;br /&gt;
&lt;br /&gt;
== Recent Events ==&lt;br /&gt;
&lt;br /&gt;
[https://inlg2018.uvt.nl/ INLG 2018]&lt;br /&gt;
&lt;br /&gt;
Tilburg, Netherlands, 5-8 Novemeber 2018&lt;br /&gt;
&lt;br /&gt;
== Mailing List ==&lt;br /&gt;
=== Joining the mailing list: ===&lt;br /&gt;
&lt;br /&gt;
:The SIGGEN mailing list is currently going through a transition. &lt;br /&gt;
:To sign up, view preferences, change preferences, or unsubscribe, go to: &lt;br /&gt;
&lt;br /&gt;
::&#039;&#039;&#039;[http://www.jiscmail.ac.uk/SIGGEN http://www.jiscmail.ac.uk/SIGGEN]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:If there are any issues, e-mail: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Posting messages to the mailing list ===&lt;br /&gt;
&lt;br /&gt;
:Please join the mailing list first (see above). Then you may use the email alias &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-list (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt; to post e-mails to the list.&lt;br /&gt;
&lt;br /&gt;
== Board ==&lt;br /&gt;
The SIGGEN board is made up of the following people:&lt;br /&gt;
&lt;br /&gt;
*[https://ehudreiter.com/Ehud Reiter] ([mailto:e.reiter@abdn.ac.uk mail]) Professor/Chair in Computer Science at [https://www.abdn.ac.uk/ncs/profiles/e.reiter/] University of Aberdeen. [mailto:siggen-chair(ta)aclweb(dot)org chair])&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[https://dimitragkatzia.wordpress.com Dimitra Gkatzia] ([mailto:d.gkatzia@napier.ac.uk mail]) [http://www.napier.ac.uk/about-us/our-schools/school-of-computing/staff School of Computing, Edinburgh Napier University], Edinburgh.&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[http://amandastent.com// Amanda Stent] ([mailto:amanda.stent@gmail.com mail]), Bloomberg LP ([mailto:siggen-treasurer(ta)aclweb(dot)org treasurer])&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[https://citius.usc.es/equipo/investigadores-postdoutorais/jose-maria-alonso-moral Jose M. Alons] ([mailto:josemaria.alonso.moral@usc.es]) [ University of Santiago de Compostela, Spain] (secretary)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[http://homepages.inf.ed.ac.uk/amyi Amy Isard] ([mailto:amy.isard@ed.ac.uk mail]) [http://www.inf.ed.ac.uk School of Informatics, University of Edinburgh] (student member)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2020&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To contact the entire board, please use the email alias: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-board (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
For questions regarding this website, please email: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== [http://www.aclweb.org/anthology/siggen.html  Workshop Proceedings ] ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== [[SIGGEN: Archive|Archive]] ==&lt;br /&gt;
== [[SIGGEN: Newsletter Archive|Newsletter Archive]] ==&lt;br /&gt;
== [[SIGGEN: Constitution|Constitution]] ==&lt;br /&gt;
== [[SIGGEN: Who&#039;s Who in NLG|Who&#039;s Who in NLG]] ==&lt;br /&gt;
== [[SIGGEN: What&#039;s Where in NLG|What&#039;s Where in NLG]] ==&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
[[Natural_Language_Generation_Portal|Natural Language Generation Portal]]&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
	<entry>
		<id>https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12368</id>
		<title>SIGGEN</title>
		<link rel="alternate" type="text/html" href="https://www.aclweb.org/aclwiki/index.php?title=SIGGEN&amp;diff=12368"/>
		<updated>2019-01-28T13:57:05Z</updated>

		<summary type="html">&lt;p&gt;Dimitra: /* Board */&lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;__NOTOC__&lt;br /&gt;
&lt;br /&gt;
&amp;lt;h1&amp;gt;ACL Special Interest Group on Natural Language Generation &amp;lt;/h1&amp;gt; &lt;br /&gt;
&lt;br /&gt;
{|&lt;br /&gt;
|-&lt;br /&gt;
|[[File:Siggen_logo_small.JPG|left]]||&amp;lt;h4 style=&amp;quot;width:95%;margin:0;background-color:#cedff2;font-size:120%;font-weight:bold;border:1px solid #a3b0bf;text-align:justify;color:#000;padding:0.2em 0.4em;&amp;quot;&amp;gt;Welcome to the home page of the Association for Computational Linguistics Special Interest Group on Natural Language Generation. SIGGEN [ˈsɪɡ.ʤɛn] is a special interest group of the Association for Computational Linguistics (ACL). It provides a forum for the discussion, dissemination and archiving of research topics and results in the field of text generation. &amp;lt;/h4&amp;gt;&lt;br /&gt;
&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
Active topics of interest include:&lt;br /&gt;
&lt;br /&gt;
*Discourse models, content planning.&lt;br /&gt;
*Syntactic realization: formalisms and models of grammars for sentence production.&lt;br /&gt;
*Architecture of generators.&lt;br /&gt;
*Lexical choice.&lt;br /&gt;
*Psychological modelling of discourse production.&lt;br /&gt;
*Pragmatic influences on lexical choice, syntax and content selection.&lt;br /&gt;
*Multilingual or multi-modal generation.&lt;br /&gt;
*Applications of generation technology (report generation, explanation for knowledge-based systems, automatic translation...).&lt;br /&gt;
*Learning methods.&lt;br /&gt;
*Evaluation of generation results.&lt;br /&gt;
&lt;br /&gt;
Relevant aspects of the following areas relate to problems of natural language generation:&lt;br /&gt;
&lt;br /&gt;
*Grammar theory&lt;br /&gt;
*Statistical methods&lt;br /&gt;
*Speech synthesis&lt;br /&gt;
*Psycholinguistics&lt;br /&gt;
*Neuroscience&lt;br /&gt;
*Philosophy&lt;br /&gt;
&lt;br /&gt;
== Upcoming Events ==&lt;br /&gt;
&lt;br /&gt;
INLG 2019 will be announced soon!&lt;br /&gt;
&lt;br /&gt;
== Recent Events ==&lt;br /&gt;
&lt;br /&gt;
[https://inlg2018.uvt.nl/ INLG 2018]&lt;br /&gt;
&lt;br /&gt;
Tilburg, Netherlands, 5-8 Novemeber 2018&lt;br /&gt;
&lt;br /&gt;
== Mailing List ==&lt;br /&gt;
=== Joining the mailing list: ===&lt;br /&gt;
&lt;br /&gt;
:The SIGGEN mailing list is currently going through a transition. &lt;br /&gt;
:To sign up, view preferences, change preferences, or unsubscribe, go to: &lt;br /&gt;
&lt;br /&gt;
::&#039;&#039;&#039;[http://www.jiscmail.ac.uk/SIGGEN http://www.jiscmail.ac.uk/SIGGEN]&#039;&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
:If there are any issues, e-mail: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
=== Posting messages to the mailing list ===&lt;br /&gt;
&lt;br /&gt;
:Please join the mailing list first (see above). Then you may use the email alias &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-list (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt; to post e-mails to the list.&lt;br /&gt;
&lt;br /&gt;
== Board ==&lt;br /&gt;
The SIGGEN board is made up of the following people:&lt;br /&gt;
&lt;br /&gt;
*[https://ehudreiter.com/Ehud Reiter] ([mailto:e.reiter@abdn.ac.uk mail]) Professor/Chair in Computer Science at [https://www.abdn.ac.uk/ncs/profiles/e.reiter/] University of Aberdeen. [mailto:siggen-chair(ta)aclweb(dot)org chair])&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[https://dimitragkatzia.wordpress.com Dimitra Gkatzia] ([mailto:d.gkatzia@napier.ac.uk mail]) [http://www.napier.ac.uk/about-us/our-schools/school-of-computing/staff School of Computing, Edinburgh Napier University], Edinburgh.&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[http://amandastent.com// Amanda Stent] ([mailto:amanda.stent@gmail.com mail]), Bloomberg LP ([mailto:siggen-treasurer(ta)aclweb(dot)org treasurer])&lt;br /&gt;
:elected in December 2016 for the period from 1st January 2017 to 31st December 2020&lt;br /&gt;
*[https://dimitragkatzia.wordpress.com Dimitra Gkatzia] ([mailto:d.gkatzia@napier.ac.uk mail]) [http://www.napier.ac.uk/about-us/our-schools/school-of-computing/staff School of Computing, Edinburgh Napier University], Edinburgh (secretary)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2022&lt;br /&gt;
*[http://homepages.inf.ed.ac.uk/amyi Amy Isard] ([mailto:amy.isard@ed.ac.uk mail]) [http://www.inf.ed.ac.uk School of Informatics, University of Edinburgh] (student member)&lt;br /&gt;
:elected in December 2018 for the period from 1st January 2019 to 31st December 2020&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
To contact the entire board, please use the email alias: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-board (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
For questions regarding this website, please email: &amp;lt;u&amp;gt;&#039;&#039;&#039;siggen-webmaster (ta) aclweb (dot) org&#039;&#039;&#039;&amp;lt;/u&amp;gt;.&lt;br /&gt;
&lt;br /&gt;
== [http://www.aclweb.org/anthology/siggen.html  Workshop Proceedings ] ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== [[SIGGEN: Archive|Archive]] ==&lt;br /&gt;
== [[SIGGEN: Newsletter Archive|Newsletter Archive]] ==&lt;br /&gt;
== [[SIGGEN: Constitution|Constitution]] ==&lt;br /&gt;
== [[SIGGEN: Who&#039;s Who in NLG|Who&#039;s Who in NLG]] ==&lt;br /&gt;
== [[SIGGEN: What&#039;s Where in NLG|What&#039;s Where in NLG]] ==&lt;br /&gt;
&lt;br /&gt;
== Resources ==&lt;br /&gt;
[[Natural_Language_Generation_Portal|Natural Language Generation Portal]]&lt;/div&gt;</summary>
		<author><name>Dimitra</name></author>
	</entry>
</feed>