https://aclweb.org/aclwiki/api.php?action=feedcontributions&user=Jettte&feedformat=atomACL Wiki - User contributions [en]2024-03-28T09:36:44ZUser contributionsMediaWiki 1.35.2https://aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&diff=9662Data sets for NLG2012-08-20T12:19:28Z<p>Jettte: /* GRE3D3: Spatial Relations in Referring Expressions */</p>
<hr />
<div><!-- MoinMoin name: DataSets --><br />
<!-- Comment: --><br />
<!-- WikiMedia name: DataSets --><br />
<!-- Page revision: 00000001 --><br />
<!-- Original date: Fri Nov 11 09:00:35 2005 (1131699635000000) --><br />
<br />
This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.<br />
<br />
== Focus on studying the generation target ==<br />
=== PIL: Patient Information Leaflet corpus ===<br />
The [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton. ([http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz direct download link])<br />
<br />
== Focus on content selection, aggregation ==<br />
=== SumTime Meteo ===<br />
<br />
These data contain predictions for meteorological parameters such as precipitation, temperature, wind speed, and cloud cover at various altitudes, at regular intervals for various points in the area of interest.<br />
<br />
The weather corpus currently exists as an Access database and, alternatively, in form of CSV (ASCII) files.<br />
<br />
Download and Info: [[SumTime-Meteo]]<br />
<br />
Project link: http://www.csd.abdn.ac.uk/research/sumtime/<br />
<br />
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===<br />
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The [http://www.classic-project.org/corpora Wizard-of-Oz corpus] on Information Presentation in Spoken Dialogue Systems contains the wizards' choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged. <br />
<br />
<br />
== Focus on generating referring expressions ==<br />
Referring expression generation is a sub-task of NLG with an active research community.<br />
<br />
=== COCONUT Corpus ===<br />
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The [http://www.pitt.edu/~coconut/coconut-corpus.html COCONUT corpus] is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme]. ([http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz direct download link])<br />
<br />
=== GRE3D3 and GRE3D7: Spatial Relations in Referring Expressions ===<br />
Two web-based production experiments were conducted by Jette Viethen under the supervision of Robert Dale.<br />
The resulting corpora GRE3D3 and GRE3D7 contain 720 and 4480 referring expressions, respectively. Each referring expression describes a simple object in a simple 3D scene. GRE3D3 scenes contain 3 objects and GRE3D7 scenes contain 7 objects. [http://jetteviethen.net/research/spatial.html The corpora and stimulus scenes are available here.]<br />
<br />
=== TUNA Reference Corpus ===<br />
The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment, and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])<br />
<br />
== Focus on lexicalization ==<br />
...<br />
<br />
== Focus on syntax, realization ==<br />
...<br />
<br />
<br />
[[Category:Knowledge Collections and Datasets]]<br />
{{SIGGEN Wiki}}</div>Jetttehttps://aclweb.org/aclwiki/index.php?title=Data_sets_for_NLG&diff=9661Data sets for NLG2012-08-20T12:16:51Z<p>Jettte: /* GRE3D3: Spatial Relations in Referring Expressions */</p>
<hr />
<div><!-- MoinMoin name: DataSets --><br />
<!-- Comment: --><br />
<!-- WikiMedia name: DataSets --><br />
<!-- Page revision: 00000001 --><br />
<!-- Original date: Fri Nov 11 09:00:35 2005 (1131699635000000) --><br />
<br />
This page lists sets of structured data to be used as input for natural language generation tasks, or to inform research on NLG.<br />
<br />
== Focus on studying the generation target ==<br />
=== PIL: Patient Information Leaflet corpus ===<br />
The [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ Patient Information Leaflet (PIL) corpus] is a [http://www.itri.brighton.ac.uk/projects/pills/corpus/PIL/searchtool/search.html searchable] and [http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL/ browsable] collection of patient information leaflets available in various document formats as well as structurally annotated SGML. The PIL corpus was initially developed as part of the ICONOCLAST project at ITRI, Brighton. ([http://mcs.open.ac.uk/nlg/old_projects/pills/corpus/PIL-corpus-2.0.tar.gz direct download link])<br />
<br />
== Focus on content selection, aggregation ==<br />
=== SumTime Meteo ===<br />
<br />
These data contain predictions for meteorological parameters such as precipitation, temperature, wind speed, and cloud cover at various altitudes, at regular intervals for various points in the area of interest.<br />
<br />
The weather corpus currently exists as an Access database and, alternatively, in form of CSV (ASCII) files.<br />
<br />
Download and Info: [[SumTime-Meteo]]<br />
<br />
Project link: http://www.csd.abdn.ac.uk/research/sumtime/<br />
<br />
===CLASSiC WOZ corpus on InformationPresentation in Spoken Dialogue Systems===<br />
CLASSiC is a project on [http://www.classic-project.org/ Computational Learning in Adaptive Systems for Spoken Conversation]. The [http://www.classic-project.org/corpora Wizard-of-Oz corpus] on Information Presentation in Spoken Dialogue Systems contains the wizards' choices on Information Presentation strategy (summary, compare, recommend , or a combination of those) and attribute selection. The domain is restaurant search in Edinburgh. Objective measures (such as dialogue length, number of database hits, number of sentences generated etc.), as well as subjective measures (the user scores) were logged. <br />
<br />
<br />
== Focus on generating referring expressions ==<br />
Referring expression generation is a sub-task of NLG with an active research community.<br />
<br />
=== COCONUT Corpus ===<br />
COCONUT was a project on “Cooperative, coordinated natural language utterances”. The [http://www.pitt.edu/~coconut/coconut-corpus.html COCONUT corpus] is a collection of computer-mediated dialogues in which two subjects collaborate on a simple task, namely buying furniture. SGML annotations were added according to the [http://www.pitt.edu/%7Epjordan/papers/coconut-manual.pdf COCONUT-DRI coding scheme]. ([http://www.pitt.edu/%7Ecoconut/corpora/corpus.tar.gz direct download link])<br />
<br />
=== GRE3D3: Spatial Relations in Referring Expressions ===<br />
A Web-based production experiment was conducted by Jette Viethen under the supervision of Robert Dale.<br />
The resulting GRE3D3 corpus contains 720 referring expressions for simple objects in simple 3D scenes. [http://jetteviethen.net/research/spatial.html It is available here].<br />
<br />
=== TUNA Reference Corpus ===<br />
The [http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/ TUNA Reference Corpus] is a semantically and pragmatically transparent corpus of identifying references to objects in visual domains. It was constructed via an online experiment, and has since been used in a number of evaluation studies on Referring Expressions Generation, as well as in two Shared Tasks: the Attribute Selection for Referring Expressions Generation task (2007), and the Referring Expression Generation task (2008). Main authors: Kees van Deemter, Albert Gatt, Ielka van der Sluis. ([http://www.csd.abdn.ac.uk/~agatt/tuna/corpus/corpus.zip direct download link])<br />
<br />
== Focus on lexicalization ==<br />
...<br />
<br />
== Focus on syntax, realization ==<br />
...<br />
<br />
<br />
[[Category:Knowledge Collections and Datasets]]<br />
{{SIGGEN Wiki}}</div>Jetttehttps://aclweb.org/aclwiki/index.php?title=Online_NLG_demos&diff=6324Online NLG demos2009-02-11T22:51:12Z<p>Jettte: </p>
<hr />
<div><!-- MoinMoin name: OnlineDemos --><br />
<!-- Comment: added pollen forecast demo to SumTime --><br />
<!-- WikiMedia name: OnlineDemos --><br />
<!-- Page revision: 00000003 --><br />
<!-- Original date: Tue Sep 11 07:50:07 2007 (1189497007000000) --><br />
<br />
This page lists demos of NLG systems available online.<br />
<br />
== ILEX ==<br />
http://www.hcrc.ed.ac.uk/ilex/demos/museum.cgi<br />
is a virtual museum system; it automatically produces<br />
descriptions of museum items, taking into account the educational importance <br />
of particular aspects of the objects. <small>(NLG functionality currently unavailable?)</small><br />
<br />
== Peba-II ==<br />
http://www.dynamicmultimedia.com.au/peba/<br />
is an on-line animal encyclopedia<br />
that produces descriptions and comparisons of animals as web pages.<br />
<br />
== Pollen Forecast for Scotland ==<br />
[http://www.csd.abdn.ac.uk/~rturner/ Ross Turner's] [http://www.csd.abdn.ac.uk/~rturner/cgi_bin/pollen.html Pollen Forecast for Scotland] NLG demo takes as input a pollen value for each of 6 areas of Scotland, and generates a textual pollen forecast.<br />
<br />
== Project Reporter ==<br />
http://www.cogentex.com/products/reporter.html<br />
generates dynamic web-based project status reports from files created with Microsoft Project <br />
or other compatible project management software. Reports feature hyperlinked textual <br />
descriptions of project elements, as well as coordinated multimodal display with <br />
an interactive Gantt chart applet.<br />
<br />
== RISK ==<br />
[http://www.csd.abdn.ac.uk/~chvenour/nlg/demos/risk.html RISK &ndash; 10 Year Coronary Heart Disease Predictor] aims to show various ways of presenting data. Input: medical data values. Output: tabular overview, textual summary, visual (comic-like), and chart diagram. Java applet.<br />
<br />
== Spatial descriptions for Kirklees in West Yorkshire ==<br />
[http://www.csd.abdn.ac.uk/~rturner/ Ross Turner's] [http://www.csd.abdn.ac.uk/~rturner/cgi_bin/spatial.html Spatial descriptions for Kirklees in West Yorkshire] NLG demo takes as input a latitude and a longitude value, and generates a short spacial description relative to other places in the area.<br />
<br />
== STOP ==<br />
http://www.csd.abdn.ac.uk/research/stop/ <br />
produces personalised smoking-cessation<br />
leaflets, based on responses to a smoking questionnaire. The online version of STOP is<br />
a simplified version of the main STOP system, which is based on paper input and output.<br />
<br />
== SUMTIME ==<br />
http://www.csd.abdn.ac.uk/research/sumtime/<br />
<br />
SumTime-Mousam generates textual weather forecasts from numerical<br />
weather data. This demo shows how textual descriptions of changes<br />
in wind speed and direction are generated from wind data. <br />
<br />
There is a corresponding [[Data sets for NLG|dataset]] called [[SumTime-Meteo]] available.<br />
<br />
A related demo generates [[#Pollen_Forecast_for_Scotland|pollen forecasts for Scotland]].<br />
This may be easier for non-meteorologists to<br />
understand, and the demo shows the human corpus<br />
texts as well as the computer-generated texts.<br />
<br />
== TEMSIS ==<br />
http://www.dfki.de/service/nlg-demo/<br />
automatically generates air quality reports for the Franco-German border area <br />
in the Moselle-Saar region.<br />
<br />
== XIG - CStar Italian Generator ==<br />
http://ecate.itc.it:1024/projects/cstar/cstar.html <br />
is a system for generating Italian sentences from the interlingua content representation <br />
Interchange Format) adopted inside the C-STAR II project, <br />
whose target is to build a speech to speech<br />
translation system able to treat spontaneous speech. The application domain is tourist information.<br />
<br />
[[Category:Software]]<br />
{{SIGGEN Wiki}}</div>Jetttehttps://aclweb.org/aclwiki/index.php?title=Online_NLG_demos&diff=6323Online NLG demos2009-02-11T22:50:06Z<p>Jettte: </p>
<hr />
<div><!-- MoinMoin name: OnlineDemos --><br />
<!-- Comment: added pollen forecast demo to SumTime --><br />
<!-- WikiMedia name: OnlineDemos --><br />
<!-- Page revision: 00000003 --><br />
<!-- Original date: Tue Sep 11 07:50:07 2007 (1189497007000000) --><br />
<br />
This page lists demos of NLG systems available online.<br />
<br />
== ILEX ==<br />
http://www.hcrc.ed.ac.uk/ilex/demos/museum.cgi<br />
is a virtual museum system; it automatically produces<br />
descriptions of museum items, taking into account the educational importance <br />
of particular aspects of the objects. <small>(NLG functionality currently unavailable?)</small><br />
<br />
== Peba-II ==<br />
http://www.dynamicmultimedia.com.au/peba/<br />
is an on-line animal encyclopedia<br />
that produces descriptions and comparisons of animals as web pages.<br />
<br />
== Pollen Forecast for Scotland ==<br />
[http://www.csd.abdn.ac.uk/~rturner/ Ross Turner's] [http://www.csd.abdn.ac.uk/~rturner/cgi_bin/pollen.html Pollen Forecast for Scotland] NLG demo takes as input a pollen value for each of 6 areas of Scotland, and generates a textual pollen forecast.<br />
<br />
<br />
<br />
== Project Reporter ==<br />
http://www.cogentex.com/products/reporter.html<br />
generates dynamic web-based project status reports from files created with Microsoft Project <br />
or other compatible project management software. Reports feature hyperlinked textual <br />
descriptions of project elements, as well as coordinated multimodal display with <br />
an interactive Gantt chart applet.<br />
<br />
== RISK ==<br />
[http://www.csd.abdn.ac.uk/~chvenour/nlg/demos/risk.html RISK &ndash; 10 Year Coronary Heart Disease Predictor] aims to show various ways of presenting data. Input: medical data values. Output: tabular overview, textual summary, visual (comic-like), and chart diagram. Java applet.<br />
<br />
== Spatial descriptions for Kirklees in West Yorkshire ==<br />
[http://www.csd.abdn.ac.uk/~rturner/ Ross Turner's] [http://www.csd.abdn.ac.uk/~rturner/cgi_bin/spatial.html Spatial descriptions for Kirklees in West Yorkshire] NLG demo takes as input a latitude and a longitude value, and generates a short spacial description relative to other places in the area.<br />
<br />
== StockReporter ==<br />
http://www.mri.mq.edu.au/stockreporter/<br />
is a system which produces descriptions of stock market data.<br />
<br />
== STOP ==<br />
http://www.csd.abdn.ac.uk/research/stop/ <br />
produces personalised smoking-cessation<br />
leaflets, based on responses to a smoking questionnaire. The online version of STOP is<br />
a simplified version of the main STOP system, which is based on paper input and output.<br />
<br />
== SUMTIME ==<br />
http://www.csd.abdn.ac.uk/research/sumtime/<br />
<br />
SumTime-Mousam generates textual weather forecasts from numerical<br />
weather data. This demo shows how textual descriptions of changes<br />
in wind speed and direction are generated from wind data. <br />
<br />
There is a corresponding [[Data sets for NLG|dataset]] called [[SumTime-Meteo]] available.<br />
<br />
A related demo generates [[#Pollen_Forecast_for_Scotland|pollen forecasts for Scotland]].<br />
This may be easier for non-meteorologists to<br />
understand, and the demo shows the human corpus<br />
texts as well as the computer-generated texts.<br />
<br />
== TEMSIS ==<br />
http://www.dfki.de/service/nlg-demo/<br />
automatically generates air quality reports for the Franco-German border area <br />
in the Moselle-Saar region.<br />
<br />
== XIG - CStar Italian Generator ==<br />
http://ecate.itc.it:1024/projects/cstar/cstar.html <br />
is a system for generating Italian sentences from the interlingua content representation <br />
Interchange Format) adopted inside the C-STAR II project, <br />
whose target is to build a speech to speech<br />
translation system able to treat spontaneous speech. The application domain is tourist information.<br />
<br />
[[Category:Software]]<br />
{{SIGGEN Wiki}}</div>Jettte