Difference between revisions of "MSF2 The Portuguese/Spanish corpus of Multi-Sentence Fusion (Repository)"

From ACL Wiki
Jump to navigation Jump to search
(Created page with " * '''ADCR ID:''' ADCR2013T001 * '''Name of Dataset:''' TempEval-3 Platinum TimeML annotations * '''Contributor:''' Leon Derczynski ([http://derczynski.com/sheffield/]), Un...")
 
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
  
  
* '''ADCR ID:''' ADCR2013T001
+
* '''ADCR ID:''' ADCR2020T001
  
* '''Name of Dataset:''' TempEval-3 Platinum TimeML annotations
+
* '''Name of Dataset:''' MSF2 Corpus
  
* '''Contributor:''' Leon Derczynski ([http://derczynski.com/sheffield/]), University of Sheffield, April 23 2013.
+
* '''Contributor:''' [http://juanmanueltorres.free.fr/ Juan-Manuel Torres-Moreno], Université d'Avignon, Mai 4, 2020.
  
 
* '''Copyright:''' Annotations © 2020, MSF2 Université d'Avignon. Deposited in the [[ACL Data and Code Repository]] by Juan-Manuel Torres.
 
* '''Copyright:''' Annotations © 2020, MSF2 Université d'Avignon. Deposited in the [[ACL Data and Code Repository]] by Juan-Manuel Torres.
Line 13: Line 13:
 
* '''Citation:''' If you use the MSF2 corpus in your research, please include the following citation in any resulting papers:  
 
* '''Citation:''' If you use the MSF2 corpus in your research, please include the following citation in any resulting papers:  
  
:: Elvys Linhares Pontes, Juan-Manuel Torres-Moreno, Stéphane Huet, Andréa Linhares. A New Annotated Portuguese/Spanish Corpus for the Multi-Sentence Compression Task. ''Proceedings of the 11th edition of the Language Resources and Evaluation Conference'', May 2018, Miyazaki, Japan.
+
:: Elvys Linhares Pontes, Juan-Manuel Torres-Moreno, Stéphane Huet, Andréa Linhares. [https://www.aclweb.org/anthology/L18-1504/ A New Annotated Portuguese/Spanish Corpus for the Multi-Sentence Compression Task]. ''Proceedings of the 11th edition of the Language Resources and Evaluation Conference'', May 2018, Miyazaki, Japan.
 +
 
 +
:: Elvys Linhares Pontes, Juan-Manuel Torres-Moreno, Stéphane Huet, Andréa Linhares. [https://hal.archives-ouvertes.fr/hal-01722130 hal-01722130 ArXiv]
  
 
* '''Description:''' The MSF2 corpus consists of three directories : src : sentence clusters in raw and tokenized formats
 
* '''Description:''' The MSF2 corpus consists of three directories : src : sentence clusters in raw and tokenized formats
 
ref : manual compressions to be used for ROUGE/BLEU automatic evaluation; pos : tokenized and Part-Of-Speech tagged sentences (using TreeTagger Pos-tagger). For more information, please see the documentation file that is included in the package.
 
ref : manual compressions to be used for ROUGE/BLEU automatic evaluation; pos : tokenized and Part-Of-Speech tagged sentences (using TreeTagger Pos-tagger). For more information, please see the documentation file that is included in the package.
  
* '''Download''': The package for this item is: [http://aclweb.org/aclwiki/code/5/51/ADCR2020T001.tar.gz ADCR2020T001.tar.gz]. As extra supplemental material, the original source repository is at [https://dev.termwatch.es/~fresa/CORPUS/MSF2/corpus.html https://dev.termwatch.es/~fresa/CORPUS/MSF2/corpus.html].
+
* '''Download''': The original source repository is at [https://dev.termwatch.es/~fresa/CORPUS/MSF2/corpus.html https://dev.termwatch.es/~fresa/CORPUS/MSF2/corpus.html].
  
  
 
[[Category:Data and code repository|Template for data]]
 
[[Category:Data and code repository|Template for data]]

Latest revision as of 05:01, 4 May 2020


  • ADCR ID: ADCR2020T001
  • Name of Dataset: MSF2 Corpus
  • Citation: If you use the MSF2 corpus in your research, please include the following citation in any resulting papers:
Elvys Linhares Pontes, Juan-Manuel Torres-Moreno, Stéphane Huet, Andréa Linhares. A New Annotated Portuguese/Spanish Corpus for the Multi-Sentence Compression Task. Proceedings of the 11th edition of the Language Resources and Evaluation Conference, May 2018, Miyazaki, Japan.
Elvys Linhares Pontes, Juan-Manuel Torres-Moreno, Stéphane Huet, Andréa Linhares. hal-01722130 ArXiv
  • Description: The MSF2 corpus consists of three directories : src : sentence clusters in raw and tokenized formats

ref : manual compressions to be used for ROUGE/BLEU automatic evaluation; pos : tokenized and Part-Of-Speech tagged sentences (using TreeTagger Pos-tagger). For more information, please see the documentation file that is included in the package.