Difference between revisions of "MSF2 The Portuguese/Spanish corpus of Multi-Sentence Fusion (Repository)"
Jump to navigation
Jump to search
Line 13: | Line 13: | ||
* '''Citation:''' If you use the MSF2 corpus in your research, please include the following citation in any resulting papers: | * '''Citation:''' If you use the MSF2 corpus in your research, please include the following citation in any resulting papers: | ||
− | :: Elvys Linhares Pontes, Juan-Manuel Torres-Moreno, Stéphane Huet, Andréa Linhares. A New Annotated Portuguese/Spanish Corpus for the Multi-Sentence Compression Task. ''Proceedings of the 11th edition of the Language Resources and Evaluation Conference'', May 2018, Miyazaki, Japan. | + | :: Elvys Linhares Pontes, Juan-Manuel Torres-Moreno, Stéphane Huet, Andréa Linhares. [https://www.aclweb.org/anthology/L18-1504/ A New Annotated Portuguese/Spanish Corpus for the Multi-Sentence Compression Task]. ''Proceedings of the 11th edition of the Language Resources and Evaluation Conference'', May 2018, Miyazaki, Japan. |
* '''Description:''' The MSF2 corpus consists of three directories : src : sentence clusters in raw and tokenized formats | * '''Description:''' The MSF2 corpus consists of three directories : src : sentence clusters in raw and tokenized formats |
Revision as of 04:58, 4 May 2020
- ADCR ID: ADCR2020T001
- Name of Dataset: MSF2 Corpus
- Contributor: Juan-Manuel Torres-Moreno, Université d'Avignon, Mai 4, 2020.
- Copyright: Annotations © 2020, MSF2 Université d'Avignon. Deposited in the ACL Data and Code Repository by Juan-Manuel Torres.
- Licensing: This work is licensed under the Creative Commons Attribution 3.0 Unported License.
- Citation: If you use the MSF2 corpus in your research, please include the following citation in any resulting papers:
- Elvys Linhares Pontes, Juan-Manuel Torres-Moreno, Stéphane Huet, Andréa Linhares. A New Annotated Portuguese/Spanish Corpus for the Multi-Sentence Compression Task. Proceedings of the 11th edition of the Language Resources and Evaluation Conference, May 2018, Miyazaki, Japan.
- Description: The MSF2 corpus consists of three directories : src : sentence clusters in raw and tokenized formats
ref : manual compressions to be used for ROUGE/BLEU automatic evaluation; pos : tokenized and Part-Of-Speech tagged sentences (using TreeTagger Pos-tagger). For more information, please see the documentation file that is included in the package.
- Download: The original source repository is at https://dev.termwatch.es/~fresa/CORPUS/MSF2/corpus.html.