Resources for Slovenian: Difference between revisions

From ACL Wiki
Jump to navigation Jump to search
Kiwibird (talk | contribs)
Corpora: +Europarl corpus; reorg
Line 1: Line 1:
==Corpora==
==Corpora==
===Free license===
* [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English
* [http://nl.ijs.si/elan/ IJS - ELAN] Slovene-English Parallel Corpus
* [http://nl.ijs.si/elan/ IJS - ELAN] Slovene-English Parallel Corpus
** License: "freely available for downloading, but please acknowledge in any publications"
* [http://langtech.jrc.it/JRC-Acquis.html JRC Acquis] parallel texts.  Languages involved: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene and Swedish.
 
===Non-free license===
* [http://nl.ijs.si/ME/ Multext EAST] lexica, annotated "1984" corpus, parallel and comparable text and speech corpora.  Languages involved: Bulgarian, Croatian, Czech, English, Estonian, Hungarian, Lithuanian, Macedonian, Persian, Polish, Resian, Romanian, Russian, Serbian, Slovak, Slovene, and Ukrainian


* [http://nl.ijs.si/ME/ Multext EAST] lexica, annotated "1984" corpus, parallel and comparable text and speech corpora.
** License: "research use only"
** Languages involved: Bulgarian, Croatian, Czech, English, Estonian, Hungarian, Lithuanian, Macedonian, Persian, Polish, Resian, Romanian, Russian, Serbian, Slovak, Slovene, and Ukrainian


* [http://langtech.jrc.it/JRC-Acquis.html JRC Acquis] parallel texts.
** License: Public domain.
** Languages involved: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene and Swedish.


[[Category:Resources by language|Solvenian]]
[[Category:Resources by language|Solvenian]]

Revision as of 17:27, 12 October 2013

Corpora

Free license

  • Europarl corpus, sentence aligned with English
  • IJS - ELAN Slovene-English Parallel Corpus
  • JRC Acquis parallel texts. Languages involved: Bulgarian, Czech, Danish, German, Greek, English, Spanish, Estonian, Finnish, French, Hungarian, Italian, Lithuanian, Latvian, Maltese, Dutch, Polish, Portuguese, Romanian, Slovak, Slovene and Swedish.

Non-free license

  • Multext EAST lexica, annotated "1984" corpus, parallel and comparable text and speech corpora. Languages involved: Bulgarian, Croatian, Czech, English, Estonian, Hungarian, Lithuanian, Macedonian, Persian, Polish, Resian, Romanian, Russian, Serbian, Slovak, Slovene, and Ukrainian