Resources for Portugese: Difference between revisions

From ACL Wiki
Jump to navigation Jump to search
Zeman (talk | contribs)
HamleDT
mNo edit summary
Line 1: Line 1:


==Corpora==
==Corpora==
* [http://corporavm.uni-koeln.de/colonia/ Colonia], corpus of historical Portuguese.
* [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English
* [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English
* [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.
* [http://ufal.mff.cuni.cz/hamledt HamleDT], harmonized dependency treebanks of many languages, common annotation style.
Line 8: Line 9:
* [http://www.linguateca.pt/corpografo Corpógrafo] - a Web-based environment for corpora research
* [http://www.linguateca.pt/corpografo Corpógrafo] - a Web-based environment for corpora research


==Wordlists==
* [http://www.uni-koeln.de/~mzampier/resources/pawl.txt P-AWL] - the Portuguese academic wordlist compiled as described in [http://link.springer.com/chapter/10.1007/978-3-642-12320-7_15#page-1 Baptista et al. (2010)]


[[Category:Resources by language|Portugese]]
[[Category:Resources by language|Portugese]]

Revision as of 16:40, 25 February 2015

Corpora

  • Colonia, corpus of historical Portuguese.
  • Europarl corpus, sentence aligned with English
  • HamleDT, harmonized dependency treebanks of many languages, common annotation style.

Software

  • CEPRIL - Portugese Segmenter
  • Corpógrafo - a Web-based environment for corpora research

Wordlists