Resources for Dutch: Difference between revisions
Jump to navigation
Jump to search
HamleDT |
Added: Araneum |
||
| Line 1: | Line 1: | ||
== Corpora == | == Corpora == | ||
* [http://ucts.uniba.sk/aranea_about/ Araneum Nederlandicum], Gigaword Dutch web corpus | |||
* [http://corpora.informatik.uni-leipzig.de/ Dutch Plain text and Co-occurrences at LCC] | * [http://corpora.informatik.uni-leipzig.de/ Dutch Plain text and Co-occurrences at LCC] | ||
* [http://www.statmt.org/europarl Europarl corpus] - sentence-aligned with English | * [http://www.statmt.org/europarl Europarl corpus] - sentence-aligned with English | ||
Revision as of 19:12, 8 March 2015
Corpora
- Araneum Nederlandicum, Gigaword Dutch web corpus
- Dutch Plain text and Co-occurrences at LCC
- Europarl corpus - sentence-aligned with English
- CLiPS Stylometry Investigation (CSI) corpus - multi-purpose text corpus, main use in stylometry
- HamleDT, harmonized dependency treebanks of many languages, common annotation style.
Tools
- Dutch HPSG-based parser Includes the Alpino treebank (7137 sentences, newspaper, manually corrected)