Resources for Indonesian: Difference between revisions
Jump to navigation
Jump to search
Created page with "==Machine translation== ===Free software=== * [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-id-ms apertium-id-ms] – rule-based Indonesian<->Malay ma..." |
Added Indra and the Wordnet Bahasa |
||
| (2 intermediate revisions by 2 users not shown) | |||
| Line 1: | Line 1: | ||
== | ==Corpora== | ||
* [http://ilps.science.uva.nl/resources/bahasa Kompas and Tempo Online Collection] for evaluation purposes. | |||
* [ | * [http://www.panl10n.net/english/OutputsIndonesia2.htm 500,000 Word Bahasa Indonesia Corpus and Parallel English Translation] (A-NC-SA 3.0 licence) | ||
* [http://www.panl10n.net/english/OutputsIndonesia2.htm 500,000 Word Bahasa Indonesia Parallel Corpus with Penn Treebank] (A-NC-SA 3.0 licence) | |||
* [http://www.panl10n.net/english/OutputsIndonesia2.htm One Million POS Tagged Corpus of Bahasa Indonesia] (A-NC-SA 3.0 licence) | |||
==Tools== | |||
* [http://www.panl10n.net/english/OutputsIndonesia2.htm Part of Speech Tagger for Bahasa Indonesia] (GPL licence) | |||
* [https://apertium.svn.sourceforge.net/svnroot/apertium/incubator/apertium-id-ms Rule-based Indonesian-Malay Machine Translation] by [http://dl.dropbox.com/u/537350/paper/MALINDO-2010-final.pdf Septina Dian Larasati]. Possible to use for morphological tagging. | |||
* [http://abisource.com/projects/link-grammar/ Link Grammar Parser], includes prototype Indonesian dictionaries. | |||
==Grammars== | |||
* [http://moin.delph-in.net/IndraTop Broad-coverage Indonesian Resource Grammar (INDRA)] based on HPSG, using the DELPH-IN infrastructure. | |||
==Lexicons== | |||
* [http://wn-msa.sourceforge.net/ Wordnet Bahasa] Semantic lexicon for Indonesian and Malay, linked to the Open Multilingual Wordnet. | |||
[[Category:Resources by language|Indonesian]] | [[Category:Resources by language|Indonesian]] | ||
Latest revision as of 05:28, 15 November 2018
Corpora
- Kompas and Tempo Online Collection for evaluation purposes.
- 500,000 Word Bahasa Indonesia Corpus and Parallel English Translation (A-NC-SA 3.0 licence)
- 500,000 Word Bahasa Indonesia Parallel Corpus with Penn Treebank (A-NC-SA 3.0 licence)
- One Million POS Tagged Corpus of Bahasa Indonesia (A-NC-SA 3.0 licence)
Tools
- Part of Speech Tagger for Bahasa Indonesia (GPL licence)
- Rule-based Indonesian-Malay Machine Translation by Septina Dian Larasati. Possible to use for morphological tagging.
- Link Grammar Parser, includes prototype Indonesian dictionaries.
Grammars
- Broad-coverage Indonesian Resource Grammar (INDRA) based on HPSG, using the DELPH-IN infrastructure.
Lexicons
- Wordnet Bahasa Semantic lexicon for Indonesian and Malay, linked to the Open Multilingual Wordnet.