Resources for Arabic: Difference between revisions

From ACL Wiki
Jump to navigation Jump to search
Mikahama (talk | contribs)
No edit summary
Mikahama (talk | contribs)
No edit summary
 
Line 28: Line 28:
===Proprietary===
===Proprietary===
*[http://www.ldc.upenn.edu/Catalog/LDC2001T55.html Arabic Newswire Part 1], 76 million tokens, annotation: paragraphs
*[http://www.ldc.upenn.edu/Catalog/LDC2001T55.html Arabic Newswire Part 1], 76 million tokens, annotation: paragraphs
==Diacritization==
===Free software===
*[https://github.com/mikahama/haracat hAraCat] a free tool for predicting vowels and other diacritics.


===Free/open licence===
===Free/open licence===

Latest revision as of 11:36, 29 June 2020

Morphology

Free software

  • AraMorph - Perl - An Arabic morphological analyzer and part-of-speech tagger written in Perl (originally by Tim Buckwalter)
  • AraMorph - Java - An Arabic morphological analyzer and part-of-speech tagger rewritten in Java for Lucene
  • AraComLex - An open source finite state morphology for Modern Standard Arabic. The source files can be compiled by the open source compiler, foma, or Xerox xfst.
  • UralicNLP is a Python library that provides morphological tagging, generation, lemmatization and disambiguation in many languages including Arabic

Proprietary

WordNets

Free software

Proprietary

Parsers

Free software

Corpora

Proprietary

Diacritization

Free software

  • hAraCat a free tool for predicting vowels and other diacritics.

Free/open licence

Bibliography

External links