https://aclweb.org/aclwiki/api.php?action=feedcontributions&user=Makrai&feedformat=atom
ACL Wiki - User contributions [en]
2024-03-29T02:25:09Z
User contributions
MediaWiki 1.35.2
https://aclweb.org/aclwiki/index.php?title=Resources_for_Arabic&diff=9017
Resources for Arabic
2011-10-05T21:06:53Z
<p>Makrai: /* Free/open licence */</p>
<hr />
<div>==Morphology==<br />
<br />
===Free software===<br />
*[https://sourceforge.net/projects/aramorph/ AraMorph - Perl] - An Arabic morphological analyzer and part-of-speech tagger written in Perl (originally by Tim Buckwalter)<br />
*[http://www.nongnu.org/aramorph/ AraMorph - Java] - An Arabic morphological analyzer and part-of-speech tagger rewritten in Java for [http://lucene.apache.org/ Lucene]<br />
<br />
===Proprietary===<br />
*[http://www.arabic-morphology.com Xerox Arabic Morphological Analyzer and Generator]<br />
<br />
==Parsers==<br />
===Free software===<br />
* [http://www.cis.upenn.edu/~dbikel/software.html#stat-parser Bikel's implementation of Collins Parser] by [http://www.cis.upenn.edu/~dbikel/ Dan Bikel].<br />
* [http://www.ling.ohio-state.edu/~jonsafari/arabiclg/arabiclg.20060829.tar.bz2 Arabic dictionaries], by [http://www.ling.ohio-state.edu/~jonsafari/ Jon Dehdari], for the [http://www.abisource.com/projects/link-grammar/ Link-Grammar parser]. These require the Aramorph stemming package, above. <br />
* [https://sourceforge.net/apps/trac/elixir-fm/wiki ElixirFM] ([http://quest.ms.mff.cuni.cz/cgi-bin/elixir/index.fcgi online interface here]) is a Functional Arabic Morphology written in Haskell and Perl; the lexicon is a "re-processed" version of the Buckwalter analyser.<br />
* [http://sourceforge.net/projects/sarf Sarf] - Arabic Morphology System (all in Java)<br />
<br />
==Corpora==<br />
===Proprietary===<br />
*[http://www.ldc.upenn.edu/Catalog/LDC2001T55.html Arabic Newswire Part 1], 76 million tokens, annotation: paragraphs<br />
<br />
===Free/open licence===<br />
* [http://github.com/anastaw/Meedan-Memory Meedan-Memory], Arabic-English TMX (sentence-aligned), ~467,000 words on the English side, [http://www.opendatacommons.org/licenses/odbl/ Open Database Licence]<br />
* [http://quran.uk.net/ Quranic Arabic Corpus], 77,430 words of Quranic Arabic, with manually verified contextual POS, inflection, derivation; [[dependency grammar]] annotation is planned.<br />
* [http://www1.ccls.columbia.edu/~ybenajiba/downloads.html Arabic NER corpora] by [http://www1.ccls.columbia.edu/~ybenajiba/ Yassine Benajiba], 150,000+ words.<br />
<br />
==Bibliography==<br />
<br />
==External links==<br />
*[http://www.elsnet.org/acl2001-arabic.html ACL/EACL 2001 Workshop on Arabic NLP]<br />
*[http://www1.cs.columbia.edu/~mdiab/software/ASVMTools_2.0.tar.gz Basic Arabic Processing Tools]<br />
*[http://acl.ldc.upenn.edu/coling2004/W5/index.html COLING 2004 Workshop on computational approaches to Arabic script-based languages]<br />
<br />
<br />
[[Category:Resources by language|Arabic]]</div>
Makrai
https://aclweb.org/aclwiki/index.php?title=Resources_for_Arabic&diff=9016
Resources for Arabic
2011-10-05T20:55:35Z
<p>Makrai: /* Proprietary */ some details added</p>
<hr />
<div>==Morphology==<br />
<br />
===Free software===<br />
*[https://sourceforge.net/projects/aramorph/ AraMorph - Perl] - An Arabic morphological analyzer and part-of-speech tagger written in Perl (originally by Tim Buckwalter)<br />
*[http://www.nongnu.org/aramorph/ AraMorph - Java] - An Arabic morphological analyzer and part-of-speech tagger rewritten in Java for [http://lucene.apache.org/ Lucene]<br />
<br />
===Proprietary===<br />
*[http://www.arabic-morphology.com Xerox Arabic Morphological Analyzer and Generator]<br />
<br />
==Parsers==<br />
===Free software===<br />
* [http://www.cis.upenn.edu/~dbikel/software.html#stat-parser Bikel's implementation of Collins Parser] by [http://www.cis.upenn.edu/~dbikel/ Dan Bikel].<br />
* [http://www.ling.ohio-state.edu/~jonsafari/arabiclg/arabiclg.20060829.tar.bz2 Arabic dictionaries], by [http://www.ling.ohio-state.edu/~jonsafari/ Jon Dehdari], for the [http://www.abisource.com/projects/link-grammar/ Link-Grammar parser]. These require the Aramorph stemming package, above. <br />
* [https://sourceforge.net/apps/trac/elixir-fm/wiki ElixirFM] ([http://quest.ms.mff.cuni.cz/cgi-bin/elixir/index.fcgi online interface here]) is a Functional Arabic Morphology written in Haskell and Perl; the lexicon is a "re-processed" version of the Buckwalter analyser.<br />
* [http://sourceforge.net/projects/sarf Sarf] - Arabic Morphology System (all in Java)<br />
<br />
==Corpora==<br />
===Proprietary===<br />
*[http://www.ldc.upenn.edu/Catalog/LDC2001T55.html Arabic Newswire Part 1], 76 million tokens, annotation: paragraphs<br />
<br />
===Free/open licence===<br />
* [http://github.com/anastaw/Meedan-Memory Meedan-Memory], Arabic-English TMX (sentence-aligned), ~467,000 words on the English side, [http://www.opendatacommons.org/licenses/odbl/ Open Database Licence]<br />
* [http://quran.uk.net/ Quranic Arabic Corpus], 77,430 words of Quranic Arabic, with manually verified contextual POS, inflection, derivation; [[dependency grammar]] annotation is planned.<br />
* [http://www1.ccls.columbia.edu/~ybenajiba/downloads.html Arabic NER corpora] by [http://www1.ccls.columbia.edu/~ybenajiba/ Yassine Benajiba].<br />
<br />
==Bibliography==<br />
<br />
==External links==<br />
*[http://www.elsnet.org/acl2001-arabic.html ACL/EACL 2001 Workshop on Arabic NLP]<br />
*[http://www1.cs.columbia.edu/~mdiab/software/ASVMTools_2.0.tar.gz Basic Arabic Processing Tools]<br />
*[http://acl.ldc.upenn.edu/coling2004/W5/index.html COLING 2004 Workshop on computational approaches to Arabic script-based languages]<br />
<br />
<br />
[[Category:Resources by language|Arabic]]</div>
Makrai