Difference between revisions of "Resources for Italian"

From ACL Wiki
Jump to navigation Jump to search
(→‎Corpora: +Europarl corpus)
(24 intermediate revisions by 9 users not shown)
Line 1: Line 1:
== Tools ==
+
== Tools for Italian ==
 +
 
 +
=== Tokenisers ===
 +
* [http://tcc.itc.it/projects/textpro/index.php TextPro]
 +
 
 +
=== POS taggers ===
 +
* [http://tcc.itc.it/projects/textpro/index.php TextPro]
 +
* [http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html TreeTagger]
 +
 
 +
===Morphology===
 +
====Free software====
 +
* [http://sslmitdev-online.sslmit.unibo.it/linguistics/morph-it.php Morph-It! version 0.47] - a free morphological resource for the Italian language, includes [[SFST]] sources. [[LGPL]] license.
 +
 
 +
====Unknown license====
 +
* [http://archivium.biz/ dic_it: il Verbiario] - a morphological analizer and verb coniugator for Italian verbs (web interface only?)
 +
 
 +
=== Named Entity Recognisers ===
 +
* [http://tcc.itc.it/projects/ontotext/entitypro.html EntityPro]
 +
 
 +
=== Temporal Expressions ===
 +
* [http://tcc.itc.it/projects/ontotext/ita-chronos.html ITA-Chronos]
  
 
=== Parsers ===
 
=== Parsers ===
 +
* [http://ai-nlp.info.uniroma2.it/external/chaosproject/ Chaos] - Robust syntactic parser for Italian and for English
 +
 +
=== Generators ===
 +
* [http://tcc.itc.it/projects/xig/index.html XIG] - Interchange to Italian Generator
  
 +
== Resources for Italian ==
  
 +
=== Corpora ===
 +
<!-- Please keep this list in alphabetical order -->
  
== Resources ==
+
* [http://www.istc.cnr.it/material/database/colfis/ ColFIS Corpus e Lessico di Frequenza dell'Italiano Scritto]
 +
* [http://corpus.cilta.unibo.it:8080/coris_ita.html Corpus di Italiano Scritto contemporaneo (CORIS/CODIS)]
 +
* [http://www.statmt.org/europarl Europarl corpus], sentence aligned with English
 +
* [http://corpora.informatik.uni-leipzig.de/ Italian plain text and Co-occurrences at LCC]
 +
* [http://languageserver.uni-graz.at/badip/badip/20_corpusLip.php LIP - Lessico di frequenza dell'Italiano Parlato - Access via BADIP]
 +
* [http://multisemcor.itc.it/ MultiSemCor] - English/Italian parallel corpus
 +
* [http://www.uni-duisburg.de/Fak2/FremdPhil/Romanistik/Personal/Burr/humcomp/ Oxford Text Archive Corpus of Italian Newspapers]
 +
* [http://tlio.ovi.cnr.it/TLIO/ Tesoro della lingua italiana delle origini (TLIO)]
 +
 
 +
=== Tagsets ===
 +
* [http://tcc.itc.it/projects/textpro/index.php LemmaPro] - Italian POS tagset for LemmaPro
  
 
=== Treebanks ===
 
=== Treebanks ===
Line 11: Line 48:
 
* [http://www.di.unito.it/~tutreeb/ TUT] - Turin University Treebank
 
* [http://www.di.unito.it/~tutreeb/ TUT] - Turin University Treebank
 
* [http://157.138.41.87/HTMLipar/indexparsing_a.htm VIT] - Venice Italian Treebank
 
* [http://157.138.41.87/HTMLipar/indexparsing_a.htm VIT] - Venice Italian Treebank
 +
 +
=== WordNets ===
 +
* [http://www.elda.fr/ EuroWordNet]
 +
* [http://multiwordnet.itc.it/english/home.php MultiWordNet] - a multilingual lexical database in which the Italian WordNet is strictly aligned with Princeton WordNet 1.6
 +
 +
=== Lexicons ===
 +
* [http://www.ilc.cnr.it/clips/PSC_decription.htm PAROLE-SIMPLE-CLIPS] - a four-layered, general purpose computational lexicon
 +
 +
== Links ==
 +
* [http://evalita.itc.it/ Evalita] - Evaluation of NLP tools for Italian
 +
 +
[[Category:Resources by language|Italian]]

Revision as of 11:18, 12 October 2013

Tools for Italian

Tokenisers

POS taggers

Morphology

Free software

Unknown license

  • dic_it: il Verbiario - a morphological analizer and verb coniugator for Italian verbs (web interface only?)

Named Entity Recognisers

Temporal Expressions

Parsers

  • Chaos - Robust syntactic parser for Italian and for English

Generators

  • XIG - Interchange to Italian Generator

Resources for Italian

Corpora

Tagsets

  • LemmaPro - Italian POS tagset for LemmaPro

Treebanks

  • ISST - Italian Syntactic-Semantic Treebank
  • TUT - Turin University Treebank
  • VIT - Venice Italian Treebank

WordNets

  • EuroWordNet
  • MultiWordNet - a multilingual lexical database in which the Italian WordNet is strictly aligned with Princeton WordNet 1.6

Lexicons

Links

  • Evalita - Evaluation of NLP tools for Italian