Difference between revisions of "Morphology software for English"

From ACL Wiki
Jump to navigation Jump to search
m (Reverted edits by Creek (talk) to last revision by Koskenni)
Line 12: Line 12:
 
*[http://en.wikipedia.org/wiki/Foma_%28software%29 FOMA] - finite-state toolkit (similar to Xerox XFST), created and maintained by Måns Huldén (GPL)
 
*[http://en.wikipedia.org/wiki/Foma_%28software%29 FOMA] - finite-state toolkit (similar to Xerox XFST), created and maintained by Måns Huldén (GPL)
 
*[[lttoolbox]] -- lexical processing tools for building morphological analysers/generators with XML specification files. Includes data for English (both analysis and disambiguation). (GPL)
 
*[[lttoolbox]] -- lexical processing tools for building morphological analysers/generators with XML specification files. Includes data for English (both analysis and disambiguation). (GPL)
*[http://www.thai-sbobet.com sbobet] - a Two-level Processor for Morphological Analysis, including KGEN, KTEXT, and Englex
+
*[http://www.sil.org/pckimmo/about_pc-kimmo.html PC-KIMMO] - a Two-level Processor for Morphological Analysis, including KGEN, KTEXT, and Englex
 
*[[SFST]] - Stuttgart Finite State Transducer Tools (GPL)
 
*[[SFST]] - Stuttgart Finite State Transducer Tools (GPL)
 
** Where is the data for English?
 
** Where is the data for English?
Line 45: Line 45:
 
*[http://opencog.org/wiki/RelEx RelEx] - provides English-language part-of-speech tagging, entity tagging, as well as other types of tags (gender, date, money ...), after performing a deep parse, so that tags agree with parse. Also provides resulting stems. Apache 2.0 License.
 
*[http://opencog.org/wiki/RelEx RelEx] - provides English-language part-of-speech tagging, entity tagging, as well as other types of tags (gender, date, money ...), after performing a deep parse, so that tags agree with parse. Also provides resulting stems. Apache 2.0 License.
 
*[http://nlp.ipipan.waw.pl/Spejd/ Spejd - Shallow Parsing and Disambiguation Engine] a GPL tool for simultaneous rule-based morphosyntactic disambiguation and partial parsing
 
*[http://nlp.ipipan.waw.pl/Spejd/ Spejd - Shallow Parsing and Disambiguation Engine] a GPL tool for simultaneous rule-based morphosyntactic disambiguation and partial parsing
*[http://www.thai-sbobet.com sbo] on the Apertium Wiki (HMM + constraint based)
+
*[http://wiki.apertium.org/wiki/Tagger_training Tagger training] on the Apertium Wiki (HMM + constraint based)
 
* [http://beta.visl.sdu.dk/cg3.html VISL Constraint Grammar] rule based disambiguation (GPL)
 
* [http://beta.visl.sdu.dk/cg3.html VISL Constraint Grammar] rule based disambiguation (GPL)
 
** Is there a Free set of rules for English?
 
** Is there a Free set of rules for English?

Revision as of 05:15, 25 June 2012

Software - Morphology and part of speech tagging

For languages other than English, see List of resources by language.

Morphology

Free software

  • Catvar 2.0 - The Categorial Variation Database for English (OSL)
  • HFST - Helsinki Finite-State Transducer Technology - FST library, command line tools, hfst-twolc (a rule compiler for two-level rules), and several spellers and morphological analyzers (GPL)
  • FOMA - finite-state toolkit (similar to Xerox XFST), created and maintained by Måns Huldén (GPL)
  • lttoolbox -- lexical processing tools for building morphological analysers/generators with XML specification files. Includes data for English (both analysis and disambiguation). (GPL)
  • PC-KIMMO - a Two-level Processor for Morphological Analysis, including KGEN, KTEXT, and Englex
  • SFST - Stuttgart Finite State Transducer Tools (GPL)
    • Where is the data for English?
  • MULTEXT mmorph - (unmaintained) two-level morphology, package includes some data for English and German, (GPL2 or later)

Unknown license

  • MAP - Cambridge/Edinburgh Morphological Analyzer and Dictionary System (gratis download, no license information)

Proprietary software

Part of speech tagging

Free software

  • ACOPOST - A Collection Of PoS Taggers Maximum Entropy Tagger, Trigram Tagger, Transformation-based Tagger, Example-based tagger
  • LBJ POS Tagger - Uses averaged perceptron based sequential model. Java API, Free, open source license.
  • GENiA- part-of-speech tagging, shallow parsing, and named entity recognition for biomedical text. C++, BSD license.
  • NLTK - Natural Language Toolkit Regexp Tagger, N-Gram Tagger, Brill Tagger, HMM Tagger, plus a freely downloadable book with a chapter on tagging
  • RelEx - provides English-language part-of-speech tagging, entity tagging, as well as other types of tags (gender, date, money ...), after performing a deep parse, so that tags agree with parse. Also provides resulting stems. Apache 2.0 License.
  • Spejd - Shallow Parsing and Disambiguation Engine a GPL tool for simultaneous rule-based morphosyntactic disambiguation and partial parsing
  • Tagger training on the Apertium Wiki (HMM + constraint based)
  • VISL Constraint Grammar rule based disambiguation (GPL)
    • Is there a Free set of rules for English?

Proprietary software

Combined morphology and tagging

Free software

Proprietary software

See also