Resources for Chinese: Difference between revisions

From ACL Wiki
Jump to navigation Jump to search
Kiwibird (talk | contribs)
Kiwibird (talk | contribs)
Line 8: Line 8:


==Data==
==Data==
===Free software===
* [http://corpora.heliohost.org/ HC Corpora] 1606811 lines of [http://en.wikipedia.org/wiki/Fair_use Fair Use] excerpts from news, blogs, twitter
===Unknown license===
===Unknown license===
* [http://www.chinesecomputing.com Chinese Computing]  
* [http://www.chinesecomputing.com Chinese Computing]  

Revision as of 16:05, 6 December 2012

Tools

Free software

  • rseg word segmentation; written in ruby (no compilation, no hard dependencies apart from ruby), comes with a model (MIT license)
  • ctbparser word segmentation, POS tagging, NER, dependency parsing, all using Conditional Random Fields; written in C++ (LGPL license)
  • ZPar word segmentation, POS tagging, CFG/dep/CCG parsing of Chinese and English; written in C++ (GPL3 license)
  • DuDuPlus: a graph-based dependency parser for English and Chinese ("Other Open Source" license?)
    • where is the source code?

Data

Free software

Unknown license