Resources for Chinese
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.
Tools
Free software
- rseg word segmentation; written in ruby (no compilation, no hard dependencies apart from ruby), comes with a model (MIT license)
- ctbparser word segmentation, POS tagging, NER, dependency parsing, all using Conditional Random Fields; written in C++ (LGPL license)
- ZPar word segmentation, POS tagging, CFG/dep/CCG parsing of Chinese and English; written in C++ (GPL3 license)
- DuDuPlus: a graph-based dependency parser for English and Chinese ("Other Open Source" license?)
- where is the source code?
Data
Free software
- HC Corpora 1606811 lines of Fair Use excerpts from news, blogs, twitter