Difference between revisions of "Resources for Turkish"

Revision as of 09:59, 26 May 2014

TRMorph "is a relatively complete morphological analyzer for Turkish. It is implemented using SFST, and uses a lexicon based on (but heavily modified) the wordlist of Zemberek spell checker. The morphological analyzer is distributed under the GPL."

Southeast European Times (sentence aligned corpus, Albanian, Bulgarian, English, Greek, Macedonian, Romanian, Serbo-Croatian, Turkish — approximately 4.5 million words per language)
TS Corpus (PoSTagged Turkish Corpus. The corpus also presents morphological and lemma tags of the data. Consists of 491 Million tokens)

HamleDT, harmonized dependency treebanks of many languages, common annotation style.
METU-Sabanci Turkish treebank
Turkish plain text and Co-occurrences at LCC

K. Oflazer, "Two-level Description of Turkish Morphology," Literary and Linguistic Computing, vol. 9, pp. 137-148, 1995. Backwards PDF

Revision as of 09:56, 26 May 2014 (view source) Zeman (talk \| contribs) (HamleDT) ← Older edit		Revision as of 09:59, 26 May 2014 (view source) Zeman (talk \| contribs) m (Typo.) Newer edit →
Line 33:		Line 33:
	* [http://nooj4nlp.net/pages/turkish.html NooJ_TR by Mersin University Turkish National Corpus Project Team]		* [http://nooj4nlp.net/pages/turkish.html NooJ_TR by Mersin University Turkish National Corpus Project Team]

−	[[Category:Resources by language\|~~Tajik~~]]	+	[[Category:Resources by language\|Turkish]]