Difference between revisions of "Resources for English"

From ACL Wiki
Jump to navigation Jump to search
m (fix link)
 
(53 intermediate revisions by 7 users not shown)
Line 1: Line 1:
Most of the early additions have been moved here from the [http://www.aclweb.org/universe ACL NLP/CL Universe].
+
For other languages, see [[List of resources by language]].
  
 +
See also [[Multilingual resources]].
 +
 +
<!-- Please keep this list in alphabetical order -->
 +
* [[Corpora for English|Corpora]]
 +
* [[Dictionaries (English)|Dictionaries]]
 +
* [[Generation grammars]]
 +
* [[Geographical words (English)|Geographical words]]
 +
* [[Knowledge collections and datasets (English)|Knowledge collections and datasets]]
 +
* [[Lexicons (English)|Lexicons]]
 +
* [[Subject specific resources (English)|Subject specific resources]]
 +
* [[Tools and Software for English|Tools and Software]]
 +
* [[Uncategorized resources]] - ''please help in categorizing''
 +
 +
==Other resource lists==
 +
* [[Lists of resources|Other lists of resources]]
 +
 +
==Additional information==
 +
<!-- Please keep this list in alphabetical order -->
 +
 +
* [[Anthology Statistics]]
 
* [[Bibliographies]]
 
* [[Bibliographies]]
 +
* [[Blogs]]
 
* [[Books]]
 
* [[Books]]
* [[Lists of resources]]
+
* [[Conferences]]
* [[Corpora]]
 
 
* [[Courses]]
 
* [[Courses]]
* [[Dictionaries]]
 
 
* [[Journals]]
 
* [[Journals]]
 +
* [[Newsgroups, mailing lists|Newsgroups and mailing lists]]
 +
* [[Papers]]
  
 
+
[[Category:Resources by language|English]]
 
 
==LANGUAGE==
 
*[http://www1.cs.columbia.edu/~mdiab/software/ASVMTools_2.0.tar.gz Basic Arabic Processing Tools]
 
 
 
*[http://www.dunglish.nl/ Dunglish]
 
 
 
*[http://www.up.univ-mrs.fr/veronis/donnees/index.html French Stopword List]
 
 
 
*[http://www.ivrix.org.il/projects/spell-checker/ Hebrew Spellchecker]
 
 
 
*[http://www.up.univ-mrs.fr/tresoc/ Le TrÈsor de la Langue Langue d'Oc]
 
 
 
*[http://actarus.atilf.fr/morphalou/ Lexique Morphalou]
 
 
 
*[http://online.anu.edu.au/asianstudies/ahcen/proudfoot/MCP/ Malay Concordance Project]
 
 
 
*[http://earth-info.nga.mil/gns/html/cntry_files.html Names Files of Selected Countries]
 
 
 
*[http://orleans.lti.cs.cmu.edu/Reap/ REAP Project: Reader-Specific Lexical Practice for Improved Reading Comprehension]
 
 
 
*[http://people.cs.uchicago.edu/~dinoj/icsi97syl.disc.gz Syllable-Level Conversational English Transcriptions]
 
 
 
*[http://spraakbanken.gu.se/lb/ The Bank of Swedish - A Linguistic Reference Database of G&ouml;teborg University]
 
 
 
*[http://www.sil.org/mexico/pub/vimsa.htm The Mariano Silva y Aceves Series]
 
 
 
*[http://www.unine.ch/info/clef/ UniNE stopword list for Portuguese]
 
 
 
*[http://geonames.usgs.gov/domestic/download_data.htm United States Geographic Names]
 
 
 
*[http://www.valencianlanguage.com/ Valencianlanguage.com]
 
 
 
 
 
==MAILING==
 
*[http://ling.ohio-state.edu/HPSG/Majordomo.html HPSG Mailing List]
 
 
 
*[http://www.eamt.org/mt-list.html MT List]
 
 
 
*[https://mailman.rice.edu/pipermail/funknet/2002-September/002331.html Natural Semantic Metalanguage List]
 
 
 
*[http://www.sigir.org/sigirlist/issues/ SIG-IRList Archives]
 
 
 
*[http://www.hd.uib.no/ The CORPORA list]
 
 
 
==ONLINE==
 
*[http://www.ldc.upenn.edu/exploration/survey.html A Survey of Open Language Archives]
 
 
 
*[http://www.siggen.org/resources/ ACL SIGGEN Resources Wiki]
 
 
 
*[http://www.cs.kun.nl/agfl/ AFGL Parser Generator]
 
 
 
*[http://www.let.rug.nl/~vannoord/alp/ Algorithms for Linguistic Processing]
 
 
 
*[http://www.a-i.com/ Artificial Intelligence NV (Ai)]
 
 
 
*[http://www.eprints.org/ Author/Institution Self-Archiving]
 
 
 
*[http://www.chinesecomputing.com Chinese Computing]
 
 
 
*[http://www.copernic.com/ COPERNIC 2000]
 
 
 
*[http://www.linguateca.pt/corpografo CorpÛgrafo]
 
 
 
*[http://www.cis.upenn.edu/~dbikel/#stat-parser Dan Bikel's Parser]
 
 
 
*[http://java.sun.com/docs/books/tutorial/i18n/text/boundaryintro.html Detecting Text Boundaries]
 
 
 
*[http://www.catchword.com/era Educational Research Abstracts]
 
 
 
*[http://emotion-research.net/wiki/Databases Emotional Databases]
 
 
 
*[http://lingo.stanford.edu/erg.html English Resource Grammar]
 
 
 
*[http://www.cjk.org/cjk/samples/chincomc.htm English-Chinese Chinese-English Dictionary of Computer Terms]
 
 
 
*[http://www.freelangonline.com/ Freelangonline - many on-line dictionaries + more]
 
 
 
*[http://dmoz.org/Computers/Software/Information_Retrieval/ Information Retrieval]
 
 
 
*[http://ir.dcs.gla.ac.uk/resources.html IR resources]
 
 
 
*[http://odin.prohosting.com/hkkim/cgi-bin/kaeps/ Korean Accented English Pronunciation Simulator]
 
 
 
*[http://www.kwicfinder.com/KWiCFinder.html KwicFinder Web Concordancer and Online Research Tool]
 
 
 
*[http://www.academiaisla.com/acadi/gen/0_en.html LANGUAGE LINKS]
 
 
 
*[http://www.link.cs.cmu.edu/lexfn/ Lexical FreeNet]
 
 
 
*[http://www-lfg.stanford.edu/lfg/ilfga LFG Database: List of Names]
 
 
 
*[http://lse.umiacs.umd.edu/ Linguist's Search Engine]
 
 
 
*[http://www.ims.uni-stuttgart.de/projekte/TIGER/ Linguistic Interpretation of a German Corpus]
 
 
 
*[http://www.link.cs.cmu.edu/ LINK GRAMMAR PARSER]
 
 
 
*[http://www.eturner.net/linkgrammar-wn/ LinkGrammar-WN project]
 
 
 
*[http://www.psy.uwa.edu.au/MRCDataBase/uwa_mrc.htm MRC Psycholinguistic Database]
 
 
 
*[http://nl.ijs.si/ME/V3/ Multext East Resources, Version 3]
 
 
 
*[http://multiwordnet.itc.it/english/home.php MultiWordNet]
 
 
 
*[http://www.comp.nus.edu.sg/~rpnlpir/ Natural Language Processing / Information Retrieval Software Repository]
 
 
 
*[http://nlsh.sourceforge.net/ NLSH: Natural Language Shell]
 
 
 
*[http://www.irisa.fr/Omphalos/ Omphalos Context-Free Language Learning Competition]
 
 
 
*[http://ysomeya.hp.infoseek.co.jp/ Online Business Letter Corpus KWIC Concordancer]
 
 
 
*[http://www.grsampson.net/RLeafAnc.html Parse Evaluation]
 
 
 
*[http://pie.usna.edu Phrases in English and the British National Corpus]
 
 
 
*[http://pygoogle.sourceforge.net/ PyGoogle: A Python Interface to the Google API]
 
 
 
*[http://www.ai.uga.edu/mc/PythonForNewbieLinguists.html Python Programming Tutorial]
 
 
 
*[http://corpus.leeds.ac.uk/internet.html Query to Internet Corpora]
 
 
 
*[http://www.lexmasterclass.com/exercises/regex/index.html Regular Expression Exercises]
 
 
 
*[http://www.sims.berkeley.edu/~hjiang1/eng_chi_resources.html Resources for English-Chinese CLIR]
 
 
 
*[http://www.cs.technion.ac.il/~gabr/resources/resources.html Resources for Text, Speech and Language Processing]
 
 
 
*[http://www.sfu.ca/rst/ Rhetorical Structure Theory (RST)]
 
 
 
*[http://www.philol.msu.ru/rus/galya-1 Russian Phonetics on the Web]
 
 
 
*[http://www.clres.com/SensSemRoles.html Senseval-3 Task: Automatic Labeling of Semantic Roles]
 
 
 
*[http://www.clres.com/SensWNDisamb.html Senseval-3 Task: Word-Sense Disambiguation of WordNet Glosses]
 
 
 
*[http://www.cs.unt.edu/~rada/wa/ Sentence Alignment and Word Alignment: Projects, Papers, Evaluation, Etc.]
 
 
 
*[http://ontoweb-lt.dfki.de/knowledge_index.htm SIG5 OntoWeb]
 
 
 
*[http://sara.natcorp.ox.ac.uk/lookup.html Simple Search of BNC-World]
 
 
 
*[http://www.sigsem.org/ Special Interest Group on Computational Semantics]
 
 
 
*[http://www.grsampson.net/RSue.html SUSANNE Analytic Scheme]
 
 
 
*[http://www.telemakus.net/ Telemakus: Mining and Mapping Research Findings to Promote Knowledge Discovery]
 
 
 
*[http://www.clef-campaign.org/ The Cross-Language Evaluation Forum]
 
 
 
*[http://www.robotwisdom.com/web/biography.html The Internet Timelines Project]
 
 
 
*[http://www.fb10.uni-bremen.de/anglistik/langpro/NLG-table/NLG-table-root.htm The John Bateman and Michael Zock's list of Natural Language Generation Systems]
 
 
 
*[http://www.RosettaProject.org/ The Rosetta PrOject]
 
 
 
*[http://www.cis.upenn.edu/~xtag The XTAG Project]
 
 
 
*[http://www.TSrali.com/ TransSearch]
 
 
 
*[http://www-nlpir.nist.gov/projects/trecvid/ TREC Video Retrieval Evaluation Page]
 
 
 
*[http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html TreeTagger - a language independent part-of-speech tagger]
 
 
 
*[http://www.biomedcentral.com/info/about/datamining= Using BioMed Central's Open Access Full Text Corpus for Data Mining Research]
 
 
 
*[http://view.byu.edu/ VIEW: Variation in English Words and Phrases]
 
 
 
*[http://beta.visl.sdu.dk VISL Tagger and Parser]
 
 
 
*[http://www.webir.org/ Web IR & IE]
 
 
 
*[http://elib.cs.berkeley.edu/docfreq/ Web Term Document Frequency Form (Berkeley)]
 
 
 
*[http://www.niederlandistik.fu-berlin.de/cgi-bin/web-conc.cgi WEB-CONC]
 
 
 
*[http://www.webcorp.org.uk/ WEBCORP]
 
 
 
*[http://www.webexp.info/ WebExp2 Experimental Software]
 
 
 
*[http://www.comp.lancs.ac.uk/ucrel/bncfreq/ Word Frequencies in Written and Spoken English (Based on the British National Corpus)]
 
 
 
*[http://tcc.itc.it/research/textec/topics/disambiguation/wordnetdomains.html WordNet Domains]
 
 
 
*[http://www.ldc.upenn.edu/exploration/expl2000/papers/ Workshop on Web-Based Language Documentation and Description Papers]
 
 
 
*[http://www.gmi.org/wlms/ World Language Mapping System]
 
 
 
==PAPERS==
 
*[http://www.comp.leeds.ac.uk/eric/iwcs.ps A Domain-Independent Semantic Tagger for the Study of Meaning Associations in English Text]
 
 
 
*[http://www3.interscience.wiley.com/cgi-bin/abstract/104525215/ABSTRACT Automatic Construction of English/Chinese Parallel Corpora]
 
 
 
*[ftp://ftp.icsi.berkeley.edu/pub/techreports/ Berkeley - Technical Reports]
 
 
 
*[http://sslmit.unibo.it/~baroni/publications/lrec2004/bootcat_lrec_2004.pdf BootCaT: Bootstrapping Corpora and Terms from the Web]
 
 
 
*[http://acl.ldc.upenn.edu/I/I05/I05-2015.pdf Building an Annotated Japanese-Chinese Parallel Corpus]
 
 
 
*[http://www.ucl.ac.uk/english-usage/diachronic/index.htm DCPSE: Creating a Parsed and Searchable Diachronic Corpus of  Present-Day Spoken English]
 
 
 
*[http://www.ims.uni-stuttgart.de/info/EPapers.html Electronically available papers (list at Univ. of Stuttgart)]
 
 
 
*[http://www.cis.upenn.edu/~cliff-group/94/reports.html Linc Lab (U. Penn) technical reports (not on-line)]
 
 
 
*[http://www.let.rug.nl/~tanja/ Linguistic Knowledge and Word Sense Disambiguation]
 
 
 
*[http://ir.shef.ac.uk/cloughie/papers.html Measuring Text Reuse]
 
 
 
*[http://www.linguistics.rub.de/~kiss/publications/publications.html#boundaries Paper on Sentence Boundary Disambiguation]
 
 
 
*[http://www.dfki.de/lt/papers/cl-abstracts.html Papers of the DFKI CL Department]
 
 
 
*[http://www1.cs.columbia.edu/nlp/theses.html PhD Theses (Columbia Natural Processing Language Group)]
 
 
 
*[http://www.itri.brighton.ac.uk/ucnlg/Proceedings/index.html Proceedings of the Corpus Linguistics 2005 Workshop on Using Corpora for Natural Language Generation]
 
 
 
==SOFTWARE==
 
*[http://www.answerbus.com/index.shtml Answerbus -- Automatic Language Detection Software]
 
 
 
*[http://homepage.mac.com/bncweb/home.html BNCweb: A Web-Based Interface to the British National Corpus]
 
 
 
*[http://nlg18.csie.ntu.edu.tw:8080/opinion/index.html Chinese sentiment dictionary NTUSD]
 
 
 
*[http://www.athel.com/colloc.html Collocate]
 
 
 
*[http://lingo.stanford.edu/ CSLI LinGO Lab (Stanford)]
 
 
 
*[http://www.lsi.upc.es/~nlp/freeling/ FreeLing 1.1]
 
 
 
*[http://www.webir.org/resources.html IR and IE on the web]
 
 
 
*[http://www.comp.nus.edu.sg/~qiul/NLPTools/JavaRAP.html JavaRAP]
 
 
 
*[https://sourceforge.net/projects/jwordnet/ JWNL (Java WordNet Library)]
 
 
 
*[http://xlex.uni-muenster.de/ MTP Xlex/www]
 
 
 
*[http://www.langsoft.ch Natural Language Processing software]
 
 
 
*[http://www.dfki.de/lt/registry/draft.html Natural Language Software Registry (at DFKI)]
 
 
 
*[http://www.nzdl.org/ELKB/ Roget's Thesaurus as an Electronic Lexical Knowledge Base]
 
 
 
*[http://www.chass.utoronto.ca/tact/ Text Analysis Computing Tools (TACT)]
 
 
 
*[http://odur.let.rug.nl/~vannoord/TextCat/ TextCat]
 
 
 
*[http://igm.univ-mlv.fr/~unitex/ Unitex]
 
 
 
*[http://www.mith2.umd.edu/products/ver-mach/ Versioning Machine 2.0]
 
===SOFTWARE - APPLICATIONS===
 
*[http://www.d.umn.edu/~tpederse/code.html Bigram Statistics Package]
 
 
 
*[http://webdeptos.uma.es/filifa/personal/amoreno/indexer/ BNC Indexer]
 
 
 
*[http://www.brainhat.com/ Brainhat Natural Language Processing]
 
 
 
*[http://www.chilibot.net/ Chilibot: NLP based miner for gene/protein/keyword relationships]
 
 
 
*[http://www.bultreebank.org/clark CLaRK System]
 
 
 
*[http://delphesintl.com/ Delphes Technologies International]
 
 
 
*[http://www.dtreg.com DTREG 2.0 decision trees with TreeBoost]
 
 
 
*[http://www.dfki.de/lt/registry/apps/korek21.html KOREKTOR 2.0 (at the DFKI NLP archive)]
 
 
 
*[http://www.xs4all.nl/~bsarempt/linguistics/index.html KURA 1.0]
 
 
 
*[http://www.ccs.neu.edu/home/futrelle/bionlp/commercial/opus.html Opus, a commercial biology text mining system]
 
 
 
*[http://sourceforge.net/projects/pytalk/ Project: Pytalk]
 
 
 
*[http://www.wagsoft.com/RSTTool/ Release of RSTTool: RSTTool 2.7]
 
 
 
*[http://www.softissimo.com/ SOFTISSIMO]
 
 
 
*[http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html TreeTagger]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
===SOFTWARE - KNREP===
 
*[http://www.linguateca.pt/corpografo/ Corpografo 2]
 
 
 
*[http://www.conceptnet.org The ConceptNet Project V2.1]
 
 
 
===SOFTWARE - MISC===
 
*[http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/bookcode/allen/0.html Code from James Allen's "Natural Language Understanding" (code at CMU)]
 
 
 
*[http://www.cs.cmu.edu/afs/cs/project/ai-repository/ai/areas/nlp/bookcode/nlp_pp/0.html Code from Michael Covington's "NLP for Prolog Programmers" (code at CMU)]
 
 
 
*[http://www.dtreg.com/ DTREG decision tree generator]
 
 
 
*[http://www.sics.se/ps/sicstus.html SICStus - a Prolog environment]
 
 
 
 
 
 
 
===SOFTWARE - MT===
 
*[http://www.travlang.com/Ergane/ Ergane]
 
 
 
*[http://www.lsi.upc.edu/%7Enlp/IQMT/ IQMT Framework for MT Evaluation]
 
 
 
*[http://www.isi.edu/natural-language/people/germann/software/ReWrite-Decoder/index.html ISI rewrite decoder]
 
 
 
*[http://www.scitechint.com/slp Resource for professional-quality language translation tools.]
 
 
 
===SOFTWARE - MULTI===
 
*[http://bach.arts.kuleuven.ac.be/~piet/fr_nlp.html Natural Language Processing for French]
 
 
 
*[http://www.ims.uni-stuttgart.de/projekte/TIGER/TIGERSearch/ TIGERSearch - tools for linguistic text exploration]
 
 
 
===SOFTWARE - MULTILINGUAL===
 
*[http://fmg-www.cs.ucla.edu/geoff/ispell-dictionaries.html#Spanish-dicts Dictionaries for International Ispell]
 
 
 
*[http://let.dfki.uni-sb.de/~heinz/liste.php3 list of systems in multiple languages]
 
 
 
*[http://www.lpl.univ-aix.fr/projects/multext/MtRecode/ MtRecode - Character conversion program]
 
 
 
*[http://www.lpl.univ-aix.fr/projects/multext/MtScript/ MtScript - The Multext multi-lingual text editor]
 
 
 
*[http://www.knowledge.co.uk/xxx/ Multilingual PC software]
 
 
 
*[http://www.scitechint.com/slp Resource for high-quality tools supporting multi-lingual communication]
 
 
 
===SOFTWARE - PHONOLOGY===
 
*[http://ling.ucsd.edu/~barker/Syllables/index.txt How Many Syllables Does English Have?]
 
 
 
*[http://mypage.siu.edu/lhartman/ Phono- Sound Change Model Software]
 
===SOFTWARE - SEMANTICS===
 
*[http://www.intellexer.com/ Intellexer - Natural Language Searching Technologies]
 
 
 
*[http://www.clres.com/prepositions.html Preposition Project]
 
 
 
*[http://senseclusters.sourceforge.net/ SenseClusters]
 
 
 
*[http://www.comp.lancs.ac.uk/ucrel/usas/ UCREL Semantic Analysis System]
 
 
 
===SOFTWARE - SPEECH===
 
 
 
*[http://www.speech.cs.cmu.edu/sphinx/ CMU Sphinx Group: Open Source Speech Recognition Engines]
 
 
 
*[http://www.cs.ubc.ca/~murphyk/Software/HMM/hmm.html Hidden Markov Model (HMM) Toolbox for Matlab]
 
 
 
*[http://nextens.uvt.nl/ NeXTeNS - Dutch Extension for Text to Speech]
 
 
 
*[http://bach.arts.kuleuven.ac.be/pmertens/prosogram/ Prosogram]
 
 
 
*[http://ilk.uvt.nl/g2p-www-demo.html TreeTalk: Memory - Based Grapheme - Phoneme Conversion Demo]
 
 
 
*[http://download.com.com/3000-2130-10277576.html?tag=lst-0-1 Verbot preview 4.0]
 
 
 
===SOFTWARE - SYNTAX===
 
*[http://www.comp.leeds.ac.uk/amalgam/amalgam/amalghome.htm AMALGAM project]
 
 
 
*[http://lael.pucsp.br/corpora/segmentador/ CEPRIL -  Portugese Segmenter]
 
 
 
*[http://www.collectivelanguage.com/demo.html Collective (Chaotic - Emergent) Language]
 
 
 
*[http://search.cpan.org/dist/Lingua-EN-Sentence/ CPAN Lingua EN Sentence Splitter]
 
 
 
*[http://search.cpan.org/dist/Lingua-HE-Sentence/ CPAN Lingua HE Sentence Splitter]
 
 
 
*[http://search.cpan.org/~tgrose/HTML-Summary-0.017/ CSPAN Sentence Splitter]
 
 
 
*[http://gate.ac.uk GATE, A General Architecture for Text Engineering]
 
 
 
*[http://www.infogistics.com/posdemo.htm Infogistic - NLProcessor Interactive Demo]
 
 
 
*[http://www.andy-roberts.net/software/jTokeniser jTokeniser]
 
 
 
*[http://openccg.sourceforge.net/ OpenCCG]
 
 
 
*[http://www.coli.uni-sb.de/~thorsten/tnt/ Saarland University, Computational Linguistics]
 
 
 
*[http://www.ling.helsinki.fi/~tapanain/dg/ Syntactic dependency parser for English]
 
 
 
*[http://www.ece.ubc.ca/~donaldd/treeform.htm TreeForm Syntax Tree Drawing Software]
 
 
 
*[http://bach.arts.kuleuven.ac.be/~piet/vertex/index.html VERTEX - A chart parser for unification grammars (French)]
 
 
 
*[http://visl.hum.sdu.dk/visl/ VISL - Visual Interactive Syntax Learning]
 
 
 
===SOFTWARE - TOOLS===
 
*[http://www.ldc.upenn.edu/Projects/ACE/Tools/  Automatic Content Extraction (ACE): Annotation Tools]
 
 
 
*[http://www.norvig.com/paip/grammar.lisp a simple grammar of English]
 
 
 
*[http://sourceforge.net/projects/acopost/ ACOPOST]
 
 
 
*[http://acdc.linguateca.pt/example_alignment.html Alignment of bilingual corpora performed with EasyAlign]
 
 
 
*[http://www.lsi.upc.es/~lambert/software/AlignmentSet.html Alignment Set Toolkit]
 
 
 
*[http://www.clres.com/WordNet.html alphabetic version of WordNet 2.0]
 
 
 
*[http://lucene.apache.org/java/docs/ Apache Lucene]
 
 
 
*[http://www.arabeyes.org/ Arabeyes Project]
 
 
 
*[http://members.aol.com/gnhbos/ocr.htm Aramedia]
 
 
 
*[http://www.umiacs.umd.edu/~jimmylin/downloads/index.html Aranea Question Answering System]
 
 
 
*[http://www.ltg.ed.ac.uk/~jo/interarbora/ Arbora Tree Delivery Service]
 
 
 
*[http://misshoover.si.umich.edu/~zzheng/sentence/ Automatic English Sentence Segmentation]
 
 
 
*[http://www.clg.wlv.ac.uk/projects/CAST/demos.php Automatic Summarization Demos]
 
 
 
*[http://www.r.dl.itc.u-tokyo.ac.jp/~nakagawa/resource/termext/atr-e.html Automatic Term Extraction System]
 
 
 
*[http://lael.pucsp.br/corpora/ Bancos de dados e Ferramentas de an`alise]
 
 
 
*[http://www.ai.mit.edu/~murphyk/Software/BNT/bnt.html Bayes Net Toolbox for Matlab]
 
 
 
*[http://bndev.sourceforge.net/ Bayesian Network tools in Java (BNJ)]
 
 
 
*[http://sslmit.unibo.it/~baroni/bootcat.html BootCaT: Simple Utilities to Bootstrap Corpora and Terms from the Web]
 
 
 
*[http://callisto.mitre.org/ Callisto Annotation Tool]
 
 
 
*[http://www.ldc.upenn.edu/Catalog/CatalogEntry.jsp?catalogId=LDC2005T13 CCGBank]
 
 
 
*[http://lael.pucsp.br/corpora/alinhador/ CEPRIL aligner]
 
 
 
*[http://pie.usna.edu/explorec.html Chargrams Database from British National Corpus]
 
 
 
*[http://www.bultreebank.org/clark/index.html CLaRK System]
 
 
 
*[http://dlt4.mit.edu/~dr/COALS/ COALS: Correlated Occurrence Analogue to Lexical Semantics]
 
 
 
*[ftp://cs.nyu.edu/pub/html/comlex.html/ Comlex]
 
 
 
*[http://www.ai.mit.edu/projects/iiip/doc/cl-http/home-page.html Common Lisp Hypermedia Server]
 
 
 
*[http://www.cpan.org/ Comprehensive Perl Archive Network]
 
 
 
*[http://clg.wlv.ac.uk/projects/CAST/ Computer Aided Summarisation Tool (CAST)]
 
 
 
*[http://infomap.stanford.edu/webdemo Concept Search Engine Information Mapping Demo (Center for the Study of Language and Information, Stanford University)]
 
 
 
*[https://sourceforge.net/projects/concollate/ Concollate]
 
 
 
*[http://borel.slu.edu/crubadan/ Corpus building for minority languages]
 
 
 
*[http://montev.isi.edu:8000/align-tool/?CORPUS=de-news-morphix&AFILE=full-model1-50-50.gz Corpus De-News-Morphix Alignment Tool]
 
 
 
*[http://search.cpan.org/dist/SuffixTree/ CPAN Suffix Tree Module]
 
 
 
*[http://www.ucl.ac.uk/english-usage/diachronic/index.htm Creating a Parsed and Searchable Diachronic Corpus of Present-Day Spoken English]
 
 
 
*[http://www.cis.upenn.edu/~dbikel/software.html#wn Dan Bikel's Java WordNet Library]
 
 
 
*[http://www.dataharmony.com/ Data Harmony, Document Management Software]
 
 
 
*[http://www.cs.ualberta.ca/~lindek/demos.htm Demos of dependency database, parser, and other tools]
 
 
 
*[http://fuzzy.cs.uni-magdeburg.de/~borgelt/dtree.html Dtree - Decision and Regression Tree Induction]
 
 
 
*[http://www.foreignword.com/dictionary/truespel/transpel.htm English-Truespel (USA Accent) Text Conversion Tool]
 
 
 
*[http://www.cs.jhu.edu/~brill/ Eric Brill's Part of Speech Tagger]
 
 
 
*[http://odur.let.rug.nl/~vannoord/Fsa/Manual/node1.html Finite State Automata Utilities v6]
 
 
 
*[http://www.jaist.ac.jp/~hieuxuan/flexcrfs/flexcrfs.html FlexCRFs: Flexible Conditional Random Fields]
 
 
 
*[http://garraf.epsevg.upc.es/freeling/ FreeLing 1.2]
 
 
 
*[http://grid.let.rug.nl/~vannoord/Fsa/fsa.html FSA6.2xx: Finite State Automata Utilities]
 
 
 
*[http://gate.ac.uk/ GATE (General Architecture for Text Engineering)]
 
 
 
*[http://www.clsp.jhu.edu/ws2005/groups/statistical/GenPar.html GenPar Toolkit for Generalized Parsing]
 
 
 
*[http://www.parc.xerox.com/istl/groups/nltt/medley/ Grammar Writer's Workbench for Lexical Functional Grammar]
 
 
 
*[http://htk.eng.cam.ac.uk Hidden Markov Model Toolkit]
 
 
 
*[http://www.ida.liu.se/~nlplab/ILink/ I*Link]
 
 
 
*[http://www.kbsim.com/ifind.html iFind KBSim.com - Knowledge-Based Simulations, Inc.]
 
 
 
*[http://www.infogistics.com/posdemo.htm Infogistics: NLProcessor Interactive Demo]
 
 
 
*[http://www.isi.edu/~marcu/software.html ISI's version of the RSTTool]
 
 
 
*[http://www-2.cs.cmu.edu/~javabayes/Home/ JavaBayes - v0.346]
 
 
 
*[http://www.comp.leeds.ac.uk/andyr/software/jTokeniser/ jTokeniser]
 
 
 
*[http://sourceforge.net/projects/jwordnet/ JWNL (Java WordNet Library)]
 
 
 
*[http://sslmit.unibo.it/%7ebaroni/welcome_to_knorpora.html Knorpora 1.0]
 
 
 
*[http://miniappolis.com/KWiCFinder/KWiCFinderHome.html KWiCFinder]
 
 
 
*[http://www.kwicfinder.com/KWiCFinder.html Kwicfinder]
 
 
 
*[http://odur.let.rug.nl/~vannoord/TextCat/competitors.html Language Identification Tools]
 
 
 
*[http://www-2.cs.cmu.edu/~lemur/download.html Lemur Toolkit Download]
 
 
 
*[http://www.lemurproject.org/ Lemur Toolkit Website]
 
 
 
*[http://www.leximancer.com/ Leximancer]
 
 
 
*[http://www.csie.ntu.edu.tw/~cjlin/libsvm/ LIBSVM: A Library for Support Vector Machines]
 
 
 
*[http://www.alias-i.com/lingpipe/ LingPipe]
 
 
 
*[http://search.cpan.org/~lgoddard/Lingua-Syllable-0.03/Syllable.pm Lingua-Syllable]
 
 
 
*[http://listserv.linguistlist.org/cgi-bin/wa?A2=ind0109&L=corpora&P=R729 list of POS taggers]
 
 
 
*[http://ucrel.lancs.ac.uk/llwizard.html Log-likelihood calculator]
 
 
 
*[ftp://ftp.ncbi.nlm.nih.gov/pub/lsmith/MedPost/medpost.tar.gz MedPost: A Part-of-Speech Tagger for BioMedical text]
 
 
 
*[http://www.lexically.net/wordsmith/version4/index.htm Mike Scott's Web - Wordsmith Tools]
 
 
 
*[http://mmax.eml-research.de MMAX Annotation Tool]
 
 
 
*[http://www.dcs.shef.ac.uk/research/ilash/Moby/ Moby Database]
 
 
 
*[http://search.cpan.org/author/SHLOMOY/Lingua-EN-Sentence-0.25/lib/Lingua/EN/Sentence.pm Module for splitting text into sentences]
 
 
 
*[http://www.cs.berkeley.edu/~aiken/moss.html Moss: A System for Detecting Software Plagiarism]
 
 
 
*[http://www.clsp.jhu.edu/ws2005/groups/statistical/mtv.html Multitree Viewer (MTV)]
 
 
 
*[http://www.natlantech.com/lingbench_ide.html Natlanco]
 
 
 
*[http://www.cs.jhu.edu/~brill/code.html Natural Language Processing Systems]
 
 
 
*[http://www.ltg.ed.ac.uk/NITE/ NITE XML Toolkit]
 
 
 
*[http://nltk.sourceforge.net NLTK - Natural Language Toolkit]
 
 
 
*[http://crl.nmsu.edu/Tools/Software/ NMSU Natural Language Processing Tools]
 
 
 
*[http://annotation.semanticweb.org/ontomat/index.html Ontomat Homepage]
 
 
 
*[http://teach-computers.org/word-expert.html Open Mind]
 
 
 
*[http://davinci.cs.ucdavis.edu/ OpenRCT Home]
 
 
 
*[http://www.oriel.org/homonym.htm ORIEL -- Online Research Information Environment for the Life Sciences]
 
 
 
*[http://clg.wlv.ac.uk/projects/PALinkA/ PALinkA: A Resource Annotation Tool]
 
 
 
*[http://www.sil.org/ PC-KIMMO, Englex, PC-PATR, and PC-PARSE]
 
 
 
*[http://www.bluem.net/downloads/pdftotext_en/ PDF to Text]
 
 
 
*[http://wall.jussieu.fr/dyn/Context2 perl concordancer]
 
 
 
*[http://www.ai.mit.edu/~jrennie/WordNet/ Perl interface to WordNet]
 
 
 
*[http://www.tartarus.org/~martin/PorterStemmer/index.html Porter Stemming Algorithm]
 
 
 
*[http://www.sciences.univ-nantes.fr/info/perso/permanents/enguehard/recherche/CoRRecT/CoRRecT_gb.htm Project CoRRecT: Reference Corpus for the Recognition of Terms]
 
 
 
*[http://protege.stanford.edu/ Protege Project]
 
 
 
*[http://www.lingsoft.fi/cgi-pub/engcg Publically available POS tagger]
 
 
 
*[http://www.openchannelsoftware.org/projects/Qanda Qanda: Open source question answering system]
 
 
 
*[http://corpus.leeds.ac.uk/query-zh.html Query to Chinese Corpora]
 
 
 
*[http://www-rali.iro.umontreal.ca/Reacc/ R&eacute;acc - reaccenting software]
 
 
 
*[http://rdues.uce.ac.uk/acronym.shtml RDUES ACRONYM (Automatic Collocational Retrieval of NYMs) Project]
 
 
 
*[http://www.comp.nus.edu.sg/~rpnlpir/daemonCollins/ README for the daemonized version of Collins' Parser]
 
 
 
*[http://www.research-lab.com/ Research-lab.com]
 
 
 
*[http://hacks.oreilly.com/pub/h/1011 Robot Karaoke]
 
 
 
*[http://www.reitter-it-media.de/compling/index.html RST LaTeX  (Reitter IT and Media)]
 
 
 
*[http://herzberg.ca.sandia.gov/jess/index.shtml Rule Engine for the Java Platform]
 
 
 
*[http://elib.cs.berkeley.edu/src/satz/ SATZ--Adaptive Sentence Boundary Detector]
 
 
 
*[http://ixa.si.ehu.es/Ixa/resources/selprefs Selectional Preferences Extracted from Semcor for WordNet 1.6 Synsets (v 1.0)]
 
 
 
*[http://ilk.uvt.nl/~sabine/chunklink/ Software - The chunklink script, by Sabine Buchholz]
 
 
 
*[http://people.csail.mit.edu/people/mcollins/code.html Software and Data Sets for Collins Natural Language Parser]
 
 
 
*[http://senta.di.ubi.pt Software for the Extraction of N-ary Textual Associations (SENTA)]
 
 
 
*[http://www-a2k.is.tokushima-u.ac.jp/member/kita/NLP/nlp_tools.html Software Tools for NLP]
 
 
 
*[http://www-nlp.stanford.edu/software/lex-parser.shtml Stanford Parser]
 
 
 
*[http://start.csail.mit.edu/ START Natural Language Question Answering System]
 
 
 
*[http://www.lsi.upc.edu/%7Enlp/SVMTool/ SVMTool]
 
 
 
*[http://swesum.nada.kth.se/index-eng.html SweSum - Automatic Text Summarizer (with PRM)]
 
 
 
*[http://www.wagsoft.com/Coder/ Systemic Coder -- a Text Markup Tool (Version 4.5)]
 
 
 
*[http://www-2.cs.cmu.edu/~lenzo/t2p/ t2p: Text-to-Phoneme Converter Builder]
 
 
 
*[http://www.d.umn.edu/~tpederse/parallel.html Ted Pedersen - Tools for Parallel Text]
 
 
 
*[http://lsi.research.telcordia.com/ Telcordia Latent Semantic Indexing  Demo Machine]
 
 
 
*[http://www.tei-c.org/Software/index.html Text Encoding Initiative --Tools]
 
 
 
*[http://www.comp.lancs.ac.uk/computing/research/ucrel/claws/tagservice.html The CLAWS tagging service]
 
 
 
*[http://www.clsp.jhu.edu/ws99/projects/mt/toolkit/ The EGYPT Statistical Machine Translation Toolkit]
 
 
 
*[http://www.ims.uni-stuttgart.de/CorpusToolbox/ The IMS Corpus Toolbox Webpage]
 
 
 
*[http://jazzy.sourceforge.net/ The Java Open Source Spell Checker]
 
 
 
*[http://www.findingnames.net/ The Naming Company]
 
 
 
*[http://stp.ling.uu.se/~corpora/plug/pwa/ The PLUG Word Aligner - PWA]
 
 
 
*[http://timex2.mitre.org/taggers/timex2_taggers.html TIMEX2 Taggers]
 
 
 
*[http://www.coli.uni-sb.de/~thorsten/tnt/ TnT - Statistical Part-of-Speech Tagger]
 
 
 
*[ftp://ftp.ids-mannheim.de/kt/CSSCCb-4.0.tar.bz2 Tool for Identifying Duplicates in a Large Document Collection]
 
 
 
*[http://www.cs.columbia.edu/nlp/tools.html Tools developed at Columbia University (FUF, Surge, Crep, Segmenter, Verber, Xtract)]
 
 
 
*[http://www.torch.ch Torch3]
 
 
 
*[http://main.amu.edu.pl/~sipkadan/lingo.htm Turbo Lingo]
 
 
 
*[http://stp.ling.uu.se/cgi-bin/joerg/Uplug Uplug]
 
 
 
*[http://wordlist.sourceforge.net/varcon-readme VarCon (Variant Conversion Info)]
 
 
 
*[http://www.edict.com.hk/concordance/ Virtual Language Centre's Web Concordancer]
 
 
 
*[http://www.textanalysis.com/help/help.htm Visual Text - reference documentation]
 
 
 
*[http://www.textanalysis.com/ VisualText]
 
 
 
*[http://webglimpse.net/ Webglimpse]
 
 
 
*[http://search.cpan.org/dist/WordNet-SenseRelate Word-Net SenseRelate]
 
 
 
*[http://search.cpan.org/dist/WordNet-Similarity Word-Net Similarity]
 
 
 
*[http://sourceforge.net/projects/wordfreak Wordfreak]
 
 
 
*[http://www.cogsci.princeton.edu/~wn/ Wordnet]
 
 
 
*[http://www.d.umn.edu/~tpederse/similarity.html WordNet-Similarity PERL Module]
 
 
 
*[http://www.d.umn.edu/~tpederse/wsdshell.html WSD Shell]
 
 
 
*[http://www.xml-ces.org/ XCES: Corpus Encoding Standard for XML]
 
 
 
==TOOLS==
 
*[http://sslmit.unibo.it/~baroni/bootcat.html BootCaT Toolkit: Simple Utilities to Bootstrap Corpora and Terms from the Web]
 
 
 
*[http://chasen.aist-nara.ac.jp/hiki/ChaSen/ ChaSen]
 
 
 
*[http://nlp.cs.jhu.edu/~gsm/pd_demo Dendrogram Demo]
 
 
 
*[http://www.olst.umontreal.ca/dicoeng.html DiCo Lexical Database OLST]
 
 
 
*[http://www.wolfson.ox.ac.uk/~peet/eatshow.htm Edinburgh Associative Thesaurus]
 
 
 
*[http://www.smi.ucd.ie/hyppia/ HYPPIA]
 
 
 
*[http://xlex.uni-muenster.de/ M&uuml;nster Tagging Project]
 
 
 
*[http://nltk.sourceforge.net Natural Language Toolkit (NLTK)]
 
 
 
*[http://perso.wanadoo.fr/rosavram/ NooJ]
 
 
 
*[http://www.informatics.susx.ac.uk/research/nlp/rasp/ Robust Accurate Statistical Parsing (RASP)]
 
 
 
*[http://www.searchtools.com/ Search Tools for Web Sites and Intranets]
 
 
 
*[http://www.lsi.upc.edu/~surdeanu/swirl.html SwiRL Semantic Role Labeler]
 
 
 
*[http://view.byu.edu VIEW (Variation in English Words and Phrases)]
 
 
 
==UNCATEGORIZED==
 
*[http://www40.brinkster.com/dictionarium/index.html dictionarium]
 
 
 
*[http://www.mat.upm.es/~aries ARIES Natural Language Tools]
 
 
 
*[http://www.york.ac.uk/services/library/subjects/langint.htm Language and Linguistic Science information sources]
 
 
 
*[http://www.de.elra.research.ec.org/ The RELATOR language resources server]
 
 
 
*[ftp://parcftp.xerox.com/pub/ Xerox PARC FTP site.]
 

Latest revision as of 09:45, 17 June 2015