Awesome Community-Curated NLP List

To contribute: This list is community curated, anyone can do a pull-request to add to the list. And it will be merged once 5 person have verified that the PR is not spam.

Speech NLP

Text NLP Suites

Language Specific Text NLP Suites

  • Arabic

    • SAFAR: Software Architecture For Arabic language pRocessing
  • Cantonese

  • Chinese

    • SnowNLP: Simplified Chinese Text Processing
  • Persian

    • Hazm: Python library for digesting Persian text.
  • Dutch

    • Frog: An advanced NLP suite for Dutch
  • Italian

    • Tint: Lend color to your Italian texts!
  • Korean

    • KoNLPy: Korean NLP in Python

Pre-processing (Tokenization / Stemming / POS Tagging / etc.)

Deep Linguistic Processing

The deep here isn't "deep learing" deep ;P , see https://en.wikipedia.org/wiki/Deep_linguistic_processing

  • Head-drive Phrase Structure Grammar (HPSG)

  • Combinatory Categorial Grammar (CCG)

    • CCG2PST : A tool for converting CCG derivations into PTB-style phrase structure trees

Word Embeddings

Twitter

Task Specific

Machine Translation

Language Modelling

Annotation Related

Others

NLP Related Machine Learning Tools

  • Timbl - Memory-based machine learning
  • KeLP: Kernel-based Learning Platform

List of Lists of NLP Resources/Tools

Dataset Lists

See Also

  • Corpora List: Your source of all thing computational linguistics / NLP / corpora
  • LT World: Language Technology World
  • META Net
  • LDC: Linguistic Data Consortium
  • OLAC: Open Language Archives Community
  • NLSR: Natural Language Software Registry