/corpus_toolkit

Python toolkit for corpus analysis: tokenization, lexical diversity, vocabulary growth prediction, entropy measures, and Zipf/Heaps visualizations.

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Watchers