/toolpic

Toolkit for topic modelling

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

toolpic : Topic Modelling

toolpic is a toolkit for process topics modelling. Written in python, it is based on unsupervised learning, latent dirichlet allocation (LDA). This algorithm is implemented in gensim (https://radimrehurek.com/gensim/index.html). 4 open source scripts managed the full process. The fisrt one (import.py) is for extract and import text from openedition database (according to published years and given langages). The second one (topic.py) clean/stem text in order to fit training parameters or to train directly the topic model. The third one (graph.py) is just a plot to see dependency of parameter. The last one (topic_visualisation.py) allow to visualize each topic (two format available).

Contributors

Mathieu Orban.

Installation

See INSTALL.txt

Usage

See README.txt

Licence

toolpic is released under the terms of the GNU AFFERO GENERAL PUBLIC LICENSE

Documentation