/OutbreakTopics

LDA topic modelling of WHO Disease Outbreak News

Primary LanguagePython

LDA topic modelling of outbreak reports

Quick Build

python3 __main__.py --corpus '../Documents/documents_1996-2019.txt'

nb. Input document assumes one document per line.

Optimisation

python3 __main__.py --corpus '../Documents/documents_1996-2019.txt' --run_types OPTIMISE

Exploration

Word Clouds

python3 __main__.py --corpus 'training_documents.txt' --explorations WORDCLOUDS

Word Bars

python3 __main__.py --corpus 'training_documents.txt' --explorations WORD_BARS

Clustering

python3 __main__.py --corpus 'training_documents.txt' --explorations PCA TSNE

Topics by Document

python3 __main__.py --corpus 'training_documents.txt' --explorations DOC_TOPICS

pyLDAvis

python3 __main__.py --corpus 'training_documents.txt' --explorations PYLDAVIS

Representative Documents

python3 __main__.py --corpus 'training_documents.txt' --explorations REP_DOCS

Topic Prediction

python3 __main__.py --corpus 'training_documents.txt' --predict 'document_for_prediction.txt'

nb. Input document assumes one document per line.

More info can be found in an accompanying blog post.