
Demo of BERTopic for topic modelling

Primary LanguageJupyter Notebook


Demo of BERTopic for topic modelling BERTopic is a topic modeling technique that leverages 🤗 transformers and c-TF-IDF to create dense clusters allowing for easily interpretable topics whilst keeping important words in the topic descriptions. https://maartengr.github.io/BERTopic/index.html

Requires arxiv dataset for here : https://www.kaggle.com/datasets/Cornell-University/arxiv

Pre-requisites: Create Python 3.9.7 virtual environment Then pip install numba==0.53.* numpy==1.22.0 pip install bertopic pip install jupyter

bertopic 0.9.4 pypi_0 hdbscan 0.8.27 pypi_0 numba 0.53.0 pypi_0 numpy 1.22.0 pypi_0 pip 21.2.4 py39hecd8cb5_0 python 3.9.7 h88f2d9e_1 pyyaml 5.4.1 pypi_0 toml 0.10.2 pypi_0 umap-learn 0.5.2 pypi_0