Topic models and text embeddings
This repo contains all files for my bachelor thesis on the utility of clustering algorithms at the Technische Universität in Berlin, Germany.
Content
Topic_Models_and_Embeddings.ipynb
: A copy of the final Google Colab pipelinegensim_lda.ipynb
: The LDA pipelineget_embeddings.py
: A python script to create text embeddings from a CSV filetext_embeddings_final_pipeline
: The final text embeddings pipelinetext_embeddings_it{n}
: The n-th iteration of my experiments with the text embeddings pipeline