Semi-Supervised-Topic-model-for-auto-labelling

Objective:

Use topic model to map and label transaction data into various categories.

Our dataset contain 176 rows and for we to get a better mapping or topic, we need a corpus more than what we have.

So, therefore, we will use just three categories for this project and as we get more dataset, we will product more topics.

Selected topics:

  • Online transactions
  • Charges
  • Others

Steps:

  • Data cleaning
  • data preprocessing
  • BERT Topic mode
  • Guided Bert Topic Model
  • Guided LDA model
  • Evaluation
  • Save model
  • Topic mapping for auto labelling
  • Vectorization for Baseline Classifier
  • Baseline Classifier training and export model
  • Baseline model sample prediciton