Semantic_Similarity

Given a text and a reason, predict if text satisfies the reason. You can use the train file for any training and report metrics on evaluation file.

Dataset information

Note: Small train dataset with only positive samples is intentional.

The python scripts in this repository addresses the issues below. Run on Google colab, script can be foundhere

Required packages
Label class Imbalance
- Data insights:
  - Baseline approach (use only transformer models)
  - Training approach (use only transformer models)
  - Artificial neg generation techniques.
Metrics
Ablation Study table (different tabular model architecture results comparison)
Fine-tuned the learning rate.
Used a learning rate scheduler.
Used a pre-trained model specifically designed for semantic similarity, such as sentence-transformers/bert-base-nli-mean-tokens.
Insufficient data from data insights analysis