/Spoiler-detection-and-classification-using-NLP

Spoiler Detection and classification using LLMs, Achieved 73% accuracy by utilizing advanced algorithms (RoBERTa, DistillBERT, Sentence Transformers) to classify and detect clickbait spoilers in various text formats

Primary LanguageJupyter Notebook

Spoiler detection and classification

NLP and Text Mining - final project

Team name: Natural Language Processors

Kindly View the report for this project here

Steps to execude the code -

  1. Clone the repository
  2. First Run the preprocessing folder to generate the correct dataset files in csv format. This will clean the data, both train and validation.
  3. Then start running by milestone 2, upload each .pynb file in colab and run each cell.
  4. Then start running for milestone 3, upload each file in colab and keep running each cell to see results.
  5. Please contact us incase of issues.

References

https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1 https://discuss.huggingface.co/t/trainer-only-doing-3-epochs-no-matter-the-trainingarguments/19347/5 https://huggingface.co/docs/transformers/tasks/question_answering#preprocess https://towardsdatascience.com/fine-tune-transformer-models-for-question-answering-on-custom-data-513eaac37a80 https://huggingface.co/transformers/v3.3.1/custom_datasets.html#qa-squad https://towardsdatascience.com/question-answering-with-pretrained-transformers-using-pytorch-c3e7a44b4012 https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfVectorizer.html https://zenodo.org/record/6362726#.YsbdSTVBzrk