/Multi-Label-Classification-of-Pubmed-Articles-Deployed-on-HuggingFace-Spaces

The traditional machine learning models give a lot of pain when we do not have sufficient labeled data for the specific task or domain we care about to train a reliable model. Transfer learning allows us to deal with these scenarios by leveraging the already existing labeled data of some related task or domain. We try to store this knowledge gained in solving the source task in the source domain and apply it to our problem of interest. In this work, I have utilized Transfer Learning utilizing BioBERT model. Also Applied RobertaForSequenceClassification and XLNetForSequenceClassification models for Fine-Tuning the Model. Model is live on Hugging Face Spaces

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Multi-Label-Classification-of-Pubmed-Articles

This Work was selected in November 2022 Kaggle ML Research Spotlight🎉🎉

Read Announcements Here and Here.

Live at Huggingface Here.

The traditional machine learning models cause a lot of pain when we do not have sufficient labelled data for the specific task or domain we care about to train a reliable model. Transfer learning allows us to deal with these scenarios by leveraging the already existing labelled data of some related task or domain. We try to store this knowledge gained in solving the source task in the source domain and apply it to our problem of interest. In this work, I have utilized Transfer Learning utilizing the BIOBERT model to fine-tune the Pubmed MultiLabel classification Datset.

Also tried RobertaForSequenceClassification and XLNetForSequenceClassification models for Fine-Tuning the Model on Pubmed MultiLabel Datset.

I have integrated Weight and Bias for visualizations and logging artifacts and comparisons of different models!

Multi-Label Classification of PubMed Articles Weight and Biases Different Model training Logs Links

  • To get the API key, create an account on the website.
  • Use secrets to use API Keys more securely inside Kaggle.

For more information on the attributes visit the Kaggle Dataset Description here.

In order to, get a full grasp of what steps I should be taking to utilise this dataset. Have a Full look at the Dataset and information present in the Kaggle Notebook Link & Kaggle Dataset Link

References

  1. Attention Is All You Need
  2. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
  3. https://github.com/google-research/bert
  4. https://github.com/huggingface/transformers
  5. BCE WITH LOGITS LOSS Pytorch
  6. Transformers for Multi-Label Classification made simple by Ronak Patel