/ssfinetuning

A package for fine tuning of pretrained NLP transformers using Semi Supervised Learning

Primary LanguagePythonMIT LicenseMIT

Semi Supervised FineTuning of Pretrained Transformer model(NLP) for sequence classification

Getting started.

  1. Example use cases:

  2. Important points to consider before using any of the models:

    • To prepare datasets for PiModel, TemporalEnsemble, and MeanTeacher, unlabeled datapoints should be labeled with negative labels ( <=0 ). Also, a batch should not contain mix of labeled and unlabeled datapoints. For eg for the batch size of 2:
     >>> labeled = datasets.Dataset.from_dict({'sentence':['moon can be red.', 'There are no people on moon.'], 'label':[1, 0]}) 
    
     >>> unlabeled = Dataset.from_dict({'sentence':['moon what??.', 'I am people'], 'label':[-1, -1]}) ##correct way to unlabeled datasets.
     >>> unlabeled_wrong = Dataset.from_dict({'sentence':['moon what??.', 'I am people'], 'label':[0, -1]}) ##wrong way to unlabeled datasets.
     >>> dataset_training = Dataset.from_dict({'sentence':labeled['sentence'] + unlabeled['sentence'], 'label':labeled['label'] + unlabeled['label']})
    • If directly using the Trainer from ~trainer_util modules. Following maps between the trainers and models should be considered.
     >>> from trainer_util import (
     TrainerWithUWScheduler, 
     TrainerForCoTraining, 
     TrainerForTriTraining,
     TrainerForNoisyStudent)
     
     >>> from models import (
     PiModel,
     TemporalEnsembleModel,
     CoTrain,
     TriTrain,
     MeanTeacher,
     NoisyStudent)
    
     >>> MAPPING_BETWEEN_TRAINER_AND_MODEL = OrderedDict(
     [
         (PiModel, TrainerWithUWScheduler),
         (TemporalEnsemble, TrainerWithUWScheduler),
         (CoTrain, TrainerForCoTraining),
         (TriTrain, TrainerForTriTraining),
         (MeanTeacher, TrainerWithUWScheduler),
         (NoisyStudent, TrainerForNoisyStudent),
     ])
  3. Two ways to train with semi supervised learning.

    • Use training_args.train_with_ssl which takes care of the above mapping in couple of lines of code.
    • Using an appropriate Trainer from 'trainer_util' with the model from models as shown in the above mapping.

This package implements various semisupervised learning (SSL) approaches commonly known in computer vision to NLP, only at the finetuning stage of the models. This repo is created to explore how far can one get by applying ssl at only the classifier layer/layers.

The sslfinetuning is implemented using the class composition with Auto classes of HuggingFace's Transformers library. So, any pretrained transformer model available at HuggingFace should be able to run here, for Sequence classification.

The trainers for SSL models are deriven from transformers.Trainer.

SSL Models Used:

  1. PiModel as introduced in paper Temporal Ensembling for Semi-Supervised Learning by Samuli Laine, and Timo Aila.

  2. TemporalEnsemble also introduced in the above paper.

  3. CoTrain as introduced in the paper Combining Labeled and Unlabeled Data with Co-Training by Avrim Blum and Tom Mitchell.

  4. TriTrain was first introduced in the paper Tri-training: exploiting unlabeled data using three classifiers by Zhi-Hua Zhou and Ming Li. However, in this project the implementation is more closer to implementation in Strong Baselines for Neural Semi-supervised Learning under Domain Shift and Asymmetric Tri-training for Unsupervised Domain Adaptation.

  5. MeanTeacher as introduced in the paper Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results by Antti Tarvainen, and Harri Valpola.

  6. NoisyStudent as introduced in the paper Self-training with Noisy Student improves ImageNet classification by Qizhe Xie, Minh-Thang Luong, Eduard Hovy, and Quoc V. Le.