Xiang-Pan/NYU_DL_Sys_Project

Unsupervised Domain Adaptation for Same Task/ Same Domain

jk7599 opened this issue · 0 comments

Domain Adaptation

PDF: https://arxiv.org/pdf/2004.10964.pdf
Implementation
https://github.com/allenai/dont-stop-pretraining

Things to Consider

  • Are we going to use Roberta like the paper or BERT or some other model?
  • How are we gonna store the trained model? Is there free storage that we can use?

Dataset

  • Same Task
  1. TREC-8 QA Dataset
    https://trec.nist.gov/data/qa/t8_qadata.html
  2. Yahoo Answer Dataset
    https://www.kaggle.com/datasets/jarupula/yahoo-answers-dataset
  • Same Domain
  1. Covid 19 Dataset
    https://github.com/davidcampos/covid19-corpus

Plan

-[] Implement to train the model with domain adaptation
-[] Store the model for used by active learning later