Xiang-Pan/NYU_DL_Sys_Project

Unsupervised Domain Adaptation for Same Task/ Same Domain

jk7599 opened this issue 3 years ago · 0 comments

jk7599 commented 3 years ago

Domain Adaptation

PDF: https://arxiv.org/pdf/2004.10964.pdf
Implementation
https://github.com/allenai/dont-stop-pretraining

Things to Consider

Are we going to use Roberta like the paper or BERT or some other model?
How are we gonna store the trained model? Is there free storage that we can use?

Dataset

Same Task

TREC-8 QA Dataset
https://trec.nist.gov/data/qa/t8_qadata.html
Yahoo Answer Dataset
https://www.kaggle.com/datasets/jarupula/yahoo-answers-dataset

Same Domain

Covid 19 Dataset
https://github.com/davidcampos/covid19-corpus

Plan

-[] Implement to train the model with domain adaptation
-[] Store the model for used by active learning later