BlueBrain/Search

Fine-tune Question-Answering model on our own data

Opened this issue 2 years ago · 0 comments

FrancescoCasalegno commented 2 years ago

Context

Pre-trained QA models seem to give decent accuracy on our BBP internal QA samples (see #612).
Hopefully, we can get better accuracy by fine-tuning the best performing models on our own data.
However, to do so, we need to have enough samples in our dataset to perform a train-valid split, and we also need to double-check the quality of our training data.

Actions

Fine-tune the best performing QA model(s) on our own QA dataset, using k-fold cross-validation.
Investigate also results when holdout (valid) splits are created by removing samples from one source (e.g. WvG, PS, HM, ...) and training on the others.
If our results are better than the baseline, compute also training curves (i.e. increase training set size and check accuracy).

Dependencies

#616
#617
In general, we should have ~200 samples at the very least to be able to run this experiment.