BlueBrain/Search

Fine-tune Question-Answering model on our own data

Opened this issue · 0 comments

Context

  • Pre-trained QA models seem to give decent accuracy on our BBP internal QA samples (see #612).
  • Hopefully, we can get better accuracy by fine-tuning the best performing models on our own data.
  • However, to do so, we need to have enough samples in our dataset to perform a train-valid split, and we also need to double-check the quality of our training data.

Actions

  • Fine-tune the best performing QA model(s) on our own QA dataset, using k-fold cross-validation.
  • Investigate also results when holdout (valid) splits are created by removing samples from one source (e.g. WvG, PS, HM, ...) and training on the others.
  • If our results are better than the baseline, compute also training curves (i.e. increase training set size and check accuracy).

Dependencies

  • #616
  • #617
  • In general, we should have ~200 samples at the very least to be able to run this experiment.