google-research/task_adaptation

Dataset split ids?

cjrd opened this issue · 4 comments

cjrd commented

Would it be possible to provide the dataset split ids you used for the paper, i.e. train val test?

Splits are uniquely defined in our data folder through the tfds subsplit API: https://www.tensorflow.org/datasets/splits.

The easiest solution would be to use our code to load the data (which will produce the exact splits from the paper).

cjrd commented

Thanks for your response: I've been able to load the data and output the train/val/test splits.
Is there a particular way to output the train splits for the 1000 example training?

frkl commented

Dear VTAB team,

I’m Xiao Lin from SRI. We’ve been working on cross-domain few-shot learning solutions and find your VTAB-1000 benchmark very exciting. It’s the large-scale fixed-split benchmark we need, comparing to existing small 5-way k-shot problems and the random-way random-shot meta-dataset, so we hope to try it out.

But I ran into some difficulties downloading the dataset. After installing the pip requirements and try running dataset preparation scripts, TF1.5 tells me that “the version of dataset you want to download requires TF2” and when I try installing TF2 instead of TF1.5, some other errors pop up
“Exporting/importing meta graphs is not supported when eager execution is enabled. No graph exists when eager execution is enabled” which looks like a code compatibility issue. I see that you are still actively making commits to add TF2 support so keep up the good work.

On the other hand, I main pytorch and I’m not very familiar with tensorflow. I think maybe a good common ground is sharing the images/image names/your custom labels in addition to the benchmarking code. Your protocol of train/val/test sounds very clear so people would be able to reproduce across platforms. The exception being the Res50v2 model architecture/weights and the fine-tuning procedures, but both of which are actively being improved in your BigTransfer work. In case there’s a follow up challenge, is it possible for the benchmark side to run dockers for some cross-platform love?

Best,
Xiao Lin

Hi,

Would it be possible to upload split_ids file giving the ids of the original dataset samples?