visipedia/inat_comp

Broken links to all 2017 data

justinkay opened this issue · 5 comments

Unfortunately none of the 2017 dataset links seem to be working.

For the one hosted by caltech, I get a Connection refused error:

~$ wget http://www.vision.caltech.edu/~gvanhorn/datasets/inaturalist/fgvc4_competition/train_val_images.tar.gz
--2020-07-28 18:35:19--  http://www.vision.caltech.edu/~gvanhorn/datasets/inaturalist/fgvc4_competition/train_val_images.tar.gz
Resolving www.vision.caltech.edu (www.vision.caltech.edu)... 34.208.54.77
Connecting to www.vision.caltech.edu (www.vision.caltech.edu)|34.208.54.77|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://people.vision.caltech.edu/~gvanhorn/datasets/inaturalist/fgvc4_competition/train_val_images.tar.gz [following]
--2020-07-28 18:35:20--  http://people.vision.caltech.edu/~gvanhorn/datasets/inaturalist/fgvc4_competition/train_val_images.tar.gz
Resolving people.vision.caltech.edu (people.vision.caltech.edu)... 131.215.133.185
Connecting to people.vision.caltech.edu (people.vision.caltech.edu)|131.215.133.185|:80... failed: Connection refused.

And for each of the google API mirrors, a NoSuchBucket error:

<Error>
<Code>NoSuchBucket</Code>
<Message>The specified bucket does not exist.</Message>
</Error>

Has this data been permanently removed? Or is there a mirror or some other way to access it?

Thanks for any help.

Thanks for the heads up! I'm trying to work with Caltech and Google to see if we can get the links re-established.

The link to the un-obfuscated category names also needs to be updated.
Thanks in advance!

Update: I'm currently uploading the datasets to AWS S3. As a temporary solution I'll put them in a "requester pays" bucket so that folks can at least access them. I'll close this issue once the S3 links are ready to go (hopefully by end of day today or tomorrow). Still going back and forth with Google and Kaggle to see if we can get the original links reactivated.

That’s great, thank you!

We've uploaded the dataset files to a "requester pays" bucket on AWS S3 (see here). While this is not as convenient as free downloads it at least makes the dataset available. We'll continue the conversation with Google to see if we can get the original links to work again. Sorry for the delay!