Mnist dataset server is down
Jeffwan opened this issue · 5 comments
E2e test is down. Reason is straightforwad that server report 503 issue and I did some check and notice this has been tracked in torch community.
As the patch is only available on master and there's no way to specify the download path. I can try to either disable that single test case and wait for stable release or build a nightly image which takes extra efforts
Using distributed PyTorch with gloo backend
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Traceback (most recent call last):
File "/var/mnist.py", line 150, in <module>
main()
File "/var/mnist.py", line 123, in main
transforms.Normalize((0.1307,), (0.3081,))
File "/opt/conda/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/datasets/mnist.py", line 46, in __init__
epoch, batch_idx * len(data), len(train_loader.dataset),
File "/opt/conda/lib/python3.6/site-packages/torchvision-0.2.1-py3.6.egg/torchvision/datasets/mnist.py", line 114, in download
if should_distribute():
File "/opt/conda/lib/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/opt/conda/lib/python3.6/urllib/request.py", line 532, in open
response = meth(req, response)
File "/opt/conda/lib/python3.6/urllib/request.py", line 642, in http_response
'http', request, response, code, msg, hdrs)
File "/opt/conda/lib/python3.6/urllib/request.py", line 570, in error
return self._call_chain(*args)
File "/opt/conda/lib/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/opt/conda/lib/python3.6/urllib/request.py", line 650, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 503: Service Unavailable
Confirmed this is a server side issue.
https://discuss.pytorch.org/t/mnist-server-down/114433
pytorch/vision#3554
@Jeffwan We faced with the same problem in Katib.
We currently using FashionMNIST
instead of MNIST
: https://github.com/kubeflow/katib/blob/master/examples/v1beta1/pytorch-mnist/mnist.py#L137.
I believe it hosts in the PyTorch servers.
@andreyvelich this sounds like a good solution.
Another way would be to pre-download the dataset in the image.
The problem is how to make a new image for the example. The current one is from the GCP registry, which is no longer available.
@Jeffwan We faced with the same problem in Katib.
We currently usingFashionMNIST
instead ofMNIST
: https://github.com/kubeflow/katib/blob/master/examples/v1beta1/pytorch-mnist/mnist.py#L137.
I believe it hosts in the PyTorch servers.
Sounds good. Let me double check if the code is compatible with FashionMnist dataset. If it is and data server is reliable. We can quickly change to it.