Shen-Lab/GraphCL

Dataset Download Fail

dongzizhu opened this issue · 2 comments

Hi, thanks for sharing the great work!

I'm trying to run the code for unsupervised TU with NCI1, and the data download has failed.

Downloading http://ls11-www.cs.tu-dortmund.de/people/morris/graphkerneldatasets/NCI1.zip
Traceback (most recent call last):
  File "/fs/ess/PCON0023/dzz2023/DD/GraphCL/unsupervised_TU/gsimclr.py", line 163, in <module>
    dataset = TUDataset(path, name=DS, aug=args.aug).shuffle()
  File "/fs/ess/PCON0023/dzz2023/DD/GraphCL/unsupervised_TU/aug.py", line 71, in __init__
    super(TUDataset_aug, self).__init__(root, transform, pre_transform,
  File "/fs/ess/PCON0023/dzz2023/new_conda/envs/pyg/lib/python3.9/site-packages/torch_geometric/data/in_memory_dataset.py", line 81, in __init__
    super().__init__(root, transform, pre_transform, pre_filter, log,
  File "/fs/ess/PCON0023/dzz2023/new_conda/envs/pyg/lib/python3.9/site-packages/torch_geometric/data/dataset.py", line 112, in __init__
    self._download()
  File "/fs/ess/PCON0023/dzz2023/new_conda/envs/pyg/lib/python3.9/site-packages/torch_geometric/data/dataset.py", line 229, in _download
    self.download()
  File "/fs/ess/PCON0023/dzz2023/DD/GraphCL/unsupervised_TU/aug.py", line 158, in download
    path = download_url('{}/{}.zip'.format(url, self.name), folder)
  File "/fs/ess/PCON0023/dzz2023/new_conda/envs/pyg/lib/python3.9/site-packages/torch_geometric/data/download.py", line 47, in download_url
    data = urllib.request.urlopen(url, context=context)
  File "/fs/ess/PCON0023/dzz2023/new_conda/envs/pyg/lib/python3.9/urllib/request.py", line 214, in urlopen
    return opener.open(url, data, timeout)
  File "/fs/ess/PCON0023/dzz2023/new_conda/envs/pyg/lib/python3.9/urllib/request.py", line 523, in open
    response = meth(req, response)
  File "/fs/ess/PCON0023/dzz2023/new_conda/envs/pyg/lib/python3.9/urllib/request.py", line 632, in http_response
    response = self.parent.error(
  File "/fs/ess/PCON0023/dzz2023/new_conda/envs/pyg/lib/python3.9/urllib/request.py", line 561, in error
    return self._call_chain(*args)
  File "/fs/ess/PCON0023/dzz2023/new_conda/envs/pyg/lib/python3.9/urllib/request.py", line 494, in _call_chain
    result = func(*args)
  File "/fs/ess/PCON0023/dzz2023/new_conda/envs/pyg/lib/python3.9/urllib/request.py", line 641, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

Could you please share the dataset?

@dongzizhu Thanks for pointing out the obsolete links to the datasets!

@yyou1996 Could you please revise the links in the source code for the TU data download? The latest seems to be at https://www.chrsmrrs.com/graphkerneldatasets/NAME.zip

The link seems to be indeed obsoleted. https://www.chrsmrrs.com/graphkerneldatasets/NAME.zip is the latest one (replace NAME with the dataset name) according to https://pytorch-geometric.readthedocs.io/en/2.4.0/_modules/torch_geometric/datasets/tu_dataset.html#TUDataset.

Unfortunately the dataset downloading and processing function is part of the package torch_geometric that I am not able to modify in this source code. The recommended solution is to write a subclass for the torch_geometric.datasets.tu_dataset dataset class which only replaces the link.