Crash when loading ogbn_proteins
JonasDeSchouwer opened this issue · 1 comments
I try to execute the following line:
ogb_dataset = NodePropPredDataset(name="ogbn-proteins", root=f"{datasets/data/ogb")
This starts off doing what it is supposed to:
- it downloads
proteins.zip
from the correct url - it extracts this zip into the directory
datasets/data/ogb/ogbn_proteins
with subdirectoriesmapping
,raw
,processed
,split
- it loads the graph and labels, and preprocesses them.
However, as soon as it gets to the line
torch.save({'graph': self.graph, 'labels': self.labels}, pre_processed_file_path, pickle_protocol=4)
in ogb/nodeproppred/dataset.py
(= line 135 in the version I am running), the program crashes without any error messages, and only an empty file is saved to datasets/data/ogb/ogbn_proteins/processed/data_processed
.
I have been able to reproduce this by just loading self.graph
and self.labels
in a notebook by executing the following code:
graph = read_csv_graph_raw(raw_dir, add_inverse_edge=True, additional_node_files=['node_species'], additional_edge_files=[])[0]
labels = pd.read_csv(osp.join(raw_dir, 'node-label.csv.gz'), compression='gzip', header=None).values
Then, I can save labels
and graph["node_species"]
to a file without problem, but as soon as I try to save anything containing graph["edge_index"]
or graph["edge_feat"]
to a file, the kernel crashes. Note that these have large sizes: (2, 79122504) for graph["edge_index"]
and (79122504, 8) for graph["edge_feat"]
. All matrices look pretty normal to me, so my guess is that this is a problem with torch.save
not being able to handle large files (yet the matrices are smaller than the max size reported in this issue). Yet I thought it will be useful to let you know this and perhaps find a workaround.
--- DETAILS ABOUT MY ENVIRONMENT ---
- Ubuntu 20.04.6 LTS
- Python 3.12.3
- torch 2.2.2+cu121
Output from conda:
_libgcc_mutex 0.1 conda_forge conda-forge
_openmp_mutex 4.5 2_gnu conda-forge
absl-py 2.1.0 pypi_0 pypi
aiohttp 3.9.5 pypi_0 pypi
aiosignal 1.3.1 pypi_0 pypi
asttokens 2.4.1 pyhd8ed1ab_0 conda-forge
attrs 23.2.0 pypi_0 pypi
bzip2 1.0.8 h5eee18b_6
ca-certificates 2024.3.11 h06a4308_0
certifi 2024.2.2 pypi_0 pypi
charset-normalizer 3.3.2 pypi_0 pypi
comm 0.2.2 pyhd8ed1ab_0 conda-forge
debugpy 1.6.7 py312h6a678d5_0
decorator 5.1.1 pyhd8ed1ab_0 conda-forge
exceptiongroup 1.2.0 pyhd8ed1ab_2 conda-forge
executing 2.0.1 pyhd8ed1ab_0 conda-forge
expat 2.6.2 h6a678d5_0
filelock 3.13.1 pypi_0 pypi
frozenlist 1.4.1 pypi_0 pypi
fsspec 2024.2.0 pypi_0 pypi
googledrivedownloader 0.4 pypi_0 pypi
grpcio 1.64.0 pypi_0 pypi
idna 3.7 pypi_0 pypi
importlib-metadata 7.1.0 pyha770c72_0 conda-forge
importlib_metadata 7.1.0 hd8ed1ab_0 conda-forge
ipykernel 6.29.3 pyhd33586a_0 conda-forge
ipython 8.24.0 pyh707e725_0 conda-forge
jedi 0.19.1 pyhd8ed1ab_0 conda-forge
jinja2 3.1.3 pypi_0 pypi
joblib 1.4.2 pypi_0 pypi
jupyter_client 8.6.2 pyhd8ed1ab_0 conda-forge
jupyter_core 5.5.0 py312h06a4308_0
ld_impl_linux-64 2.38 h1181459_1
libffi 3.4.4 h6a678d5_1
libgcc-ng 13.2.0 h77fa898_7 conda-forge
libgomp 13.2.0 h77fa898_7 conda-forge
libsodium 1.0.18 h36c2ea0_1 conda-forge
libstdcxx-ng 11.2.0 h1234567_1
libuuid 1.41.5 h5eee18b_0
lightning-utilities 0.11.2 pypi_0 pypi
littleutils 0.2.2 pypi_0 pypi
markdown 3.6 pypi_0 pypi
markupsafe 2.1.5 pypi_0 pypi
matplotlib-inline 0.1.7 pyhd8ed1ab_0 conda-forge
mpmath 1.3.0 pypi_0 pypi
multidict 6.0.5 pypi_0 pypi
ncurses 6.4 h6a678d5_0
nest-asyncio 1.6.0 pyhd8ed1ab_0 conda-forge
networkx 3.2.1 pypi_0 pypi
numpy 1.26.3 pypi_0 pypi
nvidia-cublas-cu12 12.1.3.1 pypi_0 pypi
nvidia-cuda-cupti-cu12 12.1.105 pypi_0 pypi
nvidia-cuda-nvrtc-cu12 12.1.105 pypi_0 pypi
nvidia-cuda-runtime-cu12 12.1.105 pypi_0 pypi
nvidia-cudnn-cu12 8.9.2.26 pypi_0 pypi
nvidia-cufft-cu12 11.0.2.54 pypi_0 pypi
nvidia-curand-cu12 10.3.2.106 pypi_0 pypi
nvidia-cusolver-cu12 11.4.5.107 pypi_0 pypi
nvidia-cusparse-cu12 12.1.0.106 pypi_0 pypi
nvidia-nccl-cu12 2.19.3 pypi_0 pypi
nvidia-nvjitlink-cu12 12.1.105 pypi_0 pypi
nvidia-nvtx-cu12 12.1.105 pypi_0 pypi
ogb 1.3.6 pypi_0 pypi
openssl 3.3.0 h4ab18f5_3 conda-forge
outdated 0.2.2 pypi_0 pypi
packaging 24.0 pyhd8ed1ab_0 conda-forge
pandas 2.2.2 pypi_0 pypi
parso 0.8.4 pyhd8ed1ab_0 conda-forge
pexpect 4.9.0 pyhd8ed1ab_0 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 10.2.0 pypi_0 pypi
pip 24.0 py312h06a4308_0
platformdirs 4.2.2 pyhd8ed1ab_0 conda-forge
prompt-toolkit 3.0.42 pyha770c72_0 conda-forge
protobuf 5.26.1 pypi_0 pypi
psutil 5.9.8 pypi_0 pypi
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge
pyg-lib 0.4.0+pt22cu121 pypi_0 pypi
pygments 2.18.0 pyhd8ed1ab_0 conda-forge
pyparsing 3.1.2 pypi_0 pypi
python 3.12.3 h996f2a0_1
python-dateutil 2.9.0 pyhd8ed1ab_0 conda-forge
pytorch-lightning 2.2.0 pypi_0 pypi
pytz 2024.1 pypi_0 pypi
pyyaml 6.0.1 pypi_0 pypi
pyzmq 25.1.2 py312h6a678d5_0
readline 8.2 h5eee18b_0
requests 2.32.2 pypi_0 pypi
scikit-learn 1.5.0 pypi_0 pypi
scipy 1.13.1 pypi_0 pypi
setuptools 69.5.1 py312h06a4308_0
six 1.16.0 pyh6c4a22f_0 conda-forge
sqlite 3.45.3 h5eee18b_0
stack_data 0.6.2 pyhd8ed1ab_0 conda-forge
sympy 1.12 pypi_0 pypi
tensorboard 2.16.2 pypi_0 pypi
tensorboard-data-server 0.7.2 pypi_0 pypi
tensorboard-reducer 0.3.1 pypi_0 pypi
threadpoolctl 3.5.0 pypi_0 pypi
tk 8.6.14 h39e8969_0
torch 2.2.2+cu121 pypi_0 pypi
torch-cluster 1.6.3+pt22cu121 pypi_0 pypi
torch-geometric 2.5.3 pypi_0 pypi
torch-scatter 2.1.2+pt22cu121 pypi_0 pypi
torch-sparse 0.6.18+pt22cu121 pypi_0 pypi
torch-spline-conv 1.2.2+pt22cu121 pypi_0 pypi
torch-tb-profiler 0.4.3 pypi_0 pypi
torchaudio 2.2.2+cu121 pypi_0 pypi
torchmetrics 1.4.0.post0 pypi_0 pypi
torchvision 0.17.2+cu121 pypi_0 pypi
tornado 6.3.3 py312h5eee18b_0
tqdm 4.66.4 pypi_0 pypi
traitlets 5.14.3 pyhd8ed1ab_0 conda-forge
typing-extensions 4.9.0 pypi_0 pypi
typing_extensions 4.11.0 pyha770c72_0 conda-forge
tzdata 2024.1 pypi_0 pypi
urllib3 2.2.1 pypi_0 pypi
wcwidth 0.2.13 pyhd8ed1ab_0 conda-forge
werkzeug 3.0.3 pypi_0 pypi
wheel 0.43.0 py312h06a4308_0
xz 5.4.6 h5eee18b_1
yarl 1.9.4 pypi_0 pypi
zeromq 4.3.5 h6a678d5_0
zipp 3.17.0 pyhd8ed1ab_0 conda-forge
zlib 1.2.13 h5eee18b_1
To reproduce this issue:
In the terminal:
conda create -n test_save_env
conda activate test_save_env
conda install python=3.12
pip install ogb==1.3.6
Note that ogb has torch as a dependency, so in my case it installs torch 2.3.1. But I observed the same behaviour with torch 2.2.2+cu121.
Then run the following Python code:
from ogb.io.read_graph_raw import read_csv_graph_raw
import pandas as pd
import os.path as osp
import torch
raw_dir = "datasets/data/ogb/ogbn_proteins/raw"
graph = read_csv_graph_raw(raw_dir, add_inverse_edge=True, additional_node_files=['node_species'], additional_edge_files=[])[0]
labels = pd.read_csv(osp.join(raw_dir, 'node-label.csv.gz'), compression='gzip', header=None).values
In my case, this gives the following error (in a notebook):
The Kernel crashed while executing code in the current cell or a previous cell.
Please review the code in the cell(s) to identify a possible cause of the failure.
Click [here](https://aka.ms/vscodeJupyterKernelCrash) for more info.
View Jupyter [log](command:jupyter.viewOutput) for further details.