jaxlib issue on midway3
ndtrung81 opened this issue · 5 comments
Hi all,
I have been trying to keep the pysages module up to date on midway2 and midway3; following the steps described in https://hackmd.io/Jbpc1E2kRbKPLUnmLKwINA
While the abf example script with openmm runs normally, I got the following errors with the example scripts under examples/hoomd-blue:
jaxlib.xla_extension.XlaRuntimeError: No matching device found for local hardware
To reproduce this error, on a GPU compute node on midway3 (midway3-0279 in this case) I did
module load python/anaconda-2021.05 cuda/11.2 openmpi/4.1.2+gcc-7.4.0 source activate pysages3
and then goes to examples/hoomd-blue/unbiased, run
python3 gen_gsd.py
to get start.gsd. Then run
python3 unbiased.py
(screenshot attached)
Maybe I missed something important here. Any suggestion will be appreciated.
(hoomd v2.9.7 installed in this pysages3 environment runs normally.)
Thanks,
-Trung
Hey Trung,
so I had to reinstall and test with Hoomd-blue on midway 3 during the last couple of days.
I load the following modules on midway3.
module purge
module load rcc
module load slurm
module load cuda
module load cmake
module load openmpi
conda activate ls-hoomd
where the ls-hoomd
conda environment looks like this:
#
elwood /home/ludwigschneider/.conda/envs/elwood
ls-hoomd * /home/ludwigschneider/.conda/envs/ls-hoomd
mypysages /home/ludwigschneider/.conda/envs/mypysages
base /software/python-anaconda-2020.11-el8-x86_64
anvio-7.1 /software/python-anaconda-2020.11-el8-x86_64/envs/anvio-7.1
arcgis /software/python-anaconda-2020.11-el8-x86_64/envs/arcgis
automl_prediction /software/python-anaconda-2020.11-el8-x86_64/envs/automl_prediction
dask /software/python-anaconda-2020.11-el8-x86_64/envs/dask
env_deeplabcut /software/python-anaconda-2020.11-el8-x86_64/envs/env_deeplabcut
fflip /software/python-anaconda-2020.11-el8-x86_64/envs/fflip
geo_jpg /software/python-anaconda-2020.11-el8-x86_64/envs/geo_jpg
geospatial /software/python-anaconda-2020.11-el8-x86_64/envs/geospatial
hoomd /software/python-anaconda-2020.11-el8-x86_64/envs/hoomd
img_conversion /software/python-anaconda-2020.11-el8-x86_64/envs/img_conversion
meep /software/python-anaconda-2020.11-el8-x86_64/envs/meep
mkdocs /software/python-anaconda-2020.11-el8-x86_64/envs/mkdocs
mpi4py /software/python-anaconda-2020.11-el8-x86_64/envs/mpi4py
mrsid /software/python-anaconda-2020.11-el8-x86_64/envs/mrsid
openmm /software/python-anaconda-2020.11-el8-x86_64/envs/openmm
pmeep /software/python-anaconda-2020.11-el8-x86_64/envs/pmeep
pysages /software/python-anaconda-2020.11-el8-x86_64/envs/pysages
pytorch-gpu-1.2-cuda-10.0 /software/python-anaconda-2020.11-el8-x86_64/envs/pytorch-gpu-1.2-cuda-10.0
qgis_stable /software/python-anaconda-2020.11-el8-x86_64/envs/qgis_stable
rstudio /software/python-anaconda-2020.11-el8-x86_64/envs/rstudio
test_python_env /software/python-anaconda-2020.11-el8-x86_64/envs/test_python_env
tf_keras /software/python-anaconda-2020.11-el8-x86_64/envs/tf_keras
vertexai /software/python-anaconda-2020.11-el8-x86_64/envs/vertexai
And manually installed jax for cuda
pip install --upgrade "jax[cuda]" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html`
And installation of hoomd-blue and hoomd-dlext from source.
hope this helps
@InnocentBug thanks for sharing. It seems that enforcing python=3.8 when installing packages makes hoomd-blue examples work again. Will give more updates soon.
Is this still an issue?
Not an issue at this point. Let's mark this issue as resolved. The current env pysages3
under python/anaconda-2021.05
on midway3 works fine with the examples as far as my tests go. Needs module load python/anacoda-2021.05 openmpi/4.1.2+gcc-7.4.0 cuda/11.2