MDIL-SNU/SIMPLE-NN

No GPU usage in generate_NNP

Opened this issue · 1 comments

I'm trying to run the example in SiO2/generate_NNP using tensorflow GPU in a workstation with 4 Tesla P100 managed by slurm.

To use SIMPLE-NN, I've installed mpi4py and then ran the python setup.py install in a clean conda environment with Python 3.7.
This procedure installed tensorflow 1.15.* automatically through pip.

I had an error with ase which I solved downgrading ase to version 3.18.2 with pip.

To run the example, I've loaded the cuda-10.1.243 module and cudnn-7.6.5.32-10.1 modules, my conda environment, requested 2 gpus with slurm and ran python run.py.
However, when I run nvidia-smi to check the GPU usage, I see that the job is not using the GPU at all.

Is there anything else I need to do to enable the GPU?

Thanks!

Hello, @hmcezar. Here are some checklists you need to check for handling your problem.

  1. Check your Tensorflow. Tensorflow has CPU only version and GPU version. If your Tensorflow is CPU-only version, then GPU is not used.
  2. Check the status of SIMPLE-NN. Training NNP using SIMPLE-NN includes two-step: generating a training dataset and training a neural network. The first process does not use GPU. Thus, check your log file and use nvidia-smi command when neural network training is started. (After the line of 'Iteration: ~~~' is appeared)

If you check 1., and 2. but there are no problems, then please let me know.