kundajelab/chrombpnet

Training chromBPNet takes too much time

lzj1769 opened this issue · 4 comments

Hi,

I am currently training a chromBPNet for an ATAC-seq sample with ~200K peaks. However, it takes ~12 hours for one epoch.
See below screenshot
Screenshot 2023-12-08 at 20 38 24

So I want to ask how to make the training process faster.

Any ideas are appreciated.

Thanks,
Zhijian

Hi Anshul,

I think I found the problem.
For some weird reason, TensorFlow doesn't use GPU properly. After fixing this issue, everything looks great!

Thanks,
Zhijian

Hi @lzj1769, I am running into a similar performance problem. Training the model from the tutorial is very slow, no output yet after more than 1 hour execution time. How did you discover TensorFlow wasn't using the GPU properly and how did you fix it?

Hi @hermandebeukelaer

You can check if tf is using GPU using the command: nvidia-smi

To fix it, be sure you installed the correct version of CUDA, Tensorflow, and cuDNN.