nanoporetech/rerio

crf model with a CPU device

lbal-biomat opened this issue · 5 comments

Hi,
I'm recalling some data in an only CPU server with Guppy 4.4.1 using the res_dna_r941_min_crf_v031 model provided here, using 32 threads, but it's taking such a long time. Is a GPU needed for this model? If so, could you recommend which is the best model to use in a CPU only environment?

Thanks

The res_dna_r941_min_crf_v031 is currently most compute intensive of our models and we would highly recommend using a high-end GPU until future releases of Guppy improve its performance.

There is a trade-off between base caller performance and read accuracy, with the "hac" model distributed with Guppy being slower but more accurate than the "fast" model. For severely compute-limited environments, reads could be filtered base on their calls with the fast model and those of interest recalled with higher accuracy models.

Hi,
Thanks for your help, it was very useful. We are trying to push the accuracy even further so we want to invest in a GPU device, but I can't find any resource detailing the recommended GPU devices. Could you point me in the right direction? Is the Nvidia GeForce 30 series compatible with Guppy 4.4.X?

Thanks

ONT only officially supports the cards supplied with our hardware (Nvidia V100 and derivatives) and can't recommend anything else. Guppy can run consumer graphics hardware with good performance -- perhaps your question is best directed to the Nanopore Community, who may be able to share success reports?

I tested this model with the free Google colab GPU with latest guppy and it worked great. Saved me so much time. Here is a guide https://gist.github.com/sirselim/13f70ae69f2a512e7d9e1f00f9704f53

Dorado basecalling can be done in CPU-mode, however we would recommend using a GPU for maximum performance.