Noble-Lab/casanovo

Version 4 is more accurate but much slower than version 3?

irleader opened this issue ยท 4 comments

Hi,

I benchmarked on same data with v4.1.0 and v3.5.0.

I can see some improvements on peptide recall with same massive-kb trained model, while the inference speed is much slower (same machine,GPU, same beam number). Does the same happen on your side?

If not, is it because I did not config the environment for Casanovo v4.1.0 correctly? I see something like this:
"You are using a CUDA device ('NVIDIA GeForce RTX 3080 Ti Laptop GPU') that has Tensor Cores. To properly utilize them, you should set torch.set_float32_matmul_precision('medium' | 'high') which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision"

Best regards

Normally there should not be significant performance reductions between v4 and v3. There was a non-neligible slowdown when beam search was introduced, but that already stems from v3.2.0.

Nevertheless, performance is indeed an important point of attention. We are optimizing the beam search decoding code, which is the most time-consuming step, but this is currently still work in progress (#269).

The notification is just for informational purposes and now occurs because of newer versions of the PyTorch and PyTorch Lightning dependencies, but it doesn't have an impact on performance.

Can you give an estimate of the amount of slowdown you're experience? Is this for inference or for training? How many spectra are you processing and how long does it take?

With same 743 spectra, beam number of 5, prediction batch size of 512, also same machine, GPU and pretrained model:

v4.1.0:
Predicting DataLoader 0: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 2/2 [05:24<00:00,  0.01it/s]

v3.5.0:
Predicting DataLoader 0: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 2/2 [04:02<00:00, 121.11s/it]

Best regards

I didn't observe any significant difference in speed when I ran v4.1.0 and v3.5.0 with the same configurations (5 beams) on the same set of 14,257 spectra on this Colab GPU runtime.

v4.1.0:
Predicting DataLoader 0: 100% 14/14 [29:52<00:00, 128.04s/it]

v3.5.0:
Predicting DataLoader 0: 100% 14/14 [29:47<00:00, 127.67s/it]

So because of the small number of spectra in the first test, fluctuations in the start-up time might dominate. There doesn't seem to be a regression issue leading to a significant slowdown in v4. Nevertheless, computational efficiency is something we're actively investigating, and hopefully we'll be able to release some speed-ups soon.