facebookresearch/FFCV-SSL

FFCV accrues massive overhead with heavy transform pipelines

akashc1 opened this issue · 1 comments

Hi,

I've noticed that FFCV can deliver nice speedups for individual steps. For context:

  • I have lots of images of shape 256 x 256 x 3
  • I use many augmentations, some even repeated as I'm using an SSL method which relies on this for performance
  • My baseline __getitem__ call:
    • read a JPEG image from a NFS
    • decode JPEG
    • Perform augmentations

Noticing with FFCV:

  • Dataloading on its own is 3-4x faster (reading from disk + JPEG decoding)
  • Transforms on their own are 3-4x faster after optimization on my end (due to using numba or other tricks)
  • E2E dataloading with FFCV is 2-3x slower than my baseline.

I profiled it quite a bit and it seems to be due to FFCV overhead. I can see in htop that even when using something like 16 dataloader workers, most cpu cores are actually idle
Screen Shot 2023-07-28 at 8 51 44 AM

Has anybody else experienced something similar, or would have any tips on debugging/addressing?

Thank you!

Thanks for opening this issue. Can you define what you mean by "lots of images" or "heavy transform" pipelines ? Is it 1B images and 20 transforms or just 1M images with 4 transforms ? In addition, it would be easier if you could share a snippet of your code to see if this is reproducible on our side.