Parskatt/DKM

Can DKM run on CPU only?

ducha-aiki opened this issue · 9 comments

Can DKM run on CPU only?

I think this issue was brought up previously. I think the code currently has a bunch of .cuda() calls, but in principle (using a low enough resolution) it should work.

Like LoFTR and other works DKM is quite computationally heavy at high resolution, but I think if you use (384, 512) or similar with fp16 you could probably get a reasonable inference time on CPU.

@Parskatt Thanks. My concern is not .cuda() calls, but the usage of cupy. Is there a fall-back implementation of local correlation?

Aha, the cupy calls are actually for us to be able to run PDCNet internally in our framework, and I kind of forgot to remove that import. Our implementation just uses native pytorch operations.

I should probably just push a fix removing that dep to cause less confusion...

@ducha-aiki added a pull request for it, however very busy currently so no idea if I broke something, will look into cleaning up the codebase in the weeks to come. Sorry for the mess.

@Parskatt thank you, that's great news! In particular, I am interested in integrating DKM into kornia, alongside with LoFTR :)
https://github.com/kornia/kornia

And going back to cpu/CUDA question - I would probably do a PR soon, allowing to run on Apple M1 GPU (torch.device('mps')), if the local correlation is not required.

Sounds great, I think we can provide a "speedy" model as well, running on AMP and lower resolution, should be nice if looking for more realtime applications.

I have cleaned-up the devices here #26

After merge, we can do integration to kornia