Can DKM run on CPU only?
ducha-aiki opened this issue · 9 comments
I think this issue was brought up previously. I think the code currently has a bunch of .cuda() calls, but in principle (using a low enough resolution) it should work.
Like LoFTR and other works DKM is quite computationally heavy at high resolution, but I think if you use (384, 512) or similar with fp16 you could probably get a reasonable inference time on CPU.
@Parskatt Thanks. My concern is not .cuda() calls, but the usage of cupy. Is there a fall-back implementation of local correlation?
Aha, the cupy calls are actually for us to be able to run PDCNet internally in our framework, and I kind of forgot to remove that import. Our implementation just uses native pytorch operations.
I should probably just push a fix removing that dep to cause less confusion...
@ducha-aiki added a pull request for it, however very busy currently so no idea if I broke something, will look into cleaning up the codebase in the weeks to come. Sorry for the mess.
@Parskatt thank you, that's great news! In particular, I am interested in integrating DKM into kornia, alongside with LoFTR :)
https://github.com/kornia/kornia
And going back to cpu/CUDA question - I would probably do a PR soon, allowing to run on Apple M1 GPU (torch.device('mps')
), if the local correlation is not required.
Sounds great, I think we can provide a "speedy" model as well, running on AMP and lower resolution, should be nice if looking for more realtime applications.
I have cleaned-up the devices here #26
After merge, we can do integration to kornia