Turoad/CLRNet

Multi-GPU training question

Closed this issue · 2 comments

Hi, Thank you for your contributions. I want to ask some questions.

I encountered an error when training in multi-GPU.

I listed an example of input here:
input = [tensor(1.7459, device='cuda:0'), tensor(1.3624, device='cuda:1'), 0. , tensor(1.6084, device='cuda:3'), tensor(2.6911, device='cuda:4'), tensor(2.0529, device='cuda:5'), tensor(1.7303, device='cuda:6'), tensor(1.6768, device='cuda:7')]

The problem is 0. is float. There's an assert statement in Gather function:

assert all(i.device.type != 'cpu' for i in inputs), (
            'Gather function not implemented for CPU tensors'
        )

float 0. can not pass this statement, so it always returns assert error.

I'm wondering if this repo supports multi-GPU training and if so, how should I fix this error? Thanks

Hello, I think this repo does not support Multi-GPU, the author has a declaration in #18

Hello, I think this repo does not support Multi-GPU, the author has a declaration in #18

yes, that answer was posted in May, I'm wondering if there's any progress on multi-gpu training