unireplknet_s is slower than convnext_b

Question

unireplknet_s is slower than convnext_b

Closed this issue 2 months ago · 2 comments

Thank you very much for your work!
When I used the pre trained model unireplknet_s_in122k_to_11k_384-acc86.44 provided, I found that the inference speed of the model was not faster than convnext_base.fb_in22k_ft_in1k_384. Although the speed has improved after using reparameterize_unirepknet(), it is still limited. Meanwhile, the faster implementation of large-kernel ConvNet is also used.
The image size I input is 384 * 384 and I used a 4090 GPU, but the result is not as described in the paper.
I would greatly appreciate it if you could respond to my question!

Answer 1 · 2024-10-26T07:57:46.000Z

Thanks for your interest! We used the NVIDIA A100 GPU to measure the inference speed, and we did not have the NVIDIA RTX 4090 GPU to double-check such results. The inference speed order seems relevant to your hardware, for example, the H100 GPU can further speed the Transformer training, which may be faster than ConvNets.

Answer 2 · 2024-10-30T08:47:10.000Z

Thanks for your reply! I will make more attempts.