JonnyS1226/ego4d_asl

How many V100 GPUs did you use? And how long was the training?

krkrkrrk opened this issue · 2 comments

Hello.

Thank you for publishing your project.

As the title says, I would like to know about the number of V100 GPUs and training time.

Since our method are pre-extracted feature-based, it won't require many GPUs. In our experiment, we only use one V100 and train about several hours.

Thank you for your response.

I am surprised that the difference in R@1 (tIoU=0.3) is only 1.54%, despite the significant difference in training time between GroundNLQ and ASL.

I thought that the ensemble with NaQ greatly influences the accuracy. Could you please share the Test results for ASL only?