lavis-nlp/spert

High GPU usage during evaluation

avipartho opened this issue · 2 comments

Hi there, thanks a lot for sharing this awesome repo. I just have one question regarding the GPU usage. I have noticed that during evaluation it takes almost 2x GPU memory than the training process, even when I use the same batch size for both. I understand that during the evaluation, the system additionally performs the span selection process but is that the only thing responsible for such a huge memory space? Any comments on this would be really helpful. Thanks.

Update: it's because of enumerating all possible spans during evaluation whereas for training we consider only a fixed number of negative spans and the gold (positive) spans. This creates much bigger matrices occupying a lot of memory.

If GPU memory usage is an issue, you could probably optimize the inference code at some places. For example, instead of creating one big matrix containing every span, the spans could be processed in chunks. After that, the spans classified as entity mentions in each chunk can be paired for relation extraction.