JJGO/hyperlight

About batch processing for main architecture RNN

Opened this issue · 2 comments

Hello,
Thank you for open-sourcing this repository!
If the main architecture is RNN, how should I implement batch processing?

JJGO commented

Could you elaborate about what do you mean by batch processing? By default, in hyperlight, a batch of data uses the same hypernetwork input and thus the same weights. To use multiple sets of weights within the same batch, the recommended way to do it is using gradient accumulation.

Thank you for your reply! Ultimately, I have resolved my question regarding batch processing.