Shared activation diff
jyhjinghwang opened this issue · 0 comments
jyhjinghwang commented
To implement a shared data holder for gradients of activations for efficient GPU memory usage.
jyhjinghwang opened this issue · 0 comments
To implement a shared data holder for gradients of activations for efficient GPU memory usage.