microsoft/fastseq

EL-attention GPT-2

Opened this issue · 2 comments

Hi,
In the EL-Attention paper, a GPT-2 implementation with 1.8x speedup is mentioned. Will that ever be released publicly?

Same question.