Where to read EL-Attention source code for huggingface-transformers

Question

Where to read EL-Attention source code for huggingface-transformers

ADaBenxiong opened this issue 2 years ago · 4 comments

We are very interested in your work and thank you for your work. We have read your paper"EL-Attention". The more comprehensive examples can be found here for huggingface-transformers, but the self-attention save the key and value, not only hidden_states. El-Attention proves that saving hidden_states can half of the memory.

Answer 1 · 2022-06-24T16:37:27.000Z

Please see the implementation for fairseq in https://github.com/microsoft/fastseq/blob/main/fastseq/optimizer/fairseq/el_attention_optimizer.py

Answer 2 · 2022-06-27T05:39:15.000Z

Hello, thanks for the implementation of your source code, and I read your code for fairseq (https://github.com/microsoft/fastseq/blob/main/fastseq/optimizer/fairseq/el_attention_optimizer.py).
I find that EL-Attention is implemented on cross attention, but self attention has not much changed. I don't know if I understand it correctly. After reading the paper, I read that GPT2 which does not have cross-attention can speed up too.
Thanks a lot.

Answer 3 · 2022-06-28T18:15:18.000Z

The code I pasted here is EL-Attention for cross attention, similar change could be applied for self attention.

Answer 4 · 2022-06-29T06:56:57.000Z

After reading your code, I have understood the specific operation, thank you for your work and your careful explain