certik opened this issue 2 years ago · 0 comments
Currently the attention over heads runs in serial:
fastGPT/gpt2.f90
Line 101 in 01eb84b
We should try to parallelize it and see if we get any speedups.