Opened this issue 5 years ago · 0 comments
You could reimplement the QKV / dense logic in terms of einsum for faster computation. An example layer here and the use here. This is how it is is now implemented in the tf2 version of bert / transformer.