alexisrozhkov/dilated-self-attention

Q: CPU Multicore?

Opened this issue · 0 comments

I see no allowance for CPU multicore in the current implementation nor in the roadmap.

Did I miss something?

PS: While I recognize this may be viewed as a low priority given the current state of The Hardware Lottery, it is nevertheless the case that the algorithm is memory intensive, CPU memory is 10x cheaper than TPU memory and the paper seems to point toward the potential for grassroots distributed screensaver training of models competitive with the largest of the LLMs.