Mesh TensorFlow CPU Inference

Question

Mesh TensorFlow CPU Inference

pablogranolabar opened this issue 3 years ago · 2 comments

If your implementation is based on Mesh TensorFlow which natively supports CPU inference, why wouldn't a multi-CPU mesh work for GPT-Neo inference if enough memory is available per CPU node (say 10GB)?

Answer 1 · 2021-05-23T00:09:47.000Z

It probably would, but we have had no need to use it and therefore no motivation to test or implement it. If you open a PR with this feature I'll review it.

Answer 2 · 2021-05-23T00:26:03.000Z

Hi Stella,

My thoughts are that if this can be parallelized on CPU via Mesh TensorFlow, that GPT-Neo would be an ideal use case for low cost microservice endpoint inference which would be exponentially cheaper than GPU inference. I'll open a PR with the details.