bigscience-workshop/petals

Question about overlapped serving blocks

jeremyzhangsq opened this issue · 0 comments

Consider a case that a pre-trained model is only hosted on three servers: the first one hosts blocks 1-4, the second hosts blocks 2-64, and the third hosts blocks 32-128.

I want to know if the overlapping of serving blocks will affect a client's fine-tuning or inference.

Thanks.