awslabs/multi-model-server

How to achieve autoscaling when running MMS on a fargate?

sunilkumarmohanty opened this issue · 1 comments

Hi,

I would like to autoscale my model workers based on the request they receive. I am unable to locate any documentation on that. Could somebody please help me configure autoscaling.

I am running the MMS on fargate and I have autoscaling enabled at task level based on CPU. However, I am clueless on how to manage scaling of model workers inside a task.

Br,
Sunil

I'm facing the same issue!