jonathanreadshaw/ServingMLFastCelery

How to instantiate one single model instance shared by all threads

Emerald01 opened this issue · 0 comments

Thank you for this insightful example to host ML model via Celery and FastAPI. In your blog, you mentioned that "There will be a PredictTask object for each worker process. Therefore if a worker has four threads then the model will be loaded four times (each is then stored in memory as well". This is exactly my concern. I think each thread would try to instantiate its own model instance, but if the model is really expensive or some states need be shared by the calling threads, then we want only one single model instance shared by all threads, but how could we do it properly? Any suggestion?