SaschaDittmann/MLOps-Lab

Understand maxConcurrentRequestsPerContainer in deployment config

Opened this issue · 0 comments

Hey,

I just want to know why you set 'maxConcurrentRequestsPerContainer' as 2. In this Microsoft documentation, it is recommended to use 1. Also we are implementing AML with high volume of request and planning to increase this value and also the 'maxQueueWaitMs'. Please let me know your thoughts on this or any you have any documentation on implementing high volume application or multi container pods.

Note: Please feel free to close this issue once you read my question. Thanks