bentoml/Yatai

builder OOM Killed

herunyu opened this issue · 7 comments

Hi there, I am trying to deploy a bento through yatai. However, it keeps giving me OOMKilled for the builder.

I have set the BentoML Configuration as the following:
image

But the resources of the building pod is not what I want:
image

It leads to the following error. Please take a look. Thank you!

image

You can use JSON editor to add resource limits to image builder Containers. If this is not possible, consider switching to a different image build engine, such as buildkit

image

image

How to switch image build engine:

helm -n yatai-image-builder get values yatai-image-builder > ./values.yaml
helm -n yatai-image-builder upgrade yatai-image-builder --values ./values.yaml --set bentoImageBuildEngine=buildkit

Somehow the JSON Editor is not showing anything. Is it because we are in a non external internet access environment?
Anyways, we have switched the image build engine to buildkit, still the same OOM issue.

Besides, if we delete the deployment through the Web UI, and try to deploy the same model, it will give us image build failed immediately. And I checked the log, it was the last deployment error. Not sure if this is a bug or what. We have to delete the BentoRequest for the bento in order to create a new deployment for the same model version.

After changing the default limitrange of memory, we can control the resource of the image building container. However, the next step was block by an image named "quay.io/bentoml/bentoml-proxy:0.0.1". As we do not have external internet access in the developing environment, we pulled the image outside the developing environment. But it seems the image repository of this bentoml-proxy image is fixed. Not sure how to change the repository to our internal repository. It may helpful if you guys can show us how. Thank you!
0080a227-0874-4166-8d04-bdd0d101fde1

@herunyu Thanks for your feedback, I just updated and released yatai-deployment and its helm chart, now you can specify a custom proxy image with this value, you can now update the helm repo and then update the yatai-deployment helm release to set this image

https://github.com/bentoml/yatai-deployment/blob/6cdba8c036e1ff4a33086efe913485593d7bf2a0/helm/yatai-deployment/values.yaml#L114

@herunyu I can demonstrate how to do this update.

First, update the helm repo:

helm repo update bentoml

Then save the previous values:

helm get values yatai-deployment -n yatai-deployment > /tmp/yatai-deployment-values.yaml

Final update on release:

helm -n yatai-deployment upgrade yatai-deployment bentoml/yatai-deployment --values /tmp/yatai-deployment-values.yaml --set internalImages.proxy=${your proxy image here}

@yetone Thank you! We will try this update and see if the problem is solved.

You can use JSON editor to add resource limits to image builder Containers. If this is not possible, consider switching to a different image build engine, such as buildkit

image image

How to switch image build engine:

helm -n yatai-image-builder get values yatai-image-builder > ./values.yaml
helm -n yatai-image-builder upgrade yatai-image-builder --values ./values.yaml --set bentoImageBuildEngine=buildkit

Hello!
I am running to a similar issue and I didn't find the JSON editor in the webUI.
Could you please enlighten me? Thank you very much.

It's actually quite easy to get OOM for this setup especially with Transformers models.
I passed several hours on the documentation but I didn't find any thing about builder memory except for this page.