Image builder failed due to MountVolume.SetUp failed for volume "yatai-regcred" and "kube-api-access"
tamle511 opened this issue · 4 comments
Hello,
I'm trying to deploy a model to our K8S cluster. I've followed the official installation guide and set up yatai
, yatai-image-builder
and yatai-deployment
successfully.
What I've achieved so far is to create a model and push the model to Yatai with bentoml
. But now when trying to create a deployment (using Yatai UI), I've got stuck at the image builder step because the builder pod cannot be created.
Logs from Yatai:
[2023-01-16 16:41:08] [BentoDeployment] [test-onnx] [Reconciling] Starting to reconcile BentoDeployment
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [CheckingImage] Checking image exists: x.x.x.x:5000/yatai-bentos:yatai.test-onnx.0.0.1
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [CheckingImage] Image not exists: x.x.x.x:5000/yatai-bentos:yatai.test-onnx.0.0.1
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [GenerateImageBuilderPod] Making sure docker config secret yatai-regcred in namespace yatai
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [GenerateImageBuilderPod] Docker config secret yatai-regcred in namespace yatai is ready
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [GenerateImageBuilderPod] Generating image builder pod: yatai-bento-image-builder-test-onnx--0-0-1
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [GenerateImageBuilderPod] Getting bento test-onnx:0.0.1 from yatai service
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [GenerateImageBuilderPod] Got bento test-onnx:0.0.1 from yatai service
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [GenerateImageBuilderPod] Getting secret yatai-api-token in namespace yatai
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [GenerateImageBuilderPod] Secret yatai-api-token is found in namespace yatai, so updating it
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [GenerateImageBuilderPod] Secret yatai-api-token is updated in namespace yatai
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [GenerateImageBuilderPod] Getting model test-onnx:mj2hs6ut7w6udjex from yatai service
[2023-01-16 16:41:08] [BentoRequest] [test-onnx--0-0-1] [GenerateImageBuilderPod] (combined from similar events): Created image builder pod: yatai-bento-image-builder-test-onnx--0-0-1
[2023-01-16 16:41:15] [BentoRequest] [test-onnx--0-0-1] [ReconcileError] Failed to reconcile BentoRequest: image builder pod yatai-bento-image-builder-test-onnx--0-0-1 status is Failed
Pod status:
xxx@xxx:~/bentoml/yatai/helm$ kubectl get po -n yatai
NAME READY STATUS RESTARTS AGE
yatai-bento-image-builder-test-onnx--0-0-1 0/1 Init:Error 0 90s
Describe pod:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 21s default-scheduler Successfully assigned yatai/yatai-bento-image-builder-test-onnx--0-0-1 to node-01
Normal Pulled 20s kubelet Container image "quay.io/bentoml/bento-downloader:0.0.1" already present on machine
Normal Created 20s kubelet Created container bento-downloader
Normal Started 20s kubelet Started container bento-downloader
Warning FailedMount 19s (x2 over 20s) kubelet MountVolume.SetUp failed for volume "kube-api-access-xdddb" : object "yatai"/"kube-root-ca.crt" not registered
Warning FailedMount 19s (x2 over 20s) kubelet MountVolume.SetUp failed for volume "yatai-regcred" : object "yatai"/"yatai-regcred" not registered
I've confirmed thatkube-api-access-xdddb
and yatai-regcred
indeed exist so I am not sure why it says the objects were not registered.
xxx@xxx:~$ kubectl get secret -n yatai
NAME TYPE DATA AGE
default-token-tmczp kubernetes.io/service-account-token 3 47h
yatai-api-token Opaque 1 102m
yatai-regcred kubernetes.io/dockerconfigjson 1 142m
xxx@xxx:~$ kubectl get cm -n yatai
NAME DATA AGE
kube-root-ca.crt 1 47h
Kubernetes version: 1.22.8
.
Could somebody please help? Thank you!
Thanks for the report! I found a related issue from the official k8s repo, and your k8s version is within the range of versions for this issue
Thanks @yetone . I'm not sure yet if our k8s version is indeed the issue since I do not have authorities to upgrade our k8s cluster, perhaps I will try it later as our last resort.
Anyway after further debugging I've found the following error in the bento-downloader
container:
xxx@xxx:~/bentoml/yatai/helm$ kubectl logs -f -n yatai yatai-bento-image-builder-test-onnx--0-0-1 bento-downloader
Downloading bento test-onnx:0.0.1 tar file from http://yatai.yatai-system.svc.cluster.local/api/v1/bento_repositories/test-onnx/bentos/0.0.1/download to /tmp/downloaded.tar...
curl: (22) The requested URL returned error: 500
However logs from the yatai pod didn't really show anything related to the error. There were some warnings but I assume they are from some periodical checks. I also tried to call the API from another pod manually but still it didn't trigger any log messages.
xxx@xxx:~$ kubectl logs -f -n yatai-system yatai-6c564d66f5-q44pt yatai --since 5m
INFO[236524] listing unsynced deployments cron="sync env"
INFO[236524] updating unsynced deployments syncing_at cron="sync env"
INFO[236524] updated unsynced deployments syncing_at cron="sync env"
INFO[236524] syncing unsynced app deployment deployments... cron="sync env"
INFO[236524] synced unsynced app deployment deployments... cron="sync env"
W0117 04:23:35.090872 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:23:35.090920 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
W0117 04:23:43.873038 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:23:43.873078 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
ERRO[236558] ws read failed: "websocket: close 1005 (no status)"
ERRO[236558] ws read failed: "websocket: close 1005 (no status)"
ERRO[236558] ws read failed: "websocket: close 1005 (no status)"
ERRO[236576] ws read failed: "websocket: close 1005 (no status)"
W0117 04:24:22.328051 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:24:22.328089 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
W0117 04:24:22.833320 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:24:22.833370 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
ERRO[236589] ws read failed: "websocket: close 1005 (no status)"
INFO[236614] listing unsynced deployments cron="sync env"
INFO[236614] updating unsynced deployments syncing_at cron="sync env"
INFO[236614] updated unsynced deployments syncing_at cron="sync env"
INFO[236614] syncing unsynced app deployment deployments... cron="sync env"
INFO[236614] synced unsynced app deployment deployments... cron="sync env"
W0117 04:25:02.089642 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:25:02.089689 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
W0117 04:25:14.048729 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:25:14.048766 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
W0117 04:25:49.155607 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:25:49.155654 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
W0117 04:26:01.329572 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:26:01.329626 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
INFO[236704] listing unsynced deployments cron="sync env"
INFO[236704] updating unsynced deployments syncing_at cron="sync env"
INFO[236704] updated unsynced deployments syncing_at cron="sync env"
INFO[236704] syncing unsynced app deployment deployments... cron="sync env"
INFO[236704] synced unsynced app deployment deployments... cron="sync env"
W0117 04:26:38.024209 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:26:38.024250 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
W0117 04:26:38.326888 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:26:38.326928 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
W0117 04:27:11.523173 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:27:11.523229 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
W0117 04:27:35.613764 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:27:35.613817 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
W0117 04:27:42.165695 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:27:42.165739 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
INFO[236794] listing unsynced deployments cron="sync env"
INFO[236794] updating unsynced deployments syncing_at cron="sync env"
INFO[236794] updated unsynced deployments syncing_at cron="sync env"
INFO[236794] syncing unsynced app deployment deployments... cron="sync env"
INFO[236794] synced unsynced app deployment deployments... cron="sync env"
W0117 04:28:07.787353 1 reflector.go:424] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
E0117 04:28:07.787393 1 reflector.go:140] pkg/mod/k8s.io/client-go@v0.25.4/tools/cache/reflector.go:169: Failed to watch *v1.Pod: failed to list *v1.Pod: pods is forbidden: User "system:serviceaccount:yatai-system:yatai" cannot list resource "pods" in API group "" in the namespace "yatai-builders"
Can you check the output of this command?
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
name: test
namespace: yatai
spec:
containers:
- command:
- sh
- -c
- 'curl -H "X-YATAI-API-TOKEN: yatai-image-builder:default:\$(YATAI_API_TOKEN)" "http://yatai.yatai-system.svc.cluster.local/api/v1/bento_repositories/test-onnx/bentos/0.0.1/download"'
envFrom:
- secretRef:
name: yatai-api-token
image: curlimages/curl
name: bento-downloader
EOF
sleep 5
kubectl -n yatai logs -f test
Thank you. It turns out the minio endpoint was incorrect so it could not download the bento. I fixed the endpoint, re-pushed the bento and it works now. Not sure why the first push still succeeded even though the endpoint was wrong. Anyway thank you for your support!