Default values not working
jackchuong opened this issue · 3 comments
Checks
- I have checked for existing issues.
- This report is about the
User-Community Airflow Helm Chart
.
Chart Version
latest
Kubernetes Version
Client Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.3", GitCommit:"9e644106593f3f4aa98f8a84b23db5fa378900bd", GitTreeState:"clean", BuildDate:"2023-03-15T13:40:17Z", GoVersion:"go1.19.7", Compiler:"gc", Platform:"linux/amd64"}
Kustomize Version: v4.5.7
Server Version: version.Info{Major:"1", Minor:"26", GitVersion:"v1.26.4", GitCommit:"f89670c3aa4059d6999cb42e23ccb4f0b9a03979", GitTreeState:"clean", BuildDate:"2023-04-12T12:05:35Z", GoVersion:"go1.19.8", Compiler:"gc", Platform:"linux/amd64"}
Helm Version
version.BuildInfo{Version:"v3.11.3", GitCommit:"323249351482b3bbfc9f5004f65d400aa70f9ae7", GitTreeState:"clean", GoVersion:"go1.20.3"}
Description
I was trying
helm install airflow airflow-stable/airflow -f values.yml
I also tried
- sample-values-CeleryExecutor.yaml
- sample-values-KubernetesExecutor.yaml
- sample-values-CeleryKubernetesExecutor.yaml
None of them works, some pods keep restarting / init error as below
airflow-db-migrations-798dd7d674-kdxp7 0/1 Init:0/1 2 (41s ago) 110s
airflow-pgbouncer-5764575d7d-wjsrv 1/1 Running 0 110s
airflow-postgresql-0 0/1 Pending 0 110s
airflow-scheduler-64877fbb44-76nb9 0/2 Init:0/2 2 (41s ago) 110s
airflow-sync-users-648f4dfb96-vmcbp 0/1 Init:CrashLoopBackOff 2 (26s ago) 110s
airflow-triggerer-85b85c4d64-wl7ks 0/1 Init:0/2 2 (42s ago) 110s
airflow-web-645fccd75b-khcqq 0/1 Init:0/2 2 (41s ago) 111s
Please give me some advice, thank you very much.
Relevant Logs
Describe pod
Normal Scheduled 104s default-scheduler Successfully assigned namespace/airflow-db-migrations-84b7f87494-4hc42 to k8s-worker3
Warning FailedMount 102s kubelet MountVolume.SetUp failed for volume "scripts" : failed to sync secret cache: timed out waiting for the condition
Normal Pulled 73s (x3 over 101s) kubelet Container image "apache/airflow:2.5.3-python3.8" already present on machine
Normal Created 73s (x3 over 101s) kubelet Created container check-db
Normal Started 73s (x3 over 101s) kubelet Started container check-db
Warning BackOff 1s (x3 over 86s) kubelet Back-off restarting failed container check-db in pod airflow-db-migrations-84b7f87494-4hc42_namespace(a5363d98-30be-417c-8f81-671b8acb41b7)
### Custom Helm Values
_No response_
@jackchuong it looks like your postgresql
Pod is not starting, I bet its because your Kubernetes cluster has no default StorageClass, so it cant create a PVC Volume.
@thesuperzapper thank for your reply, I have checked again, the reason is:
I installed airflow by Helm failed the first time and command helm uninstall airflow
didn't clean up every things completely , there remains some PVC (pending) and secret , I deleted them manually and chart's values work fine with 2 PVC I created for postgresql
and logs
.
However I get another problem when trying to config git-sync sidecar for dags
sample-values-CeleryKubernetesExecutor.yaml
dags:
## the airflow dags folder
path: /opt/airflow/dags
## configs for the dags PVC
## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/dags/load-dag-definitions.md
persistence:
enabled: false
## configs for the git-sync sidecar
## [FAQ] https://github.com/airflow-helm/charts/blob/main/charts/airflow/docs/faq/dags/load-dag-definitions.md
gitSync:
enabled: true
repo: "git@gitlab.mydomain.com/myuser/airflow-dags.git"
repoSubPath: ""
branch: main
revision: HEAD
depth: 1
syncWait: 60
syncTimeout: 120
submodules: recursive
sshSecret: "airflow-ssh-git-secret"
sshSecretKey: "id_rsa"
sshKnownHosts: ""
maxFailures: 0
gitlab.mydomain.com is a Gitlab CE on premise , it's working normally , I created repo myuser/airflow-dags , added ssh public key into myuser profile
I also created secret airflow-ssh-git-secret
contains ssh private key
kubectl describe secret/airflow-ssh-git-secret
Name: airflow-ssh-git-secret
Namespace:
Labels: <none>
Annotations: <none>
Type: Opaque
Data
====
id_rsa: 1834 bytes
The status of pod after helm install
kubectl get pod
NAME READY STATUS RESTARTS AGE
airflow-db-migrations-5bc7cbf67b-vtlpx 0/2 Init:CrashLoopBackOff 7 (3m13s ago) 14m
airflow-flower-85cfcf5c47-f9clm 0/2 Init:CrashLoopBackOff 7 (3m18s ago) 14m
airflow-pgbouncer-5f5944d598-cdlk9 1/1 Running 0 14m
airflow-postgresql-0 1/1 Running 0 14m
airflow-redis-master-0 1/1 Running 0 14m
airflow-scheduler-d8fdbf78d-hqt8k 0/2 Init:CrashLoopBackOff 7 (3m10s ago) 14m
airflow-sync-users-857b457984-ntsc2 0/2 Init:CrashLoopBackOff 7 (3m18s ago) 14m
airflow-triggerer-bcb998dd7-8shhl 0/2 Init:CrashLoopBackOff 7 (3m17s ago) 14m
airflow-web-6f8cf6c875-fnh8w 0/2 Init:CrashLoopBackOff 7 (3m13s ago) 14m
airflow-worker-0 0/2 Init:CrashLoopBackOff 7 (3m16s ago) 14m
spark-master-0 1/1 Running 0 27h
spark-worker-0 1/1 Running 0 27h
spark-worker-1 1/1 Running 0 27h
So I guess something wrong with pod airflow-sync-users-857b457984-ntsc2
?
kubectl describe pod airflow-sync-users-857b457984-ntsc2
...
Init Containers:
dags-git-clone:
Container ID: docker://0101410f3b17e17bf7490aacc5e5ccebd5a571def1f270912d156567b4b64a1a
Image: registry.k8s.io/git-sync/git-sync:v3.6.5
Image ID: docker-pullable://registry.k8s.io/git-sync/git-sync@sha256:7231f6c2284758b91caed71e4e596413df31ac4467de9b596dc6b386b82f624f
Port: <none>
Host Port: <none>
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Sat, 01 Jul 2023 17:57:14 +0700
Finished: Sat, 01 Jul 2023 17:57:14 +0700
Ready: False
Restart Count: 3
Environment Variables from:
airflow-config-envs Secret Optional: false
Environment:
GIT_SYNC_ONE_TIME: true
GIT_SYNC_ROOT: /dags
GIT_SYNC_DEST: repo
GIT_SYNC_REPO: git@gitlab.mydomain.com/myuser/airflow-dags.git
GIT_SYNC_BRANCH: main
GIT_SYNC_REV: HEAD
GIT_SYNC_DEPTH: 1
GIT_SYNC_WAIT: 60
GIT_SYNC_TIMEOUT: 120
GIT_SYNC_ADD_USER: true
GIT_SYNC_MAX_SYNC_FAILURES: 0
GIT_SYNC_SUBMODULES: recursive
GIT_SYNC_SSH: true
GIT_SSH_KEY_FILE: /etc/git-secret/id_rsa
GIT_KNOWN_HOSTS: false
DATABASE_USER: postgres
DATABASE_PASSWORD: <set to the key 'postgresql-password' in secret 'airflow-postgresql'> Optional: false
REDIS_PASSWORD: <set to the key 'redis-password' in secret 'airflow-redis'> Optional: false
CONNECTION_CHECK_MAX_COUNT: 0
Mounts:
/dags from dags-data (rw)
/etc/git-secret/id_rsa from git-secret (ro,path="id_rsa")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s2wtx (ro)
check-db:
Container ID:
Image: apache/airflow:2.5.3-python3.8
Image ID:
Port: <none>
Host Port: <none>
Command:
/usr/bin/dumb-init
--
/entrypoint
Args:
bash
-c
exec timeout 60s airflow db check
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment Variables from:
airflow-config-envs Secret Optional: false
Environment:
DATABASE_USER: postgres
DATABASE_PASSWORD: <set to the key 'postgresql-password' in secret 'airflow-postgresql'> Optional: false
REDIS_PASSWORD: <set to the key 'redis-password' in secret 'airflow-redis'> Optional: false
CONNECTION_CHECK_MAX_COUNT: 0
Mounts:
/opt/airflow/dags from dags-data (rw)
/opt/airflow/logs from logs-data (rw,path="airflow-logs")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s2wtx (ro)
wait-for-db-migrations:
Container ID:
Image: apache/airflow:2.5.3-python3.8
Image ID:
Port: <none>
Host Port: <none>
Command:
/usr/bin/dumb-init
--
/entrypoint
Args:
bash
-c
exec airflow db check-migrations -t 60
State: Waiting
Reason: PodInitializing
Ready: False
Restart Count: 0
Environment Variables from:
airflow-config-envs Secret Optional: false
Environment:
DATABASE_USER: postgres
DATABASE_PASSWORD: <set to the key 'postgresql-password' in secret 'airflow-postgresql'> Optional: false
REDIS_PASSWORD: <set to the key 'redis-password' in secret 'airflow-redis'> Optional: false
CONNECTION_CHECK_MAX_COUNT: 0
Mounts:
/opt/airflow/dags from dags-data (rw)
/opt/airflow/logs from logs-data (rw,path="airflow-logs")
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-s2wtx (ro)
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 74s default-scheduler Successfully assigned namespace/airflow-sync-users-857b457984-ntsc2 to k8s-worker2
Normal Pulled 27s (x4 over 71s) kubelet Container image "registry.k8s.io/git-sync/git-sync:v3.6.5" already present on machine
Normal Created 27s (x4 over 71s) kubelet Created container dags-git-clone
Normal Started 26s (x4 over 71s) kubelet Started container dags-git-clone
Warning BackOff 14s (x6 over 70s) kubelet Back-off restarting failed container dags-git-clone in pod airflow-sync-users-857b457984-ntsc2_namespace(f515be0b
Doesn't it work with Git repo on premise ? Or I did something wrong ?
@jackchuong you need to look at the logs for one of those failing pods, it is not possible to know what's failing otherwise.
BTW, I highly recommend k9s, a CLI tool for managing Kubernetes clusters, it's very easy to view things like pod logs. (You just press "L" when highlighting a pod to see its logs).