Helm Release Job NotReady Status
sp71 opened this issue · 1 comments
sp71 commented
Preflight checklist
- I could not find a solution in the existing issues, docs, nor discussions.
- I agree to follow this project's Code of Conduct.
- I have read and am following this repository's Contribution Guidelines.
- I have joined the Ory Community Slack.
- I am signed up to the Ory Security Patch Newsletter.
Ory Network Project
No response
Describe the bug
When bringing up keto in terraform using the helm_release resource with autoMigration enabled, the job's pod is always set to NotReady despite the logs from the jobs pod indicating that the migration was applied correctly. I verified the database had all the changes committed to it correctly. Any ideas why the job's pod is always set to NotReady? I am using the cloudSQL proxy as the side car container.
Reproducing the bug
Steps to reproduce the behavior:
- Apply terraform
- See keto's job pod status set to NotReady
Relevant log output
Jobs Pod Logs
time=2023-09-10T12:12:40Z level=error msg=Unable to ping the database connection, retrying. audience=application error=map[message:failed to connect to `host=127.0.0.1 user=postgres database=`: dial error (dial tcp 127.0.0.1:5432: connect: connection refused)] service_name=Ory Keto service_version=v0.11.1-alpha.0
[POP] 2023/09/10 12:12:47 warn - One or more of connection details are specified in database.yml. Override them with values in URL.
time=2023-09-10T12:12:47Z level=info msg=No tracer configured - skipping tracing setup audience=application service_name=Ory Keto service_version=v0.11.1-alpha.0
Current status:
Version Name Status
20150100000001000000 networks Pending
20201110175414000000 relationtuple Pending
20201110175414000001 relationtuple Pending
20210623162417000000 relationtuple Pending
20210623162417000001 relationtuple Pending
20210623162417000002 relationtuple Pending
20210623162417000003 relationtuple Pending
20210914134624000000 legacy-cleanup Pending
20220217152313000000 nid_fk Pending
20220512151000000000 indices Pending
20220513200300000000 create-intermediary-uuid-table Pending
20220513200400000000 create-uuid-mapping-table Pending
20220513200400000001 uuid-mapping-remove-check Pending
20220513200500000000 migrate-strings-to-uuids Pending
20220513200600000000 drop-old-non-uuid-table Pending
20220513200600000001 drop-old-non-uuid-table Pending
20230228091200000000 add-on-delete-cascade-to-relationship Pending
Applying migrations...
Successfully applied all migrations:
Version Name Status
20150100000001000000 networks Applied
20201110175414000000 relationtuple Applied
20201110175414000001 relationtuple Applied
20210623162417000000 relationtuple Applied
20210623162417000001 relationtuple Applied
20210623162417000002 relationtuple Applied
20210623162417000003 relationtuple Applied
20210914134624000000 legacy-cleanup Applied
20220217152313000000 nid_fk Applied
20220512151000000000 indices Applied
20220513200300000000 create-intermediary-uuid-table Applied
20220513200400000000 create-uuid-mapping-table Applied
20220513200400000001 uuid-mapping-remove-check Applied
20220513200500000000 migrate-strings-to-uuids Applied
20220513200600000000 drop-old-non-uuid-table Applied
20220513200600000001 drop-old-non-uuid-table Applied
20230228091200000000 add-on-delete-cascade-to-relationship Applied
### Relevant configuration
```yml
resource "helm_release" "keto" {
name = "ory"
repository = "https://k8s.ory.sh/helm/charts"
chart = "keto"
values = [
<<EOT
serviceAccount:
create: false
name: ${module.service_account.value.id}
job:
serviceAccount:
create: false
name: ${module.service_account.value.id}
extraContainers: |
- name: cloud-sql-proxy
image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.6.1
imagePullPolicy: Always
args:
- "--structured-logs"
- "--health-check"
- "--http-address=0.0.0.0"
- "--port=${local.sql_port}"
- "--private-ip"
- ${var.project_id}:${var.default_region}:${module.sql_db.name}
securityContext:
runAsNonRoot: true
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
livenessProbe:
httpGet:
path: /liveness
port: 9090
initialDelaySeconds: 0
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 2
readinessProbe:
httpGet:
path: /readiness
port: 9090
initialDelaySeconds: 0
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 2
startupProbe:
httpGet:
path: /startup
port: 9090
periodSeconds: 1
timeoutSeconds: 5
failureThreshold: 20
resources:
requests:
memory: 128Mi
cpu: 50m
limits:
memory: 512Mi
cpu: 250m
keto:
automigration:
enabled: true
config:
dsn: postgres://${local.db_username}:${random_password.password.result}@127.0.0.1:${local.sql_port}
deployment:
extraContainers: |
- name: cloud-sql-proxy
image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.6.1
imagePullPolicy: Always
args:
- "--structured-logs"
- "--health-check"
- "--http-address=0.0.0.0"
- "--port=${local.sql_port}"
- "--private-ip"
- ${var.project_id}:${var.default_region}:${module.sql_db.name}
securityContext:
runAsNonRoot: true
readOnlyRootFilesystem: true
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
livenessProbe:
httpGet:
path: /liveness
port: 9090
initialDelaySeconds: 0
periodSeconds: 10
timeoutSeconds: 5
failureThreshold: 2
readinessProbe:
httpGet:
path: /readiness
port: 9090
initialDelaySeconds: 0
periodSeconds: 10
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 2
startupProbe:
httpGet:
path: /startup
port: 9090
periodSeconds: 1
timeoutSeconds: 5
failureThreshold: 20
resources:
requests:
memory: 128Mi
cpu: 50m
limits:
memory: 512Mi
cpu: 250m
EOT
]
}
Version
v0.11.1
On which operating system are you observing this issue?
None
In which environment are you deploying?
Kubernetes with Helm
Additional Context
- CloudSQL PostgreSQL database
- GCP
sp71 commented
Closing due to inactivity