Adding self-signed-certificates and removing the relation breaks the charm
dparv opened this issue · 5 comments
Bug Description
istio-pilot/0* error idle 10.244.2.10 hook failed: "certificates-relation-broken"
and can't access kubeflow dashbboard
To Reproduce
juju deploy self-signed-certificates --channel edge
juju relate istio-pilot:certificates self-signed-certificates:certificates
and
juju remove-relation istio-pilot:certificates self-signed-certificates:certificates
Environment
juju 3.4.3
istio-pilot 1.17/stable 965
self-signed-certificates latest/edge 145
Relevant Log Output
unit-istio-pilot-0: 13:52:26 WARNING unit.istio-pilot/0.juju-log certificates:56: 'app' expected but not received.
unit-istio-pilot-0: 13:52:26 WARNING unit.istio-pilot/0.juju-log certificates:56: 'app_name' expected in snapshot but not found.
unit-istio-pilot-0: 13:52:26 INFO unit.istio-pilot/0.juju-log certificates:56: Creating CSR for 57.152.89.25 with DNS ['istio-pilot-0.istio-pilot-endpoints.kubeflow.svc.cluster.local'] and IPs []
unit-istio-pilot-0: 13:52:26 ERROR unit.istio-pilot/0.juju-log certificates:56: Uncaught exception while in charm code:
Traceback (most recent call last):
File "./src/charm.py", line 1203, in <module>
main(Operator)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 544, in main
manager.run()
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 520, in run
self._emit()
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 509, in _emit
_emit_charm_event(self.charm, self.dispatcher.event_name)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 143, in _emit_charm_event
event_to_emit.emit(*args, **kwargs)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 350, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 849, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 939, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/tls_certificates_interface/v2/tls_certificates.py", line 1582, in _on_relation_broken
self.on.all_certificates_invalidated.emit()
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 350, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 849, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 939, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/observability_libs/v0/cert_handler.py", line 420, in _on_all_certificates_invalidated
self._generate_csr(overwrite=True, clear_cert=True)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/observability_libs/v0/cert_handler.py", line 272, in _generate_csr
self.certificates.request_certificate_creation(certificate_signing_request=csr)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/tls_certificates_interface/v2/tls_certificates.py", line 1421, in request_certificate_creation
raise RuntimeError(
RuntimeError: Relation certificates does not exist - The certificate request can't be completed
### Additional Context
_No response_
Thank you for reporting us your feedback!
The internal ticket has been created: https://warthogs.atlassian.net/browse/KF-5876.
This message was autogenerated
Reported issue
I was able to reproduce the issue.
My model:
Model Controller Cloud/Region Version SLA Timestamp
istio-441 uk8s-343 microk8s/localhost 3.4.3 unsupported 18:54:00Z
App Version Status Scale Charm Channel Rev Address Exposed Message
istio-ingressgateway active 1 istio-gateway 1.17/stable 1000 10.152.183.212 no
istio-pilot waiting 1 istio-pilot 1.17/stable 965 10.152.183.210 no installing agent
self-signed-certificates active 1 self-signed-certificates latest/edge 147 10.152.183.76 no
Unit Workload Agent Address Ports Message
istio-ingressgateway/0* active idle 10.1.60.140
istio-pilot/0* error idle 10.1.60.137 hook failed: "certificates-relation-broken" for self-signed-certificates:certificates
self-signed-certificates/0* active idle 10.1.60.138
Integration provider Requirer Interface Type Message
istio-pilot:istio-pilot istio-ingressgateway:istio-pilot k8s-service regular
istio-pilot:peers istio-pilot:peers istio_pilot_peers peer
self-signed-certificates:certificates istio-pilot:certificates tls-certificates regular
juju debug-log output:
unit-istio-pilot-0: 18:53:15 INFO unit.istio-pilot/0.juju-log certificates:1: Creating CSR for 10.64.140.43 with DNS ['istio-pilot-0.istio-pilot-endpoints.istio-441.svc.cluster.local'] and IPs []
unit-istio-pilot-0: 18:53:15 ERROR unit.istio-pilot/0.juju-log certificates:1: Uncaught exception while in charm code:
Traceback (most recent call last):
File "./src/charm.py", line 1203, in <module>
main(Operator)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 544, in main
manager.run()
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 520, in run
self._emit()
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 509, in _emit
_emit_charm_event(self.charm, self.dispatcher.event_name)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/main.py", line 143, in _emit_charm_event
event_to_emit.emit(*args, **kwargs)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 350, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 849, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 939, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/tls_certificates_interface/v2/tls_certificates.py", line 1582, in _on_relation_broken
self.on.all_certificates_invalidated.emit()
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 350, in emit
framework._emit(event)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 849, in _emit
self._reemit(event_path)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/venv/ops/framework.py", line 939, in _reemit
custom_handler(event)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/observability_libs/v0/cert_handler.py", line 420, in _on_all_certificates_invalidated
self._generate_csr(overwrite=True, clear_cert=True)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/observability_libs/v0/cert_handler.py", line 272, in _generate_csr
self.certificates.request_certificate_creation(certificate_signing_request=csr)
File "/var/lib/juju/agents/unit-istio-pilot-0/charm/lib/charms/tls_certificates_interface/v2/tls_certificates.py", line 1421, in request_certificate_creation
raise RuntimeError(
RuntimeError: Relation certificates does not exist - The certificate request can't be completed
Potential cause
The error message comes from the cert_handler
library. At first glance it looks like on a relation_broken
event, a all_certificates_invalidated
event is emitted by the tls_certificates
library (which is used under the hood by the cert handler lib). The cert_handler
lib then calls _on_all_certificates_invalidated
which tries to generate a CRS, but since the relation is not established anymore, generating the CSR will fail.
I have pinged the maintainers of the library I'm referring to, will come back with an update.
State of TLS certificates integration
Just as a quick check, I did the following to ensure the TLS certificates were in fact passed and rendered correctly in the Gateway
and Secret
objects:
- Deploy
istio-operators 1.17/stable
- Deploy
self-signed-certificates latest/edge
- Add relations
- Checked the
Gateway
object and theSecret
it references
My model:
Model Controller Cloud/Region Version SLA Timestamp
istio-441 uk8s-343 microk8s/localhost 3.4.3 unsupported 18:49:30Z
App Version Status Scale Charm Channel Rev Address Exposed Message
istio-ingressgateway active 1 istio-gateway 1.17/stable 1000 10.152.183.212 no
istio-pilot active 1 istio-pilot 1.17/stable 965 10.152.183.210 no
self-signed-certificates active 1 self-signed-certificates latest/edge 147 10.152.183.76 no
Unit Workload Agent Address Ports Message
istio-ingressgateway/0* active idle 10.1.60.140
istio-pilot/0* active idle 10.1.60.137
self-signed-certificates/0* active idle 10.1.60.138
Integration provider Requirer Interface Type Message
istio-pilot:istio-pilot istio-ingressgateway:istio-pilot k8s-service regular
istio-pilot:peers istio-pilot:peers istio_pilot_peers peer
self-signed-certificates:certificates istio-pilot:certificates tls-certificates regular
The Gateway
and Secret
objects:
apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
creationTimestamp: "2024-06-17T18:44:40Z"
generation: 2
labels:
app.juju.is/created-by: istio-pilot
app.kubernetes.io/instance: istio-pilot-istio-441
kubernetes-resource-handler-scope: gateway
name: istio-gateway
namespace: istio-441
resourceVersion: "1420"
uid: f56edf34-5dd7-4d5a-b50c-1e6b7f977e89
spec:
selector:
istio: ingressgateway
servers:
- hosts:
- '*'
port:
name: https
number: 8443
protocol: HTTPS
tls: # <--- it is configured for TLS
credentialName: istio-gateway-gateway-secret # <--- it references this secret
mode: SIMPLE
apiVersion: v1
data:
tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURyakNDQXBhZ0F3SUJBZ0lVWlV3L0x0aWs2MjNic3FSWWNyRlBjMUJwYjhNd0RRWUpLb1pJaHZjTkFRRUwKQlFBd09URUxNQWtHQTFVRUJoTUNWVk14S2pBb0JnTlZCQU1NSVhObGJHWXRjMmxuYm1Wa0xXTmxjblJwWm1sagpZWFJsY3kxdmNHVnlZWFJ2Y2pBZUZ3MHlOREEyTVRjeE9EUTFNRFJhRncweU5UQTJNVGN4T0RRMU1EUmFNRWN4CkZqQVVCZ05WQkFNTURXbHpkR2x2TFhCcGJHOTBMVEF4TFRBckJnTlZCQzBNSkRGaU4yRXlaRGhsTFdVNU4yVXQKTkRBd09TMDRPVGN5TFRRMU1HRXdZbVUxT1RnME1EQ0NBU0l3RFFZSktvWklodmNOQVFFQkJRQURnZ0VQQURDQwpBUW9DZ2dFQkFNOU1yS1VkZXRJOGJMeFo0Mi9VY2FXaGtKVEpzT0IwRVRxTzlENUxNSUdtZXI1d3ZLc1dmc2Q4CmMxOHV2bUtnc2pCM2tZVVV0bDNIa0xxdHlwU1ZXNkZyOUVPaWI2TGVadFFSTmFYZm11RFN1UjBqMk9jRTJzem8KdDVwRDM3MFJOTVB2eG9BT0szN3U3dkM2VjRaL2ZudnFPaWlaVDZjaU5UQjJSWmpzYTVoWjdSUHZSOW5WaXRhLwpoODZhQmkxdThaNDFpUlhTZkxlTUxDNFdYcEhwL2x2a0JRVVNwWUIyRGs0VDF0Mm90cjNhbjEzbGdMYWtmdk5XCmJYWmpzRWxYWVFCeEhHQmYzN0oraUhjOU9YM25ybnVVY3o2SGgzbG9WekpROGIwTktvMTlDbFZWbTdtbThWVjIKMnIvcVR0VXQ2dDViWE12QUQwRUFtQWhHNFEvL3dXOENBd0VBQWFPQm56Q0JuREFoQmdOVkhTTUVHakFZZ0JZRQpGSVo1ZkVqeUowNEM5U1IrbW80WWRvRWlnRkFzTUIwR0ExVWREZ1FXQkJUL0RFZmxvbTdUNGF6VUpmNXl6L09GCkpNS05KVEFNQmdOVkhSTUJBZjhFQWpBQU1Fb0dBMVVkRVFSRE1FR0NQMmx6ZEdsdkxYQnBiRzkwTFRBdWFYTjAKYVc4dGNHbHNiM1F0Wlc1a2NHOXBiblJ6TG1semRHbHZMVFEwTVM1emRtTXVZMngxYzNSbGNpNXNiMk5oYkRBTgpCZ2txaGtpRzl3MEJBUXNGQUFPQ0FRRUFkdzc5UWhJN0pVcUV3MzRwZysrRkJDSitKVUU3OFpZVURrVHNIQWVZClZEUlpWcUwyaENnL0poU2k3RHFrYUcwRjh6UkdadGxzcUFCdEdEYmhPZC9WM3BiOUtYVTRUSHl6UWhPYmlEWHkKYXRwQ2REUnEwUDVUeGpBT2l6YnJIZHlyOXc2c0FFd1VEcldKclQ2NjFOVjFNazE3YUluTVZZdFlNMExsS0h5YgpkTGZ4NmZjcGVCeXJXVjQ2cjZLTVlKQWoyd2lORjhlSXdpK0NMd2tiUGwwR1FHd3lVK3NSV1EwVmtuWk5ESVlyCjhzT0wrbUV5VVNBNmJ3RmF3dFVxUHJPRDI5RXJ5VlF0RkVtYit6cWVuN2VUNXBLN2FRRDE2NDR2TEdEajJEYzkKNkVIM1pRZ2VjUHBFcHJWTW04NTdEWC9XTTdDcDQxUThURDF5SUVGREZCSThSdz09Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0=
tls.key: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcFFJQkFBS0NBUUVBejB5c3BSMTYwanhzdkZuamI5UnhwYUdRbE1tdzRIUVJPbzcwUGtzd2dhWjZ2bkM4CnF4Wit4M3h6WHk2K1lxQ3lNSGVSaFJTMlhjZVF1cTNLbEpWYm9XdjBRNkp2b3Q1bTFCRTFwZCthNE5LNUhTUFkKNXdUYXpPaTNta1BmdlJFMHcrL0dnQTRyZnU3dThMcFhobjkrZStvNktKbFBweUkxTUhaRm1PeHJtRm50RSs5SAoyZFdLMXIrSHpwb0dMVzd4bmpXSkZkSjh0NHdzTGhaZWtlbitXK1FGQlJLbGdIWU9UaFBXM2FpMnZkcWZYZVdBCnRxUis4MVp0ZG1Pd1NWZGhBSEVjWUYvZnNuNklkejA1ZmVldWU1UnpQb2VIZVdoWE1sRHh2UTBxalgwS1ZWV2IKdWFieFZYYmF2K3BPMVMzcTNsdGN5OEFQUVFDWUNFYmhELy9CYndJREFRQUJBb0lCQVFDZ0RXc2U4T3ZyZG92ZAp3T2xCWnAxNGJJM2Mwdnlsei9lZFp0SmRabUJGT2V4N0xULytPSmdhSFpSV1lSak52WlRXcHZyTDdYb0FYaHo0CmhVWnNBZ1dGVkh4NzIrYWxzV0ZqU3daSTA2UVpBWm03VGZvaUpEVnJFQ0x5RUlXbXpLb1l2Z0JjenBQMnBUUUcKMlZqS2w1Vm94eWV3UU82bTlGcHMyR1JUOWZYODRjMS90bUkraTZOOVN1ak5wcDloVzhoMS81cjBaRnZha2VJSgpxaTgyb2IyMFhWQmhOeGg0Z0x1YWw2aGtiTE9WUWVmNWZSTThxOFRVWlVGTjRycEpvbi9XaHNBTnMxRkpaVE5kCngza2JHZWh0TEc0OXNzSzdqVG5KK0ZFeEZSM0pjZjBieHJKV1JEbkJFN0JCMGxzSmR4anVUTFhCVlZ1dGxHazMKVDNENWtQckJBb0dCQVBDelNiWDVuTnl1NjZaVmNLcTB4a3plNVhiaUR4Ri95aW1yaUtybDRHU1czSWx5bnNBdQprVG5rVGtUVDB1YithSzJGUlJiQ3hIM3dCMW9HQjlhUG1RaFBtY1BDMGJ1TlJLN3M2MGhnWVV1Y0o3czNPdE5KCk5ubm1JN0laTVpQTURweHVoekFhNHU2NGFBZ21QNWdkNWpVSjJkNGZtZkJrU1YyYzlwRG9xM29OQW9HQkFOeDUKNGhHck5mVk9BTWJCNVpQdTNrTkdzdkhzaURRalA3TWZOVTB2MjRUYW9KS3d1dXR1L1NKUE5FQ3FGbkx1S2tkWgo2cDZ1akN5UC9LNlBvQUlqSjZjMHJGUkRIbHMxQk1PTU5VQlBmdGNVUW5KSHo0Ty93ZUt2U2FWRGsvU0FXWTFICnpkS2toVEdvaG1yZ2tJcnhvaDRsUTViT0YxSnp1WlR3L3ozYXFUWnJBb0dCQU14ZW5qNXBjenVaTmJKa0p5WjYKR1VrWmxHR05iVmZoVmVodG9idmhOTmFUbFNzSzdDbW5JRjIwTUpTVitpTnhiYld2UzBzWkVqY1AvMTM3Y3RwRgowSnpTNFc3cTBxTlpQakQ4TG9Xa2Q5ZjMvWEFqWThvVUJySVhxc1ZFU09rQndJSW9BcGJnclVBZHlRN3FVdUs0CnVFYmVWMk1YRitDWmRnV0xDWHRlWW9KZEFvR0JBTkpVY0VlODFzLzdKeUIxNzRjdUZObUhnOFRwaXBKNm9oVkcKaTNua1V2NHQ5NHVaaitoMFRJYkRtcXlwMXBxei9KOXU5eldFZlBNeU5iTnVEdzZhN1FSRmFyVkVCcHlxT3E0Mgpmc0tvVSsvcFF1NTA5VkhSeUt4eDNzY0xiZ1dOd0dEWWhGRVVaSUNZTGd1ZHlpYlRGMzY4dS9zTkJ4REFsK1d2Cjl6L1I3eVdiQW9HQUx6Qk5WSTh0U2RkbTYwWEhpVFJVL012dzdFOHNXbU1Gdlp2VGRXdmZMcVY3N0V2cE1FdEEKdTJzQmhNdUJHbkVvVjV0ZDR6aVVpb2xDQms2c2UxNi8xWlpydUtvaGJIdFZobXlTTG8xalFtSXlicjB4RFBOYgpOY3k0UHNkNTVhVEZNSVd3RWhwbWovMjB0UmlQaGNadlBEWlVleHBIWDRHYi9mRmZ4QTJlTGc4PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo=
kind: Secret
metadata:
creationTimestamp: "2024-06-17T18:45:05Z"
labels:
app.juju.is/created-by: istio-pilot
app.kubernetes.io/instance: istio-pilot-istio-441
kubernetes-resource-handler-scope: gateway
name: istio-gateway-gateway-secret
namespace: istio-441
resourceVersion: "1419"
uid: 0dfe5ae8-ed60-4f54-8e3f-b3b6abed5e4e
type: kubernetes.io/tls
Based on this we can confirm the relation and the reconciler in the istio-operators
seem to be working just fine.
I have confirmed with @sed-i that this issue is caused by the cert_library
not handling relation broken events correctly. I have tested the fix in canonical/observability-libs#99 and it seems to be working for v0
of the library.
To fix the issue @dparv reported, we'll have to:
- Wait for canonical/observability-libs#99 to be merged
- Bump the library to bring in all the changes for
cert_handler v0
For more recent versions of the istio-operators
we'd ideally use the cert_handler v1
, which we'll also have to pull once the mentioned PR is merged.
The fix has been released to 1.17/stable. Closing this issue, but feel free to re-open or file a new one should you find any other error. Thanks!