Error when running in AKS
HectorMeneses333 opened this issue · 11 comments
I'm trying to run the system under AKS.
Here's the error I'm getting from the finecollectionservice:
containerID: containerd://3bb5ed001e3b75716e00a3f10b6959ff27c5bfa84025a495f44927a403070983
image: docker.io/daprio/daprd:1.2.2
imageID: docker.io/daprio/daprd@sha256:264d268d06e26525c93c58bc62ad4c0bdcddf205d125f11fc4dbe5d31dd4646d
lastState:
terminated:
containerID: containerd://3bb5ed001e3b75716e00a3f10b6959ff27c5bfa84025a495f44927a403070983
exitCode: 1
finishedAt: "2021-07-02T19:16:48Z"
reason: Error
startedAt: "2021-07-02T19:16:48Z"
name: daprd
ready: false
restartCount: 4
started: false
state:
waiting:
message: back-off 1m20s restarting failed container=daprd pod=finecollectionservice-7dd9f66584-ngqzc_dapr-trafficcontrol(f19a93dd-46ff-49b1-b2e8-de54a69e8452)
reason: CrashLoopBackOff
image: {MY ACR}/dapr-trafficcontrol/finecollectionservice:1.0
imageID: ""
lastState: {}
name: finecollectionservice
ready: false
restartCount: 0
started: false
state:
waiting:
message: Back-off pulling image "{MY ACR}/dapr-trafficcontrol/finecollectionservice:1.0"
reason: ImagePullBackOff
I have configured my Azure Container Registry to have a credential to enable my AKS to pull images from it (role acrpull).
then added
imagePullSecrets:
- name: acr-auth
to the yaml files
Have you seen this type of issue before?
Do you have some extra documentation on how to run the system under AKS?
Thanks!!
The app is not tested on AKS.
But looking at the logging, there might be an issue with the container image name: {MY ACR}/dapr-trafficcontrol/finecollectionservice:1.0
. The {MY ACR}
part looks like a placeholder that should be replaced with the name of your container registry.
But I'm guessing here based on the provided info.
Thanks Edwin, yeah... I didn't put my containers name for privacy reasons... but when I test I have the correct name on it...
I'll keep looking. I'm trying to learn how to use dapr in AKS as it is one of the recommended production environments.
Who is the DAPR team has good experience working with dapr in AKS?
Thanks!!
The issue occurs with the FineCollection service. Are the other traffic control services running correctly?
nop... none of them work...
does my AKS need any configuration to be able to get the daprd image from docker.io/daprio/daprd?
I've just deployed the Traffic Control app into an AKS cluster. Everything runs succesfully except the FineCollection service's Dapr sidecar (looks like the same situation as you're experiencing).
These are the steps I took to get this to work:
- I created an AKS cluster and an Azure Container Registry.
- I tagged all the traffic-control docker images using the name of my ACR:
*****.azurecr.io/mosquitto:1.0
*****.azurecr.io/simulation:1.0
*****.azurecr.io/vehicleregistrationservice:1.0
*****.azurecr.io/finecollectionservice:1.0
*****.azurecr.io/trafficcontrolservice:1.0
- I logged into my ACR and pushed all the images.
- I changed the environment variable
MQTT_HOST
as defined in the Simulation k8s manifest filesrc/k8s/simulation.yaml
frommosquitto.dapr-trafficcontrol.svc.cluster.local
tomosquitto
. This makes sure that the Simulation can reach the Mosquitto MQTT broker running in the AKS cluster. - I connected to the AKS cluster context so I can manage it from my machine using
kubectl
. - I installed Dapr into the AKS cluster (
dapr init -k
). - I ran
src/k8s/start.ps1
to deploy everything.
I did not configure anything specific in order for the AKS cluster to pull the Dapr images. This worked out of the box (as was expected).
So what's wrong with the FineCollection service? Well, by inspecting the logging of the sidecar container, I found that reading Kubernetes secrets is not allowed by default. This is the error I receive:
time="2021-07-03T18:48:52.641961131Z" level=error msg="error getting secret: secrets \"trafficcontrol-secrets\" is forbidden: User \"system:serviceaccount:dapr-trafficcontrol:default\" cannot get resource \"secrets\" in API group \"\" in the namespace \"dapr-trafficcontrol\"" app_id=finecollectionservice instance=finecollectionservice-5ff47f89b4-fkpnk scope=dapr.runtime type=log ver=1.2.2
The configuration for the SMTP output binding component (in dapr/components/email.yaml
) and the k8s manifest file for the SMTP mailserver (in src/k8s/email.yaml
) also reference the secrets. This is also not allowed:
time="2021-07-03T19:01:01.758583136Z" level=error msg="failed to init output binding sendmail (bindings.smtp/v1): smtp binding error: host, port, user and password fields ││ are required in metadata" app_id=finecollectionservice instance=finecollectionservice-5ff47f89b4-t7xvd scope=dapr.runtime type=log ver=1.2.2
I expect this to also be the issue with your deployment (again, guessing based on the information I've collected so far).
The preferred way of working with secrets in AKS is using an Azure Key Vault, so that could be a solution.
I hope this helps!
Awesome... I'll make those changes you suggested this coming Tuesday... (Monday is a holiday).
I really appreciate your help with this Edwin!
I'll report back my findings as well so other people in the community can benefit from it.
Hi Edwin,
I'm making good progress setting up Azure Key Vault for my AKS cluster.
However, I'm now stuck at this step:
Step 9. Configure the Azure Identity and AzureIdentityBinding yaml
from
https://docs.dapr.io/reference/components-reference/supported-secret-stores/azure-keyvault-managed-identity/
What is "selector"? where do I get that from? what does that mean?
spec:
azureIdentity: [your managed identity name]
selector: [your managed identity selector]
I logged a doc bug for that page, but I'm hoping that you may have more info about it.
You can probably use the value of [your managed identity name]
as selector. The binding needs to reference the AzureIdentity defined in the same file. See these docs.
So what's wrong with the FineCollection service? Well, by inspecting the logging of the sidecar container, I found that reading Kubernetes secrets is not allowed by default. This is the error I receive:
time="2021-07-03T18:48:52.641961131Z" level=error msg="error getting secret: secrets \"trafficcontrol-secrets\" is forbidden: User \"system:serviceaccount:dapr-trafficcontrol:default\" cannot get resource \"secrets\" in API group \"\" in the namespace \"dapr-trafficcontrol\"" app_id=finecollectionservice instance=finecollectionservice-5ff47f89b4-fkpnk scope=dapr.runtime type=log ver=1.2.2
The configuration for the SMTP output binding component (in
dapr/components/email.yaml
) and the k8s manifest file for the SMTP mailserver (insrc/k8s/email.yaml
) also reference the secrets. This is also not allowed:time="2021-07-03T19:01:01.758583136Z" level=error msg="failed to init output binding sendmail (bindings.smtp/v1): smtp binding error: host, port, user and password fields ││ are required in metadata" app_id=finecollectionservice instance=finecollectionservice-5ff47f89b4-t7xvd scope=dapr.runtime type=log ver=1.2.2
I expect this to also be the issue with your deployment (again, guessing based on the information I've collected so far).
The preferred way of working with secrets in AKS is using an Azure Key Vault, so that could be a solution.
I hope this helps!
I haven't tried this in AKS but I was having the same issue on my local/private k8s cluster and was able to resolve the issue by create a role/role-binding to allow the default service account (in the dapr-trafficcontrol namespace) to read secrets.
e.g.
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: secret-reader
namespace: dapr-trafficcontrol
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "list"]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: dapr-secret-reader
namespace: dapr-trafficcontrol
subjects:
- kind: ServiceAccount
name: default
roleRef:
kind: Role
name: secret-reader
apiGroup: rbac.authorization.k8s.io
Thanks for the update @grumpydumpty!