Setting Up Cloud Run Events on local cluster unable to validate webhook service.
maurerbot opened this issue · 7 comments
Describe the bug
Following the directions to install knative-gcp on my local microk8s cluster can't validate connect to the `webook.cloud-run-events.svc:443/config-validation.
Full error:
Error from server (InternalError): error when creating "https://github.com/google/knative-gcp/releases/download/v0.19.0/cloud-run-events.yaml": Internal error occurred: failed calling webhook "config.webhook.events.cloud.google.com": Post "https://webhook.cloud-run-events.svc:443/config-validation?timeout=30s": dial tcp 10.152.183.252:443: connect: connection refused
What I found in the cloud-run-events.yaml
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingWebhookConfiguration
metadata:
name: config.webhook.events.cloud.google.com
labels:
events.cloud.google.com/release: "v0.19.0"
webhooks:
- admissionReviewVersions:
- v1beta1
clientConfig:
service:
name: webhook
namespace: cloud-run-events
failurePolicy: Fail
sideEffects: None
name: config.webhook.events.cloud.google.com
namespaceSelector:
matchExpressions:
- key: events.cloud.google.com/release
operator: Exists
Everything in the namespace
$ mk get all -n cloud-run-events
NAME READY STATUS RESTARTS AGE
pod/storage-version-migration-knative-gcp-v0-19-0-tcrwx 0/1 Completed 0 11m
pod/controller-65bbf98864-kt6cl 0/1 ContainerCreating 0 11m
pod/webhook-d887d49fb-cjjbb 0/1 CrashLoopBackOff 7 11m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/controller ClusterIP 10.152.183.47 <none> 9090/TCP 11m
service/webhook ClusterIP 10.152.183.252 <none> 443/TCP 11m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/controller 0/1 1 0 11m
deployment.apps/webhook 0/1 1 0 11m
NAME DESIRED CURRENT READY AGE
replicaset.apps/controller-65bbf98864 1 1 0 11m
replicaset.apps/webhook-d887d49fb 1 1 0 11m
NAME COMPLETIONS DURATION AGE
job.batch/storage-version-migration-knative-gcp-v0-19-0 1/1 4s 11m
Logs from failing resources
$ mk logs -n cloud-run-events pod/storage-version-migration-knative-gcp-v0-19-0-tcrw
Error from server (NotFound): pods "storage-version-migration-knative-gcp-v0-19-0-tcrw" not found
$ mk logs -n cloud-run-events pod/controller-65bbf98864-kt6cl
Error from server (BadRequest): container "controller" in pod "controller-65bbf98864-kt6cl" is waiting to start: ContainerCreatin
$ mk describe -n cloud-run-events pod/controller-65bbf98864-kt6c
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 25m default-scheduler Successfully assigned cloud-run-events/controller-65bbf98864-kt6cl to desktop-ko4t9m8
Warning FailedMount 14m (x2 over 17m) kubelet Unable to attach or mount volumes: unmounted volumes=[config-logging], unattached volumes=[config-logging controller-token-8bttq google-cloud-key]: timed out waiting for the condition
Warning FailedMount 10m (x4 over 23m) kubelet Unable to attach or mount volumes: unmounted volumes=[config-logging], unattached volumes=[google-cloud-key config-logging controller-token-8bttq]: timed out waiting for the condition
Warning FailedMount 3m33s (x4 over 19m) kubelet Unable to attach or mount volumes: unmounted volumes=[config-logging], unattached volumes=[controller-token-8bttq google-cloud-key config-logging]: timed out waiting for the condition
Warning FailedMount 3m23s (x19 over 25m) kubelet MountVolume.SetUp failed for volume "config-logging" : configmap "config-logging" not found
$ mk logs -n cloud-run-events pod/webhook-d887d49fb-cjjbb
2021/03/17 13:40:28 Registering 2 clients
2021/03/17 13:40:28 Registering 3 informer factories
2021/03/17 13:40:28 Registering 4 informers
2021/03/17 13:40:28 Registering 5 controllers
{"level":"info","ts":"2021-03-17T13:40:29.107Z","caller":"logging/config.go:110","msg":"Successfully created the logger."}
{"level":"info","ts":"2021-03-17T13:40:29.107Z","caller":"logging/config.go:111","msg":"Logging level set to: info"}
{"level":"info","ts":"2021-03-17T13:40:29.107Z","logger":"webhook","caller":"profiling/server.go:59","msg":"Profiling enabled: false","commit":"58158f3"}
{"level":"info","ts":"2021-03-17T13:40:29.539Z","logger":"webhook","caller":"leaderelection/context.go:46","msg":"Running with Standard leader election","commit":"58158f3"}
{"level":"info","ts":"2021-03-17T13:40:30.157Z","logger":"webhook","caller":"sharedmain/main.go:209","msg":"Starting configuration manager...","commit":"58158f3"}
{"level":"fatal","ts":"2021-03-17T13:40:30.257Z","logger":"webhook","caller":"sharedmain/main.go:211","msg":"Failed to start configuration manager","commit":"58158f3","error":"configmap \"config-br-delivery\" not found","stacktrace":"knative.dev/pkg/injection/sharedmain.MainWithConfig\n\tknative.dev/pkg@v0.0.0-20201103163404-5514ab0c1fdf/injection/sharedmain/main.go:211\nknative.dev/pkg/injection/sharedmain.MainWithContext\n\tknative.dev/pkg@v0.0.0-20201103163404-5514ab0c1fdf/injection/sharedmain/main.go:142\nmain.main\n\tgithub.com/google/knative-gcp/cmd/webhook/main.go:273\nruntime.main\n\truntime/proc.go:203"}
They above shows the controller is having trouble mounting volumes?
I've also run the local pub/sub emulator following https://cloud.google.com/pubsub/docs/emulator
*$ gcloud beta emulators pubsub start --project=notebook-ninja
Executing: /usr/lib/google-cloud-sdk/platform/pubsub-emulator/bin/cloud-pubsub-emulator --host=localhost --port=8085
[pubsub] This is the Google Pub/Sub fake.
[pubsub] Implementation may be incomplete or differ from the real system.
[pubsub] Mar 17, 2021 9:27:17 AM com.google.cloud.pubsub.testing.v1.Main main
[pubsub] INFO: IAM integration is disabled. IAM policy methods and ACL checks are not supported
[pubsub] SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
[pubsub] SLF4J: Defaulting to no-operation (NOP) logger implementation
[pubsub] SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
[pubsub] Mar 17, 2021 9:27:17 AM com.google.cloud.pubsub.testing.v1.Main main
[pubsub] INFO: Server started, listening on 8085
Expected behavior
I expect the webhook service to be emulated in my cluster?
To Reproduce
Follow option 2 in https://github.com/google/knative-gcp/blob/main/docs/install/install-knative-gcp.md and apply to local microk8s cluster
Knative-GCP release version
Additional context
N/A
Hi @zhongduo. I've reapplied it several times and the error persists.
Error from server (InternalError): error when creating "https://github.com/google/knative-gcp/releases/download/v0.19.0/cloud-run-events.yaml": Internal error occurred: failed calling webhook "config.webhook.events.cloud.google.com": Post "https://webhook.cloud-run-events.svc:443/config-validation?timeout=30s": dial tcp 10.152.183.252:443: connect: connection refused
Can you try to delete the webhook pod: kubectl delete -n cloud-run-events pod/webhook-d887d49fb-cjjbb. If that still doesn't work, you will have to check the log of the webhook and controller. I notice that your controller is in creating container state, not ready either.
…
On Wed, Mar 17, 2021 at 9:53 AM Adrian Maurer @.***> wrote: Hi @zhongduo https://github.com/zhongduo. I've reapplied it several times and the error persists. Error from server (InternalError): error when creating " https://github.com/google/knative-gcp/releases/download/v0.19.0/cloud-run-events.yaml": Internal error occurred: failed calling webhook " config.webhook.events.cloud.google.com": Post " https://webhook.cloud-run-events.svc:443/config-validation?timeout=30s": dial tcp 10.152.183.252:443: connect: connection refused — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2182 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACE6CNECUUXAFIRLVVRDIHDTECX6HANCNFSM4ZKTVRGQ .
Yes, the description of the controller is it is unable to mount a couple of volumes. Not sure what I'm missing.
I think I've gotten past the webhook issue. I believe microk8s host-access was not enabled and causing the issue.
However, the controller is spitting out an error:
$ mk get all -n cloud-run-events
NAME READY STATUS RESTARTS AGE
pod/storage-version-migration-knative-gcp-v0-19-0-t46bv 0/1 Completed 0 5m14s
pod/webhook-d887d49fb-b94st 1/1 Running 0 3m23s
pod/controller-65bbf98864-lcvbf 0/1 Error 5 3m25s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/controller ClusterIP 10.152.183.107 <none> 9090/TCP 3m27s
service/webhook ClusterIP 10.152.183.238 <none> 443/TCP 3m27s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/webhook 1/1 1 1 3m24s
deployment.apps/controller 0/1 1 0 3m27s
NAME DESIRED CURRENT READY AGE
replicaset.apps/webhook-d887d49fb 1 1 1 3m24s
replicaset.apps/controller-65bbf98864 1 1 0 3m27s
$ mk describe -n cloud-run-events pod/controller-65bbf98864-ndrcn
Name: controller-65bbf98864-ndrcn
Namespace: cloud-run-events
Priority: 0
Node: desktop-ko4t9m8/172.21.29.183
Start Time: Wed, 17 Mar 2021 14:45:46 -0400
Labels: app=cloud-run-events
pod-template-hash=65bbf98864
role=controller
Annotations: cni.projectcalico.org/podIP: 10.1.97.94/32
cni.projectcalico.org/podIPs: 10.1.97.94/32
sidecar.istio.io/inject: false
Status: Running
IP: 10.1.97.94
IPs:
IP: 10.1.97.94
Controlled By: ReplicaSet/controller-65bbf98864
Containers:
controller:
Container ID: containerd://fd9601ad7b2bc1eeac98ca8487bc8b82f51bc6e18abc0c059a59e3acd3fc6f61
Image: gcr.io/knative-releases/github.com/google/knative-gcp/cmd/controller@sha256:f7e7f123f3d1f649de4de461286f3cb3de9de63c75f87630a98767e8a0f1cf0d
Image ID: gcr.io/knative-releases/github.com/google/knative-gcp/cmd/controller@sha256:f7e7f123f3d1f649de4de461286f3cb3de9de63c75f87630a98767e8a0f1cf0d
Port: 9090/TCP
Host Port: 0/TCP
Args:
--disable-ha
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Wed, 17 Mar 2021 15:02:02 -0400
Finished: Wed, 17 Mar 2021 15:02:02 -0400
Ready: False
Restart Count: 8
Limits:
cpu: 1
memory: 1000Mi
Requests:
cpu: 100m
memory: 100Mi
Environment:
GOOGLE_APPLICATION_CREDENTIALS: /var/secrets/google/key.json
PUBSUB_RA_IMAGE: gcr.io/knative-releases/github.com/google/knative-gcp/cmd/pubsub/receive_adapter@sha256:892f0b8ec11d53639f49c7c8bbdfcf0316477f66eb61c7fc84ddaf8738032539
PUBSUB_PUBLISHER_IMAGE: gcr.io/knative-releases/github.com/google/knative-gcp/cmd/pubsub/publisher@sha256:de2ceffdab7bbddd570c6c1fdbd9ddf1bc071c23d0f2db6499e2256127184625
SYSTEM_NAMESPACE: cloud-run-events (v1:metadata.namespace)
CONFIG_LOGGING_NAME: config-logging
CONFIG_OBSERVABILITY_NAME: config-observability
CONFIG_LEADERELECTION_NAME: config-leader-election
METRICS_DOMAIN: cloud.google.com/events
BROKER_CELL_INGRESS_IMAGE: gcr.io/knative-releases/github.com/google/knative-gcp/cmd/broker/ingress@sha256:0a713a4615f85f751e3c230d622de8057e63d791f6840ec76f7883c606dd7694
BROKER_CELL_FANOUT_IMAGE: gcr.io/knative-releases/github.com/google/knative-gcp/cmd/broker/fanout@sha256:6d6b5b4841439e3444520e6776929d9fda657283cba0a30e535813e8b4f4f7c4
BROKER_CELL_RETRY_IMAGE: gcr.io/knative-releases/github.com/google/knative-gcp/cmd/broker/retry@sha256:557d9e3024d57acdf8149515f74375e5a1ec0f7e18ae5f93243a8e9840d96c06
INTERNAL_METRICS_ENABLED: false
Mounts:
/etc/config-logging from config-logging (rw)
/var/run/secrets/kubernetes.io/serviceaccount from controller-token-4vzmw (ro)
/var/secrets/google from google-cloud-key (rw)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
config-logging:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: config-logging
Optional: false
google-cloud-key:
Type: Secret (a volume populated by a Secret)
SecretName: google-cloud-key
Optional: true
controller-token-4vzmw:
Type: Secret (a volume populated by a Secret)
SecretName: controller-token-4vzmw
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 16m default-scheduler Successfully assigned cloud-run-events/controller-65bbf98864-ndrcn to desktop-ko4t9m8
Warning FailedMount 16m (x4 over 16m) kubelet MountVolume.SetUp failed for volume "config-logging" : configmap "config-logging" not found
Normal Pulled 16m kubelet Successfully pulled image "gcr.io/knative-releases/github.com/google/knative-gcp/cmd/controller@sha256:f7e7f123f3d1f649de4de461286f3cb3de9de63c75f87630a98767e8a0f1cf0d" in 407.7942ms
Normal Pulled 16m kubelet Successfully pulled image "gcr.io/knative-releases/github.com/google/knative-gcp/cmd/controller@sha256:f7e7f123f3d1f649de4de461286f3cb3de9de63c75f87630a98767e8a0f1cf0d" in 401.0451ms
Normal Pulled 16m kubelet Successfully pulled image "gcr.io/knative-releases/github.com/google/knative-gcp/cmd/controller@sha256:f7e7f123f3d1f649de4de461286f3cb3de9de63c75f87630a98767e8a0f1cf0d" in 424.0637ms
Normal Pulling 15m (x4 over 16m) kubelet Pulling image "gcr.io/knative-releases/github.com/google/knative-gcp/cmd/controller@sha256:f7e7f123f3d1f649de4de461286f3cb3de9de63c75f87630a98767e8a0f1cf0d"
Normal Pulled 15m kubelet Successfully pulled image "gcr.io/knative-releases/github.com/google/knative-gcp/cmd/controller@sha256:f7e7f123f3d1f649de4de461286f3cb3de9de63c75f87630a98767e8a0f1cf0d" in 430.7369ms
Normal Created 15m (x4 over 16m) kubelet Created container controller
Normal Started 15m (x4 over 16m) kubelet Started container controller
Warning BackOff 97s (x71 over 16m) kubelet Back-off restarting failed container
$ mk logs -n cloud-run-events controller-65bbf98864-lcvbf
2021/03/17 18:41:23 google: could not find default credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.
I've set gcloud auth application-default login
is there something else I'm missing?
I believe you didn't set up authentication: https://github.com/google/knative-gcp/blob/main/docs/install/install-knative-gcp.md#configure-the-authentication-mechanism-for-gcp-the-control-plane
This issue is stale because it has been open for 90 days with no
activity. It will automatically close after 30 more days of
inactivity. Reopen the issue with /reopen
. Mark the issue as
fresh by adding the comment /remove-lifecycle stale
.