kubernetes-sigs/aws-encryption-provider

minikube kubernetes api-server crashloop with aws-enryption-provider configured for AWS KMS.

tapanhalani opened this issue · 7 comments

I am trying to use https://github.com/kubernetes-sigs/aws-encryption-provider on my local minikube k8s-1.16.2 cluster. After following the steps listed in this link, I am getting the following error when restarting minikube cluster :

==> kube-apiserver ["0d328d261494"] <== I1125 13:30:12.775356 1 cache.go:32] Waiting for caches to sync for AvailableConditionController controller I1125 13:30:12.775406 1 autoregister_controller.go:140] Starting autoregister controller I1125 13:30:12.775411 1 cache.go:32] Waiting for caches to sync for autoregister controller I1125 13:30:12.785198 1 controller.go:85] Starting OpenAPI controller I1125 13:30:12.785241 1 customresource_discovery_controller.go:208] Starting DiscoveryController I1125 13:30:12.785311 1 naming_controller.go:288] Starting NamingConditionController I1125 13:30:12.785507 1 establishing_controller.go:73] Starting EstablishingController I1125 13:30:12.785794 1 nonstructuralschema_controller.go:191] Starting NonStructuralSchemaConditionController I1125 13:30:12.785891 1 apiapproval_controller.go:185] Starting KubernetesAPIApprovalPolicyConformantConditionController E1125 13:30:12.797101 1 controller.go:154] Unable to remove old endpoints from kubernetes service: StorageError: key not found, Code: 1, Key: /registry/masterleases/192.168.99.102, ResourceVersion: 0, AdditionalErrorMsg: I1125 13:30:13.010620 1 crdregistration_controller.go:111] Starting crd-autoregister controller I1125 13:30:13.010914 1 shared_informer.go:197] Waiting for caches to sync for crd-autoregister I1125 13:30:13.010995 1 shared_informer.go:204] Caches are synced for crd-autoregister I1125 13:30:13.025833 1 controller.go:606] quota admission added evaluator for: leases.coordination.k8s.io I1125 13:30:13.081635 1 cache.go:39] Caches are synced for autoregister controller I1125 13:30:13.082063 1 cache.go:39] Caches are synced for APIServiceRegistrationController controller I1125 13:30:13.098277 1 cache.go:39] Caches are synced for AvailableConditionController controller I1125 13:30:13.772769 1 controller.go:107] OpenAPI AggregationController: Processing item I1125 13:30:13.772793 1 controller.go:130] OpenAPI AggregationController: action for item : Nothing (removed from the queue). I1125 13:30:13.772803 1 controller.go:130] OpenAPI AggregationController: action for item k8s_internal_local_delegation_chain_0000000000: Nothing (removed from the queue). I1125 13:30:13.783102 1 storage_scheduling.go:148] all system priority classes are created successfully or already exist. E1125 13:30:13.808370 1 grpc_service.go:71] failed to create connection to unix socket: /var/run/kmsplugin/socket.sock, error: dial unix /var/run/kmsplugin/socket.sock: connect: no such file or directory W1125 13:30:13.808653 1 clientconn.go:1120] grpc: addrConn.createTransport failed to connect to {/var/run/kmsplugin/socket.sock 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial unix /var/run/kmsplugin/socket.sock: connect: no such file or directory". Reconnecting... E1125 13:30:13.995827 1 grpc_service.go:71] failed to create connection to unix socket: /var/run/kmsplugin/socket.sock, error: dial unix /var/run/kmsplugin/socket.sock: connect: no such file or directory

Due to this, the api-server crashes continously, causing cluster to fail. But I have already deployed the kms-provider pod and verifiied the presence of /var/run/kmsplugin/socker.sock on the minikube host as follows:

`

ls -la /var/run/kmsplugin/

total 0
drwxr-xr-x 2 root root 60 Nov 25 13:15 .
drwxr-xr-x 18 root root 600 Nov 25 13:15 ..
srwxr-xr-x 1 root root 0 Nov 25 13:15 socket.sock
`

It would be extremely helpful to understand what I might be doing wrong. Any help is highly appreciated. Thanks.

I believe you might need to also mount it onto api-server with extraVolumes

apiVersion: kubeadm.k8s.io/v1beta1
kind: ClusterConfiguration
clusterName: "{{ cluster_name }}"
kubernetesVersion: {{ kube_version | regex_replace('-.*$', '') }}
apiServer:
  extraVolumes:
  - name: config
    hostPath: /etc/kubernetes/enc-config.yaml
    mountPath: /etc/kubernetes/enc-config.yaml
  - name: kmsplugin
    hostPath: /var/run/kmsplugin
    mountPath: /var/run/kmsplugin
  extraArgs:
    encryption-provider-config: /etc/kubernetes/enc-config.yaml

In this scenario I don't get

grpc_service.go:71] failed to create connection to unix socket: /var/run/kmsplugin/socket.sock, error: dial unix /var/run/kmsplugin/socket.sock: connect: no such file or directory

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Hi,
I am trying to configure aws-encryption-provider with K3s (single node cluster) on my ubuntu machine. I've provided the encryption configuration to kube-apiserver but k3s server itself is not starting and keep on complaining with below error.

"failed to create connection to unix socket: /var/run/kmsplugin/socket.sock, error: dial unix /var/run/kmsplugin/socket.sock: connect: connection refused"

I'm just trying to understand the sequence,

Is Kube-ApiServer is expecting aws-encryption-provider (pod/service) to be up and running and listening on unix-domain socket ?. And if it is not running then kube-apiserver will not start ?

Since I am using single node cluster and kube-api-server is dependent on aws-encryption-pod, then just trying to understand how to fix it because unless k3s single node cluster is up, I can't run aws-encryption-provider. The other option I could think is of running aws-encryption-provider as docker container ?

Please let me know how did you fix it on minikube.

@amitkatyal - Were you able to fix this issue?