NetApp/trident

FATA Install pre-checks failed; could not initialize Kubernetes client; unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined. Resolve the issue and try again

rabin-io opened this issue · 3 comments

Describe the bug

Seem to be same as reported in #627
In my case, I'm experiencing this same issue, when running

./tridentctl install -n trident --kubeconfig /cluster/auth/kubeconfig --debug

DEBU Initialized logging.                          logLevel=debug
DEBU Trident image: netapp/trident:23.10.0        
DEBU Autosupport image: docker.io/netapp/trident-autosupport:23.10 
DEBU Creating in-cluster Kubernetes clients.       requestID=f93e4f49-34e0-4ea4-947c-d7e550604506 requestSource=Unknown workflow="k8s_client=trace_factory"
FATA Install pre-checks failed; could not initialize Kubernetes client; unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined. Resolve the issue and try again.

I run this from inside pod which is a worker for Jenkins. At first, I thought it because we have all the KUBERNETES_ set to empty string, but even when I unset them I get this error.

When I copy the kubeconfig locally, and run it from my machine, it works without any extra effort.

Can this issue open? or should I create a new one?

Environment
Provide accurate information about the environment to help us reproduce the issue.

  • Trident version: [e.g. 19.10]
  • Trident installation flags used: [e.g. -d -n trident --use-custom-yaml]
  • Container runtime: [e.g. Docker 19.03.1-CE]
  • Kubernetes version: [e.g. 1.15.1]
  • Kubernetes orchestrator: [e.g. OpenShift v3.11, Rancher v2.3.3]
  • Kubernetes enabled feature gates: [e.g. CSINodeInfo]
  • OS: [e.g. RHEL 7.6, Ubuntu 16.04]
  • NetApp backend types: [e.g. CVS for AWS, ONTAP AFF 9.5, HCI 1.7]
  • Other:

To Reproduce
In my case, just run it from our CI which run it inside a OpenShift pod.

Expected behavior
Should install trident

Additional context
Works from my local machine :)

After exploring the code a bit, it seems to be related to this section https://github.com/NetApp/trident/blob/master/cli/k8s_client/client_factory.go#L70-L82 in the code

	inK8SPod := true
	if _, err := os.Stat(config.NamespaceFile); os.IsNotExist(err) {
		inK8SPod = false
	}

	// Get the API config based on whether we are running in or out of cluster
	if !inK8SPod {
		Logc(ctx).Debug("Creating ex-cluster Kubernetes clients.")
		clients, err = createK8SClientsExCluster(ctx, masterURL, kubeConfigPath, overrideNamespace)
	} else {
		Logc(ctx).Debug("Creating in-cluster Kubernetes clients.")
		clients, err = createK8SClientsInCluster(ctx, overrideNamespace)
	}

And I don't see anything which allow bypassing this check, we did able to mitigate the issue on our side by running the pods with automountServiceAccountToken: false

I don't know about the exact details of that Jenkins pipeline. But wouldn't it be easier to deploy Trident without tridentctl, e.g. Helm or straight YAML manifests?

Maybe, but why the tool will look for KUBERNETES_* vars when I manually set the kubeconfig in the CLI?, or even when KUBECONFIG is set as environment variable.