FATA Install pre-checks failed; could not initialize Kubernetes client; unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined. Resolve the issue and try again

Question

FATA Install pre-checks failed; could not initialize Kubernetes client; unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined. Resolve the issue and try again

rabin-io opened this issue 10 months ago · 3 comments

Describe the bug

Seem to be same as reported in #627
In my case, I'm experiencing this same issue, when running

./tridentctl install -n trident --kubeconfig /cluster/auth/kubeconfig --debug

DEBU Initialized logging.                          logLevel=debug
DEBU Trident image: netapp/trident:23.10.0        
DEBU Autosupport image: docker.io/netapp/trident-autosupport:23.10 
DEBU Creating in-cluster Kubernetes clients.       requestID=f93e4f49-34e0-4ea4-947c-d7e550604506 requestSource=Unknown workflow="k8s_client=trace_factory"
FATA Install pre-checks failed; could not initialize Kubernetes client; unable to load in-cluster configuration, KUBERNETES_SERVICE_HOST and KUBERNETES_SERVICE_PORT must be defined. Resolve the issue and try again.

I run this from inside pod which is a worker for Jenkins. At first, I thought it because we have all the KUBERNETES_ set to empty string, but even when I unset them I get this error.

When I copy the kubeconfig locally, and run it from my machine, it works without any extra effort.

Can this issue open? or should I create a new one?

Environment
Provide accurate information about the environment to help us reproduce the issue.

Trident version: [e.g. 19.10]
Trident installation flags used: [e.g. -d -n trident --use-custom-yaml]
Container runtime: [e.g. Docker 19.03.1-CE]
Kubernetes version: [e.g. 1.15.1]
Kubernetes orchestrator: [e.g. OpenShift v3.11, Rancher v2.3.3]
Kubernetes enabled feature gates: [e.g. CSINodeInfo]
OS: [e.g. RHEL 7.6, Ubuntu 16.04]
NetApp backend types: [e.g. CVS for AWS, ONTAP AFF 9.5, HCI 1.7]
Other:

To Reproduce
In my case, just run it from our CI which run it inside a OpenShift pod.

Expected behavior
Should install trident

Additional context
Works from my local machine :)

Answer 1 · 2023-11-27T21:01:16.000Z

After exploring the code a bit, it seems to be related to this section https://github.com/NetApp/trident/blob/master/cli/k8s_client/client_factory.go#L70-L82 in the code

	inK8SPod := true
	if _, err := os.Stat(config.NamespaceFile); os.IsNotExist(err) {
		inK8SPod = false
	}

	// Get the API config based on whether we are running in or out of cluster
	if !inK8SPod {
		Logc(ctx).Debug("Creating ex-cluster Kubernetes clients.")
		clients, err = createK8SClientsExCluster(ctx, masterURL, kubeConfigPath, overrideNamespace)
	} else {
		Logc(ctx).Debug("Creating in-cluster Kubernetes clients.")
		clients, err = createK8SClientsInCluster(ctx, overrideNamespace)
	}

And I don't see anything which allow bypassing this check, we did able to mitigate the issue on our side by running the pods with automountServiceAccountToken: false

Answer 2 · 2023-11-28T12:07:54.000Z

I don't know about the exact details of that Jenkins pipeline. But wouldn't it be easier to deploy Trident without tridentctl, e.g. Helm or straight YAML manifests?

Answer 3 · 2023-11-28T18:15:10.000Z

Maybe, but why the tool will look for KUBERNETES_* vars when I manually set the kubeconfig in the CLI?, or even when KUBECONFIG is set as environment variable.