apache-spark-on-k8s/spark

spark-submit ignoring : spark.kubernetes.authenticate.driver.serviceAccountName

purpletech77 opened this issue · 7 comments

~/spark-2.3.0-bin-hadoop2.7 # bin/spark-submit \
>     --master k8s://kubecluster:443 \
>     --deploy-mode cluster \
>     --name spark-pi \
>     --conf spark.kubernetes.test.serviceAccountName=default:spark \
>     --class org.apache.spark.examples.SparkPi \
>     --conf spark.executor.instances=5 \
>     --conf spark.kubernetes.driver.container.image=registry/spark-driver:latest \
>     --conf spark.kubernetes.executor.container.image=registry/spark-executor:latest \
>     local:///home/user/spark-2.2.0-k8s-0.5.0-bin-2.7.3/examples/jars/spark-examples_2.11-2.2.0-k8s-0.5.0.jar

2018-09-08 09:09:06 WARN  WatchConnectionManager:185 - Exec Failure: HTTP 403, Status: 403 - pods "spark-pi-c75ce20fff8539b3969926697eb6a78c-driver" is forbidden: User "system:anonymous" cannot watch pods in the namespace "default"
java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
	at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:216)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:183)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: pods "spark-pi-c75ce20fff8539b3969926697eb6a78c-driver" is forbidden: User "system:anonymous" cannot watch pods in the namespace "default"
	at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$2.onFailure(WatchConnectionManager.java:188)
	at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:543)
	at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:185)
	at okhttp3.RealCall$AsyncCall.execute(RealCall.java:141)
	at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:748)
2018-09-08 09:09:06 INFO  ShutdownHookManager:54 - Shutdown hook called
2018-09-08 09:09:06 INFO  ShutdownHookManager:54 - Deleting directory /tmp/spark-fa5576a8-d883-4371-98de-69c2640991e5

FYI - Spark on K8S has been merged upstream and is now being maintained as part of Apache Spark so issues should be reported on https://issues.apache.org/jira/


This is expected behaviour though I don't believe well documented.

The service account is only used for the driver and executor pods. However the submission client i.e. the local code running where you run spark-submit uses your own K8S config to monitor the ongoing progress of the driver and therefore needs to have sufficient permissions to do this.

@purpletech77 ,
How did you got this issue solved in the end? I came across the same and got no clue.

@waynegj As I tried to explain there is some monitoring of the ongoing progress of the driver pod that happens on the submission client i.e. the place where you run spark-submit. This uses your personal K8S config (typically ~/.kube/config or the file specified by the KUBECONFIG environment variable) so if the configured context there doesn't have the correct permissions the job monitoring will fail.

So the solution is to ensure that you have appropriate credentials in your local K8S config to be able to launch and monitor pods. How you get these credentials is a detail of your specific K8S cluster.

@rvesse ,
Thanks for the sharing. After some digging I found this might be related to support of aws-iam-authenticator in io.fabric8.kubernetes-client(as adressed in fabric8io/kubernetes-client#1224), The same error occurred both on Spark 2.3.1 and 2.3.2 if I configured everything correct. Can you shed some light on how to decide the fabric8 kubernetes-client being used in spark-submit?

@waynegj Well that PR is very new and neither Spark 2.3.1/2.3.2 would have a version of Fabric 8 client that is remotely new enough to incorporate that change.

This whole spark-submit kuberentes is horribly busted. The 10K of code to deploy single driver container and it can't even log what went wrong.

This repository is no longer used for tracking issues related to running Spark on Kubernetes. Please use the official Apache Spark JIRA project to report issues. Also, this project isn't the means to use this feature anymore - the official Spark releases from upstream are the way to do it.

Please move discussions to the official Apache channels. Thanks!