The driver and executor are not mounted to a given volume
Closed this issue · 1 comments
JWDobken commented
I have installed the spark-operator with:
helm upgrade --install spark-operator \
spark-operator/spark-operator \
--namespace spark-operator \
--values ./spark-operator/values.yaml \
--create-namespace
Where the values file is:
serviceAccounts:
spark:
create: true
name: "spark"
sparkoperator:
create: true
name: "spark-operator-spark"
webhook:
enabled: true
sparkJobNamespace: default
- webhook enabled
- job namespace set to
default
then, following this procedure I created the following job and service:
apiVersion: batch/v1
kind: Job
metadata:
name: sparkoperator-init
namespace: spark-operator
labels:
app.kubernetes.io/name: sparkoperator
app.kubernetes.io/version: v1beta2-1.3.0-3.1.1
spec:
backoffLimit: 3
template:
metadata:
labels:
app.kubernetes.io/name: sparkoperator
app.kubernetes.io/version: v1beta2-1.3.0-3.1.1
spec:
serviceAccountName: spark-operator-spark
restartPolicy: Never
containers:
- name: main
image: ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.8-3.1.1
imagePullPolicy: IfNotPresent
command: ["/usr/bin/gencerts.sh", "-p"]
---
kind: Service
apiVersion: v1
metadata:
name: spark-webhook
namespace: spark-operator
spec:
ports:
- port: 443
targetPort: 8080
name: webhook
selector:
app.kubernetes.io/name: sparkoperator
app.kubernetes.io/version: v1beta2-1.3.0-3.1.1
The service to the webhook is created and the job completes succesfully.
Here is the SparkApplication
with the mounted volume:
apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
name: pyspark-job
namespace: default
spec:
type: Python
pythonVersion: "3"
mode: cluster
image: "ghcr.io/apache/spark-docker/spark:3.5.0-python3"
imagePullPolicy: IfNotPresent
mainApplicationFile: local:///mnt/mydata/spark/applications/jobs/pyspark_job.py
sparkVersion: "3.5.0"
restartPolicy:
type: Never
volumes:
- name: my-data
persistentVolumeClaim:
claimName: pvc-smb
driver:
cores: 1
coreLimit: "1200m"
memory: "512m"
labels:
version: 3.1.1
serviceAccount: spark
volumeMounts:
- name: my-data
mountPath: /mnt
executor:
cores: 1
instances: 1
memory: "512m"
labels:
version: 3.1.1
volumeMounts:
- name: my-data
mountPath: /mnt
The SparkApplication is created and a Job is launched, but the volume is not mounted and the job fails.
JWDobken commented
ok... my bad.
webhook:
enable: true