kubeflow/spark-operator

The driver and executor are not mounted to a given volume

Closed this issue · 1 comments

I have installed the spark-operator with:

helm upgrade --install spark-operator \
    spark-operator/spark-operator \
    --namespace spark-operator \
    --values ./spark-operator/values.yaml \
    --create-namespace

Where the values file is:

serviceAccounts:
  spark:
    create: true
    name: "spark"
  sparkoperator:
    create: true
    name: "spark-operator-spark"
webhook:
  enabled: true
sparkJobNamespace: default
  • webhook enabled
  • job namespace set to default

then, following this procedure I created the following job and service:

apiVersion: batch/v1
kind: Job
metadata:
  name: sparkoperator-init
  namespace: spark-operator
  labels:
    app.kubernetes.io/name: sparkoperator
    app.kubernetes.io/version: v1beta2-1.3.0-3.1.1
spec:
  backoffLimit: 3
  template:
    metadata:
      labels:
        app.kubernetes.io/name: sparkoperator
        app.kubernetes.io/version: v1beta2-1.3.0-3.1.1
    spec:
      serviceAccountName: spark-operator-spark
      restartPolicy: Never
      containers:
        - name: main
          image: ghcr.io/googlecloudplatform/spark-operator:v1beta2-1.3.8-3.1.1
          imagePullPolicy: IfNotPresent
          command: ["/usr/bin/gencerts.sh", "-p"]
---
kind: Service
apiVersion: v1
metadata:
  name: spark-webhook
  namespace: spark-operator
spec:
  ports:
    - port: 443
      targetPort: 8080
      name: webhook
  selector:
    app.kubernetes.io/name: sparkoperator
    app.kubernetes.io/version: v1beta2-1.3.0-3.1.1

The service to the webhook is created and the job completes succesfully.

Here is the SparkApplication with the mounted volume:

apiVersion: "sparkoperator.k8s.io/v1beta2"
kind: SparkApplication
metadata:
  name: pyspark-job
  namespace: default
spec:
  type: Python
  pythonVersion: "3"
  mode: cluster
  image: "ghcr.io/apache/spark-docker/spark:3.5.0-python3"
  imagePullPolicy: IfNotPresent
  mainApplicationFile: local:///mnt/mydata/spark/applications/jobs/pyspark_job.py
  sparkVersion: "3.5.0"
  restartPolicy:
    type: Never
  volumes:
    - name: my-data
      persistentVolumeClaim:
        claimName: pvc-smb
  driver:
    cores: 1
    coreLimit: "1200m"
    memory: "512m"
    labels:
      version: 3.1.1
    serviceAccount: spark
    volumeMounts:
      - name: my-data
        mountPath: /mnt
  executor:
    cores: 1
    instances: 1
    memory: "512m"
    labels:
      version: 3.1.1
    volumeMounts:
      - name: my-data
        mountPath: /mnt

The SparkApplication is created and a Job is launched, but the volume is not mounted and the job fails.

ok... my bad.

webhook:
  enable: true