Serverless Spark on GCP

This repository contains a hands on lab with multiple modules and covers serverless Spark on GCP powered by Cloud Dataproc.

Audience

The intended audience is Google Customer Engineers but anyone with access to GCP can try the lab modules just as well.

Prerequisites

The lab covers setup in Argolis, therefore Argolis enablement is prerequisite

Goal

(a) Just enough knowledge of serverless Spark on GCP powered by Cloud Dataproc to field customer conversations & questions, (b) completed setup in Argolis for serverless Spark, (c) basic/quickstart demos and knowledge of how to run them and (d) knowledge of resources for serverless Spark on GCP.

What is covered?

Currently, the labs are very basic to get quick started on serverless Spark on GCP. More advanced labs will be added eventually.

# Sub-Modules
1 Foundational setup in Argolis
2 Serverless Spark in BigQuery UI
3 Serverless Spark Batch jobs
4 Serverless Spark in Dataplex
5 Serverless Spark in Vertex AI
6 Resources for Serverless Spark

Get started

Go to foundational setup in Argolis

Dont forget to

Shut down/delete resources as needed.

Credits

This is a community effort by Google Cloud Data Analytics Specialist Engineers. Contributions are welcome.

# Contributor Contribution Team
1 Anagha Khanolkar Author North America Technology Team
2 Jay O' Leary Testing Sub-regional Technology Team