Join this introductory session on using Snowpark ML to build out an end-to-end prediction pipeline, starting from data ingestion through to model deployment and inference. We will address all the steps in ML development, and demonstrate the various capabilities Snowflake provides with Snowpark ML and its supporting MLOps capabilities. The model aims to predict the winner of the Euro 2024 — a polarizing subject for Matteo (Italy supporter) and Simon (England supporter)!
setup
: Contains prerequisites for the session.dataset
: Data to be uploaded to the Snowflake account before the session.images
: Images displayed in the readme and setup scripts.notebooks_snowflake
: 5 notebooks to run sequentially that cover the various steps that are compatible with Snowflake Notebooks.notebooks_hex
: 5 notebooks to run sequentially that cover the various steps that are compatible with Hex Notebooks (with some minor edits they will also work with standard Jupyter notebooks).
This guide will help you to perform all prerequites to successfully follow the HOL session.
Estimated HOL Preparation Time: 10 mins
Before you begin, ensure you have the following:
- Access to the HOL GitHub Repository: Summit 2024 HOL Repository.
- An active Snowflake Trial Account in AWS US West 2
- Depending on which notebook environment you wish to use;
- The above Snowflake Account with Snowflake Notebooks, or
- A Hex Account.
- NOTE - We do not recommend having multiple users running this in the same Snowflake account.
- Clone / download the whole GitHub Repo locally. During the Setup, you'll need the
dataset
folder and thenotebooks
folder to finalize the HOL pre-work.
Execute on your account the setup.sql
script available in this repository.
Ensure to push the dataset into Snowflake stages prior to the on-site session, as the internet connection might be limited at the venue.
Below are the steps to push data via the UI (alternatively, you can use PUT
command via command line
- Navigate to Data -> DB
EURO2024
, Schema:Public
-> Stage:DATA
. - Use the warehouse created to list files on the stage.
- Click on the top right button to add new files.
- Select dataset files you downloaded from this GitHub Repo and push them in the stage folder. Click on the button upload.
You can follow steps described in the official docs
For Hex Notebooks:
- Please follow these steps for importing notebooks.
For Snowflake Notebooks:
-
Click on Projects -> Notebook
-
Import the notebooks located in the
/notebooks
folder by using the import button on the top right. -
As you import, select the database
EURO2024
, the schemaPUBLIC
and theEURO2024_WH
created from thesetup.sql
script. -
For notebooks (3) and (5), once they are imported add these packages by using the "Packages" dropwdown on the top right:
- snowflake-ml-python 1.5.0
- fastparquet 2023.8.0
Run through Notebooks 1 through 5 and see who the predicted winner is of Euro 2024 ⚽🏆