Let's explore the Vertex AI Workbench as an alternative to Compute Engine for model training.
Vertex AI Workbench provides managed virtual machines, allowing you to run ML code without having to precisely configure the environment for the code:
- User-managed notebooks provide a customizable environment and allow you to specify package versions
- Managed notebooks use custom containers, can be extended to read or write to BigQuery or cloud storage, and can be scheduled to run at set times
Create a workbench instance:
- Access the Vertex AI Workbench page
- At the top, select the USER-MANAGED NOTEBOOKS tab and click on the blue CREATE NOTEBOOK button further below
- Give your notebook the following name: cloud-training-recap
- In the Environment section, set the operating system to Ubuntu 20.04
- Still in this section, select TensorFlow Enterprise 2.10 (without GPU) as the environment
- Scroll down and click on CREATE
👉 The workbench should be ready in a couple of minutes
Open the virtual machine
- Click on OPEN JUPYTERLAB
- Install gh for Ubuntu
Install zsh:
sudo apt-get install zsh
Install oh-my-zsh:
sh -c "$(curl -fsSL https://raw.github.com/ohmyzsh/ohmyzsh/master/tools/install.sh)"
Go to the workbench instance and open the Terminal.
Run the gh auth login
command:
- Account:
GitHub.com
- Protocol:
HTTPS
- Authenticate Git with your GitHub credentials:
Yes
- Authentication method:
Paste an authentication token
Create a GitHub token to allow the workbench to access your account:
- Access GitHub Tokens
- Click on generate new token
- Fill in the Note field with a meaningful name, such as Vertex AI Workbench token
- Check that these scopes are enabled: 'repo', 'read:org', 'workflow'
- Click on generate token
- Copy the token (you will not be able to retrieve it later)
Paste the token in the Vertex AI instance's Terminal
Clone your recap repo inside your Workbench VM using Kitt-generated token provided on top of Kitt webpage.
# Create challenge folder
mkdir -p ~/code/lewagon/data-recap-cloud-training && cd $_
# Download challenge
curl -s -H "Authorization: Token <REPLACE_BY_KITT_MYRIAD_TOKEN>, User=$(gh api user --jq '.login')" "https://kitt.lewagon.com/camps/1002/challenges/setup_script?path=07-ML-Ops%2F02-Cloud-training%2FRecap" | bash
cp .env.sample .env
nano .env
Install direnv
:
curl -sfL https://direnv.net/install.sh | bash
eval "$(direnv hook zsh)"
direnv allow .
Install package:
pip install -e .
make reset_local_files
Run the preprocessing and the training:
make run_preprocess run_train
Manually hook direnv
:
eval "$(direnv hook zsh)"
The easiest solution is to manually define the environment variables from Python:
import os
os.environ["DATA_SIZE"] = "1k"
os.environ["MODEL_TARGET"] = "local"
os.environ["GCP_PROJECT_WAGON"] = wagon-public-datasets
os.environ["GCP_PROJECT"] = ... # Your personal GCP project for this bootcamp
os.environ["GCP_REGION"] = "europe-west1"
os.environ["CLOUD_STORAGE"] = "europe-west1"
os.environ["BQ_REGION"] = "EU"
os.environ["BQ_DATASET"] = "taxifare"
...
In Compute Engine we can see that the Vertex AI Workbench uses a Compute Engine instance behind the scenes: