/semantic-segmentation-ml-pipeline

Primary LanguageJupyter NotebookApache License 2.0Apache-2.0

Python TFX Validity Check for Training Pipeline Trigger Training Pipeline

Semantic Sementation model within ML pipeline

This repository shows how to build a Machine Learning Pipeline for Semantic Segmentation task with TensorFlow Extended(TFX) and various GCP products such as Vertex Pipeline, Vertex Training, Vertex Endpoint. Also, the ML pipeline contains few custom components integrated with 🤗 Hub. HFModelPusher pushes trained model to 🤗 Model Repositry, and HFSpacePusher creates a Gradio application with latest model out of the box.

NOTE: We use U-NET based TensorFlow model from the official tutorial. Since we implement an ML pipeline, U-NET like model could be a good starting point. Other SOTA models like SegFormer from 🤗 Transformers will be explored later.

NOTE: The aim of this project is not to serve a fully fine-tuned model. Our main focus is to demonstrate how to build ML pipeline for semantic segmentation task instead.

Instructions

The TFX pipeline is designed to be run on both of local and GCP environments.

On local environment

$ cd training_pipeline
$ tfx pipeline create --pipeline-path=local_runner.py \
                      --engine=local
$ tfx pipeline compile --pipeline-path=local_runner.py \
                       --engine=local
$ tfx run create --pipeline-name=segformer-training-pipeline \ 
                 --engine=local

On Vertex AI environment

There are two ways to run TFX pipeline on GCP environment(Vertex AI).

First, you can run it manually with the following CLIs. In this case, you should replace GOOGLE_CLOUD_PROJECT to your GCP project ID in training_pipeline/pipeline/configs.py beforehand.

$ cd training_pipeline
$ tfx pipeline create --pipeline-path=kubeflow_runner.py \
                      --engine=vertex
$ tfx pipeline compile --pipeline-path=kubeflow_runner.py \
                       --engine=vertex
$ tfx run create --pipeline-name=segformer-training-pipeline \ 
                 --engine=vertex \ 
                 --project=$GCP_PROJECT_ID \
                 --regeion=$GCP_REGION

Or, you can use workflow_dispatch feature of GitHub Action. In this case, go to the action tab, then select Trigger Training Pipeline on the left pane, then Run workflow on the branch of your choice. The GCP project ID in the input parameters will automatically replace the GOOGLE_CLOUD_PROJECT in training_pipeline/pipeline/configs.py. Also it will be injected to the tfx run create CLI.

To-do

  • Notebook to prepare input dataset in TFRecord format
  • Upload the input dataset into the GCS bucket
  • (Optional) Add additional Service Account Key as a GitHub Action Secret if collaborators want to run ML pipeline on the GCP with their own GCP account. Each word of the secret key should be separated with underscore. For example, GCP-ML-172005 should be GCP_ML_172005.
  • Modify modeling part to train TensorFlow based Unet model.
  • Modify Gradio app part. Initial version is copied from segformer-tf-transformers repository.
  • Modify Pipeline part. May need to remove some optional components such as StatisticsGen, SchemaGen, Transform, Tuner, and Evaluator.
  • Modify configs.py to reflect changes.
  • (Optional) Add custom TFX component to dynamically inject hyperameters to search with Tuner.

Acknowledgements

We are thankful to the ML Developer Programs team at Google that provided GCP support.