/deploy-action

A custom GitHub Action that wraps the Qwak CLI 'models deploy' command.

Primary LanguagePythonApache License 2.0Apache-2.0

Qwak Model DEPLOY Action (v1)

This GitHub Action triggers a Qwak Cloud Deployment for a machine learning model Build. It provides a seamless integration with Qwak's platform, allowing you to deploy and monitor your models directly from your GitHub repository.


Table of Contents


Action Flow

  1. Initialize Deployment: Trigger build using the qwak models deploy CLI command.
  2. Extract IDs: Retrieve the Deployment ID and Build ID from the command output.
  3. Monitor Status: Continuously check the Deployment status from the Qwak Cloud every 10 seconds while it's in the PENDING or INITIALIZING state.
  4. Output Results: Once the deployment is complete, the Action outputs the Deployment ID and STATUS.

Inputs

  • qwak-api-key: A Qwak API key. Recommended to be set up as a repository secret.
  • sdk-version: Specifies the Qwak-SDK version required to trigger this deploy. Default is latest.
  • deploy-type: (Required) Type of deployment. Supported types are realtime, stream, and batch.
  • model-id: (Required) The ID of the model to be deployed.
  • build-id: The Build ID to be deployed. If not specified, the latest successful build will be deployed.
  • tags: One or more TAGS separated by comma. If build-id is specified, tags will be ignored, otherwise the action will look for the latest successful build with the tags mentioned.
  • param-list: A list of key-value pairs representing deployment parameters. These are specified in the format NAME=VALUE and separated by commas. For a complete list of available parameters for each deployment type, refer to Deployment Types.
  • env-vars: Environment variables for the deployment, specified in the format NAME=VALUE and separated by commas. These can be used to set or override environment settings within the deployment process.
  • instance: Specifies the hardware type to deploy the model on. The instance defines the allocated CPU/GPU and Memory resources. Instances list. Default is small.
  • replicas: The number of selected instances to provision for this deployment. Default is 1.
  • iam-role-arn: Custom IAM Role ARN that Qwak should assume in order to access external resources during the build process.
  • environment: Specifies the Qwak environment to use, such as dev, staging, or production. If not specified, the default environment will be used.
  • timeout-after: Specifies how many minutes to wait for the build to complete before timing out. Default is 30.

Outputs

  • deploy-id: The ID of the deployment.
  • deploy-status: The status of the deployment once it has finished execution or times out.

Output Example

deploy-id=bc3ceeca-e4ed-48b9-8ff1-80427923f1cf
deploy-status=SUCCESSFUL_DEPLOYMENT

Deployment Types

Realtime

Qwak real time models deploy your ML models with a lightweight, simple and scalable REST API wrapper. We set up the network requirements and deploy your model on a managed Kubernetes cluster, allowing you to leverage auto-scaling and security

Parameters

Parameter Type Default Value Description
timeout INT Inference request timeout in MS.
server-workers INT Number of workers running the HTTP server.
daemon-mode BOOLEAN true Configure Gunicorn daemon mode.
max-batch-size INT 0 Max batch size in prediction. A value of 0 means it's dynamic.
variation-name TEXT default The model variation name.
deployment-timeout INT 1800 The number of seconds the deployments can be in progress before it is considered as failed.
protected BOOLEAN false Whether the deployment variation is protected.

param-list example

timeout=3000,server-workers=4,variation-name=default,daemon-mode=false

Stream

Streaming deployments let you easily connect Kafka streams with your models to perform real-time inference.

Using streaming deployments can be useful for processing large amounts of distributed data to avoiding complex triggering and scheduling architectures as fresh data arrives. A streaming deployment will consume messages from a Kafka topic and produce predictions into a Kafka topic of your choice.

Parameters

Parameter Type Description Default Possible Values
bootstrap-server TEXT Kafka consumer/producer bootstrap server.
consumer-bootstrap-server TEXT Kafka consumer bootstrap server.
consumer-topic TEXT Kafka consumer topic.
consumer-group TEXT Kafka consumer group.
consumer-auto-offset-reset ENUM Kafka consumer auto offset reset. unset unset, latest, earliest
consumer-timeout INT Kafka consumer polling timeout. Should be in range of kafka admin configuration group.min.session.timeout.ms and group.max.session.timeout.ms.
consumer-max-batch-size INT The maximum number of records returned in a single call to poll().
consumer-max-poll-latency FLOAT The maximum delay between invocations of poll() when using consumer group management.
producer-bootstrap-server TEXT Kafka producer bootstrap server.
producer-topic TEXT Kafka producer topic.
producer-compression-type ENUM Kafka producer compression type. uncompressed uncompressed, gzip, snappy, lz4, zstd

param-list example

consumer-bootstrap-server="10.0.0.8",consumer-topic="model-input-topic",producer-bootstrap-server="10.0.0.9",producer-topic="model-output-topic"

Batch

This deployment type allows you to run batch inference executions in the system, and handle data files from an online cloud storage provider.

No additional parameters are required for batch deployments.


Example Usage

Basic Example

- name: Build Qwak Model
  uses: qwak-ai/deploy-action@v1
  with:
    qwak-api-key: <your qwak key>
    model-id: <your-model-id>
    tags: ${{ github.head_ref }}                # Deploy the latest successful build with this branch as TAG
    deploy-type: realtime
    param-list: 'timeout=3000,server-workers=4'

Example with GPU configuration

- name: Build Qwak Model with GPU
  uses: qwak-ai/deploy-action@v1
  with:
    qwak-api-key: <your qwak key>
    model-id: <your-model-id>                   # Deploy the latest successful build for this Model
    instance: 'gpu.t4.xl'
    deploy-type: batch

Example with Timeout Configuration

- name: Build Qwak Model with Timeout
  uses: qwak-ai/deploy-action@v1   
  with:
    qwak-api-key: <your qwak key>
    model-id: 'your-model-id'
    deploy-type: realtime
    param-list: 'timeout=6000'                # Prediction REST endpoint timeout after 6s

Example with Shadow Variation

- name: Build Qwak Model with Timeout
  uses: qwak-ai/deploy-action@v1   
  with:
    qwak-api-key: <your qwak key>
    model-id: 'your-model-id'
    deploy-type: realtime
    param-list: 'variation-name=shadow,from-file=config.yaml'                # config.yaml should be in the runner's current directory

Trigger a Streaming Deployment when after a successful model Build

name: Deploy ML Model after successful Build

on:
  pull_request:
    types: [opened, reopened, synchronize]
    branches:
        - 'main'

jobs:
  build:
    outputs:
      build_id: </...>
      build_status: </...>

    steps:
      </...>

  deploy:
    if: needs.build.outputs.build_status == 'SUCCESSFUL' 
    runs-on: ubuntu-latest
    needs: build
    steps:

    - name: Deploy Qwak Build
      uses: qwak-ai/deploy-action@v1
      with:
        model-id: <your-model-id>
        build-id: ${{ needs.build.outputs.build_id }}
        deploy-type: stream
        sdk-version: '0.5.18'
        instance: 'medium'
        iam-role-arn: 'arn:aws:iam::<account-id>:role/<role-name>'
        param-list: 'consumer-bootstrap-server="10.0.0.8",consumer-topic="model-input-topic",producer-bootstrap-server="10.0.0.9",producer-topic="model-output-topic"'
        # other inputs as needed

Support

For support or any questions related to this action, please contact the Qwak team.