nos: A Python repository from spillai

⚡️ What is NOS?

NOS (torch-nos) is a fast and flexible Pytorch inference server, specifically designed for optimizing and running inference of popular foundational AI models.

👩‍💻 Easy-to-use: Built for PyTorch and designed to optimize, serve and auto-scale Pytorch models in production without compromising on developer experience.
🥷 Flexible: Run and serve several foundational AI models (Stable Diffusion, CLIP, Whisper) in a single place.
🔌 Pluggable: Plug your front-end to NOS with out-of-the-box high-performance gRPC/REST APIs, avoiding all kinds of ML model deployment hassles.
🚀 Scalable: Optimize and scale models easily for maximum HW performance without a PhD in ML, distributed systems or infrastructure.
📦 Extensible: Easily hack and add custom models, optimizations, and HW-support in a Python-first environment.
⚙️ HW-accelerated: Take full advantage of your underlying HW (GPUs, ASICs) without compromise.
☁️ Cloud-agnostic: Run on any cloud HW (AWS, GCP, Azure, Lambda Labs, On-Prem) with our ready-to-use inference server containers.

NOS inherits its name from Nitrous Oxide System, the performance-enhancing system typically used in racing cars. NOS is designed to be modular and easy to extend.

🚀 Getting Started

Get started with the full NOS server by installing via pip:

$ conda env create -n nos-py38 python=3.8
$ conda activate nos-py38
$ conda install pytorch>=2.0.1 torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
$ pip install torch-nos[server]

If you want to simply use a light-weight NOS client and run inference on your local machine (via docker), you can install the client-only package:

$ conda env create -n nos-py38 python=3.8
$ conda activate nos-py38
$ pip install torch-nos

🔥 Quickstart / Show me the code

Image Generation as-a-Service

gRPC API ⚡

REST API

from nos.client import Client

client = Client("[::]:50051")

sdxl = client.Module("stabilityai/stable-diffusion-xl-base-1-0")
image, = sdxl(prompts=["fox jumped over the moon"],
              width=1024, height=1024, num_images=1)

curl \
-X POST http://localhost:8000/infer \
-H 'Content-Type: application/json' \
-d '{
      "model_id": "stabilityai/stable-diffusion-xl-base-1-0",
      "inputs": {
          "prompts": ["fox jumped over the moon"],
          "width": 1024,
          "height": 1024,
          "num_images": 1
      }
    }'

Text & Image Embedding-as-a-Service (CLIP-as-a-Service)

gRPC API ⚡

REST API

from nos.client import Client

client = Client("[::]:50051")

clip = client.Module("openai/clip-vit-base-patch32")
txt_vec = clip.encode_text(text=["fox jumped over the moon"])

curl \
-X POST http://localhost:8000/infer \
-H 'Content-Type: application/json' \
-d '{
      "model_id": "openai/clip-vit-base-patch32",
      "method": "encode_text",
      "inputs": {
          "texts": ["fox jumped over the moon"]
      }
    }'

📂 Directory Structure

├── docker         # Dockerfile for CPU/GPU servers
├── docs           # mkdocs documentation
├── examples       # example guides, jupyter notebooks, demos
├── makefiles      # makefiles for building/testing
├── nos
│   ├── cli        # CLI (hub, system)
│   ├── client     # gRPC / REST client
│   ├── common     # common utilities
│   ├── executors  # runtime executor (i.e. Ray)
│   ├── hub        # hub utilies
│   ├── managers   # model manager / multiplexer
│   ├── models     # model zoo
│   ├── proto      # protobuf defs for NOS gRPC service
│   ├── server     # server backend (gRPC)
│   └── test       # pytest utilities
├── requirements   # requirement extras (server, docs, tests)
├── scripts        # basic scripts
└── tests          # pytests (client, server, benchmark)

📚 Documentation

Quickstart
Models
Concepts: Architecture Overview, ModelSpec, ModelManager, Runtime Environments
Demos: Building a Discord Image Generation Bot, Video Search Demo

🛣 Roadmap

HW / Cloud Support

📄 License

This project is licensed under the Apache-2.0 License.

📡 Telemetry

NOS collects anonymous usage data using Sentry. This is used to help us understand how the community is using NOS and to help us prioritize features. You can opt-out of telemetry by setting NOS_TELEMETRY_ENABLED=0.

🤝 Contributing

We welcome contributions! Please see our contributing guide for more information.

🔗 Quick Links

💬 Send us an email at support@autonomi.ai or join our Discord for help.
📣 Follow us on Twitter, and LinkedIn to keep up-to-date on our products.

spillai/nos