/vllm-client

vLLM client with minimal dependencies

Primary LanguagePythonApache License 2.0Apache-2.0

vLLM Client

Overview

Client for the vLLM API with minimal dependencies.

Installation

pip install vllm-client

Examples

See example.py for the following:

  • Single generation
  • Streaming
  • Batch inference

It should work out of the box with a vLLM API server.

Notes

  • sampling_params.py needs to be kept in sync with vLLM. It is a simplified version of their class, containing only the code required on client side.

Another programming languages