MLX Server

The easiest way to create a server to inference MLX models.

Documentation

For documentation and guides, visit mlxserver.com.

View all of the support models here.

Install mlxserver via pip to get started. This installation will install mlx as well.

pip install mlxserver

To install from PyPI you must meet the following requirements:

The following is an example of using Mistral 7B Nous Hermes 2 for generating text:

Python

from mlxserver import MLXServer

server = MLXServer(model="mlx-community/Nous-Hermes-2-Mistral-7B-DPO-4bit-MLX")

Curl

curl -X GET 'http://127.0.0.1:5000/generate?prompt=write%20me%20a%20poem%20about%the%20ocean&stream=true'

This library only runs on Apple Metal. The MLX library focuses on Apple Metal acceleration.