The easiest way to create a server to inference MLX models.
For documentation and guides, visit mlxserver.com.
View all of the support models here.
Install mlxserver
via pip to get started. This installation will install mlx
as well.
pip install mlxserver
To install from PyPI you must meet the following requirements:
- Using an M series chip (Apple silicon)
- Using a native Python >= 3.8
- macOS >= 13.5
The following is an example of using Mistral 7B Nous Hermes 2
for generating text:
Python
from mlxserver import MLXServer
server = MLXServer(model="mlx-community/Nous-Hermes-2-Mistral-7B-DPO-4bit-MLX")
Curl
curl -X GET 'http://127.0.0.1:5000/generate?prompt=write%20me%20a%20poem%20about%the%20ocean&stream=true'
This library only runs on Apple Metal. The MLX library focuses on Apple Metal acceleration.
This library was made by Mustafa Aljadery & Siddharth Sharma.