TorchServe is a flexible and easy to use tool for serving and scaling PyTorch models in production.
Requires python > 3.8
curl http://127.0.0.1:8080/predictions/bert -T input.txt
# Install dependencies
# cuda is optional
python ./ts_scripts/install_dependencies.py --cuda=cu111
# Latest release
pip install torchserve torch-model-archiver torch-workflow-archiver
# Nightly build
pip install torchserve-nightly torch-model-archiver-nightly torch-workflow-archiver-nightly
docker pull pytorch/torchserve
Refer to torchserve docker for details.
- Model Management API: multi model management with optimized worker to model allocation
- Inference API: REST and gRPC support for batched inference
- TorchServe Workflows: deploy complex DAGs with multiple interdependent models
- Default way to serve PyTorch models in
- Export your model for optimized inference
- Torchscript out of the box
- ORT
- IPEX
- TensorRT
- FasterTransformer
- Performance Guide: builtin support to optimize, benchmark and profile PyTorch and TorchServe performance
- Expressive handlers: An expressive handler architecture that makes it trivial to support inferencing for your usecase with many supported out of the box
- Metrics API: out of box support for system level metrics with Prometheus exports, custom metrics and PyTorch profiler support
- Model Server for PyTorch Documentation: Full documentation
- TorchServe internals: How TorchServe was built
- Contributing guide: How to contribute to TorchServe
- 🤗 HuggingFace Transformers
- MultiModal models with MMF combining text, audio and video
- Dual Neural Machine Translation for a complex workflow DAG
For more examples
We welcome all contributions!
To learn more about how to contribute, see the contributor guide here.
To file a bug or request a feature, please file a GitHub issue. For filing pull requests, please use the template here.
- Announcing TorchServe
- How to deploy PyTorch models on Vertex AI
- How to Serve PyTorch Models with TorchServe
- Model Serving in PyTorch
- Explain Like I’m 5: TorchServe
Made with contrib.rocks.
This repository is jointly operated and maintained by Amazon, Meta and a number of individual contributors listed in the CONTRIBUTORS file. For questions directed at Meta, please send an email to opensource@fb.com. For questions directed at Amazon, please send an email to torchserve@amazon.com. For all other questions, please open up an issue in this repository here.
TorchServe acknowledges the Multi Model Server (MMS) project from which it was derived