triton-inference-server

There are 90 repositories under triton-inference-server topic.

NVIDIA/GenerativeAIExamples
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
Language:Python2.5k 58 52531
CoinCheung/BiSeNet
Add bisenetv2. My implementation of BiSeNet
Language:Python1.4k 17 308311
isarsoft/yolov4-triton-tensorrt
This repository deploys YOLOv4 as an optimized TensorRT engine to Triton Inference Server
Language:C++279 15 6364
npuichigo/openai_trtllm
OpenAI compatible API for TensorRT LLM triton backend
Language:Rust177 8 2227
NetEase-Media/grps
【深度学习模型部署框架】支持tf/torch/trt/trtllm/vllm以及更多nn框架，支持dynamic batching、streaming模式，支持python/c++双语言，可限制，可拓展，高性能。帮助用户快速地将模型部署到线上，并通过http/rpc接口方式提供服务。
Language:C++165 11 313
torchpipe/torchpipe
Serving Inside Pytorch
Language:C++145 6 713
allegroai/clearml-serving
ClearML - Model-Serving Orchestration and Repository Solution
Language:Python137 11 5740
triton-inference-server/onnxruntime_backend
The Triton backend for the ONNX Runtime.
Language:C++132 16 10757
kamalkraj/stable-diffusion-tritonserver
Deploy stable diffusion model with onnx/tenorrt + tritonserver
Language:Jupyter Notebook123 6 1119
NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference
NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU
Language:C++104 4 3117
notAI-tech/fastDeploy
Deploy DL/ ML inference pipelines with minimal extra code.
Language:Python97 8 617
Koldim2001/TrafficAnalyzer
Анализ трафика на круговом движении с использованием компьютерного зрения
Language:Python53 4 04
bug-developer021/YOLOV5_optimization_on_triton
Compare multiple optimization methods on triton to imporve model service performance
Language:Jupyter Notebook46 1 312
akiragy/recsys_pipeline
Build Recommender System with PyTorch + Redis + Elasticsearch + Feast + Triton + Flask. Vector Recall, DeepFM Ranking and Web Application.
Language:Python44 1 111
chiehpower/Setup-deeplearning-tools
Set up CI in DL/ cuda/ cudnn/ TensorRT/ onnx2trt/ onnxruntime/ onnxsim/ Pytorch/ Triton-Inference-Server/ Bazel/ Tesseract/ PaddleOCR/ NVIDIA-docker/ minIO/ Supervisord on AGX or PC from scratch.
Language:Python43 2 27
rtzr/tritony
Tiny configuration for Triton Inference Server
Language:Python43 2 41
trinhtuanvubk/Diff-VC
Diffusion Model for Voice Conversion
Language:Jupyter Notebook38 3 37
k9ele7en/Triton-TensorRT-Inference-CRAFT-pytorch
Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server - multi-format). Supported model format for Triton inference: TensorRT engine, Torchscript, ONNX
Language:Python32 2 27
omarabid59/yolov8-triton
Provides an ensemble model to deploy a YoloV8 ONNX model to Triton
Language:Python32 1 58
Bobo-y/triton_ensemble_model_demo
triton server ensemble model demo
Language:Python30 0 58
Curt-Park/serving-codegen-gptj-triton
Serving Example of CodeGen-350M-Mono-GPTJ on Triton Inference Server with Docker and Kubernetes
Language:Python20 2 00
openhackathons-org/End-to-End-Computer-Vision
This repository is an AI bootcamp material that consist of a workflow for computer vision
Language:Jupyter Notebook20 2 112
olibartfast/computer-vision-triton-cpp-client
C++ application to perform computer vision tasks using Nvidia Triton Server for model inference
Language:C++19 1 12
Biano-AI/serving-compare-middleware
FastAPI middleware for comparing different ML model serving approaches
Language:Python15 4 11
inferless/triton-co-pilot
Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments
Language:Python15 3 02
redis-applied-ai/redis-feast-gcp
A demo of Redis Enterprise as the Online Feature Store deployed on GCP with Feast and NVIDIA Triton Inference Server.
Language:Jupyter Notebook15
tonhathuy/tensorrt-triton-magface
Magface Triton Inferece Server Using Tensorrt
Language:Jupyter Notebook15 3 02
detail-novelist/novelist-triton-server
Deploy KoGPT with Triton Inference Server
Language:Shell14 1 00
YeonwooSung/MLOps
Miscellaneous codes and writings for MLOps
Language:Jupyter Notebook11 4 51
dpressel/reserve
FastAPI + WebSockets + SSE service to interface with Triton/Riva ASR
Language:Python10 2 02
duydvu/triton-inference-server-web-ui
Triton Inference Server Web UI
Language:TypeScript10 1 01
ybai789/yolov8-triton-tensorrt
Provides an ensemble model to deploy a YOLOv8 TensorRT model to Triton
Language:Python10 1 04
yas-sim/openvino-model-server-wrapper
Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.
Language:Python9 1 31
levipereira/deepstream-yolo-triton-server-rtsp-out
The Purpose of this repository is to create a DeepStream/Triton-Server sample application that utilizes yolov7, yolov7-qat, yolov9 models to perform inference on video files or RTSP streams.
Language:Python8 1 02
hiennguyen9874/triton-face-recognition
Triton face detection & recognition
Language:Jupyter Notebook7 2 05
smarter-project/armnn_tflite_backend
TensorFlow Lite backend with ArmNN delegate support for Nvidia Triton
Language:C++7 1 22

triton-inference-server

NVIDIA/GenerativeAIExamples

CoinCheung/BiSeNet

isarsoft/yolov4-triton-tensorrt

npuichigo/openai_trtllm

NetEase-Media/grps

torchpipe/torchpipe

allegroai/clearml-serving

triton-inference-server/onnxruntime_backend

kamalkraj/stable-diffusion-tritonserver

NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference

notAI-tech/fastDeploy

Koldim2001/TrafficAnalyzer

bug-developer021/YOLOV5_optimization_on_triton

akiragy/recsys_pipeline

chiehpower/Setup-deeplearning-tools

rtzr/tritony

trinhtuanvubk/Diff-VC

k9ele7en/Triton-TensorRT-Inference-CRAFT-pytorch

omarabid59/yolov8-triton

Bobo-y/triton_ensemble_model_demo

Curt-Park/serving-codegen-gptj-triton

openhackathons-org/End-to-End-Computer-Vision

olibartfast/computer-vision-triton-cpp-client

Biano-AI/serving-compare-middleware

inferless/triton-co-pilot

redis-applied-ai/redis-feast-gcp

tonhathuy/tensorrt-triton-magface

detail-novelist/novelist-triton-server

YeonwooSung/MLOps

dpressel/reserve

duydvu/triton-inference-server-web-ui

ybai789/yolov8-triton-tensorrt

yas-sim/openvino-model-server-wrapper

levipereira/deepstream-yolo-triton-server-rtsp-out

hiennguyen9874/triton-face-recognition

smarter-project/armnn_tflite_backend