triton-inference-server
There are 90 repositories under triton-inference-server topic.
NVIDIA/GenerativeAIExamples
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
CoinCheung/BiSeNet
Add bisenetv2. My implementation of BiSeNet
isarsoft/yolov4-triton-tensorrt
This repository deploys YOLOv4 as an optimized TensorRT engine to Triton Inference Server
npuichigo/openai_trtllm
OpenAI compatible API for TensorRT LLM triton backend
NetEase-Media/grps
【深度学习模型部署框架】支持tf/torch/trt/trtllm/vllm以及更多nn框架,支持dynamic batching、streaming模式,支持python/c++双语言,可限制,可拓展,高性能。帮助用户快速地将模型部署到线上,并通过http/rpc接口方式提供服务。
torchpipe/torchpipe
Serving Inside Pytorch
allegroai/clearml-serving
ClearML - Model-Serving Orchestration and Repository Solution
triton-inference-server/onnxruntime_backend
The Triton backend for the ONNX Runtime.
kamalkraj/stable-diffusion-tritonserver
Deploy stable diffusion model with onnx/tenorrt + tritonserver
NVIDIA-ISAAC-ROS/isaac_ros_dnn_inference
NVIDIA-accelerated DNN model inference ROS 2 packages using NVIDIA Triton/TensorRT for both Jetson and x86_64 with CUDA-capable GPU
notAI-tech/fastDeploy
Deploy DL/ ML inference pipelines with minimal extra code.
Koldim2001/TrafficAnalyzer
Анализ трафика на круговом движении с использованием компьютерного зрения
bug-developer021/YOLOV5_optimization_on_triton
Compare multiple optimization methods on triton to imporve model service performance
akiragy/recsys_pipeline
Build Recommender System with PyTorch + Redis + Elasticsearch + Feast + Triton + Flask. Vector Recall, DeepFM Ranking and Web Application.
chiehpower/Setup-deeplearning-tools
Set up CI in DL/ cuda/ cudnn/ TensorRT/ onnx2trt/ onnxruntime/ onnxsim/ Pytorch/ Triton-Inference-Server/ Bazel/ Tesseract/ PaddleOCR/ NVIDIA-docker/ minIO/ Supervisord on AGX or PC from scratch.
rtzr/tritony
Tiny configuration for Triton Inference Server
trinhtuanvubk/Diff-VC
Diffusion Model for Voice Conversion
k9ele7en/Triton-TensorRT-Inference-CRAFT-pytorch
Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server - multi-format). Supported model format for Triton inference: TensorRT engine, Torchscript, ONNX
omarabid59/yolov8-triton
Provides an ensemble model to deploy a YoloV8 ONNX model to Triton
Bobo-y/triton_ensemble_model_demo
triton server ensemble model demo
Curt-Park/serving-codegen-gptj-triton
Serving Example of CodeGen-350M-Mono-GPTJ on Triton Inference Server with Docker and Kubernetes
openhackathons-org/End-to-End-Computer-Vision
This repository is an AI bootcamp material that consist of a workflow for computer vision
olibartfast/computer-vision-triton-cpp-client
C++ application to perform computer vision tasks using Nvidia Triton Server for model inference
Biano-AI/serving-compare-middleware
FastAPI middleware for comparing different ML model serving approaches
inferless/triton-co-pilot
Generate Glue Code in seconds to simplify your Nvidia Triton Inference Server Deployments
redis-applied-ai/redis-feast-gcp
A demo of Redis Enterprise as the Online Feature Store deployed on GCP with Feast and NVIDIA Triton Inference Server.
tonhathuy/tensorrt-triton-magface
Magface Triton Inferece Server Using Tensorrt
detail-novelist/novelist-triton-server
Deploy KoGPT with Triton Inference Server
YeonwooSung/MLOps
Miscellaneous codes and writings for MLOps
dpressel/reserve
FastAPI + WebSockets + SSE service to interface with Triton/Riva ASR
duydvu/triton-inference-server-web-ui
Triton Inference Server Web UI
ybai789/yolov8-triton-tensorrt
Provides an ensemble model to deploy a YOLOv8 TensorRT model to Triton
yas-sim/openvino-model-server-wrapper
Python wrapper class for OpenVINO Model Server. User can submit inference request to OVMS with just a few lines of code.
levipereira/deepstream-yolo-triton-server-rtsp-out
The Purpose of this repository is to create a DeepStream/Triton-Server sample application that utilizes yolov7, yolov7-qat, yolov9 models to perform inference on video files or RTSP streams.
hiennguyen9874/triton-face-recognition
Triton face detection & recognition
smarter-project/armnn_tflite_backend
TensorFlow Lite backend with ArmNN delegate support for Nvidia Triton