inference

There are 1335 repositories under inference topic.

hpcaitech/ColossalAI
Making large AI models cheaper, faster and more accessible
Language:Python38.9k 385 1.7k4.3k
ggerganov/whisper.cpp
Port of OpenAI's Whisper model in C/C++
Language:C++36.5k 317 1.4k3.7k
microsoft/DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Language:Python36k 346 2.9k4.2k
vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:Python32.3k 263 5.7k4.9k
google-ai-edge/mediapipe
Cross-platform, customizable ML solutions for live and streaming media.
Language:C++28k 516 5.2k5.2k
Tencent/ncnn
ncnn is a high-performance neural network inference framework optimized for the mobile platform
Language:C++20.7k 575 3.6k4.2k
SYSTRAN/faster-whisper
Faster Whisper transcription with CTranslate2
Language:Python13.1k 124 7731.1k
gvergnaud/ts-pattern
🎨 The exhaustive Pattern Matching library for TypeScript, with smart type inference.
Language:TypeScript12.7k 29 165135
NVIDIA/TensorRT
NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.
Language:C++11k 156 3.8k2.1k
aws/amazon-sagemaker-examples
Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.
Language:Jupyter Notebook10.2k 269 1.4k6.8k
huggingface/text-generation-inference
Large Language Model Text Generation Inference
Language:Python9.5k 103 1.4k1.1k
triton-inference-server/server
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Language:Python8.5k 145 3.8k1.5k
dusty-nv/jetson-inference
Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
Language:C++8k 272 1.8k3k
openvinotoolkit/openvino
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Language:C++7.5k 197 2.7k2.3k
Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB
💎1MB lightweight face detection model (1MB轻量级人脸检测模型)
Language:Python7.2k 191 2661.5k
gcanti/io-ts
Runtime type system for IO decoding/encoding
Language:TypeScript6.7k 54 441327
sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
Language:Python6.6k 60 759588
xorbitsai/inference
Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
Language:Python5.7k 43 1.5k476
Trusted-AI/adversarial-robustness-toolbox
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
Language:Python5k 99 8921.2k
superduper-io/superduper
Superduper: Build end-to-end AI applications and agent workflows on your existing data infrastructure and preferred tools - without migrating your data.
Language:Python4.9k 45 1.3k468
NVIDIA-AI-IOT/torch2trt
An easy to use PyTorch to TensorRT converter
Language:Python4.6k 73 724678
AutoGPTQ/AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
Language:Python4.6k 31 469491
Tencent/TNN
TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.
Language:C++4.4k 91 954769
openvinotoolkit/open_model_zoo
Pre-trained Deep Learning models and demos (high quality and extremely fast)
Language:Python4.1k 125 4611.4k
argmaxinc/WhisperKit
On-device Speech Recognition for Apple Silicon
Language:Swift4k 39 155343
typedb/typedb
TypeDB: the power of programming, in your database
Language:Rust3.9k 119 2.4k341
tencentmusic/cube-studio
cube studio开源云原生一站式机器学习/深度学习/大模型AI平台，支持sso登录，多租户，大数据平台对接，notebook在线开发，拖拉拽任务流pipeline编排，多机多卡分布式训练，超参搜索，推理服务VGPU，边缘计算，serverless，标注平台，自动化标注，数据集管理，大模型微调，vllm大模型推理，llmops，私有知识库，AI模型应用商店，支持模型一键开发/推理/微调，支持国产cpu/gpu/npu芯片，支持RDMA，支持pytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/spark/ray/volcano分布式
Language:Jupyter Notebook3.8k 76 147666
OpenNMT/CTranslate2
Fast inference engine for Transformer models
Language:C++3.5k 60 719309
bytedance/lightseq
LightSeq: A High Performance Library for Sequence Processing and Generation
Language:C++3.2k 58 287329
neuralmagic/deepsparse
Sparsity-aware deep learning inference runtime for CPUs
Language:Python3.1k 57 141176
pgmpy/pgmpy
Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.
Language:Python2.8k 76 911721
huggingface/optimum
🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools
Language:Python2.6k 57 762486
zjhellofss/KuiperInfer
校招、秋招、春招、实习好项目！带你从零实现一个高性能的深度学习推理库，支持大模型 llama2 、Unet、Yolov5、Resnet等模型的推理。Implement a high-performance deep learning inference library step by step
Language:C++2.6k 27 28298
openvinotoolkit/openvino_notebooks
📚 Jupyter notebook tutorials for OpenVINO™
Language:Jupyter Notebook2.5k 56 318828
tairov/llama2.mojo
Inference Llama 2 in one file of pure 🔥
Language:Mojo2.1k 28 44141
microsoft/aici
AICI: Prompts as (Wasm) Programs
Language:Rust2k 24 7578

inference

hpcaitech/ColossalAI

ggerganov/whisper.cpp

microsoft/DeepSpeed

vllm-project/vllm

google-ai-edge/mediapipe

Tencent/ncnn

SYSTRAN/faster-whisper

gvergnaud/ts-pattern

NVIDIA/TensorRT

aws/amazon-sagemaker-examples

huggingface/text-generation-inference

triton-inference-server/server

dusty-nv/jetson-inference

openvinotoolkit/openvino

Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB

gcanti/io-ts

sgl-project/sglang

xorbitsai/inference

Trusted-AI/adversarial-robustness-toolbox

superduper-io/superduper

NVIDIA-AI-IOT/torch2trt

AutoGPTQ/AutoGPTQ

Tencent/TNN

openvinotoolkit/open_model_zoo

argmaxinc/WhisperKit

typedb/typedb

tencentmusic/cube-studio

OpenNMT/CTranslate2

bytedance/lightseq

neuralmagic/deepsparse

pgmpy/pgmpy

huggingface/optimum

zjhellofss/KuiperInfer

openvinotoolkit/openvino_notebooks

tairov/llama2.mojo

microsoft/aici