inference
There are 1340 repositories under inference topic.
DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
XNNPACK
High-efficiency floating-point neural network inference operators for mobile, server, and Web
tensorflow_template_application
TensorFlow template application for deep learning
zml
Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild
ao
PyTorch native quantization and sparsity for training and inference
NNPACK
Acceleration package for neural networks on multi-core CPUs
transformer-deploy
Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀
delta
DELTA is a deep learning based natural language and speech processing platform. LF AI & DATA Projects: https://lfaidata.foundation/projects/delta/
TurboTransformers
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
huggingface.js
Utilities to use the Hugging Face Hub API
inference
A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.
budgetml
Deploy a ML inference service on a budget in less than 10 lines of code.
agibot_x1_infer
The inference module for AgiBot X1.
BERT-NER
Pytorch-Named-Entity-Recognition-with-BERT
awesome-ml-demos-with-ios
The challenge projects for Inferencing machine learning models on iOS
CausalDiscoveryToolbox
Package for causal inference in graphs and in the pairwise settings. Tools for graph structure recovery and dependencies are included.
multi-model-server
Multi Model Server is a tool for serving neural net models for inference
ort
Fast ML inference & training for ONNX models in Rust
neuropod
A uniform interface to run deep learning models from multiple frameworks
bolt
Bolt is a deep learning library with high performance and heterogeneous flexibility.
ims
📚 Introduction to Modern Statistics - A college-level open-source textbook with a modern approach highlighting multivariable relationships and simulation-based inference. For v1, see https://openintro-ims.netlify.app.
Adlik
Adlik: Toolkit for Accelerating Deep Learning Inference
cppflow
Run TensorFlow models in C++ without installation and without Bazel
AI-Engineering.academy
Mastering Applied AI, One Concept at a Time
pytriton
PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
GenossGPT
One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT 3.5/4, Vertex, GPT4ALL, HuggingFace ...) 🌈🐂 Replace OpenAI GPT with any LLMs in your app with one line.
bark.cpp
Suno AI's Bark model in C/C++ for fast text-to-speech generation
awesome-emdl
Embedded and mobile deep learning research resources
pipeless
An open-source computer vision framework to build and deploy apps in minutes
yolort
yolort is a runtime stack for yolov5 on specialized accelerators such as tensorrt, libtorch, onnxruntime, tvm and ncnn.
onepanel
The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.
InferLLM
a lightweight LLM model inference framework
ai-reference-models
Intel® AI Reference Models: contains Intel optimizations for running deep learning workloads on Intel® Xeon® Scalable processors and Intel® Data Center GPUs
model_server
A scalable inference server for models optimized with OpenVINO™
tf_trt_models
TensorFlow models accelerated with NVIDIA TensorRT
filetype.py
Small, dependency-free, fast Python package to infer binary file types checking the magic numbers signature