inference

There are 1340 repositories under inference topic.

  • DeepSpeed-MII

    MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

    Language:Python1.9k
  • XNNPACK

    High-efficiency floating-point neural network inference operators for mobile, server, and Web

    Language:C1.9k
  • tensorflow_template_application

    TensorFlow template application for deep learning

    Language:Python1.9k
  • zml

    Any model. Any hardware. Zero compromise. Built with @ziglang / @openxla / MLIR / @bazelbuild

    Language:Zig1.7k
  • ao

    PyTorch native quantization and sparsity for training and inference

    Language:Python1.7k
  • NNPACK

    Acceleration package for neural networks on multi-core CPUs

    Language:C1.7k
  • transformer-deploy

    transformer-deploy

    Efficient, scalable and enterprise-grade CPU/GPU inference server for 🤗 Hugging Face transformer models 🚀

    Language:Python1.7k
  • delta

    DELTA is a deep learning based natural language and speech processing platform. LF AI & DATA Projects: https://lfaidata.foundation/projects/delta/

    Language:Python1.6k
  • TurboTransformers

    a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.

    Language:C++1.5k
  • huggingface.js

    huggingface.js

    Utilities to use the Hugging Face Hub API

    Language:TypeScript1.4k
  • inference

    inference

    A fast, easy-to-use, production-ready inference server for computer vision supporting deployment of many popular model architectures and fine-tuned models.

    Language:Python1.4k
  • budgetml

    budgetml

    Deploy a ML inference service on a budget in less than 10 lines of code.

    Language:Python1.3k
  • agibot_x1_infer

    The inference module for AgiBot X1.

    Language:C++1.3k
  • BERT-NER

    Pytorch-Named-Entity-Recognition-with-BERT

    Language:Python1.2k
  • awesome-ml-demos-with-ios

    The challenge projects for Inferencing machine learning models on iOS

    Language:Python1.2k
  • CausalDiscoveryToolbox

    Package for causal inference in graphs and in the pairwise settings. Tools for graph structure recovery and dependencies are included.

    Language:Python1.1k
  • multi-model-server

    Multi Model Server is a tool for serving neural net models for inference

    Language:Java1k
  • ort

    Fast ML inference & training for ONNX models in Rust

    Language:Rust989
  • neuropod

    A uniform interface to run deep learning models from multiple frameworks

    Language:C++937
  • bolt

    Bolt is a deep learning library with high performance and heterogeneous flexibility.

    Language:C++927
  • ims

    📚 Introduction to Modern Statistics - A college-level open-source textbook with a modern approach highlighting multivariable relationships and simulation-based inference. For v1, see https://openintro-ims.netlify.app.

    Language:JavaScript874
  • Adlik

    Adlik: Toolkit for Accelerating Deep Learning Inference

    Language:C++794
  • cppflow

    Run TensorFlow models in C++ without installation and without Bazel

    Language:C++787
  • AI-Engineering.academy

    Mastering Applied AI, One Concept at a Time

    Language:Jupyter Notebook771
  • pytriton

    PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.

    Language:Python754
  • GenossGPT

    GenossGPT

    One API for all LLMs either Private or Public (Anthropic, Llama V2, GPT 3.5/4, Vertex, GPT4ALL, HuggingFace ...) 🌈🐂 Replace OpenAI GPT with any LLMs in your app with one line.

    Language:Python751
  • bark.cpp

    Suno AI's Bark model in C/C++ for fast text-to-speech generation

    Language:C++750
  • awesome-emdl

    Embedded and mobile deep learning research resources

  • pipeless

    An open-source computer vision framework to build and deploy apps in minutes

    Language:Rust732
  • yolort

    yolort

    yolort is a runtime stack for yolov5 on specialized accelerators such as tensorrt, libtorch, onnxruntime, tvm and ncnn.

    Language:Python724
  • onepanel

    The open source, end-to-end computer vision platform. Label, build, train, tune, deploy and automate in a unified platform that runs on any cloud and on-premises.

    Language:Go722
  • InferLLM

    a lightweight LLM model inference framework

    Language:C++708
  • ai-reference-models

    Intel® AI Reference Models: contains Intel optimizations for running deep learning workloads on Intel® Xeon® Scalable processors and Intel® Data Center GPUs

    Language:Python687
  • model_server

    A scalable inference server for models optimized with OpenVINO™

    Language:C++687
  • tf_trt_models

    TensorFlow models accelerated with NVIDIA TensorRT

    Language:Python686
  • filetype.py

    Small, dependency-free, fast Python package to infer binary file types checking the magic numbers signature

    Language:Python675