/onnxruntime

ONNX Runtime: cross-platform, high performance scoring engine for ML models

Primary LanguageC++MIT LicenseMIT

Build Status Build Status Build Status Build Status Build Status

ONNX Runtime is an open-source scoring engine for Open Neural Network Exchange (ONNX) models.

ONNX is an open format for machine learning (ML) models that is supported by various ML and DNN frameworks and tools. This format makes it easier to interoperate between frameworks and to maximize the reach of your hardware optimization investments. Learn more about ONNX on https://onnx.ai or view the Github Repo.

Why use ONNX Runtime

ONNX Runtime has an open architecture that is continually evolving to address the newest developments and challenges in AI and Deep Learning. ONNX Runtime stays up to date with the ONNX standard, supporting all ONNX releases with future compatibility and maintaining backwards compatibility with prior releases.

ONNX Runtime continuously strives to provide top performance for a broad and growing number of usage scenarios in Machine Learning. Our investments focus on:

  1. Run any ONNX model
  2. High performance
  3. Cross platform

Run any ONNX model

Alignment with ONNX Releases

ONNX Runtime provides comprehensive support of the ONNX spec and can be used to run all models based on ONNX v1.2.1 and higher. See ONNX version release details here.

As of May 2019, ONNX Runtime supports ONNX 1.5 (opset10). See this table for details on ONNX Runtime and ONNX versioning compatibility,

Traditional ML support

ONNX Runtime fully supports the ONNX-ML profile of the ONNX spec for traditional ML scenarios.

High Performance

ONNX Runtime supports both CPU and GPU hardware through a variety of execution providers. With a variety of graph optimizations and accelerators, ONNX Runtime often provides lower latency and higher efficiency compared to other runtimes. This provides faster end-to-end customer experiences and lower costs from improved machine utilization.

Currently ONNX Runtime supports CUDA, TensorRT, MLAS (Microsoft Linear Algebra Subprograms), MKL-DNN, MKL-ML, and nGraph for computation acceleration. See more details on available build options here.

We are continuously working to integrate new execution providers to provide improvements in latency and efficiency. If you are interested in contributing a new execution provider, please see this page.

Cross Platform

ONNX Runtime offers:

  • APIs for Python, C#, and C
  • Available for Linux, Windows, and Mac 

See API documentation and package installation instructions below.

We have ongoing investments to make ONNX Runtime compatible with more platforms and architectures. If you have specific scenarios that are not currently supported, please share your suggestions via Github Issues.

Getting Started

ONNX models:

  • Check out the ONNX Model Zoo for ready-to-use pre-trained models.
  • To get an ONNX model by exporting from various frameworks, see ONNX Tutorials.

Once you have an ONNX model, you can install the runtime for your machine to try it out. There is also an ONNX-Ecosystem Docker container available and ready for use with the Python API.

One easy way to deploy the model on the cloud is by using Azure Machine Learning. See detailed instructions and sample notebooks.

Installation

System Requirements

  • ONNX Runtime binaries in CPU packages use OpenMP and depends on the library being available at runtime in the system.
    • For Windows, OpenMP support comes as part of VC runtime. It is also available as redist packages: vc_redist.x64.exe and vc_redist.x86.exe
    • For Linux, the system must have the libgomp.so.1 which can be installed using apt-get install libgomp1.
  • The official GPU builds require the CUDA 9.1 and cuDNN 7.1 runtime libraries being installed in the system.
  • Python binaries are compatible with Python 3.5-3.7.
  • Certain operators makes use of system locales. At the very least you will need to install English language package and configure en_US.UTF-8 locale.
    • For Ubuntu install language-pack-en package
    • Run the following commands:
      • locale-gen en_US.UTF-8
      • update-locale LANG=en_US.UTF-8
    • Follow similar procedure to configure other locales on other platforms.

APIs and Official Builds

API Documentation CPU package GPU package
Python Available on Pypi
  • Windows: x64
  • Linux: x64
  • Mac OS X: x64

Available on Pypi
  • Windows: x64
  • Linux: x64


C# Available on Nuget :
MLAS+Eigen
  • Windows: x64, x86
  • Linux: x64, x86
  • Mac OS X: x64

MKL-ML
  • Windows: x64
  • Linux: x64
  • Mac OS X: x64
Available on Nuget
  • Windows: x64
  • Linux: x64

C Available on Nuget :
MLAS+Eigen
  • Windows: x64, x86
  • Linux: x64, x86
  • Mac OS X: x64

MKL-ML
  • Windows: x64
  • Linux: x64
  • Mac OS X: x64

Binaries (.zip, .tgz)
  • Windows: x64, x86
  • Linux: x64, x86
  • Mac OS X: x64
Available on Nuget
  • Windows: x64
  • Linux: x64


Binaries (.zip, .tgz)
  • Windows: x64
  • Linux: x64

C++ Build from source Build from source

For builds using other execution providers, see Build Details below.

Build Details

For details on the build configurations and information on how to create a build, see Build ONNX Runtime.

Versioning

See more details on API and ABI Versioning and ONNX Compatibility in Versioning.

Design and Key Features

For an overview of the high level architecture and key decisions in the technical design of ONNX Runtime, see Engineering Design.

ONNX Runtime is built with an extensible design that makes it versatile to support a wide array of models with high performance.

Contribute

We welcome your contributions! Please see the contribution guidelines.

Feedback

For any feedback or to report a bug, please file a GitHub Issue.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

License

MIT License