/neuropod

A uniform interface to run deep learning models from multiple frameworks

Primary LanguageC++Apache License 2.0Apache-2.0

Neuropod

What is Neuropod?

Neuropod is a library that provides a uniform interface to run deep learning models from multiple frameworks in C++ and Python. Neuropod makes it easy for researchers to build models in a framework of their choosing while also simplifying productionization of these models.

It currently supports TensorFlow, PyTorch, TorchScript, Keras and Ludwig.

Why use Neuropod?

Run models from any supported framework using one API

Running a TensorFlow model looks exactly like running a PyTorch model.

x = np.array([1, 2, 3, 4])
y = np.array([5, 6, 7, 8])

for model_path in [TF_ADDITION_MODEL_PATH, PYTORCH_ADDITION_MODEL_PATH]:
    # Load the model
    neuropod = load_neuropod(model_path)

    # Run inference
    results = neuropod.infer({"x": x, "y": y})

    # array([6, 8, 10, 12])
    print results["out"]

See the tutorial, Python guide, or C++ guide for more examples.

Some benefits of this include:

  • All of your inference code is framework agnostic.
  • You can easily switch between deep learning frameworks if necessary without changing runtime code.
  • Avoid the learning curve of using the C++ libtorch API and the C/C++ TF API

Any Neuropod model can be run from both C++ and Python (even PyTorch models that have not been converted to TorchScript).

Define a Problem API

This lets you focus more on the problem you're solving rather than the framework you're using to solve it.

For example, if you define a problem API for 2d object detection, any model that implements it can reuse all the existing inference code and infrastructure for that problem.

INPUT_SPEC = [
    # BGR image
    {"name": "image", "dtype": "uint8", "shape": (1200, 1920, 3)},
]

OUTPUT_SPEC = [
    # shape: (num_detections, 4): (xmin, ymin, xmax, ymax)
    # These values are in units of pixels. The origin is the top left corner
    # with positive X to the right and positive Y towards the bottom of the image
    {"name": "boxes", "dtype": "float32", "shape": ("num_detections", 4)},

    # The list of classes that the network can output
    # This must be some subset of ['vehicle', 'person', 'motorcycle', 'bicycle']
    {"name": "supported_object_classes", "dtype": "string", "shape": ("num_classes",)},

    # The probability of each class for each detection
    # These should all be floats between 0 and 1
    {"name": "object_class_probability", "dtype": "float32", "shape": ("num_detections", "num_classes")},
]

This lets you

  • Build a single metrics pipeline for a problem
  • Easily compare models solving the same problem (even if they're in different frameworks)
  • Build optimized inference code that can run any model that solves a particular problem
  • Swap out models that solve the same problem at runtime with no code change (even if the models are from different frameworks)
  • Run fast experiments

See the tutorial for more details.

Build generic tools and pipelines

If you have several models that take in a similar set of inputs, you can build and optimize one framework-agnostic input generation pipeline and share it across models.

Other benefits

  • Fully self-contained models (including custom ops)

  • Efficient zero-copy operations

  • Tested on platforms including

    • Mac, Linux, Linux (GPU)
    • Four or five versions of each supported framework
    • Five versions of Python
  • Model isolation with out-of-process execution

    • Use multiple different versions of frameworks in the same application
      • Ex: Experimental models using Torch nightly along with models using Torch 1.1.0
  • Switch from running in-process to running out-of-process with one line of code

Getting started

See the basic introduction tutorial for an overview of how to get started with Neuropod.

The Python guide and C++ guide go into more detail on running Neuropod models.