NVIDIA Deep Learning Examples for Tensor Cores
Introduction
This repository provides the latest deep learning example networks for training. These examples focus on achieving the best performance and convergence from NVIDIA Volta Tensor Cores.
NVIDIA GPU Cloud (NGC) Container Registry
These examples, along with our NVIDIA deep learning software stack, are provided in a monthly updated Docker container on the NGC container registry (https://ngc.nvidia.com). These containers include:
- The latest NVIDIA examples from this repository
- The latest NVIDIA contributions shared upstream to the respective framework
- The latest NVIDIA Deep Learning software libraries, such as cuDNN, NCCL, cuBLAS, etc. which have all been through a rigorous monthly quality assurance process to ensure that they provide the best possible performance
- Monthly release notes for each of the NVIDIA optimized containers
Directory structure
The examples are organized first by framework, such as TensorFlow, PyTorch, etc. and second by use case, such as computer vision, natural language processing, etc. We hope this structure enables you to quickly locate the example networks that best suit your needs. Here are the currently supported models:
Computer Vision
- ResNet-50 [MXNet] [PyTorch] [TensorFlow]
- ResNext [PyTorch]
- SE-ResNext [PyTorch]
- SSD [PyTorch] [TensorFlow]
- Mask R-CNN [PyTorch] [TensorFlow] [TensorFlow 2]
- U-Net(industrial) [TensorFlow]
- U-Net(medical) [TensorFlow] [TensorFlow 2]
- VNet [TensorFlow]
Natural Language Processing
- GNMT [PyTorch] [TensorFlow]
- Transformer [PyTorch]
- BERT [PyTorch] [TensorFlow]
- Transformer-XL [PyTorch]
Recommender Systems
- NCF [PyTorch] [TensorFlow]
- VAE-CF [TensorFlow]
- WideAndDeep [TensorFlow]
Text to Speech
- Tacotron2 & WaveGlow [PyTorch]
Speech Recognition
- Jasper [PyTorch]
CUDA Accelerated Applications
- Kaldi [TRTIS]
Jupyter Notebooks
Models | TensorFlow | PyTorch | TensorRT | TRTIS |
---|---|---|---|---|
SSD | Inference | Inference | - | - |
MaskRCNN | - | Training & Inference | - | - |
Jasper | - | - | PyTorch Inference TensorRT Colab, PyTorch Inference TensorRT | PyTorch Inference TRTIS |
Tacotron2 & WaveGlow | - | Training & Inference | - | PyTorch Inference TRTIS |
BERT | Inference Movie Review Sentiment, Fine-Tuning SQuaD, Inference Colab, Inference | - | - | - |
BioBERT | Inference | - | - | - |
UNet Industrial | Export and Inference Colab, Inference | - | - | - |
Automatic Mixed Precision | AMP Training | - | - | - |
Feature Matrix
Models | Framework | DALI | AMP | Multi-GPU | Multi-Node | TensorRT | ONNX | TRTIS | TF-TRT |
---|---|---|---|---|---|---|---|---|---|
ResNet50 v1.5 | PyTorch | Yes | Yes | Yes | - | - | - | - | - |
ResNeXt101-32x4d | PyTorch | Yes | Yes | Yes | - | - | - | - | - |
SE-ResNeXt101-32x4d | PyTorch | Yes | Yes | Yes | - | - | - | - | - |
SSD300 v1.1 | PyTorch | Yes | Yes | Yes | - | - | - | - | - |
BERT | PyTorch | N/A | Yes | Yes | Yes | - | - | - | - |
Transformer-XL | PyTorch | N/A | Yes | Yes | Yes | - | - | - | - |
Neural Collaborative Filtering | PyTorch | N/A | Yes | Yes | - | - | - | - | - |
Mask R-CNN | PyTorch | N/A | Yes | Yes | - | - | - | - | - |
Jasper | PyTorch | N/A | Yes | Yes | - | Yes | Yes | Yes | - |
Tacotron 2 And WaveGlow v1.10 | PyTorch | N/A | Yes | Yes | - | Yes | Yes | Yes | - |
GNMT v2 | PyTorch | N/A | Yes | Yes | - | - | - | - | - |
Transformer | PyTorch | N/A | Yes | Yes | - | - | - | - | - |
ResNet-50 v1.5 | TensorFlow | Yes | Yes | Yes | - | - | - | - | - |
SSD320 v1.2 | TensorFlow | N/A | Yes | Yes | - | - | - | - | - |
BERT | TensorFlow | N/A | Yes | Yes | Yes | Yes | - | Yes | Yes |
BioBert | TensorFlow | N/A | Yes | Yes | - | - | - | - | - |
Neural Collaborative Filtering | TensorFlow | N/A | Yes | Yes | - | - | - | - | - |
Variational Autoencoder Collaborative Filtering | TensorFlow | N/A | Yes | Yes | - | - | - | - | - |
WideAndDeep | TensorFlow | N/A | Yes | Yes | - | - | - | - | - |
U-Net Industrial | TensorFlow | N/A | Yes | Yes | - | Yes | - | - | Yes |
U-Net Medical | TensorFlow | N/A | Yes | Yes | - | Yes | - | - | Yes |
V-Net Medical | TensorFlow | N/A | Yes | Yes | - | Yes | Yes | - | Yes |
Mask R-CNN | TensorFlow | N/A | Yes | Yes | - | - | - | - | - |
GNMT v2 | TensorFlow | N/A | Yes | Yes | - | - | - | - | - |
Faster Transformer | Tensorflow | N/A | - | - | - | Yes | - | - | - |
U-Net Medical | TensorFlow-2 | N/A | Yes | Yes | - | Yes | - | - | Yes |
Mask R-CNN | TensorFlow-2 | N/A | Yes | Yes | - | - | - | - | - |
ResNet50 v1.5 | MXNet | Yes | Yes | Yes | - | - | - | - | - |
HMM | Kaldi | N/A | - | Yes | - | - | - | Yes | - |
NVIDIA support
In each of the network READMEs, we indicate the level of support that will be provided. The range is from ongoing updates and improvements to a point-in-time release for thought leadership.
Feedback / Contributions
We're posting these examples on GitHub to better support the community, facilitate feedback, as well as collect and implement contributions using GitHub Issues and pull requests. We welcome all contributions!
Known issues
In each of the network READMEs, we indicate any known issues and encourage the community to provide feedback.