/ReZero-examples

PyTorch Examples repo for "ReZero is All You Need: Fast Convergence at Large Depth"

Primary LanguageJupyter NotebookMIT LicenseMIT

ReZero-examples

This repo contains examples demonstrating the power of the ReZero architecture, see the paper.

The official ReZero repo is here.

ResNet56_error

Final valid errors: Vanilla - 7.74. FixUp - 7.5. ReZero - 6.38, see .

Contribute

If you find ReZero or a similar architecture improves the performance of your application, you are invited to share a demonstration here.

Install

To install ReZero via pip use pip install rezero

Usage

We provide custom ReZero Transformer layers (RZTX).

For example, this will create a Transformer encoder:

import torch
import torch.nn as nn
from rezero.transformer import RZTXEncoderLayer

encoder_layer = RZTXEncoderLayer(d_model=512, nhead=8)
transformer_encoder = nn.TransformerEncoder(encoder_layer, num_layers=6)
src = torch.rand(10, 32, 512)
out = transformer_encoder(src)

Citation

If you find rezero useful for your research, please cite our paper:

@inproceedings{BacMajMaoCotMcA20,
    title = "ReZero is All You Need: Fast Convergence at Large Depth",
    author = "Bachlechner, Thomas  and
      Majumder, Bodhisattwa Prasad
      Mao, Huanru Henry and
      Cottrell, Garrison W. and
      McAuley, Julian",
    booktitle = "arXiv",
    year = "2020",
    url = "https://arxiv.org/abs/2003.04887"
}