/InvertibleUT

Combining the invertibility of normalizing flow with the strength of Universal Transformer

Primary LanguagePython

Combining Universal Transformer and Flow-based models

Experiment 1: Invertible Universal Transformer

This experiment tests the capability of combining invertible neural networks (iRevNet, Reversible ResNet) and the universal transformer. The idea is to get a memory-efficient backpropagation of the UT allowing us to train it on smaller GPUs.

  • First version implementation
  • Translation from UT parameters to invertible UT parameters (taking half of hidden size)
  • Verify current implementation
  • Check parameter setting for attention (might also reduce channel size)
  • Adding parameter for deciding whether to share the layers or not