Re-implementation of the Shortcut Transformer (SCT) in fairseq and the accompanying experimental code. This codebase is provided solely for completion, as supplmentary material for the PhD thesis. Following files had been added to / modified within the fairseq repository and should be integrated into your local fairseq repo in order to replicate the experiments discussed in the thesis:
./fairseq/fairseq/models/transformer/init.py
./fairseq/fairseq/models/transformer/transformer_base.py
./fairseq/fairseq/models/transformer/transformer_decoder.py
./fairseq/fairseq/models/transformer/transformer_encoder.py
./fairseq/fairseq/models/transformer/transformer_legacy.py
./fairseq/fairseq/modules/init.py
./fairseq/fairseq/modules/shortcut_multihead_attention.py
./fairseq/fairseq/modules/shortcut_multihead_attention_with_fusion.py
./fairseq/fairseq/modules/transformer_layer.py
./fairseq/fairseq/criterions/label_smoothed_cross_entropy_with_probs.py