/sct_fairseq

Re-implementation of the Shortcut Transformer in fairseq and the accompanying experimental code.

Primary LanguagePythonMIT LicenseMIT

sct_fairseq

Re-implementation of the Shortcut Transformer (SCT) in fairseq and the accompanying experimental code. This codebase is provided solely for completion, as supplmentary material for the PhD thesis. Following files had been added to / modified within the fairseq repository and should be integrated into your local fairseq repo in order to replicate the experiments discussed in the thesis:

SCT reimplementation

./fairseq/fairseq/models/transformer/init.py
./fairseq/fairseq/models/transformer/transformer_base.py
./fairseq/fairseq/models/transformer/transformer_decoder.py
./fairseq/fairseq/models/transformer/transformer_encoder.py
./fairseq/fairseq/models/transformer/transformer_legacy.py

./fairseq/fairseq/modules/init.py
./fairseq/fairseq/modules/shortcut_multihead_attention.py
./fairseq/fairseq/modules/shortcut_multihead_attention_with_fusion.py
./fairseq/fairseq/modules/transformer_layer.py

Supporting experimental code

./fairseq/fairseq/criterions/label_smoothed_cross_entropy_with_probs.py