Code to implemnent a decoder only transformer model that predicts the next sentence from a given input phrase
This code implemented a decoder only transformer model with a multihead attention mechanism. This model was trained using the TinyStories dataset and produces a 20 word generative output which follows on semantically and syntactically from a given input phrase.
-
tokens.py - creates and trains sentence piece tokeniser
-
dataset.py - implements tokenised dataset
-
positional_encoding.py - implements positional encoding from decoder architecture
-
multi_head_attention.py - implements multihead attention mechansim from decoder architecture
-
position_wise_feed_forward.py - implements position wise feedforward layers from from decoder architecture
-
decoder_layer.py - implements decoder layer from previosuly defined building blocks
-
transformer.py - implements decoder only transformer from decoder layer and positional encoder building blocks
-
train.py - trains the transformer using TinyStories dataset
-
sentence_completer.py - generates output text from input phrase
-
server.py - connects to server to allow model to be accessed and interacted with from website
-
constants.py - contains constants for the project
-
utilities.py - contains simple functions for accessing and loading latest transformer models
Louis Chapo-Saunders