Text-Generation-with-Decoder-Architecture

Code to implemnent a decoder only transformer model that predicts the next sentence from a given input phrase

Overview

This code implemented a decoder only transformer model with a multihead attention mechanism. This model was trained using the TinyStories dataset and produces a 20 word generative output which follows on semantically and syntactically from a given input phrase.

Project Structure

tokens.py - creates and trains sentence piece tokeniser
dataset.py - implements tokenised dataset
positional_encoding.py - implements positional encoding from decoder architecture
multi_head_attention.py - implements multihead attention mechansim from decoder architecture
position_wise_feed_forward.py - implements position wise feedforward layers from from decoder architecture
decoder_layer.py - implements decoder layer from previosuly defined building blocks
transformer.py - implements decoder only transformer from decoder layer and positional encoder building blocks
train.py - trains the transformer using TinyStories dataset
sentence_completer.py - generates output text from input phrase
server.py - connects to server to allow model to be accessed and interacted with from website
constants.py - contains constants for the project
utilities.py - contains simple functions for accessing and loading latest transformer models

Author

Louis Chapo-Saunders

louisc-s/Text-Generation-with-Decoder-Architecture

Text-Generation-with-Decoder-Architecture

Overview

Project Structure

Author