The purpose of this project is to showcase the process of fine tuning a pretrained GPT model to produce personalized text that mimics the style of a different text corpus. For demonstation, a pretrianed GPT2 model was trained on the lyrics of some Taylor Swift songs to generate text with the same style as her songs.
Dataset : https://www.kaggle.com/datasets/thespacefreak/taylor-swift-song-lyrics-all-albums
Example usage :
python3 finetune.py --epochs 5 --output_dir './models/ts2' --data_dir './notebooks/taylor_swift_lyrics.csv'
Example usage :
python3 generate.py --model_dir './models/ts2' --prompt 'hello' --max_length 100
Sample output:
prompt : 'beautiful day'
max_length : 100
beautiful day
The stars shine through the shaded sky
The song's playing and it seems like we're at home
All we know is that we're dreaming
But we don't know what
We're doing
And we're getting tired of it
You're so small, like
Love in a bottle
Hold on tight, baby
Even though we're so far away
You've got your eyes like diamonds