This word-level LSTM model generates clickbait headlines like the following:
- we know your zodiac sign based on your zodiac sign
- the 17 most important canadian celebrity moments of 2015
- here's how to make a vampire
- can you guess your favorite '90s movie based on your favorite kitten
- are you more a canadian or taylor swift or oprah
pipenv shell
jupyter notebook clickbait.ipynb
This model uses the collection of 17,000 clickbait headlines scraped from the following esteemed publications:
- BuzzFeed
- Upworthy
- ViralNova
- Thatscoop
- Scoopwhoop
- ViralStories
Taken from the paper "Stop Clickbait: Detecting and Preventing Clickbaits in Online News Media"
Data Source
This model trains its own 10-dimensional embeddings.
The model's current architecture is a two-layer LSTM with 256 units and a 20% dropout rate.
- Get more data
- Replace all this with a transformer
Inspired by Lars Eidnes' blog post
"Stop Clickbait: Detecting and Preventing Clickbaits in Online News Media" link
Excellent RNN intro by Andrej Karpathy link