joshdance/2d-map-blog-embeddings

2d map of blog posts using embeddings

Jupyter Notebook

Blog Embeddings

An analysis of two years of daily blogging on matt-rickard.com.

I embedded all my posts using BERT (a transformers model pre-trained on a large corpus of English data). BERT uses 768-dimensional vectors.
Then I ran them through t-SNE (t-distributed stochastic neighbor embedding, a fancy way to visualize high-dimensional data by translating them to two dimensions.
Finally, I separated the two-dimensional space into equally sized bins and asked GPT-3.5 to develop a category name for each set of post titles.

https://blog.matt-rickard.com/p/two-years-of-daily-blogging