I created a simple bigram-LLM from scratch and trained it on Shakesphere text. It is inspired from https://arxiv.org/abs/1706.03762 (after all attention is all you need).
I created a simple bigram-LLM from scratch and trained it on Shakesphere text. It is inspired from https://arxiv.org/abs/1706.03762 (after all attention is all you need).