Pinned Repositories
EsperBERTo
A test of the Attention Is Off By One hypothesis
Flash-Attention-Softmax-N
CUDA and Triton implementations of Flash Attention with SoftmaxN.
llama2.c-tinystories
Inference Llama 2 in one file of pure C
MosaicBERT-Softmax1
nanoGPT_softmax1
An experiment using nanoGPT vs nanoGPT (softmax1) to see how it affects perplexity score
nanoGPT_softmax1_reddit
The simplest, fastest repository for training/finetuning medium-sized GPTs.
quietGPT
A scaled down empirical study of "Attention is Off by One" on nanoGPT
softmax1's Repositories
softmax1/Flash-Attention-Softmax-N
CUDA and Triton implementations of Flash Attention with SoftmaxN.
softmax1/quietGPT
A scaled down empirical study of "Attention is Off by One" on nanoGPT
softmax1/EsperBERTo
A test of the Attention Is Off By One hypothesis
softmax1/llama2.c-tinystories
Inference Llama 2 in one file of pure C
softmax1/MosaicBERT-Softmax1
softmax1/nanoGPT_softmax1
An experiment using nanoGPT vs nanoGPT (softmax1) to see how it affects perplexity score
softmax1/nanoGPT_softmax1_reddit
The simplest, fastest repository for training/finetuning medium-sized GPTs.