MRL.ipynb
is a replication of matryoshka embeddings from this paper which are a hierarchical representation scheme designed for representation learning that allowed one representation model to train embedding vectors of varying sizes simultaneously that also fit inside each other, like russian nesting dolls. see my youtube explanation below:
in MatFormer+.ipynb
I'm made the entire model exhibit the same splicing behavior as above within the inner-workings of the GPT, for example the kv multiplication using these smaller d
lengths and corresponding smaller head sizes. this has already been done by MATFORMER except they only implmenented it on the feedforward network, not the MHA, whereas I've done it with literally every part of the model
in MatryoskhaGPT.ipynb
i'm incorporating the ideas from this paper to make MatFormer+ not only subsettable in all weight matrices but also across layers. Basically if you don't want to use all
tangents.ipynb
is some rant I was going on at one point that's somehow related- if you're looking for the models pertaining to imposed & emergent hierarchical embeddings, they have been moved to this repo
- p.s. the code in this repo is based on andrej karpathy's minGPT