Pinned Repositories
DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
modded-nanogpt
NanoGPT (124M) quality in 2.67B tokens
nanodo
jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
nGPT-pytorch
Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI
x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
equinox
Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/
OpenDiloco
OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training
flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
faresobeid's Repositories
faresobeid/nanoRWKV
RWKV in nanoGPT style