faresobeid

Pinned Repositories

DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
3.7k 31 89157
modded-nanogpt
NanoGPT (124M) quality in 2.67B tokens
Language:Python00
nanodo
Language:Python198 8 110
jax
Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more
Language:Python30.6k 337 5.7k2.8k
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python37.5k 377 3186k
nGPT-pytorch
Quick implementation of nGPT, learning entirely on the hypersphere, from NvidiaAI
Language:Python260 9 1118
x-transformers
A concise but complete full-attention transformer with a set of promising experimental features from various papers
Language:Python4.8k 55 228418
equinox
Elegant easy-to-use neural networks + scientific computing in JAX. https://docs.kidger.site/equinox/
Language:Python2.1k 26 461143
OpenDiloco
OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training
Language:Python329 5 126
flash-linear-attention
Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton
Language:Python1.4k 27 5670