xrsrke

Research Engineer @huggingface

@huggingfaceEarth

Pinned Repositories

nanotron
Minimalistic large language model 3D-parallelism training
Language:Python1.3k 42 83133
ai-notebooks
AI notebooks
Language:Jupyter Notebook5 3 01
fastgoose
A PyTorch implementation of Model Parallelism and ZeRO Optimizer
Language:Jupyter Notebook3 1 00
instructGOOSE
Implementation of Reinforcement Learning from Human Feedback (RLHF)
Language:Jupyter Notebook171 5 521
pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Language:Python81 4 3217
prodgpt
A production-ready training, evaluation and data pipeline
Language:Python4 1 00
progen
Generating new proteins using language models
Language:Jupyter Notebook4 1 00
reinforcement-learning
Language:Jupyter Notebook9 1 11
stable-diffusion-from-scratch
Implementation of Stable Diffusion from scratch [WORK IN PROGRESS]
Language:Jupyter Notebook21 3 11
toolformer
Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools
Language:Jupyter Notebook136 4 614

xrsrke's Repositories

xrsrke/instructGOOSE
Implementation of Reinforcement Learning from Human Feedback (RLHF)
Language:Jupyter Notebook171 5 521
xrsrke/toolformer
Implementation of Toolformer: Language Models Can Teach Themselves to Use Tools
Language:Jupyter Notebook136 4 614
xrsrke/pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Language:Python81 4 3217
xrsrke/prodgpt
A production-ready training, evaluation and data pipeline
Language:Python4 1 00
xrsrke/fastgoose
A PyTorch implementation of Model Parallelism and ZeRO Optimizer
Language:Jupyter Notebook3 1 00
xrsrke/homework
Notes
Language:Jupyter Notebook2 1 00
xrsrke/nanoGPT-msamp
Integrate MS-AMP into nanoGPT (https://github.com/karpathy/nanoGPT)
Language:Python2 0 01
xrsrke/snippets
snippets
Language:Python2 1 01
xrsrke/elasticgoose
A fault-tolerant elastic training framework for PyTorch
Language:Python1 2 00
xrsrke/fsdl-megatron
Code for FSDL Breaking down parallelism in Megatron-LM
Language:Python1 1 0
xrsrke/fsdl-website
Source for https://fullstackdeeplearning.com
Language:Jupyter Notebook1 0 0
xrsrke/hf-notebooks
Language:Jupyter Notebook1
xrsrke/Jetfire-INT8Training
Language:Jupyter Notebook1
xrsrke/megatron-tp
for debugging pipegoose
Language:Python1 1 0
xrsrke/minitron
A mini Megatron 3D parallelism library for FSDL blog
Language:Python1 1 0
xrsrke/mousai
PyTorch Implementation of Moûsai: Text-to-Music Generation with Long-Context Latent Diffusion
Language:Python1 2 0
xrsrke/pipegoose-training
Language:Jupyter Notebook1 1 0
xrsrke/transformers-starcoder
Language:Python1 0 0
xrsrke/xrsrke
1 2 0
xrsrke/hf-blog
Public repo for HF blog posts
Language:Jupyter Notebook0 0
xrsrke/internal
Mechanistic Interpretability's Tools
1 0
xrsrke/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python0 0
xrsrke/nmmo-baselines
Baselines for Neural MMO -- new users should treat this repo as a starter project
Language:Python0 0
xrsrke/nmmo-environment
Neural MMO - A Massively Multiagent Environment for Artificial Intelligence Research
Language:Python0 0
xrsrke/perceiver
Implementation of Perceiver: General Perception with Iterative Attention from DeepMind
1 0
xrsrke/prodgpt-data
Data Versioning for ProdGPT
Language:Python1 0
xrsrke/prodgpt-dbt
1 0
xrsrke/vision-transformer
Pytorch implementation of An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Language:Python2 0
xrsrke/xrsrke.github.io
Language:Jupyter Notebook1 0
xrsrke/xrswtf
xrs.wtf
Language:HTML1 0