muellerzr's Stars
huggingface/transformers
๐ค Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
huggingface/datasets
๐ค The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools
stas00/ml-engineering
Machine Learning Engineering Open Book
huggingface/accelerate
๐ A simple way to launch, train, and use PyTorch models on almost any device and distributed configuration, automatic mixed precision (including fp8), and easy-to-configure FSDP and DeepSpeed support
NVIDIA/ChatRTX
A developer reference project for creating Retrieval Augmented Generation (RAG) chatbots on Windows using TensorRT-LLM
facebookresearch/schedule_free
Schedule-Free Optimization in PyTorch
pytorch/torchtitan
A native PyTorch Library for large model training
microsoft/mup
maximal update parametrization (ยตP)
jiaweizzhao/GaLore
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
anibali/docker-pytorch
A Docker image for PyTorch
tinygrad/open-gpu-kernel-modules
NVIDIA Linux open GPU with P2P support
stas00/the-art-of-debugging
The Art of Debugging
pytorch/PiPPy
Pipeline Parallelism for PyTorch
pacman100/LLM-Workshop
LLM Workshop by Sourab Mangrulkar
bigcode-project/starcoder2-self-align
StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation
muellerzr/minimal-trainer-zoo
Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines
lucidrains/recurrent-interface-network-pytorch
Implementation of Recurrent Interface Network (RIN), for highly efficient generation of images and video without cascading networks, in Pytorch
SergioMEV/slurm-for-dummies
A dummy's guide to setting up (and using) HPC clusters on Ubuntu 22.04LTS using Slurm and Munge. Created by the Quant Club @ UIowa.
LukasHedegaard/pytorch-benchmark
Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption
xrsrke/pipegoose
Large scale 4D parallelism pre-training for ๐ค transformers in Mixture of Experts *(still work in progress)*
fchollet/namex
Clean up the public namespace of your package!
Youhe-Jiang/IJCAI2023-OptimalShardedDataParallel
[IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any interests, please visit/star/fork https://github.com/Youhe-Jiang/OptimalShardedDataParallel
muellerzr/nbquarto
Small python library solely for quick Quarto extensions
gnovack/distributed-training-and-deepspeed
muellerzr/RAG-Experiments
My learnings (publicly) on RAG systems
TJ-Solergibert/transformers-in-supercomputers
Transformers training in a supercomputer with the ๐ค Stack and Slurm
BenjaminBossan/pytest-guide
Pytest guide for unittest users
muellerzr/llama-3-8b-self-align
StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation applied to llama 3 8b
muellerzr/swe-study-group
Code for the SWE study group
lessw2020/hyper_efficient_optimizers
Development of hyper efficient optimizers that can match/exceed AdamW, while using reduced memory