Pinned Repositories
1.5-Pints
A compact LLM pretrained in 9 days by using high quality data
Advanced-Python
Awesome-LLM-Synthetic-Data
A reading list on LLM based Synthetic Data Generation 🔥
awesome-open-source-lms
Friends of OLMo and their links.
Book-Mathematical-Foundation-of-Reinforcement-Learning
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
carbs
Cost aware hyperparameter tuning algorithm
chat_templates
Chat Templates for 🤗 HuggingFace Large Language Models
cluster-health
cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
crawl4ai
🔥🕷️ Crawl4AI: Open-source LLM Friendly Web Crawler & Scrapper
musram's Repositories
musram/unsloth
Finetune Llama 3, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory
musram/GPU-Puzzles
Solve puzzles. Learn CUDA.
musram/ring-attention-pytorch
Implementation of 💍 Ring Attention, from Liu et al. at Berkeley AI, in Pytorch
musram/flash-attention
Fast and memory-efficient exact attention
musram/rotary-embedding-torch
Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch
musram/carbs
Cost aware hyperparameter tuning algorithm
musram/cluster-health
musram/local-attention
An implementation of local windowed attention for language modeling
musram/pytorch-custom-utils
Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of new AI research
musram/k2-train
musram/k2-data-prep
musram/text-dedup
All-in-one text de-duplication
musram/notebooks
musram/SPIN
The official implementation of Self-Play Fine-Tuning (SPIN)
musram/minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
musram/llm-perf-bench
musram/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
musram/pretraining-data-packing
[ACL 2024] Analysing The Impact of Sequence Composition on Language Model Pre-Training
musram/hf-codegen
A repository of Python scripts to scrape code contents of the public repositories of `huggingface`.
musram/cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
musram/LLM-Training-Puzzles
What would you do with 1000 H100s...
musram/ipyexperiments
Automatic GPU+CPU memory profiling, re-use and memory leaks detection using jupyter/ipython experiment containers
musram/seamless_communication
Streaming text and audio using gradio for demo
musram/einops
Deep learning operations reinvented (for pytorch, tensorflow, jax and others)
musram/memory-efficient-attention-pytorch
Implementation of a memory efficient multi-head attention as proposed in the paper, "Self-attention Does Not Need O(n²) Memory"
musram/mlc-training
Relax Training APIs Tutorial and Examples
musram/gotraining
Go Training Class Material :
musram/Advanced-Python
musram/musram
Config files for my GitHub profile.
musram/scala-app