carmocca's Stars
syncthing/syncthing
Open Source Continuous File Synchronization
mckaywrigley/chatbot-ui
AI chat for every model.
refined-github/refined-github
:octocat: Browser extension that simplifies the GitHub interface and adds useful features
karpathy/llm.c
LLM training in simple, raw C/CUDA
benfred/py-spy
Sampling profiler for Python programs
state-spaces/mamba
Mamba SSM architecture
stas00/ml-engineering
Machine Learning Engineering Open Book
NVIDIA/Megatron-LM
Ongoing research training transformer models at scale
cleanlab/cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
Lightning-AI/litgpt
Load, pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
skypilot-org/skypilot
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
Lightning-AI/lit-llama
Implementation of the LLaMA language model based on nanoGPT. Supports flash attention, Int8 and GPTQ 4bit quantization, LoRA and LLaMA-Adapter fine-tuning, pre-training. Apache 2.0-licensed.
pythonprofilers/memory_profiler
Monitor Memory usage of Python code
JoePenna/Dreambooth-Stable-Diffusion
Implementation of Dreambooth (https://arxiv.org/abs/2208.12242) by way of Textual Inversion (https://arxiv.org/abs/2208.01618) for Stable Diffusion (https://arxiv.org/abs/2112.10752). Tweaks focused on training faces, objects, and styles.
eric-mitchell/direct-preference-optimization
Reference implementation for DPO (Direct Preference Optimization)
Lightning-AI/lightning-thunder
Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.
xl0/lovely-tensors
Tensors, ready for human consumption
mosaicml/streaming
A Data Streaming Library for Efficient Neural Network Training
bigscience-workshop/bigscience
Central place for the engineering/scaling WG: documentation, SLURM scripts and logs, compute environment and data.
repository-settings/app
Pull Requests for GitHub repository settings
pytorch/PiPPy
Pipeline Parallelism for PyTorch
penghao-wu/vstar
PyTorch Implementation of "V* : Guided Visual Search as a Core Mechanism in Multimodal LLMs"
Lightning-AI/litdata
Streamline data pipelines for AI. Process datasets across 1000s of machines, and optimize data for blazing fast model training.
llm-efficiency-challenge/neurips_llm_efficiency_challenge
NeurIPS Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day
EleutherAI/cookbook
Deep learning for dummies. All the practical details and useful utilities that go into working with real models.
pytorch/torchsnapshot
A performant, memory-efficient checkpointing library for PyTorch applications, designed with large, complex distributed workloads in mind.
pytorch/torchdistx
Torch Distributed Experimental
graphcore-research/unit-scaling
A library for unit scaling in PyTorch
rom1504/gpu-tester
gpu tester detects broken and slow gpus in a cluster
graphcore-research/out-of-the-box-fp8-training
Demo of the unit_scaling library, showing how a model can be easily adapted to train in FP8.