Pinned Repositories
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
nanotron
Minimalistic large language model 3D-parallelism training
ambivalent
Minimal, beautiful (+ highly-customizable) styles for Matplotlib.
awesome-stars
A curated list of my GitHub stars!
chunkwm
Tiling window manager for MacOS based on plugin architecture
ezpz
Train across all your devices, ezpz 🍋
l2hmc-qcd
Application of the L2HMC algorithm to simulations in lattice QCD.
parallel-training-slides
Modern parallelism techniques for training LLMs
personal_site
My personal website
saforem2's Repositories
saforem2/ezpz
Train across all your devices, ezpz 🍋
saforem2/ambivalent
Minimal, beautiful (+ highly-customizable) styles for Matplotlib.
saforem2/awesome-stars
A curated list of my GitHub stars!
saforem2/personal_site
My personal website
saforem2/mmm
Multi-Modal Modeling
saforem2/Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
saforem2/saforem2
Profile README
saforem2/wordplay
Playing with words
saforem2/github-stats
Github Stats
saforem2/mccl
Collective communications using mpi4py
saforem2/pbs-tui
TUI for PBS Pro Scheduler
saforem2/user-guides
ALCF Systems User Documentation
saforem2/ArcticTraining
ArcticTraining is a framework designed to simplify and accelerate the post-training process for large language models (LLMs)
saforem2/blendcorpus
BlendCorpus is a modular and scalable data preprocessing and loading framework for large language model (LLM) training.
saforem2/cfg
Dotfiles from `$HOME/.cfg`
saforem2/checkpoint_restart
This repo is for providing instructions on how to do checkpoint/restart at large scale simulations on exasscale machines
saforem2/diffusion-trainer
saforem2/ezpz-ai
PyPi alias for https://github.com/saforem2/ezpz
saforem2/intro-hpc-bootcamp-2025
Intro to HPC Bootcamp 2025
saforem2/kitty-config
Configuration for Kitty
saforem2/m
monorepo
saforem2/mynorfolk-dash
MyNorfolk Dashboard: A Quarto Dashboards Demo
saforem2/nanosft
minimal implementation of sft with gpt2-124M
saforem2/oumi
Easily fine-tune, evaluate and deploy gpt-oss, Qwen3, DeepSeek-R1, or any open source LLM / VLM!
saforem2/quarto-codespaces
Quarto codespaces
saforem2/skills-communicate-using-markdown
My clone repository
saforem2/starter
Starter template for LazyVim
saforem2/swift
An Autoregressive Consistency Model for Efficient Weather Forecasting
saforem2/torchtitan
A native PyTorch Library for large model training
saforem2/verl
verl: Volcano Engine Reinforcement Learning for LLMs