SeunghyunSEO

deep learning researcher

real.seunghyun.seo@navercorp.comSouth Korea

SeunghyunSEO's Stars

microsoft/generative-ai-for-beginners
21 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
Language:Jupyter Notebook65.2k 560 12933.3k
meta-llama/llama3
The official Meta Llama 3 GitHub site
Language:Python27.2k 226 2643.1k
unslothai/unsloth
Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
Language:Python18.4k 129 1k1.3k
openai/tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
Language:Python12.4k 168 240857
plasma-umass/scalene
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
Language:Python12.2k 91 476399
facebookresearch/DiT
Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"
Language:Python6.4k 44 81570
mosaicml/llm-foundry
LLM training code for Databricks foundation models
Language:Python4.1k 48 383530
openai/transformer-debugger
Language:Python4k 25 14238
pytorch/torchtitan
A native PyTorch Library for large model training
Language:Python2.6k 42 178205
AnswerDotAI/fsdp_qlora
Training LLMs with QLoRA + FSDP
Language:Jupyter Notebook1.4k 23 38188
microsoft/mup
maximal update parametrization (µP)
Language:Jupyter Notebook1.4k 29 6295
databricks/megablocks
Language:Python1.2k 16 61175
cuda-mode/resource-stream
CUDA related news and material links
1.1k 37 267
lilacai/lilac
Curate better data for LLMs
Language:Python946 13 29389
mistralai/megablocks-public
Language:Python860 9 055
jzhang38/EasyContext
Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.
Language:Python647 8 4646
alibaba/Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
Language:Python628 6 6351
zhuzilin/ring-flash-attention
Ring attention implementation with flash attention
Language:Python588 10 3546
cloneofsimo/minRF
Minimal implementation of scalable rectified flow transformers, based on SD3's approach
Language:Jupyter Notebook445 6 934
lucidrains/triton-transformer
Implementation of a Transformer, but completely in Triton
Language:Python248 15 915
foundation-model-stack/fms-fsdp
🚀 Efficiently (pre)training foundation models with native PyTorch features, including FSDP for training and SDPA implementation of Flash attention v2.
Language:Python193 12 3432
cloneofsimo/d3pm
Minimal Implementation of a D3PM in pytorch
Language:Jupyter Notebook183 4 815
cloneofsimo/min-max-gpt
Minimal (400 LOC) implementation Maximum (multi-node, FSDP) GPT training
Language:Python113 4 25
mgmalek/efficient_cross_entropy
Language:Python77 3 47
qtli/GSM-Plus
GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.
Language:Python46 1 34
cloneofsimo/auto_llm_codebase_analysis
Language:Python26 2 01
cloneofsimo/project_RF
Language:Python24 2 01
likenneth/persona_drift
Measuring and Controlling Persona Drift in Language Model Dialogs
Language:Python12 2 23
cloneofsimo/reverse_eng_deepspeed_study
DeepSpeed Study, focused on reverse engineering and enhancing documentation
Language:Python5 2 01
mgmalek/ring-attention
Language:Python4 3 00

SeunghyunSEO

SeunghyunSEO's Stars

microsoft/generative-ai-for-beginners

meta-llama/llama3

unslothai/unsloth

openai/tiktoken

plasma-umass/scalene

facebookresearch/DiT

mosaicml/llm-foundry

openai/transformer-debugger

pytorch/torchtitan

AnswerDotAI/fsdp_qlora

microsoft/mup

databricks/megablocks

cuda-mode/resource-stream

lilacai/lilac

mistralai/megablocks-public

jzhang38/EasyContext

alibaba/Megatron-LLaMA

zhuzilin/ring-flash-attention

cloneofsimo/minRF

lucidrains/triton-transformer

foundation-model-stack/fms-fsdp

cloneofsimo/d3pm

cloneofsimo/min-max-gpt

mgmalek/efficient_cross_entropy

qtli/GSM-Plus

cloneofsimo/auto_llm_codebase_analysis

cloneofsimo/project_RF

likenneth/persona_drift

cloneofsimo/reverse_eng_deepspeed_study

mgmalek/ring-attention