megatron-lm

There are 12 repositories under megatron-lm topic.

alibaba/Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
Language:Python646 7 6456
openpsi-project/ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
Language:Python265 4 2317
shreyansh26/Annotated-ML-Papers
Annotations of the interesting ML papers I read
238 22 023
xrsrke/pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
Language:Python81 3 3218
feifeibear/Odysseus-Transformer
Odysseus: Playground of LLM Sequence Parallelism
Language:Python68 2 23
MoFHeka/LLaMA-Megatron
A LLaMA1/LLaMA12 Megatron implement.
Language:Python28 1 72
GoogleCloudPlatform/nvidia-nemo-on-gke
Training NVIDIA NeMo Megatron Large Language Model (LLM) using NeMo Framework on Google Kubernetes Engine
Language:HCL12 16 16
janelu9/EasyLLM
Running Large Language Model easily.
Language:Python7 3 00
Beomi/megatronlm_dataset_autotokenizer
Megatron-LM/GPT-NeoX compatible Text Encoder with 🤗Transformers AutoTokenizer.
Language:Python6 2 01
SulRash/minLLMTrain
Minimal yet high performant code for pretraining llms. Attempts to implement some SOTA features. Implements training through: Deepspeed, Megatron-LM, and FSDP. WIP
Language:Python6 1 00
GJ98/Megatron-LM
Megatron-LM implemented by PyTorch
Language:Python2 1 01
0-1CxH/megatron-wrap
Wrapped Megatron: As User-Friendly as HuggingFace, As Powerful as Megatron-LM | Megatron封装：和HuggingFace一样方便，和Megatron-LM一样强大
Language:Python

megatron-lm

alibaba/Megatron-LLaMA

openpsi-project/ReaLHF

shreyansh26/Annotated-ML-Papers

xrsrke/pipegoose

feifeibear/Odysseus-Transformer

MoFHeka/LLaMA-Megatron

GoogleCloudPlatform/nvidia-nemo-on-gke

janelu9/EasyLLM

Beomi/megatronlm_dataset_autotokenizer

SulRash/minLLMTrain

GJ98/Megatron-LM

0-1CxH/megatron-wrap