megatron-lm
There are 12 repositories under megatron-lm topic.
alibaba/Megatron-LLaMA
Best practice for training LLaMA models in Megatron-LM
openpsi-project/ReaLHF
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
shreyansh26/Annotated-ML-Papers
Annotations of the interesting ML papers I read
xrsrke/pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
feifeibear/Odysseus-Transformer
Odysseus: Playground of LLM Sequence Parallelism
MoFHeka/LLaMA-Megatron
A LLaMA1/LLaMA12 Megatron implement.
GoogleCloudPlatform/nvidia-nemo-on-gke
Training NVIDIA NeMo Megatron Large Language Model (LLM) using NeMo Framework on Google Kubernetes Engine
janelu9/EasyLLM
Running Large Language Model easily.
Beomi/megatronlm_dataset_autotokenizer
Megatron-LM/GPT-NeoX compatible Text Encoder with 🤗Transformers AutoTokenizer.
SulRash/minLLMTrain
Minimal yet high performant code for pretraining llms. Attempts to implement some SOTA features. Implements training through: Deepspeed, Megatron-LM, and FSDP. WIP
GJ98/Megatron-LM
Megatron-LM implemented by PyTorch
0-1CxH/megatron-wrap
Wrapped Megatron: As User-Friendly as HuggingFace, As Powerful as Megatron-LM | Megatron封装:和HuggingFace一样方便,和Megatron-LM一样强大