Pinned Repositories
apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
babyagi
FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
flash-attention
Fast and memory-efficient exact attention
langchain-nvidia
LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
LWM
Megatron-LM
Ongoing research training transformer models at scale
moondream
tiny vision language model
NeMo
NeMo: a toolkit for conversational AI
zhenghax's Repositories
zhenghax/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
zhenghax/babyagi
zhenghax/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
zhenghax/flash-attention
Fast and memory-efficient exact attention
zhenghax/langchain-nvidia
zhenghax/LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
zhenghax/LWM
zhenghax/Megatron-LM
Ongoing research training transformer models at scale
zhenghax/moondream
tiny vision language model
zhenghax/NeMo
NeMo: a toolkit for conversational AI
zhenghax/NeMo-Megatron-Launcher
NeMo Megatron launcher and tools
zhenghax/pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
zhenghax/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.
zhenghax/triton
Development repository for the Triton language and compiler