zhenghax

NVIDIA, ex-Amazon, ex-AMDSan Jose

Pinned Repositories

apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Language:Python00
babyagi
Language:Python0 0 00
FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Language:Python00
flash-attention
Fast and memory-efficient exact attention
Language:Python00
langchain-nvidia
Language:Python0 0 00
LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Language:Jupyter Notebook0 0 00
LWM
Language:Python00
Megatron-LM
Ongoing research training transformer models at scale
Language:Python00
moondream
tiny vision language model
Language:Python00
NeMo
NeMo: a toolkit for conversational AI
Language:Python00

zhenghax/apex
A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch
Language:Python00
zhenghax/babyagi
Language:Python0 0 00
zhenghax/FastChat
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Language:Python00
zhenghax/flash-attention
Fast and memory-efficient exact attention
Language:Python00
zhenghax/langchain-nvidia
Language:Python0 0 00
zhenghax/LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
Language:Jupyter Notebook0 0 00
zhenghax/LWM
Language:Python00
zhenghax/Megatron-LM
Ongoing research training transformer models at scale
Language:Python00
zhenghax/moondream
tiny vision language model
Language:Python00
zhenghax/NeMo
NeMo: a toolkit for conversational AI
Language:Python00
zhenghax/NeMo-Megatron-Launcher
NeMo Megatron launcher and tools
Language:Python0 0 00
zhenghax/pytorch-lightning
Pretrain, finetune and deploy AI models on multiple GPUs, TPUs with zero code changes.
Language:Python0 0 00
zhenghax/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.
Language:Python0 0 00
zhenghax/triton
Development repository for the Triton language and compiler