Pinned Repositories
emdr2
Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering"
logging
MLPerf™ logging library
Megatron-LM
Ongoing research training transformer models at scale
NeMo
NeMo: a toolkit for conversational AI
policies
General policies for MLPerf™ including submission rules, coding standards, etc.
training
Reference implementations of MLPerf™ training benchmarks
training_policies
Issues related to MLPerf™ training policies, including rules and suggested changes
training_results_v3.0
This repository contains the results and code for the MLPerf™ Training v3.0 benchmark.
training_results_v3.1
This repository contains the results and code for the MLPerf™ Training v3.1 benchmark.
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.
ShriyaPalsamudram's Repositories
ShriyaPalsamudram/training
Reference implementations of MLPerf™ training benchmarks
ShriyaPalsamudram/emdr2
Code and Models for the paper "End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering"
ShriyaPalsamudram/logging
MLPerf™ logging library
ShriyaPalsamudram/Megatron-LM
Ongoing research training transformer models at scale
ShriyaPalsamudram/NeMo
NeMo: a toolkit for conversational AI
ShriyaPalsamudram/policies
General policies for MLPerf™ including submission rules, coding standards, etc.
ShriyaPalsamudram/training_policies
Issues related to MLPerf™ training policies, including rules and suggested changes
ShriyaPalsamudram/training_results_v3.0
This repository contains the results and code for the MLPerf™ Training v3.0 benchmark.
ShriyaPalsamudram/training_results_v3.1
This repository contains the results and code for the MLPerf™ Training v3.1 benchmark.
ShriyaPalsamudram/TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper GPUs, to provide better performance with lower memory utilization in both training and inference.