Pinned Repositories
horovod
Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.
cuDecomp
An Adaptive Pencil Decomposition Library for NVIDIA GPUs
TorchFort
An Online Deep Learning Interface for HPC programs on NVIDIA GPUs
apex
A PyTorch Extension
aws-ofi-nccl
This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.
benchy
Prototype benchmarking dataloader for DL
cineca-openacc-tutorial
CUDA-Fortran-Tutorial
PGInsider
Source files for PGInsider blog post on Cholesky factorization and reduction of generalized eigenproblem
qe-gpu-benchmarks
Benchmark repository for qe-gpu (https://github.com/fspiga/qe-gpu)
romerojosh's Repositories
romerojosh/benchy
Prototype benchmarking dataloader for DL
romerojosh/cineca-openacc-tutorial
romerojosh/CUDA-Fortran-Tutorial
romerojosh/PGInsider
Source files for PGInsider blog post on Cholesky factorization and reduction of generalized eigenproblem
romerojosh/qe-gpu-benchmarks
Benchmark repository for qe-gpu (https://github.com/fspiga/qe-gpu)
romerojosh/apex
A PyTorch Extension
romerojosh/aws-ofi-nccl
This is a plugin which lets EC2 developers use libfabric as network provider while running NCCL applications.
romerojosh/DALI
A library containing both highly optimized building blocks and an execution engine for data pre-processing in deep learning applications
romerojosh/cupy
NumPy & SciPy for GPU
romerojosh/GiMMiK
romerojosh/horovod
Distributed training framework for TensorFlow, Keras, and PyTorch.
romerojosh/models
Models and examples built with TensorFlow
romerojosh/nccl
Optimized primitives for collective multi-GPU communication
romerojosh/spack
A flexible package manager that supports multiple versions, configurations, platforms, and compilers.