PPGrad

C++ & OpenMPI & OpenMP toy framework for Neural Nets (a la Micrograd) created in approx. 7 days of scattered work as a semestral project.

Simple evaluation at the end of semester can be found in PPGrad.pdf. The goal was to teach myself how Autograd works (in principle) and try building simple NN Library on top of it. Also GitHub actions party, Doxygen learning, Google Test learning and few other little things.

Initial Plan

The goal is to create distributed (OpenMPI) and parallelized (OpenMP) C++ framework (called "PPGrad") that would provide modular building blocks of Neural Networks (such as Conv2D Layer, Dense Layer, AdamW Optimizer, ReLU activation) and would allow to run the training in a Data distributed fashion.

We will use Eigen to utilize optimized matrix multiplications and we'll use OpenMP to paralellize run as many samples from the microbatch as possible.

The OpenMPI will then be used to provide the Data Parallel training.

The training loop provided by the framework should look something like:

Distribute the model to all the MPI machines
Scatter the batch so that each machine has part of the batch
Run the gradient calculations on each machine using OpenMP to parallelize each sample in the given microbatch
Synchronize the gradients using Allreduce so that all the machines can update the model.
Repeat.

We'd also like for the framework to use operator overloading to provide simple Autograd feature like PyTorch or Tensorflow (or Karpathy's "famous" Micrograd).

Stretch goal is to also support running the matrix multiplication on the GPU using custom CUDA kernels (to learn CUDA development and have some fun with it.)

shinobiultra/PPGrad

PPGrad

Initial Plan