On your Raspberry Pi:
- Install poetry package manager
- Run
poetry install
in the project root, to install dependencies - Using
poetry shell
will activate the venv - Run script using
python SCRIPT
WARNING: Some dependencies might need shared libraries such as libssl
This has only been tested on linux
We tried multiple versions of a pytorch geometric GCN impementation using the example reddit dataset. In following we will describe those versions in a short manner.
The default reddit dataset GCN implementation
Using DistributedDataParallel from torch.nn.parallel this version is an implementation of data parallel model training. It is suitable for throughput/performance gain on multiple machines and less suitable for low spec hardware with little memory, since the whole model has to be hold by every node
Using OpenMPI and Horovod this version distributes the model using horovod, which is a bit more memory efficient. Since the NeighbourSampler have to be hold by every node, this solution was still not efficient enough, because of the big sampler size in memory.
Ray is an actor framework built on Redis so nodes can communicate via message passing. We were not able to get backwards propagation running, because that has to be deeply integrated into pytorch. While calculating the forward propagation, the tensor graph has to be built and remembered. This is not trivial using actors and therefore was no solution suitable for us.
The solution we chose at the end, as it gave us the best results regarding memory efficiency. Every layer and NeigboughrSampling could be run on different machines and therefore the pis did not run out of memory. Only problem is, that this solution requires the Gloo framework for communication, which is only compatible with 64bit systems. Fir that we installed 64bit ubuntu on our pi cluster and got it running.