Mini raspberry pi cluster with slurm workload manager
Base compute node with Raspberry Pi 4
2 x Raspberry Pi 4 8GB(RAM)
4 x Raspberry Pi 4 4GB(RAM)
1 x Nvidia Jetson nano 4GB(RAM) (Optional)
1 x SSD 120 GB
1 x Gigabit Ethernet Switch
1 x Router
|--(CAT6)->[Control-Service] RPI4 - 8GB(RAM)
________________ | |-(USB)->[NFS-SSD] - 120GB
| Gigabit Switch |---> |--(CAT6)->[Compute-Node1] RPI4 - 8GB(RAM)
------^--------- |--(CAT6)->[Compute-Node2] RPI4 - 4GB(RAM)
______|____ |--(CAT6)->[Compute-Node3] RPI4 - 4GB(RAM)
| Router MT | |--(CAT6)->[Compute-Node4] RPI4 - 4GB(RAM)
Internet -->| DHCP & GW | |--(CAT6)->[Compute-Node5] RPI4 - 4GB(RAM)
----------- |--(CAT6)->(Option)[GPU-Node1] NVD nano - 4GB(RAM)
In under test, now I will try to compile DPDK and SoftRDMA to replace TCP.
DPDK
SoftRDMA
MUNGE
SLURM
Lmod
Spack
Easybuild
MPI(MPICH, OpenMPI)
ATLAS
OpenBLAS
UCX
HPL
sudo apt install build-essential libssl-dev ntp
- The munge is installed on every node in the cluster. (both service and compute)
- Download MUNGE form source
wget https://github.com/dun/munge/releases/download/munge-0.5.14/munge-0.5.14.tar.xz
- Install MUNGE
tar -xvJf munge-0.5.14.tar.xz \
&& cd munge-0.5.14 \
&& ./configure \
--prefix=/usr \
--sysconfdir=/etc \
--localstatedir=/var \
--runstatedir=/run \
&& make \
&& sudo make install
- Create
mungue
user and group
sudo adduser munge --uid 1001 --system --no-create-home --group