/gol

High-performance Computing (90535) final project at UniGe

Primary LanguageCApache License 2.0Apache-2.0

Game of Life

Comparative analysis of possible parallel implementations of Conway's famous Game of Life (GoL) using both GPU-based toolkits, CUDA, and CPU-based toolkits, OpenMP and MPI, on INFN's Ocapie cluster for HPC.

Authors: F. Minutoli, M. Ghirardelli, and D. Surpanu.

Useful links

Useful information

Standard

  • The borderline size to distinguish a small GoL's grid from a big one has been set to 50x50 due to visualization constraints that would hinder visibility. Indeed any grid larger than that cannot be properly visualized in a terminal and its evolution would not be appreciated in its entirety.

  • Any custom C guard implemented to force some specific behaviour in the code is marked with a GoL_ prefix.

Display file format

Both the input and output file format comply with the full-matrix format (FM), that is:

  • A single header row comprising the # of rows and columns in the grid, as two space-separated numbers.

  • A line for each row in the grid comprising all of its values, expressed as an X character for ALIVE cells and an empty space (or non-X character) for DEAD cells.

Sample input files can be found in the example folder, but one is given here, as well:

4 4
X  X
    
  XX
 X X

Please note: In case the (0, 0) cell is DEAD, thus the file starts with an empty space, replace its character with any non-X character (i.e., A) of choice before reading the GoL matrix from file. This prevents a well-known buggy behaviour of the getline() function in C from happening, due to which leading whitespaces are skipped.

Folder structure

This repository contains both the source code for a GPU-based implementation of Conway's Game of Life, inside the src\gpu folder and for a CPU-based implementation, inside the src\cpu folder. The include folder, instead, contains header files that both implementations utilize interchangeably, i.e., the base structs life_t and chunk_t, with a few specific C guards whenever the functionalities have to differ.

The bin folder contains various binaries generated by both implementations via the make command, each of which is characterized by specific tags in its name that describe how it was compiled; hence, its scope:

  • vec, stands for binaries optimized with vectorization at compile time;
  • omp, stands for binaries in which OpenMP support has been enabled;
  • mpi, stands for binaries in which MPI support has been enabled, thus they should be launched following standard MPI commands format, i.e., mpirun or mpiexec;
  • hybrid, stands for binaries in which a hybrid MPI+OpenMP support has been enabled;
  • cuda, stands for binaries that should be run on a GPU-capable machine.

Last but not least, the experiment folder contains all the experiments that we ran both implementations through.

Despite the repo containing both CPU and GPU code it has to be said that in order for the whole code to run, it needs to be shipped on a GPU-capable machine with OpenMP and MPI support. Otherwise, specific machines that provide either CPU or GPU capabilities should be implied to test both worlds separately.

Sample usage

Run any binary with the -h flag to learn its expected usage.