This project demonstrates a parallel matrix multiplication using the MPI (Message Passing Interface) framework and applying a convolutional filter to a grayscaled image with OpenMP. It distributes rows of matrix A among different processes, multiplies them with matrix B, and gathers the results into matrix C.
To run this project locally, ensure that the following tools are installed on your machine:
- MPI Library: This project uses MPI for parallel processing. You can use OpenMPI or MPICH.
- C Compiler: To compile the C code, you will need
mpicc
(MPI's C compiler).
MacOS (using Homebrew):
brew install open-mpi llvm libomp
Ubuntu:
sudo apt update
sudo apt install openmpi-bin openmpi-common libopenmpi-dev libomp
Windows:
For Windows, you can install OpenMPI through MSYS2 or use WSL (Windows Subsystem for Linux) to emulate a Linux environment.
src/openmp-optimise
: Main source code for applying a convolution filter on a grayscale image with OpenMPsrc/openmpi-optimise
: Main source code implementing matrix multiplication using MPI and saving the results to a file.src/openmpi-uneven-optimise
: Example scenario of the having rows in AAA not evenly divisble by the number of processes.output/test.txt
: The file with the resultant matrix C.output/uneven-test.txt
: The file where the resultant matrix C with AAA edge case.
First, clone this repository to your local machine:
git clone https://github.com/restartdk/computational-fluid-dynamics-hpi-openmp.git
cd computational-fluid-dynamics-hpi-openmp/
Use mpicc
to compile the C program:
mpicc -o matrix_multiply_file matrix_multiply_file.c
This will produce an executable named matrix_multiply_file
.
Run the program using mpirun
or mpiexec
with a specified number of processes:
mpirun -np <NUMBER_OF_PROCESSES> ./matrix_multiply_file
The resultant matrix C
will be written to test.txt
. You can view it using any text editor:
cat test.txt
14 32
32 77
50 122
This output shows the result of multiplying matrix A by matrix B and saving it to test.txt
.
The matrix sizes are defined by the constants M
, K
, and N
in the source code. You can modify these values as needed in the openmpi-optimise.c
file:
#define M 5 // Number of rows in A
#define K 3 // Number of columns in A and rows in B
#define N 2 // Number of columns in B
I tested the convolution operation with different kernel sizes and scheduling strategies to optimise performance. I measured the execution time for each combination of kernel size and scheduling strategy to compare their efficiency.
The results showed that the guided scheduling strategy with a chunk size of 3 was the most efficient for both kernel sizes 3 and 7. For a kernel size of 3, the guided scheduling strategy with a chunk size of 3 achieved a convolution time of 0.088422 seconds, which was faster than the other scheduling strategies tested. For a kernel size of 7, the guided scheduling strategy with a chunk size of 3 achieved a convolution time of 0.563970 seconds for a normal image, 0.452003 seconds for a serial image, and 0.439804 seconds for a blurred image. Compared to the other scheduling strategies tested, these times were the fastest.
I encountered some challenges, such as installing OpenMP on my computer and configuring the development environment to support it. However, I was able to overcome these challenges by following the appropriate installation instructions and properly configuring the compiler and development environment.