This project demonstrates the use of SIMD (Single Instruction, Multiple Data) instructions for optimizing convolution calculations between two matrices of floating-point numbers. The project contains three main C programs: init.c
, simd.c
, and simd_t.c
, each showcasing different methods of computation with varying levels of optimization.
-
init.c
(Baseline Convolution Calculation):- Implements a basic non-SIMD convolution operation between two 400x198 matrices stored in a file (
data.txt
). - The result is stored in a 1D array
ans[200]
. - The program measures and prints the elapsed computation time.
- Outputs the result to
output.txt
.
- Implements a basic non-SIMD convolution operation between two 400x198 matrices stored in a file (
-
simd.c
(SIMD Convolution without Timing):- Utilizes SIMD instructions (
_mm_mul_ps
and_mm_add_ps
) to perform vectorized convolution calculations. - The arrays are aligned using
__attribute__((aligned(16)))
to ensure proper memory alignment for SIMD operations. - The computation uses SIMD to speed up the operations by processing 4 floating-point values simultaneously.
- This version does not include time measurement.
- Outputs the result to
output_simd_without_time.txt
.
- Utilizes SIMD instructions (
-
simd_t.c
(SIMD Convolution with Timing):- Similar to
simd.c
but includes time measurement usingclock_gettime()
. - Measures the time taken to perform the convolution with SIMD instructions.
- Outputs the computation time and the result to
output_simd.txt
.
- Similar to
-
data.txt
: Input file containing the floating-point numbers that will be processed. The file is expected to have 400 rows and 198 columns of float data. -
output.txt
: Output file frominit.c
containing the convolution results without SIMD optimization. -
output_simd_without_time.txt
: Output file fromsimd.c
containing the results of the SIMD optimized convolution without time measurement. -
output_simd.txt
: Output file fromsimd_t.c
containing the SIMD optimized convolution results and the elapsed time for the computation.
-
Compiling: You can compile the programs using the provided
Makefile
. It generates three executable files:init.exe
,simd.exe
, andsimd_t.exe
.make all
-
Running the programs: After compilation, run each program using the following commands:
./init.exe # Runs the basic (non-SIMD) convolution ./simd.exe # Runs the SIMD optimized convolution (without timing) ./simd_t.exe # Runs the SIMD optimized convolution with timing
-
Cleaning: To remove the compiled executables, run:
make clean
init.c
: Baseline implementation using standard loops for convolution.simd.c
andsimd_t.c
: Use SIMD instructions for improved performance, withsimd_t.c
additionally measuring the elapsed time.
- GCC: Ensure you have GCC installed with support for SSE (SIMD) instructions.
- C Standard Library: The programs rely on standard libraries (
stdio.h
,stdlib.h
,string.h
,time.h
) and SIMD-specific headers (xmmintrin.h
).
- The
data.txt
file must be formatted with floating-point numbers and properly structured to fit the 400x198 matrix size. - Ensure that the system supports SIMD (SSE) instructions for
simd.c
andsimd_t.c
to execute correctly.
- Expand the program to support larger matrices.
- Explore AVX or AVX-512 instructions for further optimization.
- Compare SIMD performance across different platforms.
- NonSIMD version: Elapsed time: 0.025616 seconds
- SIMD version: Elapsed time: 0.011092 seconds