/matrix-multiply-optimization

Used cache blocking, parallelizing, loop unrolling, register blocking, loop ordering, and SSE instructions to optimize the multiplication of large matrices to 55 gFLOPS

Primary LanguageC

This repository is not active