mn416/QPULib

How many gigaflops?

Closed this issue · 2 comments

Looking at the manual, there are 4 slices x 4 QPUs / slice, giving 16 QPUs per GPU.
Each QPU has 2 ALUs, which operate in parallel.. With full pipelines, each QPU executes an instruction on each cycle, giving 2 32-bit floating-point operations per cycle, per QPU.
At a clock speed of 250 MHz, there are 250 x 10^6 x 16 x 2 flops - 8 Giga-flops as a GPGPU.
At 16-bit precision, this presumably gives 16 Giga-flops.
I do not understand where 12 Giga-flops comes from.

mn416 commented

Hi @Systems-Analyst ,

Looking at the manual, there are 4 slices x 4 QPUs / slice, giving 16 QPUs

The diagram in the manual does suggest 16, but I read here that there are only 12 in the Pi.

giving 2 32-bit floating-point operations per cycle, per QPU

A QPU is 4-way physical SIMD, so it's 4 x 32-bit FP ops per cycle (or 8 x 32-bit FP ops if you count both ALUs in each vector lane).

Thanks for clearing that up; it makes sense now.