4ms/metamodule

Optimize EnOsc with vectorization

Opened this issue · 0 comments

EnOsc::Oscillator::Process<twist_mode, warp_mode, block_size> calculates a block of samples for one oscillator. Looking at the generated assembly, no vector operations are used (only floating-point).
Measuring it shows it takes the majority of the time to execute.

Since the block size is always a multiple of 4, we could try to re-write this so that the compiler generates NEON instructions, or use CMSIS DSP library (or try NE10 or simde).