PoC-Consortium/scavenger

NEON architecture for AArch64 uses 32 × 128-bit register, twice as many as for ARMv7

Closed this issue · 1 comments

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/CEGDJGGC.html
http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/CJHECGIH.html

The NEON architecture for AArch64 uses 32 × 128-bit register, twice as many as for ARMv7.

If I right understand, then it can improve performance for aarch64?
If yes, I will be grateful for implementation, I think it take for me long time.
I ready to test.

Hi,
had a look into this. Our current code for Neon is not on machine level (assembler) but in C. This means that we don't specify which register to use. The register allocation is done by the compiler. If the compiler is aware of the extra registers, it can use it - no code changes necessary. Indeed this might help to boost performance, as there is less need for cache.

Best,

Johnny