below/HelloSilicon

Chapter 13, NEON: Invalid operand for instruction MUL

below opened this issue · 2 comments

below commented

MUL \ccol\().4H, V0.4H, \bcol\().4H[0]

While the distance code works, the matrixmultneon code does not, and building it leads to the following error:

<instantiation>:1:24: error: invalid operand for instruction
MUL V6.4H, V0.4H, V3.4H[0]
                        ^

The book says:

MUL V6.4H, V0.4H, V3.4H[0]
This is multiplying each lane in V0 by the scalar contained in a specific lane of V3. This shows how we typically access a value in a specific lane by appending [lane number] to the end of the register specifier—counting lanes from zero.

Adding -arch arm64e does not change anything.

Maybe @steipete has an idea?

You're digging into this deeper than I did, not sure what's up here. The NEON code we use mostly still is C, just using the low-level intrinsics like vcombine_s64.

below commented

not sure what's up here.

Once you find the issue, it looks trivial: The Clang assembler has a slightly different syntax:

MUL.4H	V6, V0, V3[0]

Thanks for having a look!

MUL.4H \ccol\(), V0, \bcol\()[0]