Bad liveness check for PBLENDVB and similar instructions
sneves opened this issue · 2 comments
PBLENDVB
is one of a handful of instructions that has an implicit register argument at xmm0
. Consider this trivial example:
from peachpy import *
from peachpy.x86_64 import *
with Function("test", (), None, target=uarch.haswell):
x = XMMRegister()
y = XMMRegister()
PBLENDVB(x, y, xmm0)
RETURN()
This fails with the exception Instruction PBLENDVB requires operand 3 to be allocated to xmm0 register, but xmm0 is a live register
. Looking at the check in question, we have
if instruction.name in {"BLENDVPD", "BLENDVPS", "PBLENDVB", "SHA256RNDS2"}:
assert len(instruction.operands) == 3, \
"expected 3 operands, got %d (%s)" % \
(len(instruction.operands), ", ".join(map(str, instruction.operands)))
xmm0_operand = instruction.operands[2]
assert isinstance(xmm0_operand, XMMRegister), \
"expected xmm registers in the 3rd operand, got %s" % str(xmm0_operand)
# Check that xmm0 is not live at this instruction
if instruction._live_registers.get(xmm0._internal_id, 0) & XMMRegister._mask != 0:
raise RegisterAllocationError(
"Instruction %s requires operand 3 to be allocated to xmm0 register, " +
"but xmm0 is a live register" % instruction.name)
if xmm0_operand.is_virtual:
xmm0_binded_registers.add(xmm0_operand._internal_id)
I have found no way to actually use these instructions, as xmm0
will apparently always be live at the instruction in question.
Curiously, the ROL
and related instructions work as expected when using cl
as an argument. The difference suggests the possible issue: ROL
et al check first that the rotation count operand is a virtual register, whereas PBLENDVB
et al do not and always make the liveness check.
Now, if we get past this issue by letting every register be virtual, e.g.,
from peachpy import *
from peachpy.x86_64 import *
with Function("test", (), None, target=uarch.haswell):
x = XMMRegister()
y = XMMRegister()
z = XMMRegister()
PBLENDVB(x, y, z)
RETURN()
we run into a different issue: the comparison here
other_xmm0_registers = filter(operator.methodcaller("__ne__", xmm0_register), xmm0_binded_registers)
does not work, because earlier we are pushing integers into xmm0_binded_registers
:
xmm0_binded_registers.add(xmm0_operand._internal_id)
Furthermore, in the next line self._conflicting_registers[3][xmm0_register]
simply does not exist. Some investigation leads me to believe that self._register_allocators[XMMRegister._kind].conflicting_registers[xmm0_register]
is what is intended here, but I'm not entirely sure. Similarly, a few lines below, self._register_allocations[XMMRegister._kind][xmm0_register] = xmm0.physical_id
should probably be self._register_allocators[XMMRegister._kind].register_allocations[xmm0_register] = xmm0.physical_id
.
This is partially fixed by 5e6def5. PeachPy now accepts fixed registers for these instructions, but doesn't yet allow virtual registers as fixed operands.