tensorflow/mlir-hlo

Support i4 type in MHLO

GMNGeoffrey opened this issue · 4 comments

MHLO currently limits operations to 8/16/32/64-bit types. We would like support for i4 to enable quantization work

Hi Geoffrey! This is a very reasonable idea.

What is the extent of the i4 support that you have in mind? Would you like to have i4 enabled everywhere (i.e. here:

def HLO_SInt : SignlessIntOfWidths<[8, 16, 32, 64]>;
def HLO_UInt : UnsignedIntOfWidths<[8, 16, 32, 64]>;
def HLO_Int : AnyTypeOf<[HLO_SInt, HLO_UInt]>;
), or your use case involves only particular ops operating on i4 values?

I think basically everywhere yes. ABI boundaries get tricky though and I think can be punted on. We are especially interested in being able to do matmuls with i4 (especially from i4 constant weights), but manipulation of those constants would likely be quickly necessary. So I think basically I want to add "4" to those lists, yes, though I'm not sure if there's some necessary pre-work to make that safe. There may very well be lowerings or verifications that assume those are the only types (I think I've raised this concern before about it being difficult to track down issues when we rely on invariants from verifiers). One potential migration approach would be to enable the smaller types op by op while auditing that nothing (within MHLO, at least) is relying on that property.

This has landed in a622ca2.