Support i4 type in MHLO
GMNGeoffrey opened this issue · 4 comments
MHLO currently limits operations to 8/16/32/64-bit types. We would like support for i4 to enable quantization work
Hi Geoffrey! This is a very reasonable idea.
What is the extent of the i4 support that you have in mind? Would you like to have i4 enabled everywhere (i.e. here:
mlir-hlo/include/mlir-hlo/Dialect/mhlo/IR/hlo_ops_base.td
Lines 34 to 36 in d1aa065
I think basically everywhere yes. ABI boundaries get tricky though and I think can be punted on. We are especially interested in being able to do matmuls with i4 (especially from i4 constant weights), but manipulation of those constants would likely be quickly necessary. So I think basically I want to add "4" to those lists, yes, though I'm not sure if there's some necessary pre-work to make that safe. There may very well be lowerings or verifications that assume those are the only types (I think I've raised this concern before about it being difficult to track down issues when we rely on invariants from verifiers). One potential migration approach would be to enable the smaller types op by op while auditing that nothing (within MHLO, at least) is relying on that property.