BF16 BRGEMM is not matched on mlir-gen input when using prepacked weights and tilesize and tensor size are identical

Question

BF16 BRGEMM is not matched on mlir-gen input when using prepacked weights and tilesize and tensor size are identical

Closed this issue 8 months ago · 0 comments

How to reproduce:
./mlir-gen --kernel=model --float-width=16 --batch=256 --layers=64,64,64,64 --tiles=64,64,64 --vnni=2 > gemm_bf16.mlir
In this case the weight is in VNNI layout:
%cst = arith.constant dense<1.000000e+00> : tensor<1x1x32x64x2xbf16>
However, when running ./tpp-opt -default-tpp-passes gemm_bf16.mlir the linalg.generic is not matched to BRGEMM with count 1 or a regular GEMM.

This issue only exists for BF16, when using it on a similar case for FP32
./mlir-gen --kernel=model --float-width=32 --batch=256 --layers=64,64,64,64 --tiles=64,64,64 > gemm_fp32.mlir
everything works as expected.

@chelini