ROCm/Tensile

Why MT not equal to WG*TT?

lingjiew93 opened this issue · 2 comments

Hi,
I've tried some gemm micro and found that Micro tile is not equal to workgroup * Thread tile in some cases.
For example, Sgemm with (M,N,K), (8012,8012,8012)
Cijk_Ailk_Bljk_SB_MT128x144x16_MI16x16x4x1_SE_1LDSB1_APM1_ABV0_ACED0_AF0EM1_AF1EM1_AMAS0_ASAE01_ASCE01_ASEM1_AAC0_BL1_DTL0_DVO0_EPS1_FL0_GRVW1_GSU1_GSUAMB_GLS0_ISA908_IU1_K1_KLA_LBSPP128_LPA0_LPB2_LDL1_LRVW2_MAC_MIAV0_MDA2_NTC0_NTD0_NEPBS0_NLCA2_NLCB1_ONLL1_OPLV0_PK0_PAP0_PGR1_PLR5_RK0_SIA3_SS0_SU0_SUM0_SUS0_SCIUI1_SPO0_SRVW2_SSO0_SVW4_SNLL0_TT2_144_TLDS1_USFGROn1_VAW1_VSn1_VW1_WSGRA1_WSGRB1_WS64_WG64_4_1_WGM1
In this case, from the doc, MT should be(2*64, 144*4), but the truth is (128, 144). Seems like N dimension workgroup is not working here.
Is there an explanation for that?

@lingjiew93
Thanks for submitting the issue.

Before the introduction of MatrixInstructions (MI), MT=WGxTT. This is valid now if you only have WG and TT in your config file. However, if you use MI in your config file, internally, the WG and TT may be adjusted to meet the MI requirements, or in some cases might be totally ignored. We still allow WG/TT in the config file for legacy reasons.

The kernel name that you are referring to has MI16x16x4x1, and as a result, MT is not necessarily equal to WGxTT.

Please let me know if that answers your questions.

Thanks for your reply.
It's clear now.