NVIDIA/cutlass

NOTICE: Upcoming change to 3.x kernel argument ordering

mnicely opened this issue · 1 comments

Discussed in #889

Originally posted by mnicely March 23, 2023
The CUTLASS 3.1 release will include a slight modification to the ordering of arguments passed in to 3.x kernels.
Existing CUTLASS 3.x kernel invocations will need to be modified to reflect this change.

The change involves moving the thread-level epilogue parameters to the beginning of the arguments list for the collective-level epilogue.

The change involves grouping parameters for operands A and B, and moving the thread-level epilogue parameters to the beginning of the arguments list for the collective-level epilogue.

The following diff illustrates an example of the changes that will be required in example 48:

-    block_A.get(),
-    stride_A,
-    block_B.get(),
-    stride_B,
-    {block_C.get(), stride_C, block_D.get(), stride_D, {options.alpha, options.beta}}
+    {block_A.get(), stride_A, block_B.get(), stride_B},
+    {{options.alpha, options.beta}, block_C.get(), stride_C, block_D.get(), stride_D}

Existing 3.x examples (48, 49, and 50) will be updated to reflect this change.

No CUTLASS 2.x kernels will be impacted by this change.

Please respond to this discussion if you have any questions about this upcoming change.

Released