ad8e opened this issue 5 months ago · 1 comments
torchtitan/torchtitan/parallelisms/__init__.py
Line 46 in d442743
Should be PP DP TP. This matters for NUMA across nodes.
good catch! submitting a fix soon