NVIDIA/modulus-launch

๐Ÿ›[BUG]: The training for test 'bloodflow_1d_mgn' needs performance improvements

lucapegolotti opened this issue ยท 0 comments

Version

Modulus 0.3.0

On which installation method(s) does this occur?

No response

Describe the issue

We tested the training for this example on A40, A100, and RTX3090. While the performance on A40 is reasonable (150s per epoch), on the other GPUs it is too slow (600s per epoch on RTX3090 with a smaller version of the GNN). Further testing is required to understand whether the custom DataLoader for this example should be improved, or if there is some other performance bottleneck somewhere else.

Minimum reproducible example

No response

Relevant log output

No response

Environment details

+ A40, A100, RTX3090 GPUs