aws/sagemaker-training-toolkit

Custom MPI options doesn't override the flags

ChaiBapchya opened this issue · 3 comments

Describe the bug
custom_mpi_options flag in the sagemaker training toolkit isn't over-riding the MPI command instead it just appends the flags

Logic

overridden_known_options, additional_options = _parse_custom_mpi_options(
self._custom_mpi_options
)

To reproduce

mpi_options = '-verbose -x orte_base_help_aggregate=0  -map-by socket -rank-by core'
estimator = MXNet(
    entry_point='hvd_resnet_mx.py',
    role=role,
    train_instance_type='ml.p3.8xlarge',
    train_instance_count=2,
    image_name=image,
    framework_version='1.6.0',
    py_version='py3',
    hyperparameters={'sagemaker_mpi_enabled': True,
                     'sagemaker_mpi_custom_mpi_options': mpi_options,
                     'sagemaker_mpi_num_of_processes_per_host': 4},
    sagemaker_session=sagemaker_session)

Invoking this command doesn't override the mpi

mpirun --host algo-1:4,algo-2:4 -np 8 --allow-run-as-root --display-map 
--tag-output -mca btl_tcp_if_include eth0 -mca oob_tcp_if_include eth0 -mca plm_rsh_no_tree_spawn 1 -bind-to socket 
-map-by slot -mca pml ob1 -mca btl ^openib 
-mca orte_abort_on_non_zero_status 1 -x NCCL_MIN_NRINGS=4 -x NCCL_SOCKET_IFNAME=eth0 -x NCCL_DEBUG=INFO -x LD_LIBRARY_PATH -x PATH -x LD_PRELOAD=/usr/local/lib/python3.6/site-packages/gethostname.cpython-36m-x86_64-linux-gnu.so -verbose -x orte_base_help_aggregate=0 
-map-by socket -rank-by core -x SM_HOSTS -x SM_NETWORK_INTERFACE_NAME -x SM_HPS -x SM_USER_ENTRY_POINT -x SM_FRAMEWORK_PARAMS -x SM_RESOURCE_CONFIG -x SM_INPUT_DATA_CONFIG -x SM_OUTPUT_DATA_DIR -x SM_CHANNELS -x SM_CURRENT_HOST -x SM_MODULE_NAME -x SM_LOG_LEVEL -x SM_FRAMEWORK_MODULE -x SM_INPUT_DIR -x SM_INPUT_CONFIG_DIR -x SM_OUTPUT_DIR -x SM_NUM_CPUS -x SM_NUM_GPUS -x SM_MODEL_DIR -x SM_MODULE_DIR -x SM_TRAINING_ENV -x SM_USER_ARGS -x SM_OUTPUT_INTERMEDIATE_DIR -x PYTHONPATH /usr/local/bin/python3.6 -m mpi4py hvd_resnet_mx.py

Expected behavior
Expected to not see -map-by twice.

-map-by slot
-map-by socket

System information
Latest MX Dockerfile

Thank you for reporting that!

Moreover, since this is training-toolkit specific, it would be an issue regardless of the framework.

You can override and env var -X key=value, but not sure if you can MPI parmaters like -rank-by core. In details:
I tested and if you set an env var (-X) with custom_mpi_options it overrides the defaults set of env vars. As it’s concatenated at the end of the mpirun command.

Estimator distribution dict:
distribution = { "mpi": {"enabled": True, "custom_mpi_options": "-x NCCL_MIN_NRINGS=1", } }

Runtime output when printing env var from within the MPI worker:
rv81uo39sp-algo-1-wr5k6 | [1,mpirank:0,algo-1]<stdout>:NCCL_MIN_NRINGS=1