aws-samples/amazon-sagemaker-build-train-deploy

Exception when running 01_build_and_train.ipynb with pinned sagemaker version

szamarin opened this issue · 1 comments

When running the 01_build_and_train.ipynb notebook with the pinned version of sagemaker==2.199.0, I get an exception below when executing the remote function

File /opt/conda/lib/python3.10/site-packages/sagemaker/remote_function/job.py:513, in _JobSettings.init(self, dependencies, pre_execution_commands, pre_execution_script, environment_variables, image_uri, include_local_workdir, custom_file_filter, instance_count, instance_type, job_conda_env, job_name_prefix, keep_alive_period_in_seconds, max_retry_attempts, max_runtime_in_seconds, role, s3_kms_key, s3_root_uri, sagemaker_session, security_group_ids, subnets, tags, volume_kms_key, volume_size, encrypt_inter_container_traffic, spark_config, use_spot_instances, max_wait_time_in_seconds)
511 self.role = self.sagemaker_session.expand_role(_role)
512 else:
--> 513 self.role = get_execution_role(self.sagemaker_session)
515 self.s3_root_uri = resolve_value_from_config(
516 direct_input=s3_root_uri,
517 config_path=REMOTE_FUNCTION_S3_ROOT_URI,
(...)
523 sagemaker_session=self.sagemaker_session,
524 )
526 self.s3_kms_key = resolve_value_from_config(
527 direct_input=s3_kms_key,
528 config_path=REMOTE_FUNCTION_S3_KMS_KEY_ID,
529 sagemaker_session=self.sagemaker_session,
530 )

File /opt/conda/lib/python3.10/site-packages/sagemaker/session.py:6789, in get_execution_role(sagemaker_session)
6787 if not sagemaker_session:
6788 sagemaker_session = Session()
-> 6789 arn = sagemaker_session.get_caller_identity_arn()
6791 if ":role/" in arn:
6792 return arn

File /opt/conda/lib/python3.10/site-packages/sagemaker/session.py:5371, in Session.get_caller_identity_arn(self)
5369 if space_name is not None:
5370 domain_desc = self.sagemaker_client.describe_domain(DomainId=domain_id)
-> 5371 return domain_desc["DefaultSpaceSettings"]["ExecutionRole"]
5373 user_profile_desc = self.sagemaker_client.describe_user_profile(
5374 DomainId=domain_id, UserProfileName=user_profile_name
5375 )
5377 # First, try to find role in userSettings

KeyError: 'DefaultSpaceSettings

If I bump the version of sagemaker to 2.219.0 then I no longer have the issue and can run the lab. It seems like something in the sagemaker session implementation has changed

Thank you, @szamarin. I will update the dependency version.