aws/amazon-sagemaker-feedback

Troubleshooting permissions while connecting SageMaker to EMR Cluster

Closed this issue · 3 comments

aiqc commented

Question

How can I authorize my SageMaker Studio notebook to connect to my EMR Cluster?

Other Details

https://stackoverflow.com/questions/78962340/sagemaker-emr-cluster-select-emr-runtime-role-for-cluster

aiqc commented

Within domain configuration, I found where the SageMaker user's EMR Assumable Role and EMR Execution Role attributes can be defined.

However, it is not clear what ARN values I should be using. Nor am I able to get the spark context working in either kernel (Glue PySpark, SparkMagic PySpark)

Screenshot from 2024-09-08 15-40-15

Screenshot from 2024-09-08 15-37-32

aiqc commented

I added glue to the list of services in the custom allowable policy example of the documentation and now the SparkMagic PySpark connection in the notebook works as expected. https://docs.aws.amazon.com/emr/latest/EMR-Serverless-UserGuide/getting-started.html#gs-runtime-role

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "EMRServerlessTrustPolicy",
            "Effect": "Allow",
            "Principal": {
                "Service": [
                    "emr-serverless.amazonaws.com", #<-- the only entry in documentation
                    "glue.amazonaws.com"                   #<-- I added this entry
                ]
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

Screenshot from 2024-09-08 16-44-29

Maybe that fixed it? I don't know. This was supposed to be a fun thing to explore on Friday morning, but now it's Sunday night.

aiqc commented

Closing this because I don't need help, but it is a pain point for sure