[BUG] Nondescript error when EMR roles are misconfigured
Opened this issue · 1 comments
What is the bug?
In AWS, when setting up a cluster and connecting an improper IAM role directly for a new data source, Spark queries will fail with the message:
Failed to verify existing mapping: Failed to get OpenSearch index mapping for query_execution_result_[data source]
A quick code search reveals the exception comes from getIndexMetadata. The solution is to carefully configure a new role with the correct permissions, as described in https://docs.aws.amazon.com/opensearch-service/latest/developerguide/direct-query-s3-creating.html.
It would be helpful to flesh out this error message, to make it more clear.
How can one reproduce the bug?
Steps to reproduce the behavior:
- In AWS, create a new OpenSearch cluster
- Create an IAM role with insufficient permissions for modifying the request index, such as only giving it
S3FullAccess
. - Create a data source with this role (e.g.
example
). - Attempt to query the data source. The error is:
Failed to verify existing mapping: Failed to get OpenSearch index mapping for query_execution_result_example
What is the expected behavior?
An error message that hints that the access to the index is misconfigured. This error does explain the problem, but at too low of a level to be especially useful without some familiarity with OS-Spark's internal implementation. Strictly speaking it's likely that this is due to a more specific underlying exception and might not be relevant for all occurrences of this error -- some form of pattern matching may be the solution.
What is your host/environment?
- OS: Amazon OS 2.13
Do you have any additional context?
N/A