IBM/ibm-cos-sdk-python-core

IAM-Token refresh fails

Closed this issue · 1 comments

Copy from IBM/ibm-cos-sdk-python#59 as the code is actually located in this repository

Hi team!

I'm having an issue with the IAM-Token refresh.
I am using the DataEngine python SDK, which internally uses this SDK.
Queries usually time out after 60 minutes, but earlier if the IAM token lifespan is shorter.
Therefore, if a token is already used for some time, the max query length is shorter.
This leads to issues where a query already times out after 24/30 minutes, which is too short for bigger loads.

The reason why I open the issue here is that I get spammed on the logging with the following error message:

 Refreshing temporary credentials failed during the mandatory refresh period.
Traceback (most recent call last):

  File "/opt/app-root/lib64/python3.11/site-packages/ibm_botocore/credentials.py", line 2773, in _protected_refresh
    metadata = self.auth_function()
               ^^^^^^^^^^^^^^^^^^^^

  File "/opt/app-root/lib64/python3.11/site-packages/ibm_botocore/credentials.py", line 2685, in _default_auth_function
    raise CredentialRetrievalError(provider=self._get_token_url(), error_msg=_msg)

ibm_botocore.exceptions.CredentialRetrievalError: Error when retrieving credentials from https://iam.cloud.ibm.com/identity/token: HttpCode(400) - Retrieval of tokens from server failed.

This error/warning appears every 10 seconds, repeating for a very long time.
Extra:

  • pathname: /opt/app-root/lib64/python3.11/site-packages/ibm_botocore/credentials.py
  • threadName: 'Thread-2 (_background_refresher)'
  • lineno: 2776

I'm not specifying any special IAM router, just using the default init for the SQLQuery of DataEngine:

return SQLQuery(
            api_key=self.__de_cos_api_key,
            instance_crn=self.__de_instance_crn,
            max_concurrent_jobs=self.__max_concurrent_jobs,
            max_tries=1, # this needs to be 1, as we do our own restart with increasing timers
            iam_max_tries=3, # increase in case of iam timeouts
            thread_safe=True # enable, unsure about the effects...
            )

Their params are 1-1 forwarded into the COSClient inside the SQLQuery.
(not using staging_env)

COSClient.__init__(
            self,
            cloud_apikey=api_key,
            token=token,
            cos_url=target_cos_url,
            client_info=client_info,
            iam_max_tries=iam_max_tries,
            thread_safe=thread_safe,
            staging=staging_env,
        )

According to the error message, I assume that the failing refresh of the token is hard blocking me from executing longer queries.
Can anyone please take a look at why the refresh is failing and how this could be fixed/workaround?

Thanks a lot!

This is due to an CLOUD account security setting. Closing.