oracle/oci-typescript-sdk

Intermittent NotAuthenticated Error with OKE Provider

nemohuang-new opened this issue · 1 comments

Description:

I am encountering an intermittent NotAuthenticated issue while using the OKE Workload Identity Authentication Details Provider in my project. The setup generally works fine, but occasionally, I see authentication failures. Here's a detailed description:

Environment:
Client Library: OkeWorkloadIdentityAuthenticationDetailsProviderBuilder (OCI SDK)
Application Framework: NestJS
OCI SDK Version: Oracle-TypeScriptSDK/2.95.1
Service Affected: ObjectStorage
Operation: updateTags/getObject etc.
Issue:
Problem: Sometimes, the client throws the NotAuthenticated error. The error persists for a while (up to an hour) and then resolves on its own (I have retry logic to reinitial the service).
Workaround Attempts: I added retry logic to reinitialize the client when the error occurs, but it didn’t resolve the issue reliably.
Observation: Restarting the server resolves the issue immediately.
Error Details:
Here’s the full error message:

Operation 'updateTags' failed after 3 attempts: {
"statusCode": 401,
"serviceCode": "NotAuthenticated",
"opcRequestId": "",
"targetService": "ObjectStorage",
"operationName": "getObject",
"timestamp": "2025-01-12T01:01:40.177Z",
"clientVersion": "Oracle-TypeScriptSDK/2.95.1",
"loggingTips": "To get more info on the failing request, refer to https://docs.oracle.com/en-us/iaas/Content/API/SDKDocs/typescriptsdkconcepts.htm#typescriptsdkconcepts_topic_Logging for ways to log the request/response details.",
"troubleshootingTips": "See https://docs.oracle.com/iaas/Content/API/References/apierrors.htm#apierrors_401__401_notauthenticated for more information about resolving this error."
}
Below is the relevant code for initializing the client:

constructor(private moduleRef: ModuleRef) {
  this.client = this.initializeClient();
  this.setRegion();
}
private initializeClient() {
  let provider;
  try {
    const kubeServiceHostEnvVar =
      common.OkeWorkloadIdentityAuthenticationDetailsProvider
        .KUBERNETES_SERVICE_HOST_ENV_VAR_NAME;

    if (process.env[kubeServiceHostEnvVar]) {
      provider = new common.OkeWorkloadIdentityAuthenticationDetailsProvider.OkeWorkloadIdentityAuthenticationDetailsProviderBuilder().build();
    } else {
      Logger.log("Initializing with local config");
      provider = new common.ConfigFileAuthenticationDetailsProvider();
    }
  } catch (error) {
    Logger.error(
      `Error occurred when creating auth provider: ${JSON.stringify(error)}`
    );
    throw new HttpException(
      `Authentication provider initialization failed: ${error.message}`,
      500
    );
  }

  return new objectStorage.ObjectStorageClient({
    authenticationDetailsProvider: provider,
  });
}

Observations:

Intermittent Issue: The error occurs sporadically but resolves automatically after a while (usually within an hour).
Immediate Resolution: Restarting the server instantly fixes the issue.
Retried Initialization Fails: Attempting to reinitialize the client programmatically does not resolve the error.

Question:
What might cause the NotAuthenticated error to occur intermittently?
Are there specific token expiry/refresh requirements for OkeWorkloadIdentityAuthenticationDetailsProvider that might not be handled correctly?
Could the issue be related to the Kubernetes environment, such as temporary unavailability of the workload identity service?
Is there a recommended way to handle such intermittent issues without needing a server restart?
Any suggestions, insights, or workarounds would be greatly appreciated. Thank you!

@NiviPari - Can you take a look?