Intermittent "ORA-12154: TNS:could not resolve the connect identifier specified"

Question

Intermittent "ORA-12154: TNS:could not resolve the connect identifier specified"

studds opened this issue 6 months ago · 4 comments

What versions are you using?

Describe the problem

We're running oracledb on AWS lambda, using the thick client deployed as a layer. For the most part, it's working fine. We call initOracleClient({}) once for each lambda cold-start. We call getConnection and cache the result only if the connection is successful.

But every now and then, we get this error: ORA-12154: TNS:could not resolve the connect identifier specified.

When this error occurs, it seems to then happen in the first invocation and every subsequent invocation of that lambda container, and persist until the next cold start, although we're calling getConnection afresh on each invocation.

Forcing the lambda to use a new container (via a redeploy with no functional changes) seems to resolve the issue.

I suspect this might be caused by a failure in initOracleClient() - does this sound plausible? Or are there any other potential causes for this type of intermittent problems?

Is there some verbose logging we can enable to attempt to diagnose the underlying problem?

Is there any way of forcing a complete refresh of the oracle client without restarting the node process?

Include a runnable Node.js script that shows the problem.

We're not immediately able to provide an independent reproduction of this issue, as the code is proprietary.

Answer 1 · 2024-07-19T07:01:19.000Z

Are you using a tnsnames.ora alias to resolve the connect descriptor? If so, can you check if the tnsnames.ora file exists in the proper directory and is accessible. If using TNS_ADMIN environment variable can u check if that is set?

Additional tracing can be enabled by setting trace_level_client=16 in sqlnet.ora. The traces should be available under $HOME/oradiag_<username>.

Answer 2 · 2024-07-19T07:17:59.000Z

No, we're not using tnsnames.ora - we're using a connection string with the hostname, port, and database name. And most of the time it works.

We're not using a TNS_ADMIN environment variable.

I'll have a look at further tracing next week.

Answer 3 · 2024-07-19T07:29:59.000Z

Thanks for the information. Most likely the cause would be that the hostname could not be resolved to an IP address. Can you check if the hostname is resolvable via nslookup on the container?

Answer 4 · 2024-08-19T04:49:10.000Z

Thanks @sreguna, it was an intermittent DNS failure causing our problem. Really appreciate the quick response and helpful guidance.