NJS-040: connection request timeout
todevmilen opened this issue · 15 comments
I receive this error NJS-040: connection request timeout
while the server is running on production. Already check all closed issue regarding this error also read the documentation, but still facing a problem to solve it. Please assist me.
Versions
- Database: 11g
- process.platform: linux
- process.version: v16.14.0
- process.arch: 'x64'
- require('oracledb').versionString: '5.1.0'
- require('oracledb').oracleClientVersionString: 21.5.0.0.0'
Here is the configuration from which i make a pool. Also before I start the index.js file i set the UV_THREADPOOL_SIZE to 25 (because i have two databases with 10 poolMax set.
CMD ["sh", "-c", "export UV_THREADPOOL_SIZE=25 && node index.js"]
module.exports = {
hrPool: {
user: process.env.USER_2,
password: process.env.PASSWORD_2,
connectString: "local",
poolMin: 10,
poolMax: 10,
poolIncrement: 0,
},
hrPoolTest: {
user: process.env.TEST_NAME,
password: process.env.TEST_PASS,
connectString: "test",
poolMin: 10,
poolMax: 10,
poolIncrement: 0,
},
}
I can see in the logs that just before this error occurs the incoming request doesn't close the connections.
Here is example code of the way I create and close connections.
const oracledb = require("oracledb");
async function find(selectItemsParams) {
let conn;
try {
conn = await oracledb.getConnection();
oracledb.outFormat = oracledb.OBJECT;
const result = await conn.execute(
`BEGIN example.slc_items (
:p_item_id,
:p_item_code,
:info_slc_items
); END;`,
{
p_item_id: selectItemsParams.p_item_id,
p_item_code: selectItemsParams.p_item_code,
info_slc_items: {
dir: oracledb.BIND_OUT,
type: oracledb.DB_TYPE_CURSOR,
},
},
{ resultSet: true }
);
const resultSet = result.outBinds.info_slc_items;
let row;
let rows = [];
while ((row = await resultSet.getRow())) {
rows.push(row);
}
await resultSet.close();
return rows;
} catch (err) {
console.error(err);
throw err;
} finally {
if (conn) {
// conn assignment worked, need to close
try {
await conn.close();
console.log("connection is closed;");
} catch (err) {
console.error(err);
}
}
}
}
module.exports.find = find;
I also can see that the number of active handlers/requests is continuously increasing.
Hi @todevmilen , as you mentioned, the incoming requests are not getting closed might be the cause for pool treating all the connections as busy and hence cannot provide a new connection . You have more information on which phase/line number, they are stuck like in conn.execute, rs.getRow, ... I see you have 2 pools created inside application, the incoming requests for both the pools are returning NJS-040 ?
If any one of the queries are stuck, you can enable the callTimeout ,
check firewall settings if idle connections are not dropped out and hence delaying new connection creations ,
other timeout settings to limit the time..
@sudarshan12s ,
I'm not sure exactly on which phase they are stuck, but the requests are coming, no data is returned, the connections are not closing and then the error occurs. Normally these connections are closing without a problem.
I have two pools created, but the default one (first one created) is used a lot more frequent than the second one that is used very rare, so that's why i'm not sure if the problem occurs also for the second pool.
Can you try callTimeout setting (conn.callTimeout = t(in msec) )
before executing the query. It could be due to specific long running queries which are delayed.. Since you have older database version 11g , you may need to add DISABLE_OOB=ON to a sqlnet.ora file.
The problem is that i can't catch exactly on which query it happens, more also we have reports that need different time based on period and is hard to set a fixed timeout. Is there anything else that i can check to resolve this issue ?
ok understood, I mean its just a max time for any query, if it can be derived.
enabling logs (export DPI_DEBUG_LEVEL=92 ) would help to see the API details like establishing a connection (dpiConn_create
) , running a query (dpiStmt_execute
) but since its on production, you might not be able to do it. We might otherwise need to check on server side to see what active sqls are run from active sessions (from username, machine ..) at that time..
From what I saw for the most of the time there is only one active session that switch from active to unactive and from time to time there is two max three active sessions.
From what I saw for the most of the time there is only one active session that switch from active to unactive and from time to time there is two max three active sessions.
I suspect few previous connections/sockets have not got terminated gracefully. timeout configurations should cleanup such connections..
This could be due to a firewall dropping packets for idle connections. You can enable TCP Keepalives by setting expireTime(https://node-oracledb.readthedocs.io/en/latest/api_manual/oracledb.html)
A good value would be to set it to 1(1 minute).
I can't find where to place the expireTime
. I'm using Oracle 11g and all the information i see is for greater version.
timeout
I think the 11g server does have support. Since you have 21.5 instant client, It can be placed in connect string as below or client sqlnet.ora ( SQLNET.EXPIRE_TIME=1 ).
Easy Connect: host/service?expire_time=n.
ex: "localhost:1521/testServ?expire_time=1
With this setting probe packets can be seen through tcpdump at regular intervals , ... Is this not working?
From doc:
Setting EXPIRE_TIME in C Clients:
- With 18c+ client libraries it can be added as (EXPIRE_TIME=n) to the DESCRIPTION section of a connect descriptor
- With 19c+ client libraries it can be used via Easy Connect: host/service?expire_time=n.
- With 21c client libraries it can be used in a client-side sqlnet.ora.
I'm using tnsnames.ora
file. Is that a correct place where EXPIRE_TIME=1
should stay?
XE_test =
(DESCRIPTION =
(EXPIRE_TIME=1)
(ADDRESS = (PROTOCOL = TCP)(HOST = xxx.xxx.x.xxx )(PORT = 1521))
(CONNECT_DATA =
(SERVER = DEDICATED)
(SERVICE_NAME = XE.local)
)
)
I'm using
tnsnames.ora
file. Is that a correct place whereEXPIRE_TIME=1
should stay?XE_test = (DESCRIPTION = (EXPIRE_TIME=1) (ADDRESS = (PROTOCOL = TCP)(HOST = xxx.xxx.x.xxx )(PORT = 1521)) (CONNECT_DATA = (SERVER = DEDICATED) (SERVICE_NAME = XE.local) ) )
Yes its correct
Ok, great. I will place it there to check if this will fix the problem.
I've made the changes, but today the error shows again :(.
Hi @todevmilen , You have more information on reproducing scenario . There are connect times which can be tuned for getting a connection, callTimeout for query executions ,..
If you can share more traces (DPI_LOG_LEVEL=92) , It would be easier to find the exact function call and debug further..