@cap-js/postgres - cds returns ResourceRequest timed out response.
BartekCapgemini opened this issue · 3 comments
Description of erroneous behaviour
Hello!
During project development I have faced a problem which seems to be related to the acquireTimeoutMillis and destroyTimeoutMillis values.
We use quite big view that takes some time to process on the server side. Sadly, it takes so long that it causes timeouts which breaks the application flow. Frontend application sends usually 4-7 requests in batch to the server. Usually, first requests are handled without issues but whenever we select something from this bigger view it has to process a lot of data - all following requests are getting timed out. This issue happens mostly when requests are sent in batches. I can easily read this big entity without any issues.
Timeouts were appearing exactly 1 second after request was sent so we quickly found out that it might be something related to the postgres plugin.
Here is the error stacktrace from application logs
TimeoutError: ResourceRequest timed out
at ResourceRequest._fireTimeout (/home/vcap/app/node_modules/generic-pool/lib/ResourceRequest.js:62:17)
at Timeout.bound (/home/vcap/app/node_modules/generic-pool/lib/ResourceRequest.js:8:15)
at listOnTimeout (node:internal/timers:559:17)
at processTimers (node:internal/timers:502:7)
Active connections:1
Error: begin called from:
at Pool.acquire (/home/vcap/app/node_modules/@cap-js/db-service/lib/common/generic-pool.js:18:44)
at runMicrotasks (<anonymous>)
at processTicksAndRejections (node:internal/process/task_queues:96:5)
at async PostgresService.acquire (/home/vcap/app/node_modules/@cap-js/db-service/lib/common/DatabaseService.js:105:12)
at async PostgresService.begin (/home/vcap/app/node_modules/@cap-js/db-service/lib/common/DatabaseService.js:65:16)
at async PostgresService._begin (/home/vcap/app/node_modules/@sap/cds/lib/srv/srv-tx.js:205:3)
at async CatService.<anonymous> (/home/vcap/app/node_modules/@sap/cds/libx/_runtime/common/generic/crud.js:73:16)
at async next (/home/vcap/app/node_modules/@sap/cds/lib/srv/srv-dispatch.js:79:17)
at async CatService.handle (/home/vcap/app/node_modules/@sap/cds/lib/srv/srv-dispatch.js:77:10)
at async _getCount (/home/vcap/app/node_modules/@sap/cds/libx/_runtime/cds-services/adapter/odata-v4/handlers/read.js:104:20
For now we managed to temporarily fix the issue by increasing the timeouts values in the source file source code So my question is whether this timeout time could be increased or somehow exposed for configuration in package.json.
Thanks in advance!
Details about project
Package | version |
---|---|
OData version | v4 / v2 |
Node.js version | v16.16 |
@sap/cds | 7.5.2 |
@sap/cds-dk | 7.3.0 |
@cap-js/postgres | 1.4.1 |
@BartekCapgemini thanks for the report.
There does not seem to be an issue with the default acquire timeout. The problem is most likely related to the fact that the pool implementation has a default max
size of 1
. Which you can increase using the service configurations by adding it into the pool
property. As you can see in the source code you linked here
The configuration should look something like.
{
"cds": {
"db": {
"impl": "@cap-js/postgres",
"pool": {
"max": 10 // configured according to your predicted service load
}
}
}
}
Additionally I would like to add that the nature of the Postgres network protocol results in that all queries are executed in sequence. So if you would have a large $batch
request with a change set that has to execute many different queries inside the same transaction. It will execute them in sequence as the tcp connection is only able to process a single query at a time. While having many requests in side a $batch
request outside of any changeset. They will execute in parallel and will greatly benefit from having a larger max
configured.
@BobdenOs thank you for reply.
Luckily we were able to solve most of the performance issues with timeouts by optimizing the underlying requests to both service and database.
For testing purposes I restored default timeouts values (1000ms) an set up max pool to 10 to see if after all optimizations we will face the same issue. At first glance it was working properly but ResourceRequest timed out error still appears.
Issue happens when user request column (in the smart table, not as a part of larger batch since all other data is loaded beforehand) that is dynamically calculated on the postgres side. This particular column relies on some regular expressions and array calculations but runs just fine when queried on pgAdmin for example. We still get timeouts after 30s so we had to once again increase default values in the source code.