telefonicaid/fiware-sth-comet

Sth-Comet does not reconnect with MongoDB.

Closed this issue · 17 comments

Issue Description:
When Mongodb restarted, STH-Comet and MongoDB are disconnected and will not reconnect. API responds normally even though the connection with MongoDB is lost, so the user don't know that an error has occurred. Shouldn't the API respond with an error ?
And it is not possible to know from the log that the connection with MongoDB has been lost. Shouldn't it be output that the connection was lost in the error log ?
Connection is not restored even if MongoDB is restored. Shouldn't it reconnect ?

Reproduction Steps:
1 Store the history data in MongoDB
2 Start STH-Comet and MongoDB
3 Execute the API on STH-Comet and confirm that you can get the history data.
4 Stop only MongoDB and restart MongoDB after 5 minutes
5 Execute the API on STH-Comet and confirm that you cannot get the history data (status: 200OK but values field will be empty).

API used for getting history data:
curl -X GET 'localhost:8666/STH/v1/contextEntities/type/Room/id/Room1/attributes/temperature?lastN=1' -H 'fiware-service: test' -H 'fiware-servicePath: /'

Output:
{"contextResponses":[{"contextElement":{"attributes":[{"name":"temperature","values":[]}],"id":"Room1","isPattern":false,"type":"Room"},"statusCode":{"code":"200","reasonPhrase":"OK"}}]}

Expected output:
{"contextResponses":[{"contextElement":{"attributes":[{"name":"temperature","values":[{"recvTime":"2021-09-17T08:06:25.046Z","attrType":"string","attrValue":"190"}]}],"id":"Room1","isPattern":false,"type":"Room"},"statusCode":{"code":"200","reasonPhrase":"OK"}}]}

Hi @fgalan

As per our understanding, It seems a bug. Please share your opinion.

Could anyone please confirm whether it is a bug or it is a specification of sth-comet?
I have updated issue description

We'll have a look. Thanks for the feedback!

I confirm I could reproduce the problem @SwatiNEC commented.

Additional information:

  • STH Version: Latest (2.8.0)

Procedure:

  1. Bring up the containers
  2. Create the subscription
  3. Create the entity
  4. Modify the entity several times
  5. Get historical data from STH - Everything works well
  6. Stop MongoDB container for a while
  7. Resume the same container
  8. Get historical data - Empty log but a response code 200

The log is printing this:

fiware-sth-comet | time=2021-09-28T13:14:32.347Z | lvl=WARN | corr=0616e74d-914d-447e-ba52-747fa3dbe9bc | trans=0616e74d-914d-447e-ba52-747fa3dbe9bc | op=OPER_STH_GET | from=n/a | srv=test | subsrv=/ | comp=STH | msg=Error when getting the raw data collection for retrieval (the collection 'undefined' may not exist)

Complete log after rebooting the DB container here

Requests used:
Create subscription

curl -iX POST \
  --url 'http://localhost:1026/v2/subscriptions' \
  --header 'content-type: application/json' \
  -H 'fiware-service: test' -H 'fiware-servicePath: /' \
  --data '{
  "description": "HTTP sub",
  "subject": {
    "entities": [
      {
        "id": "E",
        "type": "T"
      }
    ]
  },
  "notification": {
    "http": {
      "url": "http://sth-comet:8666/notify"
    },
    "attrsFormat": "legacy"
  }
}'

Create entity

curl -iX POST \
  --url 'http://localhost:1026/v2/entities' \
  --header 'content-type: application/json' \
  -H 'fiware-service: test' -H 'fiware-servicePath: /' \
  --data '{
  "id": "E",
  "type": "T",
  "A": {
    "value": 1,
    "type": "Float"
  }
}'

Update entity

curl -iX POST --url 'http://localhost:1026/v2/entities/E/attrs' --header 'content-type: application/json' -H 'fiware-service: test' -H 'fiware-servicePath: /' --data '{"A":{"value": 2,"type": "Float"}}'

Query STH*

curl -X GET 'localhost:8666/STH/v1/contextEntities/type/T/id/E/attributes/A?lastN=1' -H 'fiware-service: test' -H 'fiware-servicePath: /'

docker-compose.yml

version: "3.5"
services:
  orion:
    image: telefonicaiot/fiware-orion:3.2.0
    hostname: orion
    platform: linux/amd64
    container_name: fiware-orion
    depends_on:
      - mongo-db
    expose:
      - "1026"
    ports:
      - "1026:1026" 
    command: -dbhost mongo-db -logLevel DEBUG -mqttMaxAge 5

  mongo-db:
    image: mongo:4.4
    hostname: mongo-db
    platform: linux/amd64
    container_name: db-mongo
    expose:
      - "27017"
    ports:
      - "27017:27017"

  sth-comet:
      image: fiware/sth-comet
      hostname: sth-comet
      container_name: fiware-sth-comet
      depends_on:
          - mongo-db
      ports:
          - "8666:8666"
      environment:
          - STH_HOST=0.0.0.0
          - STH_PORT=8666
          - DB_PREFIX=sth_
          - DB_URI=mongo-db:27017
          - LOGOPS_LEVEL=DEBUG

Hi @fgalan @mapedraza

Same error log also printed in case of referring to unregistered data even when STH-Comet and MongoDB are connected.

Reproduction Steps :

  1. Start STH-Comet and MongoDB. (STH-Comet and MongoDB are connected.)
  2. Execute the STH-Comet API to refer to the data.(Specify an unregistered ID in the ID parameter.)

comet | time=2021-09-29T04:43:29.794Z | lvl=WARN | corr=12ef225a-9e1a-4ef4-8180-265d8725c702 | trans=12ef225a-9e1a-4ef4-8180-265d8725c702 | op=OPER_STH_GET | from=n/a | srv=test | subsrv=/ | comp=STH | msg=Error when getting the raw data collection for retrieval (the collection may not exist)

Hi @fgalan @mapedraza

I have fixed the issue of reconnection with mongoDB. Now, I am working on error handling in case of mongoDb connection failed.

I have fixed the issue of reconnection with mongoDB. Now, I am working on error handling in case of mongoDb connection failed.

Great!

Did you solve it modifying STH code? Could you contribute your changes with a pull request in this repository, please? Thanks!

Hi @fgalan

Yes, I have modified the STH code. Currently, I am working on its error handling.
As per my understanding, if STH is not connected with mongoDB then "Database is not connected" error message will be displayed in the logs instead of "Error when getting the raw data collection for retrieval (the collection 'undefined' may not exist)" message.
And "500 Internal server error" will be returned in API response.

We are using STH v2.3.0. So, is it possible to raise a PR for v2.3.0 also.

Yes, I have modified the STH code. Currently, I am working on its error handling.
As per my understanding, if STH is not connected with mongoDB then "Database is not connected" error message will be displayed in the logs instead of "Error when getting the raw data collection for retrieval (the collection 'undefined' may not exist)" message.
And "500 Internal server error" will be returned in API response.

I agree in your understanding

We are using STH v2.3.0. So, is it possible to raise a PR for v2.3.0 also.

It's possible, of course (using release/2.3.0 as base branch), but we would need to freeze a 2.3.1 after all. Maybe it's better to do the PR on master, then freeze STH 2.9.0 version and you upgrade STH to 2.9.0?

Hi @fgalan

Hi @fgalan @mapedraza

Same error log also printed in case of referring to unregistered data even when STH-Comet and MongoDB are connected.

Reproduction Steps :

  1. Start STH-Comet and MongoDB. (STH-Comet and MongoDB are connected.)
  2. Execute the STH-Comet API to refer to the data.(Specify an unregistered ID in the ID parameter.)

comet | time=2021-09-29T04:43:29.794Z | lvl=WARN | corr=12ef225a-9e1a-4ef4-8180-265d8725c702 | trans=12ef225a-9e1a-4ef4-8180-265d8725c702 | op=OPER_STH_GET | from=n/a | srv=test | subsrv=/ | comp=STH | msg=Error when getting the raw data collection for retrieval (the collection may not exist)

What is you opinion regarding this error?

Yes, I have modified the STH code. Currently, I am working on its error handling.
As per my understanding, if STH is not connected with mongoDB then "Database is not connected" error message will be displayed in the logs instead of "Error when getting the raw data collection for retrieval (the collection 'undefined' may not exist)" message.
And "500 Internal server error" will be returned in API response.

I agree in your understanding

We are using STH v2.3.0. So, is it possible to raise a PR for v2.3.0 also.

It's possible, of course (using release/2.3.0 as base branch), but we would need to freeze a 2.3.1 after all. Maybe it's better to do the PR on master, then freeze STH 2.9.0 version and you upgrade STH to 2.9.0?

Thanks for your quick response. We are working on it.

Hi @fgalan

Hi @fgalan @mapedraza
Same error log also printed in case of referring to unregistered data even when STH-Comet and MongoDB are connected.
Reproduction Steps :

  1. Start STH-Comet and MongoDB. (STH-Comet and MongoDB are connected.)
  2. Execute the STH-Comet API to refer to the data.(Specify an unregistered ID in the ID parameter.)

comet | time=2021-09-29T04:43:29.794Z | lvl=WARN | corr=12ef225a-9e1a-4ef4-8180-265d8725c702 | trans=12ef225a-9e1a-4ef4-8180-265d8725c702 | op=OPER_STH_GET | from=n/a | srv=test | subsrv=/ | comp=STH | msg=Error when getting the raw data collection for retrieval (the collection may not exist)

What is you opinion regarding this error?

Not sure but I'd say that this error is fine in that situation (i.e. STH-Comet and MongoDB are connected). However, if you have some idea to improve the text, please go ahead with the suggestion.

@SwatiNEC after merge PR #565 could you please retry your test again?

hi @AlvaroVega , @fgalan

Initially, it was finalized that mongoDb driver needs to be upgraded(#560 (comment)), then fix will be done.
But we observed that in PR #565 mongoDb driver was not upgraded. Could you please explain?

Fix is working fine but we have discussed that if database is not connected and any API will be requested then it will return "500 internal server error" and in logs it will display "database is not connected". Please find the reference for the same discussion: #559 (comment)

Currently, if any API request receives to sth-comet, then no response will be returned and no error logs will be printed.

hi @AlvaroVega , @fgalan

Initially, it was finalized that mongoDb driver needs to be upgraded(#560 (comment)), then fix will be done. But we observed that in PR #565 mongoDb driver was not upgraded. Could you please explain?

The work has been divided the into pieces. First, a simple fix with the existing MongoDB driver version has been developed (in PR #565). The second step would be to upgrade the MongoDB driver. In some sense they are the same steps described in the comment you cite but in reserve order.

With regards to the second step (MongoDB driver version upgrade) there are two PRs on the table for that (an automic one and the one from @Gauravp-NEC) but both show tests failing in the GitAction pass so probably need some extra work.

Fix is working fine...

Nice! That validates the fix done in PR #565

...but we have discussed that if database is not connected and any API will be requested then it will return "500 internal server error" and in logs it will display "database is not connected". Please find the reference for the same discussion: #559 (comment)

Currently, if any API request receives to sth-comet, then no response will be returned and no error logs will be printed.

From my point of view, implementing such behaviour (which would be nice) could be considered a third step. Did you PR #560 address that?

Thanks for the feedback! I hope having clarified...

Hi @fgalan

From my point of view, implementing such behaviour (which would be nice) could be considered a third step. Did you PR #560 address that?

Yes my PR was addressing that issue according to the implementation. Also I want to know about your target for step 3.

Migration to MongoDB 3.3.6 driver was finally accomplished in PR #568. I have created a follow-up issue with the pending item about 500 responses: #570. Let's close this one (which is a about reconnections and has been fixed) and continue discussion there.