FreeRTOS/iot-reference-esp32

[BUG] <mqtt reconnects the question>

KeysPAN0114 opened this issue · 7 comments

Hi, I have a question. In normal use, the network instability of WIFI causes the retention timeout, and occasionally triggers mqtt reconnection. How to solve this situation? For example, when I am in China, the customer requires to connect to AWS in South Korea, but the network environment in China is poor, sometimes the external network will be banned, which will result in no problem in the network of the module, but the module may not receive the response from the server after sending mqtt ping. After that, the module has a probability of not reconnecting. Only when the module's WIFI connection is disconnected, the reconnection will continue to connect to the MQTT.

aggarg commented

but the module may not receive the response from the server after sending mqtt ping.

If a PINGRESP is not received within the specified timeout, the connection is assumed to be dead and a re-connection attempt is made. Is that not the case?

but the module may not receive the response from the server after sending mqtt ping.

If a PINGRESP is not received within the specified timeout, the connection is assumed to be dead and a re-connection attempt is made. Is that not the case?

Yes, but sometimes the module won't trigger reconnection. Now my practice is to disconnect the WIFI and reconnect the MQTT after the module does not reconnect the MQTT within the specified time, so as to trigger the MQTT connection

aggarg commented

Yes, but sometimes the module won't trigger reconnection.

This should not happen. How do you determine that the connection is dead and needs re-connecting?

Yes, but sometimes the module won't trigger reconnection.

This should not happen. How do you determine that the connection is dead and needs re-connecting?

Now I am testing, if this happens in the future I will post log screenshots, please do not close this discussion yet

Yes, but sometimes the module won't trigger reconnection.

This should not happen. How do you determine that the connection is dead and needs re-connecting?

Just like this, CoreMQTT did not reconnect, I set up a detection function, 15 seconds to check, if not reconnect within 15 seconds will disconnect WIFI, and reconnect, and coreMQTT reconnect
c9db50291e9dcac8a814913a49d4fd0

aggarg commented

This log is present in your logs - https://github.com/FreeRTOS/iot-reference-esp32/blob/main/main/networking/mqtt/core_mqtt_agent_manager.c#L890.

After that CORE_MQTT_AGENT_DISCONNECTED_BIT should get set in the xNetworkEventGroup here and this should break this loop and the connection should be retried here. Can you check why prvCoreMqttAgentConnectionTask is not re-trying the connection? You can try increasing its priority.

This log is present in your logs - https://github.com/FreeRTOS/iot-reference-esp32/blob/main/main/networking/mqtt/core_mqtt_agent_manager.c#L890.

After that CORE_MQTT_AGENT_DISCONNECTED_BIT should get set in the xNetworkEventGroup here and this should break this loop and the connection should be retried here. Can you check why prvCoreMqttAgentConnectionTask is not re-trying the connection? You can try increasing its priority.

I'll give it a try.