Slow to publish message
law-ko opened this issue · 20 comments
Describe the bug
We have realized the time to get the message published to AWS IoT takes around 2 seconds. Any method to reduce this time?
System information
Running on ESP32
Screenshots or console output
I (59746) fleet_prov_by_claim_demo: Sending publish request to coreMQTT-Agent with message "{"state":{"reported":{"accessories":{"outlet":0}},"desired":null}}" on topic "$aws/things/XXX/shadow/name/XXX/update" with ID 11.
I (59776) aws_demo: Waiting for publish 11 to complete.
I (60476) coreMQTT: Publishing message to $aws/things/XXX/update.
I (61696) coreMQTT: Ack packet deserialized with result: MQTTSuccess.
I (61696) coreMQTT: State record updated. New state=MQTTPublishDone.
From sending publish request (59746) to MQTTPublishDone (61696): 1950
The code to publish to AWS IoT
ESP_LOGI( TAG,
"Sending publish request to coreMQTT-Agent with message \"%s\" on topic \"%s\" with ID %ld.",
pcPayload,
pcTopicName,
ulPublishMessageId );
/* To ensure ulNotification doesn't accidentally hold the expected value
* as it is to be checked against the value sent from the callback.. */
ulNotifiedValue = ~ulPublishMessageId;
xCommandAcknowledged = pdFALSE;
xCommandAdded = MQTTAgent_Publish( &xGlobalMqttAgentContext,
&xPublishInfo,
&xCommandParams );
if( xCommandAdded == MQTTSuccess )
{
/* For QoS 1 and 2, wait for the publish acknowledgment. For QoS0,
* wait for the publish to be sent. */
ESP_LOGI( TAG,
"Waiting for publish %ld to complete.",
ulPublishMessageId );
xCommandAcknowledged = prvWaitForNotification( &ulNotifiedValue );
}
else
{
ESP_LOGE( TAG,
"Failed to enqueue publish command. Error code=%s",
MQTT_Status_strerror( xCommandAdded ) );
}
/* Check all ways the status was passed back just for demonstration
* purposes. */
if( ( xCommandAcknowledged != pdTRUE ) ||
( xCommandContext.xReturnStatus != MQTTSuccess ) ||
( ulNotifiedValue != ulPublishMessageId ) )
{
ESP_LOGW( TAG,
"Error or timed out waiting for ack for publish message %ld. Re-attempting publish.",
ulPublishMessageId );
}
else
{
ESP_LOGI( TAG,
"Publish %ld succeeded.",
ulPublishMessageId );
}
Hey @law-ko, thanks for reaching out about this issue, and sorry for the delay in getting a response.
Is this long delay an issue that you are still facing? Or have you found a root cause of this? And is the problem that it is taking 2 seconds for the message to reach the AWS IoT endpoint? Or two seconds for the return traffic as well?
I would be curious to know the value of this macro MQTT_AGENT_MAX_EVENT_QUEUE_WAIT_TIME.
If you want the MQTT Agent to poll the responses from the broker faster, you should reduce this value to 50 (in milliseconds) or something which better suits your needs.
Let us know if that solved your issue.
Thanks,
Aniruddha
Hi! Im facing the exact same issue. Have tried to modify MQTT_AGENT_MAX_EVENT_QUEUE_WAIT_TIME to 10 and still takes about 2-3 seconds to publish :(
Digging deeper on coreMQTT-Agent found that the publish message waits for a "publish completition" from the agent. The blocking behavior is due to the xTaskNotifyWait on the program on prvWaitForNotification( &ulNotifiedValue ), that should be notified by prvPublishCommandCallback when a publish action has been completed by agent. Btw this is all QoS_0 messages so no server ACK is expected.
I can't seem to find where to speed up this behavior.
Hey @aidiaz, you could try increasing the priority of the MQTT task to ensure it's the highest priority? Or maybe reducing the the send block time might also help with that?
I'm just thinking that conceptually either of these could be causing the delay in when messages get published. If modifying these don't work I can look further into the cause of the delay.
- Hello @moninom1, unfortunately it did not help. Set up the task to maximum freertos priority and cut all wait times or timeouts to 10ms, still getting publishes between 3 seconds on IoT Core.
Hi @aidiaz, wondering if you have tried any other network activtiy on the HW setup (other than MQTT) you have, if so, what latency are you getting?
By HW do you mean my esp32 board? Or Networks gateway/router. I have tried several boards, devkits, wifi networks, sharing from cellphone. No differences at all.
Hi @law-ko, @aidiaz,
Have you ever checked the CPU loading while publishing messages?
Refer to Run Time Statistics for FreeRTOS API to query CPU loading on every task.
If CPU is blocking on some other tasks, it doesn't have any performance to execute publishing until unblock.
Thanks.
Hi @ActoryOu, will be checking the statistics, but Im pretty sure nothing is blocking since already tried eliminating every other task in our program
@ActoryOu so I took some time to do test with all the suggestions here but still can't publish messages faster than 3 seconds between them. I blocked some naming on the statistics below.
aws_pub_task enqueues the data into the DataMQTT task where the queue is read in the loop. Both task have high priority respect the rest.
Found this https://www.esp32.com/viewtopic.php?t=32253 where a TLS network context modification is suggested. Still not working as expected.
EDITED
This seems to cover up some of the same topics too https://forums.freertos.org/t/mqttagent-publish-is-slow-as-compare-to-freertos-mqtt/15559/23
EDITED This seems to cover up some of the same topics too https://forums.freertos.org/t/mqttagent-publish-is-slow-as-compare-to-freertos-mqtt/15559/23
I only just saw this now. I should point out that some of those changes have already been integrated into this repo.
@txf-, thank you for reporting back on the forum post and on this issue.
@aidiaz did the solution provided in the above forum post work for you?
If not, I suggest you step through the code to see where it blocks. Since you mentioned that all of the messages are QoS0, the only place where I think that the code would block is at the socket_recv function. Because otherwise the Agent is supposed to return immediately - see here that the addWaitingOperation function is only called if the publish is >= QoS1.
@AniruddhaKanhere, I have already checked where the program blocks, if you read the previous replies (#44 (comment)) No solution found so far, please help us!
I can confirm (without a single doubt) that there is no way to send faster than 3 seconds with this repo as is.
Hello @aidiaz, very sorry to hear that your issue has not been solved yet.
Have you tried to make the socket being used by the MQTT agent non-blocking? Since you are using lwIP, a call to setsockopt should do the trick. Can you try setting the timeout to something very small like 1ms?
Let us know if that helps.
This would allow the Agent not to block when calling MQTT_ProcessLoop after each publish and would allow your code to progress faster.
@AniruddhaKanhere unfortunately I'm currently working on a esp32 board. Any suggestion? I would check if any similar shows up anyway.
Hello @aidiaz,
Yes, since you are working on esp32, you most probably are using lwIP. So if you could follow my previous advice, it should work for you. Can you inform us how it goes?
Closing this issue due to lack of response.
Please feel free to reopen this issue or create a new one.