LabVIEW-Open-Source/MQTT-Client

Memory leak in SendPacket.vi

kjkirkegaard opened this issue · 6 comments

Hi

In the search for a explanation for why it will take in average 1.4ms to publish to a topic I think I found a memory leak in the SendPackage.vi.
We only use QoS = 0 in this application because we need speed, and we can survive loosing packages. With my knowledge when sending a package with QoS = 0 the MQTT Broker will not replay with an ACK and here the problem starts, because every time we send a package the PackageID is stored in the PackageIDs DVR.
image
but is will not be taken off the PackageIDs array because no ACK is returned to the MQTT Client and therefore the array will keep growing.

One solution to this could be to use the Wait for ack? to not save the PacketID when we don't need to wait for a ACK.
But I'm not 100% sure that the PackageID is only used for verifying ACK if not then this is not the correct solution.
image
image

But I did not find a solution to why it takes in average 1.4ms to publish to a topic.

Regards
Kaare Juul Kirkegaard

That's a good observation. This has been put in place to handle QoS and I agree with you that it should skip it for QoS = 0.
As for the 1.4ms latency, can you open another issue describing how you decoupled the part of latency due to connection from the internal packet handling? I'd like to create an official benchmark for the toolkit, so we can leverage the unit test framework in assessing this important parameter.

Yes I will start a new issue and explain how I measured the publish time.

Memory Leak

Memory leak will be fixed in the upcoming release.
@kjkirkegaard Thanks much for the report.

Latency

I just tested the following pathway, which is to publish a raw payload in QoS = 0:
image
(image taken from https://github.com/LabVIEW-Open-Source/LV-MQTT-Broker/wiki/Publish-message-with-QoS-=-1)

The payload used was 4 bytes, for a packet size of 15 bytes.
To decouple the effect of a server, I ran a parallel loop with a TCP Read, completely ignoring the time to unpack the 15 bytes and process to extract topic name. This setup also isolated the effect of the broker receiving and relaying the incoming bytes to the session/subscription engine.

With this simple setup, the client was dispatching the Publish packet to a session loop in ~48us.
I did not try to measure the session process alone, as it runs in a parallel thread, but considering that the TCP read is probably negligeable, the second leg (session handling the bytes, sending by TCP and receiving in a second loop) ran in about 35us when my processor was not doing something else in the background. (Each point is the result of averaging the latency measured from 1000 packets, or 15000 bytes). Doing it on 10 000 packets didn't change the result significantly.

This is by no mean a perfect test because I am not knowledgeable about the inner workings of a TCP connection locally (does it implement the Nagle algorithm, for example?), but the point I was trying to test was that I didn't want to measure the connection latency. This is still much faster than the 1.4ms being reported. Of course, I used a small payload... Could it be that the serialization is very inefficient? I didn't use any serialization in my test.

image

That does not mean there are no efficiency gains to be found!
And I'd be grateful for any benchmark suggestion that I can add to the test suite...

I'll be adding the VI into the repo. It's called "Test Publish QoS 0 Latency.vi"
image

Fixed and released (3.1.5). Coming soon on VIPM Community.

Thanks you for the update.

Regarding the latency then I would just add that I also measured the time of the publish, and when I run my test on my PC I get comparable numbers.
But I need this to run in LabVIEW RT (PharLab) and there the same measurement show an average of 1.4ms. The target is PXI controller with 8 cpu cores.

benchmarking can be followed in the new topic #7