Apollon77/smartmeter-obis

High CPU Usage with big data package D0Protocol

nooxnet opened this issue · 8 comments

Is your feature request related to a problem? Please describe.
I have a 2WR5 heat meter which does not send some important values with the standard SignOnMessage (?). So I use '#'. The returned data is about 10.000 chars long and the readout lasts for about 51 seconds. This causes 97-100% CPU usage on one core of my RaspberryPI 3. I have three heat meters. So up to three cores are stressed with nearly 100% usage.
I mentioned this already in the ioBroker forum quite some time ago but now I had the time to look into this more thoroughly.

When data arrives checkMessage of D0protocol.js is called. It is checked if all data has arrived by using the regDataMsg Regex. In my case with the 10.000 chars of data, this is done about 950 times. At first the regex match lasts a couple of milliseconds but later with more data up to 450 ms. Interestingly the last and successful regex match lasts only 0 ms.

Describe the solution you'd like
The regex does some basic checking but mainly it just waits for the first '!' to arrive. Then the message is considered as completed.
So I added the following:

at the top:
var dataMsgCheckString = '!';

and instead of regMessage = message.match(regDataMsg);

	if(message.indexOf(dataMsgCheckString) < 0) {
		regMessage = null;
	} else {
		regMessage = message.match(regDataMsg);
	}

Now the CPU usage during the readout is only about 5 - 10%.
The downside is that it's not that elegant any more with just a single regex.

Describe alternatives you've considered
First I added a variable to track the length of the received data and started to check for the '!' from the previous position. But it seems like that does not really make a difference.

It's interesting that when I use this new method, checkMessage is called much more often (2860 times), so the data comes in smaller junks. I wonder if there is a possibility to delay the new data for some time such that we do not have to check for completeness that often.

Additional context
Statistics on RaspberryPI3 & 10.000 chars:

Regex only:
51 seconds
950x checkMessage called
99% CPU usage for most of the time
CPU Temperature: 60° avg (3 smartmeter read quite frequently)

indexOf & Regex when '!' received:
51 seconds
2860x checkMessage called
5-10% CPU usage (15-25% with debug logging)
CPU Temperature: 46° (3 smartmeter read quite frequently)

What are your thoughts about this?

My thoughts are that it is difficult to fix it ... I need to this about this.

The issue is the following: Serial data come in chunk wise and the module do not know when a message started and when it ends, so it could be that reading starts in the middle of a package. This means after each data chunk we need to analyze the message (D0 and also SML) to see if it is "now" complete and where it starts and where it ends.

So to fix this issue I would need to find a better idea for the detection of "is the message complete".

Yes for sure I could collect data for e.g. 1s and so delay the data anylssis - but this also means that results are delayed by that time ... so it's tradeoff and could be a configurable feature. Also an idea could be to wait until e.g. 100ms no new data arrived and then do the check, but this works only if the smartmeter is doing a "pause" ... No idea whats best :-(

What do you thin?

I just wondered why my approach seems to work.

But then I realized: If the message does not contain an '!' the regex match would fail in any case. So I check for the '!' with a simple indexOf. If not found I set the result to null - like it would happen with the regex. If the '!' is found I execute the regex match. So if the message is not complete the regex would still return null.

So I assume the code is OK but my explanation was wrong or at least misleading.

If this approach works then I think there is no need to add a delayed analysis.
And it could be that a delayed analysis would lead to problems with different devices. Also if the regex match already lasts >400 ms a small delay would not lead to a significantly lower CPU usage.

And when I think of my approach where the check is executed 2800 times it seemed really a lot and I thought about optimizing that. But when I think about it again I'm pretty sure that it would not significantly reduce the CPU usage as it is already down at usually 5%.

Update:
If it would happen regularly that the read started not at the start of the data package my approach would not reduce the high CPU usage (or only partially). As the indexOf would find the '!' and then the regex match would be executed and fail because the start of the data package is missing. But in my case it seems like that does not happen. Maybe because the communication becomes in sync because of previous initializations and messages.

As yu can see in code the regex is a bit more complex then "just" check for a ! ... there can be characters in front of it

var regDataMsg = new RegExp('\u0002?([A-Za-z0-9][^!]+)![A-Za-z0-9]*\r\n\u0003?[^]?');

So when just checing for ! it could be the very first character in the message and then the message would be invalid in the end :-(

I do not just check for '!'. If I do not find a '!' I omit the regex match because it would fail anyways. If I find the '!' I check the regex.

As long as the data is not complete there is no '!' in the data. So in my case I can get rid of 950 regex matches. When I finally find a '!' I do the regex match. Only if this is OK the message is considered as complete. - just like it is now.

This approach would not help if the first data of the chunk is the last part of a message with the '!' but the beginning missing. The regex.match would be executed every time like it happens now. So in this case the code would still not fail but would only sometimes result in lower CPU usage. But that is not what I experience.

you are right that it can optimize a bit. In fact then you started "In the middle of a message" then ! is at the end of the incomplete message and so the regex is still executed until the second "full message" is there, but yes it can help reducing it a bit

oeiber commented

Reducing the CPU load would be great.
I'm reading two smartmeters ( sml and d0) with a rpi 4. And the system has alway a load of about 80 percent.
This makes a lot of heat in my electrical enclosure ;-)

@oeiber Can you provide more details about your settings? Is D0 or SML the issue or both?
System load of 80% because of these two instances running?

oeiber commented

@oeiber Can you provide more details about your settings? Is D0 or SML the issue or both? System load of 80% because of these two instances running?

I'm sorry for that. The high cpu usage was related to my own script's loop. So everything is fine ;-)