jbuehl/solaredge

Checksum errors

Closed this issue · 11 comments

Hi,

I'm seeing a lot of messages like the one below and it seems not all data is captured because of it.
Does anybody have an idea what could be wrong? :)

Sep 21 14:10:29 icarus semonitor.py: dataLen: 0211
Sep 21 14:10:29 icarus semonitor.py: dataLenInv: fdee
Sep 21 14:10:29 icarus semonitor.py: sequence: 0058
Sep 21 14:10:29 icarus semonitor.py: source: 731281f7
Sep 21 14:10:29 icarus semonitor.py: dest: fffffffe
Sep 21 14:10:29 icarus semonitor.py: function: 003d
Sep 21 14:10:29 icarus semonitor.py: Discarding 12 extra bytes
Sep 21 14:10:29 icarus semonitor.py: data: 9e 89 09 bb 2b 82 00 00 20 20 20 20
Sep 21 14:10:29 icarus semonitor.py: Checksum error. Expected 0x4f65, got 0x1a93
Sep 21 14:10:29 icarus semonitor.py: data: 11 02 ee fd 58 00 f7 81 12 73 fe ff ff ff 3d 00
Sep 21 14:10:29 icarus semonitor.py: data: ba 27 0c 91 78 11 b6 90 1f f1 83 0e 94 af 8c ea
Sep 21 14:10:29 icarus semonitor.py: data: 01 e8 c4 4a 79 d7 9d 4a 58 3e 92 26 8e b4 df 71
Sep 21 14:10:29 icarus semonitor.py: data: 01 ec c7 09 e5 61 bc b7 b0 9c 32 1e f9 df 0a c9
Sep 21 14:10:29 icarus semonitor.py: data: 83 29 fc b9 98 00 ba f9 70 46 fc a3 6e 3e fb d1
Sep 21 14:10:29 icarus semonitor.py: data: a3 93 c1 e1 44 7c 89 c3 2c 49 34 66 7a 33 d9 67
Sep 21 14:10:29 icarus semonitor.py: data: 70 b2 2d 9b 56 11 e5 7f 86 8a 11 bc 93 1a 2c 24
Sep 21 14:10:29 icarus semonitor.py: data: eb 19 48 22 29 bf bd 50 dc a7 5d 53 3e b0 d9 17
Sep 21 14:10:29 icarus semonitor.py: data: 09 05 da 8b 84 61 ba 3b 12 43 17 02 34 ee 83 66
Sep 21 14:10:29 icarus semonitor.py: data: 4e f6 f6 d8 c1 f0 f2 a8 ca e4 78 2e cc a1 75 09
Sep 21 14:10:29 icarus semonitor.py: data: 41 5e 52 05 ef b6 14 0d 35 d6 85 fd d5 dc 82 96
Sep 21 14:10:29 icarus semonitor.py: data: d2 01 31 47 bc b6 b8 8e 88 55 36 9b 32 7a aa 01
Sep 21 14:10:29 icarus semonitor.py: data: 3b c4 42 5f 74 94 6f d9 f3 47 f6 b8 15 61 1d 81
Sep 21 14:10:29 icarus semonitor.py: data: 37 0b e0 02 46 c9 ec 19 ed 44 dd 14 65 ae d2 39
Sep 21 14:10:29 icarus semonitor.py: data: 69 40 53 c3 3e cb cb 99 96 b6 18 73 eb b8 80 ec
Sep 21 14:10:29 icarus semonitor.py: data: 77 a1 73 a0 c2 e2 58 59 fc b5 46 1e c6 b1 ab fd
Sep 21 14:10:29 icarus semonitor.py: data: 8e 7e 6a 29 8a 92 ed 60 30 f1 fe a1 2b a6 14 a3
Sep 21 14:10:29 icarus semonitor.py: data: 48 63 d9 ee ee 1c a7 ae 25 aa 62 59 55 9e f5 e5
Sep 21 14:10:29 icarus semonitor.py: data: 7d 2b 3b e7 25 bb 9d 83 4d 2a e7 e2 17 cf 7b 17
Sep 21 14:10:29 icarus semonitor.py: data: 63 de 8c 24 e6 66 2e ea 57 32 21 dd 52 e0 ee 5c
Sep 21 14:10:29 icarus semonitor.py: data: eb ca f8 33 67 b2 c8 fc 80 16 04 f6 8f 5e 8e 5f
Sep 21 14:10:29 icarus semonitor.py: data: 17 c2 90 d7 7e 4e 5f fa 83 5b d9 2b bc a6 d8 e1
Sep 21 14:10:29 icarus semonitor.py: data: ea fb cf cd 2a 26 a0 8a e3 af cb f1 e4 01 ee 48
Sep 21 14:10:29 icarus semonitor.py: data: 00 8d a8 e5 a5 1b 67 68 a2 45 6b 0c 74 09 e6 75
Sep 21 14:10:29 icarus semonitor.py: data: 68 9d 0f 01 80 a6 3b 17 18 c3 a7 23 e0 8b dc 5b
Sep 21 14:10:29 icarus semonitor.py: data: 6a 52 23 86 01 b2 86 0b dc 57 bd 5e 3d 0e 85 36
Sep 21 14:10:29 icarus semonitor.py: data: b9 64 6f bd 9d 38 d9 f9 f7 b2 95 af 54 6d b2 60
Sep 21 14:10:29 icarus semonitor.py: data: 55 0d 34 56 b5 1f 9a 04 f7 d9 10 58 7f 47 04 34
Sep 21 14:10:29 icarus semonitor.py: data: 20 81 66 2f 0b 1c 29 23 c7 6c 9e 37 ad 25 27 28
Sep 21 14:10:29 icarus semonitor.py: data: 22 e7 d9 02 7d 78 ca 63 d7 04 32 7d 58 58 f1 a2
Sep 21 14:10:29 icarus semonitor.py: data: d2 3b ee d6 83 47 78 b5 64 8e 43 b0 2f 26 58 48
Sep 21 14:10:29 icarus semonitor.py: data: e0 ac 18 4d df ee 2d 08 b5 40 ac 94 78 eb 76 6b
Sep 21 14:10:29 icarus semonitor.py: data: c0 e4 94 af b2 80 b8 71 66 f6 ca ac 5c ab 33 98
Sep 21 14:10:29 icarus semonitor.py: data: 75 d1 47 f6 31 90 00 00 20 20 20 20 67 88 a6 69
Sep 21 14:10:29 icarus semonitor.py: data: a6 65 4f 9e 89 09 bb 2b 82 00 00 20 20 20 20
Sep 21 14:10:29 icarus semonitor.py: Ignoring this message

It seems that processing pcap files whwen using a bridge (raspberry pi, 2 ethernet ports in bridge modes) is not the same when using nat (dhcp on eth0, nat via eth1). when using nat the data seems ok, when using bridge modes I see more checksum errors.

I also have the occasional CRC error over RS-485, so maybe sometimes some of the data is corrupt?

As for bridge vs nat, it might be that the hardware cannot keep up? though NAT should be harder on the cpu. Then again, the pi has zero native ethernet, everythig is via USB, so you'll surely run into bw problems etc that could cause CRC errors and then, bridge mode might actually be harder (nat could be throttled)

Since i've switched to nat i don't see any Checksum errors any more, at all.
Also, since all packets are dumped to a file I don't think cpu is an issue here.

I have seen checksum errors with a raspberry pi because of hardware limitations, but only with RS485. It could also be the case that NAT is filtering out other IP messages that may confuse semonitor.py. It expects to get a clean TCP stream of data that is being sent to the portal and if there are other packets that are mixed in there will be errors.

I was thinking the same thing (other TCP (in this case) packets interfering). I did filter on the specific ip of the inverter, but this doesn't help. Could it be that, even though I filter just tcp with tcpdump, that other ip packets interfere anyway?

You should run it through seextract.py or some other utility such as tshark that extracts a single TCP stream.

I did use seextract to create the data file. Still got the checksum errors when doing bridge.
This is what I run:

/home/pi/solaredge/seextract.py -a -f /data/pcap/ | tee -a $DATAFILE >/dev/null | /home/pi/solaredge/semonitor.py -v -f -k $KEYFILE $DATAFILE | /home/pi/solaredge/pickle2graphite.py -b solaredge -h 192.168.1.12

I get checksum errors on RS485 pretty much everytime I start semonitor.py (I run it once per day at 22:00 hrs) First time it fails, second time it generally works (sometimes i get timeouts) third time generally is the charm however. I'll debug it once i have a bit more time.

I can confirm the findings by vandalon. I got checksum errors quite often when using bridge on my raspberry Pi 1 Model B. Even when using TCP filter on tcpdump and piping it through seextract. After switching over to nat I get no checksum errors at all even without seextract. But at the moment I am figuring out why I get inverter data only once in a while ( 10-90mins).

WIth the current master, I do still get the initial checksum error, but it now managed to run for 3 or 4 days without timeouts or major checksum errors. So I think this issue may actually be closed? @jbuehl

@oliv3r I think this issue was raised for the case where semonitor.py is monitoring ethernet traffic, which is unrelated to the RS-485 case. That is what was apparently fixed in issue #136 and seems relevant to your situation. I agree this issue should be closed.