jbuehl/solaredge

semonitor stops after: Checksum error. Expected 0x123d, got 0x16c6

Closed this issue · 31 comments

Thanks again for helping!

After some time semonitor receives a garbled packet, and reports a mismatch in checksum.

The problem: after that, the program seems to be stopped, and no new data is received until a restart.

@DatafoxChris - Please run this again with the -vvvv option now that issue #135 is fixed and post the debug output showing the last few messages before it stops.

This command is running now:
python3 semonitor.py -o 2019-07-19.json -m -s 73123D96 -t 4 -b 115200 -r raw.log -vvvv /dev/ttyUSB2 > vvvv.log 2>&1

When it stops i will post vvvv.log in this thread

  • In the next few weeks my solar-panels and optimizers will be installed

semonitor has been running the whole weekend without the checksum error. As soon as it stops running i will post the log.

When using -vvvv option, it seems that no data is written to -o file.json

Is anything written to vvvv.log?

Yes, everything is written into vvvv.log and i am viewing it live using tail -f

Using -vvvv option there is an error message every time semonitor generates json output, and nothing is written into the json file:

/dev/ttyUSB2 <-- message: 77191 length: 22
data: 12 34 56 79 00 00 ff ff 9a 73 fe ff ff ff 96 3d
data: 12 73 02 03 9e dd
/dev/ttyUSB2 --> message: 83058 length: 88
data: 12 34 56 79 42 00 bd ff 8b 02 96 3d 12 73 fe ff
data: ff ff 00 05 22 00 96 3d 12 73 3a 00 b2 7d 30 5d
data: 09 01 00 00 00 00 00 00 00 80 00 00 00 00 00 00
data: 00 80 00 00 00 00 00 00 00 80 00 00 00 00 00 00
data: 00 80 2c 01 00 00 00 00 00 00 00 00 00 00 00 00
data: 00 00 ff ff 7f ff 23 b7
dataLen: 0042
dataLenInv: ffbd
sequence: 028b
source: 73123d96
dest: fffffffe
function: 0500
meters_0x0022: 73123D96 type: 0022 len: 003a
meters_0x0022 : {'73123D96': {'9_PVProduction': {'dateTime': 1563458994, 'devType': 'meters_0x0022', 'onlyIntervalData': 1, 'EfromX': 0, 'AlwaysZero_off26_int2': 0, 'Flag_off20_hex': '00 80', 'AlwaysZero_off34_int2': 0, 'E2X': 0, 'PfromX': -3.4028234663852886e+38, 'AlwaysZero_off18_int2': 0, 'TotalE2Grid': 0, 'Interval': 300, 'Time': '16:09:54', 'Totaloff22_int4': 0, 'devLen': 58, 'Date': '2019-07-18', 'Totaloff30_int4': 0, 'TotalEfromGrid': 0, 'P2X': 0.0, 'Flag_off36_hex': '00 80', 'Flag_off12_hex': '00 80', 'Flag_off28_hex': '00 80', 'AlwaysZero_off10_int2': 0, 'seId': '73123D96', 'recType': 9, 'seType': '0x0022'}}}
2019-07-19.json <-- message: 5881 length: 677
Failed to parse message: Unknown format code 'x' for object of type 'str'
data: 42 00 bd ff 8b 02 96 3d 12 73 fe ff ff ff 00 05
data: 22 00 96 3d 12 73 3a 00 b2 7d 30 5d 09 01 00 00
data: 00 00 00 00 00 80 00 00 00 00 00 00 00 80 00 00
data: 00 00 00 00 00 80 00 00 00 00 00 00 00 80 2c 01
data: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff ff
data: 7f ff 23 b7
/dev/ttyUSB2 --> message: 83059 length: 22
data: 12 34 56 79 00 00 ff ff 9a 73 96 3d 12 73 fe ff
data: ff ff 9a 03 a2 c4
dataLen: 0000
dataLenInv: ffff
sequence: 739a
source: 73123d96
dest: fffffffe
function: 039a
dataLen: 0000
dataLenInv: ffff
sequence: 739b
source: fffffffe
dest: 73123d96
function: 0302

@DatafoxChris it looks like you might still be using an old version. Can you repull master and confirm your checkout is at the latest revision?

Pulled new version, and the error at the moment of saving the json is gone now.

Now every request leads to this kind of response:

/dev/ttyUSB2 <-- message: 51 length: 22
data: 12 34 56 79 00 00 ff ff 82 ab fe ff ff ff 96 3d
data: 12 73 02 03 c3 1c
/dev/ttyUSB2 --> message: 51 length: 88
data: 12 34 56 79 42 00 bd ff c7 00 96 3d 12 73 fe ff
data: ff ff 00 05 22 00 96 3d 12 73 3a 00 fb 9a 30 5d
data: 09 01 00 00 00 00 00 00 00 80 00 00 00 00 00 00
data: 00 80 00 00 00 00 00 00 00 80 00 00 00 00 00 00
data: 00 80 2c 01 00 00 00 00 00 00 00 00 00 00 00 00
data: 00 00 ff ff 7f ff c1 67
dataLen: 0042
dataLenInv: ffbd
sequence: 00c7
source: 73123d96
dest: fffffffe
function: 0500
meters_0x0022: 73123D96 type: 0022 len: 003a
meters_0x0022 : {'73123D96': {'9_PVProduction': {'Date': '2019-07-18', 'TotalE2Grid': 0, 'AlwaysZero_off26_int2': 0, 'seId': '73123D96', 'recType': 9, 'Totaloff30_int4': 0, 'PfromX': -3.4028234663852886e+38, 'devType': 'meters_0x0022', 'Flag_off36_hex': '00 80', 'onlyIntervalData': 1, 'Time': '18:14:51', 'AlwaysZero_off18_int2': 0, 'devLen': 58, 'E2X': 0, 'AlwaysZero_off34_int2': 0, 'Totaloff22_int4': 0, 'seType': '0x0022', 'P2X': 0.0, 'dateTime': 1563466491, 'Flag_off12_hex': '00 80', 'EfromX': 0, 'Interval': 300, 'Flag_off20_hex': '00 80', 'AlwaysZero_off10_int2': 0, 'TotalEfromGrid': 0, 'Flag_off28_hex': '00 80'}}}
{"events": {}, "inverters": {}, "meters_0x0022": {"73123D96": {"9_PVProduction": {"AlwaysZero_off10_int2": 0, "AlwaysZero_off18_int2": 0, "AlwaysZero_off26_int2": 0, "AlwaysZero_off34_int2": 0, "Date": "2019-07-18", "E2X": 0, "EfromX": 0, "Flag_off12_hex": "00 80", "Flag_off20_hex": "00 80", "Flag_off28_hex": "00 80", "Flag_off36_hex": "00 80", "Interval": 300, "P2X": 0.0, "PfromX": -3.4028234663852886e+38, "Time": "18:14:51", "TotalE2Grid": 0, "TotalEfromGrid": 0, "Totaloff22_int4": 0, "Totaloff30_int4": 0, "dateTime": 1563466491, "devLen": 58, "devType": "meters_0x0022", "onlyIntervalData": 1, "recType": 9, "seId": "73123D96", "seType": "0x0022"}}}, "optimizers": {}}
dataLen: 0000
dataLenInv: ffff
sequence: 00c7
source: fffffffe
dest: 73123d96
function: 0080
/dev/ttyUSB2 <-- message: 52 length: 22
data: 12 34 56 79 00 00 ff ff c7 00 fe ff ff ff 96 3d
data: 12 73 80 00 61 9a
/dev/ttyUSB2 --> message: 52 length: 22
data: 12 34 56 79 00 00 ff ff 82 ab 96 3d 12 73 fe ff
data: ff ff 9a 03 ff 05
dataLen: 0000
dataLenInv: ffff
sequence: ab82
source: 73123d96
dest: fffffffe
function: 039a

What seems strange to me is the speed at which 'Time' value is changing. The minutes seem to count upwards like they are seconds...

Wow, it delivers very usable data now:

"inverters": {"73123D96": {"Date": "2019-07-18", "Eac": 0.0, "Eday": 0.0, "Etot": 0.0, "Freq": 50.011592864990234, "ID": "73123D96", "Iac": 0.0, "Interval": 103, "Pac": 0.0, "Pmax": -3.4028234663852886e+38, "Temp": 27.81756591796875, "Time": "22:49:57", "Uptime": 108, "Vac": 230.40524291992188, "Vdc": 0.02406148612499237}}

Finally there was a checksum error :)

Checksum error. Expected 0x12bd, got 0x23ab
Ignoring this message

After this message, no new JSON is written into the json file.

I have included the log and the json-outputfile to this post:
vvvv.zip

Can you also post the record file raw.log from this session? This problem has happened before as described in issues #18 and #20, but it shouldn't be stalling.

The reason you were seeing the time value change so fast is that the inverter had a backlog of several days of data that it wasn't able to send before. Once that is cleared out you should see a data record every 5 minutes.

Here is the raw.log file:

raw.zip

Hi @jbuehl , @ericbuehl

Trivial decode error / regression problem due to change from Python2.7 -> Python3.x

I doubt it is anything at all to do with the checksum error / semonitor stalling, but I notice that some of the Power values reported in the comments above are ridiculous / magic values of -3.4028234663852886e+38.

See for example the "PfromX" values for the 0x0022 message type, or the "Pmax" value from the inverters device.

"meters_0x0022": {"73123D96": {"9_PVProduction": {"AlwaysZero_off10_int2": 0, "AlwaysZero_off18_int2": 0, "AlwaysZero_off26_int2": 0, "AlwaysZero_off34_int2": 0, "Date": "2019-07-18", "E2X": 0, "EfromX": 0, "Flag_off12_hex": "00 80", "Flag_off20_hex": "00 80", "Flag_off28_hex": "00 80", "Flag_off36_hex": "00 80", "Interval": 300, "P2X": 0.0, "PfromX": -3.4028234663852886e+38, "Time": "18:14:51", "TotalE2Grid": 0, "TotalEfromGrid": 0, "Totaloff22_int4": 0, "Totaloff30_int4": 0, "dateTime": 1563466491, "devLen": 58, "devType": "meters_0x0022", "onlyIntervalData": 1, "recType": 9, "seId": "73123D96", "seType": "0x0022"}
"inverters": {"73123D96": {"Date": "2019-07-18", "Eac": 0.0, "Eday": 0.0, "Etot": 0.0, "Freq": 50.011592864990234, "ID": "73123D96", "Iac": 0.0, "Interval": 103, "Pac": 0.0, "Pmax": -3.4028234663852886e+38, "Temp": 27.81756591796875, "Time": "22:49:57", "Uptime": 108, "Vac": 230.40524291992188, "Vdc": 0.02406148612499237}}

I came across this while working on the ParseDevice code and eventually tracked it down (to what I believe is a bug in the SolarEdge data). The correct value is NaN ; which is what you obtain from Python if you reverse the order of the 4 bytes in the magic floating point number -3.4028234663852886e+38

For the 0x0022 message (ie the meters_0x0022 device) line 229 to 231 of datadevices.py contains a 1-off fix for this. It used to work in Python 2.7, but I think it needs a trivial update to succeed in Python3.x

Line 230 is currently

data[dataPtr:dataPtr + paramLen] == '\xff\xff\x7f\xff'):

The constant string '\xff\xff\x7f\xff' needs to be converted to an explicit byte string , viz changed to b'\xff\xff\x7f\xff' for the test to recognise the magic value correctly in Python3.x.

This change would fix the meters_0x0022 "PfromX" field value; I haven't yet read through the code that decodes the inverter's data message to figure out where an equivalent fix for the Pmax field would fit.

Hope this helps - I remain amazed and very grateful for all your work on this enormously useful project.

Regards

Geoff

@DatafoxChris I just made a change that may fix the problem. Please pull the master and try it again. I have no way of duplicating and testing the problem since I don't have the same hardware as you. I also added some additional debugging messages, so if it hangs again you should terminate the program with ctrl-C which will post additional information to the log file. Ideally when it encounters a checksum error it should recover and continue processing data.

Pulled latest version and tested it. After only a few messages the checksum-error was detected and all messages after that result in a timeout:

2019-07-29.zip

Sorry, my mistake. Line 116 of se/msg.py should be

if not mode.passiveMode and (mode.serialType != "4"):

I just pushed another change. Please pull it again and try it.

Thanks :) Pulled it, and it is running OK now. Will keep you informed.

So far: no problems :)

Still no problems

@DatafoxChris, are you seeing any checksum errors in the debug log? I would expect that you would still see them occasionally, but the program is now able to recover. If you see one, can you please post the portion of the log that shows the error and recovery? I don't need the whole log file, just a few messages before and after the checksum error.

Thanks.

Thanks @Geoff99. Are you planning to submit a change?

Hi @jbuehl .

I could submit a change if you like, but equally happy if you want to just put the 1 line change in datadevices.py (as per my earlier comment) through yourself to fix the meters_0x0022 "PfromX" case.

I've had a quick look at how I might fix the PMax value returned by the inverters device decoder, as implemented in :

def parseInvData(seId, invItems, devData):

and while the actual change is not that complex, I'm struggling to see how I can make it elegant rather than hackish.

An alternative spot to catch the magic (ie should be Nan but isn't) value and repair it could be bit later in the execution flow, just after :

invDict[seId] = parseInvData(seId, invItems,

but that would be equally inelegant.

It may be that inelegant is the only way, since it is essentially trapping and repairing a solardedge bug, where they have apparently encoded a float Nan value with the wrong (ie opposite to every other value) endianness.

Let me know what you think, and whether you'd like me to prepare a pull request.

Geoff

No checksum errors have been reported so far, i am using command "cat vvvv.log | grep checksum" to check.

@DatafoxChris there should still be checksum errors from time to time, but the program should handle them. If you typed the grep command exactly as you showed it won't find any in the log file because grep is case sensitive and Checksum is capitalized in that message. Be sure to enter "grep Checksum".

@Geoff99 - You can make the change since that's not an area of the program that I am very familiar with nor particularly interested in since I don't have that equipment.

Sure @jbuehl, happy to.

Still getting up to speed with Python 3, but the changes in this pull request seem to work in my environment, and for my hardware

#138

Hi @jbuehl, there still don't seem to be any Checksum errors... using command "cat vvvv.log | grep Checksum"

The checksum error would have been caused by a hardware issue or possibly load on the server so it's difficult to cause them to occur in order to thoroughly test this. I'm going to assume the problem is fixed and close this issue.