jbuehl/solaredge

Change in data?

JustApu opened this issue · 10 comments

I have three SE10000 inverters, with daily pcaps captured back to August 2017. I've been succesfully processing these files using jbuehl's code all this time. However, on April 4-5, 2019, I suddenly went from ~720 lines of data in the JSON files to ~25 lines of data and I'm no longer seeing the Etot values from the inverters or TotalE2Grid from the meters.

The SolarEdge monitoring portal is still showing system data and the daily pcap files are roughly the same size as I've had on other days (2.6-2.8 MB/day for all three inverters combined). However, there are occasional (approx. once a month) jumps in the size of the pcap files, with the latest jump to 4.1 MB coinciding with this change.

Is anyone else seeing changes or having problems monitoring their systems?

Screen Shot 2019-04-08 at 20 09 55

P.S. My workflow... tcpdump runs on a Linux VM that acts as a router between the inverters and the Internet, capturing any data to/from the inverters. tcpdump then separates the one daily pcap into three pcap files, one per inverter. Those are processed, on a different Linux box, by tshark, piped to unhexlify and then sent to semonitor to create the JSON.

I'm running this "from scratch" each day - the only data that persists from one day to the next is the saved key that semonitor uses to process the data. Is it possible the key changed? Would I still get any data with the old key? I have all the pcap files if its a matter of extracting a new key.

@jbuehl - Any thoughts?

@JohnOmernik - You were working on dealing with network key recently; is it safe to say that, if I'm getting any JSON, its likely not that the key changed since we can understand a few messages between the inverter and the SolarEdge servers? If were a bad key, it would all be gibberish, correct?

7D112E73.json.txt

Hi @JustApu ,

Not sure there's much I can do to help (except to reassure you that someone is listening) because I have a different model inverter, and possibly different version of the embedded SolarEdge software as well.

As you already figured out yourself, the key must be OK, because you are getting JSON, not gibberish. I had a quick look at your attached JSON and I note that there are some undeciphered messages being reported

{"Unknown_device_0x0042": {"7D112E73": {"devLen": 48, "Undeciphered_data": "02 01 00 34 30 35 37 33 33 36 00 57 61 74 74 4e | 6f 64 65 00 52 57 4e 44 2d 33 44 2d 32 34 30 2d | 4d 42 00 32 34 00 09 01 00 00 00 00", "Time": "21:49:12", "devType": "Unknown_device_0x0042", "dateTime": 1555811352, "seType": "0x0042", "seId": "7D112E73", "Date": "2019-04-20"}},

 "Unknown_device_0x0041": {"7D112E73": {"devLen": 40, "Undeciphered_data": "00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ff | ff 7f ff 01 00 00 00 00 ff ff 7f ff 00 00 00 00 | 00 00 00 00", "Time": "21:49:12", "devType": "Unknown_device_0x0041", "dateTime": 1555811352, "seType": "0x0041", "seId": "7D112E73", "Date": "2019-04-20"}}, "Unknown_device_0x0040": {"7D112E73": {"devLen": 5, "Undeciphered_data": "00", "Time": "21:49:12", "devType": "Unknown_device_0x0040", "dateTime": 1555811352, "seType": "0x0040", "seId": "7D112E73", "Date": "2019-04-20"}}, "inverters": {}, 

"Unknown_device_0x0017": {"7D112E73": {"devLen": 102, "Undeciphered_data": "53 6f 6c 61 72 45 64 67 65 00 53 45 31 30 30 30 | 30 00 37 44 31 31 32 45 37 33 00 01 00 06 00 03 | 00 73 2e 11 7d 00 00 00 03 03 00 e2 07 00 00 73 | 2e 91 7d 01 00 00 02 01 00 d2 00 2c 04 73 2e d1 | 7d 02 00 00 0d 02 00 34 00 9a 01 01 00 41 5c 12 | 00 00 2a 00 00 00 00 00 00 00 00 00 00 03 00 00 | 00 00", "Time": "21:49:12", "devType": "Unknown_device_0x0017", "dateTime": 1555811352, "seType": "0x0017", "seId": "7D112E73", "Date": "2019-04-20"}}, "Unknown_device_0x0018": {"7D112E73": {"devLen": 10, "Undeciphered_data": "0a 00 55 53 41 00", "Time": "21:49:12", "devType": "Unknown_device_0x0018", "dateTime": 1555811352, "seType": "0x0018", "seId": "7D112E73", "Date": "2019-04-20"}}, "events": {"7D112E73": {"Type": 114, "Event3": "Sun Apr  7 10:43:02 1974", "Time": "21:49:12", "Date": "2019-04-20", "Event2": 109263277, "ID": "7D112E73", "Event1": "Tue Oct 20 19:48:25 1970"}}, "optimizers": {}}


{"Unknown_device_0x0043": {"7D112E73": {"devLen": 222, "Undeciphered_data": "01 00 01 00 05 00 2a 00 fe dd d4 10 00 48 33 d4 | 10 00 60 2b d5 10 00 47 4f d5 10 00 e2 31 d4 10 | 00 79 c7 d4 10 00 b4 dd d4 10 00 38 2b d5 10 00 | a5 af d4 10 00 5c c7 d4 10 00 5b f5 d3 10 00 ce | 4c d5 10 00 05 35 d4 10 00 ba 28 d5 10 00 61 7f | d5 10 00 df df d4 10 00 5f db d4 10 00 89 d3 d3 | 10 00 5f 7f d5 10 00 41 db d4 10 00 57 39 d4 10 | 00 49 76 d4 10 00 09 36 d4 10 00 ab 4d d5 10 00 | 4a d3 d4 10 00 bb 57 d5 10 00 51 de d4 10 00 48 | d5 d3 10 00 0c 26 d3 10 00 39 de d4 10 00 a8 2c | d3 10 00 70 7c d5 10 00 74 2b d5 10 00 da e2 cc | 10 00 0b ba d4 10 00 99 2b d5 10 00 30 f7 d4 10 | 00 a5 50 d5 10 00 d4 7e d4 10 00 b1 c8 d4 10 00 | b8 89 d4 10 00 be da d4 10 00", "Time": "21:49:12", "devType": "Unknown_device_0x0043", "dateTime": 1555811352, "seType": "0x0043", "seId": "7D112E73", "Date": "2019-04-20"}}, "Unknown_device_0x0044": {"7D112E73": {"devLen": 5, "Undeciphered_data": "00", "Time": "21:49:12", "devType": "Unknown_device_0x0044", "dateTime": 1555811352, "seType": "0x0044", "seId": "7D112E73", "Date": "2019-04-20"}}, "inverters": {}, "meters_0x0022": {"7D112E73": {"9_PVProduction": {"TotalE2Grid": 0, "AlwaysZero_off34_int2": 0, "seId": "7D112E73", "TotalEfromGrid": 0, "Flag_off20_hex": "00 80", "devLen": 58, "Flag_off28_hex": "00 80", "E2X": 0, "PfromX": "NaN", "P2X": 0.0, "Date": "2019-04-20", "AlwaysZero_off18_int2": 0, "Totaloff22_int4": 0, "Totaloff30_int4": 0, "Interval": 300, "EfromX": 0, "seType": "0x0022", "Time": "21:49:55", "onlyIntervalData": 1, "dateTime": 1555811395, "AlwaysZero_off26_int2": 0, "AlwaysZero_off10_int2": 0, "devType": "meters_0x0022", "Flag_off36_hex": "00 80", "Flag_off12_hex": "00 80", "recType": 9}, "1_UnrecognisedRecType": {"TotalE2Grid": 16986587, "AlwaysZero_off34_int2": 0, "seId": "7D112E73", "TotalEfromGrid": 4032, "Flag_off20_hex": "00 00", "devLen": 58, "Flag_off28_hex": "00 00", "E2X": 0, "PfromX": 0.0, "P2X": 0.0, "Date": "2019-04-20", "AlwaysZero_off18_int2": 0, "Totaloff22_int4": 260258, "Totaloff30_int4": 17624214, "Interval": 300, "EfromX": 0, "seType": "0x0022", "Time": "21:49:55", "onlyIntervalData": 0, "dateTime": 1555811395, "AlwaysZero_off26_int2": 0, "AlwaysZero_off10_int2": 0, "devType": "meters_0x0022", "Flag_off36_hex": "00 00", "Flag_off12_hex": "00 00", "recType": 1}}}, "events": {}, "optimizers": {}}

(Apologies for the formatting above, just trying to illustrate what to look for).

It has been a while since I deciphered the meters_0x0022 group of messages from my own inverter, so my memory of the details has faded, but those Unknown_device_0x____ and Undeciphered_data messages are coming from the code I drafted at that time - a sort of a (not very useful except as input to further work) backstop. When the code that was deciphering the 0x0022 message came across a different message type (in the examples above 0x0040, 0x0041 etc) that had not (yet) been deciphered by anyone, it reported the message type, and (effectively) did a hex dump of the undeciphered message.

It is a possibility (but not a certainty) that some update to the software on your inverters has chosen to emit the information that you used to get from messages which have already been decoded using a new message type. If so, someone would need to decipher the new message formats. That's tricky! (to say the least, which is why the work already done by the main authors is so impressive!). It requires capturing deciphered data (from your inverter screens, or the SolarEdge website), matching it against the messages sent (with the very same timestamp!), and then figuring out how the deciphered data is encoded in the Undeciphered data hex string.

Of course I could be entirely off the track as well.

Anyway, hope that helps a tiny bit with understanding what's going on.

Regards

Geoff

@Geoff99 - Thanks for the reply. I didn't know you were behind the Unknown_device_0x____ and Undeciphered_data messages but do appreciate your work in adding those messages to help decipher the unknown, your reply today and, as you note, the very impressive work by @jbuehl and others. And your confirmation that it isn't the key is reassuring as well.

Any recommendations on what data I can gather and steps I can take to try to help decode these new messages? Would the raw pcaps be of any use in helping decipher the change? I still have all of those.

I see the protocol changed in Issue #8 but that pre-dates my install so I'm still reading to see if there is anything historical I can use to gather more useful data about what changed this time.

Hi @JustApu,

Having the raw pcaps is helpful, but to decipher a message (or to find a message that contains the data you want), what you really need is to match individual messages with the corresponding "true and known" values from the "same" time point.

"True and known" can be found locally (by toggling through the inverter menu and writing down the various values which are reported in the menus - I spent hours doing that; or in some cases by logging in to your account the SolarEdge website, and looking at what is currently reported there (or, my memory is a bit hazy, but you can maybe download a csv of historical values from the website for some data fields as well).

My inverter transmits to the SolarEdge website every 5 minutes (when everything is working well), so "same" time point is really to the closest 5 minutes.

If you have that matched data, and want to have a go at finding where and how the known true values are encoded in the undeciphered messages, you also need to do some detective work, (and have a fair bit of patience).

If you are OK with Python (or willing to have a go anyway) ParseDevice can help. Have a look at the [README.ParseDevice.md] (https://github.com/jbuehl/solaredge/blob/master/README.ParseDevice.md) which is included in the top level directory of the GitHub directory. Rereading it in hindsight, it is quite lengthy (sorry), and jumps in at the deep end (assumes you have deciphered a new message type, and now want to add that to the SolarEdge project codebase). But right at the end it talks about a strategy for actually deciphering a new message type, and explains how a debug (ParseDevice_Explorer) option in the ParseDevice codebase can help you. The option is turned off in production mode, but you could make a branch (or a copy some other way) and switch it on.

Somewhere around line 171 in se/data.py you will find the following code.

            # In production would usually set explorer to False, to prevent excessively long (and mostly useless) parse
            # results for unknown device types.
            parsedDevice = ParseDevice(
                data[dataPtr - devHdrLen:dataPtr + devLen], explorer=False)

If you change that to read explorer=True, and then run that version of semonitor.py over one of your pcap files, you'll get MUCH larger json output. With explorer=True, whenever a Unknown_device is encountered, as well as just putting out a nice hex version of the Undeciphered_data, it also works through the data, 2 bytes at a time, trying out multiple parsings of the data (byte, integer, float, …). Most of them will be absolute rubbish, but if you are lucky, some will look plausible, and if you are really lucky, some will match with the known and true values that you have acquired. In which case you are on your way to deciphering a new message type!

That's about the best I can do to help, hope enough of it makes sense to help you.

Good luck,

Geoff

PS Final thought (but should have been first!) Before you go to all that trouble, you might try running the standard semonitor.py over some pcap files from before and after the change, and cross checking the Unknown_device messages that appear. If the Unknown_device messages were already there - then :-( I'm probably leading you in the wrong direction. But if (at least some of them) are new, then those new ones would be my prime suspects for where some of your missing data has been moved to.

@JustApu

PS One of the trickier things about decoding the meters_0x0022 ("meters" is just the human friendly name I gave the 0x0022 message type) is that one of the earlier fields in the 0x0022 message is a record type field. The interpretation of the later fields in the 0x0022 message depends on the value of the record type field. So, another possibility is that a new record type has been added to the 0x0022 message, and your missing data now appears there. In short, having a close look through the detailed components of the json from 0x0022 messages from before and after the change could be another useful avenue for investigation.

Once again,

Good luck

Hi @JustApu ,

Not sure whether it will help much (or not - more likely) but I refreshed my memory of where my json files are backed up. (I've had SolarEdge running on a dedicated RPi for a couple of years now, and it's been a set and forget situation). Had a quick look through them and :

  • I'm still getting Etot values (up to yesterday anyway)

  • I sampled a few randomly chosen files as far back as Dec 2018, and even then I was getting some Unknown_device_0x____ messages appearing (viz 0041, 0042, 0043, 0044, 0017, 0018).

  • The Undeciphered_data entries all seem to belong to Unknown_device_ entries (only a cursory look at this, may have overlooked some entries)

Overall that has me scratching my head a bit. It kind of reduces the likelihood of the hypotheses I posted above. I haven't looked into the sizes etc of the assorted messages over time - they may have got bigger or smaller - which could fit within the conjectures above, but less likely I guess. Unless of course you find some other Unknown_device_ messages that aren't being sent by my inverter … ?

You've got me curious now though, so if you do make any progress, pls let us know.

Regards

Geoff

@Geoff99,

Just wanted to write back with some thanks.

After multiple days of bad data, I had disabled my nightly scripts that process the PCAP files and was just working on figuring out what had changed. That's when I opened this issue. Based on your suggestions above, I grabbed one of the latest PCAP files and started comparing it to one from a month prior. They looked very similar and I was puzzled. Eventually, I scratched my head enough and brought over all of the unprocessed PCAP files and ran my normal nightly scripts against them, one by one. Apparently, two or three days after I started having the problem, the problem went away. Or, at least, my normal nightly scripts seem to have started finding the data I care about again.

Thank you for all your advise and help.

Apu

P.S. I double-checked the SolarEdge portal and they do have data for my inverters for the missing days. So there is still some mystery but it will have to wait for some other day to be solved.

Hi @JustApu ,

Thanks for the feedback, and glad you got it sorted out.

In terms of the "mystery" of how the SolarEdge portal has data for the missing days, I am pretty sure that the inverters keep a reasonable amount of "back" data stored locally. At least I have observed that when my net connection has been down (for none inverter related reasons), the "missing" data appears in the pcap files and on the SolarEdge portal within 12-24 hours of the connection being restored. Whether the inverter is clever enough to know how long it has been offline, or whether the portal sends a request specifying how much data it is missing I don't know, but one way or the other things get brought up to date.

What went wrong in the first place with your set up may well have to remain a mystery.

Regards

Geoff

The inverter has a fixed amount of storage available for caching performance data and how many days this equates to will vary depending on factors such as how many panels you have and how much energy is being produced. I have been offline for days and have seen it catch up. Some people have seen them store up to a month of data https://forums.moneysavingexpert.com/showthread.php?t=5034684. The way it works is that when the inverter sends a 0500 message, if the message is acknowledged with a 0080 message it assumes the data has been received and will not send it again. If it doesn't receive an acknowledgement, it will save that data and keep trying to send it until it succeeds. While messages can't be sent the data accumulates until the storage is full and then the oldest data is discarded to make room for new data.