Occasional Full-Sync actually causes Sync-Loss with 2/4 slots (Chroma29)
xsrf opened this issue · 15 comments
Hey,
I just noticed that the occasional full-sync (which is a Wake-Up phase followed by longer sync pulses on the configured channel) actually causes a loss of sync for the Chroma29 (and others, I guess).
The full-sync switches to the wake-up channel for 16s. If only 2 or 4 slots are used, labels in sync will actually miss enough sync pulses to fall back to the wake-up channel. However, they will only start listening to the wake-up channel 15s after the fall-back. When they do, the full-sync is already back on the configured channel, so the label won't wake up again.
I've also noticed that labels go back in sync on their own after ~15-30min. I guess they are listening to their configured channel periodically and looking for syncs.
So the occasional full-sync may not really be necessary.
With 8/16/32 slots it's not really a problem because they won't miss 3 syncs in 16s and thus stay in sync.
Thanks for that,
it could actually be removed and it sounds very plausible.
Another solution is to calculate the max missed sync better
Here for New method:
https://github.com/atc1441/E-Paper_Pricetags/blob/main/Custom_PriceTag_AccesPoint/ESP32_Async_PlatformIO/RFV3/mode_wun_activation.h#L125
And here for the old one:
https://github.com/atc1441/E-Paper_Pricetags/blob/main/Custom_PriceTag_AccesPoint/ESP32_Async_PlatformIO/RFV3/mode_activation.h#L179
I've set int rounds_to_resync = INT_MAX;
for now to basically disable the automatic sync.
I also found that running the longer sync pulses increases the chance that a label recovers from wake-up channel.
Interestingly, while a longer sync-pulse shows up in the power draw of the label, it won't always recover to sync.
just checked the calculation of the max missed syncs, it should not interrupt the sync
lets say we have 2 slots, so max missed is 60/2 = 30. which would be equivalent to 60 seconds of missed sync
The full sync will be handled differently as the displays know that its a full sync with the Periods per slot setting
its just the question how it will handle it differently.
the automatic sync is also used in the stock software, and it needs to be possible to do so if some display is not contactable.
Only the Chroma29 is making problems with loosing the sync,
on the other displays the sync will still be there also after a full sync/wakeup
The Sync is lost because the occasional wake-up stops all syncs on the regular channel for 15s.
The Chroma29 drops out when it misses 3 Syncs in a row.
The full sync is fine for displays already in sync.
i understand, but it should only drop out when not receiving syncs for about 60 seconds.
and this is done like this on the stock firmware as well.
there is definitely a bug for why it looses the sync it is just the question what it is,
the stock sets the max sync periods to 0x06 with 16 slots. so 106 seconds
I've got only the current draw as indicator, but the Chroma29 switches to 15s wake-up interval when it misses 3 syncs, regardless of being 2 slots or 16 slots configuration.
When I then start the long/full sync (not wu - I've changed the button so it skips wu) it sometimes comes back:
(2 slots configuration)
But some times it won't come back in sync, despite it clearly received something...
Found the issue ;)
get_num_slots()
returns always 16 in the activation process and thus calculates to 3. temp_num_slots
should be what you expected here.ohh yes !
Will upload a fixed version just now
Played around with the new activation a bit today,
as it is basically a full synced communication i added a full sync before trying to activate the display. it looks like it works way more often now.
also i changed the activation to only 4 slots and to a higher Frequency as most people will use freq 0 already.
Still have a problem on the normal mode to get the Chroma29 to sync but we get there :D
I actually didn't quite get the real difference (and benefit) between the two activation methods...
The "old" activation tells a specific device to go to a configured channel. This message is broadcasted for 15s. The device will receive it and start listening on the requested channel 15s later. The broadcast contains no sync information, so the device doesn't know if it catched the start or the end of the 15s wake-up call. It needs to sync later and then you send it all the configuration data. correct?
The "new" activation tells all devices in range (listening to wu) to go to a configured channel. This message is broadcasted for 15s. The devices will receive it and start listening on the requested channel 15s later. They will listen for 1.2s in order to acquire sync. When they do, they are in sync and you can push configuration by addressing them with a display id derived from the last two bytes of the serial.
Both wake the devices and establish an intermediate communication channel. There you're able to persist the configuration.
The real difference is that the "new" one wakes all devices in range while the "old" one addresses a specific device. 🤔
you got that right,
the new method allows, as all not activated displays wake up, to configure a lot at once, the old method needs to wake up every display on its own, and now think about 20k displays need to be activated, the software will handle it both ways but the new method is way faster then.
Well, I have to correct myself... When the EPOP50 wakes up from a wake-up call it continues to listen until the wake-up call is finished. The Chroma29 however goes back to sleep and wakes up later. This means the EPOP50 actually does kind of sync during the wape-up call because it knows when it finished...
Correct
That is also the reason why the wakeup call can not be simply 50 seconds to habe better chances as then the displaya would stop listening as a timeout
Should this be fixed?
I'm observing a similar behavior (or could also be #15) with my Chroma29s:
- configured with 4 slots, just one display in use
- after activation everything works
- the next day, nothing works until I reboot the esp32
- I also don't see the occasional sync happening anymore, so maybe that's the culprit?