XKNX/xknx

'TunnellingRequest with sequence number not equal to expected' error every few hours

Closed this issue · 10 comments

Every few hours 'Received TunnellingRequest with sequence number not equal to expected' error is stuck while the sequence finally loops around. During that time messages are sent but not read.

  • using xknx standalone
  • using Home-Assistant knx integration

Version information:

  • xknx / Home-Assistant release with the issue: 1.2.1
  • last working xknx / Home-Assistant release (if known): unknown

KNX installation:
configuration.yaml
knx.yaml.txt

Diagnostic data of the config entry (only when Home Assistant is used)
config_entry-knx-ed093e5771cbc5d18e4307a64fd3a504.json.txt

Traceback (if applicable):
log.txt

Hi 👋!

It seems like there is some package drop going on on your network - maybe you have some gear to monitor that?

What KNX interface are you using?

If there is a frame with unexpected sequence number it shall not be ACKed - which it is not according to the logs - and the server device shall repeat that frame. If that repeated frame is not ACKed it shall close the connection.

In your logs you can see

  • frame 122 was fine
  • 123 was never received, and thus never acked - but not repeated by the server
  • next frame was 124 - which was also not acked and thus repeated after 1 second
  • the repeated 124 was also not acked - which should result in a disconnect from the server - which never happened 😬
2022-11-22 12:12:00.054 DEBUG (KNX Interface) [xknx.raw_socket] Received from ('192.168.1.155', 3671): 06100420001504487a002e00bce011002100010080
2022-11-22 12:12:00.055 DEBUG (KNX Interface) [xknx.knx] Received from 192.168.1.155:3671 at 1669111920.055147:
 <KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_REQUEST" Reserve="0" TotalLength="21" />
 body="<TunnellingRequest communication_channel_id="72" sequence_counter="122" cemi="<CEMIFrame code="L_DATA_CON" src_addr="IndividualAddress("1.1.0")" dst_addr="GroupAddress("4/1/0")" flags="1011110011100000" tpci="TDataGroup()" payload="<GroupValueWrite value="<DPTBinary value="0" />" />" />" />" />
2022-11-22 12:12:00.055 DEBUG (KNX Interface) [xknx.knx] Sending to 192.168.1.155:3671 at 1669111920.0555031:
 <KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_ACK" Reserve="0" TotalLength="10" />
 body="<TunnellingAck communication_channel_id="72" sequence_counter="122" status_code="ErrorCode.E_NO_ERROR" />" />

2022-11-22 12:12:07.076 DEBUG (KNX Interface) [xknx.raw_socket] Received from ('192.168.1.155', 3671): 06100420001704487c002900bce01115190503008007e4
2022-11-22 12:12:07.077 DEBUG (KNX Interface) [xknx.knx] Received from 192.168.1.155:3671 at 1669111927.077707:
 <KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_REQUEST" Reserve="0" TotalLength="23" />
 body="<TunnellingRequest communication_channel_id="72" sequence_counter="124" cemi="<CEMIFrame code="L_DATA_IND" src_addr="IndividualAddress("1.1.21")" dst_addr="GroupAddress("3/1/5")" flags="1011110011100000" tpci="TDataGroup()" payload="<GroupValueWrite value="<DPTArray value="[0x7,0xe4]" />" />" />" />" />
2022-11-22 12:12:07.078 WARNING (KNX Interface) [xknx.log] Received TunnellingRequest with sequence number not equal to expected: 123. Discarding frame: <TunnellingRequest communication_channel_id="72" sequence_counter="124" cemi="<CEMIFrame code="L_DATA_IND" src_addr="IndividualAddress("1.1.21")" dst_addr="GroupAddress("3/1/5")" flags="1011110011100000" tpci="TDataGroup()" payload="<GroupValueWrite value="<DPTArray value="[0x7,0xe4]" />" />" />" />

2022-11-22 12:12:08.076 DEBUG (KNX Interface) [xknx.raw_socket] Received from ('192.168.1.155', 3671): 06100420001704487c002900bce01115190503008007e4
2022-11-22 12:12:08.077 DEBUG (KNX Interface) [xknx.knx] Received from 192.168.1.155:3671 at 1669111928.0779228:
 <KNXIPFrame <KNXIPHeader HeaderLength="6" ProtocolVersion="16" KNXIPServiceType="TUNNELLING_REQUEST" Reserve="0" TotalLength="23" />
 body="<TunnellingRequest communication_channel_id="72" sequence_counter="124" cemi="<CEMIFrame code="L_DATA_IND" src_addr="IndividualAddress("1.1.21")" dst_addr="GroupAddress("3/1/5")" flags="1011110011100000" tpci="TDataGroup()" payload="<GroupValueWrite value="<DPTArray value="[0x7,0xe4]" />" />" />" />" />
2022-11-22 12:12:08.078 WARNING (KNX Interface) [xknx.log] Received TunnellingRequest with sequence number not equal to expected: 123. Discarding frame: <TunnellingRequest communication_channel_id="72" sequence_counter="124" cemi="<CEMIFrame code="L_DATA_IND" src_addr="IndividualAddress("1.1.21")" dst_addr="GroupAddress("3/1/5")" flags="1011110011100000" tpci="TDataGroup()" payload="<GroupValueWrite value="<DPTArray value="[0x7,0xe4]" />" />" />" />

Thanks for response.

Never looked into network package monitoring. Both devices are connected straight to a good quality switch. I would suspect something else first like KNX interface.

Im using a second-hand JUNG KNX IP-Router IPR 100 REG. Theres a reasonable chance that my config (ETS or HASS) could be wrong since this is my first setup riddled by a lot of trial and error experiments.

I've seen similar errors from bad Lan cables, so never say never.

On the other hand it's unlikely that wrong ETS / HA config results in such an error.

Maybe you can intercept traffic at the Knx interface with Wireshark?
You can also check if the interface provides logs on its own (web interface or telnet etc.).

Never used Wireshark and didnt manage to get it working correctly in reasonable amount of time.

What i did instead was connect to Group Monitor via ETS and wait for the issue to appear with KNX integration. I doubt its enough for you to go on but ETS had a disconnect at the same time - so i guess this points to an issue with my KNX IP Router.

group_monitor.txt
hass_knx.txt

ETS handled it by simply recconnecting. Could the same approach be used by KNX integration?

I suppose we can implement a reconnect for that.

Nevertheless, I recommend checking your cabling and network gear. I guess something is dropping packets there (and the interface doesn't handle that properly).
Keep in mind that reconnecting HA results in requesting all states again to be sure we are in sync. This implies additional bus / network load which could theoretically result in more package loss... I mean it's better than having a defunct connection, but still you should find the root of the problem instead of curating symptoms.

Reconnection traffic might not be as bad as completelly out of sync values while counter fully loops.

Will try to move KNX interface further from power lines.
Thanks for help.

It would still be interesting how ETS does its reconnection in that case. Unfortunately I'd need a tcpdump to see that - an ETS log doesn't show that. And I have no idea how to reproduce that with my own setup.
You can install Wireshark next to ETS and just let it listen to the used network interface (LAN or Wifi). If you need help, you may also contact me via Discord 👾

For a really quick look at how to use Wireshark: https://youtu.be/QjR57iTPzMc?t=60

Just found out about this as I was struggling with my KNX interface as well having exactly the same issue (perhaps the interface is not dealing UDP retransmissions properly?). I solved the issue completely by switching to TCP Tunneling mode.

I understand that this might not suit specific use-cases, but it definitely helps when TCP overhead is not an issue.

@Mikepicker what interface manufacturer / model are you using? Also JUNG KNX IPR 100 REG?

If your interface supports it, TCP is a better option regardless. There are no sequence numbers and TunnelingAck packets needed on TCP so I guess overhead is more or less equal.

I'm using a Zennio AllInBOX 1612 as the KNX Gateway