jaysoffian/eap_proxy

Re-authenticating every 20-180 seconds?

tobiasmcnulty opened this issue · 17 comments

Previously I could go weeks at a time without seeing anything in my eap_proxy log file. Recently, I started seeing the remote link send a Failure packet followed by a Request that initiated the authentication sequence all over again (sometimes even as soon as 10 seconds after completing the authentication sequence).

The logs for a single Failure and re-auth look like this (eth0 being my WAN, eth2 the AT&T device):

[2019-11-10 19:14:16,350]: eth0: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v1, len 4, Failure (4) id 0, len 4 [0] > eth2
[2019-11-10 19:14:16,350]: eth0: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v1, len 15, Request (1) id 1, len 15 [11] > eth2
[2019-11-10 19:14:16,377]: eth2: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v2, len 22, Response (2) id 1, len 22 [18] > eth0
<snip>
[2019-11-10 19:14:24,079]: eth2: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v2, len 6, Response (2) id 8, len 6 [2] > eth0
[2019-11-10 19:14:24,108]: eth0: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v1, len 4, Success (3) id 8, len 4 [0] > eth2

In the ~8 minutes that it took me to write this post, all these re-auths (7 in total) led to about ~23% packet loss:

--- 8.8.8.8 ping statistics ---
462 packets transmitted, 355 packets received, 23.2% packet loss
round-trip min/avg/max/stddev = 11.484/12.799/35.356/1.823 ms

The time between re-auths is not consistent:

$ grep Success eap_proxy-stdout---supervisor-qDpAVS.log
[2019-11-10 19:19:01,349]: eth0: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v1, len 4, Success (3) id 8, len 4 [0] > eth2
[2019-11-10 19:19:30,860]: eth0: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v1, len 4, Success (3) id 8, len 4 [0] > eth2
[2019-11-10 19:20:10,860]: eth0: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v1, len 4, Success (3) id 8, len 4 [0] > eth2
[2019-11-10 19:20:51,340]: eth0: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v1, len 4, Success (3) id 8, len 4 [0] > eth2
[2019-11-10 19:21:46,839]: eth0: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v1, len 4, Success (3) id 8, len 4 [0] > eth2
[2019-11-10 19:22:05,878]: eth0: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v1, len 4, Success (3) id 8, len 4 [0] > eth2
[2019-11-10 19:25:25,072]: eth0: xx:xx:xx:xx:xx:xx > xx:xx:xx:xx:xx:xx, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2

Has anyone else seen similar issues, or have ideas for debugging this? Is there some non-EAP traffic that AT&T is trying and failing to send to the modem, perhaps? I will try plugging it directly into the ONT converter for a day and then switch back again to see if that helps (and report back).

I don't know why the ONT would be sending an unprompted Failure packet. Seems like the Failure packet should be in response to something from the RG. Let's see all of the log lines matching "eap_proxy" not just the Success lines.

My RG sends an EAPOL start every 60 seconds which the proxy ignores as long as the connection is up.

It's possible AT&T was having issues. I ran the modem connected directly to the ONT for a few days, and it had some trouble staying connected at first too but in the past few days became more reliable. I started using eap_proxy again last night, and these are the only packets in the log since I switched back:

[2019-11-15 20:08:23,134]: starting proxy_loop
[2019-11-15 20:08:35,863]: eth0: xx:xx:xx:xx:xx:xx > 01:80:c2:00:00:03, EAP packet (0) v1, len 15, Request (1) id 1, len 15 [11] > eth2
[2019-11-15 20:08:35,889]: eth2: yy:yy:yy:yy:yy:yy > 01:80:c2:00:00:03, EAP packet (0) v2, len 22, Response (2) id 1, len 22 [18] > eth0
[2019-11-15 20:08:35,891]: eth0: xx:xx:xx:xx:xx:xx > 01:80:c2:00:00:03, EAP packet (0) v1, len 4, Failure (4) id 1, len 4 [0] > eth2
[2019-11-15 20:08:35,891]: eth0: xx:xx:xx:xx:xx:xx > 01:80:c2:00:00:03, EAP packet (0) v1, len 15, Request (1) id 2, len 15 [11] > eth2
[2019-11-15 20:08:36,930]: eth2: yy:yy:yy:yy:yy:yy > 01:80:c2:00:00:03, EAP packet (0) v2, len 22, Response (2) id 2, len 22 [18] > eth0
[2019-11-15 20:08:36,950]: eth0: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 6, Request (1) id 3, len 6 [2] > eth2
[2019-11-15 20:08:38,121]: eth2: yy:yy:yy:yy:yy:yy > 01:80:c2:00:00:03, EAP packet (0) v2, len 206, Response (2) id 3, len 206 [202] > eth0
[2019-11-15 20:08:38,227]: eth0: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 1020, Request (1) id 4, len 1020 [1016] > eth2
[2019-11-15 20:08:39,146]: eth2: yy:yy:yy:yy:yy:yy > 01:80:c2:00:00:03, EAP packet (0) v2, len 6, Response (2) id 4, len 6 [2] > eth0
[2019-11-15 20:08:39,165]: eth0: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 1020, Request (1) id 5, len 1020 [1016] > eth2
[2019-11-15 20:08:40,169]: eth2: yy:yy:yy:yy:yy:yy > 01:80:c2:00:00:03, EAP packet (0) v2, len 6, Response (2) id 5, len 6 [2] > eth0
[2019-11-15 20:08:40,189]: eth0: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 805, Request (1) id 6, len 805 [801] > eth2
[2019-11-15 20:08:41,484]: eth2: yy:yy:yy:yy:yy:yy > 01:80:c2:00:00:03, EAP packet (0) v2, len 1408, Response (2) id 6, len 1408 [1404] > eth0
[2019-11-15 20:08:41,511]: eth0: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 6, Request (1) id 7, len 6 [2] > eth2
[2019-11-15 20:08:42,510]: eth2: yy:yy:yy:yy:yy:yy > 01:80:c2:00:00:03, EAP packet (0) v2, len 1404, Response (2) id 7, len 1404 [1400] > eth0
[2019-11-15 20:08:42,537]: eth0: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 6, Request (1) id 8, len 6 [2] > eth2
[2019-11-15 20:08:43,534]: eth2: yy:yy:yy:yy:yy:yy > 01:80:c2:00:00:03, EAP packet (0) v2, len 893, Response (2) id 8, len 893 [889] > eth0
[2019-11-15 20:08:43,570]: eth0: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 69, Request (1) id 9, len 69 [65] > eth2
[2019-11-15 20:08:44,560]: eth2: yy:yy:yy:yy:yy:yy > 01:80:c2:00:00:03, EAP packet (0) v2, len 6, Response (2) id 9, len 6 [2] > eth0
[2019-11-15 20:08:44,603]: eth0: xx:xx:xx:xx:xx:xx > 01:80:c2:00:00:03, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2

The command I'm running is:

/usr/bin/python3 /path/to/eap_proxy.py --ignore-start --ignore-logoff --ignore-when-wan-up --ping-ip=8.8.8.8 --run-as nobody eth0 eth2

I'm going to close this for now since it seems like it was a short-term issue.

Did you ever run into this again? I'm seeing this on a much smaller scale - failures 2-5 times per day. Similarly, there's no consistency with how often this happens, but it's super frustrating. A single failure comes in from the router and then everything is reset, so my entire network goes offline for 30 seconds or so.

It did happen a few more times; I actually switched back to routing everything through the AT&T box for now, but it too goes down from time to time and I need to unplug everything and plug it back in.

So, I would hazard a guess that it's an AT&T issue. I haven't had the time yet to determine if there's any difference in behavior when packets are routed through the AT&T box or not. As you say there is no consistency, so it's a real pain to debug. 😖

Let me know if you discover anything new and I'll do the same!

An AT&T tech came out and replaced all the fiber connections, the ONT, and provided a shiny new BGW210 as well (had a Pace 5268 before). They said it was evident in their tests that the fiber connect was not good before, and has improved now.

I haven't tried eap_proxy again yet but I'll report back when I do.

@tobiasmcnulty That's good to hear. Hopefully it works out from here on out - I'll have to get them out here to take a look I suppose.

@tobiasmcnulty AT&T tech came out today replaced the ONT, but not my BGW-210. Unfortunately, I'm still seeing regular disconnects - every 2-6 hours just as before. In hindsight I probably should have asked him to replace the BGW as well as a "just in case" sort of thing... oops.

As of recently, the broadband LED on the RG blinks red continuously. This started in the past week, so I don't think it's necessarily the same issue that's causing disconnects but it's hard to imagine that they aren't related. Previously, once eap_proxy had done its job and the connection was auth'd, the broadband light would turn off. I've seen it blink green a few times - only while authenticating. Once that's done, it goes back to blinking red. Were you seeing this this behavior as well? I feel like I must have misconfigured something at some point while trying to troubleshoot this mess, but I can't for the life of me figure it out :/

I have no idea if this is possible so maybe @jaysoffian can chime in with some details on how this actually works, but -- I'm wondering if my problem is that the RG is doing something wrong which triggers a reauth. So the WAN connection is fine and everything is working, but something wonky happens in the RG that the ONT responds to with a failure. The WAN connection is still up, but the ONT and RG do the authentication dance because the RG is stupid and then eap_proxy resets the network interfaces once the reauth is successful. This causes all my devices to go offline for ~5-10 seconds.

edit: sadly, this didn't pan out either. Overnight I left the ethernet cable from the RG to the USG unplugged and everything worked for a few hours, but woke up to no internet and this in the logs: Feb 11 05:45:35 USGLivingRoom eap_proxy[3471]: eth0: 00:xx:xx:xx:ff:01 > 01:xx:xx:xx:00:03, EAP packet (0) v1, len 4, Failure (4) id 0, len 4 [0]. I plugged it back in this morning and it has failed again since then.

Here's the full set of logs from a failure this afternoon (with --debug set):

Feb 11 16:21:23 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > UNKNOWN_MAC_2, EAP packet (0) v1, len 4, Failure (4) id 0, len 4 [0]
Feb 11 16:21:23 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > UNKNOWN_MAC_2, EAP packet (0) v1, len 4, Failure (4) id 0, len 4 [0] > eth2
Feb 11 16:21:23 USGLivingRoom eap_proxy[3471]: eth2: sent 64 bytes
Feb 11 16:21:23 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > UNKNOWN_MAC_2, EAP packet (0) v1, len 15, Request (1) id 1, len 15 [11]
Feb 11 16:21:23 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > UNKNOWN_MAC_2, EAP packet (0) v1, len 15, Request (1) id 1, len 15 [11] > eth2
Feb 11 16:21:23 USGLivingRoom eap_proxy[3471]: eth2: sent 64 bytes
Feb 11 16:21:24 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 22, Response (2) id 1, len 22 [18]
Feb 11 16:21:24 USGLivingRoom eap_proxy[3471]: eth0.0: MY.IP.ADDR
Feb 11 16:21:25 USGLivingRoom eap_proxy[3471]: ping: ATT.WAN.GATEWAY.IP failed
Feb 11 16:21:25 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 22, Response (2) id 1, len 22 [18] > eth0
Feb 11 16:21:25 USGLivingRoom eap_proxy[3471]: eth0: sent 60 bytes
Feb 11 16:21:25 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 6, Request (1) id 2, len 6 [2]
Feb 11 16:21:25 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 6, Request (1) id 2, len 6 [2] > eth2
Feb 11 16:21:25 USGLivingRoom eap_proxy[3471]: eth2: sent 64 bytes
Feb 11 16:21:25 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 92, Response (2) id 2, len 92 [88]
Feb 11 16:21:25 USGLivingRoom eap_proxy[3471]: eth0.0: MY.IP.ADDR
Feb 11 16:21:26 USGLivingRoom eap_proxy[3471]: ping: ATT.WAN.GATEWAY.IP failed
Feb 11 16:21:26 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 92, Response (2) id 2, len 92 [88] > eth0
Feb 11 16:21:26 USGLivingRoom eap_proxy[3471]: eth0: sent 110 bytes
Feb 11 16:21:26 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 1020, Request (1) id 3, len 1020 [1016]
Feb 11 16:21:26 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 1020, Request (1) id 3, len 1020 [1016] > eth2
Feb 11 16:21:26 USGLivingRoom eap_proxy[3471]: eth2: sent 1038 bytes
Feb 11 16:21:26 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 6, Response (2) id 3, len 6 [2]
Feb 11 16:21:26 USGLivingRoom eap_proxy[3471]: eth0.0: MY.IP.ADDR
Feb 11 16:21:27 USGLivingRoom eap_proxy[3471]: ping: ATT.WAN.GATEWAY.IP failed
Feb 11 16:21:27 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 6, Response (2) id 3, len 6 [2] > eth0
Feb 11 16:21:27 USGLivingRoom eap_proxy[3471]: eth0: sent 60 bytes
Feb 11 16:21:27 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 1020, Request (1) id 4, len 1020 [1016]
Feb 11 16:21:27 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 1020, Request (1) id 4, len 1020 [1016] > eth2
Feb 11 16:21:27 USGLivingRoom eap_proxy[3471]: eth2: sent 1038 bytes
Feb 11 16:21:27 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 6, Response (2) id 4, len 6 [2]
Feb 11 16:21:27 USGLivingRoom eap_proxy[3471]: eth0.0: MY.IP.ADDR
Feb 11 16:21:28 USGLivingRoom eap_proxy[3471]: ping: ATT.WAN.GATEWAY.IP failed
Feb 11 16:21:28 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 6, Response (2) id 4, len 6 [2] > eth0
Feb 11 16:21:28 USGLivingRoom eap_proxy[3471]: eth0: sent 60 bytes
Feb 11 16:21:28 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 273, Request (1) id 5, len 273 [269]
Feb 11 16:21:28 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 273, Request (1) id 5, len 273 [269] > eth2
Feb 11 16:21:28 USGLivingRoom eap_proxy[3471]: eth2: sent 291 bytes
Feb 11 16:21:28 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 906, Response (2) id 5, len 906 [902]
Feb 11 16:21:28 USGLivingRoom eap_proxy[3471]: eth0.0: MY.IP.ADDR
Feb 11 16:21:29 USGLivingRoom eap_proxy[3471]: ping: ATT.WAN.GATEWAY.IP failed
Feb 11 16:21:29 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 906, Response (2) id 5, len 906 [902] > eth0
Feb 11 16:21:29 USGLivingRoom eap_proxy[3471]: eth0: sent 924 bytes
Feb 11 16:21:29 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 6, Request (1) id 6, len 6 [2]
Feb 11 16:21:29 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 6, Request (1) id 6, len 6 [2] > eth2
Feb 11 16:21:29 USGLivingRoom eap_proxy[3471]: eth2: sent 64 bytes
Feb 11 16:21:29 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 902, Response (2) id 6, len 902 [898]
Feb 11 16:21:29 USGLivingRoom eap_proxy[3471]: eth0.0: MY.IP.ADDR
Feb 11 16:21:30 USGLivingRoom eap_proxy[3471]: ping: ATT.WAN.GATEWAY.IP failed
Feb 11 16:21:30 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 902, Response (2) id 6, len 902 [898] > eth0
Feb 11 16:21:30 USGLivingRoom eap_proxy[3471]: eth0: sent 920 bytes
Feb 11 16:21:30 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 6, Request (1) id 7, len 6 [2]
Feb 11 16:21:30 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 6, Request (1) id 7, len 6 [2] > eth2
Feb 11 16:21:30 USGLivingRoom eap_proxy[3471]: eth2: sent 64 bytes
Feb 11 16:21:30 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 902, Response (2) id 7, len 902 [898]
Feb 11 16:21:30 USGLivingRoom eap_proxy[3471]: eth0.0: MY.IP.ADDR
Feb 11 16:21:31 USGLivingRoom eap_proxy[3471]: ping: ATT.WAN.GATEWAY.IP failed
Feb 11 16:21:31 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 902, Response (2) id 7, len 902 [898] > eth0
Feb 11 16:21:31 USGLivingRoom eap_proxy[3471]: eth0: sent 920 bytes
Feb 11 16:21:31 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 6, Request (1) id 8, len 6 [2]
Feb 11 16:21:31 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 6, Request (1) id 8, len 6 [2] > eth2
Feb 11 16:21:31 USGLivingRoom eap_proxy[3471]: eth2: sent 64 bytes
Feb 11 16:21:31 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 484, Response (2) id 8, len 484 [480]
Feb 11 16:21:31 USGLivingRoom eap_proxy[3471]: eth0.0: MY.IP.ADDR
Feb 11 16:21:32 USGLivingRoom eap_proxy[3471]: ping: ATT.WAN.GATEWAY.IP failed
Feb 11 16:21:32 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 484, Response (2) id 8, len 484 [480] > eth0
Feb 11 16:21:32 USGLivingRoom eap_proxy[3471]: eth0: sent 502 bytes
Feb 11 16:21:32 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 69, Request (1) id 9, len 69 [65]
Feb 11 16:21:32 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > RG_MAC, EAP packet (0) v1, len 69, Request (1) id 9, len 69 [65] > eth2
Feb 11 16:21:32 USGLivingRoom eap_proxy[3471]: eth2: sent 87 bytes
Feb 11 16:21:32 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 6, Response (2) id 9, len 6 [2]
Feb 11 16:21:32 USGLivingRoom eap_proxy[3471]: eth0.0: MY.IP.ADDR
Feb 11 16:21:33 USGLivingRoom eap_proxy[3471]: ping: ATT.WAN.GATEWAY.IP failed
Feb 11 16:21:33 USGLivingRoom eap_proxy[3471]: eth2: RG_MAC > UNKNOWN_MAC_2, EAP packet (0) v2, len 6, Response (2) id 9, len 6 [2] > eth0
Feb 11 16:21:33 USGLivingRoom eap_proxy[3471]: eth0: sent 60 bytes
Feb 11 16:21:33 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > UNKNOWN_MAC_2, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0]
Feb 11 16:21:33 USGLivingRoom eap_proxy[3471]: eth0.0: MY.IP.ADDR
Feb 11 16:21:34 USGLivingRoom eap_proxy[3471]: ping: ATT.WAN.GATEWAY.IP failed
Feb 11 16:21:34 USGLivingRoom eap_proxy[3471]: eth0.0: restarting dhclient
Feb 11 16:21:35 USGLivingRoom miniupnpd[3481]: ioctl(s, SIOCGIFADDR, ...): Cannot assign requested address
Feb 11 16:21:35 USGLivingRoom miniupnpd[3481]: Failed to get IP for interface eth0.0
Feb 11 16:21:35 USGLivingRoom miniupnpd[3481]: SendNATPMPPublicAddressChangeNotification: cannot get public IP address, stopping
Feb 11 16:21:38 USGLivingRoom ipsec-dhclient-hook: DHCP address updated to from MY.IP.ADDR: Updating ipsec configuration.
Feb 11 16:21:39 USGLivingRoom xl2tpd[17235]: death_handler: Fatal signal 15 received
Feb 11 16:21:39 USGLivingRoom xl2tpd[14691]: parse_config: line 3: set_ip: host '' not found
Feb 11 16:21:39 USGLivingRoom xl2tpd[14691]: init: Unable to load config file
Feb 11 16:21:39 USGLivingRoom eap_proxy[3471]: eth0: UNKNOWN_MAC_1 > UNKNOWN_MAC_2, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
Feb 11 16:21:39 USGLivingRoom eap_proxy[3471]: eth2: sent 64 bytes
Feb 11 16:21:44 USGLivingRoom ipsec-dhclient-hook: DHCP address updated to MY.IP.ADDR from MY.IP.ADDR: Updating ipsec configuration.
Feb 11 16:21:45 USGLivingRoom xl2tpd[14784]: setsockopt recvref[30]: Protocol not available

I have I have no idea what the MAC addresses are referencing there - I assume one of them is the ONT, but what in the world would the others be? @jaysoffian any idea? Could something as trivial as a bad ethernet cable or some sort of interference be causing this?

I'm seeing similar things again 😦. It was working fine for a few days, and then yesterday almost all day there was EAP traffic back and forth. Internet wasn't working at all at the end of it (not sure about earlier), so I power cycled the ONT and BGW210 and things have been fine with just a single re-auth since. I attached the full log below (nearly 6000 lines!).

BTW, I think the unknown MACs are standard bridge and EAP MAC addresses (which I guess is how some bridges/switches know not to pass this traffic?). I left them in my logs so you can compare (but replaced my BGW210 MAC with aa:tt:tt:mm:aa:cc).

eap_proxy_logs_2020-02-15.txt

It does feel like a physical link issue still. If I briefly unplug the ethernet or fiber on the ONT, I get the same re-auth sequence, sometimes starting with a Failure packet and sometimes without.

That said, when unplugging EITHER the fiber or ethernet, I can see the ethernet link go down briefly on the router, which I DON'T see when it's re-authenticating on its own every minute or two. 🤷‍♂ E.g.:

Feb 15 15:43:24 vyos kernel: [400645.911366] igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Down
Feb 15 15:43:24 vyos kernel: [400645.913328] br0: port 1(eth0.0) entered disabled state
Feb 15 15:43:32 vyos kernel: [400653.920030] igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
Feb 15 15:43:32 vyos kernel: [400653.921074] br0: port 1(eth0.0) entered blocking state
Feb 15 15:43:32 vyos kernel: [400653.921087] br0: port 1(eth0.0) entered forwarding state
vyos@vyos:/var/log$ grep "eth0 NIC Link is Down" messages.1
Feb 15 00:23:13 vyos kernel: [345434.343866] igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Down
Feb 15 00:26:36 vyos kernel: [345637.487856] igb 0000:01:00.0 eth0: igb: eth0 NIC Link is Down
vyos@vyos:/var/log$ grep "eth0 NIC Link is Down" messages.2
vyos@vyos:/var/log$ grep "eth0 NIC Link is Down" messages.3
vyos@vyos:/var/log$ grep "eth0 NIC Link is Down" messages.4
vyos@vyos:/var/log$ grep "eth0 NIC Link is Down" messages.5

Hmm. That’s odd. One thing I’ve thought about that I haven’t tried yet is just swapping out all the Ethernet cables between the USG, RG, and ONT with fresh ones... have you tried that? Seems as plausible as anything else... :/

@zcramos I've had bad cables before so it's certainly worth a try. FWIW, my ONT and RG are connected through a wall port and patch panel in the closet, so the cable run is probably longer than most. But it tests fine and I have tried swapping out the cables that I can replace, so 🤷‍♂. And to be clear, the ethernet link DOESN'T seem to be going down for me when it gets stuck in those re-auth cycles.

I wish the ONT had a private IP one could access to see the noise level on the fiber side; that was invaluable for diagnosing RF noise back when we had cable, and it turns out that's what the AT&T tech saw that caused them to come replace all those connections this time. He showed it to me on their tablet but said there probably isn't a way for us to see it ourselves.

@zcramos Another finding: If I unplug the RGW after it authenticates successfully once, the connection appears to stay up for a week or more.

Question: Are you passing --ignore-start to eap_proxy.py? I wasn't (recently at least, I don't recall if I was when I first opened this ticket), but I've enabled it now and am going to leave the RGW plugged in for awhile to see if that helps...

--ignore-start doesn't seem to have helped. It happened again yesterday:

vyos@vyos:/var/log$ grep Success messages.1|wc -l
68
vyos@vyos:/var/log$ grep Success messages|wc -l
34

I got it back online by rebooting the ONT.

This could be a complete coincidence, but I switched my ONT to a different power supply and I haven't noticed the issue since...

Changing the power supply did NOT fix the issue entirely for me. Whenever this happens now I simply reboot the ONT, and things go back to normal for a little while. In the log below, you can see:

  • I last rebooted my router on 5/24 (when the log messages start, this is OpenWRT)
  • The cycling started at 13:22 and continued until about 14:40 when I rebooted the ONT
  • I'm posting this around 15:02 and the connection has been completely stable since then
# grep Success messages 
May 24 17:15:55 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:22:23 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:27:39 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:31:00 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:31:29 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:33:22 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:36:20 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:38:38 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:40:05 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:41:50 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:43:07 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:45:02 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:45:35 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:46:57 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:48:09 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:49:10 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:49:34 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:50:22 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:50:56 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:52:26 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:53:06 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:56:35 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:57:17 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:58:23 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:59:05 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 13:59:34 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:00:48 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:01:05 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:03:48 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:07:04 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:09:47 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:11:48 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:12:45 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:13:20 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:14:28 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:15:05 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:15:50 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:16:27 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:16:54 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:17:31 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:18:26 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:18:45 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:19:03 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:19:36 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:20:44 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:21:13 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:22:43 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:23:05 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:23:40 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:23:55 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:24:32 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:25:19 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:26:40 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:28:35 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:29:07 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:30:31 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:30:54 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:31:57 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:32:31 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:33:18 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:33:51 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:38:24 OpenWrt eap_proxy[2011]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2
May 31 14:41:23 OpenWrt eap_proxy[4422]: eth1: xx:xx:xx:xx:xx:xx > yy:yy:yy:yy:yy:yy, EAP packet (0) v1, len 4, Success (3) id 9, len 4 [0] > eth2

I'm going to close this since I don't think there's anything eap_proxy can do to fix this, but hopefully this thread will at least be useful to someone else. I don't think it's specific to eap_proxy either, I think eap_proxy just gives us a way to see a specific issue with some AT&T fiber connections. 🤷

Should have updated a while ago, but I believe you’re correct as well. I moved to a new house last month, in another area served by AT&T Fiber. I haven’t had this issue at all since the move. It is absolutely an AT&T problem, and with enough phone calls and bitching I suspect you could get them to send techs out to test and replace the shared infrastructure further up the network.