gabe565/CastSponsorSkip

CastSponsorSkip panics after TV turns off and can't reconnect until it is restarted

patmead opened this issue · 4 comments

CastSponsorSkip works perfectly the first time the TV is turned on, but after the TV is turned off CastSponsorSkip is unable to reconnect again until the systemd service has been restarted,

  • Version: CastSponsorSkip v0.6.0 (afeb347)
  • OS: Raspbian 11 (bullseye)
  • Package: gabe565 apt repository
  • Init: systemd
  • Target device: Android TV with Youtube app

Config:

[Service]
Environment="CSS_PAUSED_INTERVAL=20s"
Environment="CSS_CATEGORIES=sponsor,selfpromo,interaction"
Environment="CSS_YOUTUBE_API_KEY=[REMOVED]"
Environment="CSS_LOG_LEVEL=debug"

Logs:

Nov 14 12:19:26 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:19:26.818Z level=DEBUG msg="Requesting update." device=TV
Nov 14 12:19:46 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:19:46.819Z level=DEBUG msg="Requesting update." device=TV
Nov 14 12:20:06 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:20:06.818Z level=DEBUG msg="Requesting update." device=TV
Nov 14 12:20:41 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:20:41.829Z level=DEBUG msg="Failed to update device. Reconnecting..." device=TV error="context deadline exceeded"
Nov 14 12:20:44 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:20:44.832Z level=DEBUG msg="Failed to connect to device. Retrying..." device=TV try=0 error="unable to connect to chromecast at '192.168.1.30:8009': dial tcp 192.168.1.30:8009: i/o timeout"
Nov 14 12:20:50 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:20:50.345Z level=DEBUG msg="Failed to connect to device. Retrying..." device=TV try=1 error="unable to connect to chromecast at '[<nil>]:0': dial tcp: lookup <nil>: no such host"
Nov 14 12:20:56 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:20:56.355Z level=DEBUG msg="Failed to connect to device. Retrying..." device=TV try=2 error="unable to connect to chromecast at '[<nil>]:0': dial tcp: lookup <nil>: no such host"
Nov 14 12:21:03 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:21:03.378Z level=DEBUG msg="Failed to connect to device. Retrying..." device=TV try=3 error="unable to connect to chromecast at '[<nil>]:0': dial tcp: lookup <nil>: no such host"
Nov 14 12:21:12 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:21:12.387Z level=DEBUG msg="Failed to connect to device. Retrying..." device=TV try=4 error="unable to connect to chromecast at '[<nil>]:0': dial tcp: lookup <nil>: no such host"
Nov 14 12:21:25 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:21:25.401Z level=DEBUG msg="Failed to connect to device. Retrying..." device=TV try=5 error="unable to connect to chromecast at '[<nil>]:0': dial tcp: lookup <nil>: no such host"
Nov 14 12:21:46 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:21:46.411Z level=DEBUG msg="Failed to reconnect." device=TV error="device not found: "
Nov 14 12:21:46 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:21:46.417Z level=ERROR msg="Lost connection to device." device=TV error="context deadline exceeded"
Nov 14 12:21:46 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:21:46.428Z level=DEBUG msg="Panic during close" device=TV error="runtime error: invalid memory address or nil pointer dereference" stack="goroutine 18 [running]:\nruntime/debug.Stack()\n\truntime/debug/stack.go:24 +0x78\ngithub.com/gabe565/castsponsorskip/internal/device.(*Device).closeApp.func1()\n\tgithub.com/gabe565/castsponsorskip/internal/device/watch.go:118 +0x48\npanic({0x7f1fc8, 0xe17ec8})\n\truntime/panic.go:914 +0x244\ncrypto/tls.(*Conn).Close(0x0)\n\tcrypto/tls/conn.go:1403 +0x14\ngithub.com/vishen/go-chromecast/cast.(*Connection).Close(0x2e9e020)\n\tgithub.com/vishen/go-chromecast@v0.3.2-0.20231028022757-c2af7da8e4b9/cast/connection.go:82 +0xbc\ngithub.com/vishen/go-chromecast/application.(*Application).Close(0x3050090, 0x0)\n\tgithub.com/vishen/go-chromecast@v0.3.2-0.20231028022757-c2af7da8e4b9/application/application.go:431 +0x15c\ngithub.com/gabe565/castsponsorskip/internal/device.(*Device).closeApp(0x2ca5200)\n\tgithub.com/gabe565/castsponsorskip/internal/device/watch.go:122 +0x7c\ngithub.com/gabe565/castsponsorskip/internal/device.(*Device).Close(0x2ca5200)\n\tgithub.com/gabe565/castsponsorskip/internal/device/watch.go:111 +0xa8\ngithub.com/gabe565/castsponsorskip/cmd.run.func1.1()\n\tgithub.com/gabe565/castsponsorskip/cmd/cmd.go:104 +0xb0\ncreated by github.com/gabe565/castsponsorskip/cmd.run.func1 in goroutine 11\n\tgithub.com/gabe565/castsponsorskip/cmd/cmd.go:101 +0x48\n"
Nov 14 12:21:52 raspberrypi castsponsorskip[12350]: time=2023-11-14T12:21:52.479Z level=DEBUG msg="Ignoring device." device=TV reason="Already connected"

@patmead Interesting. I'll need to troubleshoot this one a little bit. The go-chromecast library that I use to communicate with devices sometimes panics when closing connections, which should be fine because CastSponsorSkip has a logic to recover from these. For some reason, it seems like the panic is causing CastSponsorSkip to fail to remove the device from the list of active connections, so it never tries to reconnect. This logic shouldn't ever be skipped, so I'm not sure what's causing the issue yet.

I'll keep you posted on a fix!

Ah I believe I see the issue. When it tries to reconnect, some of the device identifiers get reset, causing it to never be removed.

@patmead Just released v0.6.1 which includes a fix for this! Reconnections are working again on my end. Would you mind verifying?

Looks good on my end too - thanks @gabe565!