netbrain/zwift

[SOLVED] keeps crashing after 20 minutes

Closed this issue · 15 comments

SOLVED:

The "Video Screenshots" feature in the game setting was causing the game to use up more and more memory until the game crashed. Disable this feature

Thanks everyone who helped with this


Hello,

firstly, thanks for all your work on this. I've been a Zwift user since beta and it's a dream come true to finally be able to run on linux.

i've run the game 3 times now, and each time, it crashes around the 25 minutes into the ride. any ideas on how to troubleshoot this issue?

here's a chunk of my journal log just before the crash

Oct 02 19:16:32 [redacted] systemd[1]: Starting fwupd-refresh.service - Refresh fwupd metadata and update motd...
Oct 02 19:10:03 [redacted] angry_jepsen[698494]: 11e0:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706ba
Oct 02 19:10:03 [redacted] angry_jepsen[698494]: 11e0:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706ba
Oct 02 19:10:02 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:10:02 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:10:02 [redacted] angry_jepsen[698494]: 11e0:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:10:01 [redacted] drkonqi-coredump-launcher[747490]: Nothing handled the dump :O
Oct 02 19:10:01 [redacted] drkonqi-coredump-launcher[747490]: QFile::remove: Empty or null file name
Oct 02 19:10:01 [redacted] drkonqi-coredump-launcher[747490]: Unable to find file for pid 747338 expected at "kcrash-metadata/747338.ini"
Oct 02 19:10:01 [redacted] systemd[1]: drkonqi-coredump-processor@115-747356-0.service: Deactivated successfully.
Oct 02 19:10:01 [redacted] systemd[1209]: Started drkonqi-coredump-launcher@115-747364-0.service - Launch DrKonqi for a systemd-coredump crash (PID 747364/UID 0).
Oct 02 19:10:01 [redacted] drkonqi-coredump-processor[747364]: "/opt/wine-stable/bin/wine64-preloader" 747338 "/var/lib/systemd/coredump/core.mscorsvw\\x2eexe.1000.a8dc208b362b4c4fac36dd23fd3193c3.747338.1696288200000000.zst"
Oct 02 19:10:01 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:10:01 [redacted] systemd[1]: systemd-coredump@115-747356-0.service: Deactivated successfully.
Oct 02 19:10:01 [redacted] systemd-coredump[747359]: [🡕] Process 747338 (mscorsvw.exe) of user 1000 dumped core.
                                                    
                                                    Stack trace of thread 44505:
                                                    #0  0x00007f56aaf1e107 n/a (/opt/wine-stable/lib64/wine/x86_64-unix/ntdll.so + 0x3f107)
                                                    #1  0x0000000000000000 n/a (n/a + 0x0)
                                                    ELF object binary architecture: AMD x86-64
Oct 02 19:10:01 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706bf
Oct 02 19:10:01 [redacted] angry_jepsen[698494]: 0584:err:rpc:RpcAssoc_BindConnection rejected bind for reason 0
Oct 02 19:10:01 [redacted] angry_jepsen[698494]: 110c:err:rpc:RpcAssoc_BindConnection rejected bind for reason 0
Oct 02 19:10:01 [redacted] angry_jepsen[698494]: 11e0:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:10:00 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:10:00 [redacted] systemd[1]: Started drkonqi-coredump-processor@115-747356-0.service - Pass systemd-coredump journal entries to relevant user for potential DrKonqi handling.
Oct 02 19:10:00 [redacted] systemd[1]: Started systemd-coredump@115-747356-0.service - Process Core Dump (PID 747356/UID 0).
Oct 02 19:10:00 [redacted] angry_jepsen[698494]: 11e0:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:09:59 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:09:59 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:09:58 [redacted] angry_jepsen[698494]: 11e0:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:09:58 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:09:58 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:09:57 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:09:57 [redacted] angry_jepsen[698494]: 0584:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be
Oct 02 19:09:57 [redacted] drkonqi-coredump-launcher[747095]: Nothing handled the dump :O
Oct 02 19:09:57 [redacted] drkonqi-coredump-launcher[747095]: QFile::remove: Empty or null file name
Oct 02 19:09:57 [redacted] drkonqi-coredump-launcher[747095]: Unable to find file for pid 746930 expected at "kcrash-metadata/746930.ini"
Oct 02 19:09:57 [redacted] systemd[1]: drkonqi-coredump-processor@114-746954-0.service: Deactivated successfully.
Oct 02 19:09:57 [redacted] systemd[1209]: Started drkonqi-coredump-launcher@114-746964-0.service - Launch DrKonqi for a systemd-coredump crash (PID 746964/UID 0).
Oct 02 19:09:57 [redacted] drkonqi-coredump-processor[746964]: "/opt/wine-stable/bin/wine64-preloader" 746930 "/var/lib/systemd/coredump/core.mscorsvw\\x2eexe.1000.a8dc208b362b4c4fac36dd23fd3193c3.746930.1696288195000000.zst"
Oct 02 19:09:57 [redacted] systemd[1]: systemd-coredump@114-746954-0.service: Consumed 1.145s CPU time.
Oct 02 19:09:57 [redacted] systemd[1]: systemd-coredump@114-746954-0.service: Deactivated successfully.
Oct 02 19:09:57 [redacted] systemd-coredump[746957]: [🡕] Process 746930 (mscorsvw.exe) of user 1000 dumped core.
                                                    
                                                    Stack trace of thread 44137:
                                                    #0  0x00007fb20f1be107 n/a (/opt/wine-stable/lib64/wine/x86_64-unix/ntdll.so + 0x3f107)
                                                    #1  0x0000000000000000 n/a (n/a + 0x0)
                                                    ELF object binary architecture: AMD x86-64
Oct 02 19:09:57 [redacted] angry_jepsen[698494]: 11e0:err:ole:ifproxy_release_public_refs IRemUnknown_RemRelease failed with error 0x800706be

OS: Debian sid
CPU: AMD ryzen 5600
RAM: 32GB
GPU: AMD RX580

thanks in advance

I've had similar consistent crashes back on 1.47.0 but it hasn't happened to me in a while.
Is your image up to date, 1.49.0?

Maybe there's something in the Zwift logs, too.

I've had similar consistent crashes back on 1.47.0 but it hasn't happened to me in a while. Is your image up to date, 1.49.0?

yes, it's up to date. i saw that it downloaded a new image today when i ran the script. but i was also having the same problem last week on the older version of zwift

Maybe there's something in the Zwift logs, too.

i don't see anything strange in the zwift logs... the only thing weird is that the timestamps are in a different time zone. 4 hours ahead of me...

I Would monitor the resource consumption while running the application, maybe there is a memory leak with system ram or gpu? Especially since you state that its fairly reproducible at 25 minutes. Something is probably hitting the roof.

Would also try to tweak the graphics setting, and maybe setting resolution and quality way down to see if it has any effect on when it crashes.

I've seen my system crash when running to many instances of zwift at once, (once had 4 running simultaneously) but that was due to out of memory on the graphics card.

And are you running docker or podman?

And are you running docker or podman?

podman.

I Would monitor the resource consumption while running the application, maybe there is a memory leak with system ram or gpu? Especially since you state that its fairly reproducible at 25 minutes. Something is probably hitting the roof.

i will try to watch this. but i used to run zwift on a laptop with integrated graphics... so i assume my system well above the minimum requirements

thanks

If everything still fails with podman, i'd try docker as a last resort.

fandan commented

Here the same. I suspect it has something to do with resolution. The higher, the faster the container crashes. I couldn't find any clues in the journal. I get error messages from the wine64-preloader at irregular intervals while Zwift is running. Here are the last entries in the container log:

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] ERROR LEVEL: VIDEO_CAPTURE : WindowsFailure, 31442 of 10, reasons=["EncoderClip: CreateSinkWriter failed, hr=-2147467263", "Encoder: clip failed, clipIndex=0", "VCW_StateMachine: Encoder failed"]

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] ERROR LEVEL: VIDEO_CAPTURE : WindowsFailure, 31443 of 10, reasons=["EncoderClip: CreateSinkWriter failed, hr=-2147467263", "Encoder: clip failed, clipIndex=0", "VCW_StateMachine: Encoder failed"]

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] ERROR: ZwiftApp Crashed

Here the same. I suspect it has something to do with resolution. The higher, the faster the container crashes. I couldn't find any clues in the journal. I get error messages from the wine64-preloader at irregular intervals while Zwift is running. Here are the last entries in the container log:

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] ERROR LEVEL: VIDEO_CAPTURE : WindowsFailure, 31442 of 10, reasons=["EncoderClip: CreateSinkWriter failed, hr=-2147467263", "Encoder: clip failed, clipIndex=0", "VCW_StateMachine: Encoder failed"]

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] ERROR LEVEL: VIDEO_CAPTURE : WindowsFailure, 31443 of 10, reasons=["EncoderClip: CreateSinkWriter failed, hr=-2147467263", "Encoder: clip failed, clipIndex=0", "VCW_StateMachine: Encoder failed"]

[17:44:25] INFO LEVEL: VIDEO_CAPTURE : Allocating pixel buffer, w=1920, h=1200

[17:44:25] ERROR: ZwiftApp Crashed

making progress... i did 2 tests today:

  1. i opened the game and just let it sit while i did something else. i noticed that that the VIRT memory column in top kept going up.... it was at 21G at one point.... after around 25mins, it crashed as usual
  2. i opened up the game and disabled the "video screenshots" option in settings. all of the "Allocating pixel buffer..... Encoder failed" messages disappeared from the log, and the VIRT memory use in top seems to have stabilized at around 6G. i left the game open for over an hour and it didn't crash.

next test is to actually do a long ride and see if it crashes... but seems like i made some progress. thanks to everyone who chimed into this thread with ideas.

CC: @netbrain

Not the first time I read that video screenshots cause crashes, even on the Zwift forums there are mentions of it.
I've always had it disabled personally.

fandan commented
  1. i opened the game and just let it sit while i did something else. i noticed that that the VIRT memory column in top kept going up.... it was at 21G at one point.... after around 25mins, it crashed as usual
  2. i opened up the game and disabled the "video screenshots" option in settings. all of the "Allocating pixel buffer..... Encoder failed" messages disappeared from the log, and the VIRT memory use in top seems to have stabilized at around 6G. i left the game open for over an hour and it didn't crash.

@S74HK9hV I was able to reproduce your experience. After the container had consumed around 50GB of virtual memory, it was terminated. Turning off the “Video Screenshot” function also solves the problem for me. The virtual memory consumption has remained constant for 2.5 hours now and Zwift is running without any problems. It is a perfect workaround :)

Would be interesting to see at which version this started to happen.

looks like the "Video Screenshot" feature was released on PC July 26th, 2023
https://forums.zwift.com/t/video-screenshots-windows-release-july-2023/609842

it was a server side change. dev claims that there was no special client version update needed:
https://forums.zwift.com/t/video-screenshots-windows-release-july-2023/609842/4