GeyserMC/Geyser

Encryption-related regression/connection issues with 10th/11th gen Intel + new AMD CPUs

Camotoy opened this issue · 32 comments

Describe the bug

There is some encryption-related bug where players are only able to play for a short time before communication essentially ceases and no actions go through to Geyser if the server is running a 10th or 11th generation Intel processor. It appears not all 10th generation Intel processors are affected, but potentially all 11th generation processors are (if the processor has AVX-512 support, then it is affected).

Fix

Java versions at, or greater than, 17.0.6 and 19.0.2, and OpenJDK 20 build 17 or greater, have this encryption bug fixed.

If you cannot update to these versions: add the following JVM parameters to your startup arguments: -XX:+UnlockDiagnosticVMOptions -XX:-UseAESCTRIntrinsics (add these before -jar)

Geyser Version

As of 7bd5b59

Additional Context

:(

Any update on this ticket?

I have recently been able to test on a Gen 11 machine - it may actually be a JDK bug. For anyone affected, try using Java 18.0.1+ - using Adoptium's JDK 18.0.1+10 Windows release, I was unable to replicate the bug.

I have recently been able to test on a Gen 11 machine - it may actually be a JDK bug. For anyone affected, try using Java 18.0.1+ - using Adoptium's JDK 18.0.1+10 Windows release, I was unable to replicate the bug.

Just tested on 10th gen Intel (Surface Laptop 3), I have this bug. Good to know I'm not the only one :P

Posting for those interested.

Yes, it is an OpenJDK bug that only affects processors with AVX-512. The bug seems to occur when encrypting/decrypting less than 16 bytes at a time. The relevant code is here https://github.com/openjdk/jdk/blob/9bff3b76f2e5d0ecede6c0d4042f65d377a28325/src/hotspot/cpu/x86/macroAssembler_x86_aes.cpp#L783-L814

It looks like the preloop assumes there is at least 16 bytes of input, leading to corruption of the cipher state such as the counter variable.

A bug report and potential fix has been sent. Waiting on their response.

Thank you so much for the temp fix 🙏worked perfectly, was stressing so hard over this

Is this proleam only with 10/11th gen or is it also effecting older ones and is it effecting 12gen?

Older ones are definitely not affected. Newer ones very likely are but we haven't confirmed if to the same degree or worse.

This breaks on Intel Xeon E5-2670 v1 with these specs: https://www.cpu-world.com/CPUs/Xeon/Intel-Xeon%20E5-2670.html

Including the startup options -XX:+UnlockDiagnosticVMOptions -XX:-UseAESCTRIntrinsics does not fix.

ETA: Updating to latest build of Geyser-Spigot fixed it for me (ended up being a 2.0.5 -> 2.1.0-snapshot update).

same on a raspberry pi

Unless a new Raspberry Pi model was introduced with AVX-512 support, it is very likely not affected. Same with a Xeon processor for 2012. Make sure you're using the latest Geyser version, and if the issue persists please make a new issue.

@Camotoy I'm pretty sure this is fixed on OpenJDK 20 rev 17+ and OpenJDK 21

We're aware that this issue is fixed in an upcoming version. If I remember correctly they aren't released yet, but they will be in the coming month.

It's fixed in 11.0.18, 15.0.10, 17.0.6 and 19.0.2. These versions will all release on 2023-01-17.
And you're right, it's fixed in build 17 and higher of OpenJDK 20.

After January the 17th it'll be up to the hosting providers to update their Java versions.
Link to the bug tracker: https://bugs.openjdk.org/browse/JDK-8292158

is it got fixed on java
19.0.2?
(i tested and not fixed)
still i am getting a lag issue on new chunks and get disconnected

20230118_085525.mp4

hmm i was on window 10 edition (1.19.51) and i got fps lags when i get close to new chunks

same as on jdk 17.0.6

то же, что и в jdk 17.0.6

That is, there is no full support for AVX-512 in jdk 19?

AVX-512 support has been implemented in Java 11 I believe.
This was simply a bug in their implementation.

hw2007 commented

I had a similar thing happen, but the bedrock client experiencing it was iOS. The first time it happened, all chunks just stopped loading until relog, and the second time, all players (not sure if other entities were affected) were frozen on the client-side.

Seems to also affect the new Epyc genoa 9654 cpu's (the flags do fix it, it seems)

Yes it will happen on any CPU with AVX512 support, so basically as more new CPUs are released more are affected by this issue. That's why the best solution at this point is to update Java rather than rely on the flags.

Yes it will happen on any CPU with AVX512 support, so basically as more new CPUs are released more are affected by this issue. That's why the best solution at this point is to update Java rather than rely on the flags.

what java fixes this?, currently running java 17

nvm just saw the java 20 thing above

riqvip commented

On bedrock, for me, I can load in and join fine but no chunks load and I fall in the void. It was working earlier too which is just strange. This seems to be a recurring issue.

having issues on nodecraft, cannot figure out how to add fix to jar or whatever is going on there, and it doesnt seem like i can update, halp plz

If you're aware that your java runtime is outdated, and you're unable to add the startup flags that patch the issue, then all you can do is ask your hosting provider.

its more like i dont know how to do it

its more like i dont know how to do it

Try contacting your host. Some hosts allow changing startup flags some doesn't.

Hey, so we updated our JVMs today, can you see if the issue persists?

Closing this given the JVM fixes have been out for some time now. If you are still experiencing this, please ensure you are on the latest OpenJDK build available from your OpenJDK vendor.

I tried to use java 22 for test on purpur 1.16.5 but still some part of chunks are not loading

2024-04-08.18-01-34.mp4

this is happening (when i load the new chunks the fps goes like under 10 and ping goes like 1k+ with java 17.0.10 as well and i tried to put the flags of above 2 too but still happening