`threads` example displays a black window on X11
hecrj opened this issue ยท 14 comments
The threads
example in master
displays a black window on X11. I'm using Arch Linux with a GTX 2080 Ti and the NVIDIA proprietary drivers.
The offscreen
example works correctly.
Let me know if you need more details.
I get an IncompatibleWinitWindow crash on:
- KDE, X11, Debian Testing, Intel Iris 6100 (Broadwell GT3)
- Ubuntu 20.04, Nvidia GTX 1060, proprietary + Mesa/Nouveau drivers
Servo seems to be affected by this as well: servo/servo#26353, servo/servo#26400
@pcwalton doesn't watch this repo.
KDE, X11, Debian Testing, Intel Iris 6100 (Broadwell GT3)
I ran git bisect with cd surfman; cargo +nightly run --example threads; cd ..
:
Before 2925168 I got:
libEGL warning: FIXME: egl/x11 doesn't support front buffer rendering.
Failed to compile shader:
0:1(10): error: GLSL 3.30 is not supported. Supported versions are: 1.10, 1.20, 1.30, 1.00 ES, 3.00 ES, 3.10 ES, and 3.20 ESthread 'main' panicked at 'Shader compilation failed!', surfman/examples/common/mod.rs:75:17
note: run withRUST_BACKTRACE=1
environment variable to display a backtrace
Since then it is:
thread 'main' panicked at 'called
Result::unwrap()
on anErr
value: IncompatibleWinitWindow', surfman/examples/threads.rs:98:22
With cargo +nightly run --example threads --features sm-x11
I get the old error back:
libEGL warning: FIXME: egl/x11 doesn't support front buffer rendering.
Failed to compile shader:
0:1(10): error: GLSL 3.30 is not supported. Supported versions are: 1.10, 1.20, 1.30, 1.00 ES, 3.00 ES, 3.10 ES, and 3.20 ESthread 'main' panicked at 'Shader compilation failed!', surfman/examples/common/mod.rs:75:17
note: run withRUST_BACKTRACE=1
environment variable to display a backtrace
surfman/surfman/examples/common/mod.rs
Line 46 in 625c4ff
surfman/surfman/examples/threads.rs
Line 104 in 625c4ff
It works if I run $ MESA_GL_VERSION_OVERRIDE=3.2 cargo +nightly run --example threads --features sm-x11
. If I want to run it differently, I have to clean ~/.cache/mesa_shader_cache, otherwise it doesn't seem to recompile shaders. So Nvidia might just be silently failing (?) while Mesa told us what the problem is.
So Nvidia might just be silently failing (?) while Mesa told us what the problem is.
Is there any way we can test this? I had the black window issue even when using surfman
without any shaders. I am not completely familiar with the API yet, so there is a chance I may be missing something.
Just so we are on the same page, I have created an SSCCE in my fork that reproduces the issue I am describing here (enable the sm-x11
feature flag). I believe this should display a blue window, but I get a black one instead.
FYI I tried doing glReadPixels
on each of the FBOs prior to presentation and to submitting the completed surface to the main thread. The worker thread's glReadPixels yields the ball image. The main thread's surface always returns transparent black for the whole buffer, even if I do another glClear (OR if I remove the alpha flag from the context creation, then it returns solid black for everything).
So this seems to only happen to the surface corresponding to the native widget. @hecrj I'll try playing with your minimal repro next (apitrace is unhappy with multithreaded rendering so I have a much uglier hack that I've been playing with to try to understand what's going on).
@iamralpht I also played with glReadPixels
during my experiments and got similar results.
The first read, after a clear and before presenting for the first time, got me pixel data from the region of the screen behind the window (!). After presenting, the data turned to transparent black / solid black as you described.
@hecrj neat, glad we're seeing the same things. If I run apitrace on your test case, then the replay actually shows a blue window, so maybe this is something funny with context creation.
So eglCreatePlatformWindowSurface
fails on NVidia, but not on Intel. Now to figure out where it's called from and what to do about it ;).
That's
.Ok, I was wrong about eglCreatePlatformWindowSurface
. It sets the error flag, but it appears to be benign, because I have a minimal C program which successfully gets a glClear
to appear in a window and that triggers the same errors.
I tried matching the X11 visual of the window with the desired X11 visual of the EGLConfig
since technically you should (and I guess it was only really important when there were still PseudoColor visuals). The window depth also matches. This didn't fix the problem.
Next I ensured that the EGLConfig
has EGL_SURFACE_TYPE
containing EGL_WINDOW_BIT
. The selected config ID is the same in the working C test, and the failing surfman test. This also did not seem to change anything.
At this point, there's not too much between the working and non-working, at least from what I can see in apitrace. I made the C program unbind the window surface and re-bind it prior to calling swap buffers, and its output is still visible. glReadPixels
still returns all black pixels for the surfman context. I'll keep poking, but not sure what I'm missing.
Edit: the other odd thing is that everything works OK in apitrace
's replay (which was what made me suspicious of the X visual mismatch). I'm not sure how eglretrace
works for the window system bits, so maybe this doesn't mean much.
Edit 2: I have a Rust program, based on @hecrj's minimal sample which uses winit
to create the window and establish the X connection, and then calls EGL manually using surfman's bindings (and uses surfman's GL binding method too). This works fine. I don't need to match visuals, even. So this tells me the problem isn't due to some quirk in window creation, isn't due to some linkage or _init
magic in NVidia's EGL implementation. My next step is to make the apitrace from my working Rust program match the apitrace from surfman more precisely and hopefully repro the failure. (But if anyone else can think of other obvious problems that this could be, then I'm all ears!).
Ha! Got it! If I call eglMakeCurrent
with null read/draw surfaces prior to creating the window surface, then I get the black window problem.
If I don't make the context current in ContextDescriptor.from_egl_context
then things seem to work--hooray!
To make a generic fix I'll need to either defer poking the GL context for version and the compatibility bit, or move the window surface creation up. @jdm any guidance on what you'd be likely to approve?
Sorry, what exactly does "move the window surface creation" up mean? It might help to see branches demonstrating each option.
@jdm sorry for not being very clear; I mean either I create the window surface before from_egl_context
is called and just hold onto it somewhere, or I somehow avoid fetching the context information until after a window surface has been created if one is desired. This is my first expedition into surfman, so I don't know which is more likely to fit into the library. I'll see where I get to and post a PR if I hit upon something that's not too disgusting ;).