Clemapfel/Mousetrap.jl

`gdk_x11_surface_get_xid` Segmentation Fault when initializing Mousetrap

ufechner7 opened this issue · 19 comments

julia> using Mousetrap
[ Info: Precompiling Mousetrap [5deeb4b9-6e04-4da7-8b7f-c77fb1eae65e]

[25147] signal (11.1): Segmentation fault
in expression starting at /home/ufechner/.julia/packages/Mousetrap/k0X9u/src/Mousetrap.jl:39
gdk_x11_surface_get_xid at /home/ufechner/.julia/artifacts/5498f875c31a1b3422ce1b64ef770407109eff30/lib/libgtk-4.so (unknown line)
Allocations: 815336 (Pool: 814877; Big: 459); GC: 1
ERROR: Failed to precompile Mousetrap [5deeb4b9-6e04-4da7-8b7f-c77fb1eae65e] to "/home/ufechner/.julia/compiled/v1.9/Mousetrap/jl_9p08JE".
Stacktrace:
 [1] error(s::String)
   @ Base ./error.jl:35
 [2] compilecache(pkg::Base.PkgId, path::String, internal_stderr::IO, internal_stdout::IO, keep_loaded_modules::Bool)
   @ Base ./loading.jl:2300
 [3] compilecache
   @ ./loading.jl:2167 [inlined]
 [4] _require(pkg::Base.PkgId, env::String)
   @ Base ./loading.jl:1805
 [5] _require_prelocked(uuidkey::Base.PkgId, env::String)
   @ Base ./loading.jl:1660
 [6] macro expansion
   @ ./loading.jl:1648 [inlined]
 [7] macro expansion
   @ ./lock.jl:267 [inlined]
 [8] require(into::Module, mod::Symbol)
   @ Base ./loading.jl:1611

What I did:
a. rename .julia to .julia.bak
b. create a new project and install Mousetrap as described in the README.md

(Mouse) pkg> st
Status `~/repos/Mouse/Project.toml`
  [5deeb4b9] Mousetrap v0.2.0 `https://github.com/Clemapfel/mousetrap.jl#main`
  [7f2654e2] mousetrap_apple_jll v0.2.0+0 `https://github.com/Clemapfel/mousetrap_apple_jll#main`
  [b6bb2801] mousetrap_linux_jll v0.2.0+0 `https://github.com/Clemapfel/mousetrap_linux_jll#main`
  [1923bf96] mousetrap_windows_jll v0.2.0+0 `https://github.com/Clemapfel/mousetrap_windows_jll#main`
julia> versioninfo()
Julia Version 1.9.3
Commit bed2cd540a1 (2023-08-24 14:43 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 32 × AMD Ryzen 9 7950X 16-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-14.0.6 (ORCJIT, znver3)
  Threads: 1 on 32 virtual cores
Environment:
  LD_LIBRARY_PATH = /lib:/usr/lib:/usr/local/lib
ufechner@ufryzen:~$ uname -a
Linux ufryzen 6.2.0-32-generic #32~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Aug 18 10:40:13 UTC 2 x86_64 x86_64 x86_64 GNU/Linux

I tried mousetrap on a freshly installed virtual machine, and that worked. I could also install it successfully on Ubuntu 20.04.

Any idea how to debug this issue?

This time it shouldn't be a mousetrap issue, this is with the GTK backend and something about your windowing system. I can still try to help you get it working, since the GNOME forums will assume you're using C or Vala.

I can't reproduce this on my 22.04 machine, I tried Xorg and Wayland, which windowing system are you using? Could you try switching to others and see if the error persists? You can swap windowing system by choosing "log out" from the shutdown menu, then on the screen where you enter your password there's a cog in the bottom right where you can switch.

Well, I am using X11, but I cannot easily switch the windowing system because I use an nvidia graphics card with the proprietary driver, and it does not support Wayland, so the selection of Wayland is disabled by default.

OK, I tried with different graphic drivers. Not working:
Nvidia-535 (tested by Ubuntu)
Nvidia-525

Working:
nouveau

Not the final solution because there is no OpenGL acceleration with the nouveau driver, but at least we know now better what works and what doesn't.

Might this be related:

I am now using gdk_x11_surface_get_xid(gtk_native_get_surface(GTK_NATIVE(window))) to get an XID from a GTK window widget. – 
[Ank i zle](https://stackoverflow.com/users/10696946/ank-i-zle)
[Mar 15, 2022 at 2:10](https://stackoverflow.com/questions/71461608/gtk-window-to-gdk-surface-in-gtk4#comment126333979_71461608)

fwiw: I'm getting the same error as @ufechner7 on Ubuntu 23.04 with X.

fwiw: I'm getting the same error as ufechner7 on Ubuntu 23.04 with X.

Is this new in mousetrap v0.2.0 or did you have the same issue in 0.1.0?

I don't have access to an nvidia graphics card and unlike with an OS I can't just spin up a VM to test it, so this might be hard for me to fix.

Just to isolate whether it's a mousetrap issue, can you install libadwaita_jll from the Julia package registry, then:

import libadwaita_jll
ccall((:adw_init, libadwaita_jll.libadwaita), Cint, ())

If that works, exit the Julia runtime, install GTK4_jll, then:

import GTK4_jll
ccall((:gtk_init, GTK4_jll.libgtk4), Cint, ())

If both work, please run:

import GTK4_jll, libadwaita_jll
begin
    ccall((:adw_init, libadwaita_jll.libadwaita), Cvoid, ())
    local window = ccall((:adw_window_new, libadwaita_jll.libadwaita), Ptr{Cvoid}, ())
    local area = ccall((:gtk_gl_area_new, GTK4_jll.libgtk4), Ptr{Cvoid}, ())
    ccall((:adw_window_set_content, libadwaita_jll.libadwaita), Cvoid, (Ptr{Cvoid}, Ptr{Cvoid}), window, area)
    ccall((:gtk_window_present, GTK4_jll.libgtk4), Cvoid, (Ptr{Cvoid},), window)
    println(ccall((:gtk_widget_get_realized, GTK4_jll.libgtk4), Cint, (Ptr{Cvoid},), area) == true)
    exit()
end

It should print true to the console, with no other console output.

fwiw: I'm getting the same error as ufechner7 on Ubuntu 23.04 with X.

Is this new in mousetrap v0.2.0 or did you have the same issue in 0.1.0?

First time I tried to install mousetrap.

I don't have access to an nvidia graphics card and unlike with an OS I can't just spin up a VM to test it, so this might be hard for me to fix.

Just to isolate whether it's a mousetrap issue, can you install libadwaita_jll from the Julia package registry, then:

import libadwaita_jll
ccall((:adw_init, libadwaita_jll.libadwaita), Cint, ())

If that works, exit the Julia runtime, install GTK4_jll, then:

import GTK4_jll
ccall((:gtk_init, GTK4_jll.libgtk4), Cint, ())

If both work, please run:

import GTK4_jll, libadwaita_jll
begin
    ccall((:adw_init, libadwaita_jll.libadwaita), Cvoid, ())
    local window = ccall((:adw_window_new, libadwaita_jll.libadwaita), Ptr{Cvoid}, ())
    local area = ccall((:gtk_gl_area_new, GTK4_jll.libgtk4), Ptr{Cvoid}, ())
    ccall((:adw_window_set_content, libadwaita_jll.libadwaita), Cvoid, (Ptr{Cvoid}, Ptr{Cvoid}), window, area)
    ccall((:gtk_window_present, GTK4_jll.libgtk4), Cvoid, (Ptr{Cvoid},), window)
    println(ccall((:gtk_widget_get_realized, GTK4_jll.libgtk4), Cint, (Ptr{Cvoid},), area) == true)
    exit()
end

It should print true to the console, with no other console output.

I tried the three tests, and all works fine, even when using the nvidia driver...
Just using Moustrapfails...

But I did not try the suggestion from Tim Holy yet: https://docs.julialang.org/en/v1/devdocs/backtraces/

I'm sorry you're still having trouble, I'm kinda forced to make you do the debugging since I can't test it myself, I apologize.

Could you try the following?

import libadwaita_jll, mousetrap_linux_jll
ccall((:adw_init, libadwaita_jll.libadwaita), Cvoid, ())
ccall((:_ZN9mousetrap6detail17initialize_openglEv, mousetrap_linux_jll.mousetrap_julia_binding), Cvoid, ())

Does that still crash?


If it says _ZN9mousetrap6detail17initialize_openglEv is not defined, run the following in Julia:

println(mousetrap_linux_jll.mousetrap_julia_binding)

Which will return a path, for me it's /home/clem/.julia/artifacts/2670af70bf210b40d1e852d7b5d644c76c0e7d01/lib/libmousetrap_julia_binding.so

Then, in your bash console, not in julia, run:

nm /home/clem/.julia/artifacts/2670af70bf210b40d1e852d7b5d644c76c0e7d01/lib/libmousetrap_julia_binding.so | grep initialize_opengl`

With the path you got from mousetrap_linux_jll, for me it's _ZN9mousetrap6detail17initialize_openglEv, then use that in the ccall above. You may need to install nm, it's sudo apt-get install binutils.

We're calling a mangled C++ name like a C library so we can call mousetrap::initialize_opengl directly, it's the best candidate for what could be crashing. If I knew which line in the C++ shared library called gdk_x11_surface_get_xid that would help a lot but your stacktrace doesn't say.

Does that still crash?

Yes, it does:

julia> import libadwaita_jll, mousetrap_linux_jll

julia> ccall((:adw_init, libadwaita_jll.libadwaita), Cvoid, ())

julia> ccall((:_ZN9mousetrap6detail17initialize_openglEv, mousetrap_linux_jll.mousetrap_julia_binding), Cvoid, ())

[19474] signal (11.1): Segmentation fault
in expression starting at REPL[3]:1
gdk_x11_surface_get_xid at /home/ufechner/.julia/artifacts/5498f875c31a1b3422ce1b64ef770407109eff30/lib/libgtk-4.so (unknown line)
Allocations: 2998 (Pool: 2986; Big: 12); GC: 0
Segmentation fault (core dumped)

Ok, perfect, I know how to work on a fix for this then. I'll try to get a hotfix out soon

Hi, please run:

ENV["MOUSETRAP_DISABLE_OPENGL_COMPONENT"] = "TRUE"
import Pkg; Pkg.update("mousetrap_linux_jll")
using Mousetrap

Does that still crash?

Looks better:

julia> ENV["MOUSETRAP_DISABLE_OPENGL_COMPONENT"] = "TRUE"
"TRUE"

julia> import Pkg; Pkg.update("mousetrap_linux_jll")
    Updating registry at `~/.julia/registries/General.toml`
    Updating git-repo `https://github.com/Clemapfel/mousetrap_linux_jll`
  No Changes to `~/repos/Mouse/Project.toml`
  No Changes to `~/repos/Mouse/Manifest.toml`

julia> using Mousetrap

julia> 

But the nvidia driver should support OpenGL, shouldn't it?

It's not disabling OpenGL, it's just turning off everything related to the RenderArea widget. Mousetrap will still use the OpenGL backend for rendering and you still get the same performance.

The issue isn't that OpenGL isn't supported, it's that during creation of the global OpenGL state that is only used to store all the data of Shape and Texture, either GTK4 or GLEW fails to correctly access the windows surface because of some reason related to X11 with your driver. This hotfix just turns that part off so you can still use mousetrap.

If you want you can unset MOUSETRAP_DISABLE_OPENGL_COMPONENT and see if you get a new error, I made the context creation function a little bit more resilient. If yes, then we just have to wait until I can get my hands on an nvidia card so I can reproduce this and reach out to GNOME.

I'll keep this open for anyone else running into this.

Solution

TL;DR: Certain NVIDIA drivers are causing an error in GTK4 when trying to initialize everything needed for the RenderArea widget to work. The current fix for this is to just turn off RenderArea, which we do by setting an environment variable.

1. Set MOUSETRAP_DISABLE_OPENGL_COMPONENT

Add the following line to your startup.jl

ENV["MOUSETRAP_DISABLE_OPENGL_COMPONENT"] = "TRUE"

If you are unsure of where startup.jl is located, you can access the path by running the following in the Julia REPL:

something(Base._global_julia_startup_file(), Base._local_julia_startup_file())

If you are unwilling or unable to modify startup.jl, you can also set the environment variable using your IDE, or, on unix, by appending the following line to your ~/.bashrc file:

export MOUSETRAP_DISABLE_OPENGL_COMPONENT=TRUE

2. Update mousetrap jlls

Open a fresh terminal session, enter the REPL, then update the mousetrap jll:

begin
   import
   Pkg.update("mousetrap_jll")
   Pkg.update("Mousetrap")
end

This should set the jll to version 0.3.0 or newer, while Mousetrap should be 0.3.0 or newer.

You should now be able to use mousetrap, with the exception of the RenderArea widget, which will print a log message when instantiated, but should not crash or otherwise affect the rest of the application:

using Mousetrap
area = RenderArea()
(process:185604): mousetrap-CRITICAL **: 14:32:36.464: In RenderArea(): trying to instantiate RenderArea, but the OpenGL component is disabled.

I am also running into this error on Ubuntu 22.04 LTS with an Nvidia graphics card and 535 driver. I am using Moustrap version 0.3.0. Here is the entry from the manifest.

  [5deeb4b9] Mousetrap v0.3.0 `https://github.com/clemapfel/mousetrap.jl#main`

When I set ENV["MOUSETRAP_DISABLE_OPENGL_COMPONENT"] = "TRUE" at the prompt, then the basic Mousetrap gui comes up. When I tried the area = RenderArea() I get the mousetrap-CRITICAL **: 21:48:43.254: In RenderArea(): trying to instantiate RenderArea, but the OpenGL component is disabled. message.

Is there a way to test the OpenGL piece? How would I test that?

UPDATE: This code worked. So a user can test whether OpenGL works with the complete example below. The code was taken from the Mousetrap test cases.

ENV["MOUSETRAP_DISABLE_OPENGL_COMPONENT"] = "TRUE"

module MousetrapMakie

    export GLMakieArea, create_glmakie_screen

    using Mousetrap
    using ModernGL, GLMakie, Colors, GeometryBasics, ShaderAbstractions
    using GLMakie: empty_postprocessor, fxaa_postprocessor, OIT_postprocessor, to_screen_postprocessor
    using GLMakie.GLAbstraction
    using GLMakie.Makie

    """
    ## GLMakieArea <: Widget
    `GLArea` wrapper that automatically connects all necessary callbacks in order for it to be used as a GLMakie render target. 

    Use `create_glmakie_screen` to initialize a screen you can render to using Makie from this widget. Note that `create_glmakie_screen` needs to be 
    called **after** `GLMakieArea` has been realized, as only then will the internal OpenGL context be available. See the example below.

    ## Constructors
    `GLMakieArea()`

    ## Signals
    (no unique signals)

    ## Fields
    (no public fields)

    ## Example
    ```
    using Mousetrap, MousetrapMakie
    main() do app::Application
        window = Window(app)
        canvas = GLMakieArea()
        set_size_request!(canvas, Vector2f(200, 200))
        set_child!(window, canvas)
    
        # use optional ref to delay screen allocation after `realize`
        screen = Ref{Union{Nothing, GLMakie.Screen{GLMakieArea}}}(nothing)
        connect_signal_realize!(canvas) do self
            screen[] = create_glmakie_screen(canvas)
            display(screen[], scatter(1:4))
            return nothing
        end
        present!(window)
    end
    ```
    """
    mutable struct GLMakieArea <: Widget

        glarea::GLArea              # wrapped native widget
        framebuffer_id::Ref{Int}    # set by render callback, used in MousetrapMakie.create_glmakie_screen
        framebuffer_size::Vector2i  # set by resize callback, used in GLMakie.framebuffer_size

        function GLMakieArea()
            glarea = GLArea()
            set_auto_render!(glarea, false) # should `render` be emitted everytime the widget is drawn
            connect_signal_render!(on_makie_area_render, glarea)
            connect_signal_resize!(on_makie_area_resize, glarea)
            return new(glarea, Ref{Int}(0), Vector2i(0, 0))
        end
    end
    Mousetrap.get_top_level_widget(x::GLMakieArea) = x.glarea

    # maps hash(GLMakieArea) to GLMakie.Screen
    const screens = Dict{UInt64, GLMakie.Screen}()

    # maps hash(GLMakieArea) to Scene, used in `on_makie_area_resize`
    const scenes = Dict{UInt64, GLMakie.Scene}()

    # render callback: if screen is open, render frame to `GLMakieArea`s OpenGL context
    function on_makie_area_render(self, context)
        key = Base.hash(self)
        if haskey(screens, key)
            screen = screens[key]
            if !isopen(screen) return false end
            screen.render_tick[] = nothing
            glarea = screen.glscreen
            glarea.framebuffer_id[] = glGetIntegerv(GL_FRAMEBUFFER_BINDING)
            GLMakie.render_frame(screen) 
        end
        return true
    end

    # resize callback: update framebuffer size, necessary for `GLMakie.framebuffer_size`
    function on_makie_area_resize(self, w, h)
        key = Base.hash(self)
        if haskey(screens, key)
            screen = screens[key]
            glarea = screen.glscreen
            glarea.framebuffer_size.x = w
            glarea.framebuffer_size.y = h
            queue_render(glarea.glarea)
        end

        if haskey(scenes, key)
            scene = scenes[key]
            scene.events.window_area[] = Recti(0, 0, glarea.framebuffer_size.x, glarea.framebuffer_size.y)
            scene.events.window_dpi[] = Mousetrap.calculate_monitor_dpi(glarea)
        end
        return nothing
    end

    # resolution of `GLMakieArea` OpenGL framebuffer
    GLMakie.framebuffer_size(self::GLMakieArea) = (self.framebuffer_size.x, self.framebuffer_size.y)

    # forward retina scale factor from GTK4 back-end
    GLMakie.retina_scaling_factor(w::GLMakieArea) = Mousetrap.get_scale_factor(w)

    # resolution of `GLMakieArea` widget itself`
    function GLMakie.window_size(w::GLMakieArea)
        size = get_natural_size(w)
        size.x = size.x * GLMakie.retina_scaling_factor(w)
        size.y = size.y * GLMakie.retina_scaling_factor(w)
        return (size.x, size.y)
    end

    # calculate screen size and dpi
    function Makie.window_area(scene::Scene, screen::GLMakie.Screen{GLMakieArea})
        glarea = screen.glscreen
        scenes[hash(glarea)] = scene
    end

    # resize request by makie will be ignored
    function GLMakie.resize_native!(native::GLMakieArea, resolution...)
        # noop
    end

    # bind `GLMakieArea` OpenGL context
    ShaderAbstractions.native_switch_context!(a::GLMakieArea) = make_current(a.glarea)

    # check if `GLMakieArea` OpenGL context is still valid, it is while `GLMakieArea` widget stays realized
    ShaderAbstractions.native_context_alive(x::GLMakieArea) = get_is_realized(x)

    # destruction callback ignored, lifetime is managed by mousetrap instead
    function GLMakie.destroy!(w::GLMakieArea)
        # noop
    end

    # check if canvas is still realized
    GLMakie.was_destroyed(window::GLMakieArea) = !get_is_realized(window)

    # check if canvas should signal it is open
    Base.isopen(w::GLMakieArea) = !GLMakie.was_destroyed(w)

    # react to makie screen visibility request
    GLMakie.set_screen_visibility!(screen::GLMakieArea, bool) = bool ? show(screen.glarea) : hide!(screen.glarea)

    # apply glmakie config
    function GLMakie.apply_config!(screen::GLMakie.Screen{GLMakieArea}, config::GLMakie.ScreenConfig; start_renderloop=true) 
        @warn "In MousetrapMakie: GLMakie.apply_config!: This feature is not yet implemented, ignoring config"
        # cf https://github.com/JuliaGtk/Gtk4Makie.jl/blob/main/src/screen.jl#L111
        return screen
    end

    # screenshot framebuffer
    function Makie.colorbuffer(screen::GLMakie.Screen{GLMakieArea}, format::Makie.ImageStorageFormat = Makie.JuliaNative)
        @warn "In MousetrapMakie: GLMakie.colorbuffer: This feature is not yet implemented, returning framecache"
        # cf https://github.com/JuliaGtk/Gtk4Makie.jl/blob/main/src/screen.jl#L147
        return screen.framecache
    end

    # ignore makie event model, use the mousetrap event controllers instead
    Makie.window_open(::Scene, ::GLMakieArea) = nothing
    Makie.disconnect!(::GLMakieArea, f) = nothing
    GLMakie.pollevents(::GLMakie.Screen{GLMakieArea}) = nothing
    Makie.mouse_buttons(::Scene, ::GLMakieArea) = nothing
    Makie.keyboard_buttons(::Scene, ::GLMakieArea) = nothing
    Makie.dropped_files(::Scene, ::GLMakieArea) = nothing
    Makie.unicode_input(::Scene, ::GLMakieArea) = nothing
    Makie.mouse_position(::Scene, ::GLMakie.Screen{GLMakieArea}) = nothing
    Makie.scroll(::Scene, ::GLMakieArea) = nothing
    Makie.hasfocus(::Scene, ::GLMakieArea) = nothing
    Makie.entered_window(::Scene, ::GLMakieArea) = nothing

    """
    ```
    create_gl_makie_screen(::GLMakieArea; screen_config...) -> GLMakie.Screen{GLMakieArea}
    ```
    For a `GLMakieArea`, create a `GLMakie.Screen` that can be used to display makie graphics
    """
    function create_glmakie_screen(area::GLMakieArea; screen_config...)

        if !get_is_realized(area) 
            log_critical("MousetrapMakie", "In MousetrapMakie.create_glmakie_screen: GLMakieArea is not yet realized, it's internal OpenGL context cannot yet be accessed")
        end

        config = Makie.merge_screen_config(GLMakie.ScreenConfig, screen_config)

        set_is_visible!(area, config.visible)
        set_expand!(area, true)

        # quote from https://github.com/JuliaGtk/Gtk4Makie.jl/blob/main/src/screen.jl#L342
        shader_cache = GLAbstraction.ShaderCache(area)
        ShaderAbstractions.switch_context!(area)
        fb = GLMakie.GLFramebuffer((1, 1)) # resized on GLMakieArea realization later

        postprocessors = [
            config.ssao ? ssao_postprocessor(fb, shader_cache) : empty_postprocessor(),
            OIT_postprocessor(fb, shader_cache),
            config.fxaa ? fxaa_postprocessor(fb, shader_cache) : empty_postprocessor(),
            to_screen_postprocessor(fb, shader_cache, area.framebuffer_id)
        ]

        screen = GLMakie.Screen(
            area, shader_cache, fb,
            config, false,
            nothing,
            Dict{WeakRef, GLMakie.ScreenID}(),
            GLMakie.ScreenArea[],
            Tuple{GLMakie.ZIndex, GLMakie.ScreenID, GLMakie.RenderObject}[],
            postprocessors,
            Dict{UInt64, GLMakie.RenderObject}(),
            Dict{UInt32, Makie.AbstractPlot}(),
            false,
        )
        # end quote

        hash = Base.hash(area.glarea)
        screens[hash] = screen
        
        set_tick_callback!(area.glarea) do clock::FrameClock
            if GLMakie.requires_update(screen)
                queue_render(area.glarea)
            end

            if GLMakie.was_destroyed(area)
                return TICK_CALLBACK_RESULT_DISCONTINUE
            else
                return TICK_CALLBACK_RESULT_CONTINUE
            end
        end
        return screen
    end
end

# test
using Mousetrap, .MousetrapMakie, GLMakie
main() do app::Application
    window = Window(app)
    set_title!(window, "Mousetrap x Makie")
    canvas = GLMakieArea()
    set_size_request!(canvas, Vector2f(200, 200))
    set_child!(window, canvas)

    # use optional ref to delay screen allocation after `realize`
    screen = Ref{Union{Nothing, GLMakie.Screen{GLMakieArea}}}(nothing)
    connect_signal_realize!(canvas) do self
        screen[] = create_glmakie_screen(canvas)
        display(screen[], scatter(rand(123)))
        return nothing
    end
    present!(window)
end