jeremyletang/rgtk

Closures fundamentally broken

mathijshenquet opened this issue · 27 comments

Closures right now are fundamentally broken:

fn main(){
    make_window();
    gtk::main();
}

fn make_window(){

    let mut window = gtk::Window::new(gtk::window_type::TopLevel).unwrap();
    let drawing_area = DrawingArea::new().unwrap();

    drawing_area.connect(signals::Draw::new(|cr| 
        drawing_area.get_allocated_area() //`drawing_area` is garbage
    ));

    window.add(drawing_area);
    window.show_all();
}

This is because rust only has stack closures for now. When make_window is finished and it's stack goes out of scope all referenced variables inside the callback become garbage.

I'll be able to fix this when rust-lang/rfcs#114 gets merged.

this should come soon: rust-lang/rust#14539

I don't think so, Unboxed closures seems to have had a major setback so now rfcs 151 is implemented, taking some uncontroversial/worked out ideas from Unboxed closures. At least, if i read https://github.com/rust-lang/meeting-minutes/blob/master/weekly-meetings/2014-07-08.md correctly ;)

That is not true. Unboxed closures have not had a "major setback". I'm working on them as we speak.

What's the status of this now that rust-lang/rfcs#151 is fixed?

gkoz commented

Here's my take on closures/signals: https://gist.github.com/gkoz/52ae8de5370ba99d9795/9227b2a60e15cd3c00b824197cfc65d819bbeae4
GTK owns the closure after we box it and give the pointer to g_signal_connect_data as data.

            let closure: Box<Box<S::F>> = Box::new(closure);
            let closure_ptr     = transmute(closure);

            println!("connect closure {:?}", closure_ptr);

            glib::ffi::g_signal_connect_data(
                self.get_gobject(),
                c_signal_name.as_ptr(),
                Some(trampoline),
                closure_ptr,
                Some(transmute(destroy_closure)),
                0
            );

We also specify a destroy_data function that will take the ownership back and run the destructors.

extern fn destroy_closure(closure: *mut (), _: usize) {
    unsafe {
        println!("destroy closure {:?}", closure);
        let closure: Box<Box<FnMut()>> = transmute(closure);
        drop(closure);
    }
}

The weak link here is

        unsafe {
            let closure: Box<&mut FnMut($($arg_type),*) -> $ret_type> = transmute(closure);
            (*closure)($($arg_name),*)
        }

where we conjure a &mut reference to the closure in order to invoke it. I doubt it's safe...

gkoz commented

Actually we can just be conservative and use Fn closures: https://gist.github.com/gkoz/52ae8de5370ba99d9795
The example uses RefCell anyway, so it didn't even have to change. Transmuting a pointer to an aliased reference to Fn closure should be safe.

I think so too, casting should be safe. I did on other binding and it worked well. I don't see any reason why it couldn't here. It's a long time issue, it would be nice to fix it once and for all.

Do you want to do it ? I don't really have time recently so i won't do it before at least a week but maybe @jeremyletang will have it ?

But before that, we need to fix rgtk build ! I'll take a look this week-end.

gkoz commented

Maybe it should brew for a while more. There's another issue to tackle: the library has to translate between native GTK types and user-facing ones, possibly casting GdkEvent to an appropriate specific variant, converting between Gboolean and either bool or some event-specific enums (e.g. enum Propagate { Yes, No }).

Is this still a work in progress? I just ran into the problem today.

gkoz commented

My plan was to do the signals after the widgets are converted to the new model. We could probably change that. Today's draw signal debugging session has once again proved that a translation layer is required. Some of it is already in the tree.

gkoz commented

There're other challenges to consider. We need a way to disconnect signals and/or let them disconnect themselves. This probably calls for storing and managing them ourselves and not just passing pointers to GLib.

Is there a fundamental problem with unboxed closures that requires boxed closures to be used instead? I don't know enough about Rust internals to understand what the problem is.

gkoz commented

Don't take it as a gospel but here's my understanding:

A closures needs a place to store its environment (or at least the function pointer). An 'unboxed' closure, like any automatic variable, is stored on the stack and can't outlive the function it's defined in. Putting anything in a box means allocating on the heap and passing around a pointer that is freed when the box is destroyed. This allows a closure to live longer than the function where it was defined.

Here's some info from a more authoritative source: http://smallcultfollowing.com/babysteps/blog/2014/11/26/purging-proc/

But you don't strictly need closures to make signal handlers, you should be able to use references to any functions to avoid the lifetime problem.

fn f() { println!("Hello!"); }

fn main() {
    let fun = &f;
    fun();
}

Except I've just tried it and got a segfault. So need to investigate further.

That makes sense; I'm just surprised it ends in a segfault, but I suppose that's the nature of FFI. I don't think top-level functions would work for me, because my signal handlers rely on things from the lexical scope.

gkoz commented

Are you able to move those things into the closure i.e. move || { do_stuff; }? For the same reasons an unboxed closure doesn't work, any references to the automatic variables would be prohibited too.

I need to create several signal handlers, all of which having access to the same values, so I don't think I can use move.

gkoz commented

How are you going you ensure those values live long enough for the references to be valid?
The example linked above uses Rc<RefCell<_>>.

I can wrap the values I intend to use in reference-counted cells, but it doesn't solve the primary issue, which is that I have several signal handlers that need them, so I can't use move. If I wrap them reference-counted cells without using move, I still get a segfault. I admit this is not familiar territory for me.

gkoz commented

Rc is clonable, so you move a clone:

    let ctr: Rc<RefCell<i32>> = Rc::new(RefCell::new(0));
    let ctr1 = ctr.clone();

    add_key_press_handler(&window, ctr.clone());

    NGConnect::connect::<DeleteEvent>(&window, Box::new(move |_| {
        let mut r = ctr1.borrow_mut();
        ...

Any widget is clonable too so no problem sharing them either.

Thanks for the tip, I am currently trying it out. I made a handler creation function just like yours, and cloned the Rc objects, but on the very first borrow_mut call, I get a panic: RefCell<T> already borrowed. Have you experienced that before? Note that I am not using boxed closures -- I assume you had to fork rgtk to do that.

gkoz commented

Not sure about your problem, from the panic message it would seem that the RefMut from the last borrow hadn't gone out of scope before you tried to borrow again.

I didn't so much fork, just rolled my own signals in a binary (it required adding a single extern fn to the FFI though, the patch inserted at the top).
https://gist.github.com/gkoz/52ae8de5370ba99d9795

That's the puzzling thing -- it's the very first time borrow_mut is called. Perhaps it's because I'm not using boxed closures, though; I'm doing button.connect(gtk::signals::Clicked::new(&mut move || { ... }));. I appreciate your help.

gkoz commented

I guess I'd need to see more of the code.

Sure, here's the part I'm working with right now. It's using my fork of rgtk to create a Box containing a VteTerminal with several buttons corresponding to cargo commands.

gkoz commented

Well... One possible reason could be precisely the closures brokenness: we can't be sure what's inside of the build_term captured variable because the closure is destroyed by the time it's called. This reason could probably be eliminated if you managed to box the closure and put it in some long-living struct, possibly a global static.
Also note that you shouldn't have to put a widget in any refcounted box (or cell) because it kinda is one already, cloning it should suffice.

Yeah I figured that would be a problem. I may have to revisit this later, as it is proving really difficult.