tiling->unsplit from the context menu crashes notion
wilhelmy opened this issue · 9 comments
Apr 14 19:15:11 fukushima systemd-coredump[2607]: Process 1905 (notion) of user 1000 dumped core.
Stack trace of thread 1905:
#0 0x000000000043c9d7 lookup_dynfun (notion + 0x3c9d7)
#1 0x000000000042465e region_rescue_clientwins (notion + 0x2465e)
#2 0x0000000000424c6d region_rescue_needed (notion + 0x24c6d)
#3 0x00007f3dbc8da150 mod_tiling_untile (mod_tiling.so + 0x12150)
#4 0x00007f3dbc8da21c l2chnd_b_o__WTiling (mod_tiling.so + 0x1221c)
#5 0x000000000043aa2a extl_l1_call_handler2 (notion + 0x3aa2a)
#6 0x00007f3dbcccd573 n/a (liblua5.4.so.5 + 0xf573)
#7 0x00007f3dbccce129 n/a (liblua5.4.so.5 + 0x10129)
#8 0x00007f3dbcccb6cb n/a (liblua5.4.so.5 + 0xd6cb)
#9 0x00007f3dbccd039e n/a (liblua5.4.so.5 + 0x1239e)
#10 0x00007f3dbccd0486 lua_pcallk (liblua5.4.so.5 + 0x12486)
#11 0x000000000043b505 extl_l1_call_handler (notion + 0x3b505)
#12 0x00007f3dbcccd573 n/a (liblua5.4.so.5 + 0xf573)
#13 0x00007f3dbccdceb9 n/a (liblua5.4.so.5 + 0x1eeb9)
#14 0x00007f3dbccce142 n/a (liblua5.4.so.5 + 0x10142)
#15 0x00007f3dbcccb6cb n/a (liblua5.4.so.5 + 0xd6cb)
#16 0x00007f3dbccd039e n/a (liblua5.4.so.5 + 0x1239e)
#17 0x00007f3dbccd0486 lua_pcallk (liblua5.4.so.5 + 0x12486)
#18 0x000000000043af8c extl_dodo_call_vararg (notion + 0x3af8c)
#19 0x000000000043a17c extl_docpcall (notion + 0x3a17c)
#20 0x00007f3dbcccd573 n/a (liblua5.4.so.5 + 0xf573)
#21 0x00007f3dbccce129 n/a (liblua5.4.so.5 + 0x10129)
#22 0x00007f3dbcccb6cb n/a (liblua5.4.so.5 + 0xd6cb)
#23 0x00007f3dbccd039e n/a (liblua5.4.so.5 + 0x1239e)
#24 0x00007f3dbccd0486 lua_pcallk (liblua5.4.so.5 + 0x12486)
#25 0x0000000000439341 extl_cpcall (notion + 0x39341)
#26 0x000000000043bc81 extl_cpcall_call (notion + 0x3bc81)
#27 0x000000000043bd66 extl_call (notion + 0x3bd66)
#28 0x00007f3dbc8fffbf menu_do_finish (mod_menu.so + 0x4fbf)
#29 0x000000000043765e do_execute (notion + 0x3765e)
#30 0x000000000041a200 ioncore_mainloop (notion + 0x1a200)
#31 0x00000000004183dd main (notion + 0x183dd)
#32 0x00007f3dbc9cab25 __libc_start_main (libc.so.6 + 0x27b25)
#33 0x000000000041850e _start (notion + 0x1850e)
Edit: can anyone reproduce?
I didn't catch the stacktrace but I confirm it crashed :(
Btw: the menuitem is called Untile
This is super reproducible and I've been looking into it for a few days (I'm pretty lazy). The "bug" is in ops.c
FOR_ALL_MANAGED_BY_TILING(reg, tiling, tmp)
That for loop gets corrupted by this line:
reg2=group_do_attach(grp, ¶m, &data);
Which frees the tiling->managed_list object from underneath it and thus corrupts the pointers.
Here's a really reproducible version.
- do rm -fr ~/.notion
- start notion in the default tiled mode with 1 application window (such as xterm)
- right click on the title bar, go to "Tiling" => "Untile"
- enjoy your core.
Somehow the reparenting has to be done without the rug pull here. It has been there since at least 2014. I didn't follow the full path down though. I could bisect it if we really care.
ops.c may not be the fix point, it's just the first common parent between the defect and crash.
Here's some more details:
(gdb) b ops.c:135
(gdb) c
(gdb) watch tiling->managed_list
(gdb) c
Notice how you'll get something like this:
#0 0x000055cd3c8e4e0a in free_node (ptrlist=0x55cd3dfdb2d0, node=0x55cd3e055210) at ptrlist.c:19
#1 0x000055cd3c8e53a0 in ptrlist_remove (ptrlist=0x55cd3dfdb2d0, ptr=0x55cd3df85180) at ptrlist.c:121
#2 0x00007f2656a257e1 in tiling_do_managed_remove (ws=0x55cd3dfdb210, reg=0x55cd3df85180) at tiling.c:673
#3 0x00007f2656a258f8 in tiling_managed_remove (ws=0x55cd3dfdb210, reg=0x55cd3df85180) at tiling.c:697
#4 0x000055cd3c8b33fe in region_managed_remove (mgr=0x55cd3dfdb210, reg=0x55cd3df85180) at region.c:234
#5 0x000055cd3c8b3b39 in region_detach_manager (reg=0x55cd3df85180) at region.c:576
#6 0x000055cd3c8b5641 in doit_reparent (mgr=0x55cd3e056ba0, par=0x55cd3df455a0, fp=0x7ffe1f188240, cont=0x55cd3c8ce580 <group_do_attach_final>, cont_param=0x7ffe1f1882b0, reg=0x55cd3df85180) at attach.c:86
#7 0x000055cd3c8b58f2 in region_attach_helper (mgr=0x55cd3e056ba0, par=0x55cd3df455a0, fp=0x7ffe1f188240, fn=0x55cd3c8ce580 <group_do_attach_final>, fn_param=0x7ffe1f1882b0, data=0x7ffe1f188290) at attach.c:171
#8 0x000055cd3c8ce96f in group_do_attach (ws=0x55cd3e056ba0, param=0x7ffe1f1882b0, data=0x7ffe1f188290) at group.c:729
#9 0x00007f2656a31cfb in mod_tiling_untile (tiling=0x55cd3dfdb210) at ops.c:148
That's where your memory corruption occurs. I don't know enough about the mechanics here. Maybe a simple copy of the ptr list and then a freeing at the end of the loop would do it?
Ouch, sorry about that. Before 8d3f262 that splittree_remove was already there, but perhaps triggered under different/fewer conditions. That if(!reused)
that got removed looks potentially relevant I guess...
I'll look into it more
Thanks!
it's calling ptrlist_remove(&(ws->managed_list), reg);
in tiling_do_managed_remove
which we identified from above as the issue here.
Here's a really cheap solution
a/mod_tiling/ops.c b/mod_tiling/ops.c
index 0bd21912..bc5059d9 100644
--- a/mod_tiling/ops.c
+++ b/mod_tiling/ops.c
@@ -147,13 +147,20 @@ bool mod_tiling_untile(WTiling *tiling)
reg2=group_do_attach(grp, ¶m, &data);
+ // See #334: tiling->unsplit from the context menu crashes notion
+ if(tiling->managed_list == NULL) {
+ break;
+ }
+
if(reg2==NULL)
warn(TR("Unable to move a region from tiling to group."));
}
tiling->batchop=FALSE;
- region_dispose((WRegion*)tiling);
+ if(tiling->managed_list != NULL) {
+ region_dispose((WRegion*)tiling);
+ }
return TRUE;
}
we can probably do better than that
I'm also open to just using this and then opening up a more minor memory leak bug which I'm sure is sitting around somewhere in this hack and then just move on.
I merged #357 - do you want to keep this issue open to keep looking for a clearer solution or rather close it?
no ... let me handle that
apparently only you have the privilege of closing. Feel free to do that now that we have documented and opened up the more major and now less visible issue.