lxde/menu-cache

menu-cached utilize 100% CPU

Closed this issue · 58 comments

sometimes, mostly after resume after hibernate menu-cached utilizes 100% cpu. SIGTERM as workaround helps.

I can deliver more info, but I dont know how.

additional info: the bug is triggered after hibernate, if I enter a name of a program in a launcher:

  1. Press Alt+F2
  2. enter some name of a program, that exists on a computer.

I've had this same thing happen when I try to run a python script that doesn't have a .desktop file registered from the "Open With" option in the right click menu of files in PCManFM. The script runs fine from terminal and as soon as a .desktop file is generated it doesn't happen any more.

Is this the actual develop branch hier? The developers seems to not react.

Haven't tested if this commit actually solves the proble, but with pcmanfm 1.2.4 if I do "Open With" with custom command line, set it as default application and don't give it an application name; it starts eating cpu resources.

I've had this issue twice. Don't know what triggers it. but i did use 'Open with" option on pcmanfm shortly before i noticed the issue.
Couldn't reproduce it. I've caught this problem twice. Killed the offending process each time.

Also had this occur. I did try to "Open With" a cusom command line while pcmanfm was running. The menu-cached process remained after closing pcmanfm.

pcmanfm 1.2.4 & menu-cache 1.0.1 on Arch Linux

The problem sometimes triggered with the Alt+F2 (Run command) in LXDE

So, had this occur few more times. This seem to have only gotten more frequent. i can reproduce it now with 'open with' menu. on a half downloaded video with mpv, a jpeg file with firefox. it's weird.

  • on dwm.

I can remember, that a similar bug was one times fixed by geektime in his fork of LXDE - SDE. It was a problem, that menu-cached did hang up, if there were any of menu entries.

It would be pretty good if someone could catch where that 100% core usage happens, using gdb backtrace. I never could reproduce this issue so have no idea how to fix it. :(

I'm having the same issue. I hardly use "Alt-F2" but maybe I did by mistake or something.
Killall menu-cached got rid of the CPU usage. What is this menu-cached? Why does it sit here and use 100% if it does nothing useful at all (killing it has no consequences!). I was hoping to understand.

Killall menu-cached got rid of the CPU usage. What is this menu-cached? Why does it sit here and use 100% if it does nothing useful at all (killing it has no consequences!). I was hoping to understand.

menu-cached is a daemon which monitors changes in applications list and
updates cached system menu after you install/deinstall application, edit
some application information or reorder system menu. It should do nothing
most of the time, sitting there and sleeping. And after you kill it, the
application which uses cached system menu (via libmenu-cache) silently
restarts it, it's why it has no consequences to kill it. Anyway, 100% CPU
usage should be investigated and fixed, it's why I need your help using
gdb stacktrace to find out what happens.

Anyway, 100% CPU
usage should be investigated and fixed, it's why I need your help using
gdb stacktrace to find out what happens
.

I am sorry, I missed your message where you request that. I just skimmed over the replies in this thread and most were just complaining about the same thing. I hope to help.
Thanks.

gdb isn't very useful here, sadly:

(gdb) thread apply all bt
Thread 3 (Thread 0x7f78b1b1d700 (LWP 8697)):
#0  0x00007f78b2f3768d in poll () from /usr/lib/libc.so.6
#1  0x00007f78b3243fd6 in ?? () from /usr/lib/libglib-2.0.so.0
#2  0x00007f78b32440ec in g_main_context_iteration () from /usr/lib/libglib-2.0.so.0
#3  0x00007f78b3244131 in ?? () from /usr/lib/libglib-2.0.so.0
#4  0x00007f78b326a2b5 in ?? () from /usr/lib/libglib-2.0.so.0
#5  0x00007f78b2199474 in start_thread () from /usr/lib/libpthread.so.0
#6  0x00007f78b2f4069d in clone () from /usr/lib/libc.so.6

Thread 2 (Thread 0x7f78abfff700 (LWP 8698)):
#0  0x00007f78b2f3768d in poll () from /usr/lib/libc.so.6
#1  0x00007f78b3243fd6 in ?? () from /usr/lib/libglib-2.0.so.0
#2  0x00007f78b3244362 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#3  0x00007f78b383f726 in ?? () from /usr/lib/libgio-2.0.so.0
#4  0x00007f78b326a2b5 in ?? () from /usr/lib/libglib-2.0.so.0
#5  0x00007f78b2199474 in start_thread () from /usr/lib/libpthread.so.0
#6  0x00007f78b2f4069d in clone () from /usr/lib/libc.so.6

Thread 1 (Thread 0x7f78b3ca9700 (LWP 8694)):
#0  0x00007f78b2f3768d in poll () from /usr/lib/libc.so.6
#1  0x00007f78b3243fd6 in ?? () from /usr/lib/libglib-2.0.so.0
#2  0x00007f78b3244362 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#3  0x00000000004024db in ?? ()
#4  0x00007f78b2e79741 in __libc_start_main () from /usr/lib/libc.so.6
#5  0x0000000000402679 in ?? ()

strace shows constant calls to this:

poll([{fd=3, events=POLLIN|POLLPRI}, {fd=4, events=POLLIN}, {fd=8, events=POLLIN|POLLPRI}], 3, -1) = 1 ([{fd=8, revents=POLLNVAL}])

Return value says ([{fd=8, revents=POLLNVAL}]), fd 8 isn't open:

$ ls /proc/8694/fd -l
total 0
lr-x------ 1 dx users 64 Jun 30 02:07 0 -> /dev/null
l-wx------ 1 dx users 64 Jun 30 02:07 1 -> /dev/null
lrwx------ 1 dx users 64 Jun 30 02:07 10 -> 'anon_inode:[eventfd]'
lrwx------ 1 dx users 64 Jun 30 02:07 11 -> 'anon_inode:[eventfd]'
lrwx------ 1 dx users 64 Jun 30 02:07 12 -> 'socket:[75244389]'
lr-x------ 1 dx users 64 Jun 30 02:07 13 -> anon_inode:inotify
l-wx------ 1 dx users 64 Jun 30 02:07 2 -> /dev/null
lrwx------ 1 dx users 64 Jun 30 02:07 3 -> 'socket:[75276290]'
lrwx------ 1 dx users 64 Jun 30 02:07 4 -> 'anon_inode:[eventfd]'
lrwx------ 1 dx users 64 Jun 30 02:07 5 -> 'socket:[75247030]'
lr-x------ 1 dx users 64 Jun 30 02:07 6 -> 'pipe:[75247034]'
l-wx------ 1 dx users 64 Jun 30 02:07 7 -> 'pipe:[75247034]'
lrwx------ 1 dx users 64 Jun 30 02:07 9 -> 'socket:[75276292]'

I have no idea how this started happening, i don't suspend/hibernate, and it never happened before. The process was started 4 hours ago and CPU time of the process is 20 mins lower (because it's been paused in gdb), but 4 hours ago is roughly around the time i started pcmanfm (which is the only package i have that depends on menu-cache).

I don't have debug symbols or an easy way to install them without rebuilding the package, but some attempts to set breakpoints show that g_main_context_dispatch is called but g_main_dispatch isn't (because pending_dispatches->len == 0, so it never reaches the application code?)

I have the same behaviour using menu-cached, could be related to mixing lxde application into other environments, personally I am using openbox, tint2 and pcmanfm. And again it happen when I select an application in the Open with window in pacmanfm. as son as the window closes the 100% cpu appears in top

@diegocrzt and @ALL - really nice prosa, but really useful informations would include the used distribution, program version and so on. Steps to reproduce this bug are welcome too.

@agaida, I am using archlinux (64 bit version) updated, the version for menu-cache is 1.0.1-2 and as I mention I am not using a full lxde desktop environment, instead I am using openbox 3.6.1(-3)
To reproduce the behaviour in my computer, those are the steps in a idle system (no process has 100% cpu utilzation in top/htop)

  • Open pcmanfm
  • Right click on a file and select Open With...
  • In the Choose application window Select the tab Custom Command Line
  • Write a valid (installed) program and click OK
  • The selected program opens the selected file without problem
  • But in top/htop I can see 100% of cpu utilization for menu-cached

I am new using this lxde utility (pcmanfm/menu-cache), but any further information or task that I could do, I am open to try.

@diegocrzt Nice, i can reproduce that.

Thread 1 received signal SIGPIPE, Broken pipe.
0x0000000070000005 in ?? ()
(rr) bt
[...]
#6  0x00007f58ae598750 in write () from /usr/lib/libc.so.6
#7  0x000055c7d261e449 in do_reload (cache=0x55c7d29a8800) at menu-cached.c:313
#8  0x000055c7d261e614 in on_file_changed (mon=<optimized out>, gf=<optimized out>, other=<optimized out>, evt=G_FILE_MONITOR_EVENT_CREATED, cache=0x55c7d29a8800) at menu-cached.c:389
#9  0x00007f58ad8691f0 in ffi_call_unix64 () from /usr/lib/libffi.so.6
#10 0x00007f58ad868c58 in ffi_call () from /usr/lib/libffi.so.6
#11 0x00007f58aeb7dea6 in g_cclosure_marshal_generic_va (closure=0x55c7d29cc4c0, return_value=0x0, instance=<optimized out>, args_list=<optimized out>, marshal_data=<optimized out>, n_params=3, param_types=0x55c7d29cc080)
    at gclosure.c:1604
#12 0x00007f58aeb7cfea in _g_closure_invoke_va (closure=closure@entry=0x55c7d29cc4c0, return_value=return_value@entry=0x0, instance=instance@entry=0x55c7d29c1da0, args=args@entry=0x7ffce9ef4740, n_params=3, param_types=0x55c7d29cc080)
    at gclosure.c:867
#13 0x00007f58aeb9721a in g_signal_emit_valist (instance=0x55c7d29c1da0, signal_id=<optimized out>, detail=0, var_args=var_args@entry=0x7ffce9ef4740) at gsignal.c:3294
#14 0x00007f58aeb97fb7 in g_signal_emit (instance=instance@entry=0x55c7d29c1da0, signal_id=<optimized out>, detail=detail@entry=0) at gsignal.c:3441
#15 0x00007f58aee17f6d in g_file_monitor_emit_event (monitor=0x55c7d29c1da0, child=0x7f58a4001760, other_file=0x0, event_type=G_FILE_MONITOR_EVENT_CREATED) at gfilemonitor.c:290
#16 0x00007f58aeec4b18 in g_file_monitor_source_dispatch (source=source@entry=0x55c7d29cc710, callback=<optimized out>, user_data=<optimized out>) at glocalfilemonitor.c:546
#17 0x00007f58ae8a9394 in g_main_dispatch (context=0x55c7d29a6760) at gmain.c:3154
#18 g_main_context_dispatch (context=context@entry=0x55c7d29a6760) at gmain.c:3769
#19 0x00007f58ae8a976c in g_main_context_iterate (context=0x55c7d29a6760, block=block@entry=1, dispatch=dispatch@entry=1, self=self@entry=0x55c7d29a6a00) at gmain.c:3840
#20 0x00007f58ae8a9b13 in g_main_loop_run (loop=0x55c7d29a6970) at gmain.c:4034
#21 0x000055c7d261eaf1 in main (argc=<optimized out>, argv=<optimized out>) at menu-cached.c:823
(rr) frame 7
#7  0x000055c7d261e449 in do_reload (cache=0x55c7d29a8800) at menu-cached.c:313
313             if(write(g_io_channel_unix_get_fd(ch), buf, 37) < 37)

(rr) p g_io_channel_unix_get_fd(ch)
$1 = 8

(rr) s
314                 g_io_channel_shutdown(ch, FALSE, NULL);
(rr) l
309         /* notify the clients that reload is needed. */
310         for( l = cache->clients; l; l = l->next )
311         {
312             GIOChannel* ch = (GIOChannel*)l->data;
313             if(write(g_io_channel_unix_get_fd(ch), buf, 37) < 37)
314                 g_io_channel_shutdown(ch, FALSE, NULL);
315         }
316         cache->need_reload = FALSE;
317     }
318

(rr) s
g_io_channel_shutdown (channel=0x55c7d29a7290, flush=0, err=0x0) at giochannel.c:487
487     {

That channel is created in on_new_conn_incoming

702         child = g_io_channel_unix_new(client);
703         g_io_channel_set_close_on_unref( child, TRUE );
704         g_io_add_watch_full(child, G_PRIORITY_DEFAULT, G_IO_PRI|G_IO_IN|G_IO_HUP|G_IO_ERR,
705                             on_client_data_in, child, on_client_closed);
706         g_io_channel_unref(child);

on_client_data_in is called once, with a line of data. on_client_closed is never called (btw, that function has a FIXME with a couple of red flags). The return value of g_io_add_watch_full is not saved anywhere (should probably be removed with g_source_remove), and that g_io_channel_set_close_on_unref( child, TRUE ); probably isn't helping. Getting to SIGPIPE already means the ownership of the fd isn't clear. This looks messy, not sure what to do.

Well, actually it seems here we must remember source id returned by g_io_add_watch (instead of g_io_add_watch_full, it seems that GDestroyNotify function here doesn't seem useful), and explicitly call g_source_remove() when closing (destroying) channel, otherwise we see 100% CPU usage.

Also, it seems that even when client side of socket has already closed, server cannot receive POLLHUP. When server tries to write into the socket, server receives SIGPIPE and write() fails, then we first know client is already closed, at that time we have to call "shutdown" function explicitly.

I added one commit to my branch, which hopefully fix this issue.

when the cpu going spike, do relogin

Brand new installation of Lubuntu 16.10 in my laptop and the issue is still happening.

I just don't know if it was because I returned from the suspended mode or it's because I tried to run a custom command with "Open With".

CPU usage was 101% and the CPU temperature was 96 degrees! 😱

Thank you very much, @mtasaka, I've merged your changes, they seem sane, let hope issue is fixed!

Hi LStranger, just returned into my Archlinux Openbox, using menu-cache with openbox-menu and I have met wit this same mad going CPU issue with the official 1.0.1 version of menu-cache. So reading about your patch here, I just installed from AUR, modified the PKGBUILD to make the version match your new 1.0.2.5e50156 version, and now menu-cache-git is installed and after I restarted X, the CPU is quiet, absolutely quiet!

I'm very happy and I'm heading to the Arch forums to tell them about it. Many thanks for your good work!

I'm very happy and I'm heading to the Arch forums to tell them about it. Many thanks for your good work!

Dear Melodie, thank Mamoru Tasaka, please, not me fixed the issue.

Dear Melodie, thank Mamoru Tasaka, please, not me fixed the issue.

Yes ok sure! Mtasaka hi, thanks for the patch which solved the issue!

just a thumbs up from me:
I am - or was - in this exact situation (pcmanfm on archlinux, "open with" with custom command), and did this to fix it.
thanks.

I can only hope this will be merged into the next release (arch's menu-cache is currently on 1.0.1-2).

0x85C commented

@ohnonot Arch linux just upgrade menu-cache to 1.0.2-1 today :) yay

@brandonsandoval: actually, this deserves more than an emoji.
important info, i can remove an aur package before i forget about it. thanks for telling.

cefn commented

Great work! I experienced the same problem on Lbuntu Xenial, which reports...

cefn@cefn-xenial-toshiba:~$ dpkg --list | grep menu-cache
ii libmenu-cache-bin 1.0.1-1build1 amd64 LXDE implementation of the freedesktop Menu's cache (libexec)
ii libmenu-cache3:amd64 1.0.1-1build1 amd64 LXDE implementation of the freedesktop Menu's cache

...and currently top looks like...

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND     
 1399 cefn      20   0  196276      0      0 R  99.3  0.0 242:00.36 menu-cached 

Any idea if I can simply install and run the zesty versions or would this be a very bad idea? They are downloadable from...
https://launchpad.net/ubuntu/zesty/amd64/libmenu-cache-bin/1.0.2-1
https://launchpad.net/ubuntu/zesty/amd64/libmenu-cache3/1.0.2-1

cefn, if you installed them you could easily switch back to the xenial version.
l assume it maybe doesn't work due to the versions of the used libraries:
$ ldd /usr/lib/menu-cache/menu-cached
If you get problems uninstalling a package because apt thinks some packages circularly depend on each other (apt install -f doesn't work), you can remove the bare package with dpkg and simply reinstall the official one with apt, e.g.
$ sudo dpkg -r menu-cached --ignore-depends=menu-cached

Why is the fixed menu-cached not distributed in the yakkety and xenial lubuntu versions?

cefn commented

@HybridDog hadn't tried installing on the basis that it might create fundamental incompatibilities by nature of the version bump, but have now assumed this isn't the case, downloaded the files to a directory, cd to the directory and run the following...

$ sudo dpkg --install libmenu-cache*.deb
(Reading database ... 242534 files and directories currently installed.)
Preparing to unpack libmenu-cache3_1.0.2-1_amd64.deb ...
Unpacking libmenu-cache3:amd64 (1.0.2-1) over (1.0.1-1build1) ...
Preparing to unpack libmenu-cache-bin_1.0.2-1_amd64.deb ...
Unpacking libmenu-cache-bin (1.0.2-1) over (1.0.1-1build1) ...
Setting up libmenu-cache-bin (1.0.2-1) ...
Setting up libmenu-cache3:amd64 (1.0.2-1) ...
Processing triggers for libc-bin (2.23-0ubuntu5) ...

...so far without incident.

Closing this as it appear to be actually fixed. Thank you everyone.

Still happening on Arch with libfm 1.2.5-1+ menu-cache 1.0.2-2.

This is still happening on Ubuntu 16.04 with libfm 1.2.4 and menu-cache 1.0.2

I've updated both those packages to 1.0.2-1 and still seeing the high CPU. Would something have changed between -1 and -3 to be worth trying?

Thanks. Will do.

OK, but how does this help with my original post?
Arch is still having the issue with menu-cache 1.0.2-2?

I hope last commit finally fixed this issue, check it, please. PPA will have next update in about 22 hours unless @gilir triggers a build before that schedule. Thank you.

Bodhi Linux people proclaim this issue is finally fixed by latest commits, which should appear in PPA in 23 hours. Be patient and you'll get to test it. If no more issues would be found in next 10 days, a release 1.1.0 will follow, with this fix as well. Thank you everyone.

Would like to confirm the latest source appears to resolve this issue.

Now 1.1.0 is to be released?

Which specific commits fix this? If I know that, I can cherry pick the commits and upload to all stable Ubuntu releases (and Artful (to be 17.10)).

I'm sorry for a delay, my comp at home was crashed, and also some private issues came. Will do release as soon as I can. Will release Debian/Ubuntu package to unstable as well. For Ubuntu stable I would appreciate your help, commits from September 11...14 fix three important issues so would be nice to be added as a patch. Thank you.

Hey @LStranger:

I'll take responsibility for getting fixes in Ubuntu, as long as you could tell me which commits I should cherry pick. I don't want to pick the wrong ones.

It would also help if you could tell me how far back I should cherry pick them.

Thanks!

gilir commented

@tsimonq2
I just checked the commits since 1.0.2, most are bug fixes. I think it's probably safer to upload it as 1.1.0 than trying to cherry-pick commits. If you are unsure, you can revert the API addition.
However, looking at the schedule, it's probably too late for the release, but maybe as a 0-day SRU ? I can do the packaging if you do the paperwork :-)

I am having Lubuntu (16.04.5 LTS (Xenial Xerus)) and i found to be still having this issue.

today i tried to right click .avi file (in PCManFM 1.2.4) and "Open with" (kind of menu entry), then clicked Custom command tab, then ype command: mpv --no-keepaspect
that failed, so i tried again entering command: mpv --no-keepaspect "media file name.avi"
Then after some minutes i realized that menu-cached process eating my CPU more than 40%.

Linux * 4.4.0-135-generic #161-Ubuntu SMP Mon Aug 27 10:46:32 UTC 2018 i686 i686 i686 GNU/Linux

dpkg --list | grep menu-cache

ii  libmenu-cache-bin                                           1.0.1-1ubuntu0.2                           i386         LXDE implementation of the freedesktop Menu's cache (libexec)
ii  libmenu-cache3:i386                                         1.0.1-1ubuntu0.2                           i386         LXDE

I believe this is an issue, because i am using apt-get update and apt-get upgrade and the LTS release of Ubuntu (16.04). My sources.list file active repos:
deb http://fr.archive.ubuntu.com/ubuntu/ xenial main restricted universe multiverse deb http://fr.archive.ubuntu.com/ubuntu/ xenial-security main restricted universe multiverse deb http://fr.archive.ubuntu.com/ubuntu/ xenial-proposed main restricted universe multiverse deb http://archive.canonical.com/ubuntu xenial partner deb-src http://archive.canonical.com/ubuntu xenial partner deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main deb http://deb.torproject.org/torproject.org xenial main deb http://ppa.launchpad.net/ubuntu-wine/ppa/ubuntu xenial main deb-src http://deb.torproject.org/torproject.org xenial main deb-src http://ppa.launchpad.net/ubuntu-wine/ppa/ubuntu xenial main

@slrslr

this exact issue was believed to be fixed in 1.0.2-1

yet your version is

libmenu-cache-bin 1.0.1-1ubuntu0.2

generally speaking, the *buntu 16.04 LTS series is beginning to show its age.
normal upgrading won't help if the maintainers do not deem this important enough.
you raised the issue in the wrong place. it has been long fixed here, you need to take it to your distro maintainers.

ib commented

We believe that the issue is fixed.