irqbalance-ui: aborts with coredump if irqbalance is started with IRQBALANCE_BANNED_CPUS
Closed this issue · 4 comments
IRQBALANCE_BANNED_CPULIST=32-47,48-63 ./irqbalance
irqbalance-ui
free(): invalid pointer
Aborted (core dumped)
ll /var/lib/systemd/coredump/
total 128
-rw-r-----. 1 root root 61291 Jul 15 11:31 core.irqbalance-ui.0.d97e90f059414c1b8e0d0eb1d00c8c14.4067.1657864890000000.zst
-rw-r-----. 1 root root 61223 Jul 15 11:36 core.irqbalance-ui.0.d97e90f059414c1b8e0d0eb1d00c8c14.4101.1657865160000000.zst
Jul 15 11:36:00 localhost.localdomain systemd[1]: Started Process Core Dump (PID 4102/UID 0).
Jul 15 11:36:00 localhost.localdomain systemd-coredump[4103]: [🡕] Process 4101 (irqbalance-ui) of user 0 dumped core.
Module /root/irqbalance/irqbalance-ui with build-id 88bdf0081453ff408adb5d6b74d7cfacb91e4a7b
Module linux-vdso.so.1 with build-id 7c3e210917108833f13aaa2b8470456d7118448e
Module ld-linux-x86-64.so.2 with build-id d66f437b27ec0a0a70d480f7731f9c9aafd98bad
Module libpcre.so.1 with build-id cffb947bcc416dca3cd249cdb0a1c6f614549c30
Module libc.so.6 with build-id 79ee25245bb9d11d30e095e7ee2629aa4fe4dbf6
Module libtinfo.so.6 with build-id 7745adf36f8d068cdf99dc45bab9352ade38b6eb
Module libncursesw.so.6 with build-id 25554c31777f891c014487b5dd91b2d198aa1941
Module libm.so.6 with build-id 07bcee7dd6b3c9dda6a73fd434e2560632e3241e
Module libglib-2.0.so.0 with build-id addb8fcb7df102ae4897fec40e395bcfb4f4ca59
Stack trace of thread 4101:
#0 0x00007f4bbd28642c __pthread_kill_implementation (libc.so.6 + 0xa642c)
#1 0x00007f4bbd239d06 raise (libc.so.6 + 0x59d06)
#2 0x00007f4bbd20c7d3 abort (libc.so.6 + 0x2c7d3)
#3 0x00007f4bbd27a567 __libc_message (libc.so.6 + 0x9a567)
#4 0x00007f4bbd29043c malloc_printerr (libc.so.6 + 0xb043c)
#5 0x00007f4bbd291d4c _int_free (libc.so.6 + 0xb1d4c)
#6 0x00007f4bbd2947d5 free (libc.so.6 + 0xb47d5)
#7 0x0000000000402c25 n/a (/root/irqbalance/irqbalance-ui + 0x2c25)
ELF object binary architecture: AMD x86-64
The problem also happens if system is booted with isolcpus and nohz_full parameters.
On another system, I got following output.
munmap_chunk(): invalid pointer
Aborted (core dumped)
Booted with.
isolcpus=72-75,90-99,108-115,126-140 nohz_full=72-75,90-99,108-115,126-140
Jul 15 02:38:33 localhost.localdomain systemd[1]: Started Process Core Dump (PID 38659/UID 0).
Jul 15 02:38:34 localhost.localdomain systemd-coredump[38660]: Process 38658 (irqbalance-ui) of user 0 dumped core.
Stack trace of thread 38658:
#0 0x00007f5f4c705a4f raise (libc.so.6)
#1 0x00007f5f4c6d8db5 abort (libc.so.6)
#2 0x00007f5f4c748057 __libc_message (libc.so.6)
#3 0x00007f5f4c74f1bc malloc_printerr (libc.so.6)
#4 0x00007f5f4c74f46c munmap_chunk (libc.so.6)
#5 0x00000000004020f5 n/a (/root/irqbalance/irqbalance-ui)
Thanks,
@liuchao173 this is almost certainly related to one of your recent UI changes, please investigate asap
@vishal14051992 if you could run the UI utility under gdb, and provide a line-accurate backtrace, it would help identify the problem.
@vishal14051992 I can't reproduce it in my environment, can you run the UI utility under gdb and provide a line-accurate backtrace.
Hello,
I compiled latest tag for irqbalance github. Here are detailed steps that I have performed.
# git clone https://github.com/Irqbalance/irqbalance.git
# git describe
v1.6.0-189-g56a9a0f
# ./autogen.sh
# ./configure
# make
Started irqbalance with banned cpu.
# IRQBALANCE_BANNED_CPULIST=65 ./irqbalance
# gdb ./irqbalance-ui
(gdb) run
free(): invalid pointer
Program received signal SIGABRT, Aborted.
__pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0;
(gdb) bt
#0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007ffff7bcc493 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78
#2 0x00007ffff7b7fd06 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3 0x00007ffff7b527d3 in __GI_abort () at abort.c:79
#4 0x00007ffff7bc0567 in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7ce659a "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#5 0x00007ffff7bd643c in malloc_printerr (str=str@entry=0x7ffff7ce41e7 "free(): invalid pointer") at malloc.c:5536
#6 0x00007ffff7bd7d4c in _int_free (av=<optimized out>, p=<optimized out>, have_lock=0) at malloc.c:4327
#7 0x00007ffff7bda7d5 in __GI___libc_free (mem=mem@entry=0x406010) at malloc.c:3279
#8 0x0000000000402c25 in parse_setup (setup_data=<optimized out>) at ui/irqbalance-ui.c:191
#9 0x0000000000403965 in parse_setup (setup_data=setup_data@entry=0x52aa00 "SLEEP 10 BANNED 00000002,00000000,00000000") at ui/irqbalance-ui.c:207
#10 0x0000000000405958 in display_tree () at ui/ui.c:797
#11 0x0000000000405a6e in init () at ui/ui.c:682
#12 0x00000000004024a7 in main (argc=<optimized out>, argv=<optimized out>) at ui/irqbalance-ui.c:533
(gdb)
I hope this helps. Let me know if anything else is required.
I see, my environment doesn't have enough CPU. When processing the ',' in hex_to_bitmap, it returns '0000\ 0' directly. The map will be freed in parse_setup, but it is not requested through malloc. I'll fix this bug as soon as possible.