100% CPU usage and not doing anything
Closed this issue · 9 comments
The symptoms differ from #501 / #457.
rtorrent seems to stop doing anything while the UI is responsive and using 100% CPU on 2 threads.
System:
archlinux, rtorrent 0.9.6, libcurl 7.51.0, openssl 1.0.2j
top dump:
top - 04:16:00 up 39 min, 3 users, load average: 1.96, 1.98, 1.96
Tasks: 143 total, 2 running, 141 sleeping, 0 stopped, 0 zombie
%Cpu0 : 47.3/52.7 100[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||]
%Cpu1 : 0.0/0.0 0[ ]
%Cpu2 : 0.0/0.0 0[ ]
%Cpu3 : 29.1/70.9 100[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||]
%Cpu4 : 0.0/0.0 0[ ]
%Cpu5 : 0.0/0.0 0[ ]
%Cpu6 : 0.0/0.0 0[ ]
%Cpu7 : 0.0/0.0 0[ ]
GiB Mem : 32.8/3.862 [ ]
perf top dump:
PerfTop: 14492 irqs/sec kernel:80.4% exact: 0.0% [4000Hz cpu-clock], (all, 8 CPUs)
-------------------------------------------------------------------------------
12.80% libtorrent.so.19.0.0 [.] torrent::thread_base::event_loop
8.73% [kernel] [k] _raw_spin_lock
6.21% [kernel] [k] _raw_spin_unlock_irqrestore
5.17% [kernel] [k] entry_SYSCALL_64_fastpath
4.40% [kernel] [k] pvclock_clocksource_read
3.42% [kernel] [k] __call_rcu.constprop.62
2.61% [vdso] [.] 0x0000000000000981
2.41% [kernel] [k] __fget
2.31% libtorrent.so.19.0.0 [.] torrent::signal_bitfield::work
1.93% [vdso] [.] __vdso_gettimeofday
1.77% libpthread-2.24.so [.] __GI___libc_accept
1.76% libtorrent.so.19.0.0 [.] torrent::PollEPoll::do_poll
1.76% [kernel] [k] __slab_free
1.47% [kernel] [k] inode_init_always
1.44% libc-2.24.so [.] 0x00000000000e8dd3
1.28% [kernel] [k] select_estimate_accuracy.part.2
1.27% [kernel] [k] ep_send_events_proc
1.25% [kernel] [k] fput
1.18% [kernel] [k] _raw_spin_unlock_irq
1.17% [kernel] [k] ktime_get_ts64
I have about 4k torrents added but this seems to happen when there are about 150 torrents started.
all of the torrents that are hashing gets closed.
any ideas?
UPDATE:
I've generated a flamegraph of the system with around 150 torrents started, both threads are spending most of their CPU time in system calls.....
seems to be making a lot of of accept
and epoll
calls...
hmm, i got the same problem with curl-7.52.1, and no problems with curl-7.51.0
hmm, i got the same problem with curl-7.52.1, and no problems with curl-7.51.0
i can confirm that, i was wondering why it couldn't connect to http trackers anymore (didn't notice the cpu usage at the beginning) and downgrading curl to 7.51.0 fixed it
Have you reported this to the curl devs? They know better what changes might be causing this.
No, I did not. Moreover rtorrent (feature-bind) does not download from some private trackers.
UPDATE: updating to the new build of curl on archlinuxarm seems to have fixed the issue (https tracker timeouts)
https://git.archlinux.org/svntogit/packages.git/commit/trunk?h=packages/curl&id=bae58314ffe16b0642811fdb0187a9b8eba8d362
I also started encountering this recently on a Debian testing amd64 box. I don't know if it coincided with an update to libcurl, but I know it coincided with an update from kernel 4.11 to 4.12.
Checking strace on rtorrent, during times of spiked CPU use, the system is pounded with epoll_wait(... {EPOLLOUT, ...}, ...)
calls; the return value (number of available fds) seems to always equal the number of fds in the struct epoll_events* events
, which appears to correspond to the number of 'handshake' sockets shown in the UI ([S 7/34/768]). (There have been a few occasions where handshakes is zero, and CPU use also drops.)
I'm not a kernel hacker, but I'm slightly suspicious of work on epoll that went into 4.12, like epoll: Add busy poll support to epoll with socket fds.
Edit: I should also note that the epoll_wait(...)
calls appear to be happening roughly 1-5 times per second, based on the timeout value, e.g.:
epoll_wait(7, [{EPOLLOUT, {u32=152, u64=152}}, ..., 1024, 781) = 15
epoll_wait(7, [{EPOLLOUT, {u32=152, u64=152}}, ..., 1024, 780) = 15
epoll_wait(7, [{EPOLLOUT, {u32=152, u64=152}}, ..., 1024, 779) = 15
epoll_wait(7, [{EPOLLOUT, {u32=152, u64=152}}, ..., 1024, 779) = 15
epoll_wait(7, [{EPOLLOUT, {u32=152, u64=152}}, ..., 1024, 779) = 15
epoll_wait(7, [{EPOLLOUT, {u32=152, u64=152}}, ..., 1024, 779) = 15
epoll_wait(7, [{EPOLLOUT, {u32=152, u64=152}}, ..., 1024, 779) = 15
epoll_wait(7, [{EPOLLOUT, {u32=152, u64=152}}, ..., 1024, 778) = 15
epoll_wait(7, [{EPOLLOUT, {u32=152, u64=152}}, ..., 1024, 778) = 15