rakshasa/rtorrent

repeated temporary freezes on 9.x

mikisvaz opened this issue ยท 42 comments

Rtorrent freezes a few seconds after startup, and then every now and then. The freeze lasts several minutes, and It even becomes permanently un-responsive at times. The freeze includes the ncurses gui and the SGI interface. There is no uncommon cpu usage from rtorrent during that time.

I've tried several versions of rtorrent in arch, including the extended and pyro packages. I even compiled several versions against a version of libcurl with c-ares support (as suggested in http://filesharefreak.com/tutorials/rtorrent-libtorrent-installing-on-linux/). Nothing helped.

Any suggestions?

# .rtorrent.rc

min_peers = 40
max_peers = 100

min_peers_seed = 10
max_peers_seed = 50

max_uploads = 15

download_rate = 800
upload_rate = 20

session = ~/session

directory = ~/downloads/

group.insert_persistent_view=sick
group.sick.ratio.enable=
group.sick.ratio.max.set=200

group.insert_persistent_view=manual
group.manual.ratio.enable=
group.manual.ratio.max.set=200

group.insert_persistent_view=couch
group.couch.ratio.enable=
group.couch.ratio.max.set=200

schedule = watch_directory_1,5,5,"load_start=~/blackhole/*.torrent"
schedule = watch_directory_2,5,5,"load_start=~/blackhole/sickbeard/*.torrent, d.set_custom1=sick, \"d.set_custom=isauto,~/temp-process/sickbeard/\",d.set_custom2=sick"
schedule = watch_directory_3,5,5,"load_start=~/blackhole/couchpotato/*.torrent, d.set_custom1=couch, \"d.set_custom=isauto,~/temp-process/couchpotato/\",d.set_custom3=couch"

system.method.set_key = event.download.inserted_new,set_ratio,"branch=d.get_custom2=,view.set_visible=sick"
system.method.set_key = event.download.inserted_new,set_ratio,"branch=d.get_custom3=,view.set_visible=couch"

system.method.set_key = event.download.inserted_new,set_autodir,"d.set_custom=autodir,\"$cat=$d.get_custom=isauto,$d.get_name=\""

system.method.set_key = event.download.inserted_new,set_manualdir,"d.set_custom=manualdir,\"$cat=~/downloads/complete/,$d.get_custom1=\""

system.method.set_key = event.download.inserted_new,set_movedir,"branch=d.get_custom=isauto,\"d.set_custom=movedir,$d.get_custom=autodir\",\"d.set_custom=movedir,$d.get_custom=manualdir\""

system.method.set_key = event.download.inserted_new,del_tor,"execute={rm,-rf,--,$d.get_loaded_file=}"

system.method.set_key = event.download.finished,move_complete,"d.close=;execute=mkdir,-p,$d.get_custom=movedir;execute=mv,-u,$d.get_base_path=,$d.get_custom=movedir;d.erase="

system.method.set = group.sick.ratio.command, d.close=, "execute={rm,-rf,--,$d.get_base_path=}", d.erase=
system.method.set = group.couch.ratio.command, d.close=, "execute={rm,-rf,--,$d.get_base_path=}", d.erase=

dht = auto
dht_port = 6881

peer_exchange = yes

port_range = 10001-10010

check_hash = no

scgi_port = localhost:5000

schedule = throttle_1,20:00:00,24:00:00,upload_rate=30
schedule = throttle_2,10:00:00,24:00:00,upload_rate=10

log.open_file = "rtorrent.log", (cat,/tmp/rtorrent.log.,(system.pid))

I solved the problem.

I turned on debuging and saw that the freezes coincided with these messages:

1394722090 I 146D112F85F21D449E16C502DAD44641B17203CA->tracker_list: Failed to connect to tracker url:'udp://fr33domtracker.h33t.com:3310/announce' msg:'Could not resolve hostname.'.

Reinforcing the idea of a DNS problem. As I mentioned, compiling against a c-ares version of libcurl did not help.

What helped was installing bind as a local name server and pointing my resolv.conf there instead of to the DNS in my router. It is now working smoothly, at least for much longer than before.

This is actually an issue, DNS lookups for udp needs to be in a separate thread.

I'm having this too with current revision. It seems to happen when the session directory is being updated. UI freezes and all network activity is dropped. With near 4K torrents seeding it can freezes for a minute. With version 0.9,3 UI did freeze, but I didn't notice network activity dropping drastically and I think freezes were shorter.

Downgraded to 0.9.3 and yeah, freezes are much shorter, about four seconds versus near one minute of current revision.

There's chances this may be related to https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=688816 ?

According to Damyan it happens when his network goes down. If it's not related let me know and will open a new issue here in the tracker.

I believe I'm experiencing the same problem (also running arch linux), and it does not seem to be related to the debian bug. I turned on debug logging in my rtorrent configuration and did not notice any useful/relevant messages. It seems to be a threading issue. rtorrent does not crash/segfault and there are no CPU spikes, but it does become very unresponsive, as if a thread is waiting for some disk or network IO.

I upgraded recently and just noticed the problem for the first time today and haven't had a chance to dig in further.

Tested again to try last commit in libtorrent. I still got freezes that stop UI and network activity every 20 minutes, when the *.torrent.rtorrent and *.torrent.libtorrent.resume files in the session directory are being updated.

Currently working on c-ares support for UDP tracker requests.

Has there been any progress on this lately? A tracker I use recently went offline and their domain won't even respond to dns lookups (nyaa.se), which led me to this bug report. Originally the logs could be interpreted as network or firewall issues, which was somewhat confusing.

I was able to enable logging, which rtorrent did just fine and I got similar errors to @mikisvaz. However neither the client nor the SCGI interface responded so I had to manually delete the (r)torrent files in the .session directory.

I'm running v0.9.2, but if #234 counts for anything it doesn't seem to be better in newer versions of libtorrent/rtorrent.

I also have the issue mentioned by ratpenat and have had it for over a year. With a standard 0.9.3/0.13.3 every 20 minutes without fail, the disk I/O goes insane for up to 5 minutes. During this period, the UI freezes, the xmlrpc interface is unresponsive and uploads and downloads stop and PS shows the rtorrent process constantly in disk wait state. I will note that I am not using any web front end, I just use the xmlrpc interface via command line client for maintenance tasks and queries.

I have used iotop to watch and ensure that no other processes were causing disk interference and that rtorrent was the only process with disk access and then used fatrace by Martin Pitt to watch what access was occurring during the rtorrent disk wait state. I noticed that the only access was the session save function updating and writing a bunch of $hash.torrent.libtorrent_resume.new and $hash.torrent.rtorrent.new files. I have a strong suspicion that this is the issue that people are complaining about when their seedbox hoster contacts them and advises that their virtual machine is thrashing the disk. I also think that it might be the same as issue #233 although Rid the OP from #233 still has access with the UI even though the SCGI access is frozen.

I have found a work around that seems to help until the issue can be resolved, and that is to recompile from source after you edit the second 1200 entry in the src/main.cc file line

"schedule2 = session_save,1200,1200,((session.save))\n"

like this:

"schedule2 = session_save,1200,14400,((session.save))\n".

I have tried 3600 (1hr), 14400 (4hrs), 43200 (12hrs) and 86400 (24hrs). The greater the number, the less frequently the session data will be saved (meaning you may lose resume and statistical data if you terminate prior to a session save - which I don't really care about, but you should be aware of this in case you do care) the result is that rtorrent less frequently thrashes your disk. I have found that the 12hr time setting is quite acceptable on my seedbox, seeding over 40,000 torrents with an average size of 4mb. Just goes to show how insanely good this software is that it can handle that kind of load if not for the session saving. My rtorrent process disk wait states have dropped from the program being frozen approximately 10% (on a good day) to 17% (on a bad day) of the time, to less than 1%.

I have had this issue for over 12 months since I migrated to rtorrent from utorrent, when I had less than half the number of torrents I have now (my best estimate is 12,000). Obviously mo torrents, mo problems as the disk waits (that happen every 20 minutes) last longer as the torrent numbers increase.

#180 (comment)

I also did what mikisvaz did: installing bind and pointing it to the resolv.conf which works wonders for some reason.

In my case the tracker timing out was mainly a http one (a udp one also helped)

Double check that all entries in resolv.conf are reachable. After a recent network update (192.168.0.1/24->192.168.1.1/24) I had a non-existent server of 192.168.0.1 above the existing 192.168.1.1 in my file. Removing the non-existent server fixed my freezing issues.

I think it works for you because you are querying a dns server on your lan, which is fast and the freeze unnoticeable, like with a local bind server. I suspect the underlying problem is still there

On October 4, 2014 6:50:02 AM CEST, Shawn Fisher notifications@github.com wrote:

Double check that all entries in resolv.conf are reachable. After a
recent network update (192.168.0.1/24->192.168.1.1/24) I had a
non-existent server of 192.168.0.1 above the existing 192.168.1.1 in my
file. Removing the non-existent server fixed my freezing issues.


Reply to this email directly or view it on GitHub:
#180 (comment)

Just wanted to add my experience that I experience the same freeze every 20 minutes from the session files being saved. I'm running 0.9.2 with pyroscope's extensions. I even went so far as to put the session folder on a tmpfs mount. Nothing I've tried has worked so far.

The comments regarding freezes during session directory updating seems to be a different issue than the one described in this ticket; please open another separate ticket to further discuss that.

I have experienced the same problem with what seemingly is DNS queries not receiving responses, hanging the UI and RPC interface in the process (not listening to kill -2 either, though kill -5 brings the program down). I just experienced it again as tracker.openbittorrent.com stopped resolving correctly. This tracker was only contacted for about 10 torrents, but it completely froze the rTorrent UI immediately on startup.

My "fix" was to add a manual host entry in /etc/hosts:

127.0.0.1 tracker.openbittorrent.com

which at least gives a quick answer to rTorrent when the host is queried, making the UI responsive again. Note that the problem is not that the tracker is down (since I am certainly not running a tracker on 127.0.0.1, so there are no valid responses given), but it seems to occur when the DNS lookup for a particular domain for any reason is really slow, never yielding an answer.

I isolated the issue to the lookup of that particular host by launching with strace rtorrent and noting that when the garbled output on the terminal screen intermittently stopped, there had always just been a library call to find that particular host. Not a pretty method, though.

I am running version 0.9.2/0.13.2 (Debian Sid).

tracker.openbittorrent.com may be affected by the current dnsimple outage. I've been following this here all day: http://dnsimplestatus.com/. Removing that host from magnets seems to fix the issue for me.

kfei commented

Hi everyone, I also encounter this issue. But in my case it's a DNS problem of my ISP, after echo nameserver 8.8.8.8 > /etc/resolv.conf it does not freeze anymore. By the way I've tried what @mikisvaz says, it also helped.

@rakshasa Any news on when this might be resolved? :)

@zmpeg: See https://torrentfreak.com/worlds-largest-bittorrent-tracker-goes-down-141205/ for the problem with openbittorrent.com. There are no valid DNS entries, which apparently leads to a block of the rTorrent thread in this case.

I encourage everyone to keep this issue on topic; i.e. focused on rTorrent operation. I just mentioned the link above to perhaps help in pinpointing how the DNS lookup locking issue can occur to begin with.

I have this same problem. It's always on resolving DNS for down trackers. Freezes are anywhere from 30 seconds to 20 minutes, mean seems to be 4 minutes. It happens every 4th time I check on the tracker, which is rarely, so I have no idea how often it's happening but it has to be bad. During this time, network traffic drops to nothing, and clients report the rtorrent server to be down.

This has been seen only since 0.9.4. I have downgraded to 0.9.3 and the problem still exists but only for 5-10 seconds and does not seem to happen nearly as often.

Quick&dirty fix: log all trackers errors, disable them by redirecting them to localhost:t.
First enable logs in .rtorrent.rc

log.open_file = "tracker.log", "/var/log/rtorrent/tracker.log"
log.add_output = "tracker_debug", "tracker.log"

Then parse these logs

# find all "Timed out" errors
grep Fail /var/log/rtorrent/tracker.log  | grep Timed | sed "s,.*url:'\(.*\)' msg:'Timed out'.,\1," | sort -u > trackers-to-blacklist
# check if tracker is still down
for url in $(cat trackers-to-blacklist); do echo -n  $url; curl -s -m 2 $url > /dev/null && echo " OK" || echo " FAIL"; done | tee trackers-to-blacklist.tests
# extract hostnames
for url in $(grep "FAIL$" trackers-to-blacklist.tests | sed 's: FAIL$::'); do echo $url | sed 's,http://\([^:/]*\)[:/].*,\1,'; done > trackers-to-blacklist.hosts.list
# disable invalid trackers
cat trackers-to-blacklist.hosts.list | xargs echo "127.0.0.1 " >> /etc/hosts

To update the blacklisting just remove the last line from /etc/hosts, and redo the script.

I also tried recompiling curl with c-aes, but perhaps I didn't do it correctly. Regardless the suggestion by dandersson helped.

i had the same issue, on a fresh installed debian (vm) i tested it and it had the same problem
i compiled curl with ares, recompiled libtorrent and rtorrent and everything was smooth as butter again, thanks :)

As a workaround, you can disable udp trackers by adding below contents to rtorrent.rc
schedule = disableudp, 0, 1, trackers.use_udp.set=no
trackers.use_udp.set = no

Forgive me for not rereading all of this but I'm sitting here reloading my seedbox and I noticed that call outs to trackers are all going at the same time. Have 45 torrents and it's making 45 connections to those trackers at once. That may be an issue here as well. I;m sure some of those trackers aren't liking that and may be blacklisting.

Echoing what other people have said: I can confirm that running a caching DNS server on localhost is an excellent workaround for this issue. dnsmasq may be a bit easier to configure than bind for this use case. This guide is helpful: http://www.georgestarcher.com/splunk-dns-lookup-performance-and-caching-with-dnsmasq/

Written by ituser694:

I have found a work around that seems to help until the issue can be resolved, and that is to recompile from source after you edit the second 1200 entry in the src/main.cc file line
"schedule2 = session_save,1200,1200,((session.save))\n"
like this:
"schedule2 = session_save,1200,14400,((session.save))\n".
I have tried 3600 (1hr), 14400 (4hrs), 43200 (12hrs) and 86400 (24hrs). The greater the number, the less frequently the session data will be saved (meaning you may lose resume and statistical data if you terminate prior to a session save - which I don't really care about, but you should be aware of this in case you do care) the result is that rtorrent less frequently thrashes your disk. I have found that the 12hr time setting is quite acceptable on my seedbox, seeding over 40,000 torrents with an average size of 4mb.

If someone wants to add this workaround (along with a dns cache in the local network), you don't have to recompile rtorrent, just add this into your config (it will overwrite the default scheduling):

# Save all the session in every 12 hours instead of the default 20 minutes.
schedule2 = session_save, 1200, 43200, ((session.save))

Bad response from server: (500 [error,list]) Link to XMLRPC failed. May be, rTorrent is down?

its freeze i need to kill rtorrent and start again

Jolar commented

Wonderful to get a hint on faulty trackers causing the problems...! For some time rtorrent have becoming unresponsive every 4-5 minutes, for 10-30 seconds. All connections seemed to reset and restart every time, and mean speeds were low. Very, very frustrating. The last days I have been trying to tweak memory, file and network settings both system wide and in rtorrent, but nothing helped. But manually going through all torrents and disabling faulty trackers with no updates did the trick. Back to normal! Finally! Using 9.6.0 (and now I see that there is a 0.9.7 release since earlier today).

Chaz6 commented

I would just like to add that the suggestion by mikisvaz also resolved my issue. I tried compiling curl with c-ares but running my own dns resolver has completely fixed the problem of the rtorrent tui hanging.

I've had the same issues, so have been considering moving everything that might cause dns lookups into separate threads.

For the record, I am still using rakshasa/libtorrent#134 in my personal fork; it is an effective mitigation and I'm still willing to work to have it merged.

Any word on this? or is running a local DNS for caching the expected solution?

i see there are commits in branch 'slingamn-udns.10' that look relevant to this, @rakshasa assuming this fixes the issue, any timescale as to when these will get merged into the 'master' branch?.

I made a script for Ubuntu/Debian to automatically install rtorrent with a few extra things to resolve this problem. https://github.com/stickz/rtinst

  1. It installs curl with c-ares support for asynchronous http tracker requests.
  2. It installs libtorrent with udns support for asynchronous udp tracker requests.
  3. It installs dnsmasq as a local dns caching solution for faster tracker updates.

To run the script, you just need to input 2 commands. The recommended operating system is Ubuntu 20.04 LTS.

This will install the script onto your operating system:
sudo bash -c "$(wget --no-check-certificate -qO - https://raw.githubusercontent.com/stickz/rtinst/master/rtsetup)"

This will run the script:
sudo rtinst

aki-k commented

As a workaround, you can disable udp trackers by adding below contents to rtorrent.rc schedule = disableudp, 0, 1, trackers.use_udp.set=no trackers.use_udp.set = no

This helped me with rtorrent UI freezes with UDP trackers.

stickz commented

As a workaround, you can disable udp trackers by adding below contents to rtorrent.rc schedule = disableudp, 0, 1, trackers.use_udp.set=no trackers.use_udp.set = no

This helped me with rtorrent UI freezes with UDP trackers.

Swizzin install script supports UDNS if you want to use UDP trackers. https://github.com/swizzin/swizzin

The problem is with DNS blocking on the main thread.

aki-k commented

Swizzin install script supports UDNS if you want to use UDP trackers. https://github.com/swizzin/swizzin

That looks like a lot of script to fix the problem for rtorrent, which is just one of the programs I run on my server (laptop). Fedora is not even on the supported distro list.

stickz commented

That looks like a lot of script to fix the problem for rtorrent, which is just one of the programs I run on my server (laptop). Fedora is not even on the supported distro list.

Yes it is. The problem is with libtorrent. A custom software patch is required to be compiled using GCC. There are two crash issues with UDP trackers as well fixed on Swizzin. Plus a custom compiled version of curl with c-ares support to fix TCP trackers.

You can install Dnsmasq to cover up the issue temporarily. But eventually it won't be enough to keep up.

If you do decide to switch to Ubuntu 22.04 for Swizzin, the install script is modular. It doesn't do a lot of OS reconfiguration. You can run whatever else you want.

aki-k commented

@stickz I mean this page doesn't mention Fedora at all, just Debian and Ubuntu.

stickz commented

@stickz I mean this page doesn't mention Fedora at all, just Debian and Ubuntu.

I know. You have to switch your OS. Ubuntu 22.04 LTS is recommended if you decide to do this.