Canop/dysk

dysk very slow in case of not present automounted external hdd

Dialga opened this issue · 9 comments

Dialga commented

dysk, and it's previous incarnation always takes ~1m30 seconds to load on my system, whereas the alternative dfrs shows disk usage immediately.

$ time dysk
┌──────────────┬───────┬────┬────┬─────────┬────┬────┬───────────────────────────┐
│  filesystem  │ type  │disk│used│   use   │free│size│mount point                │
├──────────────┼───────┼────┼────┼─────────┼────┼────┼───────────────────────────┤
│/dev/sda2     │ f2fs  │SSD │ 21G│ 1%      │4.0T│4.0T│/mnt/XXX                   │
│/dev/sdc2     │fuseblk│HDD │3.2T│79% ████ │820G│4.0T│/mnt/XXX                   │
│/dev/sdb2     │ f2fs  │HDD │897G│44% ██▎  │1.2T│2.1T│/mnt/XXX                   │
│/dev/nvme0n1p2│ f2fs  │SSD │299G│15% ▊    │1.7T│2.0T│/                          │
│/dev/loop0    │ ext4  │SSD │146G│27% █▍   │394G│540G│/media/XXX                 │
│/dev/zram1    │ ext2  │RAM │ 25K│ 0%      │6.6G│6.6G│/run/compressed-mount-point│
│/dev/nvme0n1p1│ vfat  │SSD │259M│12% ▋    │1.9G│2.1G│/boot                      │
└──────────────┴───────┴────┴────┴─────────┴────┴────┴───────────────────────────┘

real    1m27.212s
user    0m0.000s
sys     0m0.006s
$ time dfrs
Filesystem     Type                          Used%  Avail   Used   Size Mounted on
/dev/nvme0n1p2 f2fs    ▇▇▇╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍  15.0%   1.5T 278.6G   1.8T /
/dev/nvme0n1p1 vfat    ▇▇▇╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍  12.1%   1.8G 246.8M   2.0G /boot
/dev/loop0     ext4    ▇▇▇▇▇▇╍╍╍╍╍╍╍╍╍╍╍╍╍╍  27.1% 366.6G 136.3G 502.9G /media/XXX
/dev/sda2      f2fs    ▇╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍   0.5%   3.6T  19.2G   3.6T /mnt/XXX
/dev/sdb2      f2fs    ▇▇▇▇▇▇▇▇▇╍╍╍╍╍╍╍╍╍╍╍  43.8%   1.0T 835.4G   1.9T /mnt/XXX
/dev/sdc2      fuseblk ▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇╍╍╍╍  79.5% 763.9G   2.9T   3.6T /mnt/XXX
/dev/zram1     ext2    ▇╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍╍   0.0%   6.1G  24.0k   6.1G /run/compressed-mount-point

real    0m0.002s
user    0m0.001s
sys     0m0.000s
Canop commented

Hum. That's curious. Any idea of what's the filesystem which causes the problem ?

Can you try dysk --remote-stats no ?

Dialga commented

Hi, I've just tried that and same result.

Canop commented

I don't know which filesystem you can easily unmount but trying without some of them would probably tell us where the problem is.
Or I'll have to make a special version with time analysis.

Dialga commented

Interestingly I've just tried with dfrs --total and that also took some 1m25s to process, so disks that are not being displayed are also being calculated in dysk. Will continue to narrow down.

Canop commented

disks that are not being displayed are also being calculated in dysk

This eases decoupling filtering logic and data gathering logic (they're in different crates btw) and the only cases of slow "computing" I had until now were the remote filesystems. I'll reconsider this part of the architecture depending on your findings.

Dialga commented

I've found the offending line causing trouble in my fstab, which I use to automount an external hdd that's not currently connected:
UUID=XXX /mnt/XXX ntfs3 rw,relatime,iocharset=utf8,prealloc,group,owner,user,uid=1000,gid=1000,auto,nofail,acl,X-mount.mkdir,X-mount.owner=me,X-mount.group=me,x-systemd.automount 0 0

It appears that running dysk triggers a systemd job to run, but not for dfrs.

$ systemctl list-jobs
No jobs running.
$ dysk
^C
$ systemctl list-jobs
JOB  UNIT                                        TYPE  STATE
1000 mnt-XXX.mount                               start waiting
1000 dev-disk-by\x2duuid-XXX.device              start running
$ dfrs
...
$ systemctl list-jobs
No jobs running.

A further search brought me to https://www.freedesktop.org/software/systemd/man/systemd.mount.html#x-systemd.device-timeout= to which I configured to 1 second, and now dysk runs in 1s. So that's the root cause of this issue.

Canop commented

Thanks for the investigation.

With this, I should be able to try reproduce the problem and look for workarounds.

Canop commented

I changed the issue title to ease management. Feel free to adjust it if you want but try to keep it as specific as possible.

I have the same issue:

% time /bin/dysk > /dev/null
/bin/dysk > /dev/null  0.01s user 0.00s system 0% cpu 5.044 total
% time /bin/df > /dev/null
/bin/df > /dev/null  0.00s user 0.00s system 80% cpu 0.007 total

Maybe dysk is doing things in series rather than parallel... it's a big jump from 0.00 to 5.

I have 5 x-systemd.device-timeout entries in /etc/fstab, each =200ms.

So I'm not sure where the 5 seconds is coming from.

(Thanks @Dialga for clueing me into dfrs)