shundhammer/qdirstat

File type statistics for packages sometimes allowed, sometimes not

Lithopsian opened this issue · 9 comments

When a packages query is first done, with no selections made, it is possible to do a File type statistics query and get sensible results. If either the top level Pkg: or any individual package is selected, it is no longer possible: the actions are disabled. If a directory inside a package is selected, it is then possible to do the query again.

updateActions() in MainWindow appears to try and prevent the three file statistics actions from being enabled on packages, but doesn't quite make it (it unconditionally enables the actions if there is no selection). I'm not sure why it even tries. Is there some issue? It looks like these actions work OK when done on the whole Pkg: tree.

I don't think I ever tested that. Maybe it works, maybe it doesn't.

I enabled those views for sel->isDirInfo() which includes DotEntry, PkgInfo, Attic (Ignored directories in the UnPkg view) and did some minimal testing. It's a bit unexpected, but as far as I can tell, the file size / type / age views work well on them. Sometimes they might even be useful.

The initial logic was a bit short-sighted and had only ordinary directory trees in mind: If nothing is selected, start the alternate view from the tree's root. Otherwise, if one directory is selected, start it from there. When multiple directories are selected, disable the alternate views.

The Pkg view introduced confusion here, and yes, it should be either always or not at all in that case. But if it works, why not simply always enable it.

I am sure somebody will find a fringe case where this leads to weird results. ;-)

It turned out there were quite a number of follow-on problems with this:

The Pkg view has a PkgInfo node as its visible tree root, and one level of PkgInfo nodes below that, one for each package in the tree. PkgInfo inherits DirInfo which inherits FileInfo. The FileInfo base class provides empty stubs for children management, and DirInfo implements the functionality, and derived classes like PkgInfo inherit that, too; so the tree can be traversed by only caring about DirInfo and FileInfo.

But there is a subtle difference: DirInfo directly corresponds to a directory on disk, so the checks for file attributes like FileInfo::isDir() (using S_ISDIR( mode ) internally) or FileInfo::isSymLink() work only on that level because the mode_t file mode (read from disk with the stat() syscall) is checked.

But that doesn't work with a PkgInfo because it does not correspond to an on-disk directory; myPkgInfo->isDir() always returns false. myPkgInfo->isDirInfo(), however, always returns true because that refers to the inheritance hierarchy.

During normal operation on a normal DirTree read from a disk subtree, that doesn't matter, because there are only DirInfo and FileInfo nodes in the tree plus DotEntry nodes (which also do not correspond directly to an on-disk directory). But in the Pkg view, the tree's first two levels are PkgInfo nodes, and that needs to be taken into account.

The commits above should take care about that. The differences are subtle, but important: Use isDirInfo() instead of isDir() at strategic places.

I also removed two convenience methods from QDirStatApp to avoid the same problem in the future: QDirStatApp::selectedDir() and QDirStatApp::selectedDirInfo(). Using them for any new view or filter would very likely reintroduce the same problem again.

There are still some discrepancies at some places, such as the number of files shown below a package in the Pkg view and the number of files (n) in the file size statistics. Some of that may be explained away with files listed in a package's file list as reported by the package manager and the files actually on disk; the user might choose to delete some, or some others might not get installed for whatever reason.

This is something to be investigated.

I tried a handful of different combinations of package trees and they all added up to the exact number of files in the statistics window. This is with dpkg. There are only a couple of reported missing files in the whole package tree and I avoided those in my tests.

I was chasing a ghost: I was looking at the 'Sigma n' in the histogram window and ignoring the fact that this is supposed to be less than the number of files (the data size) whenever the histogram is cut off at a certain percentile.

file-size-stats-01
Not the supposed 730 files, just 643?!

Moving the slider all the way to the right to include all percentiles in the histogram shows the total number of files every time, though:

file-size-stats-02

Here they are. Duh.

On the plus side, I went through that code and verified that indeed it does the right thing.

One edge case is that all three statistics views are enabled with no DirTree. File age statistics actually opens an empty window. Not harmful so far as I can see, but strange.

Well, whatever. It still shows you the true facts: Everything is empty.