File Age Statistics
shundhammer opened this issue · 5 comments
Background
This was inspired by issue #165 : File histogram view. It's not exactly the same, but it's in the same spirit.
General Idea
Show the age of files in a selected directory tree; not with the exact time stamp, but more roughly to get a better idea of the approximate age of files. In this current application, it was year-based; that gives a good overview when it comes to decisions such as what to move to archive media or to delete and what to keep.
Implementation
Show a list of years and how many files were last modified in that year.
For each year, show the absolute number of files, the percentage of files in that directory tree in that year, as well as the total size of all those files and their percentage.
Example
This is my /work/photos
directory where I store the photos that I shot over the years. Beside the main window that shows all the total values, the File Age Statistics window is opened (menu View -> File Age Statistics or F4
) that breaks down all those files into all the years.
Notice that this is strictly about the file modification time (mtime). It only uses the year part of that mtime time stamp (that is really a time_t, i.e. seconds since 1970-01-01 00:00:00 like all Linux/Unix time stamps).
Drilling Down
In issue #165 there was a beginning discussion how to get further from that very rough overview information. Knowing that some files go back as early as 2003 may be interesting, but there should be an easy way to find out where they are.
Oldest and Newest Files: The new Discover Actions
For the oldest and the newest files, there is now an easy way: Use Discover -> Oldest Files or Discover -> Newest Files. But what about files in some other year range?
Just Leave the Window Open and Click Around
This new File Age Statistics window can simply remain there while you get busy in the main window. As you select a different directory, it is automatically being updated with the year statistics from that different directory.
It's Persistent
This is the default behaviour; you can change that with the Sync with Main Window check box. Just uncheck it, and it will retain its current content, even if you click on another directory in the main window. That setting is saved to the config file and will be as you left it when you open that window the next time after restarting QDirStat.
More Detailed Example
We already saw this:
Selecting a subdirectory in the main window (with the mouse or with the cursor keys):
Selecting one directory deeper down:
Notice how the years from the current year on are displayed even if no file in that subtree was modified in any of them. Those gaps are displayed dimmed out. But they still give you an important piece of information: It's been a while since anything was changed there. It's old stuff.
Even though that behaviour can be configured to only start at the active years, i.e. from the first year on that has a changed file in that subtree, I found it a lot more intuitive to see even the gap at the start: Since the relevant part starts further down, i.e. further in the past, you instantly know that it's in the past even without reading the year numbers. It's location-encoding, i.e. the location of the information (further down) is already a piece of information that your brain can instantly process even without reading.
Gaps between active years are always displayed; in the beginning I found it very confusing to not see at a glance if there were any years in a long list without activity. Empty space helps: You see instantly that there was a period with no activity.
Let's switch to the photo directory where I keep photos about beer hikes over the years:
Again, even without studying all the numbers, you see that it's been two years ago since the last one (Corona got in the way in 2020 and 2021), then there were some years without one, after that one some more years without, then some years with fairly regular activity.
You can do some rough data analysis by just glancing at the table. That is useful when you move around in the main window with the cursor keys rapidly; it's quite easy to get a feeling about a lot of data that way.
Finally, business trips to Finland:
I didn't have much time for taking photos on those trips, so it's not many of them; but again, you can see a pattern: Two occasions, in 2009 and again in 2015.
Future Development
This is just a first shot at those File Age Statistics. So far, it still has some rough edges; some refinement will follow in the near future.
But it's already usable, and it should not have broken any existing features. Hopefully. ;-)
Monthly Statistics for the Last Months
Now also monthly file age statistics are available for the current year and for the last year. By default. this is collapsed:
Clicking on the little arrow near the current year (2021) shows the months up to the current month:
With also the last year expanded:
Notice that this is limited to the current and the last year, no matter when activity in this directory tree begins (i.e. even if there is no entry for 2021 and 2020, only for 2012 and earlier).
The rationale is that it may really be important to know when anything changed during the last few months, but the further back any activity was, the less important is the exact month (at least in this context).
Discussions are welcome. Feel free to discuss this here.