skx/puppet-summary

Add option to delete reports if status is unchanged

matejzero opened this issue · 13 comments

I'm testing your tool for the last 24 hours and as I already said, it's really nice but the problem that is's consuming a lot of space.

In my case, after a bit more then 24h of it running, it's already using 14GB of space for reports (puppet is running every 30min on nodes) with 10k+ reports generated daily. Most of the reports are "unchanged", which I think could easily be remove since they don't contain any useful data (at least for me).

A solution could be an option to delete reports, when status is unchanged. When puppet uploads the reports, it pulls out data it needs (status, timestamp, execution time,...) for sqlite db and then remove it. In case status is failed/changed, it would save it.
This also means there would be no link to reports if status is unchanged in node details.

What do you think?

skx commented

With that volume of reports I can see that this would be a useful enhancement.

I would probably add a new flag to the prune command:

  • puppet-summary prune -days=3
    • Delete all reports which are older than 3 days.
    • This means remove from the various views and delete the report from disk.
  • puppet-summary prune -unchanged
    • Delete all reports from disc where the status is "unchanged".
    • But keep the reports in the various views.

I'm just a little unsure whether to keep the references to the "unchanged" reports in the views; on the one hand it shows you that puppet did run, on the other hand if you can't click it for the details maybe it is not worth showing?

If we remove the entries, not just the files, then I suspect the end result is that over time nodes would become "orphaned" as the might be days where every run was "unchanged" - so if we remove them all the node will have no local state.

That makes me think we just reap the report-files, and leave the records (until they're removed via the -days mechanism). Does that seem sane?

I would definitely keep the references to the "unchanged" states for the exact reason you said in the other part of reply. I want to see if puppet was ran or not, because this way I can see "orphaned" hosts, which comes handy from time to time (catching users who forget to start puppet after testing:P)

As far as your commands go, they seem sane and simple to use. Would one then have to run puppet-summary prune -unchanged via crontab?

For test, I erased all unchanged reports and disk usage changed from 12GB to 28MB.

skx commented

The commit e0487ee should have given you what you want.

Update via:

    go get -u github.com/skx/puppet-summary

Then run this to test it:

   puppet-summary prune -unchanged [-prefix ./blah/blah]

The downside is, as I realized too late, that the links to the report will still exist - because the renderer doesn't read/touch the path. So you'll get a 404..

Wow, that was fast:)

Compiled and ran it, it cleaned up my reports folder nicely. prune added to hourly crontab.

And, as you already realized, I get 404 when accessing unexisting report:

open reports/pruned: no such file or directory

One solution would be to check if report ID is pruned here and in this case, not create a link. I'm not sure if this is a good solution or not, but this one came to mind:)

skx commented

Yeah it needs a bit more work, as the report ID is numeric - it's the yaml_file column in the database which is set to pruned - and that isn't part of the struct/record that the node-view has access to.

Shouldn't be hard to fix, but it'll take me a day or two I guess.

Ooo ok:) I would help you, but Go is not my thing:/

Yea, no need to hurry, current solution works for me. Not linking unchanged reports is just a nice touch:)

skx commented

Got there in the end :)

Great! Will recompile today.

This patch is really helping me to keep low disk usage. I'm keeping 14 days of reports and disk usage is around 27GB for cca 16k reports/day.

skx commented

Nice to hear it works at that kind of scale!

Let me know if you do/don't see any problem, and then I'll tag a new release.

It looks like it's working as it should:)

Unpurged/deleted reports still have a link which might come handy sometime, but purged are not linked. From my side it works as expected.

skx commented

I assume you meant "Unpurged/undeleted", i.e. reports of changes are still linked.

But thanks for the speedy update. I'll tag a new release later in the day, which will show up on the download-page.

Yea yea, typo on my end:)

Unchanged reports that are still on the disk and not purged by purge -unchanged.