gohugoio/hugo

'hugo clean'

Closed this issue · 13 comments

In summary:

  1. Some web site creators (and maintainers) seem uncomfortable with the command line (in general);
  2. Some Hugo tutorials recommend purging the ./public/ directory by means of the command line, in order to aid development;
  3. Purging manually involves the (arguably) somewhat scary command: rm -rf public;
  4. Some Hugo users are reluctant to use rm -rf {argument};
  5. For Hugo old hands, this reluctance is no problem, but for newish Hugo users, it does IMO seem to be an issue;
  6. A subcommand clean is familiar (at least to some people), thus easily explainable: see (for instance) make clean;
  7. Currently Hugo is unlike most other programs in how its users do the necessary cleanup.

To help Hugo (by slightly aiding its evangelism), the sweet spot IMO would be these goals:

  1. Help those web site creators relatively uncomfortable with the command line;
  2. Provide a Hugo command to clean the configured output directory (this isn't always ./public/);
  3. Keep a clean command separate. (Don't complicate the hugo rendering process);
  4. Add only a tiny amount of code.

Based on all this, Hugo should provide IMO a command hugo clean.

For some users, hugo clean would "feel" IMO safer than rm -rf {argument}, for two reasons:

  1. It would be an official Hugo command;
  2. It would avoid some users' uncomfortable feeling (observable above) regarding possibly deleting the wrong files through their use of the "powerful and dangerous" command rm -rf {argument}.

Later, IMO Hugo could provide an option --keep-static to speed cleaning (and rendering) when users know their static files haven't changed. This option might apply to hugo clean and hugo.

tl;dr

Hugo's evangelistic interest would be served by helping some users with their command line rm -rf {argument} discomfort, if this help is merely a tiny addition to the Hugo codebase, by means of adding IMO a hugo clean command.

bep commented

tiny addition to the Hugo codebase, by means of adding IMO a hugo clean command.

This is not tiny. There is nothing as hard to do in a open source project as a rm -rf <something>. Just think about it. Would you love to be the person responsible for viping out thousands of peoples' disks?

Would you love to be the person responsible for wiping out thousands of peoples' disks?

Point taken!

Since extra carefulness is desirable, then:

Whenever Hugo writes (or copies) each file into ./public, it could:

  1. Log that file's (project-relative) path into ./.hugo-output-files;
  2. (Later at hugo clean time), uniquely sort the log and delete only that file; and then
  3. (After Hugo has processed all the files in the log), empty the log.

And Hugo's documentation would:

  1. Mention that Hugo might not catch every file; and
  2. Recommend ./.hugo-output-files be kept outside version control.

To achieve this bit of further friendliness to non-command line oriented users, this still seems fairly simple (perhaps).

Alternatively, Hugo's documentation could tell people ./public is ephemeral, and unsafe to keep things in.

bep commented

You are missing my point. ANY recursive delete in an open source projects, where pull requests arrives from anyone and mistakes are bound to happen, is a bad idea.

That is, if I make a perfectly fine rm -rf public now, it may do weird stuff on some platforms, or some person will remove or add a slash in the future and create havoc.

This is much better handled by the user him or herself outside of Hugo. Then any mistake isn't on me.

ANY recursive delete in an open source projects, where pull requests arrives from anyone and mistakes are bound to happen, is a bad idea.

I take your point. You've convinced me! I certainly believe that it's a valid one: that it's a good idea to avoid recursive deletes. (This is especially true, it occurs to me, because symbolic links might escape any filesystem tree.) :)

Whenever Hugo writes (or copies) each file
Log that file's (project-relative) path
delete only that file

Regarding the words, "recursive delete": I suppose that you were speaking metaphorically (perhaps).

I suppose so, because the suggestion here contains no recursion. (By this, I mean that it contains no recursive filesystem traversal.) Isn't this statement a true one?

For additional safety as well, it contains no file basename globbing. Instead, from an explicit list of files, each file would be deleted individually. Previously, into the list, each file's path would have been added carefully, individually, by Hugo. So, it involves only simple iteration, over a list of individual files.

This is why the suggestion here is IMO helpful. :)

So, in a (hypothetical) implementation involving Hugo deleting individual, logged files, recursion and globbing would not be a part of Hugo's source code base.

Now, in an open-source project, I grant that someone could accept an alteration to that code, which added recursion to the deletion.

However, also, by the same token, I suppose that anywhere, at any time, someone else could add recursive deletion code (to any project). Therefore, arguably, the above suggestion seems to pose no significant, additional danger (practically speaking). :)

snapo commented

How about just doing a mv and let the user deside if he wants to delete it. In this case no data is deleted (only if you move it to /dev/null)
The user gets still the newest data rendered without the old stuff. With this in mind it would also be possible to miss-use it as a versioning

i completly agree rm is too dangerous (see what happend with the steam client :-) )

Just my 2 cents...

I'm against these rm and mv ideas. A safer solution would be to add a subcommand to hugo list. For example:

hugo list output               List all files Hugo shall write in outputDir
hugo list output --orphans     List all files in outputDir not part of Hugo

The hugo list command currently only deals with listing content. The output subcommand would expand the scope of list a bit, but I think list is the right place to put this functionality.

snapo commented

@moorereason , that sounds also good.

Just now, I realized that the "list of files" suggestion for a hugo clean command is analogous to how uninstallers work.

Basically, a (hypothetical) hugo clean command, which removed only the individual files it's sure Hugo created, would work like apt-get remove on Debian or Ubuntu, brew uninstall on OS X, or even a typical uninstall program (on Windows).

Typically, we run these kinds of programs quite often, and trust them. :)

bep commented

I said I had a Hugo pause, but I still get alerts on changes on this issue. This will not be implemented on my watch, so I might as well close it. Take discussions at http://discuss.gohugo.io/

Understood—continuing the discussion here.

@bep How can you be comfortable with telling people "just rm -rf it" but then not be comfortable with adding a delete flag to a build tool?

If deleting previous build data is not a trivial problem and requires, as you say, more effort than just adding an rm -rf line to Hugo's code because it won't work consistently across systems, maybe you shouldn't be recommending it to everyone?

Don't really understand how you can be of both minds.

@acnebs The point is, as I understand it, that the writing of the rm -rf command itself is handled by the user, rather than Hugo. This means that if the command breaks, for whatever reason, and deletes more than it should, it's obviously 100% the fault of the user, and not a bug in Hugo which is deleting peoples sites, or worse yet, their entire disk. There are plenty of valid codepaths which would look completely fine, pass code review, be shipped, and then result in someone nuking their entire OS.

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.