'hugo clean'
Closed this issue · 13 comments
- I've noticed—as I move things around, change names, and bring things in and out of the static folder—that Hugo doesn't clear the public folder of this "old stuff."
I've resorted to deleting it, on occasion, just to "start anew."—@dkebler - If I render my site, then change the name of a content file and re-render, the old rendered HTML files stay in the output directory. This easily can cause me to have some duplicate-, or just plain "old" data in the output directory.—@natefinch
- I forgot to clear out the public directory; I'm not entirely satisfied with its nomenclature either. However, these are largely self-inflicted pains.
I'm not satisfied with nuking the directory via the command line approach.
I also think that this is something Hugo should be responsible for, as it is the one putting the files in its own directories—it should be capable of managing those resources better: or, at a minimum, clean up after itself.
I'm for having some solution to cleaning up the public folder, even if it's just a delete process.—@mohae - Files that were created in the wrong directory or with the wrong title will remain. If you leave them, you might get confused by them later. [and]
Purge thepublic/
directory (optional, but useful if you want to start with a clean slate).—@mdhender - Maybe: something like a
.hugoignore
file, to white-list files you want to keep.—@natefinch - Simpler just to let the user define one or more commands to run. For me,
rm -rf *
would work fine.—@natefinch - I'm fine with adding an extra
clean
step to my deploy script.—@bep - Behavior 1): Static change -> sync static files
Behavior 2): Other file changes (template, content, etc) -> render of complete site.
I think behavior 2) could improve to: first do a complete filesync (including deleting all non static files), then do the re-render.
This would have the advantage of not copying over all the images, etc., that are in the static directory, while providing a clean directory for the rendering to happen in.—@spf13 - Hugo's big selling point is speed. If this "complete filesync" makes the live-reload less snappy, I'm less happy.—@bep
- [I don't want to implement/review "delete folders" code in an open-source project. There are plenty of effective ways to handle this:
- Do a
rm -rf [public/]
in a shell script; - Separate development and production build targets.—](#379 (comment)
- Do a
In summary:
- Some web site creators (and maintainers) seem uncomfortable with the command line (in general);
- Some Hugo tutorials recommend purging the
./public/
directory by means of the command line, in order to aid development; - Purging manually involves the (arguably) somewhat scary command:
rm -rf public
; - Some Hugo users are reluctant to use
rm -rf {argument}
; - For Hugo old hands, this reluctance is no problem, but for newish Hugo users, it does IMO seem to be an issue;
- A subcommand
clean
is familiar (at least to some people), thus easily explainable: see (for instance)make clean
; - Currently Hugo is unlike most other programs in how its users do the necessary cleanup.
To help Hugo (by slightly aiding its evangelism), the sweet spot IMO would be these goals:
- Help those web site creators relatively uncomfortable with the command line;
- Provide a Hugo command to
clean
the configured output directory (this isn't always./public/
); - Keep a
clean
command separate. (Don't complicate thehugo
rendering process); - Add only a tiny amount of code.
Based on all this, Hugo should provide IMO a command hugo clean
.
For some users, hugo clean
would "feel" IMO safer than rm -rf {argument}
, for two reasons:
- It would be an official Hugo command;
- It would avoid some users' uncomfortable feeling (observable above) regarding possibly deleting the wrong files through their use of the "powerful and dangerous" command
rm -rf {argument}
.
Later, IMO Hugo could provide an option --keep-static
to speed cleaning (and rendering) when users know their static files haven't changed. This option might apply to hugo clean
and hugo
.
tl;dr
Hugo's evangelistic interest would be served by helping some users with their command line rm -rf {argument}
discomfort, if this help is merely a tiny addition to the Hugo codebase, by means of adding IMO a hugo clean
command.
tiny addition to the Hugo codebase, by means of adding IMO a hugo clean command.
This is not tiny. There is nothing as hard to do in a open source project as a rm -rf <something>
. Just think about it. Would you love to be the person responsible for viping out thousands of peoples' disks?
Would you love to be the person responsible for wiping out thousands of peoples' disks?
Point taken!
Since extra carefulness is desirable, then:
Whenever Hugo writes (or copies) each file into ./public
, it could:
- Log that file's (project-relative) path into
./.hugo-output-files
; - (Later at
hugo clean
time), uniquely sort the log and delete only that file; and then - (After Hugo has processed all the files in the log), empty the log.
And Hugo's documentation would:
- Mention that Hugo might not catch every file; and
- Recommend
./.hugo-output-files
be kept outside version control.
To achieve this bit of further friendliness to non-command line oriented users, this still seems fairly simple (perhaps).
Alternatively, Hugo's documentation could tell people ./public
is ephemeral, and unsafe to keep things in.
You are missing my point. ANY recursive delete in an open source projects, where pull requests arrives from anyone and mistakes are bound to happen, is a bad idea.
That is, if I make a perfectly fine rm -rf public
now, it may do weird stuff on some platforms, or some person will remove or add a slash in the future and create havoc.
This is much better handled by the user him or herself outside of Hugo. Then any mistake isn't on me.
ANY recursive delete in an open source projects, where pull requests arrives from anyone and mistakes are bound to happen, is a bad idea.
I take your point. You've convinced me! I certainly believe that it's a valid one: that it's a good idea to avoid recursive deletes. (This is especially true, it occurs to me, because symbolic links might escape any filesystem tree.) :)
Whenever Hugo writes (or copies) each file
Log that file's (project-relative) path
delete only that file
Regarding the words, "recursive delete": I suppose that you were speaking metaphorically (perhaps).
I suppose so, because the suggestion here contains no recursion. (By this, I mean that it contains no recursive filesystem traversal.) Isn't this statement a true one?
For additional safety as well, it contains no file basename globbing. Instead, from an explicit list of files, each file would be deleted individually. Previously, into the list, each file's path would have been added carefully, individually, by Hugo. So, it involves only simple iteration, over a list of individual files.
This is why the suggestion here is IMO helpful. :)
So, in a (hypothetical) implementation involving Hugo deleting individual, logged files, recursion and globbing would not be a part of Hugo's source code base.
Now, in an open-source project, I grant that someone could accept an alteration to that code, which added recursion to the deletion.
However, also, by the same token, I suppose that anywhere, at any time, someone else could add recursive deletion code (to any project). Therefore, arguably, the above suggestion seems to pose no significant, additional danger (practically speaking). :)
How about just doing a mv and let the user deside if he wants to delete it. In this case no data is deleted (only if you move it to /dev/null)
The user gets still the newest data rendered without the old stuff. With this in mind it would also be possible to miss-use it as a versioning
i completly agree rm is too dangerous (see what happend with the steam client :-) )
Just my 2 cents...
I'm against these rm and mv ideas. A safer solution would be to add a subcommand to hugo list
. For example:
hugo list output List all files Hugo shall write in outputDir
hugo list output --orphans List all files in outputDir not part of Hugo
The hugo list
command currently only deals with listing content
. The output
subcommand would expand the scope of list
a bit, but I think list
is the right place to put this functionality.
@moorereason , that sounds also good.
Just now, I realized that the "list of files" suggestion for a hugo clean
command is analogous to how uninstallers work.
Basically, a (hypothetical) hugo clean
command, which removed only the individual files it's sure Hugo created, would work like apt-get remove
on Debian or Ubuntu, brew uninstall
on OS X, or even a typical uninstall program (on Windows).
Typically, we run these kinds of programs quite often, and trust them. :)
I said I had a Hugo pause, but I still get alerts on changes on this issue. This will not be implemented on my watch, so I might as well close it. Take discussions at http://discuss.gohugo.io/
Understood—continuing the discussion here.
@bep How can you be comfortable with telling people "just rm -rf
it" but then not be comfortable with adding a delete flag to a build tool?
If deleting previous build data is not a trivial problem and requires, as you say, more effort than just adding an rm -rf
line to Hugo's code because it won't work consistently across systems, maybe you shouldn't be recommending it to everyone?
Don't really understand how you can be of both minds.
@acnebs The point is, as I understand it, that the writing of the rm -rf
command itself is handled by the user, rather than Hugo. This means that if the command breaks, for whatever reason, and deletes more than it should, it's obviously 100% the fault of the user, and not a bug in Hugo which is deleting peoples sites, or worse yet, their entire disk. There are plenty of valid codepaths which would look completely fine, pass code review, be shipped, and then result in someone nuking their entire OS.
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.