The problem: spurious git diffs when whitespace changes happen.
When text and code does not have a canonical form, it's possible for someone to change from one form to another without intending to or noticing. As an example, someone can add or remove whitespace "after" the text on the end of a line without changing its meaning.
A common solution to this is to automatically "clean" whitespace on every save. But this can introduce its own spurious diffs when working on a file that isn't already clean.
The solution: ethan-wspace
.
When ethan-wspace
is activated in a buffer, it examines the
whitespace in that buffer and does different things depending on
whether that whitespace is clean or dirty.
- If the whitespace is dirty, then
ethan-wspace
will highlight the "errors", so you can be cognizant of where the whitespace is already dirty. This can help you preserve the whitespace as-is, although it does not prevent you from introducing new errors. - If the whitespace is already clean, then
ethan-wspace
will insert hooks to clean this whitespace on save. This will ensure that the whitespace remains clean, even if you introduce errors yourself. Because the whitespace will be automatically cleaned, there is no need to display whitespace specially, and no highlighting is added.
ethan-wspace
does this at a granular level for different kinds of
whitespace problems: tabs, newlines at end of file, trailing
whitespace. The categories of whitespace that are cleaned will be
maintained by cleaning on save, and the ones that are dirty will be
highlit.
It looks like this:
When you open files (N.B. but not non-file buffers), bad whitespace
will be highlit and clean whitespace will be maintained by cleaning
just before files are saved. There's also a mode line "lighter" for
ethan-wspace
which looks something like ew:mnLt
. Each letter
corresponds to a kind of whitespace (see "Errors", below). Lower case
letters indicate categories of whitespace which will be cleaned before
save; upper case letters indicate categories which are being
highlit.
You can switch from one to the other using M-x
ethan-wspace-highlight-FOO-mode
or M-x
ethan-wspace-clean-FOO-mode
(each mode disables the other). You can
also click on the corresponding letter in the modeline lighter, which
will switch from cleaning to highlighting or back.
If you want to clean all kinds of whitespace, you can use M-x
ethan-wspace-clean-all
, which immediately cleans everything and
switches to clean-before-save on all whitespace types.
ethan-wspace recognizes the following categories of whitespace errors:
- trailing whitespace at end of line (
eol
, modeline letterL
for "end of Line"). - no trailing newline (
no-nl-eof
, modeline letterN
for "No trailing newline"). - more than one trailing newline (
many-nls-eof
, modeline letterM
for "Many trailing newlines"). - tabs, at all (
tabs
, modeline letterT
).
It recognizes these categories independently, and treats each category as clean or not-clean.
If you are editing some line, and are writing something at the end of
it, and have added some spaces, but your cursor is just after those
spaces, the spaces aren't considered "trailing" yet. They won't be
highlit if you are in highlight-eol-mode
. If you are in
clean-eol-mode
, and you should save the buffer, the spaces will be
cleaned, the buffer will be saved, and the spaces will be re-added for
your convenience[1]. Similar behavior exists for
newlines-at-end-of-file.
[1] | This may have the surprising behavior that your file appears "clean" even though its contents are not exactly what is on disk. |
Some file formats (notably Makefiles) treat tabs as syntactically
significant. Tabs in these files are not errors but are actually
required. To try to accommodate these files, ethan-wspace
will
check the value of the variable indent-tabs-mode
. If set, tabs
will not be considered errors (so they will neither be highlit nor
converted to spaces on saves). However, this means you are on your
own if some lines happen to indent using spaces.
You can override this behavior (if you desire) by customizing
ethan-wspace-errors-in-buffer-hook
, using something like:
(defun i-still-really-hate-tabs () (if (not (member 'tabs ethan-wspace-errors)) (setq ethan-wspace-errors (cons 'tabs ethan-wspace-errors)))) (add-hook 'ethan-wspace-errors-in-buffer-hook 'i-still-really-hate-tabs)
You should also remove any customizations you have made to turn on
either show-trailing-whitespace
or require-final-newline
; we
handle those for you. (But note that require-final-newline
is
turned on by some modes based on the value of
mode-require-final-newline
, so you may have to turn that off.)
(custom-set-variable '(mode-require-final-newline nil))
ethan-wspace
is in MELPA and can be installed using
package-install
. If you use use-package
, a sample config might be:
(use-package ethan-wspace :ensure t :config (setq mode-require-final-newline nil) (global-ethan-wspace-mode 1))
Otherwise, you can manually add the lisp
directory to your
load-path
, and then (require 'ethan-wspace)
. In other words,
add to your init.el
something like the following:
(add-to-list 'load-path (expand-file-name "~/.emacs.d/upstream/ethan-wspace.git/lisp")) (require 'ethan-wspace) (global-ethan-wspace-mode 1)
You might also want to customize the face used to highlight erroneous
whitespace. This is configurable by ethan-wspace-face
. A default
face is computed based on the background of your frame when
ethan-wspace
was require
d (so you might want to make your
calls to color-theme
first).
Most other emacs whitespace customizations (and there are many: see
ShowWhiteSpace on the EmacsWiki) focus on showing
problematic whitespace. There are also some customizations out there
focused on Deleting Whitespace. But there are
many and they all have extremely similar names. (ethan-wspace
aims
to be the most egotistically-named package.) ethan-wspace
subsumes most of them, except for whitespace.el
to show all whitespace in non-programming contexts, and ws-trim.el
which I had never heard of before just now.
- whitespace.el and the
family of related code that includes
visws.el
,whitespace-mode.el
,show-whitespace-mode.el
, andblank-mode.el
has many options for making whitespace characters visible, both by faces and by changing their representations in the display table. That seems very useful for editing binary files or other circumstances where you care exactly what whitespace you're looking at, but it isn't really useful for editing source code, where you typically want whitespace to be as clean as possible. I have no idea which of those files is most recent or "best", as I have never used them. - ws-trim.el
automatically trims whitespace on edited lines. With a low
ws-trim-level
it is complementary toethan-wspace
, and may be useful to encourage you to delete whitespace organically. I'd never heard about this package and hopefullyethan-wspace
will grow similar functionality soon. - ws-butler automatically trims whitespace on edited lines too, like an improved ws-trim.
- Putting
delete-trailing-whitespace
ornuke-trailing-whitespace
in yourbefore-save-hook
is now obsolete; these functions are too aggressive and will cause you many spurious whitespace commits. - Standard emacs variables
show-trailing-whitespace
andrequire-final-newline
are "subsumed" by this mode --require-final-newline
is reimplemented in a more general way, andshow-trailing-whitespace
is triggered per-buffer by this mode. (show-trailing-whitespace
is built into emacs core and seems to be the fastest/most elegant way to highlight trailing whitespace.) next-line-add-newlines
, to add newlines when you move forward lines, still exists and is unchanged. I recommend you set this to nil (if it isn't already -- I think it is nil in all versions since 21.1), butethan-wspace
will still trim unnecessary newlines on each save if there were fewer than two when the buffer was opened.- redspace.el is a
small library meant only to highlight trailing whitespace. This is
already done by the variable
show-trailing-whitespace
, which is used internally byethan-wspace
.show-trailing-whitespace
has the nice effect that it doesn't highlight trailing whitespace when your cursor is after it -- so you don't see little blinking lights as you type a line of text. - show-wspace.el is
a library that has lots of faces to show tabs, trailing whitespace,
and "hard spaces".
ethan-wspace
obsoletes this mode too.
Honestly, you're right. I sincerely doubt using these customizations will make your life as a programmer even 1% more productive. 1% is nothing. You'd do better to buy a bigger monitor.
I just hate spurious git diffs so much. And when I was working on a
codebase with dirty files, I couldn't just clean everything without
making my subsequent PRs dirty too. If I accidentally cleaned
something, I'd have to carefully undo the cleaning so my commits
didn't include it. A nightmare! ethan-wspace
is the result.
Listen. You may have some opinions about whitespace in your source code. They may even amount to preferences. However, it takes a seriously twisted person to think about whitespace obsessively. I have.
The fact is that I simply have more opinions about whitespace than you do. That makes mine more correct.
It is my opinion (and remember, my opinions are right) that you should never, ever have tabs in your source code, at all. If you disagree, please see Tabs Are Evil on the EmacsWiki. This was once a holy war, and then for a time it was settled, but these days, the idea that tabs are acceptable is making a resurgence due to gofmt.
Perhaps you are one of those bizarre creatures who uses Smart Tabs. In that case, you are
even more OCD about whitespace than I am, and in a twisted way I
salute you. However, ethan-wspace
by default treats tabs as
errors, which you might find distracting. In that case, I recommend
something like the following:
(set-default 'ethan-wspace-errors (remove 'tabs ethan-wspace-errors))
We don't have an error type yet for smart tabs, but patches to add one would be welcome.
Required reading for this discussion is JWZ's "famous" tabs versus spaces post. He sets out three categories of effect that tabs have, and how to defuse the whole situation.
I have encountered people who prefer tabs because they prefer being
able to press backspace and go exactly one level of indentation
back. These people are obviously wrong because if you're using a
halfway decent editor, it should be capable of indenting CORRECTLY for
you automatically (i.e. emacs's TAB
behavior), as well as
backspacing a whole level in languages where that's useful
(i.e. emacs's python-backspace
). So this argument just boils down
to "I have a crappy text editor."
You may encounter people who say things like, "Tabs are better because they let everybody set their own indentation width." And this is true to a point. If you are one of those people, pop quiz: let's say you use tabs, and prefer them to be four spaces wide. How do you indent the last line of this code?
if __name__ == '__main__': main.Application(config, sys.argv, time.time(), docutils.parsers.rst.directives.images.Image)
If you said "five tabs, one space" -- you lose. Because then when you move to Jean's machine, where tabs are two spaces, you find:
if __name__ == '__main__': main.Application(config, sys.argv, time.time(), docutils.parsers.rst.directives.images.Image)
And on Johann's machine, where tabs are eight spaces, you see:
if __name__ == '__main__': main.Application(config, sys.argv, time.time(), docutils.parsers.rst.directives.images.Image)
Your beautifully-indented source code has been scattered to the winds. You've just demonstrated that you aren't crazy enough to think about whitespace issues obsessively enough. Rejoice! There is a place for you in normal society.
It's due to code above that truly demented people will suggest using tabs for blocks only and spaces within blocks. This is the "Smart Tabs" approach mentioned above. In the above code, that gives you "one tab, seventeen spaces". I've never seen a project with this as the coding standard, and I'll never suggest it for a real project, for the simple fact that people are lazy and source-code editors are imperfect, and somewhere, somehow, I am certain to come across spaces where there should be tabs, or tabs where there should be spaces. And then I will be furious.
Rather than try to ensure complete compliance with this extremely complicated rule for source code formatting, I have set my sights on the simpler expedient of just outlawing tabs in source code entirely and consigning them to the dustbin of history.
ethan-wspace
is released under a BSD license (see COPYING
).