-
A format for embedding code reviews into unified diff files.
-
An Emacs mode for creating / reading / responding to reviews in that format, with git integration.
-
gen-diffscuss.py, a tool for generating code reviews from git repos.
There are also plans for some command line tools, and tools to allow diffscuss files to serve as a code-review interchange format.
If you like:
-
Plain text formats that play nicely with the *nix toolchain
-
Lightweight, flexible code-review processes
-
The idea of being able to work with / keep your reviews close to your source, even in the same repository
Then you might want to give Diffscuss a look.
Diffscuss adds a single beginning-of-line character to the unified diff format: %, which marks the beginning of a comment line. Lines beginning with %* are comment headers, and lines beginning with %- are comment bodies. The number of * or - characters indicates reply threading.
Since one example is worth a lot of explaining, here's an example diffscuss file:
%*
%* author: Bob Jones
%* email: bjones@example.com
%* date: 2013-03-15T19:00:21-0400
%*
%- This change set accomplishes two purposes:
%-
%- * Add a version of the world greeting for our Spanish-speaking users.
%-
%- * Indicate that it's only a test.
%-
%**
%** author: Edmund Jorgensen
%** email: edmund@example.com
%** date: 2013-03-16T19:00:45-0400
%**
%-- Looks good, one question about internationalization.
%--
diff --git a/example.txt b/example.txt
index f75ba05..01632d8 100644
--- a/example.txt
+++ b/example.txt
@@ -1 +1,7 @@
Hello, world.
+
+Hola, mundo.
%*
%* author: Edmund Jorgensen
%* email: edmund@example.com
%* date: 2013-03-16T19:00:37-0400
%*
%- I'm not sure we want to go international just yet.
%-
%**
%** author: Bob Jones
%** email: bjones@example.com
%** date: 2013-03-17T18:12:49-0400
%**
%-- What are your specific concerns?
%--
+
+This is a test.
+
+It is only a test.
The comments at the top of the file apply to the entire review in general.
The other two comments address the +Hola, mundo.
line (that is,
the line directly above the first comment in the thread).
Take a look at the "Format Definition" section for much more detail.
The Emacs mode colorizes diffscuss files to make for easier reading. In addition it helps with:
The main command you need to know is C-c C-c
, which generally
"does the right thing" based on the position of the cursor. To wit:
-
If the cursor is at the very top of the buffer, it will insert a new review-level comment (this function is available anywhere in the buffer with
C-c C-f
). -
If the cursor is inside another comment, it will create a reply to that comment (this can also be invoked with
C-c C-r
). -
If the cursor is inside the diff index information for a file / hunk, it will insert a comment after the "range line" (the line beginning with @@).
-
If the cursor is on a diff line, it will create a comment directly below that line (this is also available with
C-c C-i
).
If you position the cursor on one of the diff lines in a Diffscuss file, then:
-
C-c -
will open up a temporary buffer containing the old version of the source (if it's available locally), with the cursor positioned on the same line. -
C-c +
will open up a temporary buffer containing the new version of the source (if it's available locally), with the cursor positioned on the same line.
Both of these require that the Diffscuss file you're visiting is located inside the git repo directory.
C-c s
will attempt to find the local source file / line in that file that's the best candidate to match up with the diffscuss line the cursor is currently on.
This requires that the diffscuss file you are visiting is somewhere
under a git checkout of the repo against which the diffscuss file was
generated, and that you have set the diffscuss-dir
Emacs
variable to root directory of your diffscuss installation (where
find-local-source.py is located).
You can move around the diffscussion quickly using:
-
C-c f
to move forward one comment. -
C-c b
to move back one comment. -
C-c n
to move to the next thread. -
C-c p
to move to the previous thread. -
C-c a
to move to the beginning of the current thread. -
C-c e
to move to the end of the current thread.
Like the Emacs mode, the Vim plugin offers syntax highlight, comment insertion, commands for jumping to source files, and motions for comments and threads. The Vim plugin is implemented primarily in Python, so you'll need a version of Vim compiled with Python support in order to use it.
The Vim plugin requires you to set g:diffscuss_config
in your .vimrc
,
which it uses to pre-fill comment headers and locate scripts.
let g:diffscuss_config = {
\'author': 'Your Name',
\'email': 'your.email@example.com',
\'diffscuss_dir': '/path/to/diffscuss'
\}
<leader>dd
: insert a new comment contextually (EmacsC-c C-c
)<leader>df
: insert a new review-level comment (EmacsC-c C-f
)<leader>dr
: insert a new reply (EmacsC-c C-r
)<leader>di
: insert a new comment (EmacsC-c C-i
)
<leader>do
: show the old source version (EmacsC-c -
)<leader>dn
: show the new source version (EmacsC-c +
)<leader>ds
: show the local source version (EmacsC-c s
)
]d
: jump to the start of the next comment[d
: jump to the start of the previous comment]D
: jump to the end of the next comment[D
: jump to the end of the previous comment]t
: jump to the start of the next thread[t
: jump to the start of the previous thread]T
: jump to the end of the next thread[T
: jump to the end of the previous thread
gen-diffscuss.py is a helper script for creating a new Diffscuss file with diffs and log messages from a set of changes in git.
For example,
gen-diffscuss.py HEAD~3..HEAD > some_code_review.diffscuss
Makes a Diffscuss file containing the diffs from the last three commits in a local git repo, introduced with a comment at the top containing all the log messages for those three commits.
All Diffscuss lines begin with one '%'.
There's no limit to the length of a Diffscuss line, but keeping them 80 chars or less when possible is probably good citizenship, since they're meant to be read and edited inside editors.
A Diffscuss comment begins with an author line, and possibly other header lines, followed by at least one body line.
A header line one '%' followed by at least one '*', followed by a space, followed by a header of the format 'field-name: value'.
Header lines may also be empty, containing no 'field-name: value', in which case the space after the * is also optional.
A header line must always begin a comment or follow another header line.
For example, all three of these are valid header lines:
%* author: Bob Jones
%**
%*** x-github-version: 1
The field name cannot contain whitespace. The value cannot contain a newline.
The current Diffscuss headers are:
-
author
-
email
-
date
Non-standard headers should begin with an 'x-', to keep compatible with future additions to the Diffscuss format.
An author line is a standard header line with a field of 'author' and a value indicating who authored the comment. For example:
%* author: ejorgensen
Or
%** author: bsmith
Every comment must begin with an author line. All other headers are optional.
Dates should be specified in the full ISO 8601 standard, including time zone offset, e.g. 2013-11-22T23:11:21-0400.
A body line is one '%' character followed by at least one -, followed by a space, followed by arbitrary text.
Exception: for an empty body line, the space is optional.
For example:
%- This is a body line.
%-- And so is *this*.
%- and this next body line
%-
%- is empty, so doesn't need a space, but could have one.
A body line must always follow a header line or another body line.
A thread is one or more adjacent comments, properly threaded.
For example, this is a thread:
%* author: ejorgensen
%- I'm a one comment thread.
And so is this:
%* author: ejorgensen
%- I'm a two comment thread.
%* author: bsmith
%- With no replies, just two top level comments.
And so is this:
%* author: ejorgensen
%- I'm a two comment thread.
%** author: bsmith
%-- With a reply.
To be explicit: the nesting / reply level of a thread is determined by the number of '*' characters in each header line / '-' characters in each body line, which should remain constant for a given comment.
A comment that is a reply to a previous comment should have one more '*' at the beginning of each header line and one more '-' at the beginning of each body line than its parent comment.
The parent comment for any reply can always be determined by finding the closest previous comment with one less level of nesting. For example:
%* author: ejorgensen
%- I'm a top-level comment.
%** author: bsmith
%-- And I'm a reply.
%*** author: sjones
%--- And I'm a reply to the reply.
%** author: jkidd
%-- And I'm a second reply to the original top-level comment.
%* author: mdarks
%- And I'm another top-level comment.
%** author: lbutterman
%-- And I'm a reply to the second top-level comment.
Diffscuss threads are generally taken to apply to the line immediately above them, so for example in this snippet:
+It's only just a test
%* author: ejorgensen
%- I have grave doubts about this change. To me it appears foolhardy
%- and dangerous.
The comment applies to the line
+It's only just a test
A Diffscuss thread can also appear directly after the range information in a hunk, in which case the target of the comment is assumed to be the entire hunk, for example:
--- 1.txt 2013-03-07 20:18:10.000000000 -0500
+++ 2.txt 2013-03-07 20:18:35.000000000 -0500
@@ -1,5 +1,7 @@
%* author: ejorgensen
%- I love this hunk.
This is a test.
-It's just a test
+It's only just a test
Finally, if a thread appears at the very top of the Diffscuss file, the thread applies to the whole changeset (where it should generally act as a thread discussing the review as a whole--for example, introductory remarks about what the changes are attempting to achieve, "ship it" remarks, etc.).
Right now, if you use Github (for example), you can stop using Github at any moment without losing any of your code and history--it's all in your git repo.
Github has done a fantastic job of making sure you can even take your wiki information with you--that's just another git repo.
But the comments on your pull requests--all that amazing and rich history of problems being hammered out and nits picked--is harder to reconstruct. Github has--in amazing non-lockin-y fashion, and mad props to them for it--provided an API to get all the information about your pull requests, but it's non-trivial to reconstruct it all, requiring a bunch of calls and a bunch of reconstructed json.
Now imagine you could hit the pull request url with a .diffscuss extension (as you can already do with a .diff extension) and pull in the diff--with the contextualized comments, right inline, for viewing in your editor or whatever. That would be pretty cool, no? Especially if you could then import into another code review system with relatively little effort.
Or say that, just like you can with your Github wiki, you could clone a git repo of all your project's pull requests, in a simple textual format, for offline reading, and potentially--with some push magic--even offline code reviews.
This is an aspirational in the extreme goal--after all, Github would have to create the .diffscuss endpoint etc.--but a simple, standard format helps pave the way.