Literate Diffing


Posted

in

,

The other day I found myself wanting to add commentary to a diff. There are code review tools such as reviewboard and gerrit that make commenting on diffs pretty easy. Github allows you to comment on pull requests and individual commits.

These are all fantastic tools for commenting on diffs, but I kind of wanted something different, something a little more self-contained. I wanted to write about the individual changes, what motivated them, and what the non-code implications of each change might be. At that point my mind wandered to the world of lightweight literate programming using tools like docco, rocco, and pycco.

A literate diff might look something like this (using Python/Bash style single-line comments):

# Extend Pygments' DiffLexer using a non-standard comment (#) for literate diffing using pycco.
diff -r cfa0f44daad1 pygments/lexers/text.py
--- a/pygments/lexers/text.py	Fri Apr 29 14:03:50 2011 +0200
+++ b/pygments/lexers/text.py	Sat Apr 30 20:28:56 2011 -0500
@@ -231,6 +231,7 @@
             (r'@.*\n', Generic.Subheading),
             (r'([Ii]ndex|diff).*\n', Generic.Heading),
             (r'=.*\n', Generic.Heading),
# Add non-standard diff comments.  This has to go above the Text capture below
# in order to be active.
+            (r'#.*\n', Comment),
             (r'.*\n', Text),
         ]
     }

It turns out that it’s pretty easy to process with patch, but comes with a catch. The patch command would blow up quite spectacularly if it encountered one of these lines, so the comments will have to be removed from a literate diff before being passed to patch. This is easily done using awk:

cat literate.diff | awk '!/\#/' | patch -p0

If you’re using a DVCS, you’ll need -p1 instead.

Since I’m using a non-standard extension to diffs, tools such as pygments won’t know to syntax highlight comments appropriately. If comments aren’t marked up correctly, pycco won’t be able to put them in the correct spot. This requires a patch to pygments and a patch to pycco. I’m kind of abusing diff syntax here and haven’t submitted these patches upstream, but you can download and apply them if you’d like to play along at home.

I still think tools like github, reviewboard, and gerrit are much more powerful for commenting on diffs but was able to make pycco output literate diffs quick enough that I thought I’d share the process. These tools are no excuse for clearly commenting changes and implications within the code itself, but I do like having a place to put underlying motivations. Here’s an example of a literate diff for one of my commits to phalanges, a finger daemon written in Scala. It’s still a pretty contrived example but is exactly what I was envisioning when my mind drifted from diffs to literate programming.

Comments

2 responses to “Literate Diffing”

  1. Eric Moritz Avatar

    Can’t you just add a comment in your code so it’ll show up as a new line in the diff?

  2. Me Avatar
    Me

    /\#/ this means “have #” not “starts with #” so it’ll break diff of bash comments etc…