26 Feb 2020

literate programming

I got started with literate programming many years ago, out of admiration for almost everything else i knew done by Donal Knuth, and tried my hand at it in some toyish projects in OCaml and Scheme. So it wasn't without lack of enthusiasm that i plunged into the literate world.

For reasons i've forgotten (probably better Emacs integration, perhaps simpler syntax) i chose noweb over funnelweb (perhaps, i would recommend the reverse these days, if it weren't for org's babel) and wrote a set of make functions to help create multi-file noweb projects. If memory serves, i was writing an MMIX interpreter in OCaml and a distributed job scheduler in Scheme, and i tried in both cases to use a literate style.

On paper, literate programming is great. Just take a look at Knuth's MMIXware book, Hanson's C Interfaces and Implementations or Jay MacCarthy's blog: that's definitely how i want to read and understand programs for fun and profit.

As i soon learned, that's however not the way i wanted to write programs.

Programming is for me mostly a dialectic process of understanding problems and their solutions, always evolving and always provisional (and most often buggy!), and i seemed never to be able to frozen the solution at hand in a structure that were not in flux. When changing the structure of your solution means rewriting your essay, there's too much friction.

In addition, i tend to build my programs in a bottom-up fashion. Reorganisation comes later, so it's again a chore to start by trying to write down an essay: it's only when i've reached the top that i can see the landscape and properly describe it.

And of course there's the REPL: interactive programming using a live image is, or seemed to be for a long time, at odds with a literate approach to writing your programs.

So, as many others, i just shelved LP for the day i would write a book, and there it's been, untouched, until recently.

It all began when i found myself, more and more often, writing solutions to customer problems that needed the collaboration of several programming languages, with a sprinkle of shell scripting. That need is even more pressing when moreover one routinely uses DSLs (as we do in BigML with Flatline or WhizzML). Together with all the programming bits, one needs of course to deliver decent documentation explaining it. So it's the perfect storm for an org babel document, specially when one combines it with poly-org. This package allows a truly poly-mode experience in the org buffer itself: when one enters a src block, emacs switches to its language's major mode, and one can do things like sending expressions to the corresponding REPL, or propertly indenting without having to pop-up a new buffer as in babel's default modus operandi. It might seem a little thing, but to me it's made a great difference. Add to that the all but excellent export capabilities of org, and the easiness with which one can tangle out different files from the same org file, together with having at your fingertips the immense gamut of functionality offered by org mode, and it's very difficult not to become a fan of this form of literate programming.

Another setting in which babel is lately winning my heart is as a tool for writing emacs lisp packages. It all began with writing my init.el using org, at first tangling it semi-automatically on emacs startup. But then i discovered literate-elisp, a package that teaches emacs to load org files directly, reading the elisp source blocks within, and, most importantly, retaining the source code positions for them. That means that, when one jumps to a function or variable definition¹, one lands directly into the org file: no intermediate tangled elisp is needed. That's, again, incredibly handy, specially when combined with poly-org. I can now write packages like signel in an org file that is directly exportable to a blog post, and i find myself writing at the very same time and place both the emacs lisp program and the blog post explaining how it works.

All that said, this new penchant for LP has its obvious limits: i'm aware that it is not going to be comfortable for projects that aren't as self-contained and small as the ones i mention above. For instance, i don't think we could write our clojure backend as a bunch of org files, much as i wish i were wrong.

Footnotes:

i've had since forever (it's been so long that i've forgotten where i stole it) a little utility function that let's me conveniently jump a la geiser using M-. to any emacs lisp definiton:

(defun elisp-find-definition (name)
  "Jump to the definition of the function (or variable) at point."
  (interactive (list (thing-at-point 'symbol)))
  (cond (name
         (let ((symbol (intern-soft name))
               (search (lambda (fun sym)
                         (let* ((r (save-excursion (funcall fun sym)))
                                (buffer (car r))
                                (point (cdr r)))
                           (cond ((not point)
                                  (error "Found no definition for %s in %s"
                                         name buffer))
                                 (t
                                  (switch-to-buffer buffer)
                                  (goto-char point)
                                  (recenter 1)))))))
           (cond ((fboundp symbol)
                  (elisp-push-point-marker)
                  (funcall search 'find-function-noselect symbol))
                 ((boundp symbol)
                  (elisp-push-point-marker)
                  (funcall search 'find-variable-noselect symbol))
                 (t
                  (message "Symbol not bound: %S" symbol)))))
        (t (message "No symbol at point"))))