# Posts tagged "auld":

# sussmaniana

I'm back from ILC 09, slowly digesting all i lived there. This was my first Lisp conference and my first visit to MIT, a place marked with red big letters in the atlas of my private mythology. And it wasn't only about places: suddenly realizing that you're sitting next to Richard Greenblatt, or enjoying Gerry Sussman's talks in the flesh, was quite an experience, with an almost eerie feeling attached to it.

There's a problem in meeting your myths: real life is almost never up to the task of meeting one's idealizations. But this was not the case; i enjoyed every minute, not a tinge of disappointment to be felt. Lisp has been around for quite a while and its history is an important part of computer science's history. That history comes life in the ILC, and you get a chance to share with the people that were there back in the day, the people you read about in books… i've wished many times i was there, and these days in Cambridge i've been as close to those halcyon days as i can expect to ever be. Living history, what a feeling.

Had i to single out just one speaker, that'd have to be Gerry
Sussman. I just kept finding myself resonating in a deep way with his
thoughts. For instance, during the panel on the future of Lisp,
conversation revolved around how to keep the language popular and apt
for commercial applications. Gerry stepped out to point that that was
all very well, but that we shouldn't forget that one of the key
aspects of programming languages is to what extent they allow us to
extend our problem-solving abilities by providing new ways of
expressing and talking about problems (as Dijkstra once said: /Lisp has
assisted a number of our most gifted fellow humans in thinking
previously impossible thoughts/). Dead on, if you ask me, although
unfortunately nobody seemed to have anything else to add, and the
debate returned to the far less interesting popularity issues (well,
yes, the conference wasn't *perfect* after all).

Next day we were in a kind of tongue-in-check debate provocatively
entitled *Are macros a menace?*. Richard Gabriel was on the wrong side,
and arguing along the lines that macros were akin to language design
and that he'd rather not suffer the consequences of letting your
average software engineer undertake such a complex task. Gerry's
intervention at this point made me nod again: if we cannot trust our
software enginneers to proficiently use the tools of our trade, there
must be something wrong in the way we educate them; only those able to
judiciously use them should get a diploma, to begin with. That's
exactly how i felt during my period as a CS teacher, as i tried,
rather clumsily, to explain in one of my first rants in this blog. It
feels great to be in such a good company.

Then there was this unplanned mini-talk on why MIT has replaced the
SICP-based introduction to programming with something about robots
using Python. You can read a nice synopsis of the reasons Sussman gave
in Andy's blog (together with summaries and comments of many other
talks). It was nice in a kind of sad way: at the end, while answering
a question, Gerry mentioned that this new computing world was not his,
and it wasn't one that he liked^{1}. 'But', he said, 'that's because
we're old farts'. Although i'm younger than him, i like much better
the kind of world that gave rise to SICP, and spent the rest of the
evening feeling like a sad, old fart. I must say, however, that the
ideas SICP grew upon, that simple world where you built up complexity
out of small pieces and system that you understood completely, have
much to offer to new generations. We should not dismiss them in the
name of modernity.

Besides those cameos, Gerry had two *official* appearances in the
regular conference programme, namely, one invited speech on robust
systems design and a lightning talk. (The latter were 5-minutes long
presentations–with Guy Steele and his chronometer strictly ensuring
that that was actually the case–followed by a 2 minutes Q&A part,
while the next speaker was setting up her laptop. This formula worked
great most of the time, forcing the presenters to squeeze their brains
in order to capture as much content, sense and, in most cases, fun as
possible in such a short time. We had a living confirmation that
working under severe constraints is a great creativity boost.)

In the invited speech, we had the opportunity of hearing more about Gerry's ideas on what makes a system robust. I say 'more' because he made public a paper on the subject some time ago, and, actually, his ideas on these issues can be traced back to, for instance, the SICP video lectures, where he already exposes the general strategy to tackle the problem: in order to make a system robust, you don't want to solve a strict, narrowly specified problem, but a whole family of them (or, if you have a very crisp specification, a class of problems in its neighbourhood). That way, when the problem to be solved varies in small ways, your whole solution accommodates to the new situation by small variations. The solution is not brittle. To attain such a flexible behaviour, we need to build our system upon components that are lenient on the inputs they accept and as sharp as possible in the outputs they produce. Ways to attain the above goals include metalinguistic abstraction (creating a language tailored to the problem domain), use of generic interfaces, degeneracy or non-deterministic search in the solution space.

Generic functions were nicely demonstrated with examples from the library used in SICM (and the delicious Functional Differential Geometry). We got the warning that using generics this way is dangerous. But they're powerful, and one needs powerful methods to solve challenging problems; one needs to know what one's doing, but that's part of what we're supposed to be good at, right? Sussman kept smiling and saying 'beware, this is a very dangerous thing to do'. There was also an almost off-the-record comment to the effect that one of the missed opportunities in Lisp's design was its not defining all its primitive forms as generics (converting it definitely in the most dangerous language around).

Degeneracy (using completely different ways for computing the same result) was illustrated, as much of Sussman's thinking on robust programming, by examples drawn from biological systems (in this talk, it was frogs most of the time). You can find many other examples of this sort of parallelism between computing and biological systems in the paper linked above, a line of thought that i find mixes well with the handful of metaphors i use to reason about my job as a programmer. In particular, it sort of connects with my bias towards living systems such as Lisp or Smalltalk's, where one is evolving more than designing and implementing the program; which in turn mixes well with the ideas of latent typing and late binding, present in those highly dynamic environments (Self, Factor, and APL (as demonstrated in this game of life video (unintended pun intended)) are in the same league). Much more than with beautiful but extremely rigid ones based on static typing, such as Haskell's. (That's why i keep coming back to Lisp after my Haskell excursions, or why i find R6RS so disappointing. Or, if you'll pardon my keeping on mixing methaphors, why i prefer healing rather than practising autopsies.)

Another venue to flexibility mentioned in the talk are constraint propagation networks in which multiple sources contribute to defining the values of each state variable. You get that way the possibility of partially defined values, that can be nonetheless useful by themselves, depending on the computation you're performing. Propagator networks also work as additive computation machines able to refine coarse inputs into correct solutions for problems specified as a set of constraints. One of Sussman's students, Alexey Radu, is actively working on propagators, building on work inititated back in the day by Guy Steele. You can find an extensive report and nice, working Scheme code here.

Finally, Gerry gave a lightning talk with yet another piece of food for thought. The rub of it was drawing our attention to the possibility of exploiting a posited parallelism between the theory and methods to solve differential equations on the one hand, and programs on the other. There's a way of approaching solving a differential equation that is, if you will, algebraic in nature: one manipulates algebraically expressions to simplify and eventually obtain a closed form solution, or, if that's not possible, creates numerical approximations to evolve the boundary conditions in the state space as a function of discrete time steps. You end up that way with something that works as a solution, but, most of the time, without a deep understanding of the traits that make it a solution: in the spirit of the robust design ideas sketched before, we should probably be asking for more qualitative information about how solutions behave as we change the boundary or initial conditions of our problem. As it happens, matematicians have a way of analyzing the behaviour of solutions to differential equations by studying their Poincaré maps and sections, which are views into the orbits followed by the solutions in their state space. Many times, you don't have to solve exactly the differential equation to predict its qualitative behaviour (e.g., is it bounded? is it periodic?) in state space, and get insight on how it changes in presence of perturbations. The analogy with computing processes is clear: most of the time, we narrow our efforts in finding, so to speak, algebraic solutions to our computing problems: the program-as-text is the analog to the process of finding an exact formula for the solution of a differential equation. What we're missing, according to Sussman, is a way or reasoning about the qualitative features of our programs à la Poincaré, i.e., a way or reasoning about programs in the state space of their outputs, beyond the mechanistic algorithmical details. Gerry admitted that he didn't know how or in what form this kind of reasoning would proceed, only that his hunch is that it must possible, and exhorted us to find the way.

ILC09 would have been worth it only for those talks, but there was much more: don't miss the next one!

## Footnotes:

^{1}

As rightly pointed out by Dan Weinreb in his comment below, Sussman endorses the changes in the new curriculum, though. His post on this issue is worth reading too.

This is a comment by Mark Miller that started the discussion:

I read a few articles on Sussman’s speech on why MIT switched to Python, and the sense I get is, as you said, that he understands the reasoning behind the switch, but he doesn’t endorse it. What’s sad to me is he takes himself out of the picture (“I’m just an old fart”), rather than as an authority with something valuable to say about it that people should hear.

I agree with you about how your myths never measure up to reality. I had the opportunity to meet Alan Kay at the Rebooting Computing summit held in January. I expected to be all giddy to meet him. Instead I felt very humble, but gratified. When he was talking with others about CS education he said something similar to Sussman with regard to who should be given diplomas, even more so who should be allowed to teach. I forget if it was at the Summit, but I’ve heard him quote Alan Turing who said that, “If you’re not satisfied with the computer you have, you can create one to your liking,” or something like that. Kay considers it essential that every CS graduate be able to write their own programming language, because he considered this to be like creating your own computer in a virtual sense. He also said that if any CS professor can’t do this they should be fired.

I understand what Sussman said about the reasons for the decision to make the switch. If I were in the CS department I’d feel uncomfortable about it. The reason being that it feels like MIT is slouching towards the undergrad CS education I had from a lesser known university, which was largely about algorithm analysis and optimization. This sort of education is perfect for doing what he describes: analyzing what already exists, and then piecing the best parts together into a cohesive whole in a performant manner. The problem is it doesn’t get at the lower level ideas. Who’s going to create the new APIs, operating systems, and languages? Certainly not the kind of people who had the CS education I had. It seems like we’re voluntarily taking ourselves out of the realm of computing innovation. “That’ll be done elsewhere.” Gee, I hope not!

and Daniel's answer:

@Mark: Gerry Sussman was quite clear that he endorses the new curriculum, as does Hal Abelson. What he said (the “we’re old farts” comment) was that we may not like the fact that the way engineers work has changed in this way, but it has in fact changed, and it would be a big mistake to hide our heads in the sand and pretend it hasn’t, no matter what we like.

Also, please keep in mind that the change is only the freshman core courses. The rest of the CS curriculum is the same as it has been for a long time, and it’s in those classes that we all learned about operating systems and new languages.

My son’s best friend’s older sister is a senior in CS at MIT now, and I’ve been hearing all about her experiences. It’s as if she’s following in my footsteps, which I find very entertaining. Of course they’re not my very own footsteps, but those of all the CS majors in the last thirty years. Of course the details have been updated; the new reading list for 6.033 is modern and very relevant.

6.033 is probably the deepest undergraduate CS course, dealing with issues that are leading-edge and often not entirely resolved yet. She sent me email when she was given a 6.033 paper of which I was one of the co-authors, namely our CACM paper about ObjectStore. Wow, I myself am on the 6.033 reading list — what an honor! Brag, brag. (It’s nothing to do with Lisp, by the way.)

# the ghost in the lisp machine

A friend of mine uses to say that Emacs fills our yearning for a Lisp Machine. I tend to agree with him: Emacs is not just an editor, but a full integrated environment where you can perform virtually any imaginable task; and, most importantly, the inner workings of the system are open to you to explore and extend. Using, for extra fun, Lisp. No, i don't think that Elisp is the nicest Lisp incarnation around, but is far better than, say, C, and i still prefer it to other scripting languages. Moreover, the awesome range of libraries at your disposal makes up for many of the deficiencies in the language.

Living in Emacs is addictive. Imagine an operating system where you
can switch from writing code to browsing the web or chatting without
leaving a consistent environment, with the same set of commands and
shortcuts. Imagine a set of integrated applications where data is
seamlessly shared, where any single functionality can be tweaked,
extended and adapted to your particular needs. Where everything is
*easily* scriptable. Imagine, in additon, that the environment provides
powerful and complete interactive self-documentation facilities with
which the user can find out what is available. I have yet to find an
operating system providing such an integrated environment. Not even
Mac OS X, where AppleScript support is often lacking and system
services are underused.

Of course, the key ingredient here is Emacs' extensibility. Far from being an afterthought or simply one of its features, extensibility is the central aspect of Emacs' architecture. Actually, the whole point of this post is to recommend you reading Richard Stallman's 1981 essay EMACS: The Extensible, Customizable Display Editor, which explains much better than I could the strong points of Emacs design, i.e., those traits that make Emacs more, much more, than just an editor. From the horse's mouth:

Extensibility means that the user can add new editing commands or change old ones to fit his editing needs, while he is editing. EMACS is written in a modular fashion, composed of many separate and independent functions. The user extends EMACS by adding or replacing functions, writing their definitions in the same language that was used to write the original EMACS system. We will explain below why this is the only method of extension which is practical in use: others are theoretically equally good but discourage use, or discourage nontrivial use. […] User customization helps in another, subtler way, by making the whole user community into a breeding and testing ground for new ideas. Users think of small changes, try them, and give them to other users–if an idea becomes popular, it can be incorporated into the core system. When we poll users on suggested changes, they can respond on the basis of actual experience rather than thought experiments.

The article goes on explaining the organization of the Emacs system, how it depends on its interpreter, Elisp's main features and how built-in self-documentation is provided. Also interesting is the list of related systems at the end of the essay: Lisp machines, LOGO, MacLisp and Smalltalk. We're definitely in good company!

# geometrically speaking

While a was a full-time physics and maths student, i seldom, if ever,
thought of proving anything using a diagram, or any kind of
non-algebraic method, for that matter. One could make a couple of
drawings every now and then to help understanding, but that was
all. Not even after learning differential geometry did my view
change. As a matter of fact, with the emphasis on (and the beauty of)
abstract representations (as in abstract tensor notations), using
drawings of surfaces embedded in Euclidean space felt like
cheating. To make things even worse, my first serious physics book had
been Landau and Lifshitz's *Classical Field Theory*, where even *words*
are scarce, let alone drawings or diagrammatic reasoning ^{1}. In a
nutshell, i would have felt at home reading Lagrange's introduction to
his *Méchanique Analytic*^{2}:

No figures will be found in this work. The methods like i set forth require neither constructions nor geometrical or mechanical arguments, but only algebraic operations, subject to a regular and uniform procedure.

## Counting squares

I'm stealing the quote above from a talk entitled
*Proofs and Pictures*^{3}, which started me re-thinking about diagrams
in physics (and maths) in the first place. It was given at the
Perimeter Institute by James Brown, a professor of Philosophy of
Science at the University of Toronto. In this fun talk, professor
Brown explores the use of geometrical reasoning in maths and physics
as a means of actually *proving* results. Some simple but instructive
(and, to me, somewhat surprising and definitely amusing) examples of
such "proving by diagrams" are given in the figure on the left (click
to enlarge), which shows how getting general formulas for arithmetic
and geometric sums may be as easy as counting squares. I'm giving away
just two of them, so that you can try your hand with the other two and
have a little fun (you can also try to invent your own, maybe going to
3- or even n-dimensional cubes, in which case, please, don't forget to
post your discoveries below! :)). Although elementary, these *proofs*
are intriguing: would you accept them as such? Brown argues that they
do, since they can be used to show the validity of the induction step
in the usual algebraic proofs. I'm not sure i buy the argument, but
it's a very interesting one.

## Penguins and lollypops

Turning our attention to physics, probably the most famous diagrams in
the field are Feynman's. As i'm sure you know, they offer a convenient
notation for manipulating terms in QED's perturbative
expansions. Taken at face value, or, one might say, analytically, the
represent just algebraic combinations of functions (propagators)
entering a power series expansion in a small parameter (the
interaction coupling constant, alpha). But they're usually *interpreted*
as providing the actual physical mechanism for the interaction of real
particles by means of exchanges of *virtual*, unobservable
photons. Albeit intuitive and appealing, this interpretation has
always bothered me. After reading about it in popular science books, i
expected QED being somehow based on photon exchanges from
scratch. Instead, what one has is a principle of least action which
leads to differential equations unsolvable in exact analytical form.
Then, when calculating an approximate solution to a scattering problem
using a power series, one obtains (the analytical equivalent) of
Feynman diagrams and interprets them, so to speak, after the fact. I
would somehow feel more comfortable if the process were the other way
around: start with the (supposedly) physical underlying process (the
photon exchange) and derive the scattering amplitude. Each Feynman
diagram would then represent an actually possible scenario, in the
same sense that an electron choosing one slit in the two-slit
experiment is possible: one can break the superposition and observe
the electron in its way through the slit. But this is of course
impossible: virtual photons are unobservable, if only because they
travel faster than light and violate energy conservation. To add to my
uneasiness, a plain Feynman series leads to divergences to be cured,
non-diagrammatically, by renormalisation. Yet, everyone since Feynman
discusses this spooky photon ping-pong as the right interpretation^{4},
so probably i'm just showing off my lack of understanding! And,
besides, one could arguably point to measurable vacuum polarisation
effects like Casimir's as an experimental proof of the existence of
virtual particles (see for instance this recent, accesible account at
PR Focus). Or one could even see the situation as a derivation of the
interaction underlying mechanism from first principles, an stunning
testament to their power^{5}. At any rate, and specially if one
accepts the mainstream interpretation, Feynman diagrams appear as a
good example of how diagrammatic tools can be more than just a
picture, and not only in mathematics. For more on Feynman diagrams and
pointers to further reading, see their WikiPedia entry, or get
Diagrammar a CERN report by 't Hooft and Veltman with all the gory
details with a deliciously retro (as in written in 1973 using a
typewritter) flavour.

Before leaving the subject of Feynman diagrams, let me mention two
bits of diagrammatic folklore stolen from Peter Woit's latest
book. Naturally enough, recurring diagrams have got pet names over the
years. The first one seems to have been the *tadpole* (for a diagram
shaped, well, like a tadpole), coined by Sidney Coleman and resignedly
accepted by the Physical Review editors after he proposed lollypop and
*spermion* as alternatives. The second anecdote involves a diagram
(depicted above) known as *penguin* since Melissa Franklin won a dart
match over John Ellis: Tommaso Dorigo has recently recounted the story
in his blog.

## Tensors and birds

Roger Penrose's thought is all but geometrical, and it comes as no
surprise that he has made many a contribution to the *physics by
drawing* camp. Every decent course on General Relativity touches
conformal diagrams^{6}, a nifty method envisioned by Penrose and Brandon
Carter (back in the sixties) to bring infinity back into your drawing
board. The trick consists on scaling your metric by a global function
that vanishes quickly enough when your original coordinates go to
infinite. Such scaling is known as a conformal transformation, and has
the virtue of preserving angles; in particular, null geodesics are
mapped into null geodesics and, therefore, the causal structure
(represented by null cones) is untouched. While beautiful and handy, i
think that conformal diagrams do not add anything really new from a
computational standpoint (as Feynman diagrams do), let alone serving as
the basis for actual proofs.

More interesting for our current musings is Penrose's graphical tensor
notation. Tensor indexes (specially in its abstract flavour, also
introduced by Penrose) are a quite convenient housekeeping device,
ensuring almost automatically the consistency of your equations and
even (once one has a bit of practice with them) suggesting their form^{7}.
But, convenient as they are, indexes seem to be confusing for
geometrical minds like Penrose's, who some fifty years ago devised a
pictorial representation for tensor equations^{8}.

As you can see in the figure, the idea is simple: choose a closed
polygon to represent the kernel letter of each tensor, and add an
upwards leg for each contravariant index, and a downwards one for each
covariant index. Index contraction is represented by joining the
respective legs. A wiggly horizontal line represents symmetrisation; a
straight one anti-symmetrisation. One can cross legs to indicate index
shuffling. The metric gets no kernel figure (it's just an arch), so
that contractions of indexes in the same tensor are easily depicted,
and raising and lowering indexes amounts to twist the requisite leg up
or down. To indicate covariant differentiation, circle the tensor
being differentiated and add the corresponding downwards (covariant)
leg. And so on and so forth. Note also that commutative and
associative laws of tensor multiplication allow your using any two
dimensional arrangement of symbols that fits you, which aids in
compactifying expressions. Penrose explains the many details and
twists of the notation in *The Road to Reality* and in his (and
Rindler's) Spinors and Space-time I, where you'll find extensions to
deal graphically also with spinors and twistors. According to the
latter,

The notation has been found very useful in practice as it greatly simplifies the appearance of complicated tensor or spinor equations, the various interrelations expressed being discernable at a glance. Unfortunately the notation seems to be of value mainly for private calculations because it cannot be printed in the normal way.

Besides the (not so obvious nowadays) difficulty mentioned above, i
guess that the main hurdle in adopting Penrose's notation is habit.
After many years using indexes, my algebraic mind seldom finds
equations confusing because of their indexes. But after a little
practice it becomes easier, and i'd say that people who *see* equations
will find it quite natural after a very little while^{9}. I don't know
how popular Penrose graphics are among physicists for private use, but
there's many an example of their application and extension to related
fields. A few years after its introduction, the notation was
rediscovered by Pedrag Cvitanovic, who used a variation of it in an
article on group theory and Feynman diagrams. More concretely,
Cvitanovic uses diagrams similar to Penrose's to represent to
represent the structure constants of simple groups in the context of
non-abelian gauge theories, interestingly linking them with Feynman
diagrams (and closing a loop in this article!). Later on, he would use
the notation very extensively in his on-line book on Group Theory,
where the diagrams go by the name of *bird-tracks*. In a nutshell, the
book is devoted to answer, in Cvitanovic words, a *simple* question:

"On planet Z, mesons consist of quarks and antiquarks, but baryons contain 3 quarks in a symmetric color combination. What is the color group?" If you find the particle physics jargon distracting, here is another way to posing the same question: "Classical Lie groups preserve bilinear vector norms. What Lie groups preserve trilinear, quadrilinear, and higher order invariants?"

From here, an amazing journey through the theory of Lie groups and
algebras ensues, a journey conducted almost exclusively by diagrams.
For, notably, Cvitanovic uses his bird-tracks (as mentioned, a very
evolved kind of Feynman diagrams) to actually *derive* his results. We
have here physics (and maths) by diagrams for real, actually replacing
algebraic reasoning (and, incidentally, a proof that Penrose's
reservations about his notation not being apt for publications are
unfounded nowadays–i wonder how Cvitanovic draws his diagrams).

Before leaving the subject, let me mention a couple more works inspired by Penrose's diagrammatic notation. Yves Lafont has greatly extended it and carefully analysed its application to mathematical problems in the context of category theory and term rewriting systems. If you're privy in the field, or simply curious, take a look at his articles Algebra and Geometry of Rewriting (PS) and Equational Reasoning With 2-Dimensional Diagrams , where Yves explores two-dimensional diagrams a la Penrose with an eye to (possibly automatic and computer-aided) derivations much in the spirit of Cvitanovic. And, turning back to physics, if there's a theory prone to diagrammatic reasoning it must be Loop Quantum Gravity, where the basic constituents are graphs and their transformations. Arguably, LQG is the most fundamental example discussed so far of graphical reasoning applied to physics, for here graphs (and their combinations in spin foams, an evolution of another Penrose invention, spin networks) do stand for themselves, as opposed to representing some underlying algebraic mathematical entity. Wandering into the marvels of LQG would carry us too far afield, so i'll just point out that Rovelli, Smolin and friends use not only Penrose's spin networks, but, on occasion, also the graphical tensor notation we've been reviewing; see for instance their seminal paper Spin Networks and Quantum Gravity, where Rovelli and Smolin presented their famous derivation of exact solutions to the Wheeler-DeWitt equation. The notable thing is, again, the fact that graphic notation is key in many a derivation, and cannot be seen as just an aid to represent some calculations.

## Kindergarten categories

Our final example of physics by diagrams comes from the category theory-inspired view of Quantum Mechanics invented by Samsom Abramsky, who has managed to do "quantum mechanics using only pictures of lines, squares, triangles and diamonds". This beautiful notation (or picture language, as their authors call it) is nicely explained in Bob Coecke's Kindergarten Quantum Mechanics, a very pedagogical set of lecture notes where it is applied to the problem of quantum teleportation. Bob's thesis is that teleportation was not discovered until the 90's (despite it's being a relatively straightforward result in QM) due to the inadequacy of the commonly used, low-level mathematical language used to describe Hilbert spaces. Had lines, squares, triangles and diamonds been used from the beginning, teleportation would have followed almost immediately. Or so thinks Bob: go take a look at his article and see what's your take. In any case, its more than sixty full-color diagrams, used instead of boring algebraic formulae, make for a fun reading (or, should i say, viewing). By the way, don't let the mention to category theory put you off: only very basic ideas (explained in the lecture notes) are needed, if at all, in this case, and actually the author's enthusiasm goes as far as making the bold claim that this new graphical formalism could be taught in kindergarten! Maybe that's the gist, since i, for one, find the notation hard to follow, undoubtedly due to my old-school, algebraic upbringing. Just to give you an idea of how this preschool notation looks like and close this long post as it deserves (i.e., with a diagram), here you have how the teleportation protocol (including a correctness proof) looks like:

## Footnotes:

^{1}

My copy (Spanish translation) of the fifth edition of L&L's book has 500 pages and just 22 figures!

^{2}

The link above points to Volume 11 of the collection at Oeuvres de Lagrange, a site that contains what seems to be the complete Lagrange corpus, conveniently scanned and downloadable too.

^{3}

I would give you a direct link, did it exist. Unfortunately, PI's website is not up to the quality of their other activities. You'll find it by browsing to their Public Lectures Series and from there to page 2 (or search for James Brown). Another very unfortunate circumstance is that the videos are only available for those of you not/ using weird as in freedom operating systems :-(.

^{4}

That's at least my impression. Penrose, for instance, advocates
for their reality in his *road*. The subject is however controversial
enough to grant the existence of monographs like the recent Drawing
theories apart, by David Kaiser (which i cannot comment on since i've
just added it to my wish list).

^{5}

But i find this argument hard to swallow. Think for instance in the interpretation of antiparticles as particles travelling backwards in time: it also follows naturally (for some definition of natural) from perturbative series and/or their diagrams, but it is not as easily accepted as the existence of virtual photons. One wonders, where's the limit?

^{6}

If you haven't your favourite textbook at hand (Hawking and Ellis being mine when it comes to anything related to causal structure), you can find a pretty good introduction on-line in this chapter of Sean Carroll's lecture notes.

^{7}

There is only so many ways of combining indexes, and if you know what are the free ones on, say, your LHS and the tensors entering the RHS and its general properties (e.g. symmetries), it's often an easy task how their indexes should be combined. It reminds me, in a way, of dimensional reasoning, where knowing the target units and the ingredients gives an often quite accurate clue of how to combine them.

^{8}

It was introduced in a chapter of the book Combinatorial
Mathematics and its Applications (Academic Press, London, 1971),
entitled *Application of Negative Dimensional Tensors*. But Penrose have
been using it (according to this letter to Cvitanovic (PDF)
since 1952.

^{9}

An interesting (and not too far fetched) software project would be to write a Penrose diagram editor, possibly with support for tablet input devices. Such a tool would also probably solve the publication issue. In an ideal world, one would use a stylus to draw equations which would get automatically imported as nice diagrams, regular tensor equations with indexes or both. Any takers? ;-)

# programmers go bananas

## Introduction: lists galore

I learned programming backwards, plunging right on into C and, shortly after, C++ and Java from the very beginning. I was knee deep in complex data structures, pointers and abstruse template syntax in no time. And the more complex it all felt, the more i thought i was learning. Of course, i was clueless.

Reading SICP and learning about functional programming changed it
all. There were many occasions for revelation and awe, but one of my
most vivid recollections of that time is my gradual discovery of the
power of simplicity. At about half way into SICP i realised in wonder
that that beautiful and powerful world was apparently being
constructed out of extremely simple pieces. Basically, everything was
a list. Of course there were other important ingredients, like
procedures as first-class objects, but the humble list was about the
only data structure to be seen. After mulling on it for a little bit,
i saw where lists draw their power from: recursion. As you know, lists
are data types recursively defined: a list is either the empty list or
an element (its *head*) followed by another list (its *tail*):

list = []

list = a : list

where i'm borrowing Haskell's notation for the empty list ([]) and the list constructor (:), also known by lispers as () and cons. So that was the trick, i thought: lists have recursion built-in, so to speak, and once you've read a little bit about functional programming you don't need to be sold on the power and beauty of recursive programs.

It is often the case that powerful and beautiful yet simple constructs have a solid mathematical foundation, and only when you grasp it do you really realize how powerful, beautiful and amazingly simple that innocent-looking construct is. Lists, and recursive operations on them, are an excellent case in point. But the path connecting them to their mathematical underpinnings is a long and winding one, which lays in the realm of Category Theory.

I first became acquainted of the relationship between categories and recursive programming reading Functional Programming with Bananas, Lenses, Envelopes and Barbed Wire, by Erik Meijer, Maarten Fokkinga and Ross Paterson. Albeit very enjoyable, this paper presupposes a high degree of mathematical sophistication on the reader's side. I will try in this article to give you a simplified overview of the concepts involved, including Category Theory, its application to programming languages and what funny names like catamorphism, anamorphism or lambda-lifting have to do with your everyday list manipulations. Of course, i'll be only scratching the surface: interspersed links and the Further reading section provide pointers to more in-depth explorations of this wonderland.

## Categorical interlude

Categories are (relatively) simple constructs. A category consists of
a set **O** of objects, and a set **A** of arrows between elements of
**O**. Arrows are composable: if there's an arrow from *a* to *b*, and another
one from *b* to *c*, there must be an arrow from *a* to *c* (where *a*, *b* and *c*
are elements of **O**). Besides, they are *associative*: if you have arrows
from *a* to *b*, *b* to *c*, and *c* to *d*, you can go from *a* to *d* via two
different paths, namely, first from *a* to *c* and then from *c* to *d*, or
first from *a* to *b* and then from *b* to *d*. Finally, for each element *a*
in **O** there's an *identity* arrow which goes from *a* to itself (called an
*identity*), such that following this arrow changes nothing. These
properties are better visualized with a diagram (or a bit of
mathematical notation), as shown in the image on the right.

A category captures a *mathematical world* of objects and their
relationships. The canonical example of a category is **Set**, which
contains, as objects, (finite) sets and, as arrows, (total)
functions between them. But categories go far beyond modeling
sets. For instance, one can define a category whose objects are
natural numbers, and the 'arrows' are provided by the relation
"less or equal" (that is, we say that there is an arrow joining
two numbers *a* and *b* if *a* is less or equal than *b*)/./ What we are
trying to do with such a definition is to somehow capture the
essence of ordered sets: not only integers are ordered but also
dates, lemmings on a row, a rock's trajectory or the types of the
Smalltalk class hierarchy. In order to abstract what all those
categories have in common we need a way to go from one category to
another preserving the shared structure in the process. We need
what the mathematicians call an *isomorphism*, which is the
technically precise manner of stating that two systems are, in a
deep sense, analogous; this searching for commonality amounts to
looking for concepts or abstractions, which is what mathematics
and (good) programming is all about (and, arguably, intelligence
itself, if you are to believe, for instance, Douglas Hofstadter's
ideas).

To boot, our definition of a category already contains the concept
of isomorphic objects. Think of an arrow from *a* to *b* as an
operation that transforms *a* in *b*. An arrow from *b* to *a* will make
the inverse transformation. If composing both transformations
gives you the identity, you are back to the very same object *a*,
and we say that *a* and *b* are isomorphic: you can transform one into
the other and back at will. In a deep sense, this concept captures
a generic way of expressing equality that pervades all maths: if
you're not afraid of a little bit of maths, Barry Mazur's essay
When is a thing equal to some other thing? is an excellent
introduction to Category Theory with an emphasis in the concept of
equality. Among many other things, you will learn how the familiar
natural numbers can be understood as a category, or how an object
is completely defined by the set of its transformations (and,
therefore, how to actually get rid of objects and talk only of
transformations; i know this is stretching and mixing metaphors
(if not plain silly), but this stress in arrows, as opposed to
objects, reminded me of Alan Kay's insistence on focusing on
messages rather than objects). Another introductory article with
emphasis on categories as a means to capture sameness is R. Brown
and T. Porter's Category Theory: an abstract setting for analogy
and comparison.

Not only objects inside a category can be transformed into each other.
We reveal the common structure of two disjoint categories by means of
a functor mapping across two categories. A functor consists of two
functions: one that maps each object of the first category to an
object in the second, and another one putting in correspondence arrows
in one category with arrows in the second. Besides, these functions
must preserve arrow composition. Let me spell this
mathematically. Consider to categories, C and C' with object sets **O**
and **O'** and arrow sets **A** and **A'**. A functor F mapping C to C' will
consist then of two functions (Fo, Fa); the first one taking elements
of **O** to elements of **O'**:

Fo: **O** -> **O'**

Fo(a) in **O'** for every a in **O**

and the second one taking arrows from **A** to arrows in **A'**:

Fa: **A** -> **A'**

Fa(**f**) in **A'** for every **f** in **A**

and such that, if **f** is an arrow from a to b in C, Fa(**f**) is an arrow
from Fo(a) to Fo(b) in C'. Moreover, we want that following arrows in C
is 'analogous' to following them in C', i.e., we demand that

Fa(**fg**) = Fa(**f**)Fa(**g**)

In the left hand side above, we are composing two arrows in C and then
going to C', while in the right hand side we first take each arrow to
C' and, afterwards, compose them in there. If C and C' have the same
structure, these two operations must be equivalent. Finally, F must
preserve identities: if **i** is the identity arrow for an element a in **O**,
Fa(**i**)must be the identity arrow for Fo(a) in **O'**. The diagram on the
left shows a partial graph (i'm not drawing the identity arrows and
their mappings) of a simple functor between two categories, and ways
of going from an object a in the first category to an object x in the
second one which are equivalent thanks to the functor's properties.

As you can see in this simple example, the functor gives us the ability of seeing the first category as a part of the second one. You get a category isomorphism in the same way as between objects, i.e., by demanding the existence of a second functor from C' to C (you can convince yourself that such a functor does not exist in our example, and, therefore, that the two categories in the diagram are not isomorphic).

You have probably guessed by now one nifty property of functors: they let us going meta and define a category whose objects are categories and whose arrows are functors. Actually, Eilenberg and MacLane's seminal paper General theory of natural transformations used functors and categories of categories to introduce for the first time categories (natural transformations are structure-preserving maps between functors: this Wikipedia article gives an excellent overview on them).

But enough maths for now: it is high time to show you how this rather abstract concepts find their place in our main interest, programming languages.

## Categories and programming languages

About the only similarity between C and Haskell programming is that one spends a lot of time typing ASCII arrows. But of course, Haskell's are much more interesting: you use them to declare the type of a function, as in

floor:: Real -> Int

The above stanza declares a function that takes an argument of type real and returns an integer. In general, a function taking a single argument is declared in Haskell following the pattern

fun:: a -> b

where *a* and *b* are types. Does this ring a bell? Sure it does: if we
identify Haskell's arrows with categorical ones, the language types
could be the objects of a category. As we have seen, we need identities

id:: a -> a

id x = x

and arrow composition, which in Haskell is denoted by a dot

f:: b -> c

g:: a -> b

fg:: a -> b -> c

fg = f . g

Besides, associativity of arrow composition is ensured by Haskell's referential transparency (no side-effects: if you preserve referential transparency by writing side-effect free functions, it won't matter the order in which you call them): we've got our category. Of course, you don't need Haskell, or a statically typed language for that matter: any strongly typed programming language can be modelled as a category, using as objects its types and as arrows its functions of arity one. It just happens that Haskell's syntax is particularly convenient, but one can define function composition easily in any decent language; for instance in Scheme one would have

(define (compose f g) (lambda (x) (f (g x)))

Functions with more than one arguments can be taken into the picture by means of currying: instead of writing a function of, say, 2 arguments:

(define (add x y) (+ x y))

(add 3 4)

you define a function which takes one argument (x) and returns a function which, in turn, takes one argument (y) and returns the final result:

(define (add x) (lambda (y) (+ x y)))

((add 3) 4)

Again, Haskell offers a pretty convenient syntax. In Haskell, you can
define `add`

as:

add x y = x + y

which gets assigned, when applied to integers, the following type:

add:: Int -> (Int -> Int)

that is, `add`

is not a function from pairs of integers to integers, but
a function that takes an integer and returns a function of type
`Int -> Int`

.
Finally, we can also deal with functions taking no arguments and
constant values by introducing a special type, called unit or 1 (or
void in C-ish), which has a unique value (spelled `()`

in
Haskell). Constants of our language (as, e.g., `True`

or 43.23) are then
represented by arrows from 1 to the constant's type; for instance,
`True`

is an `1 -> Boolean`

arrow. The unit type is an example of what in
category theory is known as a terminal object.

Now that we have successfully modelled our (functional) programming
language as a category (call it C), we can use the tools of the theory
to explore and reason about the language constructs and
properties. For instance, functors will let me recover the original
motivation of this post and explore lists and functions on them from
the point of view of category theory. If our language provides the
ability to create lists, its category will contain objects (types) of
the 'list of' kind; e.g. `[Int]`

for lists of integers, `[Boolean]`

for
lists of Booleans and so on. In fact, we can construct a new
sub-category CL by considering list types as its objects and functions
taking and returning lists as its arrows. For each type a we have a
way of constructing a type, [a] in the sub-category, i.e., we have a
map from objects in C to objects in CL. That's already half a
functor: to complete it we need a map from functions in C to functions
in CL. In other words, we need a way to transform a function acting on
values of a given type to a function acting on lists of values of the
same type. Using the notation of the previous section:

Fo(a) = [a]

Fa(f: a -> b) = f': [a] -> [b]

`Fa`

is better known as `map`

in most programming languages. We call the
process of going from `f`

to `f'`

lifting (not to be confused with a
related, but not identical, process known as lambda lifting), and it's
usually pretty easy to write an operator that lifts a function to a
new one in CL: for instance in Scheme we would write:

(define (lift f) (lambda (lst) (map f lst)))

and for `lift`

to truly define a functor we need that it behaves well
respect to function composition:

(lift (compose f g)) = (compose (lift f) (lift g))

We can convince ourselves that this property actually holds by means of
a simple example. Consider the function `next`

which takes an integer to
its successor; its lifting `(lift next)`

will map a list of integers to
a list of their successors. We can also define `prev`

and `(lift prev)`

mapping (lists of) integers to (lists of) their predecessors.
`(compose next prev)`

is just the identity, and, therefore,
`(lift (compose next prev))`

is the identity too (with lifted
signature). But we obtain the same function if we compose `(lift next)`

and `(lift prev)`

in CL, right? As before, there's nothing specific to
Scheme in this discussion. Haskell even has a
Functor type class
capturing these ideas. The class defines a generic lift operation,
called `fmap`

that, actually, generalizes our list lifting to arbitrary
type constructors:

fmap :: (a -> b) -> (f a -> f b)

where `f a`

is the new type constructed from `a`

. In our previous
discussion, `f a = [a]`

, but if your language gives you a way of
constructing, say, tuples, you can lift functions on given types to
functions on tuples of those types, and repeat the process with any
other type constructor at your disposal. The only condition to name it a
functor, is that identities are mapped to identities and composition is
preserved:

fmap id = id

fmap (p . q) = (fmap p) . (fmap q)

I won't cover usage of type constructors (and their associated functors) other than lists, but just mention a couple of them: monads, another paradigmatic one beautifully (and categorically) discussed by Stefan Klinger in his Programmer's Guide to the IO Monad - Don't Panic (also discussed at LtU), and the creation of a dance and music library, for those of you looking for practical applications.

## To be continued…

Returning to lists, what the lifting and categorical description above buys us is a way to formalize our intuitions about list operations, and to transfer procedures and patterns on simple types to lists. In SICP's section on Sequences as conventional interfaces, you will find a hands-on, non-mathematical dissection of the basic building blocks into which any list operation can be decomposed: enumerations, accumulators, filters and maps. Our next step, following the bananas article i mentioned at the beginning, will be to use the language of category theory to provide a similar decomposition, but this time we will talk about catamorphisms, anamorphisms and similarly funny named mathematical beasts. What the new approach will buy us is the ability to generalize our findings beyond the list domain and onto the so-called algebraic types. But this will be the theme of a forthcoming post. Stay tunned.

## Further reading

The best introductory text on Category Theory i've read is Conceptual Mathematics : A First Introduction to Categories by F. William Lawvere (one of the fathers of Category Theory) and Stephen Hoel Schanuel. It assumes no sophisticated mathematical background, yet it covers lots of ground. If you feel at home with maths, the best option is to learn from the horse's mouth and get a copy of Categories for the Working Mathematician, by Mac Lane.

The books above do not deal with applications to Computer Science, though. For that, the canonical reference is Benjamin Pierce's Basic Category Theory for Computer Scientists, but i find it too short and boring: a far better choice is, in my opinion, Barr and Well's Category Theory and Computer Science. A reduced version of Barr and Well's book is available online in the site for their course Introduction to Category Theory. They are also the authors of the freely available Toposes, Triples and Theories, which will teach you everything about monads, and then more. Marteen Fokkinga is the author of this 80-pages Gentle Introduction to Category Theory, with a stress on the calculational and algorithmical aspects of the theory. Unless you have a good mathematical background, you should probably take gentle with a bit of salt.

Let me close by mentioning a couple of fun applications of Category Theory, for those of you that know a bit about it. Haskell programmers will like this implementation of fold and unfold (as a literate program) using F-(co)algebras and applied to automata creation, while those of you with a soft spot for Physics may be interested in John Baez's musings on Quantum Mechanics and Categories.

# beyond mainstream object-oriented programming

## Introduction

After a few scheming years, i had come to view objects as little more than poor-man closures. Rolling a simple (or not so simple) object system in scheme is almost a textbook exercise. Once you've got statically scoped, first-order procedures, you don't need no built-in objects. That said, it is not that object-oriented programming is not useful; at least in my case, i find myself often implementing applications in terms of a collection of procedures acting on requisite data structures. But, if we restrict ourselves to single-dispatch object oriented languages, i saw little reason to use any of them instead of my beloved Scheme.

Things started to change recently due to my discovering the pleasures of Smalltalk. First and foremost, it offers a truly empowering integrated ambient to live and develop in. Second, if you're going to use objects, using the simplest, cleanest syntax will not hurt. Add to that some reading on the beautiful design principles underlying Smalltalk, and one begins to wonder if closures aren't, in fact, poor-man objects–or at least i do, whenever i fall in an object-oriented mood (i guess i'm yet not ready to reach satori).

But Scheme is not precisely an ugly or bad designed language, so i
needed some other reason to switch language gears for my OO
programming. I knew there's more than encapsulation or subtype
polymorphism in object-land from my readings on CLOS (the Common
Lisp Object System), or on Haskell's type classes (and its built-in
parametric polymorphism), but i was after something retaining
Smalltalk's elegance. And then i remembered that, when i was a
regular lurker in the Tunes project's mailing lists and IRC channel,
a couple of smart guys were implementing an OO language whose syntax
was smalltalkish. That language (which, if memory servers, started
life with the fun name *who me?*) has evolved during the last few
years into a quite usable programming environment named Slate,
started by Lee Salzman and currently developed and maintained by
Brian Rice.

I've been reading about Slate during the last few days, and decided to learn it. What motivated me was discovering how Slate goes beyond mainstream object-oriented programming by incorporating well-known (but hardly used) and really powerful paradigms. In short, Slate improves Smalltalk's single-dispatch model by introducing and combining two apparently incompatible technologies: multiple dispatch and prototype-based programming. To understand the whys and hows of Slate, there's hardly a better way than reading Lee Salzman's Prototypes with Multiple Dispatch. The following discussion is, basically, an elaboration of Lee's explanation on the limitations of mainstream OO languages, and how to avoid them with the aid of PMD.

*Note*: the images are thumbnails of this PDF file, with clickable
links inside.

## Fishes and sharks

Let's start by showing why on earth would you need anything beyond
Smalltalk's object system (or any of its *modern*
copycats). Consider a simple oceanographic ecosystem analyser,
which deals with (aquatic) Animals, Fishes and Sharks. These are
excellent candidates for class definitions, related by
inheritance. Moreover, we are after modeling those beasts'
behaviours and, in particular, their reactions when they encounter
each other: each time a Shark meets a Fish of other species, the
Shark will swallow the other Fish, while when a Shark meets Shark
they will fight. As a result of such fighting, Sharks get
unhealthy, which regrettably complicates matters: wound sharks
won't try to eat other fishes, and will swim away other sharks
instead of fighting them. The image on the left provides a sketchy
representation of the code we need to model our zoo. Waters are
quickly getting muddled implementation-wise.

On the one hand, subtype polymorphism based just on the object receiving the encounter message: we need, in addition, to take into account the argument's concrete type to implement the desired behaviour. This is a well-known issue in single-dispatch languages, whose cure is, of course, going to multiple dispatching (see below). In particular, we want to avoid the need to modify existing classes whenever our hierarchy is extended.

On the second hand, varying state (exemplified here by the Shark's isHealthy instance variable complicates the implementation logic. As we will see, prototype-based languages offer a way to factor out this additional complexity.

## Beyond single-dispatch

The need to adjust behaviour on the basis of the type of both a message receiver and its arguments arises frequently in practice. So frequently, that a standard way of dealing with it has been christened as the Visitor design pattern. The technique, also known as double-dispatch, is well known: you can see, for instance, how it's applied to arithmetic expressions in Smalltalk, or read about a generic implementation of multimethods in Python (which also includes a basically language-independent discussion on the issues at hand). If you happen to be a C++ programmer, you may be tempted to think that global functions and overloading solve the problem in that language. Well, think twice: a proper implementation of multiple dispatch in C++ needs of RTTI and templates, as shown in this article.

CLOS and Dylan are two examples of languages solving the issue from
the onset by including support for multi-methods. The idea is to
separate methods from classes (which only contain data slots). As
shown in the pseudo-code of the accompanying figure, methods are
defined as independent functions with the same name, but differing
in their arguments' types (in CLOS, a set of such methods is called
a *generic function*). When a generic function is called, the system
selects the actual method to be invoked using the types of *all the
arguments* used in the invocation. The encounter generic function in
our running example provides a typical example, as shown in the
figure on the right. The benefits of having multi-methods at our
disposal are apparent: the code is simpler and, notably, adding new
behaviours and classes to the system does not need modifications of
existing code. For instance, we can introduce a Piranha, which eats
unhealthy sharks instead of swimming away from them, by defining the
requisite class and methods, without any modification whatsoever to
the already defined ones.

On the downside, we have still to deal with the complications associated with internal state. Enter the magic world of prototype-based systems.

## The ultimate dynamic

If you like dynamic languages, chances are you'll find prototype-based system an almost perfect development environment. Prototype-based languages emerged as an evolution of Smalltalk with the invention of Self by David Ungar and Randall B. Smith during the late eighties. The key idea behind Self is noticing that, most of the time, class definitions needlessly coerce and complicate your object model.

A class definition becomes a contract to be satisfied by any instance, and it is all too easy to miss future or particular needs of your objects (class-based inheritance is just a partial solution to this problem, as shown, for instance, by the so-called fragile base class problem). But, if you look around you, objects change in internal behaviour and data content continously, and our attempts at distilling their Platonic nature are often in vain.

In prototype-based programming, instead of providing a *plan* for
constructing objects, you simply clone existing instances and modify
their behaviour by directly changing the new instance's slots (which
provide uniform access to methods and state). New clones contain a
pointer to their parent, from which they inherit non-modified slots:
there is no way to access state other than via messages sent to
instances, which simplifies tackling with state.

Class-based languages oblige you to keep two relationships in mind
to characterize object instances: the "is-a" relationship of the
object with its class, and the "kind-of" relationship of that class
with its parent. In self, inheritance (or behaviour **delegation**) is
the only one needed. As you can see, Self is all about making
working with objects as simple as possible. No wonder Ungar and
Smith's seminal paper was titled Self: The Power of
Simplicity. Needless to say, a must read.

The figure above shows how our running example would look in selfish pseudo-code. As promised, state is no longer surfacing in our method implementation's logic. Unfortunately, we have lost the benefits of multi-methods in the process. But fear not, for, as we will see, you can eat your cake and have it too. Instead of pseudo-code, you can use Self itself, provided you are the happy owner of a Mac or a Sun workstation. Or you can spend 20 fun minutes seeing the Self video, which features the graphical environment accompanying the system. Like Smalltalk, Self provides you with a computing environment where objects are created, by cloning, and interact with you. The system is as organic and incremental as one can possibly get.

Of course, you're not limited to Self. For instance, Ken Dickey fleshed up Norman Adams' saying that objects are poor man closure's by offering a prototype-based object system in Scheme, and, more recently, Neil Van Dyke has released Protobj. And you have probably already used a very popular language in the family: Javascript. The list goes on, albeit, unfortunately, many of these languages lack either Self's nice integrated environment, or a portable, up-to-date implementation. Slate to the rescue.

## The best of both worlds

Prototyping and multiple dispatch are, at first sight, at odds. After all, method dispatching based on arguments' type needs, well, a type for each argument, doesn't it? As it happens, Lee Salzman and Brian Rice have envisioned a way of combining the power of both paradigms into Slate. In fact, proving how this is possible is the crux of Lee's article. In addition, Slate aims at providing a complete development environment in the vein of Smalltalk or Self. Too good to be true? In future installments of this blog category, we'll see how and why it's true, but, if you cannot wait, just run-not-walk to Slate's site. You'll have a great time.

# as simple as possible...

*… but not simpler*.

Einstein's (attributed) quotation has become an aphorism, taken for granted by every mathematician or physicist i've ever met (to mention two kinds of people i've been frequently involved with). One would expect the same attitude from a community that invented the term 'no silver bullet', and yet, since i got into computer science, first for fun and later on for a living, i've found lots of people with, er, a different viewpoint. Take for instance this excerpt from Lisp is sin, a widely cited and commented article by Sriram Krisnan:

In Visual Studio, we look at 3 distinct categories of programmers. We call them Mort, Elvis and Einstein - the 3 personas we take into consideration whenever we create any developer technology. What is difficult for some of us geeks to comprehend sometimes is - all 3 of them are equally important. When you look at technology like Windows Workflow Foundation, one of the driving forces was to let non-geeks build software. Not everyone needs to be a Raymond Chen or a Dave Cutler. Not everyone needs to understand the difference between the various GC algorithms. However, everyone needs the ability to be productive. And everyone needs to be able to get software working without needing a CS degree.

We cannot afford to restrict software development only to those who know Windows kernel internals or those who can understand what a continuation is. It's not that other people are not smart - they just have better things to do. That's the key piece of understanding I find missing sometimes.

Nonsense, if you ask me. And yet, i've been hearing this same argument, in different dressings, once and again since i got into computer science. Let's apply the same line of reasoning to other disciplines, and see how well it fares:

Hey Albert, your General Relativity is awesome but, you know, with all that jazz about differential geometry and curved spacetimes, it's too hard; we're not as smart as you, pal, so we'd better use Newtonian or Aristotelian mechanics to calculate those GPS satellite orbits and get going with other important things we need to do. Hope you understand, Albert.

Well Santiago, your ideas about neurons and surgery sound pretty deep and mystifying, but please, think of the Galens among us: we don't have the time to investigate every new fad, and, anyway, we wouldn't understand it if we did. Know what? We'll keep using our old good cures and stay away from untrodden venues. Our healing parchments are a bit of a hack, but they get the work done… most of the time, that is.

Does it make *any* sense? Now, maybe you think that i am exaggerating,
and that the comparisons above are stretching the point a bit too far.
If so, take a second to look back to the people that made your nice
computing environment possible. Take a look at Charles Babagge
visions; read about Alan Turing and Alonzo Church or John von Neumann;
admire the elegance of McCarthy's original LISP (1960); prepare to be
surprised with the things the people in Dough Engelbart's Augmentation
Research Center were doing during the sixties; try to find a modern
drawing program that matches Sketchpad's algorithms (or see it in
action in this presentation by Alan Kay); follow the fascinating
development of the overlapping windows interface, hand in hand with
Smalltalk history back at Xerox PARC, and do it from the horse's
mouth; feel the thrill of the people that went beyond Xerox's big
wigs' shortsightedness and on to making a dent in the universe: it was
1984, that same year the lisp machine wars culminated in the creation
of the GNU project, which was all about ideals, about empowering
people, about freedom. When you're done, tell me whether i'm going
overboard when making parallelisms between computer science and
physics or medicine!

All those people had a vision, a dream, and pursued it with an amazing display of hard work, stubbornness and intelligence. They took no prisoners, and by the late eighties had pushed that new thing called, for want of a better name, Computer Science to its modern standards.

Then winter came. Not just the AI winter. Compare the swift pace of CS
developments during the 1960-80 period with the subsequent
*advancements* in the field. We're using the same metaphors, the same
kind of systems that we inherited from those guys and gals. Why, we
even envy the power of Lisp Machines these days. It's been a long,
cold winter for CS. And the main reason was the appearance of the
mentality that i'm criticising in this post, what Alan Kay aptly
calls, in a recent interview, a *pop culture* of computers:

Perhaps it was commercialization in the 1980s that killed off the next expected new thing […] But a variety of different things conspired together, and that next generation actually didn't show up. One could actually argue—as I sometimes do—that the success of commercial personal computing and operating systems has actually led to a considerable retrogression in many, many respects. You could think of it as putting a low-pass filter on some of the good ideas from the '60s and '70s, as computing spread out much, much faster than educating unsophisticated people can happen. In the last 25 years or so, we actually got something like a pop culture, similar to what happened when television came on the scene and some of its inventors thought it would be a way of getting Shakespeare to the masses. But they forgot that you have to be more sophisticated and have more perspective to understand Shakespeare. What television was able to do was to capture people as they were. So I think the lack of a real computer science today, and the lack of real software engineering today, is partly due to this pop culture.

Dead on, i say. People advocating about making programming simpler than
possible are the hallmark of this pop culture. And when corporate and
economic interests enter the picture, things get even worse. The *Lisp
is sin* essay goes on to say:

I frequently see on Slashdot "Windows is designed for stupid users". That is quite insulting to the millions of moms and dads, teachers and laywers and people from other walks of life who use Windows or even the Mac. If we mandated that every new user understand Windows' command line syntax or Emacs, we would have failed as an industry - we would have locked out the rest of the world.

In my opinion, this totally misses the point. There's nothing wrong in
making computers simpler to users. On the contrary, that's probably what
this endeavour is all about. Alan Kay saw it, Apple took head with its
*computer for the rest of us* mantra. But it does not follow that there
must be a *CS for the rest of us*. Making all this amazing technology
possible takes effort, and needs a high level of sophistication. Alan
didn't try to create systems usable by children inventing PHP. He
created Smalltalk striving to improve Lisp, he studied Piaget and
Papert, he has degrees in maths and biology. And he needed all that, and
then more.

The (trivial) point i'm trying to make is that not everybody has what
it takes to be a programmer. Just as not everybody can be a singer or
a painter (as an aside, i tend to agree with the opinions that link
programming and art). As a matter of fact, good programmers are rare
and need a quite peculiar combination of skills and talents. Donald
Knuth has put it far better than i could in the essay *Theory and
Practice, II* (from his Selected Papers on Computer Science):

The most important lesson [after developing TeX], for me, was that

software is hard; and it takes a long time. From now on I shall have significantly greater respect for every successful software tool that I encounter.[…] Software creation not only takes time, it's also much more difficult that I thought it would be. Why is this so? I think the main reason is that a longer attention span is needed when working on a large computer program than when doing other intellectual tasks. A great deal of technical information must be kept in one's head, all at once, in high-speed random-access memory somewhere in the brain.

We don't solve the painter's *problem* by complaining that perspective
is hard to grasp and people should better use flat icons. In the same
way, we shouldn't be claiming for a trivialisation of CS both in
academia and in the industry. The *we would have failed in the industry*
bit in the Sriram quote above is really sad: we're sacrificing an
admirable legacy in the name of industry and corporate profit. The
most remarkable feat of our current industry leaders is to have
convinced the rest of the world that having software systems that eat
incredible amounts of resources and explode without reason every now
and then is part of an acceptable, even top-notch,
technology. Fortunately, other disciplines show far more respect for
the people that, ultimately, is paying their wages.

If you've got this far, you already have one of the qualities needed to become a programmer: stamina. You'll need more. Be prepared to study hard, to learn maths, to live in abstract worlds. If you feel that you have "more important things to do", well, that's all very well, but don't ask the rest of us to dumb down the subject so that everybody can be a programmer. Lisp is not a sin. The sin would be to betray the dreams, ideals and hard work of the people that have taken us this far. We owe that to them, and to ourselves.

To end this never-ending diatribe, let me add a couple of things: first, i should apologize for taking Sriram as the scapegoat to a long honed rage: his essay contains many good points worth reading and mulling on; second, i hope you're not thinking this is just an arrogant rant by an old fart: i'm not that old.

# the joy of repl

Back in the old days i was a macho C++ programmer, one of those sneering at Java or any other language but C, willing to manage my memory and pointers and mystified by the complexity of the template syntax (it was difficult and cumbersome, ergo it had to be good). Everyone has a past.

Things began to change when i decided to add Guile extensibility to
GNU MDK. I was using the project as an excuse to learn everything one
has to learn to write free software, from parsers with flex and bison,
to documentation generators like texinfo and doxygen or localisation
via gettext. Next thing was scriptability, and those days Scheme was
still *the way* to extend your GNU programs (last time i checked, GNOME
was full of XML, a windows-like registry and, oh my, C#… no Scheme
(or good taste) to be seen).

So, when i first encountered Scheme i was high on static type checking, object oriented programming in its narrow C++ flavour, and all that jazz. I didn't understand immediately what was so cool about having an interpreter, and defining functions without the compiler checking their signature at every single call made me feel uneasy. I was told that i still had strong type checking in Lisp, but that it is deferred to run time, instead of at the apparently safer compile phase. I didn't get it. Thanks god, SICP was so fun that i kept on learning, and i kept wondering for a while what was so great about interpreters and dynamic typing.

Problem was, i was writing *C programs* in Scheme. In a compiled
language (a la C) and, to some degree, in any statically typed one,
your code is dead. You write pages and pages of inert code. You
compile it. Still dead. Only when you launch that binary does it come
to life, only that it lives elsewhere, beyond your reach. Admittedly,
i'm exaggerating: you can reach it in a convoluted way via a
debugger. But still. A debugger is an awkward beast, and it will only
work with the whole lot: all your program compiled, linked and
whatnot.

Enter a dynamic language. Enter its REPL. When you have a, say, Lisp
interpreter at your disposal *you don't write your code first and load
it later* (that's what i was doing at first). You enter your code
piecewise, function by function, variable by variable at that innocent
looking prompt. You develop *incrementally*, and, at every single
moment, your objects and functions are alive: you can access them,
inspect them, *modify* them. Your code becomes an organic creature,
plastic. Its almost not programming, but experimenting.

Maybe you're raising a skeptical eyebrow. Maybe you have one of those
modern visual-something debugger that lets you modify your compiled
code on the fly and continue running your code using the new
definitions and you think that's what i'm talking about… Well, no,
sorry, that's only *part* of what i'm talking about. To begin with, you
*continue* executing your program. I can do whatever i want. But that's
not all. We are talking about a dynamically typed language. That means
that me and my little REPL have much more leeway to modify the living
code, and thus much more margin to grow up and evolve the code.

At the end of the day, dynamically typed languages give me freedom. Programming is a creative process and greatly benefits from that freedom. At first, abandoning the safety net provided by static typing was a little bit scary, but as i grew up as a programmer i felt more and more confident, and gradually the initially uneasy feeling morphed into joy. The joy of REPL.

Richard P. Gabriel has made a far better job in beautifully conveying
what i'm trying to express in his excellent introduction to David
Lamkins' book Successful Lisp, entitled *The Art of Lisp and
Writing*. Unfortunately, i haven't found it online – you can read the
first few pages in amazon.com's "look inside this book" section for
this book. And also in his essay Do Programmers Need Seat Belts?. Paul
Graham has famously argued in favour of bottom-up development in many
of his essays, and specially in Programming Bottom-Up:

It's worth emphasizing that bottom-up design doesn't mean just writing the same program in a different order. When you work bottom-up, you usually end up with a different program. Instead of a single, monolithic program, you will get a larger language with more abstract operators, and a smaller program written in it. Instead of a lintel, you'll get an arch.

Finally, please note that i'm well aware that the static vs. dynamic typing debate is still open, and that decent type systems like those in Haskell and ML have, arguably, much to offer in the way to solid software engineering. Type theory has also a powerful and beautiful mathematical foundation. The above is just my gut feeling and current position on these issues, and i don't pretend to have backed it with solid technical argumentation. Nor was it my goal. I'm more interested here in programming as a creative activity than as engineering.