Posts tagged "scheme":
((I m a g i n e) (firstname.lastname@example.org) (((Imagine there's no FORTRAN) (It's easy if you try) (No SML below us) (Above us only Y) (Imagine all the people) (Living for their Chez)) ((Imagine there's no memory leaks) (It isn't hard to do) (Nothing to malloc(3) or free(3) for) (And no (void *) too) (Imagine all the people) (Living in parentheses)) ((You may say I'm a Schemer) (But I'm not the only one) (I hope someday you'll join us) (And the world will be as (lambda (f) (lambda (x) (f x))))) ((Imagine those continuations) (I wonder if you can) (No need for C or pointers) (A brotherhood of Dan) (Imagine all the people) (GCing all the world)) ((You may say I'm a Schemer) (But I'm not the only one) (I hope someday you'll join us) (And the world will be as (lambda (f) (lambda (x) (f x)))))))
Posted to comp.lang.scheme on January 17, 1996, for Scheme's twentieth birthday.
Recently i bought a second-hand copy of Simon Peyton Jones' classic The implementation of functional programming languages, and i've been having some very pleasant reading hours during the last week.
For instance, i've enjoyed Simon's clear and to-the-point introduction to the lambda calculus and, more concretely, his really lucid explanation of the workings of recursion and the Y combinator. As a matter of fact, i've enjoyed it so much that i just wanted to briefly reproduce it here, paraphrased in my surely poorer style, just for the fun of it.
In the pure lambda calculus, there's no direct way of defining a
function that refers to itself: all you have is the ability to define
anonymous lambda abstractions. Using Scheme's syntax, that means that
you can use
lambda as much as you want, as in
((lambda (x) (+ 1 x)) 3)
define (recursive or not) is not available. For instance, you
cannot define a
list-length recursive function like this:
(define list-length (lambda (lst) (if (null? lst) 0 (+ 1 (list-length (cdr lst))))))
the main problem being the appearance of
list-length in the lambda
expression's body. So the general issue we want to solve is how to
define a function
f of the form
f = (... f ...)
without referring to
f in the body. When all you have is lambda
calculus, everything looks like a function: we can rewrite the right
hand side of the above equation as a lambda application:
f = ((lambda (x) (... x ...)) f) := (h f)
where we've defined
h := (lambda (x) (... x ...)). For instance, in
our length example,
h is the function
h = (lambda (x) (lambda (lst) (if (null? lst) 0 (+ 1 (x (cdr lst))))))
and we're trying to solve the equation
(h list-length) = list-length.
In other words, you give me
h, a function taking a function as
argument and returning another function, and i need to give you back an
f that is its fixed point, i.e., an
f = (h f).
So all i need to do is finding a function (let's call it
Y) that takes
h as its argument and returns
f, that is, by definition of
f = (Y h)
but, as we've seen
(h f), which in turn is
(h (Y h))
which gives us the defining equation of
(h (Y h)) = (Y h)
If i had that magic
Y function (or combinator, since it won't use any
global variables), i could for instance immediately compute
(Y (lambda (x) (lambda (lst) (if (null? lst) 0 (+ 1 (x (cdr lst)))))))
The nicest kind of wishful thinking is the one that comes true: we
actually know of a lambda expression that satifies the Y combinator's
(Y h) = (h (Y h)), namely:
(lambda (h) ((lambda (x) (h (x x))) (lambda (y) (h (y y)))))
You can check it by yourself:
(Y h) is just the body of the expression
(Y h) => ((lambda (x) (h (x x))) (lambda (y) (h (y y))))
and now apply the first lambda to its argument (the second lambda) to obtain:
=> (h ((lambda (y) (h (y y))) (lambda (y) (h (y y))))) => (h (Y h))
as desired! The only wrinkle left is that, as defined, our Y
combinator cannot be directly coded in Scheme, because Scheme is a
strict language and those pesky
(x x) and
(y y) applications cause an
But that's just an accidental difficulty due to Scheme's evaluation
strategy being too eager that has a standard solution: a beta
abstraction. If you have a function
F in Scheme, you can define a
totally equivalent function
(define G (lambda (x) (F x))). We say
that G is a beta abstraction of F, or that F is a beta reduction of
G. And the usual reason you would beta abstract a function in Scheme
is in order to delay the evaluation of its body, just what the doctor
h is a function taking another function as an argument, so
(x x) and
(y y), which are used as
h's arguments, must be
functions, so they're equivalent to
(lambda (a) ((x x) a)) and
(lambda (a) ((y y) a)), so we can define Y, if we're so inclined, also
(define Y (lambda (h) ((lambda (x) (h (lambda (a) ((x x) a)))) (lambda (x) (h (lambda (a) ((x x) a)))))))
(define list-length (Y (lambda (f) (lambda (lst) (if (null? lst) 0 (+ 1 (f (cdr lst))))))))
As you see,
list-length's body doesn't mention itself and, as you can
check by yourself with your favourite interpreter, the last two
definitions are valid and working scheme code.
In case you still feel a bit unsure as to how our
Y is pulling its
trick, you can dispel any misgivings just by seeing how the combinator
transforms the given
h step by step, always reproducing
h in the
process (it's only the branch in
h that doesn't call its argument, the
recursion's base case, that breaks the implicit infinite loop induced by
Y). Let's do it with our
list-length example, reducing its
application to, say, the list '(1 2), step by step:
(define W (lambda (f) (lambda (a) ((f f) a)))) (list-length '(1 2)) ;; definition of list-length as (Y h) => ((Y h) '(1 2)) ;; definiton of Y and W => (((lambda (f) (h (W f))) (lambda (f) (h (W f)))) '(1 2)) ;; beta-reduction of the first lambda applied to the second => ((h (W (lambda (f) (h (W f))))) '(1 2)) ;; definition of h for the case of list-length => ((lambda (lst) (if (null? lst) 0 (+ 1 ((W (lambda (f) (h (W f)))) (cdr lst))))) '(1 2)) ;; function application => (+ 1 ((W (lambda (f) (h (W f)))) '(2))) ;; definition of W => (+ 1 (((lambda (f) (h (W f))) (lambda (f) (h (W f)))) '(2)))
and notice what just happened: we started with
(((lambda (f) (h (W f))) (lambda (f) (h (W f)))) '(1 2))
and, after one iteration of the implicit recursion, reduced it to:
(+ 1 (((lambda (f) (h (W f))) (lambda (f) (h (W f)))) '(2)))
i.e., 1 plus the same form we started with but applied to the list
instead of the list
(1). By following the exact same reduction steps,
we can simplify the above to:
(+ 1 1 (((lambda (f) (h (W f))) (lambda (f) (h (W f)))) '()))
and now, when expanding the same form to the right applied to the
empty list, the base case of the recursion (the true branch of the
(null? lst) test in
h), will kick in and reduce the above to
(+ 1 1 0) => 2
avoiding that the self-reproducing Y forms keep reappearing ad infinitum.
If you have a little Scheme under your belt, just plunge into the challenge right now:
We want to find three integers, x y and z such that both are in the
range [2,9] and that they are the lengths of the sides of a right
(x^2 = y^2 + z^2). The challenge is to write two procedures,
in-range and fail such that the solution to the problem looks like
(let ((x (in-range 2 9)) (y (in-range 2 9)) (z (in-range 2 9))) (if (= (* x x) (+ (* y y) (* z z))) (list x y z) (fail "no solution")))
In other words, we are after an elegant way of implementing search backtracking: the possible values of the (x, y, z) triplets must be checked in turn, until a solution is found or we exhaust the search space. This is the kind of problem that declarative languages like Prolog solve in a breeze. Scheme does not fare bad, either: the right solution is just a screenful long. Keep on reading if you feel like getting some hints.
I've stolen this little exercise from an excellent talk by Marc Feeley on writing a Scheme to C compiler. The example pops up when Marc is discussing what makes such a compiler tricky (that is, what Scheme features are missing in C): tail calls, garbage collection, closures and first class continuations. So here goes the first hint: the solution uses call/cc.
I couldn't put a finger on it, but there's something pesky about continuations: in my experience, they're hard to grasp at first. And yet, the concept is quite simple: whenever you are evaluating an expression E, there's someone waiting to do something with the value of E; for instance, in the process of evaluating:
(+ 3 (* 5 4))
if you take E to be (* 5 4), the interpreter is waiting for your result to add 3… this thing to be done to the result of evaluating a (sub)expression can be, quite naturally, represented by a procedure that takes a single argument; in our example, this procedure could be:
(lambda (v) (+ 3 v))
or, if you are evaluating the whole thing on a REPL toplevel that will print the result, maybe something roughly equivalent to:
(lambda (v) (print (+ 3 v)))
It is this (usually invisible) procedure what we call the current continuation. Every expression has a continuation, but in many languages it is only implicit. Scheme gives you the possibility of accessing it. If you write:
(+ 3 (call/cc (lambda (cont) (* 5 4))))
that could be translated as
(+ 3 (let ((cont (lambda (v) (+ 3 v))) (* 5 4)))
Of course, things get only interesting when you use cont; for instance, it is hopefully clear that
(+ 3 (call/cc (lambda (cont) (* 5 4) (cont 18))))
evaluates to 21. Besides writing silly examples and bore your readers to tears, you can put this stuff to good use in a variety of ways. The most immediate is to early scape from a long computation. This procedure multiplies all the elements in a given list:
(define (mult lon) (cond ((null? lon) 1) ((zero? (car lon)) 0) (else (* (car lon) (mult (cdr lon))))))
but it will do a lot of useless work when lon contains a 0. First class continuations allow a better way:
(define (mult-k lon cont) (cond ((null? lon) 1) ((zero? (car lon)) (cont 0)) (else (* (car lon) (mult-k (cdr lon) cont))))) (define (mult lon) (call/cc (lambda (cont) (mult-k lon cont))))
If you understand why this is better than the previous version, you're on your way to understanding continuations as non-local exits. And if by now you're thinking that continuations are not, after all, a big deal, quick tell me what
(((call/cc (lambda (k) k)) (lambda (x) x)) "HEY!")
evaluates to, and why. By the way, this nice micro-kata is from TSPL's section on continuations, which provides a nice, if brief, introduction to call/cc, including a description of another typical use case: the implementation of threading via coroutines. A more extended (but still not too long) discussion of the very same issues can be found in Dann Friedman's beautiful paper Applications of Continuations.
Finally, if you have a little time in your hands, read through this interesting ll1 mail thread, where guys like Mathias Felleisen, Avi Bryant or Paul Graham have their say on continuations and their uses.
Since coroutines are not needed to solve our original kata, let me gloss over them and jump to the next typical use of continuations: backtracking. A key thing to remember about call/cc is that the continuation passed to the lambda form is a first class value. Among other things, that means that you can store it and use it in a totally unrelated context. Or, in other words, Scheme introduces time-travel in your toolset, with its associated wonders (as in Schelog, a Prolog implemented in Scheme) and headaches.
Let's see how backtracking can be implemented with the aid of persistent continuations using an example adapted from Jacques Chazarain's book Programmer avec Scheme (which, by the way, makes for an excellent second book on Scheme, provided you read French). Writing a procedure that looks for the first occurrence of an element in a list that satisfies a given predicate is a piece of cake:
(define (search lst p?) (cond ((null? lst) #f) ((pair? lst) (or (search (car lst) p?) (search (cdr lst) p?))) ((p? lst) lst) (else #f)))
where we accept list of lists too. But what if i want to get all findings one by one? I'd like to have a solution generator that returns a procedure returning, in successive calls, each of the occurrences of a solution in the given list. That is, we want to be able to write code like this one:
(define solutions (search-generator '(0 ((1 a) 2) b (b c) (((6)))) number?) (solutions) => 0 (solutions) => 1 (solutions) => 2 (solutions) => 6 (solutions) => 'done
Persistent continuations allow a very clean implementation of search-generator. The strategy is to start the search, and each time we find an element in the list satisfying the predicate store the current continuation (which will keep on searching for more solutions) for later invocations. We then return a procedure that calls the continuation and stores a new one which will resume the search with the rest of the list. You can have a little fun trying to find the solution before reading it below. Or, if you get stuck, to read Ferguson and Duego's excellent Call with Current Continuation Patterns, where you'll find examples of call/cc in action. Got your answer? Well, let's compare:
(define (search-generator lst p?) (let ((success '?)) ;; placeholder for the current continuation (letrec ((cont-success ;; next continuation (lambda (x) (search lst))) (search (lambda (elem) (cond ((null? elem) 'done) ((pair? elem) (search (car elem)) (search (cdr elem))) ((p? elem) (call/cc (lambda (k) ;; next search will continue from here (set! cont-success k) (success elem)))) (else 'done))))) (lambda () (call/cc (lambda (k) (set! success k) (cont-success 'done)))))))
Once you grasp how this works, you have all the ingredients to solve our original kata.
Not only Scheme
You don't need call/cc to play some of the tricks discussed above. For instance, one can implement backtracking in Python (the linked article contains also an intro to continuations) or Standard SML (ditto, also with examples in Pascal, and, of course, SML), using a technique known as Continuation Passing Style, or CPS, that consist on passing explicitly the continuation to each of your functions. Explaining CPS would take another article, so, for now, i will let you explore it by yourself, but i'll mention that armed with CPS you can, essentially, play all the call/cc tricks. A few years ago, type theorist extraordinaire Benjamin C. Pierce asked in the Caml mailing list about cool CPS usages, and took the time to summarize the responses he got. It contains pointers to many mind-bending readings.
I almost forgot: if you give up finding our kata's solution or want to check it out, you'll find it in Marc Feeley's talk for the Montreal Scheme/Lisp Users Group. As i've mentioned, it deals with the implementation of a Scheme to C compiler (Marc is the lead developer of Gambit-C), which is based on CPS. The site contains links to the talk slides and videos, and a beautiful pure-scheme implementation of the compiler. Enjoy.
After a few scheming years, i had come to view objects as little more than poor-man closures. Rolling a simple (or not so simple) object system in scheme is almost a textbook exercise. Once you've got statically scoped, first-order procedures, you don't need no built-in objects. That said, it is not that object-oriented programming is not useful; at least in my case, i find myself often implementing applications in terms of a collection of procedures acting on requisite data structures. But, if we restrict ourselves to single-dispatch object oriented languages, i saw little reason to use any of them instead of my beloved Scheme.
Things started to change recently due to my discovering the pleasures of Smalltalk. First and foremost, it offers a truly empowering integrated ambient to live and develop in. Second, if you're going to use objects, using the simplest, cleanest syntax will not hurt. Add to that some reading on the beautiful design principles underlying Smalltalk, and one begins to wonder if closures aren't, in fact, poor-man objects–or at least i do, whenever i fall in an object-oriented mood (i guess i'm yet not ready to reach satori).
But Scheme is not precisely an ugly or bad designed language, so i needed some other reason to switch language gears for my OO programming. I knew there's more than encapsulation or subtype polymorphism in object-land from my readings on CLOS (the Common Lisp Object System), or on Haskell's type classes (and its built-in parametric polymorphism), but i was after something retaining Smalltalk's elegance. And then i remembered that, when i was a regular lurker in the Tunes project's mailing lists and IRC channel, a couple of smart guys were implementing an OO language whose syntax was smalltalkish. That language (which, if memory servers, started life with the fun name who me?) has evolved during the last few years into a quite usable programming environment named Slate, started by Lee Salzman and currently developed and maintained by Brian Rice.
I've been reading about Slate during the last few days, and decided to learn it. What motivated me was discovering how Slate goes beyond mainstream object-oriented programming by incorporating well-known (but hardly used) and really powerful paradigms. In short, Slate improves Smalltalk's single-dispatch model by introducing and combining two apparently incompatible technologies: multiple dispatch and prototype-based programming. To understand the whys and hows of Slate, there's hardly a better way than reading Lee Salzman's Prototypes with Multiple Dispatch. The following discussion is, basically, an elaboration of Lee's explanation on the limitations of mainstream OO languages, and how to avoid them with the aid of PMD.
Note: the images are thumbnails of this PDF file, with clickable links inside.
Fishes and sharks
Let's start by showing why on earth would you need anything beyond Smalltalk's object system (or any of its modern copycats). Consider a simple oceanographic ecosystem analyser, which deals with (aquatic) Animals, Fishes and Sharks. These are excellent candidates for class definitions, related by inheritance. Moreover, we are after modeling those beasts' behaviours and, in particular, their reactions when they encounter each other: each time a Shark meets a Fish of other species, the Shark will swallow the other Fish, while when a Shark meets Shark they will fight. As a result of such fighting, Sharks get unhealthy, which regrettably complicates matters: wound sharks won't try to eat other fishes, and will swim away other sharks instead of fighting them. The image on the left provides a sketchy representation of the code we need to model our zoo. Waters are quickly getting muddled implementation-wise.
On the one hand, subtype polymorphism based just on the object receiving the encounter message: we need, in addition, to take into account the argument's concrete type to implement the desired behaviour. This is a well-known issue in single-dispatch languages, whose cure is, of course, going to multiple dispatching (see below). In particular, we want to avoid the need to modify existing classes whenever our hierarchy is extended.
On the second hand, varying state (exemplified here by the Shark's isHealthy instance variable complicates the implementation logic. As we will see, prototype-based languages offer a way to factor out this additional complexity.
The need to adjust behaviour on the basis of the type of both a message receiver and its arguments arises frequently in practice. So frequently, that a standard way of dealing with it has been christened as the Visitor design pattern. The technique, also known as double-dispatch, is well known: you can see, for instance, how it's applied to arithmetic expressions in Smalltalk, or read about a generic implementation of multimethods in Python (which also includes a basically language-independent discussion on the issues at hand). If you happen to be a C++ programmer, you may be tempted to think that global functions and overloading solve the problem in that language. Well, think twice: a proper implementation of multiple dispatch in C++ needs of RTTI and templates, as shown in this article.
CLOS and Dylan are two examples of languages solving the issue from the onset by including support for multi-methods. The idea is to separate methods from classes (which only contain data slots). As shown in the pseudo-code of the accompanying figure, methods are defined as independent functions with the same name, but differing in their arguments' types (in CLOS, a set of such methods is called a generic function). When a generic function is called, the system selects the actual method to be invoked using the types of all the arguments used in the invocation. The encounter generic function in our running example provides a typical example, as shown in the figure on the right. The benefits of having multi-methods at our disposal are apparent: the code is simpler and, notably, adding new behaviours and classes to the system does not need modifications of existing code. For instance, we can introduce a Piranha, which eats unhealthy sharks instead of swimming away from them, by defining the requisite class and methods, without any modification whatsoever to the already defined ones.
On the downside, we have still to deal with the complications associated with internal state. Enter the magic world of prototype-based systems.
The ultimate dynamic
If you like dynamic languages, chances are you'll find prototype-based system an almost perfect development environment. Prototype-based languages emerged as an evolution of Smalltalk with the invention of Self by David Ungar and Randall B. Smith during the late eighties. The key idea behind Self is noticing that, most of the time, class definitions needlessly coerce and complicate your object model.
A class definition becomes a contract to be satisfied by any instance, and it is all too easy to miss future or particular needs of your objects (class-based inheritance is just a partial solution to this problem, as shown, for instance, by the so-called fragile base class problem). But, if you look around you, objects change in internal behaviour and data content continously, and our attempts at distilling their Platonic nature are often in vain.
In prototype-based programming, instead of providing a plan for constructing objects, you simply clone existing instances and modify their behaviour by directly changing the new instance's slots (which provide uniform access to methods and state). New clones contain a pointer to their parent, from which they inherit non-modified slots: there is no way to access state other than via messages sent to instances, which simplifies tackling with state.
Class-based languages oblige you to keep two relationships in mind to characterize object instances: the "is-a" relationship of the object with its class, and the "kind-of" relationship of that class with its parent. In self, inheritance (or behaviour delegation) is the only one needed. As you can see, Self is all about making working with objects as simple as possible. No wonder Ungar and Smith's seminal paper was titled Self: The Power of Simplicity. Needless to say, a must read.
The figure above shows how our running example would look in selfish pseudo-code. As promised, state is no longer surfacing in our method implementation's logic. Unfortunately, we have lost the benefits of multi-methods in the process. But fear not, for, as we will see, you can eat your cake and have it too. Instead of pseudo-code, you can use Self itself, provided you are the happy owner of a Mac or a Sun workstation. Or you can spend 20 fun minutes seeing the Self video, which features the graphical environment accompanying the system. Like Smalltalk, Self provides you with a computing environment where objects are created, by cloning, and interact with you. The system is as organic and incremental as one can possibly get.
The best of both worlds
Prototyping and multiple dispatch are, at first sight, at odds. After all, method dispatching based on arguments' type needs, well, a type for each argument, doesn't it? As it happens, Lee Salzman and Brian Rice have envisioned a way of combining the power of both paradigms into Slate. In fact, proving how this is possible is the crux of Lee's article. In addition, Slate aims at providing a complete development environment in the vein of Smalltalk or Self. Too good to be true? In future installments of this blog category, we'll see how and why it's true, but, if you cannot wait, just run-not-walk to Slate's site. You'll have a great time.
Back in the old days i was a macho C++ programmer, one of those sneering at Java or any other language but C, willing to manage my memory and pointers and mystified by the complexity of the template syntax (it was difficult and cumbersome, ergo it had to be good). Everyone has a past.
Things began to change when i decided to add Guile extensibility to GNU MDK. I was using the project as an excuse to learn everything one has to learn to write free software, from parsers with flex and bison, to documentation generators like texinfo and doxygen or localisation via gettext. Next thing was scriptability, and those days Scheme was still the way to extend your GNU programs (last time i checked, GNOME was full of XML, a windows-like registry and, oh my, C#… no Scheme (or good taste) to be seen).
So, when i first encountered Scheme i was high on static type checking, object oriented programming in its narrow C++ flavour, and all that jazz. I didn't understand immediately what was so cool about having an interpreter, and defining functions without the compiler checking their signature at every single call made me feel uneasy. I was told that i still had strong type checking in Lisp, but that it is deferred to run time, instead of at the apparently safer compile phase. I didn't get it. Thanks god, SICP was so fun that i kept on learning, and i kept wondering for a while what was so great about interpreters and dynamic typing.
Problem was, i was writing C programs in Scheme. In a compiled language (a la C) and, to some degree, in any statically typed one, your code is dead. You write pages and pages of inert code. You compile it. Still dead. Only when you launch that binary does it come to life, only that it lives elsewhere, beyond your reach. Admittedly, i'm exaggerating: you can reach it in a convoluted way via a debugger. But still. A debugger is an awkward beast, and it will only work with the whole lot: all your program compiled, linked and whatnot.
Enter a dynamic language. Enter its REPL. When you have a, say, Lisp interpreter at your disposal you don't write your code first and load it later (that's what i was doing at first). You enter your code piecewise, function by function, variable by variable at that innocent looking prompt. You develop incrementally, and, at every single moment, your objects and functions are alive: you can access them, inspect them, modify them. Your code becomes an organic creature, plastic. Its almost not programming, but experimenting.
Maybe you're raising a skeptical eyebrow. Maybe you have one of those modern visual-something debugger that lets you modify your compiled code on the fly and continue running your code using the new definitions and you think that's what i'm talking about… Well, no, sorry, that's only part of what i'm talking about. To begin with, you continue executing your program. I can do whatever i want. But that's not all. We are talking about a dynamically typed language. That means that me and my little REPL have much more leeway to modify the living code, and thus much more margin to grow up and evolve the code.
At the end of the day, dynamically typed languages give me freedom. Programming is a creative process and greatly benefits from that freedom. At first, abandoning the safety net provided by static typing was a little bit scary, but as i grew up as a programmer i felt more and more confident, and gradually the initially uneasy feeling morphed into joy. The joy of REPL.
Richard P. Gabriel has made a far better job in beautifully conveying what i'm trying to express in his excellent introduction to David Lamkins' book Successful Lisp, entitled The Art of Lisp and Writing. Unfortunately, i haven't found it online – you can read the first few pages in amazon.com's "look inside this book" section for this book. And also in his essay Do Programmers Need Seat Belts?. Paul Graham has famously argued in favour of bottom-up development in many of his essays, and specially in Programming Bottom-Up:
It's worth emphasizing that bottom-up design doesn't mean just writing the same program in a different order. When you work bottom-up, you usually end up with a different program. Instead of a single, monolithic program, you will get a larger language with more abstract operators, and a smaller program written in it. Instead of a lintel, you'll get an arch.
Finally, please note that i'm well aware that the static vs. dynamic typing debate is still open, and that decent type systems like those in Haskell and ML have, arguably, much to offer in the way to solid software engineering. Type theory has also a powerful and beautiful mathematical foundation. The above is just my gut feeling and current position on these issues, and i don't pretend to have backed it with solid technical argumentation. Nor was it my goal. I'm more interested here in programming as a creative activity than as engineering.