Sonja: the code

I found it. You can download it here.

“Sonja” was my PhD thesis program, described in my book Vision, Instruction, and Action. I wrote about it on this site in “I seem to be a fiction,” and in passing in The Eggplant and elsewhere.

Sonja was a follow-on to Pengi, a program I wrote with Phil Agre. Pengi is better known, and people occasionally ask me if the code is available. It’s not, for an unfortunate reason.

In the 1980s, version control hadn’t been invented yet, and backup was rudimentary at best. After we published the paper, I began a major re-write, deleting the bulk of the code, and starting over. Then I abandoned it to work on Sonja instead.

This caused a serious, unfixable problem for Phil. A description of Pengi was a centerpiece of his thesis, but he didn’t have access to the code, because I had deleted most of it. Pengi did some clever things he couldn’t write about, for that reason. I still feel really bad about this.

So whenever anyone asks me “is the Pengi code available,” I answer “no” (and sometimes retell this story). I had assumed that the Sonja code was also permanently lost—without bothering to seriously look for it. I’ve found it now, and uploaded a zip file, so you can look at it if you want to.

I can’t imagine why you would want to; I don’t think you’ll learn anything. But people ask for the Pengi code often enough that maybe this will satisfy some curiosity. Sonja has a somewhat fancier version of essentially the same architecture.

The first part of the rest of this post has brief reflections on the code; the second part muses about what might be done now to follow up the project.

The code

I’ve spent a couple hours reading through it. I was pleasantly surprised, not having looked at it in almost thirty years. There is more of it, it’s more sophisticated, and it’s cleaner than I was expecting.

You won’t be able to run it. It has extensive external dependencies that aren’t available. That includes third-party packages, internals of the Symbolics Genera operating system as of 1990, and even hardware aspects of the Symbolics 3600 architecture.

The 3600 ran at about three million instructions per second, with floating point operations much slower—maybe around 1 megaflops. (The laptop I’m writing this on can do more than ten billion instructions per second, with a GPU that runs at about 1 teraflops—a million times faster.)

In retrospect, trying to do AI on a machine like that seems crazy. (Maybe thirty years from now, 2020 attempts to do AI on a petaflops supercomputer will seem crazy in retrospect too.)

But it’s worse than that, even. Many of the code comments are complaints about how slow the operating system’s graphics operations were. A significant fraction of the Sonja development effort was spent figuring out how to bypass them so that even just the video game Sonja played could run in real time. As an extreme example, the window system used the hardware BITBLT instruction to write to the memory-mapped display frame buffer. The microcoded 3600 BITBLT did a lot of array bounds checking and other overhead that made it unusably slow, so I had to write a stripped-down version (the function my-bitblt in amazon.lisp). When the system developers learned that I’d made a video game run on the machine, they considered it a somewhat improbable feat.

So anyway, reading the code gave me warm fuzzies, but I can’t imagine that it will do you any good. If you find it interesting, though, leave a comment—I’d like to hear why!

Then what?

Sometimes I hear from AI researchers who want to incorporate the insights of Pengi/Sonja, or even to pick up where we left off. I’m flattered by this, but mostly I don’t think it’s a good idea. I left AI in 1992 largely because I felt that this work had reached a dead end, and couldn’t see any other way forward.

On the other hand, it’s an approach to AI that hasn’t been much explored since, whereas most others have been exhaustively tried and found unworkable. So there may still be productive research possible, along lines I missed. Also, large enough quantitative differences can be effectively qualitative. Access to a million or more times more computer power than I had around 1990 might blow holes in what looked like solid walls.

Vision

Sonja tried to illustrate three different ideas. The first was a model of intermediate vision that enabled task-specific visual routines using general-purpose, neuroscientifically-motivated mechanisms. I think this is an important idea that could be followed up now, and that has been almost entirely neglected, for no good reason.

This wasn’t my idea, it was Shimon Ullman’s, drawing heavily on research on visual attention by Anne Treisman. There’s been nearly no follow-up to Ullman’s work. Visual psychophysicists have continued working in Treisman’s paradigm, but mostly just working out fiddly details of her theory, rather than expanding into the broader issues raised by Ullman.

Revisiting these ideas now might be productive. One could bring to bear not only GPU power, but also some insights from the past few years’ work in “deep learning” image processing. My take-away from that (grossly overhyped) effort is that surface texture information is much more discriminating than most vision researchers had expected. An intermediate vision system that took input from a DL-derived texture representation might be startlingly effective.

I’m not sure that would be a good thing. I left AI partly because I didn’t see a way forward, but also because I was increasingly concerned that progress would be misused. I was working on machine vision, and the possibility of using facial recognition for pervasive surveillance worried me. It’s too late to halt research on that now. However, I’d ask vision researchers to think seriously other possible abuses of their work, and consider the possibility of a voluntary moratorium.

(Inter-)Action

Pengi and Sonja illustrated an interactionist approach to effective activity. We intended this as a critique of, and alternative to, the then-dominant representationalist approach.

The critique was effective, in the sense that our work (and that of a few other researchers who had come to similar conclusions) mostly killed off the representationalist program in AI and cognitive “science.” That was one cause of an “AI winter” of greatly reduced funding.

So the alternative wasn’t pursued as thoroughly as it might have been. Work in robotics has broadly followed this approach, though.

One of the conclusions of my thesis—which surprised me at the time—was that perception is the hard problem. Deciding what to do, which rationalism takes as the hard problem, is easy by comparison, once you can see what’s going on around you.

Self-driving car researchers have re-learned this lesson recently. Mostly the errors self-driving cars make come from missing seeing what they should, or from misperception, rather than from making bad decisions given what their visual systems are telling them.

I continue to believe interactionism is a critically important way of rethinking our unthought assumptions about human being. This is a major theme of In The Cells Of The Eggplant. I think it’s probably not worth pursuing in AI research unless and until vision systems improve dramatically.

Instruction

My main goal in writing Sonja was to illustrate ethnomethodological ideas about instructed action. This was intended to be its main advance over Pengi. I mostly didn’t get to that, because building its visual system (much more complex than Pengi’s) took most of the time. Sonja does incorporate some key ethnomethodological insights (indexical reference, conversational repair), but only very crudely.

Again, these ideas are a major theme in The Eggplant. I don’t think there’s much point in pursuing them in AI. There have been attempts to incorporate them into conversational interfaces for software; but I think conversational interfaces are probably just a bad idea. (Clippy must die.)