Stephen Wolfram: Multiparadigm Man

Michael Swaine, Dr. Dobb’s Journal 18 (January 1993) 109–112.

Stephen Wolfram is very bright. The kind of bright that impresses Nobel laureates like Richard Feynman. The 33-year-old full professor at the University of Illinois has been making original contributions in physics since his first published paper at age 16. His pioneering work in cellular automata opened up the study of complex phenomena.

He’s also impatient. Deciding that he needed a better tool for doing mathematics than what then existed, he built one, much as Donald Knuth built TeX so his published papers would be more attractive. Unlike Knuth, however, Wolfram erected his no-compromises system in an impressively short time while simultaneously putting together a company to support and market it. In four years, Wolfram Research has grown to over 100 employees, and Mathematica, Wolfram’s “system for doing mathematics,” is in use on Sun workstations, NeXT machines, Macintoshes, and PCs in over 70 countries.

This two-part column chronicles the conversation Ray Valdés and I had with Wolfram in the DDJ offices on a variety of topics, from programming paradigms to the thought processes of mathematicians.

DDJ: It’s hardly a new idea to incorporate a programming language into an application program. Lotus 1-2-3 has its macro language and dBASE its database language. But the language aspect of Mathematica strikes us as considerably more ambitious. It’s even taught as a first language in some college programs. Did you set out to create an application program or a programming language?

SW: I viewed the intellectually most significant [part] of the enterprise as being the creation of the elements of a programming language.

DDJ: And yet you sell it as an application.

SW: It has to do with the practical problem of introducing programming languages. Programming languages are a surprisingly slow-moving field. Fortran was invented before I was born and C is more than 20 years old now. It’s kind of strange, in a world where computer hardware and the uses that computers are put to have advanced so rapidly, that programming languages have advanced so slowly. If you have some ideas about how programming languages should be set up, and you want people to actually try using them, there’s a question of how you get [them] to do that. Once people have gotten used to using a programming language, you have to do an awful lot to convince them that they should switch to something else. We were lucky. People started off using Mathematica like an extremely enhanced calculator. And if you get a few hundred thousand people using your thing for whatever reason, then you have a reasonable community to work on in developing the language for its own sake.

DDJ: What does “for its own sake” mean?

SW: If you ask yourself, “What are the languages that have a chance in the next century?” there aren’t very many of them. And I think that Mathematica has more than a chance. That means that we have an example of a language that has pretty modern ideas—it is certainly a big step beyond C and Fortran—and that is already widely used today. One of the things that I consider an exciting direction is to what extent we can expand the use of the language itself, independent of the application side of Mathematica. We’ve considered making a thing that will probably be called M, that is essentially Mathematica without the mathematics.

DDJ: You’ve considered it. But how seriously?

SW: We’ve built little Ms. There is no doubt that Mathematica without the mathematics will exist one day. The main issue for us is to figure out how it makes sense to distribute the thing. Right now there are particular application areas where people have written programs in Mathematica that don’t use the mathematical side of Mathematica, and those are the places where you start. But I believe that every application program should have a language underneath it, and it would be great if that language was a modern, highly capable language, not an imitation of Basic or some specially crafted language that just does things for databases, for example. That’s the niche I’m interested in seeing the Mathematica language go into in the future.

DDJ: You mean extension languages, like macro or scripting languages?

SW: Exactly. The issue is, if you’re programming a spreadsheet, would you prefer to be programming in a sort of Mathematica language or in Lotus macros? And the answer is not too hard to figure out. Lotus macros are fine if you’re just doing a few simple things, but if you actually want to write a serious program, they’re far from fine. My interest in this direction is to see if one can use the Mathematica language as a basis for a wide range of different kinds of application programs. And the symbolic nature of the Mathematica language is crucial in that.

DDJ: How so?

SW: If you have a word processor and you want to represent a paragraph, that’s an easy thing to do in a symbolic language. It’s a pretty hard thing to do if you’re stuck with a language like Basic. The other point is that it’s only in the next couple of years that it becomes realistic to have a language as sophisticated as Mathematica underneath applications. By the time the M language is likely to be out, it will be a small fraction of the size of the typical application. Mathematica itself is a pretty huge program, but the language part of it is not so big.

DDJ: Could we go a bit more into the virtues of symbolic languages like Mathematica vs. procedurally based languages like Basic?

SW: When you’re working with a procedurally based numerical language, there’s a lot of mysterious hidden state associated with what’s happening. For example, you have a standard program written in C, and you have various data structures, and you have subroutines that call each other and pass pointers to these data structures. If you want to look at one subroutine on its own and see what it’s doing, [to] feed this kind of input in and see what comes out, that’s pretty difficult to do in C. But in a symbolic language there’s no [problem], because whatever input might be given, you can always explicitly write it down; whatever output might come out, you can always explicitly see it. It’s always the same kind of object, always a symbolic data structure that you can explicitly see. There’s no idea that it’s some sort of mysterious pointer encoded in such and such a way.

DDJ: Don’t symbolic languages have a name-of operator, a reference operator?

SW: If you’re using that stuff when you’re programming in Mathematica, you’re almost certainly doing something wrong. What’s great about programming in Mathematica is that you don’t have to think about any of that stuff. Everything you pass around is explicitly right there. It’s essentially passed by value.

DDJ: But there is a reference construct.

SW: Not really. There’s no need. Now in Mathematica there are ways of passing structures unevaluated. And there are some purposes for which—for example, when you do assignments—you need that. The left-hand side has to remain unevaluated; otherwise the wrong thing will happen. I would love to figure out a way to avoid having to do that. I haven’t succeeded. But in doing general programming in Mathematica you shouldn’t ever have to keep things unevaluated.

DDJ: We’ve worked some with Mathematica, but many of our readers haven’t. We really should step back and talk about what sort of language Mathematica is: the ideas and paradigms it embodies and where it came from. Maybe you could tell us about the intellectual roots of Mathematica.

SW: I got to do a test run of some of the ideas in Mathematica in a system called SMP that I built in the late ’70s or early ’80s. It was more oriented toward computer algebra; it wasn’t as ambitious a system as Mathematica. What I did there was a very educational experience. I tried to impose on people what I thought to be a good, but rather an unusual model of programming.

DDJ: What was that?

SW: The model of programming was that of pattern matching and transformation rules. Pretty much everything in that system was done with pattern matching and transformation rules. If you were going to write programs in SMP they pretty much had to be in that paradigm.

DDJ: The late ’70s and early ’80s would have been about the time Clocksin and Mellish were bringing Prolog to a wider audience with their book. Were you influenced by Prolog at that time?

SW: No, actually I wasn’t. I had never written a program in Prolog. I’d read the manual. The main thing that I was trying to do was to imitate what seemed to be what happens when you do mathematical calculations; that is, that you are continually applying rules of mathematics. The transformation-rule model has not been widely adopted. Prolog was an attempt to adopt it.

DDJ: An attempt? You consider it a failure?

SW: Prolog [has a] fatal flaw. A language where fundamental operations give you no clue as to how long they might take or what’s going on isn’t going to cut it. You have to give the user a reasonable conceptual model of what the computer is doing. It doesn’t matter if they’re a factor of ten wrong in knowing how many instructions it’s going to take, but it does matter if they can’t estimate whether this is an exponential time algorithm or something else.

DDJ: How did SMP influence Mathematica?

SW: One of the ideas I had in SMP was, “Figure out a good programming paradigm and just stick to it.” This was a mistake. I think it’s not a trivial mistake. You might think, “If there is a natural way to specify how programs should work, that maybe hooks into some way that has to do with how the brain processes ideas about things, then you should just figure out that way and stick to it.” But it turns out that while there are some kinds of programs that can be written very nicely using this [transformation rule] paradigm, there are others that are horrendous to write using it, but that are straightforward to write using, say, procedural programming or functional programming.

DDJ: So you built multiple paradigms into Mathematica?

SW: What I decided to do in building Mathematica, and have been very happy with, is to admit that there is going to be more than one paradigm for writing programs. Then the trick is to put in those paradigms in such a way that the edges fit together properly, so that you can move easily from one paradigm to another. So you can have pure functions and have them interact with transformation rules and interactive procedural programming and so on, and have a fairly seamless interface.

DDJ: In the development of Mathematica, were you explicitly thinking in those terms?

SW: Oh yeah. The development of Mathematica was in some ways boring because it was extremely deliberate. I knew what I was trying to do and what the steps were. It hasn’t been as educational as it might have been because it’s gone pretty much as expected. It has been interesting, by the way, to look at the programs people actually write in Mathematica. The idea of multiple paradigms really works out, because if you look at the programs, there are some that are 20,000 lines of transformation rules and that work just great in that form, and there are others that are a bunch of functional programming constructs and again work just great in that form, and then there are still other things that people end up writing as procedural programs, though it’s rarely a good idea.

DDJ: What kinds of design ideas went into the writing of Mathematica?

SW: One way I tried to design Mathematica was the following: Think about computations that one wants to do, and think about well-defined chunks of those computations that one could give a definite name to and do lots of times. A very simple one might be nest, a function in Mathematica that is sort of an iteration construct. There are a lot of programs one writes where one wants to do that, so it makes sense to give that thing a definite name, and say, “This is a chunk of computation that this language provides a primitive for doing.” In a sense it’s like [making] up the instruction set for a RISC machine. So [in developing] Mathematica I wrote a lot of sample programs in Mathematica, and my principle was if I keep on having to use an idiom it should have a name.

DDJ: That’s one design principle. Were there others?

SW: One principle is to keep the number of fundamentally different ideas fairly small, and then with each of those ideas to pin a lot of actual elements of the system on top of [it], because if you pin enough stuff on top of an idea, people are going to have to understand that idea to use the system. One of the mistakes that one has to fight in designing is to say, “For this particular thing we want to do, maybe there’s a nice mechanism we can make up, a special mechanism, say, for the way Poisson series work.” This will be a big mistake, because nobody will understand this mechanism. But if you have that mechanism be the mechanism that’s used for all list-like objects, say, then anybody who can use the system is going to understand the basic mechanism. Moreover, their understanding of the mechanism is going to grow if they see it used in a whole variety of different of places.

DDJ: Is there anything you’d do differently if you were writing Mathematica today?

SW: Were I to build Mathematica again I would probably have 5 percent less stuff in it.

DDJ: What would you leave out?

SW: I’ll give you an example of something that I put into Mathematica that I thought was a good idea but that turned out not to be. It was this function called short. It just has to do with printing our expressions…

DDJ: With the head and tail kind of thing?

SW: Yeah, it’s actually a bit cleverer than that. It goes through the expression [as] a tree and it has a certain amount of energy that starts off at the top of the tree, and it allocates the energy in different ways as it goes down the branches of the tree. It does a fairly nice job of showing you the structure of the expression with some little ellipses. As I say, it seemed like a good idea. The only catch is, nobody uses it. I haven’t used it in eons. Why do people not use it? I don’t know. But that’s an example of a “Designers Beware.”

DDJ: Can you ask users for their feedback about design features like that during the design process?

SW: If you ask a user, “What do you think of the design of such and such an aspect of Mathematica?” the chances are that you won’t get a sensible answer. If the person actually uses it, they’ll say, “Yeah, I can get my work done with it.” And they will have adapted to the language to make it work for what they want done. If you talk to people who work on the theory of programming-language design, they have all kinds of things to say, but I don’t believe their theories, so I’m not interested in them.

DDJ: You’ve spent a significant amount of time doing language design. What does a language designer really spend the bulk of the time doing?

SW: Almost all the time is spent trying to simplify the construct one comes up with. You start off with this idea about what capability you want it to have. Then the trick is, find the simplest, most transparent way to represent that.