An Interview with Stephen Wolfram

Paul Wellin, Mathematica in Education 2 (1993) 11–16. Stephen Wolfram is certainly no newcomer to our readers. As founder and president of Wolfram Research, Inc., he has been involved in almost every aspect of the phenomenal growth of this five year old company. Wolfram’s science career began as a research physicist and quickly led to areas in mathematics and computer science. He is the recipient of numerous awards and honors, among them a MacArthur Prize Fellowship in 1981.

In this interview, conducted in a small cafe in Northern California, Wolfram discusses his educational background, some of the history of Mathematica, and his view of its role in education. His opinions on the current state of mathematics and computer science education will be sure to disturb some, while others will find confirmation of what they already knew or suspected.

PW: Let’s start out talking about your own education. What originally got you involved in science and math?

SW: Well, I went to fine English schools, and learned lots of useful things there about how to write good English prose and so on. But my interests in science had pretty much nothing to do with my official school education. As it happens, I got interested in science very young—by the time I was ten, I was already reading lots of physics books. Mostly what happened was that I would get excited about particular areas of science, then I would try to read everything I could about those areas. All along, the thing that got me really excited was looking at problems that hadn’t been solved before. It turns out that when you are fourteen years old and thinking about physics, it isn’t too hard to find problems that—at least as far as you know—haven’t been solved before. I suppose the approach actually worked out fairly well—I started publishing physics papers when I was fifteen.

PW: You went to Oxford for college, but never finished there?

SW: That’s right.

PW: So what led from there to Caltech?

SW: I was pretty much on the track of doing particle physics research, and being a physics undergraduate at Oxford wasn’t a particularly useful environment in which to do particle physics research. Since I had the opportunity fairly easily to go to graduate school in the U.S., I decided to do that and I chose to go to Caltech.

The first year that I was at Caltech was the year that I had the highest rate of publishing papers of any time in my life. I actually think that on average, I was turning out a particle physics paper every few weeks. My main conclusion was that I did in fact know how to do particle physics research, so I collected together some of those papers and made a Ph.D. thesis out of it. I ended up getting my Ph.D. when I was just 20. In later years, I’ve realized that it was a big mistake not to make the effort to get my Ph.D. a few weeks earlier: it would be so amusing to say that one got one’s Ph.D. when one was a teenager!

Anyway, after I got my Ph.D. I started thinking about doing things other than particle physics. At the time, I was involved in doing various particle physics calculations which involved very complicated algebraic expressions. I ended up trying to use Macsyma^® to do these things. I had already been using Macsyma for several years for a variety of purposes. But my big disappointment was that after having written an incredibly ugly giant piece of code to do particle physics calculations, it in the end didn’t work properly because of various limitations in Macsyma.

As a result of that experience, I decided that it should be possible to do something better than Macsyma. So my first step was to talk to the folks who had originally written Macsyma and try to persuade them that it was time to build a second generation system. What ended up happening, though, was that the older people who were involved in the project said, “Well, you’re probably right that we could do a lot better if we started again, but we’re too old to consider doing that.” The younger people said, “No, no, Macsyma is the best thing you could possibly do along these lines—you could never do better.” I didn’t really believe that, and so I embarked on what became the SMP project, which was an effort to build a really powerful algebraic computation system. One of the very important things that happened in the course of building SMP was that I realized that there was a much richer style of programming that could be used when doing symbolic computations—rather than the Pascal-Algol-Fortan-like programming that was for example, built into Macsyma.

PW: Was your primary interest at this point to use SMP for particle physics or did you have broader goals?

SW: Well, it was pretty clear that just to do particle physics, one was going to have to have a very broad range of capabilities. And if one was going to go to all the effort to build such a system, the system better be as general as possible.

PW: The early computer algebra systems—Macsyma, SMP, etc.—were they used purely as research tools or was there some notion that they might be used in the classroom also?

SW: At that time it was not really practical to think of using these things in the classroom. SMP was built to run on the then-emerging class of mini-computer systems such as VAXes, and VAXes were in the multi-hundred thousand dollar price range, so it wasn’t realistic to think of those things being used in a serious way in the classroom.

PW: How did the jump from SMP to Mathematica come about?

SW: After I finished being involved in the SMP project, I got interested in trying to solve what I think is one of the more important fundamental problems in science—how complexity arises in nature. I worked for a number of years on that, and made rather good progress, using what one can think of as an experimental mathematics approach: taking simple computational systems and seeing what they do, and trying to develop theories on the basis of those observations. One of the things that happened in doing that, was that I realized that the main limiting factor in the science I was doing was the time it took to prepare each experiment. I think I’m a fairly good C programmer—by now I’ve written a significant fraction of a million lines of C code—but it was still taking me many hours to set up a particular experiment by creating a new C program. One of the things that I realized was necessary to make more progress in that kind of research was to develop a system that could allow one to interactively do high-level programming to specify one’s problems in a way that was as close as possible to how one thought about them, and not to have to go through this rather painful step of writing fairly low-level C programs. So that was one issue I was confronted with.

The other thing was that around 1986, I realized that there were going to be personal computers powerful enough to run a fairly general, fairly sophisticated computational system; I thought that was a very interesting intellectual and business opportunity. So I decided to build Mathematica. As it turned out, the timing was very good. At the time when Mathematica came out, we were just on the part of the curve where Macintoshes, and soon PCs, were powerful enough to run that kind of thing.

There were a lot of issues that came up in thinking about how Mathematica could be applied to education, and whether it would be applied to education. In the computer industry, it was believed at that time that selling programs for the educational market was a waste of time—that this was not a business proposition. In fact, that belief even extended to selling programs to the research part of the university community. Much of the early thinking about Mathematica, as it was presented to the computer industry, was how this should be used by engineers, and not what could happen with it in academia. In fact, I consider one of the business achievements of Mathematica to show that it is in fact a meaningful business proposition—to make programs where significant effort is put in to meeting the needs of academia.

I didn’t know how effectively Mathematica would catch on in education. In fact, I was at first pessimistic about the number of years it would take before there was really significant usage of Mathematica in education. What I thought in the beginning, was that in five to ten years, there would be significant stuff going on with Mathematica in education. I was very pleasantly surprised that within one or two years the early adopters were already doing very interesting things with Mathematica.

PW: Were there more institutions involved in the early development processes, other than the University of Illinois?

SW: There were certainly many research users, many of whom do teaching as well. I think that there was a surprisingly quick realization that Mathematica was relevant to the whole calculus reform movement. We were very lucky, though, that Horacio Porta and Jerry Uhl at Illinois jumped into this whole thing as quickly and as effectively as they did.

PW: The state of science and math in this country is really quite perplexing. On the one hand, we have some of the finest research institutions in the world. On the other hand, it is widely recognized that the teaching of these subjects is sorely in need of repair. Our students often can’t apply their knowledge to relevant problems, let alone use a computer to do any significant work in science.

SW: I remember a couple of early experiences interacting with the “computers and education” crowd at conferences. The experiences varied—from me being very pleasantly surprised at how quickly people seemed to be catching on to the potential for this kind of thing, to complete horror at the fact that people like the ones I was seeing were actually teaching young Americans about science or mathematics.

I continue to be amazed that, in the math educational process, there is such an emphasis on, as I see it, highly esoteric issues of pure mathematics. The notion of proof is an interesting one, but very few people in adult life so to speak, “do proofs.” That kind of thinking is, I think, most prevalent among mathematicians and lawyers. I think the emphasis on that kind of thing in mathematics education is a consequence of some kind of trickle-down effect from the influence of mathematics research in this century.

PW: As a mathematician, I might argue that you can rest assured that the mathematics you use in your science is secure because the foundations have been tidied up by the diligence of the mathematical community.

SW: I guess I have a stronger belief in “truth” than I do in “proof.” As experimental mathematics becomes more widespread, the divergence between truth and proof, in mathematics, will become larger. If you look at any area of science, there are far more experimentalists than theoreticians. Mathematics is the unique exception to this trend. It is my very strong suspicion that within a few decades, things will have switched around—there will be, as there are in physics, more experimental mathematicians than theoretical mathematicians. With luck, that change will reflect itself in parts of the educational process of mathematics—and I think that will be very healthy, because in my opinion, the fraction of people who are in a position to appreciate pure mathematics is very small. I don’t think I’m one of them, for example, even though I am certainly a fairly serious user of mathematics. The idea of presenting mathematics in education as being about proofs is really the wrong thing.

PW: Is this why computer science and mathematics departments have diverged so strongly in the recent past?

SW: One of the biggest mistakes of research mathematics in America in the last 50 years has been to let computer science get away. If you look at what was done when computing was young, there was a strong and definite strand of computing that was essentially part of mathematics. The mathematicians rejected it: this was a big mistake. While there is a certain track of computer science which is basically computer engineering, the fact that computing and mathematics ended up being adversaries rather than being close intellectually, was a big mistake of the mathematics community.

When Mathematica was quite young, and I talked to people in the computer industry about doing mathematics on computers, they said to me, “Why would anybody want to do that?” It’s quite ironic, considering that in the early days of computing with von Neumann, Turing, and others, one of the original conceptions (at least one of the major tracks) was that one was building these machines to automate mathematics. There was another track saying that one was building these machines to automate what the census bureau does, which is a separate bookkeeping area. But by the time personal computers had come out in the late ’70s and early ’80s, the computer industry had this idea that what computers were used for was word processing, spreadsheets, etc., and the notion that one could use computers to do mathematics was bizarre.

PW: Even with the early computational number theorists such as D.H. Lehmer at Berkeley?

SW: I’m afraid that none of the leaders of the computer industry have probably ever heard of D.H. Lehmer. But there were certainly a small number of mathematicians who had used computers to do essentially experimental mathematics for some time. Even in academic mathematics, though, these people were a tiny corner of the mathematics community.

Recently, I happened to be studying the history of computing, and I’ve really been struck by the fact that in the early writings about computing, particularly in accounts to the general public of computing, it was always about these machines that are capable of automating mathematics. One of the achievements of Mathematica has been to demonstrate that indeed, computers are useful for mathematics. I think it was an unfortunate fact that the mathematics community itself had been so much against computers.

PW: There has been discussion (and hope by some) that mathematics and the rest of the sciences will tend to become less distinct. They are becoming more and more involved in similar computational tasks, albeit on different problems.

SW: With the current system of science in America, I don’t see any mechanism to reduce the rigidity of it. I think it’s a hell of a pity, because more good science and more useful science could be done if there was less rigidity. Over the 15 years or so that I’ve been doing science in America, I’ve just seen increasing rigidification in the funding agencies and the universities. Everything has to fit into a mathematics department or a physics department or whatever else. The early hope that computing would cut across these things and develop more interdisciplinary approaches, really hasn’t panned out.

There’s a question that I really don’t know the answer to: “Is ‘computational science’ something that there should be departments of?” Or, “Is ‘computation’ really a tool that should get mentioned in the educational process of all these different areas?” That’s sort of a similar question to how calculus should be dealt with, because calculus can either be taught in a mathematics department as the domain of mathematics, or it can be distributed among the engineering and physics departments. That’s worked differently at different places.

I think that in the case of learning about Mathematica, for instance, that question again comes up. Should Mathematica be taught as a course unto itself—perhaps in the computer science department, perhaps in the mathematics department? Or should it be the case that if you are trying to teach about Mathematica, you spend the first two weeks of the class talking about that, and specialize your discussion to the particular physics course or whatever you are going to give. My guess is the way things will evolve (or should evolve), is that there will be one central place where people learn Mathematica, just as there is one central place where people learn calculus, and then they can go out and apply it.

PW: It would certainly be a more efficient way to do things.

SW: Yes, but in terms of the rigidity of present-day science, there are two places where there are issues. One is in the research area, the other is in the educational area. To be honest, I see more chance for change in the educational area than in the research area. The research area is so dependent on the structure of funding and things like this, that I don’t see that being something that will change quickly.

In the educational area, I think it is much more plausible that computational science courses will develop that do cut across the very rigid boundaries that exist right now. That seems to be a very encouraging thing.

PW: At present, what is the breakdown of educational vs. research users of Mathematica?

SW: That’s a bit of a difficult question to answer. Because when you have a class that uses Mathematica, how do you count the individual students that are going through there? I think that about 40% of the number of copies of Mathematica that are out there are in the educational sector. About 23% of the revenue that comes in from sales of Mathematica comes from the educational sector.

When I say educational sector, I mean colleges, high schools, and universities which includes much research usage of Mathematica. It’s hard to be able to come up with exact figures.

PW: I noticed a rather long debate on the nets recently about the current “role” of Mathematica. Some people were arguing that presentation features should not be focused on—that all work should go into algorithm improvement. I am sure that a similar argument could be put forth about the Mathematica language itself as well. In light of your previous statement about who is using Mathematica, what is your view of its present role?

SW: In terms of algorithm development, I am really very satisfied with the point we’re at and the rate at which things are progressing. My big test for these things in terms of, for example, algebraic algorithms, is to be able to clearly say that if there is an integral you can think about doing, then Mathematica will be able to do it better than any person, or any other computer system. This is the same kind of issue as has arisen in playing chess. There’s a point at which eventually the computers are actually just better than people at doing it. And we’re pretty close to that point with many kinds of integrals.

One area in which you will see some significant development is in the area of Mathematica interactive documents. People have talked for quite a few years about “hypertext” and “multi-media” and electronic books, and so on. But there really isn’t a hell of a lot out there that actually makes any sense—except for Mathematica Notebooks. The fact is that for all the hype that has gone into the idea of electronic books in the publishing community and the computer industry, the one example of this that actually seems to be working is Mathematica Notebooks.

There are some things you’d like to be able to do with Mathematica Notebooks that you can’t do now. For example, including beautiful typeset mathematical equations. That is something we are going to make work, and I think in a very nice way.

PW: I’d like to turn to Mathematica as a programming language entity. From an educational point of view, would you put the Mathematica language on a par with Fortran or Pascal?

SW: People might attack me for immodesty, but I think in the present day and age, if you’re teaching general people about programming computers, Mathematica is far and away the best programming language to use—and I’ll tell you why. There are a certain set of people, who when they are grown up, will write things like compilers. Those people need to know C and they need to know how to build parsers. But in the world right now, there are probably only 50 people who write compilers. And probably most of them learned what they needed outside of school, anyway.

What one should be trying to teach when one teaches people about programming, is two things. First of all, one should teach them the practicalities of actually doing programming that they might use later on in life. Second of all, one should teach them concepts about what it means to program a computer, and what ways of thinking programming involves.

Taking the second of those things first, teaching the concepts of programming in a language like C or Pascal, is crazy. You can only teach a very small subset of what is known today about the ways it makes sense to do programming.

PW: Well, most computer science departments get around this by requiring their students to learn half a dozen different languages.

SW: People who just take a CS 101 type course, they’ll typically learn C or Pascal. The pity about that is, knowing about symbolic computation, functional programming, transformation rules, what it means to do graphics programming—they don’t get any of that stuff in C or Pascal. I think that’s a real pity. Knowing the details of how to do pointer manipulation in C and how to do memory management is absolutely beside the point in the modern world.

It’s a strange anomaly in the history of programming languages that over the last 20 years, there’s been an incredible transformation in the computer hardware systems that exist, and in the kinds of people who use computers and interact with them on a daily basis. Yet in that period of time, the world of programming languages has changed almost not at all. I think it will increasingly change. Already, there are many application programs that have little programming languages attached to them. Some of them are BASIC-like, some of them are like other kinds of things.

Even if you are not actually going to program in Mathematica later, using Mathematica as the language to learn about the ideas of programming is the right thing to do, because it is the broadest of the programming languages. So you can actually get familiar with all these different concepts in this one environment.

Another thing that is very significant is that in Mathematica—because there is no really clear line between its programming side and its computational side—you can immediately do things where you can see the results. You can gently go into programming. The idea that students have to learn #include<stdio.h>, etc.—this kind of strange incantation of having to write at least 20 lines of code to get their first C program working—is really unfortunate. It gives them the idea that programming is much more of a black-magic kind of thing than it actually is. That’s a pity.

In a sense, BASIC was, from that point of view, a much better kind of language. But from the point of view of understanding the concepts of programming, it was fairly weak.

If you learn C, it’s a significant distance between knowing C and being able to simulate a pendulum, for example. Whereas, in Mathematica, if you know the concepts, it is very easy to go and apply that to your physics course right away.

One of the areas that I am most enthusiastic about in terms of the educational development of Mathematica, is this area of using Mathematica as an educational programming language. I am particularly hopeful that over the next couple of years, there will be a number of books which will help present Mathematica as an educational programming language. My own take on how these things are taught right now is that, increasingly, courses are starting to be taught in scientific computing. Mathematica is definitely a language that should be used for that. It would be crazy to use anything else at present.

PW: The argument you will get from the computer science department will be, “Our students are not going to be using this language to program when they get out in the real world.”

SW: I think that’s not really a valid argument. First of all, 20% of the users of Mathematica are computer scientists. If you look at the software development companies, I would say that all of the big ones have significant numbers of copies of Mathematica. All of them use it, particularly for prototyping algorithms. When they are building the final production version of something, they’ll often translate it into C and use low-level stuff. But in terms of algorithm prototyping and understanding the structure of algorithms and programs, Mathematica is not only a good tool to use, it is also the tool that actually is being used.

Another argument that people give is, “Gosh, the students should really understand at a low level what the computer is doing.” Well, I don’t disagree about that. You can also say that about all kinds of technical areas. You can say that about mathematics. People shouldn’t use calculators because they should understand at a low level how to do square roots and so on. Well, my contention is that the way you really understand how to do square roots is by using a calculator and seeing what happens, and that after you really understand what good they are and have experimented with them, then you are finally motivated to maybe ask, “How did the calculator manage to do this calculation?” In the case of square roots, I don’t know! I never learned how to take square roots by hand. A couple of times in my life I have studied the algorithm for doing this to implement on the computer, but it is not something I remember or think is significant.

The same thing is true in computing. Knowing how memory allocation is done is something that, once you’ve used computing a lot, you could then get interested to see what the foundations are. If every time we wrote a program, we had to think about what would be the physical addresses that our program will be loaded into memory at—or worse, what voltages would be going high and low in certain transistors in the microprocessor—we wouldn’t get anywhere. The thing that has made computing really take off is the fact that software can be built in layers, and you can assume that there is a lower layer which you really don’t have to worry about.

It would be extremely foolish for the students to decide that this layering effect of software is a bad thing. Quite to the contrary, in terms of understanding, I think the things which are most valuable to teach in a computer science course are the concepts of programming, and those are not well taught by going down to the level of transistors, or for that matter, to the level of C or Pascal.

I must say that I am not a great enthusiast of the academic computer science world. There certainly are some really good things which border on mathematics that are being done there, but I think it is a sign of the weakness of the field when its major concern is defending itself from the outside world. Those fields like physics for instance, that are really quite self-confident and at peace with themselves, don’t worry a hell of a lot about defending themselves from statements such as, “This isn’t physics, it’s engineering, or something else.”

PW: Well, I think you have to remember that computer science is a very young discipline.

SW: It is, but I think it’s made a lot of mistakes on the way. They worry, “How can we teach people about commercial software,” for example, even though commercial software is what it’s about in the world at large. To teach students a toy spreadsheet is stupid. You might as well teach them 1-2-3^® or Excel^®, because that’s what they’ll be using.

It is a strange and somewhat unhealthy feature of computer science that it is mostly trying to defend itself from the real world, rather than thinking about how it can contribute to the progress of the real world and to educate students to interact with that world.

PW: What are the changes that you envision for Mathematica and Wolfram Research in the next five years?

SW: Certainly our strategy with Mathematica over the years to come will go in several tracks; one of those tracks will be to push the programming language part of Mathematica as a separate entity, independent of the mathematical and calculational capabilities of the system. How exactly that will play out in the computer industry and what all of the arrangements and licensing issues will be, we don’t yet know.

One of the issues I mentioned earlier was this whole question about interactive documents—Mathematica Notebooks built on top of Mathematica—and how those things will develop. One of the big issues is, when people write these things, how are they going to be distributed? With MathSource we started trying to address that issue. I think one of the things that we’ll see is this notion of publishing with Mathematica. This will be increasingly important. In the educational arena, this migration from printed textbooks to on-line Mathematica Notebooks will be something that we’ll see in the next few years.

In terms of Mathematica itself, there will certainly be incremental improvements in the details of the algorithms. There are a number of things that will get developed, such as typesetting capabilities, and so on.

One of the things we’ve seen in education recently, is a transition from use on individual machines to the existence of educational labs that could have Mathematica running on all the machines. Now, finally, we’re getting to the point where the generally used machines which exist in universities are powerful enough to run Mathematica. That hasn’t been true until recently. There are a number of things that one will see changing as we adapt, both technologically and from a business point of view, to an environment where that’s possible.