Stephen Wolfram on the Quest for Computable Knowledge

Stephen Wolfram recently received an award for his contributions to computer science. The following is a slightly edited transcript of the speech he gave on that occasion.

I want to talk about a big topic here today: the quest for computable knowledge. It’s a topic that spans a lot of history, and that I’ve personally spent a long time working on. I want to talk about the history. I want to talk about my own efforts in this direction. And I want to talk about what I think the future holds.

So what do I mean by “computable knowledge”? There’s pure knowledge—in a sense just facts we know. And then there’s computable knowledge. Things we can work out—compute—somehow. Somehow we have to organize—systematize—knowledge to the point that we can build on it—compute from it. And we have to know methods and models for the world that let us do that computation.

Well, I think in history the first really big step in this direction was taken a really long time ago—with the invention of counting and arithmetic. The big idea that we know pretty much existed by 20,000 BC was that you could just abstractly count objects, independent of what the objects were. And then that there were definite unchanging rules of arithmetic that could let one abstractly compute things.

But of course just counting things is a very coarse form of systematic knowledge. Human language lets us describe much more, but it isn’t systematic—it doesn’t allow us to go directly from our knowledge to computing new things. But it was still a crucial step in perhaps 4000 BC when written language first emerged—and it became possible to systematically record and transmit knowledge about things.

It didn’t take long before numbers and writing led to kings in Babylon making pretty broad censuses of people and commodities. From which at least it was possible to compute taxes. But when it came to working out more about what would happen in the world—well, probably most people just assumed it was all just fate, and that nothing much could be predicted.

Thousands of years went by. But then something happened. People had known that there were regularities to be seen if not on earth, at least in the heavens. And then it was realized that one could use arithmetic—the same arithmetic that worked for commerce and for land surveying—to predict things about the heavens. To work out the behavior of the planets, and even to say things about spectacular events like eclipses. It was the beginning of the tradition of exact science as we know it.

Of course, it wasn’t at all clear where the boundaries were. Things worked in predicting the heavens, so why not predict the weather, the rise and fall of kingdoms, and everything? Of course, that didn’t work so well. But still, with people like Pythagoras around 500 BC, it seemed that nature, and music, and much more—even if not human affairs—could perhaps be described, and computed, using numbers. There were other possibilities too, though, even at that time.

And indeed on the other side of the world, probably around 400 BC, Panini was coming up with rules—not numbers, just rules—that described the grammar of Sanskrit. Going from human language—and finding a formal way to describe its structure—and in effect to use that to compute the poetic forms that could be produced. That idea did reappear a few times in scientific history; for example, Lucretius talked about how atoms might make up the universe as letters make up words and sentences. But for practical purposes, the notion of creating formal systems from the structure of languages was lost for more than 2000 years.

And instead, what emerged in Greek times—probably around 350 BC—was the idea of logic. The notion—found in the works of Aristotle—that the structure of human arguments, of human reasoning, can be represented in a stylized form—using logic. The idea that just as numbers let one abstractly count things, so logic could let one abstractly see the structure of certain forms of deduction and reasoning—and rhetoric. So that one could find ways to derive conclusions about the world by structured, formal, reasoning.

But what really took off from that idea was not really general reasoning—but instead specific forms of reasoning about arithmetic and geometry—about mathematics. And quite quickly—with Euclid and so on—there began to be a whole tradition of finding the truths of mathematics by derivation and by proof.

At the same time, though, there was another tradition. The tradition of trying to organize knowledge—of making lists and categories of things—like in Aristotle. And of just outright collecting knowledge, like at the library of Alexandria, where, as it happens, Euclid worked.

Well, in a sense these were academic, philosophical pursuits. But by 200 BC people like Archimedes were putting mathematics—and computation—firmly into practice in creating technology. And it’s looking as if Archimedes probably did another important thing too: he really started mechanizing the doing of computation. You know, I’ve spent much of my life working in that kind of direction, and it’s fun to realize that I’m in a very old business. Because it’s looking as if Archimedes may well have started building gear-like devices that could do computation—say astronomical predictions—22 hundred years before Mathematica and so on.

Well, after all the successes of Greek philosophy and mathematics and so on, one gets the impression that lots of people thought that everything that could be figured had been figured out. And nobody pushed hard for a long time to do more. Still, just like in Babylonian times, there were people trying to compute—to predict—more about the world. A lot of it was hocus pocus. A notable example was Ramon Lull, from around 1300, who invented a whole combinatorial scheme for generating possible ideas, and explaining what could happen in the world. That didn’t work so well. But still, there was a general feeling that the kind of systematic derivations that existed in mathematics should somehow be applicable to at least some of the goings-on in the natural world.

And by the end of the 1500s—with Galileo and so on—there was a notion that physical processes could be understood in the “language of mathematics”. The big breakthrough, though, was Isaac Newton in 1687 introducing what he called “mathematical principles of natural philosophy”. Really pointing out that things in “natural philosophy” could be worked out not by some kind of humanlike reasoning, but rather by representing the world in terms of mathematical constructs—and then using the abstract methods of mathematics (and calculus and so on) to work out their behavior.

Why it worked wasn’t clear. But the big fact was that it seemed to be possible to work out all sorts of unexpected things—and get the right answers—in mechanics, both celestial and terrestrial. And it was that surprising success that has propelled mathematics as the foundation for exact science for the past 300 years.

But back to the main story. A lot of relevant stuff was going on at the end of the 1600s. In the 1660s, people like John Graunt were in effect inventing statistics. Going beyond the Babylonian census to have more abstract mathematical representations of features of states—and things like life tables. And around 1700 there was something else: Gottfried Leibniz talking about his “characteristica universalis”.

You see, Leibniz had started taking seriously the idea that it might be possible to really make knowledge computable. He wanted to invent a universal symbolic language that could represent everything. And then effectively to apply methods of logic and mathematics to this symbolic representation—to resolve all human arguments. He started building clockwork computers. He tried to persuade the leaders of his time to start collecting knowledge in big libraries. But even though I think he had very much the right idea—one just couldn’t get there with the technology of 1700. It was a few hundred years too early.

But even though Leibniz’s big idea didn’t work out, the notion of systematizing knowledge was becoming more and more popular. The British Museum was founded in 1700, as a kind of universal collecting place—not of knowledge, but of actual things, natural and artificial. In 1750, Carl Linneus went beyond Aristotle, and came up with the modern scheme for systematically classifying living organisms. In the mid-1700s encyclopedias—like Encyclopedia Britannica—were getting founded and there was an effort to collect everything that was known into systematic books. Meanwhile, on the computation side, mathematics was doing pretty well, both in terms of abstract theorem development, and practical use in physical science and engineering. There were efforts to make it more systematic.

In the 1830s Charles Babbage got serious about automating the computation and printing of mathematical tables, and started imagining a kind of universal “analytical engine”, which, as Ada Lovelace described it, could “weave algebraical patterns just as the Jacquard loom weaves flowers and leaves”. Then in the 1850s George Boole pointed out that logic was really just like mathematics—and that one could use mathematical methods to work out questions in logic. And somehow it was beginning to seem that mathematics had a great deal of universality.

Indeed, between non-Euclidean geometry in the 1820s, abstract algebra in the mid-1800s, and transfinite numbers in the 1880s, it had begun to seem like mathematics was a kind of universal framework for abstraction. In 1879 Gottlob Frege came up with predicate logic, in effect trying to find a way to represent general truths, whether in mathematics or elsewhere. Mathematics was doing so well in physics and engineering. So many new theoretical areas were springing up from algebra, calculus, geometry and so on. It must have seemed in the late 1800s as if the whole world would soon be described and worked out in terms of mathematics.

There were efforts—by Peano and so on—to come up with definitive axiomatization of mathematics. And there was an increasing conviction that the methods of Euclid—starting from axioms and then systematically deriving theorems—would unravel all of mathematics, and perhaps all of science. By 1910, there were efforts like the Whitehead-Russell Principia Mathematica, where the notion was to formalize mathematics in terms of logic, and then, in effect, just build up everything from a modest set of axioms. And David Hilbert had the idea that really mathematics should be almost mechanical: that one could in effect just churn out all truths automatically.

Well, along with all this theoretical and foundational activity, there were also practical things going on. Starting in the mid-1600s—possibly with precursors going back to Archimedes—mechanical calculators were increasingly developed, and by the end of the 1800s they were commonplace. The idea of the Jacquard loom—and of things like player pianos—had introduced a notion of programming: having punched cards that could specify what operations the mechanical device should perform. And very gradually that idea began to be generalized.

There had emerged from calculus and so on the notion of an abstract mathematical function: f(x), where somehow the function itself was a bit like the variable. But just what could a function be? Perhaps it was something built from logic-like constructs, as in the work on Moses Schönfinkel on combinators. Perhaps something built with rewrite rules, as in the work of Emil Post. Perhaps something built with the operation of “primitive recursion”—in effect a kind of arithmetic recurrence.

All of these seemed like possible representations, but none seemed fundamental. And with the discovery of the Ackermann function in 1920, for example, it became clear that at least primitive recursion wasn’t the complete story of reasonable functions. But from all this, there was at least emerging a notion of being able to treat functions like numbers and other data. Actually, Leibniz had already suggested numbering possible logic expressions way back in 1679.

What did this have to do with the foundations of mathematics? Things like set theory had developed, and there had started to be all sorts of brewing paradoxes and things as set theory started to try to talk about itself. But still it seemed that with fancy enough footwork Hilbert’s idea of systematically finding all truths in mathematics could be saved. But that all changed in 1931 with the arrival of Gödel’s Theorem.

Gödel started by taking one of those paradox-like statements: “This statement is unprovable.” But then he did something very interesting. He showed that that statement could be expressed as a statement in arithmetic. He found a way to set up equations about integers and other constructs in arithmetic, and have them represent his at-first-not-mathematical-seeming statement. That had a big consequence: it showed that with mathematics itself, one could set up statements that one could then show couldn’t be proved or disproved within mathematics. Mathematical statements about which mathematics just didn’t have anything to say.

Well, that was pretty interesting for the philosophy of mathematics. But in some sense it was a technicality in Gödel’s proof that really changed the world. You see, to show what he showed, Gödel in effect ended up inventing programming. Because to show that his funny statement could be represented in terms of arithmetic, he actually showed that any of a huge class of statements could also be represented that way.

At first it wasn’t incredibly clear what the significance of this was. Whether it was just a technical detail of the particular mathematical systems—things like “general recursive functions”—that Gödel had considered. But what happened next was that in 1935 Alonzo Church came up with lambda calculus—and then in 1936 Alan Turing came up with Turing machines.

Both of them were in effect trying to come up with ways to describe everything that could reasonably be computed. Turing’s scheme was the clearest. He in effect had a way of describing a procedure for computing things—potentially by machine. One might have thought that to compute different things, one would always have to have a different machine. One square-roots machine. Another exponentials machine. And a quite different machine to do logic puzzles. But the crucial thing that Turing showed was that in fact there are universal machines—which can just be programmed to do any of these operations, or in fact to emulate any other Turing machine.

At first it wasn’t clear quite what the significance of this was. After all, perhaps there were different ways to construct machines that would have different properties. But meanwhile, people were constructing more and more machines that did practical computations. Beyond calculators, there had been Hermann Hollerith’s census-counting machine. There were starting to be machines for doing logic, and for combinatorially breaking codes. There were machines for doing more and more elaborate equation solving. And by the 1940s, electronics was becoming established as the underlying technology of choice.

There was also increasing knowledge of the physiology of the brain, and it was beginning to look as if somehow it might all just work with electronics. And by the 1940s, people like McCullough and Pitts were using Turing’s universal machine idea as evidence that somehow neural circuits could successfully reproduce all the things brains do. Particularly through John von Neumann that idea then merged with practical work going on to try to make programmable electronic computers. And the result was that the theoretical idea of universal computation got turned into the idea of producing universally programming electronic computers—and the idea of software.

In the early days of electronic computers, it was pretty much assumed that computers were somehow like electronic brains, and that eventually computers would be able to take over brain-like work, just like mechanical machines had been able to take over most mechanical work. Movies began to portray computers that could be asked questions, and compute answers. There didn’t seem to be any doubt that all knowledge would soon be computable—even if perhaps computers would show signs of the “simple logic” on which they were based—somehow being more “robotic” in their answers than humans would be.

There were some practical efforts to actually start working out how computable knowledge would be set up. Following up on crowdsourced projects like the assembly of the Oxford English Dictionary at the end of the 1800s, there were projects like the Mundaneum—which in particular tried to collect all the world’s knowledge on 12 million index cards, and be able to answer questions sent in by telegraph. By 1945, Vannevar Bush was talking about the “memex” that would provide computerized access to all the world’s knowledge. And by the mid-1950s, the idea of artificial intelligence was becoming all the rage. In a sense artificial intelligence was thought of a bit like mathematics: the idea was to make a general-purpose thinking machine, that could start just from nothing, and learn and figure out anything, just like humans—rather than to make something that started from a large corpus of existing knowledge or methods.

One particular direction that was pursued a lot was handling human language. In the mid-1950s, at almost exactly the same time, two important things had happened. Noam Chomsky had suggested the idea—in a sense finally a followup to Panini from 400 BC—that the grammars of human languages could be represented in algorithmic form. And at almost exactly the same time, the idea had been introduced of constructing languages for computers in algorithmic form.

But despite the idea of an algorithmic structure to human language, it proved a lot more difficult than expected to do actual computations with human language—whether for language transformation, information retrieval or whatever. Computer languages were a much better idea, though. They provided precise formal specifications for what computers should do, given in a form that was close enough to human language that people could learn them a bit like the way they learn human languages.

Meanwhile, through the 1960s, 1970s, and 1980s, computers were gradually becoming more and more powerful, and more and more common. There were ideas like relational databases. Then there were applications like graphics and word processing. And the concept that had begun as a kind of footnote to technical work on the foundations of mathematics had become a central part of our world.

But still, the early idea that computers would be like brains—and that all the world’s knowledge would be computable—just hadn’t happened. There were many places computers were extremely useful. But there was more to do. Just how far can the notion of computation go? How much can one compute? About the world? About human knowledge?

Well, I’ve thought about these things a lot over the years. In fact, the three large projects of my life have all in a sense been concerned with aspects of this very question.

In Mathematica, for example, my goal has been to create a framework for doing every possible form of formal computation. Mathematica is in a sense a generalization of the usual idea of a computer language. In a sense, what Mathematica tries to do is to imagine all possible computations that people might want to do. And then to try to identify repeated structures—repeated lumps of computational work—that exist across all those computations. And then the role of the Mathematica language is to give names to those structures—those lumps of computational work. And to implement them as the built-in functions of the system.

I wanted Mathematica to be a very general system. Not a system that could just handle things like numbers, or strings, or even formulas. But a system that could handle any structure that one might want to build. So to do that I in effect went back to thinking about the foundations of computation. And ended up defining what one can call unified symbolic programming. One starts by representing absolutely everything in a single unified way: as a symbolic expression. And then one introduces primitives that represent in a unified way what can be done with those expressions.

In building Mathematica over the past 23 years one of the big ideas has been to include in it as much—in a sense formal—knowledge as possible. The methods, the algorithms, the structures that have emerged throughout the fields of mathematics and computation. It’s really been an interesting thing: we have this very unified environment, and as we add more and more to it, it’s a kind of recursive process. Because it’s unified, and because in a sense so much of what it does is automated, the new things we add get to build on everything that’s there before. In a sense we get to see the generality of the idea of computation; we get to use it to create ways to handle even more kinds of things.

You know, it’s funny how long it takes for paradigms to sink in. The basic ideas for the unified symbolic programming that exists in Mathematica I came up with nearly 30 years ago. But every few years I realize more that one can build with those ideas. Mathematica started with technical and mathematical computation. But it’s turned out that its foundations are general enough to go far beyond that. To make a lot more things computable in systematic ways.

Well, one of the reasons I wanted to build Mathematica in the first place was that I wanted to use it myself. To explore just what the broad implications are of the fundamental idea of computation. You see, while computation has been of great practical importance—even in science—there’s a lot more to explore about its implications for the foundations of science and other things. If we’re going to be able to do science—or in general to make knowledge systematic—we kind of have to imagine that there are ultimately theories for how things work. But the question is: what are the primitives, what’s the raw material, for those theories?

Well, in the exact sciences there’s been a lot of success over the past 300 years with theories based on mathematics. With taking those ideas of numbers, and algebra, and calculus, and so on—and building theories of the world out of them. That’s certainly the way the big successes of areas like physics have worked. But of course there’s a lot in the world that we don’t yet know how to explain. There’s a lot in nature, for example, that just seems somehow complex, and beyond our theories.

But the big thing that I started asking myself nearly 30 years ago now is whether the reason it seems that way is just that we’ve been thinking too narrowly about our theories.

Maybe it’s not enough to use the primitives that we happen to have developed in the course of mathematics. Maybe the world doesn’t happen to work that way. Well, what other primitives could we use? In the past, we would have had no idea. But now that we understand the notion of computation, we do have an idea. What about just using simple programs? What about basing our theories and models not just on the constructs of math, but on the general kinds of rules we find in programs?

Well, OK. But what kind of programs?

Normally when we use programs we do it in a very engineering kind of way. We set up particular programs—usually very complicated ones—that perform particular tasks we want. Well, my idea of nearly 30 years ago was to ask, what if we just look at the world of programs as we look at an area of natural science? If we just see what’s out there in the computational universe of possible programs? Let’s say we just start with the simplest possible programs, and see what they do.

I happened to study a lot some systems called cellular automata, that just consist of rows of black and white cells, with little rules for how the colors of the cells should be updated. Well, at first I’d assumed that if the rule for the cellular automaton was simple, its behavior would somehow have to be correspondingly simple. I mean, that’s the intuition we tend to have from everyday life, and from today’s engineering and so on. If you want to make something complicated, you have to go to a lot of effort. And you have to follow complicated rules and plans. But I decided that I should just do the experiment and see what was true. Just trying running every possible simple cellular automaton program, and see what it did.

And the result was really, really surprising. And it kind of shattered my intuition about how things work. Because what I found was that even very simple programs—started off in the simplest way—could produce incredibly complex behavior. Patterns that if you saw them you’d say, “That must have been produced by something really complicated.”

Well, so what I found, by doing in effect empirical computational science, was that in the computational universe there’s incredible richness in a sense very near at hand. I think this is a pretty fundamental thing. And actually I think it explains a pretty fundamental observation about our world.

You see, even though when we build things it always seems to take a lot of effort to build something that is complicated, nature doesn’t seem to work that way. Instead, it seems as if nature has some kind of secret that effortlessly produces all sorts of complexity. And I think we now know what that secret is. It’s that nature is sampling all sorts of programs in the computational universe. But even though the programs are simple, they just don’t always happen to be ones whose behavior is simple. They don’t happen to be the programs that correspond to things we’ve built with our mathematics and our traditional mathematical science.

Ultimately it’s not that you can’t build complexity from mathematical primitives and so on. But what’s happened is that the exact sciences have tended to just define themselves to be about cases where that doesn’t happen. We haven’t studied the full computational universe of possibilities, only a thin set that we’ve historically found to be tractable.

Well, this has many implications. It gives us a “new kind of science”—as I pointed out in the title of the big book I wrote about all this. A kind of science that in a sense generalizes what we’ve had before. That uses a much broader set of primitives to describe the world.

Already that science has had lots and lots and lots of applications. All sorts of new models of natural and man-made phenomena. Where the foundation is not a mathematical equation, but a computational rule. The science has shown us new ways to think about all sorts of things. Not just traditional science. Also technology—creating things that go beyond what one can foresee from traditional engineering. In art—capturing the essence of the richness that seems to give nature its aesthetic. Even in philosophy, thinking about old questions like free will. Maybe the science will even get us to the ultimate point in a question for knowing about the world: being able to give us an ultimate fundamental theory of physics.

You know, the way physics has gone in the last few hundred years, it seems kind of hopeless to even imagine that one might be able to find a truly fundamental theory of physics. It seems like at every stage, models in physics have just been getting more and more complicated. And one might assume, given the obvious complexity that we see in our universe, that there’s no possibility for there to be an ultimate simple theory of it all.

But here’s the critical thing one learns from studying the computational universe: if one just samples possible programs, even very simple ones can have great richness in their behavior. So then the question is: in the computational universe of possible programs, just where might the program for our universe lie? Is it in effect a giant program? Or something tiny? That we might be able to find just by searching the space of programs. Well, I don’t know for sure. Though a crucial fact about our universe, long noted by theologians, is that it certainly isn’t as complicated as it could be. There’s at least some order in it. And maybe that means we can find a program for it that’s really small.

So what’s actually involved in universe hunting? I think one has to concentrate on in a sense very abstract models, where there aren’t built-in notions of space, or time, or matter, or really anything that’s too familiar from our existing physics. But if one just starts enumerating very simple sets of rules, the remarkable thing that happens is that one starts finding candidate artificial universes that at least aren’t obviously wrong—obviously not our universe.

There’s one big problem with all this. A fundamental pheonomenon I call computational irreducibility.

You see, once we start thinking in computational terms, we start to be able to ask some fundamental questions. Traditional theoretical science has been very big on the idea of predictability: of somehow working out what systems will do. Well, in the past it always seemed reasonable to assume that if the science was done properly, that kind of prediction would be possible. That somehow the scientist would be able to be smarter than systems in nature, and work out what they would do more efficiently than they do it themselves. But one of the big discoveries from what I’ve done is what I call the Principle of Computational Equivalence. Which says that as soon as a system isn’t just obviously simple in its behavior, it will be as sophisticated computationally as any other system.

So this means that the scientist will just be equivalent to the system being studied—not smarter than it. And that leads to what I call computational irreducibility: that a great many systems in effect perform computations that we can’t reduce, or predict. That we just have to simulate to see what they will do. Well, that’s obviously a problem for universe hunting. Because we can’t expect to simulate every step of the evolution of our universe.

And indeed what happens is that you get candidate universes that flap around in all sorts of complicated ways. They could actually be our universe. But computational irreducibility makes it really hard to tell. It can even be like Gödel’s theorem: it becomes undecidable whether a particular candidate universe has a particular property like our real universe. Still, I think it’s quite possible that we’ll be lucky—and be able to find our universe out in the computational universe. And it’ll be an exciting moment—being able to sort of hold in our hand a little program that is our universe. Of course, then we start wondering why it’s this program, and not another one. And getting concerned about Copernican kinds of issues.

Actually, I have a sneaking suspicion that the final story will be more bizarre than all of that. That there is some generalization of the Principle of Computational Equivalence that will somehow actually mean that with appropriate interpretation, sort of all conceivable universes are in precise detail, our actual universe, and its complete history.

But those are issues for another day. For now the main point is that from NKS—this “new kind of science”—one learns just how much of the world can really be thought of in computational terms. There are limits from computational irreducibility to how easy it is to work out consequences. But there’s much more than just traditional science that can be represented in computational terms.

Well, back to the main thread of computable knowledge. In the last decade or so a lot of practical things have happened with it. The web has arisen, putting huge amounts of material in computer-readable form. Wikipedia has organized a lot of material—encyclopedia style—and has codified a lot of “folk knowledge” about the world. And search engines have arisen, giving one efficient ways to find anything that’s been explicitly written down on the web—fact, fiction, or otherwise. Meanwhile, all sorts of things that at one time or another were considered tests for artificial intelligence—playing chess, doing integrals, doing autonomous control—have been cracked in algorithmic ways.

But the general problem of making the world’s knowledge computable—of making computers really act like the science-fiction computers of the 1950s—has always seemed too difficult. Or at least that’s the way it seemed to me. And I guess every decade or so I would wonder again: are we ready to try to make the world’s knowledge computable yet? And I would say, “No, not yet.” We’re closer than Leibniz was. But we’re not there yet.

Well, a few years ago I was thinking about this again. And I realized: no, it’s not so crazy any more. I mean, I have Mathematica, which gives a foundation—a language for representing knowledge and what can be computed about it. And from NKS I understood a lot more about what can be represented computationally, and just how simple the underlying rules to produce richness and complexity might be. And thinking that way got me started on the third big project of my life—which has turned into Wolfram|Alpha. And in fact, today [June 15, 2009] it’s exactly one month since Wolfram|Alpha was first launched out into the world.

The idea of Wolfram|Alpha was to see just how far we can get today with the goal of making the world’s knowledge computable. How much of the world’s data can we curate? How many of the methods and models from science and other areas can we encode? Can we let people access all this using their own free-form human language? And can we show them the results in a way that they can readily understand? Well, I wasn’t sure how difficult it would be. Or whether in the first decade of the 21st century it’d be possible at all. But I’m happy to say that it worked out much better than I’d expected.

Of course, it’s a very long-term project. But we’ve already managed to capture a decent amount of all the systematic information that you’d find in a standard reference library. We’ve managed to encode—in about 6 million lines of Mathematica code—a decent slice of the various methods and models that are known today.

And by using ideas from NKS—and a lot of hard work—we’ve been able to get seriously started on the problem of understanding the free-form language that we humans walk up to a computer and type in. It’s a different problem than the usual natural-language processing problem. Where one has to understand large chunks of complete text, say on the web. Here we have to take small utterances—sloppily written questions—and see whether one can map them onto the precise symbolic forms that represent the computable knowledge we know.

Before we released Wolfram|Alpha out into the wild, we had to try to learn that pidgin language people use—by looking at corpora of questions and answers and statements. But in the past month we have seen something wonderful: we have hundreds of millions of actual examples of humans communicating with the system. So—like a child learning language or something—we can now start to learn just how to understand what we’re given.

Well, so what will happen with Wolfram|Alpha, and this whole quest to make the world’s knowledge computable? Wolfram|Alpha will get better—hopefully quite quickly. So that one will really be able to ask it more and more of those questions one might have asked a science-fiction computer of the 1950s.

It’s interesting to see how it compares to the early ideas of artificial intelligence. In a sense, those concentrated on trying to make a computer operate like a person—or a brain—but bigger, faster, and stronger. To be able to work things out by reasoning. A bit like pre-Newtonian natural philosophy. But what we’re trying to do with Wolfram|Alpha is to leverage on all the achievements of science and engineering and so on. We’re not trying to fly by emulating birds; we’re just trying to build an efficient airplane. So when you ask Wolfram|Alpha to work out something in physics or math or whatever, it’s not figuring out the answer by reasoning like a person—it’s just trying to blast through, using the best methods known to our civilization, to get the answer.

It’s not obvious that in 2009 computers would yet be powerful enough to get lots of answers in what amounts to a human reaction time. But they are. And that—combined with the existence of the web, and being able to deliver answers on it—is what makes Wolfram|Alpha possible.

What of the future? Well, Wolfram|Alpha is in a sense taking existing knowledge, and encoding it in computable form, and computing answers from it. And almost everything it’s asked is unique—it’s never been asked the same question before, and nobody’s ever written down the answer on the web before. It’s getting figured out in real time, when it’s asked. And it’s in effect coming to new conclusions that have never been seen before.

But in a sense Wolfram|Alpha is very traditional in its knowledge: it’s using models and methods and structures that already exist—that are already part of the existing canon of science, engineering, and so on. But what about inventing new models and methods on the fly? Well, in a sense NKS gives a clear direction for doing that. We can just look out there in the computational universe and see if we can find things that are useful to us.

At first, that might seem crazy. By sampling a space of simple programs, how could we ever get anything rich enough to be useful for practical purposes? Well, that’s a key lesson from NKS: out in that universe of simple programs are lots with all sorts of elaborate behavior. Maybe even rich enough to be our whole universe. But certainly rich enough to be useful for lots of purposes.

Traditionally in doing engineering—or science—we tend to want to construct things step by step, understanding how we will achieve our goals. But NKS tells us that there’s a much richer supply of things out there in the computational universe—just ready to be mined. It’s a bit like the situation with physical materials: we make technology in effect by going out in the material world and seeing materials with certain properties, and then realizing that we can harness them for our particular technological purposes. And so it is in the computational universe.

In fact, increasingly in Mathematica we are using algorithms that we were not constructed step-by-step by humans, but are instead just found by searching the computational universe. And that’s also been a key methodology in developing Wolfram|Alpha. But going forward, it’s something I think we’ll be able to use on the fly.

We did an experiment a few years ago called WolframTones. That samples the computational universe of simple programs to find musical forms. And it’s remarkable how well it works. Each little program out there defines a sort of artificial world, with its own definite rules that we recognize as “making sense”, but which has enough richness in its behavior to make it seem interesting and aesthetically engaging to us.

Well, we can do the same thing with modeling, with recognizing patterns, with constructing engineering objects. Just going out into the computational universe and mining it for things that are useful to us. Mass customizing not only art, but also scientific theories and engineering inventions. Even mathematics.

You know, if you look at all the mathematics that we do today, its axioms fit comfortably on a page or two. And from that page or two emerge all the millions of theorems that constitute the mathematics of today. But why use the particular axioms that we have used so far? What happens if we just start looking at the whole space of possible axiom systems? In a sense at the space of possible mathematics? Well, it’s pretty interesting out there. All sorts of different kinds of things happen.

If one looks at the history of mathematics, one gets the impression that somehow the mathematics that’s been studied is all there could possibly be. That we’ve reached the edge of the generalization—the abstraction—that can be done. But from studying the space of all possible mathematics, it’s clear that isn’t true. One can find our areas of mathematics in there. Logic, for example, turns out to be about the 50,000th axiom system in the space of all possible axiom systems.

But there doesn’t seem to be anything special about the axiom systems we’ve actually used. And in fact, what I suspect is that really the choice of them is just a reflection of the history of mathematics: that everything we see today is what we get by generalizing specifically from the arithmetic and geometry that were studied in ancient Babylon. That history of mathematics has informed what we’ve been able to do in theoretical science. But there’s so much more out there. That we can see in the computational universe. And that maybe we’ll even be able to find and explore on the fly with a future Wolfram|Alpha.

So what will the future hold for all of this? We’ll be able to compute from the knowledge that our civilization has accumulated. We’ll be able to discover and invent more and more on the fly. And in time I expect that more and more of what we have in the world will end up being “mined” from the computational universe.

What will this mean? Well, right now the things we build in engineering are usually sort of limited to things where we can understand each part of what we do. When we write programs, we do it one piece at a time, always setting it up so that—at least apart from bugs—we can foresee what the program will do. But increasingly, the technology we use will come from mining the computational universe. We’ll know what the technology does—we’ll know its purpose—but we won’t necessarily understand how it does it.

Actually, it’ll be a little like looking at systems in nature. There’ll be the same kinds of complexity. Actually probably more. Because even in an area like biology, where there’s a lot happening even at a molecular scale, the complexity is in many ways limited by the effects of processes like natural selection. But we won’t have those kinds of constraints when we get our technology from the computational universe. We’ll potentially be able to get the most efficient, most optimal, version of everything. And it almost always won’t have any of the simplicity—the identifiable pieces and mechanisms—that exist in our technology today.

So then the big question is: what will the technology do? What purposes will we find for our technology? It’s a very interesting thing, to look at our technology today, and to ask what people at other times in history would have thought about it. Because what we realize is that not only would they not understand how it works. But they also would not understand why anyone would make it. The purposes evolve as the technology evolves.

And as we look at the future of computable knowledge, we may ask what purposes will be found for it. Some we can foresee. But many, I suspect, we cannot. Today our technology is rife with history: we see particular mechanisms over and over again because they were invented at some point, and propagated into the future. But as more of our technology is found—in a sense from scratch—in the computational universe, less history will be visible. But where the thread of history remains—and the arc of our civilization continues to be visible—is in the purposes that are found for things. There is much that is possible; but what we choose to do depends on the thread of our history.

Today is an exciting moment—indeed, I suspect, a defining moment in human history. When we are making a transition from a world in which computation is just one element to a world in which computation is our central concept. In which our thinking, our actions, and our knowledge all revolve around computation.

And what makes this possible is that we are finally now getting into a position where we can take all that knowledge that we as humans have accumulated in our civilization, and encapsulate it in an active computational form. And when we do this, we make it possible to dramatically extend and generalize all our achievements so far.

In a sense our whole history so far has been played out in a tiny corner of the computational universe. But we are now in a position to take all of our knowledge and achievements, and go out and colonize the whole computational universe, extending our human purposes and experience to an unrecognizably broad domain.

I feel extremely fortunate to live at a time in history when all of this is unfolding, and to be in a position myself to contribute to what can happen. I thank you all here for recognizing the journey I have taken so far. And I look forward to everything that will be possible in the years to come.

Note: a summary timeline of the quest for computable knowledge, based on Stephen Wolfram’s earlier notes, is available here.