But now the question is: can computers be set up to understand that notation?
That depends on how systematic it really is, and how much the meaning of a piece of math can really be deduced just from the way it's written down.
Well, as I hope I've shown you, the notation we have today has arisen through a pretty haphazard historical process. There have been a few people, like Leibniz and Peano, who've tried to think about it systematically. But mostly it's just developed through usage, pretty much like ordinary human languages do.
And one of the surprising things is that so far as I know, there's almost never been any kind of introspective study done on the structure of mathematical notation.
For ordinary human language, people have been making grammars for ages. Certainly lots of Greek and Roman philosophers and orators talked about them a lot. And in fact, already from around 500 BC, there's a remarkably clean grammar for Sanskrit written by a person called Panini. In fact, Panini's grammar is set up remarkably like the kind of BNF specifications of production rules that we use now for computer languages.
And not only have there been grammars for language; in the last centuries or so, there have been endless scholarly works on proper language usage and so on.
But despite all this activity about ordinary language, essentially absolutely nothing has been done for mathematical language and mathematical notation. It's really quite strange.
There have even been mathematicians who've worked on grammars for ordinary language. An early example was John Wallis--who made up Wallis' product formula for pi--who wrote a grammar for English in 1658. Wallis was also the character who started the whole fuss about when one should use "will" and when one should use "shall."
In the early 1900s mathematical logicians talked quite a bit about different layers in well-formed mathematical expressions: variables inside functions inside predicates inside functions inside connectives inside quantifiers. But not really about what this meant for the notation for the expressions.
Things got a little more definite in the 1950s, when Chomsky and Backus, essentially independently, invented the idea of context-free languages. The idea came out of work on production systems in mathematical logic, particularly by Emil Post in the 1920s. But, curiously, both Chomsky and Backus came up with the same basic idea in the 1950s.
Backus applied it to computer languages: first Fortran, then ALGOL. And he certainly noticed that algebraic expressions could be represented by context-free grammars.
Chomsky applied the idea to ordinary human language. And he pointed out that to some approximation ordinary human languages can be represented by context-free grammars too.
Of course, linguists--including Chomsky--have spent years showing how much that isn't really true. But the thing that I always find remarkable, and scientifically the most important, is that to a first approximation it is true that ordinary human languages are context-free.
So Chomsky studied ordinary language, and Backus studied things like ALGOL. But neither seems to have looked at more advanced kinds of math than simple algebraic language. And, so far as I can tell, nor has almost anyone else since then.
But if you want to see if you can interpret mathematical notation, you have to know what kind of grammar it uses.
Now I have to tell you that I had always assumed that mathematical notation was too haphazard to be used as any kind of thing that a computer could reasonably interpret in a rigorous way. But at the beginning of the 1990s we got interested in making Mathematica be able to interact with mathematical notation. And so we realized that we really had to figure out what was going on with mathematical notation.
Neil Soiffer had spent quite a number of years working on editing and interpreting mathematical notation, and when he joined our company in 1991, he started trying to convince me that one really could work with mathematical notation in a reasonable way, for both output and input.
The output side was pretty straightforward: after all, TROFF and already did a moderately good job with that.
The issue was input.
Well, actually, one already learned something from output. One learned that at least at some level, a lot of mathematical notation could be represented in some kind of context-free form. Because one knew that in , for instance, one could set things up in a tree of nested boxes.
But how about input? Well, one of the biggest things was something that always comes up in parsing: if you have a string of text, with operands and operators, how do you tell what groups with what?
So let's say you have a math expression like this.
What does it mean? Well, to know that you have to know the precedence of the operators--which ones bind tighter to their operands and so on.
Well, I kind of suspected that there wasn't much consistency to that across all the different pieces of math that people were writing. But I decided to actually take a look at it. So I went through all sorts of math books, and started asking all kinds of people how they would interpret random lumps of mathematical notation. And I found a very surprising thing: out of many tens of operators, there is amazing consistency about people's conception about precedence. So one can really say: here's a definite precedence table for mathematical operators.
We can say with pretty much confidence that this is the precedence table that people imagine when they look at pieces of mathematical notation.
Having found this fact, I got a lot more optimistic about us really being able to interpret mathematical notation input. One way one could always do this is by having templates. Like one has a template for an integral sign, and one just fills stuff into the integrand, the variable, and so on. And when the template pastes into a document it looks right, but it still maintains its information about what template it is, so a program knows how to interpret it. And indeed various programs work like this.
But generally it's extremely frustrating. Because as soon as you try to type fast--or do editing--you just keep on finding that your computer is beeping at you, and refusing to let you do things that seem like you should obviously be able to do.
Letting people do free-form input is much harder. But that's what we wanted to do.
So what's involved in that?
Well, basically one needs a completely rigorous and unambiguous syntax for math. Obviously, one can have such a syntax if one just uses regular computer language like string-based syntax. But then you don't have familiar math notation.
Here's the key problem: traditional math notation isn't completely unambiguous. Or at least it isn't if you try to make it decently general. Let's take a simple example, "i." Well, is that Sqrt[-1] or is it a variable "i?"
In the ordinary textual InputForm of Mathematica all those kinds of ambiguities are resolved by a simple convention: everything that's built into Mathematica has a name that starts with a capital letter.
But capital "I" doesn't look like what one's used to seeing for Sqrt[-1] in math texts. So what can one do about it? Here we had a key idea: you make another character, that's also a lowercase "i" but it's not an ordinary lowercase "i" and you make that be the "i" that's the square root of -1.
You might have thought: Well, why don't we just have two "i" characters, that look the same, exactly like in a math text, but have one of them be somehow special? Well, of course that would be absurdly confusing. You might know which "i" it was when you typed it in, but if you ever moved it around or anything, you'd be quite lost.
So one has to have two "i"s. What should the special one look like?
Well, the idea we had--actually I think I was in the end responsible for it--was to use double-struck characters. We tried all sorts of other graphical forms. But the double struck idea was the best. Partly because it sort of follows a convention in math of having notation for specific objects be double struck.
So, for example, a capital R in mathematical text might be a variable. But double struck R represents a specific object: the set of all real numbers.
So then double-struck "i" is the specific object that we call ImaginaryI. And it works like this:
Well, this double-struck idea solves a lot of problems.
Here's a big one that it solves: integrals. Let's say you try to make syntax for integrals. Well, one of the key issues is what happens with the "d" in the integral? What happens if perhaps there's a "d" as a parameter in the integrand? Or a variable? Things get horribly confused.
Well, as soon as you introduce DifferentialD, or double-struck "d", everything becomes easy. And you end up with a completely well defined syntax.
We might integrate x to the power of d over the square root of x+1. It works like this:
It turns out that there are actually very few tweaks that one has to make to the core of mathematical notation to make it unambiguous. It's surprising. But it's very nice. Because it means you can just enter free form stuff that's essentially mathematical notation, and have it rigorously understood. And that's what we implemented in Mathematica 3.
Of course, to make it really nice, there are lots of details that have to be right. One has to actually be able to type things in an efficient and easy-to-remember way. We thought very hard about that. And we came up with some rather nice general schemes for it.
One of them has to do with entering things like powers as superscripts. In ordinary textual input, when you enter a power you type ^. So what we did for math notation is to have it be that control-^ enters an explicit subscript. And with the same idea, control-/ enters a built-up fraction.
Well, having a clean set of principles like that is crucial to making this whole kind of thing work in practice. But it does. So here's what it might look like to enter a slightly complicated expression.
But we can take pieces of this output and manipulate them.
And the point is that this expression is completely understandable to Mathematica, so you can evaluate it. And the thing that comes out is the same kind of object as the input, and you can edit it, pick it apart, use its pieces as input, and so on.
Well, to make all this work we've had to generalize ordinary computer languages and parsing somewhat. First of all, we're allowing a whole zoo of special characters as operators. But probably more significant, we're allowing two dimensional structures. So instead of just having things like prefix operators, we also have things like overfix operators, and so on.
If you look at the expression here you may complain that it doesn't quite look like traditional math notation. It's really close. And it certainly has all the various compactifying and structuring features of ordinary math notation. And the important thing is that nobody who knows ordinary math notation would be at all confused about what the expression means.
But at a cosmetic level, there are things that are different from the way they'd look in a traditional math textbook. Like the way trig functions are written, and so on.
Well, I would argue rather strongly that the Mathematica StandardForm, as we call it, is a better and clearer version of this expression. And in the book I've been writing for many years about the science project I'm doing, I use only Mathematica StandardForm to represent things.
But if one wants to be fully compatible with traditional textbooks one needs something different. And here's another important idea that was in Mathematica 3: the idea of separating so-called StandardForm from so-called TraditionalForm.
Given any expression, I can always convert it to TraditionalForm.
And the actual TraditionalForm I get always contains enough internal information that it can unambiguously be turned back into StandardForm.
But the TraditionalForm looks just like traditional math notation. With all the slightly crazy things that are in traditional math notation, like writing sin squared x, instead of sin x squared, and so on.
So what about entering TraditionalForm?
You may notice those jaws on the right-hand side of the cell. Well, those mean there's something dangerous here. But let's try editing.
We can edit just fine. Let's see what happens if we try to evaluate this.
Well, we get a warning, or a disclaimer. But let's go ahead anyway.
Well, it figured out what we want.
Actually, we have a few hundred rules that are heuristics for understanding traditional form expressions. And they work fairly well. Sufficiently well, in fact, that one can really go through large volumes of legacy math notation--say specified in --and expect to convert it automatically to unambiguously meaningful Mathematica input.
It's kind of exciting that it's possible to do this. Because if one was thinking of legacy ordinary language text, there's just no way one can expect to convert it to something understandable. But with math there is.
Of course, there are some things with math, particularly on the output side, that are a lot trickier than text. Part of the issue is that with math one can expect to generate things automatically. One can't generate too much text that actually means very much automatically. But with math, you do a computation, and out comes a huge expression.
So then you have to do things like figure out how to break the expression into lines elegantly, which is something we did a lot of work on in Mathematica. There are a lot of interesting issues, like the fact that if you edit an expression, its optimal line breaking can change all the time you're editing it.
And that means there are nasty problems like that you can be typing more characters, but suddenly your cursor jumps backwards. Well, that particular problem I think we solved in a particularly neat way. Let's do an example.
Did you see that? There was a funny blob that appeared just for a moment when the cursor had to move backwards. Perhaps you noticed the blob. But if you were typing, you probably wouldn't notice that your cursor had jumped backwards, though you might notice the blob that appeared because that blob makes your eyes automatically move to the right place, without you noticing. Physiologically, I think it works by using nerve impulses that end up not in the ordinary visual cortex, but directly in the brain stem where eye motion is controlled. So it works by making you subconsciously move your eyes to the right place.
So we've managed to find a way to interpret standard mathematical notation. Does that mean we should turn everything Mathematica can do into math-like notation? Should we have special characters for all the various operations in Mathematica? We could certainly make very compact notation that way. But would it be sensible? Would it be readable?
The answer is basically no.
And I think there's a fundamental principle here: one wants just so much notation, and no more.
One could have no special notation. Then one has Mathematica FullForm. But that gets pretty tiresome to read. And that's probably why a language like LISP seems so difficult--because its syntax is basically Mathematica FullForm.
The other possibility is that everything could have a special notation. Well, then one has something like APL--or parts of mathematical logic. Here's an example of that.
It's fairly hard to read.
Here's another example from Turing's original paper showing the notation he made up for his original universal Turing machine, another not very satisfactory notation.
It's pretty unreadable too.
The question is what's right between the extremes of LISP and APL. I think it's very much the same kind of issue that comes up with things like short command names.
Think about Unix. In early versions of Unix it seemed really nice that there were just a few quick-to-type commands. But then the system started getting bigger. And after a while there were zillions of few-letter commands. And most mere mortals couldn't remember them. And the whole thing started looking completely incomprehensible.
Well, it's the same kind of thing with mathematical notation, or any other kind of notation, for that matter. People can handle a modest number of special forms and special characters. Maybe a few tens of them. Kind of an alphabet's worth. But not more. And if you try to give them more, particularly all at once, they just become confused and put off.
Well, one has to qualify that a bit. There are, for example, lots of relational operators.
But most of these are made up conceptually from a few elements, so there isn't really a problem with them.
And, of course, it is in principle possible for people to learn lots of lots of different characters. Because languages like Chinese and Japanese have thousands of ideograms. But it takes people many extra years of school to learn to read those languages, compared to ones that just use alphabets.
Talking of characters, by the way, I think it's considerably easier for people to handle extra ones that appear in variables than in operators. And it's kind of interesting to see what's happened historically with those.
One thing that's very curious is that, almost without exception, only Latin and Greek characters are ever used. Well, Cantor introduced a Hebrew aleph for his infinite cardinal numbers. And some people say that a partial derivative is a Russian d, though I think historically it really isn't. But there are no other characters that have really gotten imported from other languages.
By the way, you all know that in standard English, "e" is the most common letter, followed by "t," and so on. Well, I was curious what that distribution was like for letters in math. So I had a look in MathWorld, which is a large website of mathematical information that has about 10,000 entries and looked at what the distribution of different letters was.
You can see that "e" is the most common. Actually, very strangely "a" is the second most common. That's very unusual. We can see that lowercase is the most common followed by , , , , etc. and the uppercase ones, , are the most common.
OK. I've talked a bit about notation that is somehow possible to use in math. But what notation is good to use?
Most people who actually use math notation have some feeling for that. But there isn't an analog of something like Fowler's modern English usage for math. There's a little book called Mathematics into Type put out by the AMS, but it's mostly about things like how putting scripts on top of each other requires cutting pieces of paper or film.
The result of this is that there aren't well codified principles, analogous to things like split infinitives in English.
If you use Mathematica StandardForm, you don't need these much. Because anything you type will be unambiguously understandable. But for TraditionalForm, it would be good to have some principles. Like don't write because it's not clear what that means.