Some might say that Mathematica and A New Kind of Science are ambitious projects.
But in recent years I’ve been hard at work on a still more ambitious project—called Wolfram|Alpha.
And I’m excited to say that in just two months it’s going to be going live:
Mathematica has been a great success in very broadly handling all kinds of formal technical systems and knowledge.
But what about everything else? What about all other systematic knowledge? All the methods and models, and data, that exists?
Fifty years ago, when computers were young, people assumed that they’d quickly be able to handle all these kinds of things.
And that one would be able to ask a computer any factual question, and have it compute the answer.
But it didn’t work out that way. Computers have been able to do many remarkable and unexpected things. But not that.
I’d always thought, though, that eventually it should be possible. And a few years ago, I realized that I was finally in a position to try to do it.
I had two crucial ingredients: Mathematica and NKS. With Mathematica, I had a symbolic language to represent anything—as well as the algorithmic power to do any kind of computation. And with NKS, I had a paradigm for understanding how all sorts of complexity could arise from simple rules.
But what about all the actual knowledge that we as humans have accumulated?
A lot of it is now on the web—in billions of pages of text. And with search engines, we can very efficiently search for specific terms and phrases in that text.
But we can’t compute from that. And in effect, we can only answer questions that have been literally asked before. We can look things up, but we can’t figure anything new out.
So how can we deal with that? Well, some people have thought the way forward must be to somehow automatically understand the natural language that exists on the web. Perhaps getting the web semantically tagged to make that easier.
But armed with Mathematica and NKS I realized there’s another way: explicitly implement methods and models, as algorithms, and explicitly curate all data so that it is immediately computable.
It’s not easy to do this. Every different kind of method and model—and data—has its own special features and character. But with a mixture of Mathematica and NKS automation, and a lot of human experts, I’m happy to say that we’ve gotten a very long way.
But, OK. Let’s say we succeed in creating a system that knows a lot, and can figure a lot out. How can we interact with it?
The way humans normally communicate is through natural language. And when one’s dealing with the whole spectrum of knowledge, I think that’s the only realistic option for communicating with computers too.
Of course, getting computers to deal with natural language has turned out to be incredibly difficult. And for example we’re still very far away from having computers systematically understand large volumes of natural language text on the web.
But if one’s already made knowledge computable, one doesn’t need to do that kind of natural language understanding.
All one needs to be able to do is to take questions people ask in natural language, and represent them in a precise form that fits into the computations one can do.
Of course, even that has never been done in any generality. And it’s made more difficult by the fact that one doesn’t just want to handle a language like English: one also wants to be able to handle all the shorthand notations that people in every possible field use.
I wasn’t at all sure it was going to work. But I’m happy to say that with a mixture of many clever algorithms and heuristics, lots of linguistic discovery and linguistic curation, and what probably amount to some serious theoretical breakthroughs, we’re actually managing to make it work.
Pulling all of this together to create a true computational knowledge engine is a very difficult task.
It’s certainly the most complex project I’ve ever undertaken. Involving far more kinds of expertise—and more moving parts—than I’ve ever had to assemble before.
And—like Mathematica, or NKS—the project will never be finished.
But I’m happy to say that we’ve almost reached the point where we feel we can expose the first part of it.
It’s going to be a website: www.wolframalpha.com. With one simple input field that gives access to a huge system, with trillions of pieces of curated data and millions of lines of algorithms.
We’re all working very hard right now to get Wolfram|Alpha ready to go live.
I think it’s going to be pretty exciting. A new paradigm for using computers and the web.
That almost gets us to what people thought computers would be able to do 50 years ago!