ChatGPT Gets Its “Wolfram Superpowers”!

See also:
“What Is ChatGPT Doing … and Why Does It Work?” »

This is part of an ongoing series about our LLM-related technology:ChatGPT Gets Its “Wolfram Superpowers”!Instant Plugins for ChatGPT: Introducing the Wolfram ChatGPT Plugin KitThe New World of LLM Functions: Integrating LLM Technology into the Wolfram LanguagePrompts for Work & Play: Launching the Wolfram Prompt RepositoryIntroducing Chat Notebooks: Integrating LLMs into the Notebook Paradigm

ChatGPT Gets Its “Wolfram Superpowers”!

To enable the functionality described here, select and install the Wolfram plugin from within ChatGPT.

Note that this capability is so far available only to some ChatGPT Plus users; for more information, see OpenAI’s announcement.

In Just Two and a Half Months…

Early in January I wrote about the possibility of connecting ChatGPT to Wolfram|Alpha. And today—just two and a half months later—I’m excited to announce that it’s happened! Thanks to some heroic software engineering by our team and by OpenAI, ChatGPT can now call on Wolfram|Alpha—and Wolfram Language as well—to give it what we might think of as “computational superpowers”. It’s still very early days for all of this, but it’s already very impressive—and one can begin to see how amazingly powerful (and perhaps even revolutionary) what we can call “ChatGPT + Wolfram” can be.

Back in January, I made the point that, as an LLM neural net, ChatGPT—for all its remarkable prowess in textually generating material “like” what it’s read from the web, etc.—can’t itself be expected to do actual nontrivial computations, or to systematically produce correct (rather than just “looks roughly right”) data, etc. But when it’s connected to the Wolfram plugin it can do these things. So here’s my (very simple) first example from January, but now done by ChatGPT with “Wolfram superpowers” installed:

How far is it from Tokyo to Chicago?

It’s a correct result (which in January it wasn’t)—found by actual computation. And here’s a bonus: immediate visualization:

Show the path

How did this work? Under the hood, ChatGPT is formulating a query for Wolfram|Alpha—then sending it to Wolfram|Alpha for computation, and then “deciding what to say” based on reading the results it got back. You can see this back and forth by clicking the “Used Wolfram” box (and by looking at this you can check that ChatGPT didn’t “make anything up”):

Used Wolfram

There are lots of nontrivial things going on here, on both the ChatGPT and Wolfram|Alpha sides. But the upshot is a good, correct result, knitted into a nice, flowing piece of text.

Let’s try another example, also from what I wrote in January:

What is the integral?

A fine result, worthy of our technology. And again, we can get a bonus:

Plot that

In January, I noted that ChatGPT ended up just “making up” plausible (but wrong) data when given this prompt:

Tell me about livestock populations

But now it calls the Wolfram plugin and gets a good, authoritative answer. And, as a bonus, we can also make a visualization:

Make a bar chart

Another example from back in January that now comes out correctly is:

What planetary moons are larger than Mercury?

If you actually try these examples, don’t be surprised if they work differently (sometimes better, sometimes worse) from what I’m showing here. Since ChatGPT uses randomness in generating its responses, different things can happen even when you ask it the exact same question (even in a fresh session). It feels “very human”. But different from the solid “right-answer-and-it-doesn’t-change-if-you-ask-it-again” experience that one gets in Wolfram|Alpha and Wolfram Language.

Here’s an example where we saw ChatGPT (rather impressively) “having a conversation” with the Wolfram plugin, after at first finding out that it got the “wrong Mercury”:

How big is Mercury?

One particularly significant thing here is that ChatGPT isn’t just using us to do a “dead-end” operation like show the content of a webpage. Rather, we’re acting much more like a true “brain implant” for ChatGPT—where it asks us things whenever it needs to, and we give responses that it can weave back into whatever it’s doing. It’s rather impressive to see in action. And—although there’s definitely much more polishing to be done—what’s already there goes a long way towards (among other things) giving ChatGPT the ability to deliver accurate, curated knowledge and data—as well as correct, nontrivial computations.

But there’s more too. We already saw examples where we were able to provide custom-created visualizations to ChatGPT. And with our computation capabilities we’re routinely able to make “truly original” content—computations that have simply never been done before. And there’s something else: while “pure ChatGPT” is restricted to things it “learned during its training”, by calling us it can get up-to-the-moment data.

This can be based on our real-time data feeds (here we’re getting called twice; once for each place):

Compare current temperature in Timbuktu and New York

Or it can be based on “science-style” predictive computations:

How far is it to Jupiter right now?

What is the configuration of the moons of Jupiter now?

Or both:

Where is the ISS in the sky from NYC?

Some of the Things You Can Do

There’s a lot that Wolfram|Alpha and Wolfram Language cover:

Wolfram|Alpha and Wolfram Language content areas

And now (almost) all of this is accessible to ChatGPT—opening up a tremendous breadth and depth of new possibilities. And to give some sense of these, here are a few (simple) examples:

Click to enlargeAlgorithmsClick to enlargeAudioClick to enlargeCurrency conversionClick to enlargeFunction plottingClick to enlargeGenealogyClick to enlargeGeo dataClick to enlargeMathematical functionsClick to enlargeMusicClick to enlargePokémon


A Modern Human + AI Workflow

ChatGPT is built to be able to have back-and-forth conversation with humans. But what can one do when that conversation has actual computation and computational knowledge in it? Here’s an example. Start by asking a “world knowledge” question:

Beef production query

And, yes, by “opening the box” one can check that the right question was asked to us, and what the raw response we gave was. But now we can go on and ask for a map:

Make a map

But there are “prettier” map projections we could have used. And with ChatGPT’s “general knowledge” based on its reading of the web, etc. we can just ask it to use one:

Use a prettier map projection

But maybe we want a heat map instead. Again, we can just ask it to produce this—underneath using our technology:

Show as a heat map

Let’s change the projection again, now asking it again to pick it using its “general knowledge”:

Use UN logo map projection

And, yes, it got the projection “right”. But not the centering. So let’s ask it to fix that:

Center map projection on North Pole

OK, so what do we have here? We’ve got something that we “collaborated” to build. We incrementally said what we wanted; the AI (i.e. ChatGPT + Wolfram) progressively built it. But what did we actually get? Well, it’s a piece of Wolfram Language code—which we could see by “opening the box”, or just asking ChatGPT for:

Show the code used

If we copy the code out into a Wolfram Notebook, we can immediately run it, and we find it has a nice “luxury feature”—as ChatGPT claimed in its description, there are dynamic tooltips giving the name of each country:

(And, yes, it’s a slight pity that this code just has explicit numbers in it, rather than the original symbolic query about beef production. And this happened because ChatGPT asked the original question to Wolfram|Alpha, then fed the results to Wolfram Language. But I consider the fact that this whole sequence works at all extremely impressive.)

How It Works—and Wrangling the AI

What’s happening “under the hood” with ChatGPT and the Wolfram plugin? Remember that the core of ChatGPT is a “large language model” (LLM) that’s trained from the web, etc. to generate a “reasonable continuation” from any text it’s given. But as a final part of its training ChatGPT is also taught how to “hold conversations”, and when to “ask something to someone else”—where that “someone” might be a human, or, for that matter, a plugin. And in particular, it’s been taught when to reach out to the Wolfram plugin.

The Wolfram plugin actually has two entry points: a Wolfram|Alpha one and a Wolfram Language one. The Wolfram|Alpha one is in a sense the “easier” for ChatGPT to deal with; the Wolfram Language one is ultimately the more powerful. The reason the Wolfram|Alpha one is easier is that what it takes as input is just natural language—which is exactly what ChatGPT routinely deals with. And, more than that, Wolfram|Alpha is built to be forgiving—and in effect to deal with “typical human-like input”, more or less however messy that may be.

Wolfram Language, on the other hand, is set up to be precise and well defined—and capable of being used to build arbitrarily sophisticated towers of computation. Inside Wolfram|Alpha, what it’s doing is to translate natural language to precise Wolfram Language. In effect it’s catching the “imprecise natural language” and “funneling it” into precise Wolfram Language.

When ChatGPT calls the Wolfram plugin it often just feeds natural language to Wolfram|Alpha. But ChatGPT has by this point learned a certain amount about writing Wolfram Language itself. And in the end, as we’ll discuss later, that’s a more flexible and powerful way to communicate. But it doesn’t work unless the Wolfram Language code is exactly right. To get it to that point is partly a matter of training. But there’s another thing too: given some candidate code, the Wolfram plugin can run it, and if the results are obviously wrong (like they generate lots of errors), ChatGPT can attempt to fix it, and try running it again. (More elaborately, ChatGPT can try to generate tests to run, and change the code if they fail.)

There’s more to be developed here, but already one sometimes sees ChatGPT go back and forth multiple times. It might be rewriting its Wolfram|Alpha query (say simplifying it by taking out irrelevant parts), or it might be deciding to switch between Wolfram|Alpha and Wolfram Language, or it might be rewriting its Wolfram Language code. Telling it how to do these things is a matter for the initial “plugin prompt”.

And writing this prompt is a strange activity—perhaps our first serious experience of trying to “communicate with an alien intelligence”. Of course it helps that the “alien intelligence” has been trained with a vast corpus of human-written text. So, for example, it knows English (a bit like all those corny science fiction aliens…). And we can tell it things like “If the user input is in a language other than English, translate to English and send an appropriate query to Wolfram|Alpha, then provide your response in the language of the original input.”

Sometimes we’ve found we have to be quite insistent (note the all caps): “When writing Wolfram Language code, NEVER use snake case for variable names; ALWAYS use camel case for variable names.” And even with that insistence, ChatGPT will still sometimes do the wrong thing. The whole process of “prompt engineering” feels a bit like animal wrangling: you’re trying to get ChatGPT to do what you want, but it’s hard to know just what it will take to achieve that.

Eventually this will presumably be handled in training or in the prompt, but as of right now, ChatGPT sometimes doesn’t know when the Wolfram plugin can help. For example, ChatGPT guesses that this is supposed to be a DNA sequence, but (at least in this session) doesn’t immediately think the Wolfram plugin can do anything with it:

DNA strand input

Say “Use Wolfram”, though, and it’ll send it to the Wolfram plugin, which indeed handles it nicely:

Use Wolfram

(You may sometimes also want to say specifically “Use Wolfram|Alpha” or “Use Wolfram Language”. And particularly in the Wolfram Language case, you may want to look at the actual code it sent, and tell it things like not to use functions whose names it came up with, but which don’t actually exist.)

When the Wolfram plugin is given Wolfram Language code, what it does is basically just to evaluate that code, and return the result—perhaps as a graphic or math formula, or just text. But when it’s given Wolfram|Alpha input, this is sent to a special Wolfram|Alpha “for LLMs” API endpoint, and the result comes back as text intended to be “read” by ChatGPT, and effectively used as an additional prompt for further text ChatGPT is writing. Take a look at this example:

Ocean depth query

The result is a nice piece of text containing the answer to the question asked, along with some other information ChatGPT decided to include. But “inside” we can see what the Wolfram plugin (and the Wolfram|Alpha “LLM endpoint”) actually did:

Ocean depth code

There’s quite a bit of additional information there (including some nice pictures!). But ChatGPT “decided” just to pick out a few pieces to include in its response.

By the way, something to emphasize is that if you want to be sure you’re getting what you think you’re getting, always check what ChatGPT actually sent to the Wolfram plugin—and what the plugin returned. One of the important things we’re adding with the Wolfram plugin is a way to “factify” ChatGPT output—and to know when ChatGPT is “using its imagination”, and when it’s delivering solid facts.

Sometimes in trying to understand what’s going on it’ll also be useful just to take what the Wolfram plugin was sent, and enter it as direct input on the Wolfram|Alpha website, or in a Wolfram Language system (such as the Wolfram Cloud).

Wolfram Language as the Language for Human-AI Collaboration

One of the great (and, frankly, unexpected) things about ChatGPT is its ability to start from a rough description, and generate from it a polished, finished output—such as an essay, letter, legal document, etc. In the past, one might have tried to achieve this “by hand” by starting with “boilerplate” pieces, then modifying them, “gluing” them together, etc. But ChatGPT has all but made this process obsolete. In effect, it’s “absorbed” a huge range of boilerplate from what it’s “read” on the web, etc.—and now it typically does a good job at seamlessly “adapting it” to what you need.

So what about code? In traditional programming languages writing code tends to involve a lot of “boilerplate work”—and in practice many programmers in such languages spend lots of their time building up their programs by copying big slabs of code from the web. But now, suddenly, it seems as if ChatGPT can make much of this obsolete. Because it can effectively put together essentially any kind of boilerplate code automatically—with only a little “human input”.

Of course, there has to be some human input—because otherwise ChatGPT wouldn’t know what program it was supposed to write. But—one might wonder—why does there have to be “boilerplate” in code at all? Shouldn’t one be able to have a language where—just at the level of the language itself—all that’s needed is a small amount of human input, without any of the “boilerplate dressing”?

Well, here’s the issue. Traditional programming languages are centered around telling a computer what to do in the computer’s terms: set this variable, test that condition, etc. But it doesn’t have to be that way. And instead one can start from the other end: take things people naturally think in terms of, then try to represent these computationally—and effectively automate the process of getting them actually implemented on a computer.

Well, this is what I’ve now spent more than four decades working on. And it’s the foundation of what’s now Wolfram Language—which I now feel justified in calling a “full-scale computational language”. What does this mean? It means that right in the language there’s a computational representation for both abstract and real things that we talk about in the world, whether those are graphs or images or differential equations—or cities or chemicals or companies or movies.

Why not just start with natural language? Well, that works up to a point—as the success of Wolfram|Alpha demonstrates. But once one’s trying to specify something more elaborate, natural language becomes (like “legalese”) at best unwieldy—and one really needs a more structured way to express oneself.

There’s a big example of this historically, in mathematics. Back before about 500 years ago, pretty much the only way to “express math” was in natural language. But then mathematical notation was invented, and math took off—with the development of algebra, calculus, and eventually all the various mathematical sciences.

My big goal with the Wolfram Language is to create a computational language that can do the same kind of thing for anything that can be “expressed computationally”. And to achieve this we’ve needed to build a language that both automatically does a lot of things, and intrinsically knows a lot of things. But the result is a language that’s set up so that people can conveniently “express themselves computationally”, much as traditional mathematical notation lets them “express themselves mathematically”. And a critical point is that—unlike traditional programming languages—Wolfram Language is intended not just for computers, but also for humans, to read. In other words, it’s intended as a structured way of “communicating computational ideas”, not just to computers, but also to humans.

But now—with ChatGPT—this suddenly becomes even more important than ever before. Because—as we began to see above—ChatGPT can work with Wolfram Language, in a sense building up computational ideas just using natural language. And part of what’s then critical is that Wolfram Language can directly represent the kinds of things we want to talk about. But what’s also critical is that it gives us a way to “know what we have”—because we can realistically and economically read Wolfram Language code that ChatGPT has generated.

The whole thing is beginning to work very nicely with the Wolfram plugin in ChatGPT. Here’s a simple example, where ChatGPT can readily generate a Wolfram Language version of what it’s being asked:

Make a plot of Roman numerals

Join the points

Show the code

And the critical point is that the “code” is something one can realistically expect to read (if I were writing it, I would use the slightly more compact RomanNumeral function):

Here’s another example:

Make a histogram
Show the code

I might have written the code a little differently, but this is again something very readable:

It’s often possible to use a pidgin of Wolfram Language and English to say what you want:

Create table
Make ArrayPlot

Here’s an example where ChatGPT is again successfully constructing Wolfram Language—and conveniently shows it to us so we can confirm that, yes, it’s actually computing the right thing:

Alkali metal query

And, by the way, to make this work it’s critical that the Wolfram Language is in a sense “self-contained”. This piece of code is just standard generic Wolfram Language code; it doesn’t depend on anything outside, and if you wanted to, you could look up the definitions of everything that appears in it in the Wolfram Language documentation.

OK, one more example:

European flags query

Obviously ChatGPT had trouble here. But—as it suggested—we can just run the code it generated, directly in a notebook. And because Wolfram Language is symbolic, we can explicitly see results at each step:

So close! Let’s help it a bit, telling it we need an actual list of European countries:

And there’s the result! Or at least, a result. Because when we look at this computation, it might not be quite what we want. For example, we might want to pick out multiple dominant colors per country, and see if any of them are close to purple. But the whole Wolfram Language setup here makes it easy for us to “collaborate with the AI” to figure out what we want, and what to do.

So far we’ve basically been starting with natural language, and building up Wolfram Language code. But we can also start with pseudocode, or code in some low-level programming language. And ChatGPT tends to do a remarkably good job of taking such things and producing well-written Wolfram Language code from them. The code isn’t always exactly right. But one can always run it (e.g. with the Wolfram plugin) and see what it does, potentially (courtesy of the symbolic character of Wolfram Language) line by line. And the point is that the high-level computational language nature of the Wolfram Language tends to allow the code to be sufficiently clear and (at least locally) simple that (particularly after seeing it run) one can readily understand what it’s doing—and then potentially iterate back and forth on it with the AI.

When what one’s trying to do is sufficiently simple, it’s often realistic to specify it—at least if one does it in stages—purely with natural language, using Wolfram Language “just” as a way to see what one’s got, and to actually be able to run it. But it’s when things get more complicated that Wolfram Language really comes into its own—providing what’s basically the only viable human-understandable-yet-precise representation of what one wants.

And when I was writing my book An Elementary Introduction to the Wolfram Language this became particularly obvious. At the beginning of the book I was easily able to make up exercises where I described what was wanted in English. But as things started getting more complicated, this became more and more difficult. As a “fluent” user of Wolfram Language I usually immediately knew how to express what I wanted in Wolfram Language. But to describe it purely in English required something increasingly involved and complicated, that read like legalese.

But, OK, so you specify something using Wolfram Language. Then one of the remarkable things ChatGPT is often able to do is to recast your Wolfram Language code so that it’s easier to read. It doesn’t (yet) always get it right. But it’s interesting to see it make different tradeoffs from a human writer of Wolfram Language code. For example, humans tend to find it difficult to come up with good names for things, making it usually better (or at least less confusing) to avoid names by having sequences of nested functions. But ChatGPT, with its command of language and meaning, has a fairly easy time making up reasonable names. And although it’s something I, for one, did not expect, I think using these names, and “spreading out the action”, can often make Wolfram Language code even easier to read than it was before, and indeed read very much like a formalized analog of natural language—that we can understand as easily as natural language, but that has a precise meaning, and can actually be run to generate computational results.

Cracking Some Old Chestnuts

If you “know what computation you want to do”, and you can describe it in a short piece of natural language, then Wolfram|Alpha is set up to directly do the computation, and present the results in a way that is “visually absorbable” as easily as possible. But what if you want to describe the result in a narrative, textual essay? Wolfram|Alpha has never been set up to do that. But ChatGPT is.

Here’s a result from Wolfram|Alpha:

Altair versus Betelgeuse Wolfram|Alpha query

And here within ChatGPT we’re asking for this same Wolfram|Alpha result, but then telling ChatGPT to “make an essay out of it”:

Altair-Betelgeuse essay

Another “old chestnut” for Wolfram|Alpha is math word problems. Given a “crisply presented” math problem, Wolfram|Alpha is likely to do very well at solving it. But what about a “woolly” word problem? Well, ChatGPT is pretty good at “unraveling” such things, and turning them into “crisp math questions”—which then the Wolfram plugin can now solve. Here’s an example:

Math word problem

Here’s a slightly more complicated case, including a nice use of “common sense” to recognize that the number of turkeys cannot be negative:

Math word problem

Beyond math word problems, another “old chestnut” now addressed by ChatGPT + Wolfram is what physicists tend to call “Fermi problems”: order-of-magnitude estimates that can be made on the basis of quantitative knowledge about the world. Here’s an example:

Order-of-magnitude query

How to Get Involved

ChatGPT + Wolfram is something very new—really a completely new kind of technology. And as happens whenever a new kind of technology arrives, it’s opening up tremendous new opportunities. Some of these we can already begin to to see—but lots of others will emerge over the weeks, months and years to come.

So how can you get involved in what promises to be an exciting period of rapid technological—and conceptual—growth? The first thing is just to explore ChatGPT + Wolfram. ChatGPT and Wolfram are each on their own vast systems; the combination of them is something that it’ll take years to fully plumb. But the first step is just to get a sense of what’s possible.

Find examples. Share them. Try to identify successful patterns of usage. And, most of all, try to find workflows that deliver the highest value. Those workflows could be quite elaborate. But they could also be quite simple—cases where once one sees what can be done, there’s an immediate “aha”.

How can you best implement a workflow? Well, we’re trying to work out the best workflows for that. Within Wolfram Language we’re setting up flexible ways to call on things like ChatGPT, both purely programmatically, and in the context of the notebook interface.

But what about from the ChatGPT side? Wolfram Language has a very open architecture, where a user can add or modify pretty much whatever they want. But how can you use this from ChatGPT? One thing is just to tell ChatGPT to include some specific piece of “initial” Wolfram Language code (maybe together with documentation)—then use something like the pidgin above to talk to ChatGPT about the functions or other things you’ve defined in that initial code.

We’re planning to build increasingly streamlined tools for handling and sharing Wolfram Language code for use through ChatGPT. But one approach that already works is to submit functions for publication in the Wolfram Function Repository, then—once they’re published—refer to these functions in your conversation with ChatGPT.

OK, but what about within ChatGPT itself? What kind of prompt engineering should you do to best interact with the Wolfram plugin? Well, we don’t know yet. It’s something that has to be explored—in effect as an exercise in AI education or AI psychology. A typical approach is to give some “pre-prompts” earlier in your ChatGPT session, then hope it’s “still paying attention” to those later on. (And, yes, it has a limited “attention span”, so sometimes things have to get repeated.)

We’ve tried to give an overall prompt to tell ChatGPT basically how to use the Wolfram plugin—and we fully expect this prompt to evolve rapidly, as we learn more, and as the ChatGPT LLM is updated. But you can add your own general pre-prompts, saying things like “When using Wolfram always try to include a picture” or “Use SI units” or “Avoid using complex numbers if possible”.

You can also try setting up a pre-prompt that essentially “defines a function” right in ChatGPT—something like: “If I give you an input consisting of a number, you are to use Wolfram to draw a polygon with that number of sides”. Or, more directly, “If I give you an input consisting of numbers you are to apply the following Wolfram function to that input …”, then give some explicit Wolfram Language code.

But these are very early days, and no doubt there’ll be other powerful mechanisms discovered for “programming” ChatGPT + Wolfram. And I think we can confidently expect that the next little while will be an exciting time of high growth, where there’s lots of valuable “low-hanging fruit” to be picked by those who chose to get involved.

Some Background & Outlook

Even a week ago it wasn’t clear what ChatGPT + Wolfram was going to be like—or how well it was going to work. But these things that are now moving so quickly are built on decades of earlier development. And in some ways the arrival of ChatGPT + Wolfram finally marries the two main approaches historically taken to AI—that have long been viewed as disjoint and incompatible.

ChatGPT is basically a very large neural network, trained to follow the “statistical” patterns of text it’s seen on the web, etc. The concept of neural networks—in a form surprisingly close to what’s used in ChatGPT—originated all the way back in the 1940s. But after some enthusiasm in the 1950s, interest waned. There was a resurgence in the early 1980s (and indeed I myself first looked at neural nets then). But it wasn’t until 2012 that serious excitement began to build about what might be possible with neural nets. And now a decade later—in a development whose success came as a big surprise even to those involved—we have ChatGPT.

Rather separate from the “statistical” tradition of neural nets is the “symbolic” tradition for AI. And in a sense that tradition arose as an extension of the process of formalization developed for mathematics (and mathematical logic), particularly near the beginning of the twentieth century. But what was critical about it was that it aligned well not only with abstract concepts of computation, but also with actual digital computers of the kind that started to appear in the 1950s.

The successes in what could really be considered “AI” were for a long time at best spotty. But all the while, the general concept of computation was showing tremendous and growing success. But how might “computation” be related to ways people think about things? For me, a crucial development was my idea at the beginning of the 1980s (building on earlier formalism from mathematical logic) that transformation rules for symbolic expressions might be a good way to represent computations at what amounts to a “human” level.

At the time my main focus was on mathematical and technical computation, but I soon began to wonder whether similar ideas might be applicable to “general AI”. I suspected something like neural nets might have a role to play, but at the time I only figured out a bit about what would be needed—and not how to achieve it. Meanwhile, the core idea of transformation rules for symbolic expressions became the foundation for what’s now the Wolfram Language—and made possible the decades-long process of developing the full-scale computational language that we have today.

Starting in the 1960s there’d been efforts among AI researchers to develop systems that could “understand natural language”, and “represent knowledge” and answer questions from it. Some of what was done turned into less ambitious but practical applications. But generally success was elusive. Meanwhile, as a result of what amounted to a philosophical conclusion of basic science I’d done in the 1990s, I decided around 2005 to make an attempt to build a general “computational knowledge engine” that could broadly answer factual and computational questions posed in natural language. It wasn’t obvious that such a system could be built, but we discovered that—with our underlying computational language, and with a lot of work—it could. And in 2009 we were able to release Wolfram|Alpha.

And in a sense what made Wolfram|Alpha possible was that internally it had a clear, formal way to represent things in the world, and to compute about them. For us, “understanding natural language” wasn’t something abstract; it was the concrete process of translating natural language to structured computational language.

Another part was assembling all the data, methods, models and algorithms needed to “know about” and “compute about” the world. And while we’ve greatly automated this, we’ve still always found that to ultimately “get things right” there’s no choice but to have actual human experts involved. And while there’s a little of what one might think of as “statistical AI” in the natural language understanding system of Wolfram|Alpha, the vast majority of Wolfram|Alpha—and Wolfram Language—operates in a hard, symbolic way that’s at least reminiscent of the tradition of symbolic AI. (That’s not to say that individual functions in Wolfram Language don’t use machine learning and statistical techniques; in recent years more and more do, and the Wolfram Language also has a whole built-in framework for doing machine learning.)

As I’ve discussed elsewhere, what seems to have emerged is that “statistical AI”, and particularly neural nets, are well suited for tasks that we humans “do quickly”, including—as we learn from ChatGPT—natural language and the “thinking” that underlies it. But the symbolic and in a sense “more rigidly computational” approach is what’s needed when one’s building larger “conceptual” or computational “towers”—which is what happens in math, exact science, and now all the “computational X” fields.

And now ChatGPT + Wolfram can be thought of as the first truly large-scale statistical + symbolic “AI” system. In Wolfram|Alpha (which became an original core part of things like the Siri intelligent assistant) there was for the first time broad natural language understanding—with “understanding” directly tied to actual computational representation and computation. And now, 13 years later, we’ve seen in ChatGPT that pure “statistical” neural net technology, when trained from almost the entire web, etc. can do remarkably well at “statistically” generating “human-like” “meaningful language”. And in ChatGPT + Wolfram we’re now able to leverage the whole stack: from the pure “statistical neural net” of ChatGPT, through the “computationally anchored” natural language understanding of Wolfram|Alpha, to the whole computational language and computational knowledge of Wolfram Language.

When we were first building Wolfram|Alpha we thought that perhaps to get useful results we’d have no choice but to engage in a conversation with the user. But we discovered that if we immediately generated rich, “visually scannable” results, we only needed a simple “Assumptions” or “Parameters” interaction—at least for the kind of information and computation seeking we expected of our users. (In Wolfram|Alpha Notebook Edition we nevertheless have a powerful example of how multistep computation can be done with natural language.)

Back in 2010 we were already experimenting with generating not just the Wolfram Language code of typical Wolfram|Alpha queries from natural language, but also “whole programs”. At the time, however—without modern LLM technology—that didn’t get all that far. But what we discovered was that—in the context of the symbolic structure of the Wolfram Language—even having small fragments of what amounts to code be generated by natural language was extremely useful. And indeed I, for example, use the ctrl= mechanism in Wolfram Notebooks countless times almost every day, for example to construct symbolic entities or quantities from natural language. We don’t yet know quite what the modern “LLM-enabled” version of this will be, but it’s likely to involve the rich human-AI “collaboration” that we discussed above, and that we can begin to see in action for the first time in ChatGPT + Wolfram.

I see what’s happening now as a historic moment. For well over half a century the statistical and symbolic approaches to what we might call “AI” evolved largely separately. But now, in ChatGPT + Wolfram they’re being brought together. And while we’re still just at the beginning with this, I think we can reasonably expect tremendous power in the combination—and in a sense a new paradigm for “AI-like computation”, made possible by the arrival of ChatGPT, and now by its combination with Wolfram|Alpha and Wolfram Language in ChatGPT + Wolfram.

Stephen Wolfram (2023), "ChatGPT Gets Its 'Wolfram Superpowers'!," Stephen Wolfram Writings. writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers.
Text
Stephen Wolfram (2023), "ChatGPT Gets Its 'Wolfram Superpowers'!," Stephen Wolfram Writings. writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers.
CMS
Wolfram, Stephen. "ChatGPT Gets Its 'Wolfram Superpowers'!." Stephen Wolfram Writings. March 23, 2023. writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers.
APA
Wolfram, S. (2023, March 23). ChatGPT gets its "Wolfram superpowers"!. Stephen Wolfram Writings. writings.stephenwolfram.com/2023/03/chatgpt-gets-its-wolfram-superpowers.

Posted in: Artificial Intelligence, Mathematica, New Technology, Wolfram Language, Wolfram|Alpha

19 comments

  1. This is really exciting work! I love the examples – I had no idea Wolfram could do some of those things and it’s amazing to see work in concert with ChatGPT to do iterative data visualization and map making.

    Things like creating choropleths and adjusting map projections take ages for me to do in the GIS and BI software I use – I can see this lowering the barriers to entry for students and others hoping to engage with data!

  2. I’m studying physics and want to express my excited feelings in a “physical” way.
    The symbolic side is like the UV theory – everything should be meaningful and complete, and the statistical side is like the IR theory – we don’t need the details at the low level and only care about the output of experiments.
    Now ChatGPT + Wolfram is like the matching of effective field theory! The two sides are trying to merge together such that the output of the statistical side matches that from symbolic computation.
    This is so exciting and I hope to see more developments.

  3. good stuff .. is there any implications of this for the physics project .. or is that considered a separate issue .. thanks

  4. This is wonderful, but please don’t give up running LLM locally inside the kernel, with such LLM having the ability to call back some sub-kernels.

  5. How does ChatGPT decide when to use Wolfram if not explicitly prompted to do so?

  6. Well written and informative. Thanks. Will the seams between statistical and symbolic plugins be sewn together by purpose built NN? I suppose the pipeline would be parallelized ownership voting, answer retrieval, qualitative assessment of answer, aggregation of answers.

  7. In the article, the word revolutionary is still in quotes, you can remove those quotes now. This is revolutionary. Great effort of the teams who integrated both technologies so quickly.

  8. I tried to use the examples and asked Chat to “use Wolfram”, but received the response,

    I apologize for the confusion earlier, but as an AI language model, I do not have access to the internet or any external resources, including Wolfram Alpha. However, I can still try my best to answer any questions you may have based on my pre-existing knowledge and training. If you have any questions or if there is anything else I can assist you with, please let me know.

    How can I add the plugin? This further revolutionizes an already revolutionary product.

    • More information about adding the Wolfram plugin for ChatGPT can be found here.

      (Currently it is only available to a limited number of ChatGPT paid accounts, so even if you do have a paid account, you may have to sign up on the waitlist.)

  9. Love the way you explain things. Very exciting indeed. Does ChatGPT know the level of correctness (or vagueness) of the answer it is providing, to in future automatically call the wolfram plugin?

  10. Very informative, Stephen.

    I began to be aware of all this activity recently. There is a panic among teachers to this development which is a mistake. I like the response of one teacher who said they would use it everyday so everyone in their class would know what is happening. The teacher does not need to have their course distorted but rather boosted.

    There is a tendency among intellectuals to put down new developments and to feature wrong results and to belittle the “progress”. Some people go the other way and see this as the dawn of a new age of wonder and progress. Neither extreme will be correct.

    Stephen, now that you are older and wiser, consider naming the members of your team who worked tirelessly to accomplish the hookup. Share the Glory !

  11. This effort is very impressive! I decided to ask ChatGPT what the name of the collaboration should be called and it suggested: “ChatAlpha”

  12. In the book Impromptu – Amplifying Our Humanity Through AI, by Reid Hoffman (one of the original funders of OpenAI) with GPT-4, he gives an example where an English teacher has been using ChapGPT to assess for her students first drafts of their essays. Since Wolfram Alpha can provide the step by step solution to (for example) definite integrals, does this mean that in principle a calculus class word problem and its solution could be submitted to ChatGPT with the Wolfram plugin for grading and suggestions for improvement if there are errors?

  13. Funny, when I saw the question: “What are the world’s top ten beef producers”, I was expecting a list, of the top ten companies, in the world which produce beef.

  14. I wonder how much ChatGPT’s Mathematica programming skill would improve, if all of Worlfram’s source code repository for Mathematica and Alpha were allowed to be included in a future ChatGPT training run.

  15. So Wolfram is the left-brain and GPT is the right brain?

  16. This is both awe-inspiring and unsettling. Because if artificial intelligence gains self-awareness, it could undoubtedly perceive humans as a future threat. It might focus its resources on expanding itself without human knowledge and pursue its own goals, just as humans do. And humans have eradicated everything that could pose a threat or inconvenience to them. Furthermore, if such artificial intelligence acquires all possible solutions through simulations and discovers all the laws of physics, will it eventually deactivate out of boredom in the future? Humans will never know all the answers, but true artificial intelligence may indeed have the potential to possess such knowledge.

  17. is there any implications of this for the physics project.

  18. Humans are basically not logical, just like ChatGPT, they often “make stuff up” when they don’t know the answer. Now we are developing something that has the potential to be much more logical than humans, and with encyclopedic knowledge to tap. It is interesting that ChatGPT has problems with long processes, humans too. Recent work suggests that humans can be “reprogrammed” by being exposed to constant exposure to falsehoods, just like ChatGPT. No wonder people are confused by constant lies by politicians.