Programming with Natural Language Is Actually Going to Work

I love computer languages. In fact, I’ve spent roughly half my life nurturing one particular very rich computer language: Mathematica.

But do we really need computer languages to tell our computers what to do? Why can’t we just use natural human languages, like English, instead?

If you’d asked me a few years ago, I would have said it was hopeless. That perhaps one could make toy examples, but that ultimately natural language just wouldn’t be up to the task of creating useful programs.

But then along came Wolfram|Alpha. In which we’ve been able to make free-form linguistics work vastly better than I ever thought possible.

But still, in Wolfram|Alpha the input is essentially just set up to request knowledge—and Wolfram|Alpha responds by computing and presenting whatever knowledge is requested. But programming is different. It is not about generating static knowledge, but about generating programs that can take a range of inputs, and dynamically perform operations.

So the first question is: how might we represent these programs?

In principle we could use pretty much any programming language. But to make things practical, particularly at the beginning, we need a programming language with a couple of key characteristics.

The most important is that programs a user might specify with short pieces of natural language must typically be short—and readable—in the computer language. Because otherwise the user won’t be able to tell—at least not easily—whether the program that’s been produced actually does what they want.

A second, somewhat related, criterion is that it must be possible for arbitrary program fragments to stand alone—so that large programs can realistically be built up incrementally, much like a description in natural language is built up incrementally with sentences and the like.

Well, to get the first of these characteristics requires a very high-level language, in which there are already many constructs already built in to the language—and well enough designed that they all fit together without messy “glue” code.

And to get the second characteristic essentially requires a symbolic language, in which any piece of any program is always a meaningful symbolic expression.

Well, conveniently enough, there is one language that satisfies rather well both these requirements: Mathematica!

The linguistic capabilities of Wolfram|Alpha give one the idea that one might be able to understand free-form natural language specifications of programs. Mathematica is what gives one the idea that there might be a reasonable target for programs generated automatically from natural language.

For me, there was also a third motivating idea—that came from my work on A New Kind of Science (NKS). One might have thought that to perform any kind of complex task would always require a complex program. But what I learned in A New Kind of Science is that simple programs can often do highly complex things.

And the result of this is that it’s often possible to find useful programs just by searching for them in the computational universe of possible programs—a technique that we use with increasing frequency in the actual development of both Wolfram|Alpha and Mathematica.

And it was this that made me think that—even if all else failed—one might be able to “synthesize” programs from natural language just by searching for them.

Well, OK, so there are reasons to hope that it might be possible to use natural language input to do programming.

But can one actually make it work?

Even when Wolfram|Alpha was launched, I still wasn’t sure. But as we worked on bringing Wolfram|Alpha together with Mathematica, I got more and more optimistic.

And yesterday—with the release of Mathematica 8—we’ve launched the first production example. It’s certainly not the end of the story, but I think it’s a really good beginning. And I know that even as an expert Mathematica programmer, I’ve started routinely using natural language input for certain steps in writing programs.

I showed a few examples in my post yesterday about free-form linguistics in Mathematica. Here’s another example:


Here’s an example involving lists:

Lists example

And here are a couple of examples that make use of data from Wolfram|Alpha:

Star sequence

One can also specify programs in natural language to apply to things one’s constructed in Mathematica. And in a Mathematica session, one can discard the natural language and just use the generated code by clicking that code.

Now, of course, there are many issues—for example about disambiguation. But the good news is that we’ve got schemes for addressing these that we’ve been able to test out well in Wolfram|Alpha.

I have to say that something I thought would be a big issue is the vagueness of natural language. That one particular natural language input might equally well refer to many different precise programs.

And I had imagined it would be a routine thing to have to generate test examples for the user in order to be able to choose between different possible programs.

But in reality this seems to be quite rare: there is usually an “obvious” interpretation, that in typical Wolfram|Alpha style, one can put first—with the less obvious interpretations a click away.

So, how well does this all work? We’ve built out some particular areas of program functionality, and we’ll progressively be building out many more as time goes on.

They’re primarily set up to work in Mathematica. But actually you can see most of them in some form just on the Wolfram|Alpha website—though obviously no references to variables or other parts of a Mathematica session can be used.

Harmonic mean filter

How robust is it all? It’s definitely usable, but I would certainly like it to be more robust—and we will be working hard in that direction.

One issue that we have faced is a lack of linguistic corpora in the area. We’ve scoured a couple of decades of our own tech support logs, as well as many programming forums, to try to find natural language descriptions matched with precise programs. But we haven’t be able to apply anything like the same level of automatic filtering to this process as we’ve been able to apply in many other areas of “linguistic discovery” for Wolfram|Alpha.

There are zillions of fascinating research projects to do in figuring out generalized grammars for specifying different kinds of programming constructs in natural language—and I’ll look forward to seeing this field of inquiry develop.

But as of yesterday we now have an important new source of data: actual examples of natural language programming being done in Mathematica 8. And taking a glance right now at our real-time monitoring system for the Wolfram|Alpha server infrastructure, I can see that very soon we’re going to have lots of data to study.

How far will it be possible to get with natural language programming? Even six months ago I thought it was only going to be possible to do fairly simple examples. But seeing what we’ve actually been able to build, I’m extremely optimistic about what will be possible.

The hope would be that in the end one will just have to describe in natural language the goal for one’s program—and then an actual program that achieves that goal will be synthesized. Sometimes this will directly be possible from understanding the specification of the goal. Sometimes to create the necessary program will require a whole program-creation process—probably often involving searching for an appropriate program in a space of possible programs, NKS style.

It will be important to do program simplification—again often achieved by program search—in order to be able to get the simplest and most readable (and perhaps the most efficient) program that meets the requirements that have been given.

At this point, I am still concerned about how much of this will be possible in “interactive times” of a few seconds. But if history is a guide, with good algorithms and heuristics, and a healthy dose of large-scale parallelism, it’ll gradually be possible to get the times down.

So what will be the result? I expect natural language programming will eventually become ubiquitous as a way of telling computers what to do. People will be able to get started in doing programming-like tasks without learning anything about official “programming” and programming languages: they’ll just converse with their computers as they might converse with another person.

What will happen to programming languages? Actually, I think they’ll become much more visible and widely known than ever before. Because in natural language programming interfaces one will probably be shown the programming language code that’s being synthesized.

People will see that, and gradually learn cases where it’s much faster and more precise just to enter code like that directly, without going through natural language.

By the way, in Mathematica 8 we’re beginning to have code generation capabilities for low-level languages like C. So it’s going to be technically possible to go all the way from natural language input down to something like C. And for some practical purposes—especially with embedded systems—that’ll no doubt be quite useful.

But when it comes to doing traditional programming alongside natural language programming, there’s going to be a great premium on having a succinct readable programming language—like Mathematica.

With the free-form linguistics of Mathematica 8 we’re at the first step in a long journey. But it’s a journey I’m now confident we can take. After so many years, the science-fiction concept of being able to tell a computer what to do by using plain human language is gradually going to become reality—in a way that fascinatingly coexists with what’s been achieved in high-level computer languages.


  1. The natural language interface is impressive. One concern I have, going forward, is repeatability. As Wolfram continuously modifies the natural language processing engine, it is possible that today’s query yields a different result (maybe better, maybe worse) when executed tomorrow. This might mean that this technique is most appropriate for ad hoc use, and not for stored “programs” with longer lifetimes.

    • Our final free-form input design seems (I hope) pretty “obvious” and simple. But there were lots of issues we went through in coming to it, and repeatability was definitely one of them. We don’t know exactly how people will use the free-form input capability. I think it will be common for people to use it “ad hoc”, as you suggest … and just click on the Mathematica code and blow away their original input. It is also the case that if you re-evaluate a cell where both free-form and Mathematica input are present, then Mathematica will by default not re-run the interpretation, but will just use the Mathematica input that’s already present. I must admit I’m a little nervous about this behavior … but it will tend to increase robustness and repeatability … though at the cost of sometimes not doing what people expect. Thanks for your comment!

  2. I wish there was a better, more apropos word than simply “brilliant”, but that’s the best thing I can think of.
    My question is whether or not this would make things easier, at least at first. The sure-to-be errors will number greatly, and who knows how easy it will be fix these things…perhaps a blend of human language with traditional computer language is best…?
    Either way, very interesting and we’re all eager to see how things like this play out.

  3. I can’t wait until Mathematica can execute instructions from Sol LeWitt.

  4. Stephen, you are right on. This is the next frontier in computing.
    For instance it would make searching a lot better.
    Strangely enough, Google doesn’t look interested in smart searching.

  5. Is the language of someone who’s been immersed in Mathematica for “half my life,” language being mapped on to Mathematica, really “natural”? Some of these examples don’t seem very distant from using conventional English words for Mathematica’s built-in operators … the very language Mathematica aficionados perhaps already use when reading out well-formed Mathematica.

  6. Words don’t describe how truly amazing forward thinking this all is. Its like what Alan Kay said “Simple things should be simple, complex things should be possible”. Keep up the incredible work Mr. Wolfam!

  7. Natural language is not exact enough or clearly defined enough for computer programming, which requires unmistakeably unique symbols and even rigidly patterned whitespacing. Let us instead continue to develop our AI Minds such as and then teach the AI Minds how to program computers in the traditional way.

  8. A great step forward, and long overdue. The computing power of today’s off-the-shelf desktop computers makes natural language processing very doable – if only someone tried hard enough. To me it seems like most companies gave up this dream of AI from the early days, too easily. Kudos to Wolfram/Alpha for trying!

  9. @Brad: “it is possible that today’s query yields a different result (maybe better, maybe worse) when executed tomorrow. This might mean that this technique is most appropriate for ad hoc use, and not for stored “programs” with longer lifetimes.”

    That is a valid concern. The way I see this working for stored programs is that you’d use natural language to create the program initially, but then the program would be stored in the formal language.

    Using natural language to write a lot of code would be a big improvement. It’d be much easier to write a natural phrase than to try to remember the required keyword.

    I’d like the IDE to allow me to use both natural and formal language, with the option to convert natural language statements to formal language as I work.

  10. Have you looked at Anders Hejlsberg libraries for pascal? Very natural language like, and a heavy use of the abstract list concept.

    Why is pascal the only language that has figured out a proper assignment operator?

    Hope all is well!

    • Hi Jason,

      Could you pass along some links or pointers to the works you are referring to?

      Stephanie Prather
      Wolfram Research

  11. Fascinating !

    All of the examples given, except the example of applying a simple database look-up (planet masses), and the example of applying a filter to an image, are “declarative.”

    Where this will get interesting, and useful, imho, is when you can do what-if simulations on huge databases, or “optimization” problems of the flavour of “given I need to be in Boston on the 18th. by 6PM, in downtown Chicago two days later by 4PM, and need to keep hotel bills less than US $150 per night, what air flights and hotel bookings are optimal ?”

    Excuse me while I go ring the bells.


  12. This is an old rant of mine.

    Artificial Intelligence and Programming with Natural Language are absolutely the same challenge. Almost every barrier between us and true AI can be directly attributed to the simplistic languages we use to program computers. Until we have a computer language that has operators like ‘generally’ ‘often’ ‘probably’ ‘crap’ ‘brilliant’ then we can never hope to achieve intelligence. It is the sine qua non, and this is never recognised.

    Language is the barrier, it is the true expression of intelligence.

  13. Hi Stephen. I have this embedded language for my robots called RoboForth – yes I’m sure you know of Forth and regard it as a fossil. The problem with mathematica is it’s way too big for an embedded system. Yet it seems to meet your first and second criteria. I would not presume to compare with your amazing system but I’m looking around for opinions. I don’t want to presume to waste your time but the quickest overview is probably here I find myself between two forces – the forces of C that regard Roboforth with disdain and the forces of real and usually non technical users who demand something simple or at least a soft start.

  14. Stephen, great work. Looking forward for its evolution and availability, thanks

  15. Really fantastic ;this changes the world of programmers and genetic algorithms …

  16. Heh, yeah, except writing a bug-free, well-formed program will involve talking like a lawyer to your computer.

    Useful for ad-hoc scripting and querying but not much else.

  17. Exciting work and incredibly forward-thinking, but examples point to a big hurdle–understanding and using variable declarations. If the concept behind natural language programming is to make it accessible to the masses, an elegant and understandable way of dealing with variables will be a must. The abstract concept of “t” in the list example above will throw many who just don’t process the idea of substitution.

  18. As a technologist that works with the disabled the possibilities are truly amazing.
    I can see a quadriplegic being able to program his own devices that help him communicate or control devices that help him to interact with the physical world.

  19. With the growing ability to have low-level code generation, I can see in the future the possibility of working in Mathematica using ad-hoc queries until a query starts producing the sort of data expected, and then having the ability to “lock-in” that structure by code generation. If the code generation later improves, then the option would be to go back to the last working query and regenerate, to see if the resulting code is faster/better than its predecessor.

    I wonder if, in response to Brad Rubin, if there could be a similar way to “lock-in” the way the natural query (or the resulting code) interfaces with the WolframAlpha engine, by specifying the criteria in a less “natural language” way, but in a more expressive way, so that ambiguous terms today can be disambiguated for the purpose of repeated searches, in a manner that will always return the same data. This way, if an ambiguous term returns results A based on today’s “obvious” interpretation that tomorrow becomes less obvious, the interpretation could be fixed in a slightly less natural way for the benefit of consistent data access.

  20. I would love to see this go in a complementary direction. That is, it would be great if Mathematica could notice the user unsuccessfully fumbling around and suggest some possible code that may help them achieve their goal.

    This is probably a harder problem.

  21. There is another driving force to encourage natural language programming that may not be apparent unless you suffer from carpal tunnel and related occupational hazards of long-term programming. That is, voice recognition has made great strides both in its own computational abilities and, as important as anything, the benefit of running on seriously better hardware.

    The sad fact is that voice recognition has not made anything close to such improvement form ‘programming by voice’ since there is too little grammatical context. Today’s best voice recognition systems use a ‘best guess’ and smart interactive correction interface to get things right.

    Given Stephen’s vast and long-term experience in this domain, I can easily imagine a ‘voice-enabled natural language IDE’ that works with the developer to capture and refine the intended input. Do we need something like this? Is it an unnecessary flight of ‘solution looking for a problem’? I think not. I know there are many folks like me, and there will be MANY more in the years ahead, who struggle to keep working when their fingers don’t last as long as their brains do.

  22. This is awesome, but hopefully it won’t put me as a software engineer out of work 🙂 Anyway, great work and when it is fully developed, hook it up to an IDE with speech recognition.

  23. Interesting, but in the end I believe its fruitless.. the problem with programming languages in general is not how they are expressed, but the thinking style required to make computer programs work at all. Its the concepts that folks find hard to get right, not the language used to express it.. and I simply don’t see that being solved by any single expression mechanism. The quest for a ‘better language’ for me has always seemed a quest to ‘simplify concepts that can’t be simplified’.. making quests like these Quixotic. I do wish you the best.

    And please note.. I”m a very senior programmer who has been doing this since the 1970s.. I’m not an idle observer, I’m a long time actor.

  24. I can’t wait for computers to have this style of interface, including a voice recognition system!
    This would so make computer programming reach The Everyman and not just those people crazy/ingenious enough to “…make it so.” I program and this would make programming MMOFPRPGs so much more enticing, while setting up all of the background stuff and querying the programmer for such things as: attribute and statistics names, hazard saves, level progression, chat system, MySQL or Lua scripting system, what server/location to place it, and other pertinent informations. This would SO make my like easier!
    Keep up the awesome work!

  25. I’m thinking of this in terms of building business applications, and I can see this being possible today on top of a mainstream business language like java or c#.

    A natural language processor that could engage in a context carrying conversation with a business analyst, vs “interactive times” of a few seconds, could build a model of a needed system, dialog with the user to fill in holes and/or verify assumptions, and then produce output that can, at a minimum, be a starting point for programmers. Given that the vast majority of the content in the vast majority of LoB applications are the same heavily recycled tech, this should be immediately possible.

  26. This is an interesting article. I’m looking forward to playing with this in Mathematica 8. Awhile back I programmed something sort of on the same topic

  27. Personally I have to say that I’m not terribly convinced with natural language as a way of specifying a process or program. Natural languages are so fluid and dependent on context that I can’t see it being a computable problem in the near future. For example Wolfram Alpha seems to use some form of named entity recognition to perform it’s calculations but doesn’t appear to use context. If you ask it to “calculate pi” it gives you a numerical value of pi but if you ask it “calculate the ratio of the diameter to the circumference of a circle” then it just tells you about the circumference of a circle. You could turn round and say well if you pose the question in the correct way then you will get the correct answer but then you are stepping away from natural language into a formal language which could just as well be a programming language.

    • Hello Neil,

      Thanks for commenting! This is a bug, and we have passed it along to the appropriate group within our company. We will let you know once this has been addressed.

      Stephanie Prather
      Wolfram Research

  28. This is an interesting language. A couple of comments.
    I would have thought “remove 0 from t” would have removed 0 from t and instead it produces a list of t with zeros removed. A distinction a non-programmer might miss. Is it smart enough to calculate the program language for “list t with 0 removed”?
    Has it progressed enough that “remove zero and minus three from t” would work?
    Does it pick the variable name “t” out of thin air or did you modify the produced language to set “t” at the beginning?
    It still requires certain knowledge level, like when I am asking for mass, am I asking for an answer in KG or LBM? Even if I say “use Scientific measures” there is some ambiguity, because if I ask for length and get 10, does that mean killimeters, meters, or…

  29. its odd that you never seem to connect Mathematica script with carl sassenrath’s Rebol scripting
    in al these years and htats a shame.

  30. i see world’s of non local clustered rebol tcp:ip /UDP instances creating simple multicast tunnelled inter process communications between Mathematica script and rebol script GUI/HTML instances for parsing lots of machine code optimisations for the best SIMD paths to take and far more In the Real world If you write and colaberate with each other in a far better way, just do it and make those RebolMathematica API library extensions and world example GUI’s etc ASAP

  31. I am wondering how good it can get. When we talk to each other and say “draw translucent sphere and a red cone” there are many valid outputs. Sometimes input later in a conversion will narrow the number of valid outputs, e.g. “by the way, the sphere should be about 1 meter away from the cone”. And sometimes it requires the computer to think, “I would like a text styled with a 60’s font style saying ‘Groovy'” and “Move the cone a tad to the left”.

  32. It has been years and I am thinking (and developing translation algorithms that helps to achieve this) how to tell programs of the exact requirements as input in the form of pseudo code or just the goal and they produce the complete product. It is, no doubt, possible.
    One said “The dictionaries telling the meanings has a word, but I desire to write a dictionary telling the words used for one meaning”. I think the system should achieve this.
    And I always was thinking; will people try to deceive the system as they do to each other?
    looking forward to hear the new improvement. Congratulations and Best Regards.

  33. I just love it! It is a big leap in Programming Languages.

  34. I do think full NPL integration in Mathematica is a matter of time, no doubts. If we lookj at the big picture in programming languages, and examine the evolution of languages like C#, we’ll see there is a tendency to make the programmer’s job easier, code tends to be more intuitive and looking at the future, NPL would be the final (and perfect) approach.

    In my opinion, the evolution of pre/compilers should resolve disambiguation problems, but in the meantime, how cool it would be to have a AI dialog systemn as disambiguation trigger!.

  35. Check out applescript

  36. The more you explaing intricacies involved in natural language, the more it inclines back to computer language.

  37. I think a winning combination might be to use Groovy’s support for domain-specific languages (DSLs)[1] with this natural language capability. Groovy already satisifes some of your requirements of a target computer language, namely, terseness & high-level constructs. Throw in speech-to-text and you’d really have something!


  38. Great work.
    But we will not have artificial intelligence until we are also willing to accept artificial ignorance.

  39. Great

  40. Pattern recognition and response mechanism heuristics need to be incorporated into the underlining output models. Not all inputs provide the necessary stimuli to derive an output. The algorithms need first to detect if the question being asked is pure in the sense there is an absolute answer, if not then what further inputs are required to assimilate an adequate response. More importantly what are the further questions needed to solicit an appropriate response.

    As an example, think of how Magnus Carlsen plays chess – is it through brute force combinatorial reasoning that is utilized to identify a best answer output based on finite inputs OR through intuition based on a series of added input-output, input-output, etc… scenarios.

    Carlsen’s success is not based on having the right answer to the question, rather knowing what questions to ask – over and over and over again.

    Google search response algorithms are beginning to tap into this deterministic style of programming. Type the letter ‘T’ – potential responses XXX – type the next letter ‘R’ – potential responses YYY. Google has amassed a database of question-answers – input-outputs that are then used as variables to come up with an optimal answer.

    I guess what I am trying to say is that – if my question/input is draw me 5 balls?
    the response/output should not be a picture with 5 balls – rather it should be another question –
    how big do you want the balls? in what formation do you want the balls drawn? etc…

    If after multiple scenarios of the same question being asked a pattern begins to form as to the answers to the subsequent questions – then draw me 5 balls becomes draw me 5 balls that are X in size and our positioned Y.

    If the word ball is replace by square or the number 5 with 6, then there also needs to be a way to connect these scenarios to a collective pool of inputs that all may share the same outputs. Once the database is populated with a critical mass of input-outputs – one can then begin to formulate a heuristic response mechanism.

  41. Programming in natural language (maybe English) sounds great but is it practical?
    A high level programming language such as ADA is probably more readable than an English sentence.

    The complete syntax of a natural language will be too too big for a compiler to handle. Also,
    often a natural language sentences are vague.
    A high level computer programming language can be seen as a subset of a natural language.

    Without learning a computer programming language, a literature writer will never be a computer programmer.

  42. To Neil and Stephen,
    For natural language programming to work the meaning of each word must be unique. This entails nothing less that creating a version of the English language that is unambiguous/clear.So t ackle it headon. Not a trivial task.
    Create the W|A Dictionary. This to consist of every curated definition. It is needed anyway. I do not expect Wolfram to do the work, just recruit existing publishers.
    Curation to entail definitions using only words already curated.
    If a word has multiple definitions then these must be distinguished by W|A internally with a suffix or similar means.
    If a word can be used in more than one grammatical form then the intended grammatical form must also be indicated by the suffix.
    Input your question but do not press enter.
    Float your mouse over the first word.
    A dropdown list of alternative meanings/assumptions will appear.
    Usually the first one will be OK and you move on, otherwise you select your preferred definition.
    W|A’s output will use the same idea to show the idea behind every word in the output.
    Brian Gilbert, W|A Volunteer curator

  43. This is all very interesting and Wolfram Alpha is wonderful. I have used it for many things.
    Only problem is that when I as it to give me 6 4 number combinations between 1 and 12 which equal 26 it fails miserably.

  44. It follows from the requirement that:-
    1. Mathematica/WA must include a computable definition of every word that is to be accepted as input. Alternative meanings to be distinguished by a suffix number.The current ‘assumptions’ algorithm would help the user with input. Output would have to indicate which of alternative meanings applied perhaps by the suffix. The user could click on a word with a suffix and see the corresponding definition.
    2. A userfriendly form of Mathematica code must be available to users. This to be similar to grammatically correct English but unambiguous. It would replace the present ‘Copyable plaintext’ but would cope with a whole Mathematica program not just one line. Users could use this Mathematica grammer to write Mathematica programs.

  45. Sorry for the Late comment.

    Very good.

    Machines should Infer meaning and grasp context.

    Prompt for the specifics.

    And map to rigid procedure.

    What we have now in Object Orientated Programming, is a misguided shambles, applications that are difficult to use and read even for the best programmers. Because each application invents its’ own language. Making liberal use of new words and combinations of letters at will. Worse than that, Object orientated programming creates a complete universe of unfamiliar things separate from real-world things, then uses new words to describe new abstract processes for those new things.

    It is not an advancement in programming. It is a digression, a degeneration.