To enable the functionality described here, select and install the Wolfram plugin from within ChatGPT.

Note that this capability is so far available only to some ChatGPT Plus users; for more information, see OpenAI’s announcement.

Early in January I wrote about the possibility of connecting ChatGPT to Wolfram|Alpha. And today—just two and a half months later—I’m excited to announce that it’s happened! Thanks to some heroic software engineering by our team and by OpenAI, ChatGPT can now call on Wolfram|Alpha—and Wolfram Language as well—to give it what we might think of as “computational superpowers”. It’s still very early days for all of this, but it’s already very impressive—and one can begin to see how amazingly powerful (and perhaps even revolutionary) what we can call “

Back in January, I made the point that, as an LLM neural net, ChatGPT—for all its remarkable prowess in textually generating material “like” what it’s read from the web, etc.—can’t itself be expected to do actual nontrivial computations, or to systematically produce correct (rather than just “looks roughly right”) data, etc. But when it’s connected to the Wolfram plugin it can do these things. So here’s my (very simple) first example from January, but now done by ChatGPT with “Wolfram superpowers” installed:

It’s a correct result (which in January it wasn’t)—found by actual computation. And here’s a bonus: immediate visualization:

How did this work? Under the hood, ChatGPT is formulating a query for Wolfram|Alpha—then sending it to Wolfram|Alpha for computation, and then “deciding what to say” based on reading the results it got back. You can see this back and forth by clicking the “Used Wolfram” box (and by looking at this you can check that ChatGPT didn’t “make anything up”):

There are lots of nontrivial things going on here, on both the ChatGPT and Wolfram|Alpha sides. But the upshot is a good, correct result, knitted into a nice, flowing piece of text.

Let’s try another example, also from what I wrote in January:

A fine result, worthy of our technology. And again, we can get a bonus:

In January, I noted that ChatGPT ended up just “making up” plausible (but wrong) data when given this prompt:

But now it calls the Wolfram plugin and gets a good, authoritative answer. And, as a bonus, we can also make a visualization:

Another example from back in January that now comes out correctly is:

If you actually try these examples, don’t be surprised if they work differently (sometimes better, sometimes worse) from what I’m showing here. Since ChatGPT uses randomness in generating its responses, different things can happen even when you ask it the exact same question (even in a fresh session). It feels “very human”. But different from the solid “right-answer-and-it-doesn’t-change-if-you-ask-it-again” experience that one gets in Wolfram|Alpha and Wolfram Language.

Here’s an example where we saw ChatGPT (rather impressively) “having a conversation” with the Wolfram plugin, after at first finding out that it got the “wrong Mercury”:

One particularly significant thing here is that ChatGPT isn’t just using us to do a “dead-end” operation like show the content of a webpage. Rather, we’re acting much more like a true “brain implant” for ChatGPT—where it asks us things whenever it needs to, and we give responses that it can weave back into whatever it’s doing. It’s rather impressive to see in action. And—although there’s definitely much more polishing to be done—what’s already there goes a long way towards (among other things) giving ChatGPT the ability to deliver accurate, curated knowledge and data—as well as correct, nontrivial computations.

But there’s more too. We already saw examples where we were able to provide custom-created visualizations to ChatGPT. And with our computation capabilities we’re routinely able to make “truly original” content—computations that have simply never been done before. And there’s something else: while “pure ChatGPT” is restricted to things it “learned during its training”, by calling us it can get up-to-the-moment data.

This can be based on our real-time data feeds (here we’re getting called twice; once for each place):

Or it can be based on “science-style” predictive computations:

Or both:

There’s a lot that Wolfram|Alpha and Wolfram Language cover:

And now (almost) all of this is accessible to ChatGPT—opening up a tremendous breadth and depth of new possibilities. And to give some sense of these, here are a few (simple) examples:

AlgorithmsAudioCurrency conversionFunction plottingGenealogyGeo dataMathematical functionsMusicPokémon

ChatGPT is built to be able to have back-and-forth conversation with humans. But what can one do when that conversation has actual computation and computational knowledge in it? Here’s an example. Start by asking a “world knowledge” question:

And, yes, by “opening the box” one can check that the right question was asked to us, and what the raw response we gave was. But now we can go on and ask for a map:

But there are “prettier” map projections we could have used. And with ChatGPT’s “general knowledge” based on its reading of the web, etc. we can just ask it to use one:

But maybe we want a heat map instead. Again, we can just ask it to produce this—underneath using our technology:

Let’s change the projection again, now asking it again to pick it using its “general knowledge”:

And, yes, it got the projection “right”. But not the centering. So let’s ask it to fix that:

OK, so what do we have here? We’ve got something that we “collaborated” to build. We incrementally said what we wanted; the AI (i.e.

If we copy the code out into a Wolfram Notebook, we can immediately run it, and we find it has a nice “luxury feature”—as ChatGPT claimed in its description, there are dynamic tooltips giving the name of each country:

(And, yes, it’s a slight pity that this code just has explicit numbers in it, rather than the original symbolic query about beef production. And this happened because ChatGPT asked the original question to Wolfram|Alpha, then fed the results to Wolfram Language. But I consider the fact that this whole sequence works at all extremely impressive.)

What’s happening “under the hood” with ChatGPT and the Wolfram plugin? Remember that the core of ChatGPT is a “large language model” (LLM) that’s trained from the web, etc. to generate a “reasonable continuation” from any text it’s given. But as a final part of its training ChatGPT is also taught how to “hold conversations”, and when to “ask something to someone else”—where that “someone” might be a human, or, for that matter, a plugin. And in particular, it’s been taught when to reach out to the Wolfram plugin.

The Wolfram plugin actually has two entry points: a Wolfram|Alpha one and a Wolfram Language one. The Wolfram|Alpha one is in a sense the “easier” for ChatGPT to deal with; the Wolfram Language one is ultimately the more powerful. The reason the Wolfram|Alpha one is easier is that what it takes as input is just natural language—which is exactly what ChatGPT routinely deals with. And, more than that, Wolfram|Alpha is built to be forgiving—and in effect to deal with “typical human-like input”, more or less however messy that may be.

Wolfram Language, on the other hand, is set up to be precise and well defined—and capable of being used to build arbitrarily sophisticated towers of computation. Inside Wolfram|Alpha, what it’s doing is to translate natural language to precise Wolfram Language. In effect it’s catching the “imprecise natural language” and “funneling it” into precise Wolfram Language.

When ChatGPT calls the Wolfram plugin it often just feeds natural language to Wolfram|Alpha. But ChatGPT has by this point learned a certain amount about writing Wolfram Language itself. And in the end, as we’ll discuss later, that’s a more flexible and powerful way to communicate. But it doesn’t work unless the Wolfram Language code is exactly right. To get it to that point is partly a matter of training. But there’s another thing too: given some candidate code, the Wolfram plugin can run it, and if the results are obviously wrong (like they generate lots of errors), ChatGPT can attempt to fix it, and try running it again. (More elaborately, ChatGPT can try to generate tests to run, and change the code if they fail.)

There’s more to be developed here, but already one sometimes sees ChatGPT go back and forth multiple times. It might be rewriting its Wolfram|Alpha query (say simplifying it by taking out irrelevant parts), or it might be deciding to switch between Wolfram|Alpha and Wolfram Language, or it might be rewriting its Wolfram Language code. Telling it how to do these things is a matter for the initial “plugin prompt”.

And writing this prompt is a strange activity—perhaps our first serious experience of trying to “communicate with an alien intelligence”. Of course it helps that the “alien intelligence” has been trained with a vast corpus of human-written text. So, for example, it knows English (a bit like all those corny science fiction aliens…). And we can tell it things like “If the user input is in a language other than English, translate to English and send an appropriate query to Wolfram|Alpha, then provide your response in the language of the original input.”

Sometimes we’ve found we have to be quite insistent (note the all caps): “When writing Wolfram Language code, NEVER use snake case for variable names; ALWAYS use camel case for variable names.” And even with that insistence, ChatGPT will still sometimes do the wrong thing. The whole process of “prompt engineering” feels a bit like animal wrangling: you’re trying to get ChatGPT to do what you want, but it’s hard to know just what it will take to achieve that.

Eventually this will presumably be handled in training or in the prompt, but as of right now, ChatGPT sometimes doesn’t know when the Wolfram plugin can help. For example, ChatGPT guesses that this is supposed to be a DNA sequence, but (at least in this session) doesn’t immediately think the Wolfram plugin can do anything with it:

Say “Use Wolfram”, though, and it’ll send it to the Wolfram plugin, which indeed handles it nicely:

(You may sometimes also want to say specifically “Use Wolfram|Alpha” or “Use Wolfram Language”. And particularly in the Wolfram Language case, you may want to look at the actual code it sent, and tell it things like not to use functions whose names it came up with, but which don’t actually exist.)

When the Wolfram plugin is given Wolfram Language code, what it does is basically just to evaluate that code, and return the result—perhaps as a graphic or math formula, or just text. But when it’s given Wolfram|Alpha input, this is sent to a special Wolfram|Alpha “for LLMs” API endpoint, and the result comes back as text intended to be “read” by ChatGPT, and effectively used as an additional prompt for further text ChatGPT is writing. Take a look at this example:

The result is a nice piece of text containing the answer to the question asked, along with some other information ChatGPT decided to include. But “inside” we can see what the Wolfram plugin (and the Wolfram|Alpha “LLM endpoint”) actually did:

There’s quite a bit of additional information there (including some nice pictures!). But ChatGPT “decided” just to pick out a few pieces to include in its response.

By the way, something to emphasize is that if you want to be sure you’re getting what you think you’re getting, always check what ChatGPT actually sent to the Wolfram plugin—and what the plugin returned. One of the important things we’re adding with the Wolfram plugin is a way to “factify” ChatGPT output—and to know when ChatGPT is “using its imagination”, and when it’s delivering solid facts.

Sometimes in trying to understand what’s going on it’ll also be useful just to take what the Wolfram plugin was sent, and enter it as direct input on the Wolfram|Alpha website, or in a Wolfram Language system (such as the Wolfram Cloud).

One of the great (and, frankly, unexpected) things about ChatGPT is its ability to start from a rough description, and generate from it a polished, finished output—such as an essay, letter, legal document, etc. In the past, one might have tried to achieve this “by hand” by starting with “boilerplate” pieces, then modifying them, “gluing” them together, etc. But ChatGPT has all but made this process obsolete. In effect, it’s “absorbed” a huge range of boilerplate from what it’s “read” on the web, etc.—and now it typically does a good job at seamlessly “adapting it” to what you need.

So what about code? In traditional programming languages writing code tends to involve a lot of “boilerplate work”—and in practice many programmers in such languages spend lots of their time building up their programs by copying big slabs of code from the web. But now, suddenly, it seems as if ChatGPT can make much of this obsolete. Because it can effectively put together essentially any kind of boilerplate code automatically—with only a little “human input”.

Of course, there has to be some human input—because otherwise ChatGPT wouldn’t know what program it was supposed to write. But—one might wonder—why does there have to be “boilerplate” in code at all? Shouldn’t one be able to have a language where—just at the level of the language itself—all that’s needed is a small amount of human input, without any of the “boilerplate dressing”?

Well, here’s the issue. Traditional programming languages are centered around telling a computer what to do in the computer’s terms: set this variable, test that condition, etc. But it doesn’t have to be that way. And instead one can start from the other end: take things people naturally think in terms of, then try to represent these computationally—and effectively automate the process of getting them actually implemented on a computer.

Well, this is what I’ve now spent more than four decades working on. And it’s the foundation of what’s now Wolfram Language—which I now feel justified in calling a “full-scale computational language”. What does this mean? It means that right in the language there’s a computational representation for both abstract and real things that we talk about in the world, whether those are graphs or images or differential equations—or cities or chemicals or companies or movies.

Why not just start with natural language? Well, that works up to a point—as the success of Wolfram|Alpha demonstrates. But once one’s trying to specify something more elaborate, natural language becomes (like “legalese”) at best unwieldy—and one really needs a more structured way to express oneself.

There’s a big example of this historically, in mathematics. Back before about 500 years ago, pretty much the only way to “express math” was in natural language. But then mathematical notation was invented, and math took off—with the development of algebra, calculus, and eventually all the various mathematical sciences.

My big goal with the Wolfram Language is to create a computational language that can do the same kind of thing for anything that can be “expressed computationally”. And to achieve this we’ve needed to build a language that both automatically does a lot of things, and intrinsically knows a lot of things. But the result is a language that’s set up so that people can conveniently “express themselves computationally”, much as traditional mathematical notation lets them “express themselves mathematically”. And a critical point is that—unlike traditional programming languages—Wolfram Language is intended not just for computers, but also for humans, to read. In other words, it’s intended as a structured way of “communicating computational ideas”, not just to computers, but also to humans.

But now—with ChatGPT—this suddenly becomes even more important than ever before. Because—as we began to see above—ChatGPT can work with Wolfram Language, in a sense building up computational ideas just using natural language. And part of what’s then critical is that Wolfram Language can directly represent the kinds of things we want to talk about. But what’s also critical is that it gives us a way to “know what we have”—because we can realistically and economically read Wolfram Language code that ChatGPT has generated.

The whole thing is beginning to work very nicely with the Wolfram plugin in ChatGPT. Here’s a simple example, where ChatGPT can readily generate a Wolfram Language version of what it’s being asked:

And the critical point is that the “code” is something one can realistically expect to read (if I were writing it, I would use the slightly more compact `RomanNumeral` function):

Here’s another example:

I might have written the code a little differently, but this is again something very readable:

It’s often possible to use a pidgin of Wolfram Language and English to say what you want:

Here’s an example where ChatGPT is again successfully constructing Wolfram Language—and conveniently shows it to us so we can confirm that, yes, it’s actually computing the right thing:

And, by the way, to make this work it’s critical that the Wolfram Language is in a sense “self-contained”. This piece of code is just standard generic Wolfram Language code; it doesn’t depend on anything outside, and if you wanted to, you could look up the definitions of everything that appears in it in the Wolfram Language documentation.

OK, one more example:

Obviously ChatGPT had trouble here. But—as it suggested—we can just run the code it generated, directly in a notebook. And because Wolfram Language is symbolic, we can explicitly see results at each step:

So close! Let’s help it a bit, telling it we need an actual list of European countries:

And there’s the result! Or at least, a result. Because when we look at this computation, it might not be quite what we want. For example, we might want to pick out multiple dominant colors per country, and see if any of them are close to purple. But the whole Wolfram Language setup here makes it easy for us to “collaborate with the AI” to figure out what we want, and what to do.

So far we’ve basically been starting with natural language, and building up Wolfram Language code. But we can also start with pseudocode, or code in some low-level programming language. And ChatGPT tends to do a remarkably good job of taking such things and producing well-written Wolfram Language code from them. The code isn’t always exactly right. But one can always run it (e.g. with the Wolfram plugin) and see what it does, potentially (courtesy of the symbolic character of Wolfram Language) line by line. And the point is that the high-level computational language nature of the Wolfram Language tends to allow the code to be sufficiently clear and (at least locally) simple that (particularly after seeing it run) one can readily understand what it’s doing—and then potentially iterate back and forth on it with the AI.

When what one’s trying to do is sufficiently simple, it’s often realistic to specify it—at least if one does it in stages—purely with natural language, using Wolfram Language “just” as a way to see what one’s got, and to actually be able to run it. But it’s when things get more complicated that Wolfram Language really comes into its own—providing what’s basically the only viable human-understandable-yet-precise representation of what one wants.

And when I was writing my book *An Elementary Introduction to the Wolfram Language* this became particularly obvious. At the beginning of the book I was easily able to make up exercises where I described what was wanted in English. But as things started getting more complicated, this became more and more difficult. As a “fluent” user of Wolfram Language I usually immediately knew how to express what I wanted in Wolfram Language. But to describe it purely in English required something increasingly involved and complicated, that read like legalese.

But, OK, so you specify something using Wolfram Language. Then one of the remarkable things ChatGPT is often able to do is to recast your Wolfram Language code so that it’s easier to read. It doesn’t (yet) always get it right. But it’s interesting to see it make different tradeoffs from a human writer of Wolfram Language code. For example, humans tend to find it difficult to come up with good names for things, making it usually better (or at least less confusing) to avoid names by having sequences of nested functions. But ChatGPT, with its command of language and meaning, has a fairly easy time making up reasonable names. And although it’s something I, for one, did not expect, I think using these names, and “spreading out the action”, can often make Wolfram Language code even easier to read than it was before, and indeed read very much like a formalized analog of natural language—that we can understand as easily as natural language, but that has a precise meaning, and can actually be run to generate computational results.

If you “know what computation you want to do”, and you can describe it in a short piece of natural language, then Wolfram|Alpha is set up to directly do the computation, and present the results in a way that is “visually absorbable” as easily as possible. But what if you want to describe the result in a narrative, textual essay? Wolfram|Alpha has never been set up to do that. But ChatGPT is.

Here’s a result from Wolfram|Alpha:

And here within ChatGPT we’re asking for this same Wolfram|Alpha result, but then telling ChatGPT to “make an essay out of it”:

Another “old chestnut” for Wolfram|Alpha is math word problems. Given a “crisply presented” math problem, Wolfram|Alpha is likely to do very well at solving it. But what about a “woolly” word problem? Well, ChatGPT is pretty good at “unraveling” such things, and turning them into “crisp math questions”—which then the Wolfram plugin can now solve. Here’s an example:

Here’s a slightly more complicated case, including a nice use of “common sense” to recognize that the number of turkeys cannot be negative:

Beyond math word problems, another “old chestnut” now addressed by

So how can you get involved in what promises to be an exciting period of rapid technological—and conceptual—growth? The first thing is just to explore

Find examples. Share them. Try to identify successful patterns of usage. And, most of all, try to find workflows that deliver the highest value. Those workflows could be quite elaborate. But they could also be quite simple—cases where once one sees what can be done, there’s an immediate “aha”.

How can you best implement a workflow? Well, we’re trying to work out the best workflows for that. Within Wolfram Language we’re setting up flexible ways to call on things like ChatGPT, both purely programmatically, and in the context of the notebook interface.

But what about from the ChatGPT side? Wolfram Language has a very open architecture, where a user can add or modify pretty much whatever they want. But how can you use this from ChatGPT? One thing is just to tell ChatGPT to include some specific piece of “initial” Wolfram Language code (maybe together with documentation)—then use something like the pidgin above to talk to ChatGPT about the functions or other things you’ve defined in that initial code.

We’re planning to build increasingly streamlined tools for handling and sharing Wolfram Language code for use through ChatGPT. But one approach that already works is to submit functions for publication in the Wolfram Function Repository, then—once they’re published—refer to these functions in your conversation with ChatGPT.

OK, but what about within ChatGPT itself? What kind of prompt engineering should you do to best interact with the Wolfram plugin? Well, we don’t know yet. It’s something that has to be explored—in effect as an exercise in AI education or AI psychology. A typical approach is to give some “pre-prompts” earlier in your ChatGPT session, then hope it’s “still paying attention” to those later on. (And, yes, it has a limited “attention span”, so sometimes things have to get repeated.)

We’ve tried to give an overall prompt to tell ChatGPT basically how to use the Wolfram plugin—and we fully expect this prompt to evolve rapidly, as we learn more, and as the ChatGPT LLM is updated. But you can add your own general pre-prompts, saying things like “When using Wolfram always try to include a picture” or “Use SI units” or “Avoid using complex numbers if possible”.

You can also try setting up a pre-prompt that essentially “defines a function” right in ChatGPT—something like: “If I give you an input consisting of a number, you are to use Wolfram to draw a polygon with that number of sides”. Or, more directly, “If I give you an input consisting of numbers you are to apply the following Wolfram function to that input …”, then give some explicit Wolfram Language code.

But these are very early days, and no doubt there’ll be other powerful mechanisms discovered for “programming”

Even a week ago it wasn’t clear what

ChatGPT is basically a very large neural network, trained to follow the “statistical” patterns of text it’s seen on the web, etc. The concept of neural networks—in a form surprisingly close to what’s used in ChatGPT—originated all the way back in the 1940s. But after some enthusiasm in the 1950s, interest waned. There was a resurgence in the early 1980s (and indeed I myself first looked at neural nets then). But it wasn’t until 2012 that serious excitement began to build about what might be possible with neural nets. And now a decade later—in a development whose success came as a big surprise even to those involved—we have ChatGPT.

Rather separate from the “statistical” tradition of neural nets is the “symbolic” tradition for AI. And in a sense that tradition arose as an extension of the process of formalization developed for mathematics (and mathematical logic), particularly near the beginning of the twentieth century. But what was critical about it was that it aligned well not only with abstract concepts of computation, but also with actual digital computers of the kind that started to appear in the 1950s.

The successes in what could really be considered “AI” were for a long time at best spotty. But all the while, the general concept of computation was showing tremendous and growing success. But how might “computation” be related to ways people think about things? For me, a crucial development was my idea at the beginning of the 1980s (building on earlier formalism from mathematical logic) that transformation rules for symbolic expressions might be a good way to represent computations at what amounts to a “human” level.

At the time my main focus was on mathematical and technical computation, but I soon began to wonder whether similar ideas might be applicable to “general AI”. I suspected something like neural nets might have a role to play, but at the time I only figured out a bit about what would be needed—and not how to achieve it. Meanwhile, the core idea of transformation rules for symbolic expressions became the foundation for what’s now the Wolfram Language—and made possible the decades-long process of developing the full-scale computational language that we have today.

Starting in the 1960s there’d been efforts among AI researchers to develop systems that could “understand natural language”, and “represent knowledge” and answer questions from it. Some of what was done turned into less ambitious but practical applications. But generally success was elusive. Meanwhile, as a result of what amounted to a philosophical conclusion of basic science I’d done in the 1990s, I decided around 2005 to make an attempt to build a general “computational knowledge engine” that could broadly answer factual and computational questions posed in natural language. It wasn’t obvious that such a system could be built, but we discovered that—with our underlying computational language, and with a lot of work—it could. And in 2009 we were able to release Wolfram|Alpha.

And in a sense what made Wolfram|Alpha possible was that internally it had a clear, formal way to represent things in the world, and to compute about them. For us, “understanding natural language” wasn’t something abstract; it was the concrete process of translating natural language to structured computational language.

Another part was assembling all the data, methods, models and algorithms needed to “know about” and “compute about” the world. And while we’ve greatly automated this, we’ve still always found that to ultimately “get things right” there’s no choice but to have actual human experts involved. And while there’s a little of what one might think of as “statistical AI” in the natural language understanding system of Wolfram|Alpha, the vast majority of Wolfram|Alpha—and Wolfram Language—operates in a hard, symbolic way that’s at least reminiscent of the tradition of symbolic AI. (That’s not to say that individual functions in Wolfram Language don’t use machine learning and statistical techniques; in recent years more and more do, and the Wolfram Language also has a whole built-in framework for doing machine learning.)

As I’ve discussed elsewhere, what seems to have emerged is that “statistical AI”, and particularly neural nets, are well suited for tasks that we humans “do quickly”, including—as we learn from ChatGPT—natural language and the “thinking” that underlies it. But the symbolic and in a sense “more rigidly computational” approach is what’s needed when one’s building larger “conceptual” or computational “towers”—which is what happens in math, exact science, and now all the “computational X” fields.

And now

When we were first building Wolfram|Alpha we thought that perhaps to get useful results we’d have no choice but to engage in a conversation with the user. But we discovered that if we immediately generated rich, “visually scannable” results, we only needed a simple “Assumptions” or “Parameters” interaction—at least for the kind of information and computation seeking we expected of our users. (In Wolfram|Alpha Notebook Edition we nevertheless have a powerful example of how multistep computation can be done with natural language.)

Back in 2010 we were already experimenting with generating not just the Wolfram Language code of typical Wolfram|Alpha queries from natural language, but also “whole programs”. At the time, however—without modern LLM technology—that didn’t get all that far. But what we discovered was that—in the context of the symbolic structure of the Wolfram Language—even having small fragments of what amounts to code be generated by natural language was extremely useful. And indeed I, for example, use the `ctrl``=` mechanism in Wolfram Notebooks countless times almost every day, for example to construct symbolic entities or quantities from natural language. We don’t yet know quite what the modern “LLM-enabled” version of this will be, but it’s likely to involve the rich human-AI “collaboration” that we discussed above, and that we can begin to see in action for the first time in

I see what’s happening now as a historic moment. For well over half a century the statistical and symbolic approaches to what we might call “AI” evolved largely separately. But now, in

Just a few months ago writing an original essay seemed like something only a human could do. But then ChatGPT burst onto the scene. And suddenly we realized that an AI could write a passable human-like essay. So now it’s natural to wonder: How far will this go? What will AIs be able to do? And how will we humans fit in?

My goal here is to explore some of the science, technology—and philosophy—of what we can expect from AIs. I should say at the outset that this is a subject fraught with both intellectual and practical difficulty. And all I’ll be able to do here is give a snapshot of my current thinking—which will inevitably be incomplete—not least because, as I’ll discuss, trying to predict how history in an area like this will unfold is something that runs straight into an issue of basic science: the phenomenon of computational irreducibility.

But let’s start off by talking about that particularly dramatic example of AI that’s just arrived on the scene: ChatGPT. So what is ChatGPT? Ultimately, it’s a computational system for generating text that’s been set up to follow the patterns defined by human-written text from billions of webpages, millions of books, etc. Give it a textual prompt and it’ll continue in a way that’s somehow typical of what it’s seen us humans write.

The results (which ultimately rely on all sorts of specific engineering) are remarkably “human like”. And what makes this work is that whenever ChatGPT has to “extrapolate” beyond anything it’s explicitly seen from us humans it does so in ways that seem similar to what we as humans might do.

Inside ChatGPT is something that’s actually computationally probably quite similar to a brain—with millions of simple elements (“neurons”) forming a “neural net” with billions of connections that have been “tweaked” through a progressive process of training until they successfully reproduce the patterns of human-written text seen on all those webpages, etc. Even without training the neural net would still produce some kind of text. But the key point is that it won’t be text that we humans consider meaningful. To get such text we need to build on all that “human context” defined by the webpages and other materials we humans have written. The “raw computational system” will just do “raw computation”; to get something aligned with us humans requires leveraging the detailed human history captured by all those pages on the web, etc.

But so what do we get in the end? Well, it’s text that basically reads like it was written by a human. In the past we might have thought that human language was somehow a uniquely human thing to produce. But now we’ve got an AI doing it. So what’s left for us humans? Well, somewhere things have got to get started: in the case of text, there’s got to be a prompt specified that tells the AI “what direction to go in”. And this is the kind of thing we’ll see over and over again. Given a defined “goal”, an AI can automatically work towards achieving it. But it ultimately takes something beyond the raw computational system of the AI to define what us humans would consider a meaningful goal. And that’s where we humans come in.

What does this mean at a practical, everyday level? Typically we use ChatGPT by telling it—using text—what we basically want. And then it’ll fill in a whole essay’s worth of text talking about it. We can think of this interaction as corresponding to a kind of “linguistic user interface” (that we might dub a “LUI”). In a graphical user interface (GUI) there’s core content that’s being rendered (and input) through some potentially elaborate graphical presentation. In the LUI provided by ChatGPT there’s instead core content that’s being rendered (and input) through a textual (“linguistic”) presentation.

You might jot down a few “bullet points”. And in their raw form someone else would probably have a hard time understanding them. But through the LUI provided by ChatGPT those bullet points can be turned into an “essay” that can be generally understood—because it’s based on the “shared context” defined by everything from the billions of webpages, etc. on which ChatGPT has been trained.

There’s something about this that might seem rather unnerving. In the past, if you saw a custom-written essay you’d reasonably be able to conclude that a certain irreducible human effort was spent in producing it. But with ChatGPT this is no longer true. Turning things into essays is now “free” and automated. “Essayification” is no longer evidence of human effort.

Of course, it’s hardly the first time there’s been a development like this. Back when I was a kid, for example, seeing that a document had been typeset was basically evidence that someone had gone to the considerable effort of printing it on printing press. But then came desktop publishing, and it became basically free to make any document be elaborately typeset.

And in a longer view, this kind of thing is basically a constant trend in history: what once took human effort eventually becomes automated and “free to do” through technology. There’s a direct analog of this in the realm of ideas: that with time higher and higher levels of abstraction are developed, that subsume what were formerly laborious details and specifics.

Will this end? Will we eventually have automated everything? Discovered everything? Invented everything? At some level, we now know that the answer is a resounding no. Because one of the consequences of the phenomenon of computational irreducibility is that there’ll always be more computations to do—that can’t in the end be reduced by any finite amount of automation, discovery or invention.

Ultimately, though, this will be a more subtle story. Because while there may always be more computations to do, it could still be that we as humans don’t care about them. And that somehow everything we care about can successfully be automated—say by AIs—leaving “nothing more for us to do”.

Untangling this issue will be at the heart of questions about how we fit into the AI future. And in what follows we’ll see over and over again that what might at first essentially seem like practical matters of technology quickly get enmeshed with deep questions of science and philosophy.

I’ve already mentioned computational irreducibility a couple of times. And it turns out that this is part of a circle of rather deep—and at first surprising—ideas that I believe are crucial to thinking about the AI future.

Most of our existing intuition about “machinery” and “automation” comes from a kind of “clockwork” view of engineering—in which we specifically build systems component by component to achieve objectives we want. And it’s the same with most software: we write it line by line to specifically do—step by step—whatever it is we want. And we expect that if we want our machinery—or software—to do complex things then the underlying structure of the machinery or software must somehow be correspondingly complex.

So when I started exploring the whole computational universe of possible programs in the early 1980s it was a big surprise to discover that things work quite differently there. And indeed even tiny programs—that effectively just apply very simple rules repeatedly—can generate great complexity. In our usual practice of engineering we haven’t seen this, because we’ve always specifically picked programs (or other structures) where we can readily foresee how they’ll behave, so that we can explicitly set them up to do what we want. But out in the computational universe it’s very common to see programs that just “intrinsically generate” great complexity, without us ever having to explicitly “put it in”.

And having discovered this, we realize that there’s actually a big example that’s been around forever: the natural world. And indeed it increasingly seems as if the “secret” that nature uses to make the complexity it so often shows is exactly to operate according to the rules of simple programs. (For about three centuries it seemed as if mathematical equations were the ultimate way to describe the natural world—but in the past few decades, and particularly poignantly with our recent Physics Project, it’s become clear that simple programs are in general a more powerful approach.)

How does all this relate to technology? Well, technology is about taking what’s out there in the world, and harnessing it for human purposes. And there’s a fundamental tradeoff here. There may be some system out in nature that does amazingly complex things. But the question is whether we can “slice off” certain particular things that we humans happen to find useful. A donkey has all sorts of complex things going on inside. But at some point it was discovered that we can use it “technologically” to do the rather simple thing of pulling a cart.

And when it comes to programs out in the computational universe it’s extremely common to see ones that do amazingly complex things. But the question is whether we can find some aspect of those things that’s useful to us. Maybe the program is good at making pseudorandomness. Or distributedly determining consensus. Or maybe it’s just doing its complex thing, and we don’t yet know any “human purpose” that this achieves.

One of the notable features of a system like ChatGPT is that it isn’t constructed in an “understand-every-step” traditional engineering way. Instead one basically just starts from a “raw computational system” (in the case of ChatGPT, a neural net), then progressively tweaks it until its behavior aligns with the “human-relevant” examples one has. And this alignment is what makes the system “technologically useful”—to us humans.

Underneath, though, it’s still a computational system, with all the potential “wildness” that implies. And free from the “technological objective” of “human-relevant alignment” the system might do all sorts of sophisticated things. But they might not be things that (at least at this time in history) we care about. Even though some putative alien (or our future selves) might.

OK, but let’s come back to the “raw computation” side of things. There’s something very different about computation from all other kinds of “mechanisms” we’ve seen before. We might have a cart that can move forward. And we might have a stapler that can put staples in things. But carts and staplers do very different things; there’s no equivalence between them. But for computational systems (at least ones that don’t just always behave in obviously simple ways) there’s my Principle of Computational Equivalence—which implies that all these systems are in a sense equivalent in the kinds of computations they can do.

This equivalence has many consequences. One of them is that one can expect to make something equally computationally sophisticated out of all sorts of different kinds of things—whether brain tissue or electronics, or some system in nature. And this is effectively where computational irreducibility comes from.

One might think that given, say, some computational system based on a simple program it would always be possible for us—with our sophisticated brains, mathematics, computers, etc.—to “jump ahead” and figure out what the system will do before it’s gone through all the steps to do it. But the Principle of Computational Equivalence implies that this won’t in general be possible—because the system itself can be as computationally sophisticated as our brains, mathematics, computers, etc. are. So this means that the system will be computationally irreducible: the only way to find out what it does is effectively just to go through the same whole computational process that it does.

There’s a prevailing impression that science will always eventually be able do better than this: that it’ll be able to make “predictions” that allow us to work out what will happen without having to trace through each step. And indeed over the past three centuries there’s been lots of success in doing this, mainly by using mathematical equations. But ultimately it turns out that this has only been possible because science has ended up concentrating on particular systems where these methods work (and then these systems have been used for engineering). But the reality is that many systems show computational irreducibility. And in the phenomenon of computational irreducibility science is in effect “deriving its own limitedness”.

Contrary to traditional intuition, try as we might, in many systems we’ll never be able find “formulas” (or other “shortcuts”) that describe what’s going to happen in the systems—because the systems are simply computationally irreducible. And, yes, this represents a limitation on science, and on knowledge in general. But while at first this might seem like a bad thing, there’s also something fundamentally satisfying about it. Because if everything were computationally reducible, we could always “jump ahead” and find out what will happen in the end, say in our lives. But computational irreducibility implies that in general we can’t do that—so that in some sense “something irreducible is being achieved” by the passage of time.

There are a great many consequences of computational irreducibility. Some—that I have particularly explored recently—are in the domain of basic science (for example, establishing core laws of physics as we perceive them from the interplay of computational irreducibility and our computational limitations as observers). But computational irreducibility is also central in thinking about the AI future—and in fact I increasingly feel that it adds the single most important intellectual element needed to make sense of many of the most important questions about the potential roles of AIs and humans in the future.

For example, from our traditional experience with engineering we’re used to the idea that to find out why something happened in a particular way we can just “look inside” a machine or program and “see what it did”. But when there’s computational irreducibility, that won’t work. Yes, we could “look inside” and see, say, a few steps. But computational irreducibility implies that to find out what happened, we’d have to trace through all the steps. We can’t expect to find a “simple human narrative” that “says why something happened”.

But having said this, one feature of computational irreducibility is that within any computationally irreducible systems there must always be (ultimately, infinitely many) “pockets of computational reducibility” to be found. So for example, even though we can’t say in general what will happen, we’ll always be able to identify specific features that we can predict. (“The leftmost cell will always be black”, etc.) And as we’ll discuss later we can potentially think of technological (as well as scientific) progress as being intimately tied to the discovery of these “pockets of reducibility”. And in effect the existence of infinitely many such pockets is the reason that “there’ll always be inventions and discoveries to be made”.

Another consequence of computational irreducibility has to do with trying to ensure things about the behavior of a system. Let’s say one wants to set up an AI so it’ll “never do anything bad”. One might imagine that one could just come up with particular rules that ensure this. But as soon as the behavior of the system (or its environment) is computationally irreducible one will never be able to guarantee what will happen in the system. Yes, there may be particular computationally reducible features one can be sure about. But in general computational irreducibility implies that there’ll always be a “possibility of surprise” or the potential for “unintended consequences”. And the only way to systematically avoid this is to make the system not computationally irreducible—which means it can’t make use of the full power of computation.

We humans like to feel special, and feel as if there’s something “fundamentally unique” about us. Five centuries ago we thought we lived at the center of the universe. Now we just tend to think that there’s something about our intellectual capabilities that’s fundamentally unique and beyond anything else. But the progress of AI—and things like ChatGPT—keep on giving us more and more evidence that that’s not the case. And indeed my Principle of Computational Equivalence says something even more extreme: that at a fundamental computational level there’s just nothing fundamentally special about us at all—and that in fact we’re computationally just equivalent to lots of systems in nature, and even to simple programs.

This broad equivalence is important in being able to make very general scientific statements (like the existence of computational irreducibility). But it also highlights how significant our specifics—our particular history, biology, etc.—are. It’s very much like with ChatGPT. We can have a generic (untrained) neural net with the same structure as ChatGPT, that can do certain “raw computation”. But what makes ChatGPT interesting—at least to us—is that it’s been trained with the “human specifics” described on billions of webpages, etc. In other words, for both us and ChatGPT there’s nothing computationally “generally special”. But there is something “specifically special”—and it’s the particular history we’ve had, particular knowledge our civilization has accumulated, etc.

There’s a curious analogy here to our physical place in the universe. There’s a certain uniformity to the universe, which means there’s nothing “generally special” about our physical location. But at least to us there’s still something “specifically special” about it, because it’s only here that we have our particular planet, etc. At a deeper level, ideas based on our Physics Project have led to the concept of the ruliad: the unique object that is the entangled limit of all possible computational processes. And we can then view our whole experience as “observers of the universe” as consisting of sampling the ruliad at a particular place.

It’s a bit abstract (and a long story, which I won’t go into in any detail here), but we can think of different possible observers as being both at different places in physical space, and at different places in rulial space—giving them different “points of view” about what happens in the universe. Human minds are in effect concentrated in a particular region of physical space (mostly on this planet) and a particular region of rulial space. And in rulial space different human minds—with their different experiences and thus different ways of thinking about the universe—are in slightly different places. Animal minds might be fairly close in rulial space. But other computational systems (like, say, the weather, which is sometimes said to “have a mind of its own”) are further away—as putative aliens might also be.

So what about AIs? It depends what we mean by “AIs”. If we’re talking about computational systems that are set up to do “human-like things” then that means they’ll be close to us in rulial space. But insofar as “an AI” is an arbitrary computational system it can be anywhere in rulial space, and it can do anything that’s computationally possible—which is far broader than what we humans can do, or even think about. (As we’ll talk about later, as our intellectual paradigms—and ways of observing things—expand, the region of rulial space in which we humans operate will correspondingly expand.)

But, OK, just how “general” are the computations that we humans (and the AIs that follow us) are doing? We don’t know enough about the brain to be sure. But if we look at artificial neural net systems—like ChatGPT—we can potentially get some sense. And in fact the computations really don’t seem to be that “general”. In most neural net systems data that’s given as input just “ripples once through the system” to produce output. It’s not like in a computational system like a Turing machine where there can be arbitrary “recirculation of data”. And indeed without such “arbitrary recirculation” the computation is necessarily quite “shallow” and can’t ultimately show computational irreducibility.

It’s a bit of a technical point, but one can ask whether ChatGPT, with its “re-feeding of text produced so far” can in fact achieve arbitrary (“universal”) computation. And I suspect that in some formal sense it can (or at least a sufficiently expanded analog of it can)—though by producing an extremely verbose piece of text that for example in effect lists successive (self-delimiting) states of a Turing machine tape, and in which finding “the answer” to a computation will take a bit of effort. But—as I’ve discussed elsewhere—in practice ChatGPT is presumably almost exclusively doing “quite shallow” computation.

It’s an interesting feature of the history of practical computing that what one might consider “deep pure computations” (say in mathematics or science) were done for decades before “shallow human-like computations” became feasible. And the basic reason for this is that for “human-like computations” (like recognizing images or generating text) one needs to capture lots of “human context”, which requires having lots of “human-generated data” and the computational resources to store and process it.

And, by the way, brains also seem to specialize in fundamentally shallow computations. And to do the kind of deeper computations that allow one to take advantage of more of what’s out there in the computational universe, one has to turn to computers. As we’ve discussed, there’s plenty out in the computational universe that we humans don’t (yet) care about: we just consider it “raw computation”, that doesn’t seem to be “achieving human purposes”. But as a practical matter it’s important to make a bridge between the things we humans do care about and think about, and what’s possible in the computational universe. And in a sense that’s at the core of the project I’ve put so much effort into in the Wolfram Language of creating a full-scale computational language that describes in computational terms the things we think about, and experience in the world.

OK, people have been saying for years: “It’s nice that computers can do A and B, but only humans can do X”. What X is supposed to be has changed—and narrowed—over the years. And ChatGPT provides us with a major unexpected new example of something more that computers can do.

So what’s left? People might say: “Computers can never show creativity or originality”. But—perhaps disappointingly—that’s surprisingly easy to get, and indeed just a bit of randomness “seeding” a computation can often do a pretty good job, as we saw years ago with our WolframTones music-generation system, and as we see today with ChatGPT’s writing. People might also say: “Computers can never show emotions”. But before we had a good way to generate human language we wouldn’t really have been able to tell. And now it already works pretty well to ask ChatGPT to write “happily”, “sadly”, etc. (In their raw form emotions in both humans and other animals are presumably associated with rather simple “global variables” like neurotransmitter concentrations.)

In the past people might have said: “Computers can never show judgement”. But by now there are endless examples of machine learning systems that do well at reproducing human judgement in lots of domains. People might also say: “Computers don’t show common sense”. And by this they typically mean that in a particular situation a computer might locally give an answer, but there’s a global reason why that answer doesn’t make sense, that the computer “doesn’t notice”, but a person would.

So how does ChatGPT do on this? Not too badly. In plenty of cases it correctly recognizes that “that’s not what I’ve typically read”. But, yes, it makes mistakes. Some of them have to do with it not being able to do—purely with its neural net—even slightly “deeper”computations. (And, yes, that’s something that can often be fixed by it calling Wolfram|Alpha as a tool.) But in other cases the problem seems to be that it can’t quite connect different domains well enough.

It’s perfectly capable of doing simple (“SAT-style”) analogies. But when it comes to larger-scale ones it doesn’t manage them. My guess, though, is that it won’t take much scaling up before it starts to be able to make what seem like very impressive analogies (that most of us humans would never even be able to make)—at which point it’ll probably successfully show broader “common sense”.

But so what’s left that humans can do, and AIs can’t? There’s—almost by definition—one fundamental thing: define what we would consider goals for what to do. We’ll talk more about this later. But for now we can note that any computational system, once “set in motion”, will just follow its rules and do what it does. But what “direction should it be pointed in”? That’s something that has to come from “outside the system”.

So how does it work for us humans? Well, our goals are in effect defined by the whole web of history—both from biological evolution and from our cultural development—in which we are embedded. But ultimately the only way to truly participate in that web of history is to be part of it.

Of course, we can imagine technologically emulating every “relevant” aspect of a brain—and indeed things like the success of ChatGPT may suggest that that’s easier to do than we might have thought. But that won’t be enough. To participate in the “human web of history” (as we’ll discuss later) we’ll have to emulate other aspects of “being human”—like moving around, being mortal, etc. And, yes, if we make an “artificial human” we can expect it (by definition) to show all the features of us humans.

But while we’re still talking about AIs as—for example—“running on computers” or “being purely digital” then, at least as far as we’re concerned, they’ll have to “get their goals from outside”. One day (as we’ll discuss) there will no doubt be some kind of “civilization of AIs”—which will form its own web of history. But at this point there’s no reason to think that we’ll still be able to describe what’s going on in terms of goals that we recognize. In effect the AIs will at that point have left our domain of rulial space. And—as we’ll discuss—they’ll be operating more like the kind of systems we see in nature, where we can tell there’s computation going on, but we can’t describe it, except rather anthropomorphically, in terms of human goals and purposes.

It’s been an issue that’s been raised—with varying degrees of urgency—for centuries: with the advance of automation (and now AI), will there eventually be nothing left for humans to do? Back in the early days of our species, there was lots of hard work of hunting and gathering to do, just to survive. But at least in the developed parts of the world, that kind of work is now at best a distant historical memory.

And yet at each stage in history—at least so far—there always seem to be other kinds of work that keep people busy. But there’s a pattern that increasingly seems to repeat. Technology in some way or another enables some new occupation. And eventually that occupation becomes widespread, and lots of people do it. But then there’s a technological advance, and the occupation gets automated—and people aren’t needed to do it anymore. But now there’s a new level of technology, that enables new occupations. And the cycle continues.

A century ago the increasingly widespread use of telephones meant that more and more people worked as switchboard operators. But then telephone switching was automated—and those switchboard operators weren’t needed anymore. But with automated switching there could be huge development of telecommunications infrastructure, opening up all sorts of new types of jobs, that in aggregate employ vastly more people than were ever switchboard operators.

Something somewhat similar happened with accounting clerks. Before there were computers, one needed to have people laboriously tallying up numbers. But with computers, that was all automated away. But with that automation came the ability to do more complex financial computations—which allowed for more complex financial transactions, more complex regulations, etc., which in turn led to all sorts of new types of jobs.

And across a whole range of industries, it’s been the same kind of story. Automation obsoletes some jobs, but enables others. There’s quite often a gap in time, and a change in the skills that are needed. But at least so far there always seems to have been a broad frontier of jobs that have been made possible—but haven’t yet been automated.

Will this at some point end? Will there come a time when everything we humans want (or at least need) is delivered automatically? Well, of course, that depends on what we want, and whether, for example, that evolves with what technology has made possible. But could we just decide that “enough is enough”; let’s stop here, and just let everything be automated?

I don’t think so. And the reason is ultimately because of computational irreducibility. We try to get the world to be “just so”, say set up so we’re “predictably comfortable”. Well, the problem is that there’s inevitably computational irreducibility in the way things develop—not just in nature, but in things like societal dynamics too. And that means that things won’t stay “just so”. There’ll always be something unpredictable that happens; something that the automation doesn’t cover.

At first we humans might just say “we don’t care about that”. But in time computational irreducibility will affect everything. So if there’s anything at all we care about (including, for example, not going extinct), we’ll eventually have to do something—and go beyond whatever automation was already set up.

It’s easy to find practical examples. We might think that when computers and people are all connected in a seamless automated network, there’d be nothing more to do. But what about the “unintended consequence” of computer security issues? What might have seemed like a case where “technology finished things” quickly creates a new kind of job for people to do. And at some level, computational irreducibility implies that things like this must always happen. There must always be a “frontier”. At least if there’s anything at all we want to preserve (like not going extinct).

But let’s come back to the situation here and now with AI. ChatGPT just automated all sorts of text-related tasks. It used to take lots of effort—and people—to write customized reports, letters, etc. But (at least so long as one’s dealing with situations where one doesn’t need 100% “correctness”) ChatGPT just automated a lot of that, so people aren’t needed for it anymore. But what will this mean? Well, it means that there’ll be a lot more customized reports, letters, etc. that can be produced. And that will lead to new kinds of jobs—managing, analyzing, validating etc. all that mass-customized text. Not to mention the need for prompt engineers (a job category that just didn’t exist until a few months ago), and what amount to AI wranglers, AI psychologists, etc.

But let’s talk about today’s “frontier” of jobs that haven’t been “automated away”. There’s one category that in many ways seems surprising to still be “with us”: jobs that involve lots of mechanical manipulation, like construction, fulfillment, food preparation, etc. But there’s a missing piece of technology here: there isn’t yet good general-purpose robotics (as there is general-purpose computing), and we humans still have the edge in dexterity, mechanical adaptability, etc. But I’m quite sure that in time—and perhaps quite suddenly—the necessary technology will be developed (and, yes, I have ideas about how to do it). And this will mean that most of today’s “mechanical manipulation” jobs will be “automated away”—and won’t need people to do them.

But then, just as in our other examples, this will mean that mechanical manipulation will become much easier and cheaper to do, and more of it will be done. Houses might routinely be built and dismantled. Products might routinely be picked up from wherever they’ve ended up, and redistributed. Vastly more ornate “food constructions” might become the norm. And each of these things—and many more—will open up new jobs.

But will every job that exists in the world today “on the frontier” eventually be automated? What about jobs where it seems like a large part of the value is just “having a human be there”? Jobs like flying a plane where one wants the “commitment” of the pilot being there in the plane. Caregiver jobs where one wants the “connection” of a human being there. Sales or education jobs where one wants “human persuasion” or “human encouragement”. Today one might think “only a human can make one feel that way”. But that’s typically based on the way the job is done now. And maybe there’ll be different ways found that allow the essence of the task to be automated, almost inevitably opening up new tasks to be done.

For example, something that in the past needed “human persuasion” might be “automated” by something like gamification—but then more of it can be done, with new needs for design, analytics, management, etc.

We’ve been talking about “jobs”. And that term immediately brings to mind wages, economics, etc. And, yes, plenty of what people do (at least in the world as it is today) is driven by issues of economics. But plenty is also not. There are things we “just want to do”—as a “social matter”, for “entertainment”, for “personal satisfaction”, etc.

Why do we want to do these things? Some of it seems intrinsic to our biological nature. Some of it seems determined by the “cultural environment” in which we find ourselves. Why might one walk on a treadmill? In today’s world one might explain that it’s good for health, lifespan, etc. But a few centuries ago, without modern scientific understanding, and with a different view of the significance of life and death, that explanation really wouldn’t work.

What drives such changes in our view of what we “want to do”, or “should do”? Some seems to be driven by the pure “dynamics of society”, presumably with its own computational irreducibility. But some has to do with our ways of interacting with the world—both the increasing automation delivered by the advance of technology, and the increasing abstraction delivered by the advance of knowledge.

And there seem to be similar “cycles” seen here as in the kinds of things we consider to be “occupations” or “jobs”. For a while something is hard to do, and serves as a good “pastime”. But then it gets “too easy” (“everybody now knows how to win at game X”, etc.), and something at a “higher level” takes its place.

About our “base” biologically driven motivations it doesn’t seem like anything has really changed in the course of human history. But there are certainly technological developments that could have an effect in the future. Effective human immortality, for example, would change many aspects of our motivation structure. As would things like the ability to implant memories or, for that matter, implant motivations.

For now, there’s a certain element of what we want to do that’s “anchored” by our biological nature. But at some point we’ll surely be able to emulate with a computer at least the essence of what our brains are doing (and indeed the success of things like ChatGPT makes it seems like the moment when that will happen is closer at hand than we might have thought). And at that point we’ll have the possibility of what amount to “disembodied human souls”.

To us today it’s very hard to imagine what the “motivations” of such a “disembodied soul” might be. Looked at “from the outside” we might “see the soul” doing things that “don’t make much sense” to us. But it’s like asking what someone from a thousand years ago would think about many of our activities today. These activities make sense to us today because we’re embedded in our whole “current framework”. But without that framework they don’t make sense. And so it will be for the “disembodied soul”. To us, what it does may not make sense. But to it, with its “current framework”, it will.

Could we “learn how to make sense of it”? There’s likely to be a certain barrier of computational irreducibility: in effect the only way to “understand the soul of the future” is to retrace its steps to get to where it is. So from our vantage point today, we’re separated by a certain “irreducible distance”, in effect in rulial space.

But could there be some science of the future that will at least tell us general things about how such “souls” behave? Even when there’s computational irreducibility we know that there will always be pockets of computational reducibility—and thus features of behavior that are predictable. But will those features be “interesting”, say from our vantage point today? Maybe some of them will be. Maybe they’ll show us some kind of metapsychology of souls. But inevitably they can only go so far. Because in order for those souls to even experience the passage of time there has to be computational irreducibility. If too much of what happens is too predictable, it’s as if “nothing is happening”—or at least nothing “meaningful”.

And, yes, this is all tied up with questions about “free will”. Even when there’s a disembodied soul that’s operating according to some completely deterministic underlying program, computational irreducibility means its behavior can still “seem free”—because nothing can “outrun it” and say what it’s going to be. And the “inner experience” of the disembodied soul can be significant: it’s “intrinsically defining its future”, not just “having its future defined for it”.

One might have assumed that once everything is just “visibly operating” as “mere computation” it would necessarily be “soulless” and “meaningless”. But computational irreducibility is what breaks out of this, and what allows there to be something irreducible and “meaningful” achieved. And it’s the same phenomenon whether one’s talking about our life now in the physical universe, or a future “disembodied” computational existence. Or in other words, even if absolutely everything—even our very existence—has been “automated by computation”, that doesn’t mean we can’t have a perfectly good “inner experience” of meaningful existence.

If we look at human history—or, for that matter, the history of life on Earth—there’s a certain pervasive sense that there’s some kind of “progress” happening. But what fundamentally is this “progress”? One can view it as the process of things being done at a progressively “higher level”, so that in effect “more of what’s important” can happen with a given effort. This idea of “going to a higher level” takes many forms—but they’re all fundamentally about eliding details below, and being able to operate purely in terms of the “things one cares about”.

In technology, this shows up as automation, in which what used to take lots of detailed steps gets packaged into something that can be done “with the push of a button”. In science—and the intellectual realm in general—it shows up as abstraction, where what used to involve lots of specific details gets packaged into something that can be talked about “purely collectively”. And in biology it shows up as some structure (ribosome, cell, wing, etc.) that can be treated as a “modular unit”.

That it’s possible to “do things at a higher level” is a reflection of being able to find “pockets of computational reducibility”. And—as we mentioned above—the fact that (given underlying computational irreducibility) there are necessarily an infinite number of such pockets means that “progress can always go on forever”.

When it comes to human affairs we tend to value such progress highly, because (at least for now) we live finite lives, and insofar as we “want more to happen”, “progress” makes that possible. It’s certainly not self-evident that having more happen is “good”; one might just “want a quiet life”. But there is one constraint that in a sense originates from the deep foundations of biology.

If something doesn’t exist, then nothing can ever “happen to it”. So in biology, if one’s going to have anything “happen” with organisms, they’d better not be extinct. But the physical environment in which biological organisms exist is finite, with many resources that are finite. And given organisms with finite lives, there’s an inevitability to the process of biological evolution, and to the “competition” for resources between organisms.

Will there eventually be an “ultimate winning organism”? Well, no, there can’t be—because of computational irreducibility. There’ll in a sense always be more to explore in the computational universe—more “raw computational material for possible organisms”. And given any “fitness criterion” (like—in a Turing machine analog—“living longer before halting”) there’ll always be a way to “do better” with it.

One might still wonder, however, whether perhaps biological evolution—with its underlying process of random genetic mutation—could “get stuck” and never be able to discover some “way to do better”. And indeed simple models of evolution might give one the intuition that this would happen. But actual evolution seems more like deep learning with a large neural net—where one’s effectively operating in an extremely high-dimensional space where there’s typically always a “way to get there from here”, at least given enough time.

But, OK, so from our history of biological evolution there’s a certain built-in sense of “competition for scarce resources”. And this sense of competition has (so far) also carried over to human affairs. And indeed it’s the basic driver for most of the processes of economics.

But what if resources aren’t “scarce” anymore? What if progress—in the form of automation, or AI—makes it easy to “get anything one wants”? We might imagine robots building everything, AIs figuring everything out, etc. But there are still things that are inevitably scarce. There’s only so much real estate. Only one thing can be “the first ___”. And, in the end, if we have finite lives, we only have so much time.

Still, the more efficient—or high level—the things we do (or have) are, the more we’ll be able to get done in the time we have. And it seems as if what we perceive as “economic value” is intimately connected with “making things higher level”. A finished phone is “worth more” than its raw materials. An organization is “worth more” than its separate parts. But what if we could have “infinite automation”? Then in a sense there’d be “infinite economic value everywhere”, and one might imagine there’d be “no competition left”.

But once again computational irreducibility stands in the way. Because it tells us there’ll never be “infinite automation”, just as there’ll never be an ultimate winning biological organism. There’ll always be “more to explore” in the computational universe, and different paths to follow.

What will this look like in practice? Presumably it’ll lead to all sorts of diversity. So that, for example, a chart of “what the components of an economy are” will become more and more fragmented; it won’t just be “the single winning economic activity is ___”.

There is one potential wrinkle in this picture of unending progress. What if nobody cares? What if the innovations and discoveries just don’t matter, say to us humans? And, yes, there is of course plenty in the world that at any given time in history we don’t care about. That piece of silicon we’ve been able to pick out? It’s just part of a rock. Well, until we start making microprocessors out of it.

But as we’ve discussed, as soon as we’re “operating at some level of abstraction” computational irreducibility makes it inevitable that we’ll eventually be exposed to things that “require going beyond that level”.

But then—critically—there will be choices. There will be different paths to explore (or “mine”) in the computational universe—in the end infinitely many of them. And whatever the computational resources of AIs etc. might be, they’ll never be able to explore all of them. So something—or someone—will have to make a choice of which ones to take.

Given a particular set of things one cares about at a particular point, one might successfully be able to automate all of them. But computational irreducibility implies there will always be a “frontier”, where choices have to be made. And there’s no “right answer”; no “theoretically derivable” conclusion. Instead, if we humans are involved, this is where we get to define what’s going to happen.

How will we do that? Well, ultimately it’ll be based on our history—biological, cultural, etc. We’ll get to use all that irreducible computation that went into getting us to where we are to define what to do next. In a sense it’ll be something that goes “through us”, and that uses what we are. It’s the place where—even when there’s automation all around—there’s still always something us humans can “meaningfully” do.

Let’s say we want an AI (or any computational system) to do a particular thing. We might think we could just set up its rules (or “program it”) to do that thing. And indeed for certain kinds of tasks that works just fine. But the deeper the use we make of computation, the more we’re going to run into computational irreducibility, and the less we’ll be able to know how to set up particular rules to achieve what we want.

And then, of course, there’s the question of defining what “we want” in the first place. Yes, we could have specific rules that say what particular pattern of bits should occur at a particular point in a computation. But that probably won’t have much to do with the kind of overall “human-level” objective that we typically care about. And indeed for any objective we can even reasonably define, we’d better be able to coherently “form a thought” about it. Or, in effect, we’d better have some “human-level narrative” to describe it.

But how can we represent such a narrative? Well, we have natural language—probably the single most important innovation in the history of our species. And what natural language fundamentally does is to allow us to talk about things at a “human level”. It’s made of words that we can think of as representing “human-level packets of meaning”. And so, for example, the word “chair” represents the human-level concept of a chair. It’s not referring to some particular arrangement of atoms. Instead, it’s referring to any arrangement of atoms that we can usefully conflate into the single human-level concept of a chair, and from which we can deduce things like the fact that we can expect to sit on it, etc.

So, OK, when we’re “talking to an AI” can we expect to just say what we want using natural language? We can definitely get a certain distance—and indeed ChatGPT helps us get further than ever before. But as we try to make things more precise we run into trouble, and the language we need rapidly becomes increasingly ornate, as in the “legalese” of complex legal documents. So what can we do? If we’re going to keep things at the level of “human thoughts” we can’t “reach down” into all the computational details. But yet we want a precise definition of how what we might say can be implemented in terms of those computational details.

Well, there’s a way to deal with this, and it’s one that I’ve personally devoted many decades to: it’s the idea of computational language. When we think about programming languages, they’re things that operate solely at the level of computational details, defining in more or less the native terms of a computer what the computer should do. But the point of a true computational language (and, yes, in the world today the Wolfram Language is the sole example) is to do something different: to define a precise way of talking in computational terms about things in the world (whether concretely countries or minerals, or abstractly computational or mathematical structures).

Out in the computational universe, there’s immense diversity in the “raw computation” that can happen. But there’s only a thin sliver of it that we humans (at least currently) care about and think about. And we can view computational language as defining a bridge between the things we think about and what’s computationally possible. The functions in our computational language (7000 or so of them in the Wolfram Language) are in effect like words in a human language—but now they have a precise grounding in the “bedrock” of explicit computation. And the point is to design the computational language so it’s convenient for us humans to think and express ourselves in (like a vastly expanded analog of mathematical notation), but so it can also be precisely implemented in practice on a computer.

Given a piece of natural language it’s often possible to give a precise, computational interpretation of it—in computational language. And indeed this is exactly what happens in Wolfram|Alpha. Give a piece of natural language and the Wolfram|Alpha NLU system will try to find an interpretation of it as computational language. And from this interpretation, it’s then up to the Wolfram Language to do the computation that’s specified, and give back the results—and potentially synthesize natural language to express them.

As a practical matter, this setup is useful not only for humans, but also for AIs—like ChatGPT. Given a system that produces natural language, the Wolfram|Alpha NLU system can “catch” natural language it is “thrown”, and interpret it as computational language that precisely specifies a potentially irreducible computation to do.

With both natural language and computational language one’s basically “directly saying what one wants”. But an alternative approach—more aligned with machine learning—is just to give examples, and (implicitly or explicitly) say “follow these”. Inevitably there has to be some underlying model for how to do that following—typically in practice just defined by “what a neural net with a certain architecture will do”. But will the result be “right”? Well, the result will be whatever the neural net gives. But typically we’ll tend to consider it “right” if it’s somehow consistent with what we humans would have concluded. And in practice this often seems to happen, presumably because the actual architecture of our brains is somehow similar enough to the architecture of the neural nets we’re using.

But what if we want to “know for sure” what’s going to happen—or, for example, that some particular “mistake” can never be made? Well then we’re presumably thrust back into computational irreducibility, with the result that there’s no way to know, for example, whether a particular set of training examples can lead to a system that’s capable of doing (or not doing) some particular thing.

OK, but let’s say we’re setting up some AI system, and we want to make sure it “doesn’t do anything bad”. There are several levels of issues here. The first is to decide what we mean by “anything bad”. And, as we’ll discuss below, that in itself is very hard. But even if we could abstractly figure this out, how should we actually express it? We could give examples—but then the AI will inevitably have to “extrapolate” from them, in ways we can’t predict. Or we could describe what we want in computational language. It might be difficult to cover “every case” (as it is in present-day human laws, or complex contracts). But at least we as humans can read what we’re specifying. Though even in this case, there’s an issue of computational irreducibility: that given the specification it won’t be possible to work out all its consequences.

What does all this mean? In essence it’s just a reflection of the fact that as soon as there’s “serious computation” (i.e. irreducible computation) involved, one isn’t going to be immediately able to say what will happen. (And in a sense that’s inevitable, because if one could say, it would mean the computation wasn’t in fact irreducible.) So, yes, we can try to “tell AIs what to do”. But it’ll be like many systems in nature (or, for that matter, people): you can set them on a path, but you can’t know for sure what will happen; you just have to wait and see.

In the world today, there are already plenty of things that are being done by AIs. And, as we’ve discussed, there’ll surely be more in the future. But who’s “in charge”? Are we telling the AIs what to do, or are they telling us? Today it’s at best a mixture: AIs suggest content for us (for example from the web), and in general make all sorts of recommendations about what we should do. And no doubt in the future those recommendations will be even more extensive and tightly coupled to us: we’ll be recording everything we do, processing it with AI, and continually annotating with recommendations—say through augmented reality—everything we see. And in some sense things might even go beyond “recommendations”. If we have direct neural interfaces, then we might be making our brains just “decide” they want to do things, so that in some sense we become pure “puppets of the AI”.

And beyond “personal recommendations” there’s also the question of AIs running the systems we use, or in fact running the whole infrastructure of our civilization. Today we ultimately expect people to make large-scale decisions for our world—often operating in systems of rules defined by laws, and perhaps aided by computation, and even what one might call AI. But there may well come a time when it seems as if AIs could just “do a better job than humans”, say at running a central bank or waging a war.

One might ask how one would ever know if the AI would “do a better job”. Well, one could try tests, and run examples. But once again one’s faced with computational irreducibility. Yes, the particular tests one tries might work fine. But one can’t ultimately predict everything that could happen. What will the AI do if there’s suddenly a never-before-seen seismic event? We basically won’t know until it happens.

But can we be sure the AI won’t do anything “crazy”? Could we—with some definition of “crazy”—effectively “prove a theorem” that the AI can never do that? For any realistically nontrivial definition of crazy we’ll again run into computational irreducibility—and this won’t be possible.

Of course, if we’ve put a person (or even a group of people) “in charge” there’s also no way to “prove” that they won’t do anything “crazy”—and history shows that people in charge quite often have done things that, at least in retrospect, we consider “crazy”. But even though at some level there’s no more certainty about what people will do than about what AIs might do, we still get a certain comfort when people are in charge if we think that “we’re in it together”, and that if something goes wrong those people will also “feel the effects”.

But still, it seems inevitable that lots of decisions and actions in the world will be taken directly by AIs. Perhaps it’ll be because this will be cheaper. Perhaps the results (based on tests) will be better. Or perhaps, for example, things will just have to be done too quickly and in numbers too large for us humans to be in the loop.

But, OK, if a lot of what happens in our world is happening through AIs, and the AIs are effectively doing irreducible computations, what will this be like? We’ll be in a situation where things are “just happening” and we don’t quite know why. But in a sense we’ve very much been in this situation before. Because it’s what happens all the time in our interaction with nature.

Processes in nature—like, for example, the weather—can be thought of as corresponding to computations. And much of the time there’ll be irreducibility in those computations. So we won’t be able to readily predict them. Yes, we can do natural science to figure out some aspects of what’s going to happen. But it’ll inevitably be limited.

And so we can expect it to be with the “AI infrastructure” of the world. Things are happening in it—as they are in the weather—that we can’t readily predict. We’ll be able to say some things—though perhaps in ways that are closer to psychology or social science than to traditional exact science. But there’ll be surprises—like maybe some strange AI analog of a hurricane or an ice age. And in the end all we’ll really be able to do is to try to build up our human civilization so that such things “don’t fundamentally matter” to it.

In a sense the picture we have is that in time there’ll be a whole “civilization of AIs” operating—like nature—in ways that we can’t readily understand. And like with nature, we’ll coexist with it.

But at least at first we might think there’s an important difference between nature and AIs. Because we imagine that we don’t “pick our natural laws”—yet insofar as we’re the ones building the AIs we imagine we can “pick their laws”. But both parts of this aren’t quite right. Because in fact one of the implications of our Physics Project is precisely that the laws of nature that we perceive are the way they are because we are observers who are the way we are. And on the AI side, computational irreducibility implies that we can’t expect to be able to determine the final behavior of the AIs just from knowing the underlying laws we gave them.

But what will the “emergent laws” of the AIs be? Well, just like in physics, it’ll depend on how we “sample” the behavior of the AIs. If we look down at the level of individual bits, it’ll be like looking at molecular dynamics (or the behavior of atoms of space). But typically we won’t do this. And just like in physics, we’ll operate as computationally bounded observers—measuring only certain aggregated features of an underlying computationally irreducible process. But what will the “overall laws of AIs” be like? Maybe they’ll show close analogies to physics. Or maybe they’ll seem more like psychological theories (superegos for AIs?). But we can expect them in many ways to be like large-scale laws of nature of the kind we know.

Still, there’s one more difference between at least our interaction with nature and with AIs. Because we have in effect been “co-evolving” with nature for billions of years—yet AIs are “new on the scene”. And through our co-evolution with nature we’ve developed all sorts of structural, sensory and cognitive features that allow us to “interact successfully” with nature. But with AIs we don’t have these. So what does this mean?

Well, our ways of interacting with nature can be thought of as leveraging pockets of computational reducibility that exist in natural processes—to make things seem at least somewhat predictable to us. But without having found such pockets for AIs, we’re likely to be faced with much more “raw computational irreducibility”—and thus much more unpredictability. It’s been a conceit of modern times that—particularly with the help of science—we’ve been able to make more and more of our world predictable to us, though in practice a large part of what’s led to this is the way we’ve built and controlled the environment in which we live, and the things we choose to do.

But for the new “AI world”, we’re effectively starting from scratch. And to make things predictable in that world may be partly a matter of some new science, but perhaps more importantly a matter of choosing how we set up our “way of life” around the AIs there. (And, yes, if there’s lots of unpredictability we may be back to more ancient points of view about the importance of fate—or we may view AIs as a bit like the Olympians of Greek mythology, duking it out among themselves and sometimes having an effect on mortals.)

Let’s say the world is effectively being run by AIs, but let’s assume that we humans have at least some control over what they do. Then what principles should we have them follow? And what, for example, should their “ethics” be?

Well, the first thing to say is that there’s no ultimate, theoretical “right answer” to this. There are many ethical and other principles that AIs could follow. And it’s basically just a choice which ones should be followed.

When we talk about “principles” and “ethics” we tend to think more in terms of constraints on behavior than in terms of rules for generating behavior. And that means we’re dealing with something more like mathematical axioms, where we ask things like what theorems are true according to those axioms, and what are not. And that means there can be issues like whether the axioms are consistent—and whether they’re complete, in the sense that they can “determine the ethics of anything”. But now, once again, we’re face to face with computational irreducibility, here in the form of Gödel’s theorem and its generalizations.

And what this means is that it’s in general undecidable whether any given set of principles is inconsistent, or incomplete. One might “ask an ethical question”, and find that there’s a “proof chain” of unbounded length to determine what the answer to that question is within one’s specified ethical system, or whether there is even a consistent answer.

One might imagine that somehow one could add axioms to “patch up” whatever issues there are. But Gödel’s theorem basically says that it’ll never work. It’s the same story as so often with computational irreducibility: there’ll always be “new situations” that can arise, that in this case can’t be captured by a finite set of axioms.

OK, but let’s imagine we’re picking a collection of principles for AIs. What criteria could we use to do it? One might be that these principles won’t inexorably lead to a simple state—like one where the AIs are extinct, or have to keep looping doing the same thing forever. And there may be cases where one can readily see that some set of principles will lead to such outcomes. But most of the time, computational irreducibility (here in the form of things like the halting problem) will once again get in the way, and one won’t be able to tell what will happen, or successfully pick “viable principles” this way.

So this means that there are going to be a wide range of principles that we could in theory pick. But presumably what we’ll want is to pick ones that make AIs give us humans some sort of “good time”, whatever that might mean.

And a minimal idea might be to get AIs just to observe what we humans do, and then somehow imitate this. But most people wouldn’t consider this the right thing. They’d point out all the “bad” things people do. And they’d perhaps say “let’s have the AIs follow not what we actually do, but what we aspire to do”.

But where should we get these aspirations from? Different people, and different cultures, can have very different aspirations—with very different resulting principles. So whose should we pick? And, yes, there are pitifully few—if any—principles that we truly find in common everywhere. (Though, for example, the major religions all tend to share things like respect for human life, the Golden Rule, etc.)

But do we in fact have to pick one set of principles? Maybe some AIs can have some principles, and some can have others. Maybe it should be like different countries, or different online communities: different principles for different groups or in different places.

Right now that doesn’t seem plausible, because technological and commercial forces have tended to make it seem as if powerful AIs always have to be centralized. But I expect that this is just a feature of the present time, and not something intrinsic to any “human-like” AI.

So could everyone (and maybe every organization) have “their own AI” with its own principles? For some purposes this might work OK. But there are many situations where AIs (or people) can’t really act independently, and where there have to be “collective decisions” made.

Why is this? In some cases it’s because everyone is in the same physical environment. In other cases it’s because if there’s to be social cohesion—of the kind needed to support even something like a language that’s useful for communication—then there has to be certain conceptual alignment.

It’s worth pointing out, though, that at some level having a “collective conclusion” is effectively just a way of introducing certain computational reducibility to make it “easier to see what to do”. And potentially it can be avoided if one has enough computation capability. For example, one might assume that there has to be a collective conclusion about which side of the road cars should drive on. But that wouldn’t be true if every car had the computation capability to just compute a trajectory that would for example optimally weave around other cars using both sides of the road.

But if we humans are going to be in the loop, we presumably need a certain amount of computational reducibility to make our world sufficiently comprehensible to us that we can operate in it. So that means there’ll be collective—“societal”—decisions to make. We might want to just tell the AIs to “make everything as good as it can be for us”. But inevitably there will be tradeoffs. Making a collective decision one way might be really good for 99% of people, but really bad for 1%; making it the other way might be pretty good for 60%, but pretty bad for 40%. So what should the AI do?

And, of course, this is a classic problem of political philosophy, and there’s no “right answer”. And in reality the setup won’t be as clean as this. It may be fairly easy to work out some immediate effects of different courses of action. But inevitably one will eventually run into computational irreducibility—and “unintended consequences”—and so one won’t be able to say with certainty what the ultimate effects (good or bad) will be.

But, OK, so how should one actually make collective decisions? There’s no perfect answer, but in the world today, democracy in one form or another is usually viewed as the best option. So how might AI affect democracy—and perhaps improve on it? Let’s assume first that “humans are still in charge”, so that it’s ultimately their preferences that matter. (And let’s also assume that humans are more or less in their “current form”: unique and unreplicable discrete entities that believe they have independent minds.)

The basic setup for current democracy is computationally quite simple: discrete votes (or perhaps rankings) are given (sometimes with weights of various kinds), and then numerical totals are used to determine the winner (or winners). And with past technology this was pretty much all that could be done. But now there are some new elements. Imagine not casting discrete votes, but instead using computational language to write a computational essay to describe one’s preferences. Or imagine having a conversation with a linguistically enabled AI that can draw out and debate one’s preferences, and eventually summarize them in some kind of feature vector. Then imagine feeding computational essays or feature vectors from all “voters” to some AI that “works out the best thing to do”.

Well, there are still the same political philosophy issues. It’s not like 60% of people voted for A and 40% for B, so one chose A. It’s much more nuanced. But one still won’t be able to make everyone happy all the time, and one has to have some base principles to know what to do about that.

And there’s a higher-order problem in having an AI “rebalance” collective decisions all the time based on everything it knows about people’s detailed preferences (and perhaps their actions too): for many purposes—like us being able to “keep track of what’s going on”—it’s important to maintain consistency over time. But, yes, one could deal with this by having the AI somehow also weigh consistency in figuring out what to do.

But while there are no doubt ways in which AI can “tune up” democracy, AI doesn’t seem—in and of itself—to deliver any fundamentally new solution for making collective decisions, and for governance in general.

And indeed, in the end things always seem to come down to needing some fundamental set of principles about how one wants things to be. Yes, AIs can be the ones to implement these principles. But there are many possibilities for what the principles could be. And—at least if we humans are “in charge”—we’re the ones who are going to have to come up with them.

Or, in other words, we need to come up with some kind of “AI constitution”. Presumably this constitution should basically be written in precise computational language (and, yes, we’re trying to make it possible for the Wolfram Language to be used), but inevitably (as yet another consequence of computational irreducibility) there’ll be “fuzzy” definitions and distinctions, that will rely on things like examples, “interpolated” by systems like neural nets. Maybe when such a constitution is created, there’ll be multiple “renderings” of it, which can all be applied whenever the constitution is used, with some mechanism for picking the “overall conclusion”. (And, yes, there’s potentially a certain “observer-dependent” multicomputational character to this.)

But whatever its detailed mechanisms, what should the AI constitution say? Different people and groups of people will definitely come to different conclusions about it. And presumably—just as there are different countries, etc. today with different systems of laws—there’ll be different groups that want to adopt different AI constitutions. (And, yes, the same issues about collective decision making apply again when those AI constitutions have to interact.)

But given an AI constitution, one has a base on which AIs can make decisions. And on top of this one imagines a huge network of computational contracts that are autonomously executed, essentially to “run the world”.

And this is perhaps one of those classic “what could possibly go wrong?” moments. An AI constitution has been agreed on, and now everything is being run efficiently and autonomously by AIs that are following it. Well, once again, computational irreducibility rears its head. Because however carefully the AI constitution is drafted, computational irreducibility implies that one won’t be able to foresee all its consequences: “unexpected” things will always happen—and some of them will undoubtedly be things “one doesn’t like”.

In human legal systems there’s always a mechanism for adding “patches”—filling in laws or precedents that cover new situations that have come up. But if everything is being autonomously run by AIs there’s no room for that. Yes, we as humans might characterize “bad things that happen” as “bugs” that could be fixed by adding a patch. But the AI is just supposed to be operating—essentially axiomatically—according to its constitution, so it has no way to “see that it’s a bug”.

Similar to what we discussed above, there’s an interesting analogy here with human law versus natural law. Human law is something we define and can modify. Natural law is something the universe just provides us (notwithstanding the issues about observers discussed above). And by “setting an AI constitution and letting it run” we’re basically forcing ourselves into a situation where the “civilization of the AIs” is some “independent stratum” in the world, that we essentially have to take as it is, and adapt to.

Of course, one might wonder if the AI constitution could “automatically evolve”, say based on what’s actually seen to happen in the world. But one quickly returns to the exact same issues of computational irreducibility, where one can’t predict whether the evolution will be “right”, etc.

So far, we’ve assumed that in some sense “humans are in charge”. But at some level that’s an issue for the AI constitution to define. It’ll have to define whether AIs have “independent rights”—just like humans (and, in many legal systems, some other entities too). Closely related to the question of independent rights for AIs is whether an AI can be considered autonomously “responsible for its actions”—or whether such responsibility must always ultimately rest with the (presumably human) creator or “programmer” of the AI.

Once again, computational irreducibility has something to say. Because it implies that the behavior of the AI can go “irreducibly beyond” what its programmer defined. And in the end (as we discussed above) this is the same basic mechanism that allows us humans to effectively have “free will” even when we’re ultimately operating according to deterministic underlying natural laws. So if we’re going to claim that we humans have free will, and can be “responsible for our actions” (as opposed to having our actions always “dictated by underlying laws”) then we’d better claim the same for AIs.

So just as a human builds up something irreducible and irreplaceable in the course of their life, so can an AI. As a practical matter, though, AIs can presumably be backed up, copied, etc.—which isn’t (yet) possible for humans. So somehow their individual instances don’t seem as valuable, even if the “last copy” might still be valuable. As humans, we might want to say “those AIs are something inferior; they shouldn’t have rights”. But things are going to get more entangled. Imagine a bot that no longer has an identifiable owner but that’s successfully befriending people (say on social media), and paying for its underlying operation from donations, ads, etc. Can we reasonably delete that bot? We might argue that “the bot can feel no pain”—but that’s not true of its human friends. But what if the bot starts doing “bad” things? Well, then we’ll need some form of “bot justice”—and pretty soon we’ll find ourselves building a whole human-like legal structure for the AIs.

OK, so AIs will learn what they can from us humans, then they’ll fundamentally just be running as autonomous computational systems—much like nature runs as an autonomous computational system—sometimes “interacting with us”. What will they “do to us”? Well, what does nature “do to us”? In a kind of animistic way, we might attribute intentions to nature, but ultimately it’s just “following its rules” and doing what it does. And so it will be with AIs. Yes, we might think we can set things up to determine what the AIs will do. But in the end—insofar as the AIs are really making use of what’s possible in the computational universe—there’ll inevitably be computational irreducibility, and we won’t be able to foresee what will happen, or what consequences it will have.

So will the dynamics of AIs in fact have “bad” effects—like, for example, wiping us out? Well, it’s perfectly possible nature could wipe us out too. But one has the feeling that—extraterrestrial “accidents” aside—the natural world around us is at some level enough in some kind of “equilibrium” that nothing too dramatic will happen. But AIs are something new. So maybe they’ll be different.

And one possibility might be that AIs could “improve themselves” to produce a single “apex intelligence” that would in a sense dominate everything else. But here we can see computational irreducibility as coming to the rescue. Because it implies that there can never be a “best at everything” computational system. It’s a core result of the emerging field of metabiology: that whatever “achievement” you specify, there’ll always be a computational system somewhere out there in the computational universe that will exceed it. (A simple example is that there’s always a Turing machine that can be found that will exceed any upper bound you specify on the time it takes to halt.)

So what this means is that there’ll inevitably be a whole “ecosystem” of AIs—with no single winner. Of course, while that might be an inevitable final outcome, it might not be what happens in the shorter term. And indeed the current tendency to centralize AI systems has a certain danger of AI behavior becoming “unstabilized” relative to what it would be with a whole ecosystem of “AIs in equilibrium”.

And in this situation there’s another potential concern as well. We humans are the product of a long struggle for life played out over the course of the history of biological evolution. And insofar as AIs inherit our attributes we might expect them to inherit a certain “drive to win”—perhaps also against us. And perhaps this is where the AI constitution becomes important: to define a “contract” that supersedes what AIs might “naturally” inherit from effectively observing our behavior. Eventually we can expect the AIs to “independently reach equilibrium”. But in the meantime, the AI constitution can help break their connection with our “competitive” history of biological evolution.

We’ve talked quite a bit about the ultimate future course of AIs, and their relation to us humans. But what about the short term? How today can we prepare for the growing capabilities and uses of AIs?

As has been true throughout history, people who use tools tend to do better than those who don’t. Yes, you can go on doing by direct human effort what has now been successfully automated, but except in rare cases you’ll increasingly be left behind. And what’s now emerging is an extremely powerful combination of tools: neural-net-style AI for “immediate human-like tasks”, along with computational language for deeper access to the computational universe and computational knowledge.

So what should people do with this? The highest leverage will come from figuring out new possibilities—things that weren’t possible before but have now “come into range” as a result of new capabilities. And as we discussed above, this is a place where we humans are inevitably central contributors—because we’re the ones who must define what we consider has value for us.

So what does this mean for education? What’s worth learning now that so much has been automated? I think the fundamental answer is how to think as broadly and deeply as possible—calling on as much knowledge and as many paradigms as possible, and particularly making use of the computational paradigm, and ways of thinking about things that directly connect with what computation can help with.

In the course of human history a lot of knowledge has been accumulated. But as ways of thinking have advanced, it’s become unnecessary to learn directly that knowledge in all its detail: instead one can learn things at a higher level, abstracting out many of the specific details. But in the past few decades something fundamentally new has come on the scene: computers and the things they enable.

For the first time in history, it’s become realistic to truly automate intellectual tasks. The leverage this provides is completely unprecedented. And we’re only just starting to come to terms with what it means for what and how we should learn. But with all this new power there’s a tendency to think something must be lost. Surely it must still be worth learning all those intricate details—that people in the past worked so hard to figure out—of how to do some mathematical calculation, even though Mathematica has been able to do it automatically for more than a third of a century?

And, yes, at the right time it can be interesting to learn those details. But in the effort to understand and best make use of the intellectual achievements of our civilization, it makes much more sense to leverage the automation we have, and treat those calculations just as “building blocks” that can be put together in “finished form” to do whatever it is we want to do.

One might think this kind of leveraging of automation would just be important for “practical purposes”, and for applying knowledge in the real world. But actually—as I have personally found repeatedly to great benefit over the decades—it’s also crucial at a conceptual level. Because it’s only through automation that one can get enough examples and experience that one’s able to develop the intuition needed to reach a higher level of understanding.

Confronted with the rapidly growing amount of knowledge in the world there’s been a tremendous tendency to assume that people must inevitably become more and more specialized. But with increasing success in the automation of intellectual tasks—and what we might broadly call AI—it becomes clear there’s an alternative: to make more and more use of this automation, so people can operate at a higher level, “integrating” rather than specializing.

And in a sense this is the way to make the best use of our human capabilities: to let us concentrate on setting the “strategy” of what we want to do—delegating the details of how to do it to automated systems that can do it better than us. But, by the way, the very fact that there’s an AI that knows how to do something will no doubt make it easier for humans to learn how to do it too. Because—although we don’t yet have the complete story—it seems inevitable that with modern techniques AIs will be able to successfully “learn how people learn”, and effectively present things an AI “knows” in just the right way for any given person to absorb.

So what should people actually learn? Learn how to use tools to do things. But also learn what things are out there to do—and learn facts to anchor how you think about those things. A lot of education today is about answering questions. But for the future—with AI in the picture—what’s likely to be more important is to learn how to ask questions, and how to figure out what questions are worth asking. Or, in effect, how to lay out an “intellectual strategy” for what to do.

And to be successful at this, what’s going to be important is breadth of knowledge—and clarity of thinking. And when it comes to clarity of thinking, there’s again something new in modern times: the concept of computational thinking. In the past we’ve had things like logic, and mathematics, as ways to structure thinking. But now we have something new: computation.

Does that mean everyone should “learn to program” in some traditional programming language? No. Traditional programming languages are about telling computers what to do in their terms. And, yes, lots of humans do this today. But it’s something that’s fundamentally ripe for direct automation (as examples with ChatGPT already show). And what’s important for the long term is something different. It’s to use the computational paradigm as a structured way to think not about the operation of computers, but about both things in the world and abstract things.

And crucial to this is having a computational language: a language for expressing things using the computational paradigm. It’s perfectly possible to express simple “everyday things” in plain, unstructured natural language. But to build any kind of serious “conceptual tower” one needs something more structured. And that’s what computational language is about.

One can see a rough historical analog in the development of mathematics and mathematical thinking. Up until about half a millennium ago, mathematics basically had to be expressed in natural language. But then came mathematical notation—and from it a more streamlined approach to mathematical thinking, that eventually made possible all the various mathematical sciences. And it’s now the same kind of thing with computational language and the computational paradigm. Except that it’s a much broader story, in which for basically every field or occupation “X” there’s a “computational X” that’s emerging.

In a sense the point of computational language (and all my efforts in the development of the Wolfram Language) is to be able to let people get “as automatically as possible” to computational X—and to let people express themselves using the full power of the computational paradigm.

Something like ChatGPT provides “human-like AI” in effect by piecing together existing human material (like billions of words of human-written text). But computational language lets one tap directly into computation—and gives the ability to do fundamentally new things, that immediately leverage our human capabilities for defining intellectual strategy.

And, yes, while traditional programming is likely to be largely obsoleted by AI, computational language is something that provides a permanent bridge between human thinking and the computational universe: a channel in which the automation is already done in the very design (and implementation) of the language—leaving in a sense an interface directly suitable for humans to learn, and to use as a basis to extend their thinking.

But, OK, what about the future of discovery? Will AIs take over from us humans in, for example, “doing science”? I, for one, have used computation (and many things one might think of as AI) as a tool for scientific discovery for nearly half a century. And, yes, many of my discoveries have in effect been “made by computer”. But science is ultimately about connecting things to human understanding. And so far it’s taken a human to knit what the computer finds into the whole web of human intellectual history.

One can certainly imagine, though, that an AI—even one rather like ChatGPT—could be quite successful in taking a “raw computational discovery” and “explaining” how it might relate to existing human knowledge. One could also imagine that the AI would be successful at identifying what aspects of some system in the world could be picked out to describe in some formal way. But—as is typical for the process of modeling in general—a key step is to decide “what one cares about”, and in effect in what direction to go in extending one’s science. And this—like so much else—is inevitably tied into the specifics of the goals we humans set ourselves.

In the emerging AI world there are plenty of specific skills that won’t make sense for (most) humans to learn—just as today the advance of automation has obsoleted many skills from the past. But—as we’ve discussed—we can expect there to “be a place” for humans. And what’s most important for us humans to learn is in effect how to pick “where next to go”—and where, out of all the infinite possibilities in the computational universe, we should take human civilization.

OK, so we’ve talked quite a bit about what might happen in the future. But what about actual data from the past? For example, what’s been the actual history of the evolution of jobs? Conveniently, in the US, the Census Bureau has records of people’s occupations going back to 1850. Of course, many job titles have changed since then. Switchmen (on railroads), chainmen (in surveying) and sextons (in churches) aren’t really things anymore. And telemarketers, aircraft pilots and web developers weren’t things in 1850. But with a bit of effort, it’s possible to more or less match things up—at least if one aggregates into large enough categories.

So here are pie charts of different job categories at 50-year intervals:

And, yes, in 1850 the US was firmly an agricultural economy, with just over half of all jobs being in agriculture. But as agriculture got more efficient—with the introduction of machinery, irrigation, better seeds, fertilizers, etc.—the fraction dropped dramatically, to just a few percent today.

After agriculture, the next biggest category back in 1850 was construction (along with other real-estate-related jobs, mainly maintenance). And this is a category that for a century and a half hasn’t changed much in size (at least so far), presumably because, even though there’s been greater automation, this has just allowed buildings to be more complex.

Looking at the pie charts above, we can see a clear trend towards greater diversification in jobs (and indeed the same thing is seen in the development of other economies around the world). It’s an old theory in economics that increasing specialization is related to economic growth, but from our point of view here, we might say that the very possibility of a more complex economy, with more niches and jobs, is a reflection of the inevitable presence of computational irreducibility, and the complex web of pockets of computational reducibility that it implies.

Beyond the overall distribution of job categories, we can also look at trends in individual categories over time—with each one in a sense providing a certain window onto history:

One can definitely see cases where the number of jobs decreases as a result of automation. And this happens not only in areas like agriculture and mining, but also for example in finance (fewer clerks and bank tellers), as well as in sales and retail (online shopping). Sometimes—as in the case of manufacturing—there’s a decrease of jobs partly because of automation, and partly because the jobs move out of the US (mainly to countries with lower labor costs).

There are cases—like military jobs—where there are clear “exogenous” effects. And then there are cases like transportation+logistics where there’s a steady increase for more than half a century as technology spreads and infrastructure gets built up—but then things “saturate”, presumably at least partly as a result of increased automation. It’s a somewhat similar story with what I’ve called “technical operations”—with more “tending to technology” needed as technology becomes more widespread.

Another clear trend is an increase in job categories associated with the world becoming an “organizationally more complicated place”. Thus we see increases in management, as well as administration, government, finance and sales (which all have recent decreases as a result of computerization). And there’s also a (somewhat recent) increase in legal.

Other areas with increases include healthcare, engineering, science and education—where “more is known and there’s more to do” (as well as there being increased organizational complexity). And then there’s entertainment, and food+hospitality, with increases that one might attribute to people leading (and wanting) “more complex lives”. And, of course, there’s information technology which takes off from nothing in the mid-1950s (and which had to be rather awkwardly grafted into the data we’re using here).

So what can we conclude? The data seems quite well aligned with what we discussed in more general terms above. Well-developed areas get automated and need to employ fewer people. But technology also opens up new areas, which employ additional people. And—as we might expect from computational irreducibility—things generally get progressively more complicated, with additional knowledge and organizational structure opening up more “frontiers” where people are needed. But even though there are sometimes “sudden inventions”, it still always seems to take decades (or effectively a generation) for there to be any dramatic change in the number of jobs. (The few sharp changes visible in the plots seem mostly to be associated with specific economic events, and—often related—changes in government policies.)

But in addition to the different jobs that get done, there’s also the question of how individual people spend their time each day. And—while it certainly doesn’t live up to my own (rather extreme) level of personal analytics—there’s a certain amount of data on this that’s been collected over the years (by getting time diaries from randomly sampled people) in the American Heritage Time Use Study. So here, for example, are plots based on this survey for how the amount of time spent on different broad activities has varied over the decades (the main line shows the mean—in hours—for each activity; the shaded areas indicate successive deciles):

And, yes, people are spending more time on “media & computing”, some mixture of watching TV, playing videogames, etc. Housework, at least for women, takes less time, presumably mostly as a result of automation (appliances, etc.). (“Leisure” is basically “hanging out” as well as hobbies and social, cultural, sporting events, etc.; “Civic” includes volunteer, religious, etc. activities.)

If one looks specifically at people who are doing paid work

one notices several things. First, the average number of hours worked hasn’t changed much in half a century, though the distribution has broadened somewhat. For people doing paid work, media & computing hasn’t increased significantly, at least since the 1980s. One category in which there is systematic increase (though the total time still isn’t very large) is exercise.

What about people who—for one reason or another—aren’t doing paid work? Here are corresponding results in this case:

Not so much increase in exercise (though the total times are larger to begin with), but now a significant increase in media & computing, with the average recently reaching nearly 6 hours per day for men—perhaps as a reflection of “more of life going online”.

But looking at all these results on time use, I think the main conclusion that over the past half century, the ways people (at least in the US) spend their time have remained rather stable—even as we’ve gone from a world with almost no computers to a world in which there are more computers than people.

]]>See also:

*“Wolfram|Alpha as the Way to Bring Computational Knowledge Superpowers to ChatGPT”* »*A discussion about the history of neural nets* »

That ChatGPT can automatically generate something that reads even superficially like human-written text is remarkable, and unexpected. But how does it do it? And why does it work? My purpose here is to give a rough outline of what’s going on inside ChatGPT—and then to explore why it is that it can do so well in producing what we might consider to be meaningful text. I should say at the outset that I’m going to focus on the big picture of what’s going on—and while I’ll mention some engineering details, I won’t get deeply into them. (And the essence of what I’ll say applies just as well to other current “large language models” [LLMs] as to ChatGPT.)

The first thing to explain is that what ChatGPT is always fundamentally trying to do is to produce a “reasonable continuation” of whatever text it’s got so far, where by “reasonable” we mean “what one might expect someone to write after seeing what people have written on billions of webpages, etc.”

So let’s say we’ve got the text “*The best thing about AI is its ability to*”. Imagine scanning billions of pages of human-written text (say on the web and in digitized books) and finding all instances of this text—then seeing what word comes next what fraction of the time. ChatGPT effectively does something like this, except that (as I’ll explain) it doesn’t look at literal text; it looks for things that in a certain sense “match in meaning”. But the end result is that it produces a ranked list of words that might follow, together with “probabilities”:

And the remarkable thing is that when ChatGPT does something like write an essay what it’s essentially doing is just asking over and over again “given the text so far, what should the next word be?”—and each time adding a word. (More precisely, as I’ll explain, it’s adding a “token”, which could be just a part of a word, which is why it can sometimes “make up new words”.)

But, OK, at each step it gets a list of words with probabilities. But which one should it actually pick to add to the essay (or whatever) that it’s writing? One might think it should be the “highest-ranked” word (i.e. the one to which the highest “probability” was assigned). But this is where a bit of voodoo begins to creep in. Because for some reason—that maybe one day we’ll have a scientific-style understanding of—if we always pick the highest-ranked word, we’ll typically get a very “flat” essay, that never seems to “show any creativity” (and even sometimes repeats word for word). But if sometimes (at random) we pick lower-ranked words, we get a “more interesting” essay.

The fact that there’s randomness here means that if we use the same prompt multiple times, we’re likely to get different essays each time. And, in keeping with the idea of voodoo, there’s a particular so-called “temperature” parameter that determines how often lower-ranked words will be used, and for essay generation, it turns out that a “temperature” of 0.8 seems best. (It’s worth emphasizing that there’s no “theory” being used here; it’s just a matter of what’s been found to work in practice. And for example the concept of “temperature” is there because exponential distributions familiar from statistical physics happen to be being used, but there’s no “physical” connection—at least so far as we know.)

Before we go on I should explain that for purposes of exposition I’m mostly not going to use the full system that’s in ChatGPT; instead I’ll usually work with a simpler GPT-2 system, which has the nice feature that it’s small enough to be able to run on a standard desktop computer. And so for essentially everything I show I’ll be able to include explicit Wolfram Language code that you can immediately run on your computer. (Click any picture here to copy the code behind it.)

For example, here’s how to get the table of probabilities above. First, we have to retrieve the underlying “language model” neural net:

Later on, we’ll look inside this neural net, and talk about how it works. But for now we can just apply this “net model” as a black box to our text so far, and ask for the top 5 words by probability that the model says should follow:

This takes that result and makes it into an explicit formatted “dataset”:

Here’s what happens if one repeatedly “applies the model”—at each step adding the word that has the top probability (specified in this code as the “decision” from the model):

What happens if one goes on longer? In this (“zero temperature”) case what comes out soon gets rather confused and repetitive:

But what if instead of always picking the “top” word one sometimes randomly picks “non-top” words (with the “randomness” corresponding to “temperature” 0.8)? Again one can build up text:

And every time one does this, different random choices will be made, and the text will be different—as in these 5 examples:

It’s worth pointing out that even at the first step there are a lot of possible “next words” to choose from (at temperature 0.8), though their probabilities fall off quite quickly (and, yes, the straight line on this log-log plot corresponds to an *n*^{–1} “power-law” decay that’s very characteristic of the general statistics of language):

So what happens if one goes on longer? Here’s a random example. It’s better than the top-word (zero temperature) case, but still at best a bit weird:

This was done with the simplest GPT-2 model (from 2019). With the newer and bigger GPT-3 models the results are better. Here’s the top-word (zero temperature) text produced with the same “prompt”, but with the biggest GPT-3 model:

And here’s a random example at “temperature 0.8”:

OK, so ChatGPT always picks its next word based on probabilities. But where do those probabilities come from? Let’s start with a simpler problem. Let’s consider generating English text one letter (rather than word) at a time. How can we work out what the probability for each letter should be?

A very minimal thing we could do is just take a sample of English text, and calculate how often different letters occur in it. So, for example, this counts letters in the Wikipedia article on “cats”:

And this does the same thing for “dogs”:

The results are similar, but not the same (“o” is no doubt more common in the “dogs” article because, after all, it occurs in the word “dog” itself). Still, if we take a large enough sample of English text we can expect to eventually get at least fairly consistent results:

Here’s a sample of what we get if we just generate a sequence of letters with these probabilities:

We can break this into “words” by adding in spaces as if they were letters with a certain probability:

We can do a slightly better job of making “words” by forcing the distribution of “word lengths” to agree with what it is in English:

We didn’t happen to get any “actual words” here, but the results are looking slightly better. To go further, though, we need to do more than just pick each letter separately at random. And, for example, we know that if we have a “q”, the next letter basically has to be “u”.

Here’s a plot of the probabilities for letters on their own:

And here’s a plot that shows the probabilities of pairs of letters (“2-grams”) in typical English text. The possible first letters are shown across the page, the second letters down the page:

And we see here, for example, that the “q” column is blank (zero probability) except on the “u” row. OK, so now instead of generating our “words” a single letter at a time, let’s generate them looking at two letters at a time, using these “2-gram” probabilities. Here’s a sample of the result—which happens to include a few “actual words”:

With sufficiently much English text we can get pretty good estimates not just for probabilities of single letters or pairs of letters (2-grams), but also for longer runs of letters. And if we generate “random words” with progressively longer *n*-gram probabilities, we see that they get progressively “more realistic”:

But let’s now assume—more or less as ChatGPT does—that we’re dealing with whole words, not letters. There are about 40,000 reasonably commonly used words in English. And by looking at a large corpus of English text (say a few million books, with altogether a few hundred billion words), we can get an estimate of how common each word is. And using this we can start generating “sentences”, in which each word is independently picked at random, with the same probability that it appears in the corpus. Here’s a sample of what we get:

Not surprisingly, this is nonsense. So how can we do better? Just like with letters, we can start taking into account not just probabilities for single words but probabilities for pairs or longer *n*-grams of words. Doing this for pairs, here are 5 examples of what we get, in all cases starting from the word “cat”:

It’s getting slightly more “sensible looking”. And we might imagine that if we were able to use sufficiently long *n*-grams we’d basically “get a ChatGPT”—in the sense that we’d get something that would generate essay-length sequences of words with the “correct overall essay probabilities”. But here’s the problem: there just isn’t even close to enough English text that’s ever been written to be able to deduce those probabilities.

In a crawl of the web there might be a few hundred billion words; in books that have been digitized there might be another hundred billion words. But with 40,000 common words, even the number of possible 2-grams is already 1.6 billion—and the number of possible 3-grams is 60 trillion. So there’s no way we can estimate the probabilities even for all of these from text that’s out there. And by the time we get to “essay fragments” of 20 words, the number of possibilities is larger than the number of particles in the universe, so in a sense they could never all be written down.

So what can we do? The big idea is to make a model that lets us estimate the probabilities with which sequences should occur—even though we’ve never explicitly seen those sequences in the corpus of text we’ve looked at. And at the core of ChatGPT is precisely a so-called “large language model” (LLM) that’s been built to do a good job of estimating those probabilities.

Say you want to know (as Galileo did back in the late 1500s) how long it’s going to take a cannon ball dropped from each floor of the Tower of Pisa to hit the ground. Well, you could just measure it in each case and make a table of the results. Or you could do what is the essence of theoretical science: make a model that gives some kind of procedure for computing the answer rather than just measuring and remembering each case.

Let’s imagine we have (somewhat idealized) data for how long the cannon ball takes to fall from various floors:

How do we figure out how long it’s going to take to fall from a floor we don’t explicitly have data about? In this particular case, we can use known laws of physics to work it out. But say all we’ve got is the data, and we don’t know what underlying laws govern it. Then we might make a mathematical guess, like that perhaps we should use a straight line as a model:

We could pick different straight lines. But this is the one that’s on average closest to the data we’re given. And from this straight line we can estimate the time to fall for any floor.

How did we know to try using a straight line here? At some level we didn’t. It’s just something that’s mathematically simple, and we’re used to the fact that lots of data we measure turns out to be well fit by mathematically simple things. We could try something mathematically more complicated—say *a* + *b* *x* + *c* *x*^{2}—and then in this case we do better:

Things can go quite wrong, though. Like here’s the best we can do with *a* + *b*/*x* + *c* sin(*x*):

It is worth understanding that there’s never a “model-less model”. Any model you use has some particular underlying structure—then a certain set of “knobs you can turn” (i.e. parameters you can set) to fit your data. And in the case of ChatGPT, lots of such “knobs” are used—actually, 175 billion of them.

But the remarkable thing is that the underlying structure of ChatGPT—with “just” that many parameters—is sufficient to make a model that computes next-word probabilities “well enough” to give us reasonable essay-length pieces of text.

The example we gave above involves making a model for numerical data that essentially comes from simple physics—where we’ve known for several centuries that “simple mathematics applies”. But for ChatGPT we have to make a model of human-language text of the kind produced by a human brain. And for something like that we don’t (at least yet) have anything like “simple mathematics”. So what might a model of it be like?

Before we talk about language, let’s talk about another human-like task: recognizing images. And as a simple example of this, let’s consider images of digits (and, yes, this is a classic machine learning example):

One thing we could do is get a bunch of sample images for each digit:

Then to find out if an image we’re given as input corresponds to a particular digit we could just do an explicit pixel-by-pixel comparison with the samples we have. But as humans we certainly seem to do something better—because we can still recognize digits, even when they’re for example handwritten, and have all sorts of modifications and distortions:

When we made a model for our numerical data above, we were able to take a numerical value *x* that we were given, and just compute *a + b x* for particular *a* and *b*. So if we treat the gray-level value of each pixel here as some variable *x _{i}* is there some function of all those variables that—when evaluated—tells us what digit the image is of? It turns out that it’s possible to construct such a function. Not surprisingly, it’s not particularly simple, though. And a typical example might involve perhaps half a million mathematical operations.

But the end result is that if we feed the collection of pixel values for an image into this function, out will come the number specifying which digit we have an image of. Later, we’ll talk about how such a function can be constructed, and the idea of neural nets. But for now let’s treat the function as black box, where we feed in images of, say, handwritten digits (as arrays of pixel values) and we get out the numbers these correspond to:

But what’s really going on here? Let’s say we progressively blur a digit. For a little while our function still “recognizes” it, here as a “2”. But soon it “loses it”, and starts giving the “wrong” result:

But why do we say it’s the “wrong” result? In this case, we know we got all the images by blurring a “2”. But if our goal is to produce a model of what humans can do in recognizing images, the real question to ask is what a human would have done if presented with one of those blurred images, without knowing where it came from.

And we have a “good model” if the results we get from our function typically agree with what a human would say. And the nontrivial scientific fact is that for an image-recognition task like this we now basically know how to construct functions that do this.

Can we “mathematically prove” that they work? Well, no. Because to do that we’d have to have a mathematical theory of what we humans are doing. Take the “2” image and change a few pixels. We might imagine that with only a few pixels “out of place” we should still consider the image a “2”. But how far should that go? It’s a question of human visual perception. And, yes, the answer would no doubt be different for bees or octopuses—and potentially utterly different for putative aliens.

OK, so how do our typical models for tasks like image recognition actually work? The most popular—and successful—current approach uses neural nets. Invented—in a form remarkably close to their use today—in the 1940s, neural nets can be thought of as simple idealizations of how brains seem to work.

In human brains there are about 100 billion neurons (nerve cells), each capable of producing an electrical pulse up to perhaps a thousand times a second. The neurons are connected in a complicated net, with each neuron having tree-like branches allowing it to pass electrical signals to perhaps thousands of other neurons. And in a rough approximation, whether any given neuron produces an electrical pulse at a given moment depends on what pulses it’s received from other neurons—with different connections contributing with different “weights”.

When we “see an image” what’s happening is that when photons of light from the image fall on (“photoreceptor”) cells at the back of our eyes they produce electrical signals in nerve cells. These nerve cells are connected to other nerve cells, and eventually the signals go through a whole sequence of layers of neurons. And it’s in this process that we “recognize” the image, eventually “forming the thought” that we’re “seeing a 2” (and maybe in the end doing something like saying the word “two” out loud).

The “black-box” function from the previous section is a “mathematicized” version of such a neural net. It happens to have 11 layers (though only 4 “core layers”):

There’s nothing particularly “theoretically derived” about this neural net; it’s just something that—back in 1998—was constructed as a piece of engineering, and found to work. (Of course, that’s not much different from how we might describe our brains as having been produced through the process of biological evolution.)

OK, but how does a neural net like this “recognize things”? The key is the notion of attractors. Imagine we’ve got handwritten images of 1’s and 2’s:

We somehow want all the 1’s to “be attracted to one place”, and all the 2’s to “be attracted to another place”. Or, put a different way, if an image is somehow “closer to being a 1” than to being a 2, we want it to end up in the “1 place” and vice versa.

As a straightforward analogy, let’s say we have certain positions in the plane, indicated by dots (in a real-life setting they might be positions of coffee shops). Then we might imagine that starting from any point on the plane we’d always want to end up at the closest dot (i.e. we’d always go to the closest coffee shop). We can represent this by dividing the plane into regions (“attractor basins”) separated by idealized “watersheds”:

We can think of this as implementing a kind of “recognition task” in which we’re not doing something like identifying what digit a given image “looks most like”—but rather we’re just, quite directly, seeing what dot a given point is closest to. (The “Voronoi diagram” setup we’re showing here separates points in 2D Euclidean space; the digit recognition task can be thought of as doing something very similar—but in a 784-dimensional space formed from the gray levels of all the pixels in each image.)

So how do we make a neural net “do a recognition task”? Let’s consider this very simple case:

Our goal is to take an “input” corresponding to a position {*x*,*y*}—and then to “recognize” it as whichever of the three points it’s closest to. Or, in other words, we want the neural net to compute a function of {*x*,*y*} like:

So how do we do this with a neural net? Ultimately a neural net is a connected collection of idealized “neurons”—usually arranged in layers—with a simple example being:

Each “neuron” is effectively set up to evaluate a simple numerical function. And to “use” the network, we simply feed numbers (like our coordinates *x* and *y*) in at the top, then have neurons on each layer “evaluate their functions” and feed the results forward through the network—eventually producing the final result at the bottom:

In the traditional (biologically inspired) setup each neuron effectively has a certain set of “incoming connections” from the neurons on the previous layer, with each connection being assigned a certain “weight” (which can be a positive or negative number). The value of a given neuron is determined by multiplying the values of “previous neurons” by their corresponding weights, then adding these up and adding a constant—and finally applying a “thresholding” (or “activation”) function. In mathematical terms, if a neuron has inputs *x* = {*x*_{1}, *x*_{2} …}*f*[*w* . *x* + *b*], where the weights *w* and constant *b* are generally chosen differently for each neuron in the network; the function *f *is usually the same.

Computing *w* . *x* + *b* is just a matter of matrix multiplication and addition. The “activation function” *f* introduces nonlinearity (and ultimately is what leads to nontrivial behavior). Various activation functions commonly get used; here we’ll just use `Ramp` (or ReLU):

For each task we want the neural net to perform (or, equivalently, for each overall function we want it to evaluate) we’ll have different choices of weights. (And—as we’ll discuss later—these weights are normally determined by “training” the neural net using machine learning from examples of the outputs we want.)

Ultimately, every neural net just corresponds to some overall mathematical function—though it may be messy to write out. For the example above, it would be:

The neural net of ChatGPT also just corresponds to a mathematical function like this—but effectively with billions of terms.

But let’s go back to individual neurons. Here are some examples of the functions a neuron with two inputs (representing coordinates *x* and *y*) can compute with various choices of weights and constants (and `Ramp` as activation function):

But what about the larger network from above? Well, here’s what it computes:

It’s not quite “right”, but it’s close to the “nearest point” function we showed above.

Let’s see what happens with some other neural nets. In each case, as we’ll explain later, we’re using machine learning to find the best choice of weights. Then we’re showing here what the neural net with those weights computes:

Bigger networks generally do better at approximating the function we’re aiming for. And in the “middle of each attractor basin” we typically get exactly the answer we want. But at the boundaries—where the neural net “has a hard time making up its mind”—things can be messier.

With this simple mathematical-style “recognition task” it’s clear what the “right answer” is. But in the problem of recognizing handwritten digits, it’s not so clear. What if someone wrote a “2” so badly it looked like a “7”, etc.? Still, we can ask how a neural net distinguishes digits—and this gives an indication:

Can we say “mathematically” how the network makes its distinctions? Not really. It’s just “doing what the neural net does”. But it turns out that that normally seems to agree fairly well with the distinctions we humans make.

Let’s take a more elaborate example. Let’s say we have images of cats and dogs. And we have a neural net that’s been trained to distinguish them. Here’s what it might do on some examples:

Now it’s even less clear what the “right answer” is. What about a dog dressed in a cat suit? Etc. Whatever input it’s given, the neural net is generating an answer. And, it turns out, to do it in a way that’s reasonably consistent with what humans might do. As I’ve said above, that’s not a fact we can “derive from first principles”. It’s just something that’s empirically been found to be true, at least in certain domains. But it’s a key reason why neural nets are useful: that they somehow capture a “human-like” way of doing things.

Show yourself a picture of a cat, and ask “Why is that a cat?”. Maybe you’d start saying “Well, I see its pointy ears, etc.” But it’s not very easy to explain how you recognized the image as a cat. It’s just that somehow your brain figured that out. But for a brain there’s no way (at least yet) to “go inside” and see how it figured it out. What about for an (artificial) neural net? Well, it’s straightforward to see what each “neuron” does when you show a picture of a cat. But even to get a basic visualization is usually very difficult.

In the final net that we used for the “nearest point” problem above there are 17 neurons. In the net for recognizing handwritten digits there are 2190. And in the net we’re using to recognize cats and dogs there are 60,650. Normally it would be pretty difficult to visualize what amounts to 60,650-dimensional space. But because this is a network set up to deal with images, many of its layers of neurons are organized into arrays, like the arrays of pixels it’s looking at.

And if we take a typical cat image

then we can represent the states of neurons at the first layer by a collection of derived images—many of which we can readily interpret as being things like “the cat without its background”, or “the outline of the cat”:

By the 10th layer it’s harder to interpret what’s going on:

But in general we might say that the neural net is “picking out certain features” (maybe pointy ears are among them), and using these to determine what the image is of. But are those features ones for which we have names—like “pointy ears”? Mostly not.

Are our brains using similar features? Mostly we don’t know. But it’s notable that the first few layers of a neural net like the one we’re showing here seem to pick out aspects of images (like edges of objects) that seem to be similar to ones we know are picked out by the first level of visual processing in brains.

But let’s say we want a “theory of cat recognition” in neural nets. We can say: “Look, this particular net does it”—and immediately that gives us some sense of “how hard a problem” it is (and, for example, how many neurons or layers might be needed). But at least as of now we don’t have a way to “give a narrative description” of what the network is doing. And maybe that’s because it truly is computationally irreducible, and there’s no general way to find what it does except by explicitly tracing each step. Or maybe it’s just that we haven’t “figured out the science”, and identified the “natural laws” that allow us to summarize what’s going on.

We’ll encounter the same kinds of issues when we talk about generating language with ChatGPT. And again it’s not clear whether there are ways to “summarize what it’s doing”. But the richness and detail of language (and our experience with it) may allow us to get further than with images.

We’ve been talking so far about neural nets that “already know” how to do particular tasks. But what makes neural nets so useful (presumably also in brains) is that not only can they in principle do all sorts of tasks, but they can be incrementally “trained from examples” to do those tasks.

When we make a neural net to distinguish cats from dogs we don’t effectively have to write a program that (say) explicitly finds whiskers; instead we just show lots of examples of what’s a cat and what’s a dog, and then have the network “machine learn” from these how to distinguish them.

And the point is that the trained network “generalizes” from the particular examples it’s shown. Just as we’ve seen above, it isn’t simply that the network recognizes the particular pixel pattern of an example cat image it was shown; rather it’s that the neural net somehow manages to distinguish images on the basis of what we consider to be some kind of “general catness”.

So how does neural net training actually work? Essentially what we’re always trying to do is to find weights that make the neural net successfully reproduce the examples we’ve given. And then we’re relying on the neural net to “interpolate” (or “generalize”) “between” these examples in a “reasonable” way.

Let’s look at a problem even simpler than the nearest-point one above. Let’s just try to get a neural net to learn the function:

For this task, we’ll need a network that has just one input and one output, like:

But what weights, etc. should we be using? With every possible set of weights the neural net will compute some function. And, for example, here’s what it does with a few randomly chosen sets of weights:

And, yes, we can plainly see that in none of these cases does it get even close to reproducing the function we want. So how do we find weights that will reproduce the function?

The basic idea is to supply lots of “input → output” examples to “learn from”—and then to try to find weights that will reproduce these examples. Here’s the result of doing that with progressively more examples:

At each stage in this “training” the weights in the network are progressively adjusted—and we see that eventually we get a network that successfully reproduces the function we want. So how do we adjust the weights? The basic idea is at each stage to see “how far away we are” from getting the function we want—and then to update the weights in such a way as to get closer.

To find out “how far away we are” we compute what’s usually called a “loss function” (or sometimes “cost function”). Here we’re using a simple (L2) loss function that’s just the sum of the squares of the differences between the values we get, and the true values. And what we see is that as our training process progresses, the loss function progressively decreases (following a certain “learning curve” that’s different for different tasks)—until we reach a point where the network (at least to a good approximation) successfully reproduces the function we want:

Alright, so the last essential piece to explain is how the weights are adjusted to reduce the loss function. As we’ve said, the loss function gives us a “distance” between the values we’ve got, and the true values. But the “values we’ve got” are determined at each stage by the current version of neural net—and by the weights in it. But now imagine that the weights are variables—say *w _{i}*. We want to find out how to adjust the values of these variables to minimize the loss that depends on them.

For example, imagine (in an incredible simplification of typical neural nets used in practice) that we have just two weights *w*_{1} and *w*_{2}. Then we might have a loss that as a function of *w*_{1} and *w*_{2} looks like this:

Numerical analysis provides a variety of techniques for finding the minimum in cases like this. But a typical approach is just to progressively follow the path of steepest descent from whatever previous *w*_{1}, *w*_{2} we had:

Like water flowing down a mountain, all that’s guaranteed is that this procedure will end up at some local minimum of the surface (“a mountain lake”); it might well not reach the ultimate global minimum.

It’s not obvious that it would be feasible to find the path of the steepest descent on the “weight landscape”. But calculus comes to the rescue. As we mentioned above, one can always think of a neural net as computing a mathematical function—that depends on its inputs, and its weights. But now consider differentiating with respect to these weights. It turns out that the chain rule of calculus in effect lets us “unravel” the operations done by successive layers in the neural net. And the result is that we can—at least in some local approximation—“invert” the operation of the neural net, and progressively find weights that minimize the loss associated with the output.

The picture above shows the kind of minimization we might need to do in the unrealistically simple case of just 2 weights. But it turns out that even with many more weights (ChatGPT uses 175 billion) it’s still possible to do the minimization, at least to some level of approximation. And in fact the big breakthrough in “deep learning” that occurred around 2011 was associated with the discovery that in some sense it can be easier to do (at least approximate) minimization when there are lots of weights involved than when there are fairly few.

In other words—somewhat counterintuitively—it can be easier to solve more complicated problems with neural nets than simpler ones. And the rough reason for this seems to be that when one has a lot of “weight variables” one has a high-dimensional space with “lots of different directions” that can lead one to the minimum—whereas with fewer variables it’s easier to end up getting stuck in a local minimum (“mountain lake”) from which there’s no “direction to get out”.

It’s worth pointing out that in typical cases there are many different collections of weights that will all give neural nets that have pretty much the same performance. And usually in practical neural net training there are lots of random choices made—that lead to “different-but-equivalent solutions”, like these:

But each such “different solution” will have at least slightly different behavior. And if we ask, say, for an “extrapolation” outside the region where we gave training examples, we can get dramatically different results:

But which of these is “right”? There’s really no way to say. They’re all “consistent with the observed data”. But they all correspond to different “innate” ways to “think about” what to do “outside the box”. And some may seem “more reasonable” to us humans than others.

Particularly over the past decade, there’ve been many advances in the art of training neural nets. And, yes, it is basically an art. Sometimes—especially in retrospect—one can see at least a glimmer of a “scientific explanation” for something that’s being done. But mostly things have been discovered by trial and error, adding ideas and tricks that have progressively built a significant lore about how to work with neural nets.

There are several key parts. First, there’s the matter of what architecture of neural net one should use for a particular task. Then there’s the critical issue of how one’s going to get the data on which to train the neural net. And increasingly one isn’t dealing with training a net from scratch: instead a new net can either directly incorporate another already-trained net, or at least can use that net to generate more training examples for itself.

One might have thought that for every particular kind of task one would need a different architecture of neural net. But what’s been found is that the same architecture often seems to work even for apparently quite different tasks. At some level this reminds one of the idea of universal computation (and my Principle of Computational Equivalence), but, as I’ll discuss later, I think it’s more a reflection of the fact that the tasks we’re typically trying to get neural nets to do are “human-like” ones—and neural nets can capture quite general “human-like processes”.

In earlier days of neural nets, there tended to be the idea that one should “make the neural net do as little as possible”. For example, in converting speech to text it was thought that one should first analyze the audio of the speech, break it into phonemes, etc. But what was found is that—at least for “human-like tasks”—it’s usually better just to try to train the neural net on the “end-to-end problem”, letting it “discover” the necessary intermediate features, encodings, etc. for itself.

There was also the idea that one should introduce complicated individual components into the neural net, to let it in effect “explicitly implement particular algorithmic ideas”. But once again, this has mostly turned out not to be worthwhile; instead, it’s better just to deal with very simple components and let them “organize themselves” (albeit usually in ways we can’t understand) to achieve (presumably) the equivalent of those algorithmic ideas.

That’s not to say that there are no “structuring ideas” that are relevant for neural nets. Thus, for example, having 2D arrays of neurons with local connections seems at least very useful in the early stages of processing images. And having patterns of connectivity that concentrate on “looking back in sequences” seems useful—as we’ll see later—in dealing with things like human language, for example in ChatGPT.

But an important feature of neural nets is that—like computers in general—they’re ultimately just dealing with data. And current neural nets—with current approaches to neural net training—specifically deal with arrays of numbers. But in the course of processing, those arrays can be completely rearranged and reshaped. And as an example, the network we used for identifying digits above starts with a 2D “image-like” array, quickly “thickening” to many channels, but then “concentrating down” into a 1D array that will ultimately contain elements representing the different possible output digits:

But, OK, how can one tell how big a neural net one will need for a particular task? It’s something of an art. At some level the key thing is to know “how hard the task is”. But for human-like tasks that’s typically very hard to estimate. Yes, there may be a systematic way to do the task very “mechanically” by computer. But it’s hard to know if there are what one might think of as tricks or shortcuts that allow one to do the task at least at a “human-like level” vastly more easily. It might take enumerating a giant game tree to “mechanically” play a certain game; but there might be a much easier (“heuristic”) way to achieve “human-level play”.

When one’s dealing with tiny neural nets and simple tasks one can sometimes explicitly see that one “can’t get there from here”. For example, here’s the best one seems to be able to do on the task from the previous section with a few small neural nets:

And what we see is that if the net is too small, it just can’t reproduce the function we want. But above some size, it has no problem—at least if one trains it for long enough, with enough examples. And, by the way, these pictures illustrate a piece of neural net lore: that one can often get away with a smaller network if there’s a “squeeze” in the middle that forces everything to go through a smaller intermediate number of neurons. (It’s also worth mentioning that “no-intermediate-layer”—or so-called “perceptron”—networks can only learn essentially linear functions—but as soon as there’s even one intermediate layer it’s always in principle possible to approximate any function arbitrarily well, at least if one has enough neurons, though to make it feasibly trainable one typically has some kind of regularization or normalization.)

OK, so let’s say one’s settled on a certain neural net architecture. Now there’s the issue of getting data to train the network with. And many of the practical challenges around neural nets—and machine learning in general—center on acquiring or preparing the necessary training data. In many cases (“supervised learning”) one wants to get explicit examples of inputs and the outputs one is expecting from them. Thus, for example, one might want images tagged by what’s in them, or some other attribute. And maybe one will have to explicitly go through—usually with great effort—and do the tagging. But very often it turns out to be possible to piggyback on something that’s already been done, or use it as some kind of proxy. And so, for example, one might use alt tags that have been provided for images on the web. Or, in a different domain, one might use closed captions that have been created for videos. Or—for language translation training—one might use parallel versions of webpages or other documents that exist in different languages.

How much data do you need to show a neural net to train it for a particular task? Again, it’s hard to estimate from first principles. Certainly the requirements can be dramatically reduced by using “transfer learning” to “transfer in” things like lists of important features that have already been learned in another network. But generally neural nets need to “see a lot of examples” to train well. And at least for some tasks it’s an important piece of neural net lore that the examples can be incredibly repetitive. And indeed it’s a standard strategy to just show a neural net all the examples one has, over and over again. In each of these “training rounds” (or “epochs”) the neural net will be in at least a slightly different state, and somehow “reminding it” of a particular example is useful in getting it to “remember that example”. (And, yes, perhaps this is analogous to the usefulness of repetition in human memorization.)

But often just repeating the same example over and over again isn’t enough. It’s also necessary to show the neural net variations of the example. And it’s a feature of neural net lore that those “data augmentation” variations don’t have to be sophisticated to be useful. Just slightly modifying images with basic image processing can make them essentially “as good as new” for neural net training. And, similarly, when one’s run out of actual video, etc. for training self-driving cars, one can go on and just get data from running simulations in a model videogame-like environment without all the detail of actual real-world scenes.

How about something like ChatGPT? Well, it has the nice feature that it can do “unsupervised learning”, making it much easier to get it examples to train from. Recall that the basic task for ChatGPT is to figure out how to continue a piece of text that it’s been given. So to get it “training examples” all one has to do is get a piece of text, and mask out the end of it, and then use this as the “input to train from”—with the “output” being the complete, unmasked piece of text. We’ll discuss this more later, but the main point is that—unlike, say, for learning what’s in images—there’s no “explicit tagging” needed; ChatGPT can in effect just learn directly from whatever examples of text it’s given.

OK, so what about the actual learning process in a neural net? In the end it’s all about determining what weights will best capture the training examples that have been given. And there are all sorts of detailed choices and “hyperparameter settings” (so called because the weights can be thought of as “parameters”) that can be used to tweak how this is done. There are different choices of loss function (sum of squares, sum of absolute values, etc.). There are different ways to do loss minimization (how far in weight space to move at each step, etc.). And then there are questions like how big a “batch” of examples to show to get each successive estimate of the loss one’s trying to minimize. And, yes, one can apply machine learning (as we do, for example, in Wolfram Language) to automate machine learning—and to automatically set things like hyperparameters.

But in the end the whole process of training can be characterized by seeing how the loss progressively decreases (as in this Wolfram Language progress monitor for a small training):

And what one typically sees is that the loss decreases for a while, but eventually flattens out at some constant value. If that value is sufficiently small, then the training can be considered successful; otherwise it’s probably a sign one should try changing the network architecture.

Can one tell how long it should take for the “learning curve” to flatten out? Like for so many other things, there seem to be approximate power-law scaling relationships that depend on the size of neural net and amount of data one’s using. But the general conclusion is that training a neural net is hard—and takes a lot of computational effort. And as a practical matter, the vast majority of that effort is spent doing operations on arrays of numbers, which is what GPUs are good at—which is why neural net training is typically limited by the availability of GPUs.

In the future, will there be fundamentally better ways to train neural nets—or generally do what neural nets do? Almost certainly, I think. The fundamental idea of neural nets is to create a flexible “computing fabric” out of a large number of simple (essentially identical) components—and to have this “fabric” be one that can be incrementally modified to learn from examples. In current neural nets, one’s essentially using the ideas of calculus—applied to real numbers—to do that incremental modification. But it’s increasingly clear that having high-precision numbers doesn’t matter; 8 bits or less might be enough even with current methods.

With computational systems like cellular automata that basically operate in parallel on many individual bits it’s never been clear how to do this kind of incremental modification, but there’s no reason to think it isn’t possible. And in fact, much like with the “deep-learning breakthrough of 2012” it may be that such incremental modification will effectively be easier in more complicated cases than in simple ones.

Neural nets—perhaps a bit like brains—are set up to have an essentially fixed network of neurons, with what’s modified being the strength (“weight”) of connections between them. (Perhaps in at least young brains significant numbers of wholly new connections can also grow.) But while this might be a convenient setup for biology, it’s not at all clear that it’s even close to the best way to achieve the functionality we need. And something that involves the equivalent of progressive network rewriting (perhaps reminiscent of our Physics Project) might well ultimately be better.

But even within the framework of existing neural nets there’s currently a crucial limitation: neural net training as it’s now done is fundamentally sequential, with the effects of each batch of examples being propagated back to update the weights. And indeed with current computer hardware—even taking into account GPUs—most of a neural net is “idle” most of the time during training, with just one part at a time being updated. And in a sense this is because our current computers tend to have memory that is separate from their CPUs (or GPUs). But in brains it’s presumably different—with every “memory element” (i.e. neuron) also being a potentially active computational element. And if we could set up our future computer hardware this way it might become possible to do training much more efficiently.

The capabilities of something like ChatGPT seem so impressive that one might imagine that if one could just “keep going” and train larger and larger neural networks, then they’d eventually be able to “do everything”. And if one’s concerned with things that are readily accessible to immediate human thinking, it’s quite possible that this is the case. But the lesson of the past several hundred years of science is that there are things that can be figured out by formal processes, but aren’t readily accessible to immediate human thinking.

Nontrivial mathematics is one big example. But the general case is really computation. And ultimately the issue is the phenomenon of computational irreducibility. There are some computations which one might think would take many steps to do, but which can in fact be “reduced” to something quite immediate. But the discovery of computational irreducibility implies that this doesn’t always work. And instead there are processes—probably like the one below—where to work out what happens inevitably requires essentially tracing each computational step:

The kinds of things that we normally do with our brains are presumably specifically chosen to avoid computational irreducibility. It takes special effort to do math in one’s brain. And it’s in practice largely impossible to “think through” the steps in the operation of any nontrivial program just in one’s brain.

But of course for that we have computers. And with computers we can readily do long, computationally irreducible things. And the key point is that there’s in general no shortcut for these.

Yes, we could memorize lots of specific examples of what happens in some particular computational system. And maybe we could even see some (“computationally reducible”) patterns that would allow us to do a little generalization. But the point is that computational irreducibility means that we can never guarantee that the unexpected won’t happen—and it’s only by explicitly doing the computation that you can tell what actually happens in any particular case.

And in the end there’s just a fundamental tension between learnability and computational irreducibility. Learning involves in effect compressing data by leveraging regularities. But computational irreducibility implies that ultimately there’s a limit to what regularities there may be.

As a practical matter, one can imagine building little computational devices—like cellular automata or Turing machines—into trainable systems like neural nets. And indeed such devices can serve as good “tools” for the neural net—like Wolfram|Alpha can be a good tool for ChatGPT. But computational irreducibility implies that one can’t expect to “get inside” those devices and have them learn.

Or put another way, there’s an ultimate tradeoff between capability and trainability: the more you want a system to make “true use” of its computational capabilities, the more it’s going to show computational irreducibility, and the less it’s going to be trainable. And the more it’s fundamentally trainable, the less it’s going to be able to do sophisticated computation.

(For ChatGPT as it currently is, the situation is actually much more extreme, because the neural net used to generate each token of output is a pure “feed-forward” network, without loops, and therefore has no ability to do any kind of computation with nontrivial “control flow”.)

Of course, one might wonder whether it’s actually important to be able to do irreducible computations. And indeed for much of human history it wasn’t particularly important. But our modern technological world has been built on engineering that makes use of at least mathematical computations—and increasingly also more general computations. And if we look at the natural world, it’s full of irreducible computation—that we’re slowly understanding how to emulate and use for our technological purposes.

Yes, a neural net can certainly notice the kinds of regularities in the natural world that we might also readily notice with “unaided human thinking”. But if we want to work out things that are in the purview of mathematical or computational science the neural net isn’t going to be able to do it—unless it effectively “uses as a tool” an “ordinary” computational system.

But there’s something potentially confusing about all of this. In the past there were plenty of tasks—including writing essays—that we’ve assumed were somehow “fundamentally too hard” for computers. And now that we see them done by the likes of ChatGPT we tend to suddenly think that computers must have become vastly more powerful—in particular surpassing things they were already basically able to do (like progressively computing the behavior of computational systems like cellular automata).

But this isn’t the right conclusion to draw. Computationally irreducible processes are still computationally irreducible, and are still fundamentally hard for computers—even if computers can readily compute their individual steps. And instead what we should conclude is that tasks—like writing essays—that we humans could do, but we didn’t think computers could do, are actually in some sense computationally easier than we thought.

In other words, the reason a neural net can be successful in writing an essay is because writing an essay turns out to be a “computationally shallower” problem than we thought. And in a sense this takes us closer to “having a theory” of how we humans manage to do things like writing essays, or in general deal with language.

If you had a big enough neural net then, yes, you might be able to do whatever humans can readily do. But you wouldn’t capture what the natural world in general can do—or that the tools that we’ve fashioned from the natural world can do. And it’s the use of those tools—both practical and conceptual—that have allowed us in recent centuries to transcend the boundaries of what’s accessible to “pure unaided human thought”, and capture for human purposes more of what’s out there in the physical and computational universe.

Neural nets—at least as they’re currently set up—are fundamentally based on numbers. So if we’re going to to use them to work on something like text we’ll need a way to represent our text with numbers. And certainly we could start (essentially as ChatGPT does) by just assigning a number to every word in the dictionary. But there’s an important idea—that’s for example central to ChatGPT—that goes beyond that. And it’s the idea of “embeddings”. One can think of an embedding as a way to try to represent the “essence” of something by an array of numbers—with the property that “nearby things” are represented by nearby numbers.

And so, for example, we can think of a word embedding as trying to lay out words in a kind of “meaning space” in which words that are somehow “nearby in meaning” appear nearby in the embedding. The actual embeddings that are used—say in ChatGPT—tend to involve large lists of numbers. But if we project down to 2D, we can show examples of how words are laid out by the embedding:

And, yes, what we see does remarkably well in capturing typical everyday impressions. But how can we construct such an embedding? Roughly the idea is to look at large amounts of text (here 5 billion words from the web) and then see “how similar” the “environments” are in which different words appear. So, for example, “alligator” and “crocodile” will often appear almost interchangeably in otherwise similar sentences, and that means they’ll be placed nearby in the embedding. But “turnip” and “eagle” won’t tend to appear in otherwise similar sentences, so they’ll be placed far apart in the embedding.

But how does one actually implement something like this using neural nets? Let’s start by talking about embeddings not for words, but for images. We want to find some way to characterize images by lists of numbers in such a way that “images we consider similar” are assigned similar lists of numbers.

How do we tell if we should “consider images similar”? Well, if our images are, say, of handwritten digits we might “consider two images similar” if they are of the same digit. Earlier we discussed a neural net that was trained to recognize handwritten digits. And we can think of this neural net as being set up so that in its final output it puts images into 10 different bins, one for each digit.

But what if we “intercept” what’s going on inside the neural net before the final “it’s a ‘4’” decision is made? We might expect that inside the neural net there are numbers that characterize images as being “mostly 4-like but a bit 2-like” or some such. And the idea is to pick up such numbers to use as elements in an embedding.

So here’s the concept. Rather than directly trying to characterize “what image is near what other image”, we instead consider a well-defined task (in this case digit recognition) for which we can get explicit training data—then use the fact that in doing this task the neural net implicitly has to make what amount to “nearness decisions”. So instead of us ever explicitly having to talk about “nearness of images” we’re just talking about the concrete question of what digit an image represents, and then we’re “leaving it to the neural net” to implicitly determine what that implies about “nearness of images”.

So how in more detail does this work for the digit recognition network? We can think of the network as consisting of 11 successive layers, that we might summarize iconically like this (with activation functions shown as separate layers):

At the beginning we’re feeding into the first layer actual images, represented by 2D arrays of pixel values. And at the end—from the last layer—we’re getting out an array of 10 values, which we can think of saying “how certain” the network is that the image corresponds to each of the digits 0 through 9.

Feed in the image and the values of the neurons in that last layer are:

In other words, the neural net is by this point “incredibly certain” that this image is a 4—and to actually get the output “4” we just have to pick out the position of the neuron with the largest value.

But what if we look one step earlier? The very last operation in the network is a so-called softmax which tries to “force certainty”. But before that’s been applied the values of the neurons are:

The neuron representing “4” still has the highest numerical value. But there’s also information in the values of the other neurons. And we can expect that this list of numbers can in a sense be used to characterize the “essence” of the image—and thus to provide something we can use as an embedding. And so, for example, each of the 4’s here has a slightly different “signature” (or “feature embedding”)—all very different from the 8’s:

Here we’re essentially using 10 numbers to characterize our images. But it’s often better to use much more than that. And for example in our digit recognition network we can get an array of 500 numbers by tapping into the preceding layer. And this is probably a reasonable array to use as an “image embedding”.

If we want to make an explicit visualization of “image space” for handwritten digits we need to “reduce the dimension”, effectively by projecting the 500-dimensional vector we’ve got into, say, 3D space:

We’ve just talked about creating a characterization (and thus embedding) for images based effectively on identifying the similarity of images by determining whether (according to our training set) they correspond to the same handwritten digit. And we can do the same thing much more generally for images if we have a training set that identifies, say, which of 5000 common types of object (cat, dog, chair, …) each image is of. And in this way we can make an image embedding that’s “anchored” by our identification of common objects, but then “generalizes around that” according to the behavior of the neural net. And the point is that insofar as that behavior aligns with how we humans perceive and interpret images, this will end up being an embedding that “seems right to us”, and is useful in practice in doing “human-judgement-like” tasks.

OK, so how do we follow the same kind of approach to find embeddings for words? The key is to start from a task about words for which we can readily do training. And the standard such task is “word prediction”. Imagine we’re given “the ___ cat”. Based on a large corpus of text (say, the text content of the web), what are the probabilities for different words that might “fill in the blank”? Or, alternatively, given “___ black ___” what are the probabilities for different “flanking words”?

How do we set this problem up for a neural net? Ultimately we have to formulate everything in terms of numbers. And one way to do this is just to assign a unique number to each of the 50,000 or so common words in English. So, for example, “the” might be 914, and “ cat” (with a space before it) might be 3542. (And these are the actual numbers used by GPT-2.) So for the “the ___ cat” problem, our input might be {914, 3542}. What should the output be like? Well, it should be a list of 50,000 or so numbers that effectively give the probabilities for each of the possible “fill-in” words. And once again, to find an embedding, we want to “intercept” the “insides” of the neural net just before it “reaches its conclusion”—and then pick up the list of numbers that occur there, and that we can think of as “characterizing each word”.

OK, so what do those characterizations look like? Over the past 10 years there’ve been a sequence of different systems developed (word2vec, GloVe, BERT, GPT, …), each based on a different neural net approach. But ultimately all of them take words and characterize them by lists of hundreds to thousands of numbers.

In their raw form, these “embedding vectors” are quite uninformative. For example, here’s what GPT-2 produces as the raw embedding vectors for three specific words:

If we do things like measure distances between these vectors, then we can find things like “nearnesses” of words. Later we’ll discuss in more detail what we might consider the “cognitive” significance of such embeddings. But for now the main point is that we have a way to usefully turn words into “neural-net-friendly” collections of numbers.

But actually we can go further than just characterizing words by collections of numbers; we can also do this for sequences of words, or indeed whole blocks of text. And inside ChatGPT that’s how it’s dealing with things. It takes the text it’s got so far, and generates an embedding vector to represent it. Then its goal is to find the probabilities for different words that might occur next. And it represents its answer for this as a list of numbers that essentially give the probabilities for each of the 50,000 or so possible words.

(Strictly, ChatGPT does not deal with words, but rather with “tokens”—convenient linguistic units that might be whole words, or might just be pieces like “pre” or “ing” or “ized”. Working with tokens makes it easier for ChatGPT to handle rare, compound and non-English words, and, sometimes, for better or worse, to invent new words.)

OK, so we’re finally ready to discuss what’s inside ChatGPT. And, yes, ultimately, it’s a giant neural net—currently a version of the so-called GPT-3 network with 175 billion weights. In many ways this is a neural net very much like the other ones we’ve discussed. But it’s a neural net that’s particularly set up for dealing with language. And its most notable feature is a piece of neural net architecture called a “transformer”.

In the first neural nets we discussed above, every neuron at any given layer was basically connected (at least with some weight) to every neuron on the layer before. But this kind of fully connected network is (presumably) overkill if one’s working with data that has particular, known structure. And thus, for example, in the early stages of dealing with images, it’s typical to use so-called convolutional neural nets (“convnets”) in which neurons are effectively laid out on a grid analogous to the pixels in the image—and connected only to neurons nearby on the grid.

The idea of transformers is to do something at least somewhat similar for sequences of tokens that make up a piece of text. But instead of just defining a fixed region in the sequence over which there can be connections, transformers instead introduce the notion of “attention”—and the idea of “paying attention” more to some parts of the sequence than others. Maybe one day it’ll make sense to just start a generic neural net and do all customization through training. But at least as of now it seems to be critical in practice to “modularize” things—as transformers do, and probably as our brains also do.

OK, so what does ChatGPT (or, rather, the GPT-3 network on which it’s based) actually do? Recall that its overall goal is to continue text in a “reasonable” way, based on what it’s seen from the training it’s had (which consists in looking at billions of pages of text from the web, etc.) So at any given point, it’s got a certain amount of text—and its goal is to come up with an appropriate choice for the next token to add.

It operates in three basic stages. First, it takes the sequence of tokens that corresponds to the text so far, and finds an embedding (i.e. an array of numbers) that represents these. Then it operates on this embedding—in a “standard neural net way”, with values “rippling through” successive layers in a network—to produce a new embedding (i.e. a new array of numbers). It then takes the last part of this array and generates from it an array of about 50,000 values that turn into probabilities for different possible next tokens. (And, yes, it so happens that there are about the same number of tokens used as there are common words in English, though only about 3000 of the tokens are whole words, and the rest are fragments.)

A critical point is that every part of this pipeline is implemented by a neural network, whose weights are determined by end-to-end training of the network. In other words, in effect nothing except the overall architecture is “explicitly engineered”; everything is just “learned” from training data.

There are, however, plenty of details in the way the architecture is set up—reflecting all sorts of experience and neural net lore. And—even though this is definitely going into the weeds—I think it’s useful to talk about some of those details, not least to get a sense of just what goes into building something like ChatGPT.

First comes the embedding module. Here’s a schematic Wolfram Language representation for it for GPT-2:

The input is a vector of *n* tokens (represented as in the previous section by integers from 1 to about 50,000). Each of these tokens is converted (by a single-layer neural net) into an embedding vector (of length 768 for GPT-2 and 12,288 for ChatGPT’s GPT-3). Meanwhile, there’s a “secondary pathway” that takes the sequence of (integer) positions for the tokens, and from these integers creates another embedding vector. And finally the embedding vectors from the token value and the token position are added together—to produce the final sequence of embedding vectors from the embedding module.

Why does one just add the token-value and token-position embedding vectors together? I don’t think there’s any particular science to this. It’s just that various different things have been tried, and this is one that seems to work. And it’s part of the lore of neural nets that—in some sense—so long as the setup one has is “roughly right” it’s usually possible to home in on details just by doing sufficient training, without ever really needing to “understand at an engineering level” quite how the neural net has ended up configuring itself.

Here’s what the embedding module does, operating on the string *hello hello hello hello hello hello hello hello hello hello bye bye bye bye bye bye bye bye bye bye*:

The elements of the embedding vector for each token are shown down the page, and across the page we see first a run of “*hello*” embeddings, followed by a run of “*bye*” ones. The second array above is the positional embedding—with its somewhat-random-looking structure being just what “happened to be learned” (in this case in GPT-2).

OK, so after the embedding module comes the “main event” of the transformer: a sequence of so-called “attention blocks” (12 for GPT-2, 96 for ChatGPT’s GPT-3). It’s all pretty complicated—and reminiscent of typical large hard-to-understand engineering systems, or, for that matter, biological systems. But anyway, here’s a schematic representation of a single “attention block” (for GPT-2):

Within each such attention block there are a collection of “attention heads” (12 for GPT-2, 96 for ChatGPT’s GPT-3)—each of which operates independently on different chunks of values in the embedding vector. (And, yes, we don’t know any particular reason why it’s a good idea to split up the embedding vector, or what the different parts of it “mean”; this is just one of those things that’s been “found to work”.)

OK, so what do the attention heads do? Basically they’re a way of “looking back” in the sequence of tokens (i.e. in the text produced so far), and “packaging up the past” in a form that’s useful for finding the next token. In the first section above we talked about using 2-gram probabilities to pick words based on their immediate predecessors. What the “attention” mechanism in transformers does is to allow “attention to” even much earlier words—thus potentially capturing the way, say, verbs can refer to nouns that appear many words before them in a sentence.

At a more detailed level, what an attention head does is to recombine chunks in the embedding vectors associated with different tokens, with certain weights. And so, for example, the 12 attention heads in the first attention block (in GPT-2) have the following (“look-back-all-the-way-to-the-beginning-of-the-sequence-of-tokens”) patterns of “recombination weights” for the “*hello*, *bye*” string above:

After being processed by the attention heads, the resulting “re-weighted embedding vector” (of length 768 for GPT-2 and length 12,288 for ChatGPT’s GPT-3) is passed through a standard “fully connected” neural net layer. It’s hard to get a handle on what this layer is doing. But here’s a plot of the 768×768 matrix of weights it’s using (here for GPT-2):

Taking 64×64 moving averages, some (random-walk-ish) structure begins to emerge:

What determines this structure? Ultimately it’s presumably some “neural net encoding” of features of human language. But as of now, what those features might be is quite unknown. In effect, we’re “opening up the brain of ChatGPT” (or at least GPT-2) and discovering, yes, it’s complicated in there, and we don’t understand it—even though in the end it’s producing recognizable human language.

OK, so after going through one attention block, we’ve got a new embedding vector—which is then successively passed through additional attention blocks (a total of 12 for GPT-2; 96 for GPT-3). Each attention block has its own particular pattern of “attention” and “fully connected” weights. Here for GPT-2 are the sequence of attention weights for the “hello, bye” input, for the first attention head:

And here are the (moving-averaged) “matrices” for the fully connected layers:

Curiously, even though these “matrices of weights” in different attention blocks look quite similar, the distributions of the sizes of weights can be somewhat different (and are not always Gaussian):

So after going through all these attention blocks what is the net effect of the transformer? Essentially it’s to transform the original collection of embeddings for the sequence of tokens to a final collection. And the particular way ChatGPT works is then to pick up the last embedding in this collection, and “decode” it to produce a list of probabilities for what token should come next.

So that’s in outline what’s inside ChatGPT. It may seem complicated (not least because of its many inevitably somewhat arbitrary “engineering choices”), but actually the ultimate elements involved are remarkably simple. Because in the end what we’re dealing with is just a neural net made of “artificial neurons”, each doing the simple operation of taking a collection of numerical inputs, and then combining them with certain weights.

The original input to ChatGPT is an array of numbers (the embedding vectors for the tokens so far), and what happens when ChatGPT “runs” to produce a new token is just that these numbers “ripple through” the layers of the neural net, with each neuron “doing its thing” and passing the result to neurons on the next layer. There’s no looping or “going back”. Everything just “feeds forward” through the network.

It’s a very different setup from a typical computational system—like a Turing machine—in which results are repeatedly “reprocessed” by the same computational elements. Here—at least in generating a given token of output—each computational element (i.e. neuron) is used only once.

But there is in a sense still an “outer loop” that reuses computational elements even in ChatGPT. Because when ChatGPT is going to generate a new token, it always “reads” (i.e. takes as input) the whole sequence of tokens that come before it, including tokens that ChatGPT itself has “written” previously. And we can think of this setup as meaning that ChatGPT does—at least at its outermost level—involve a “feedback loop”, albeit one in which every iteration is explicitly visible as a token that appears in the text that it generates.

But let’s come back to the core of ChatGPT: the neural net that’s being repeatedly used to generate each token. At some level it’s very simple: a whole collection of identical artificial neurons. And some parts of the network just consist of (“fully connected”) layers of neurons in which every neuron on a given layer is connected (with some weight) to every neuron on the layer before. But particularly with its transformer architecture, ChatGPT has parts with more structure, in which only specific neurons on different layers are connected. (Of course, one could still say that “all neurons are connected”—but some just have zero weight.)

In addition, there are aspects of the neural net in ChatGPT that aren’t most naturally thought of as just consisting of “homogeneous” layers. And for example—as the iconic summary above indicates—inside an attention block there are places where “multiple copies are made” of incoming data, each then going through a different “processing path”, potentially involving a different number of layers, and only later recombining. But while this may be a convenient representation of what’s going on, it’s always at least in principle possible to think of “densely filling in” layers, but just having some weights be zero.

If one looks at the longest path through ChatGPT, there are about 400 (core) layers involved—in some ways not a huge number. But there are millions of neurons—with a total of 175 billion connections and therefore 175 billion weights. And one thing to realize is that every time ChatGPT generates a new token, it has to do a calculation involving every single one of these weights. Implementationally these calculations can be somewhat organized “by layer” into highly parallel array operations that can conveniently be done on GPUs. But for each token that’s produced, there still have to be 175 billion calculations done (and in the end a bit more)—so that, yes, it’s not surprising that it can take a while to generate a long piece of text with ChatGPT.

But in the end, the remarkable thing is that all these operations—individually as simple as they are—can somehow together manage to do such a good “human-like” job of generating text. It has to be emphasized again that (at least so far as we know) there’s no “ultimate theoretical reason” why anything like this should work. And in fact, as we’ll discuss, I think we have to view this as a—potentially surprising—scientific discovery: that somehow in a neural net like ChatGPT’s it’s possible to capture the essence of what human brains manage to do in generating language.

OK, so we’ve now given an outline of how ChatGPT works once it’s set up. But how did it get set up? How were all those 175 billion weights in its neural net determined? Basically they’re the result of very large-scale training, based on a huge corpus of text—on the web, in books, etc.—written by humans. As we’ve said, even given all that training data, it’s certainly not obvious that a neural net would be able to successfully produce “human-like” text. And, once again, there seem to be detailed pieces of engineering needed to make that happen. But the big surprise—and discovery—of ChatGPT is that it’s possible at all. And that—in effect—a neural net with “just” 175 billion weights can make a “reasonable model” of text humans write.

In modern times, there’s lots of text written by humans that’s out there in digital form. The public web has at least several billion human-written pages, with altogether perhaps a trillion words of text. And if one includes non-public webpages, the numbers might be at least 100 times larger. So far, more than 5 million digitized books have been made available (out of 100 million or so that have ever been published), giving another 100 billion or so words of text. And that’s not even mentioning text derived from speech in videos, etc. (As a personal comparison, my total lifetime output of published material has been a bit under 3 million words, and over the past 30 years I’ve written about 15 million words of email, and altogether typed perhaps 50 million words—and in just the past couple of years I’ve spoken more than 10 million words on livestreams. And, yes, I’ll train a bot from all of that.)

But, OK, given all this data, how does one train a neural net from it? The basic process is very much as we discussed it in the simple examples above. You present a batch of examples, and then you adjust the weights in the network to minimize the error (“loss”) that the network makes on those examples. The main thing that’s expensive about “back propagating” from the error is that each time you do this, every weight in the network will typically change at least a tiny bit, and there are just a lot of weights to deal with. (The actual “back computation” is typically only a small constant factor harder than the forward one.)

With modern GPU hardware, it’s straightforward to compute the results from batches of thousands of examples in parallel. But when it comes to actually updating the weights in the neural net, current methods require one to do this basically batch by batch. (And, yes, this is probably where actual brains—with their combined computation and memory elements—have, for now, at least an architectural advantage.)

Even in the seemingly simple cases of learning numerical functions that we discussed earlier, we found we often had to use millions of examples to successfully train a network, at least from scratch. So how many examples does this mean we’ll need in order to train a “human-like language” model? There doesn’t seem to be any fundamental “theoretical” way to know. But in practice ChatGPT was successfully trained on a few hundred billion words of text.

Some of the text it was fed several times, some of it only once. But somehow it “got what it needed” from the text it saw. But given this volume of text to learn from, how large a network should it require to “learn it well”? Again, we don’t yet have a fundamental theoretical way to say. Ultimately—as we’ll discuss further below—there’s presumably a certain “total algorithmic content” to human language and what humans typically say with it. But the next question is how efficient a neural net will be at implementing a model based on that algorithmic content. And again we don’t know—although the success of ChatGPT suggests it’s reasonably efficient.

And in the end we can just note that ChatGPT does what it does using a couple hundred billion weights—comparable in number to the total number of words (or tokens) of training data it’s been given. In some ways it’s perhaps surprising (though empirically observed also in smaller analogs of ChatGPT) that the “size of the network” that seems to work well is so comparable to the “size of the training data”. After all, it’s certainly not that somehow “inside ChatGPT” all that text from the web and books and so on is “directly stored”. Because what’s actually inside ChatGPT are a bunch of numbers—with a bit less than 10 digits of precision—that are some kind of distributed encoding of the aggregate structure of all that text.

Put another way, we might ask what the “effective information content” is of human language and what’s typically said with it. There’s the raw corpus of examples of language. And then there’s the representation in the neural net of ChatGPT. That representation is very likely far from the “algorithmically minimal” representation (as we’ll discuss below). But it’s a representation that’s readily usable by the neural net. And in this representation it seems there’s in the end rather little “compression” of the training data; it seems on average to basically take only a bit less than one neural net weight to carry the “information content” of a word of training data.

When we run ChatGPT to generate text, we’re basically having to use each weight once. So if there are *n* weights, we’ve got of order *n* computational steps to do—though in practice many of them can typically be done in parallel in GPUs. But if we need about *n* words of training data to set up those weights, then from what we’ve said above we can conclude that we’ll need about *n*^{2} computational steps to do the training of the network—which is why, with current methods, one ends up needing to talk about billion-dollar training efforts.

The majority of the effort in training ChatGPT is spent “showing it” large amounts of existing text from the web, books, etc. But it turns out there’s another—apparently rather important—part too.

As soon as it’s finished its “raw training” from the original corpus of text it’s been shown, the neural net inside ChatGPT is ready to start generating its own text, continuing from prompts, etc. But while the results from this may often seem reasonable, they tend—particularly for longer pieces of text—to “wander off” in often rather non-human-like ways. It’s not something one can readily detect, say, by doing traditional statistics on the text. But it’s something that actual humans reading the text easily notice.

And a key idea in the construction of ChatGPT was to have another step after “passively reading” things like the web: to have actual humans actively interact with ChatGPT, see what it produces, and in effect give it feedback on “how to be a good chatbot”. But how can the neural net use that feedback? The first step is just to have humans rate results from the neural net. But then another neural net model is built that attempts to predict those ratings. But now this prediction model can be run—essentially like a loss function—on the original network, in effect allowing that network to be “tuned up” by the human feedback that’s been given. And the results in practice seem to have a big effect on the success of the system in producing “human-like” output.

In general, it’s interesting how little “poking” the “originally trained” network seems to need to get it to usefully go in particular directions. One might have thought that to have the network behave as if it’s “learned something new” one would have to go in and run a training algorithm, adjusting weights, and so on.

But that’s not the case. Instead, it seems to be sufficient to basically tell ChatGPT something one time—as part of the prompt you give—and then it can successfully make use of what you told it when it generates text. And once again, the fact that this works is, I think, an important clue in understanding what ChatGPT is “really doing” and how it relates to the structure of human language and thinking.

There’s certainly something rather human-like about it: that at least once it’s had all that pre-training you can tell it something just once and it can “remember it”—at least “long enough” to generate a piece of text using it. So what’s going on in a case like this? It could be that “everything you might tell it is already in there somewhere”—and you’re just leading it to the right spot. But that doesn’t seem plausible. Instead, what seems more likely is that, yes, the elements are already in there, but the specifics are defined by something like a “trajectory between those elements” and that’s what you’re introducing when you tell it something.

And indeed, much like for humans, if you tell it something bizarre and unexpected that completely doesn’t fit into the framework it knows, it doesn’t seem like it’ll successfully be able to “integrate” this. It can “integrate” it only if it’s basically riding in a fairly simple way on top of the framework it already has.

It’s also worth pointing out again that there are inevitably “algorithmic limits” to what the neural net can “pick up”. Tell it “shallow” rules of the form “this goes to that”, etc., and the neural net will most likely be able to represent and reproduce these just fine—and indeed what it “already knows” from language will give it an immediate pattern to follow. But try to give it rules for an actual “deep” computation that involves many potentially computationally irreducible steps and it just won’t work. (Remember that at each step it’s always just “feeding data forward” in its network, never looping except by virtue of generating new tokens.)

Of course, the network can learn the answer to specific “irreducible” computations. But as soon as there are combinatorial numbers of possibilities, no such “table-lookup-style” approach will work. And so, yes, just like humans, it’s time then for neural nets to “reach out” and use actual computational tools. (And, yes, Wolfram|Alpha and Wolfram Language are uniquely suitable, because they’ve been built to “talk about things in the world”, just like the language-model neural nets.)

Human language—and the processes of thinking involved in generating it—have always seemed to represent a kind of pinnacle of complexity. And indeed it’s seemed somewhat remarkable that human brains—with their network of a “mere” 100 billion or so neurons (and maybe 100 trillion connections) could be responsible for it. Perhaps, one might have imagined, there’s something more to brains than their networks of neurons—like some new layer of undiscovered physics. But now with ChatGPT we’ve got an important new piece of information: we know that a pure, artificial neural network with about as many connections as brains have neurons is capable of doing a surprisingly good job of generating human language.

And, yes, that’s still a big and complicated system—with about as many neural net weights as there are words of text currently available out there in the world. But at some level it still seems difficult to believe that all the richness of language and the things it can talk about can be encapsulated in such a finite system. Part of what’s going on is no doubt a reflection of the ubiquitous phenomenon (that first became evident in the example of rule 30) that computational processes can in effect greatly amplify the apparent complexity of systems even when their underlying rules are simple. But, actually, as we discussed above, neural nets of the kind used in ChatGPT tend to be specifically constructed to restrict the effect of this phenomenon—and the computational irreducibility associated with it—in the interest of making their training more accessible.

So how is it, then, that something like ChatGPT can get as far as it does with language? The basic answer, I think, is that language is at a fundamental level somehow simpler than it seems. And this means that ChatGPT—even with its ultimately straightforward neural net structure—is successfully able to “capture the essence” of human language and the thinking behind it. And moreover, in its training, ChatGPT has somehow “implicitly discovered” whatever regularities in language (and thinking) make this possible.

The success of ChatGPT is, I think, giving us evidence of a fundamental and important piece of science: it’s suggesting that we can expect there to be major new “laws of language”—and effectively “laws of thought”—out there to discover. In ChatGPT—built as it is as a neural net—those laws are at best implicit. But if we could somehow make the laws explicit, there’s the potential to do the kinds of things ChatGPT does in vastly more direct, efficient—and transparent—ways.

But, OK, so what might these laws be like? Ultimately they must give us some kind of prescription for how language—and the things we say with it—are put together. Later we’ll discuss how “looking inside ChatGPT” may be able to give us some hints about this, and how what we know from building computational language suggests a path forward. But first let’s discuss two long-known examples of what amount to “laws of language”—and how they relate to the operation of ChatGPT.

The first is the syntax of language. Language is not just a random jumble of words. Instead, there are (fairly) definite grammatical rules for how words of different kinds can be put together: in English, for example, nouns can be preceded by adjectives and followed by verbs, but typically two nouns can’t be right next to each other. Such grammatical structure can (at least approximately) be captured by a set of rules that define how what amount to “parse trees” can be put together:

ChatGPT doesn’t have any explicit “knowledge” of such rules. But somehow in its training it implicitly “discovers” them—and then seems to be good at following them. So how does this work? At a “big picture” level it’s not clear. But to get some insight it’s perhaps instructive to look at a much simpler example.

Consider a “language” formed from sequences of (’s and )’s, with a grammar that specifies that parentheses should always be balanced, as represented by a parse tree like:

Can we train a neural net to produce “grammatically correct” parenthesis sequences? There are various ways to handle sequences in neural nets, but let’s use transformer nets, as ChatGPT does. And given a simple transformer net, we can start feeding it grammatically correct parenthesis sequences as training examples. A subtlety (which actually also appears in ChatGPT’s generation of human language) is that in addition to our “content tokens” (here “(” and “)”) we have to include an “End” token, that’s generated to indicate that the output shouldn’t continue any further (i.e. for ChatGPT, that one’s reached the “end of the story”).

If we set up a transformer net with just one attention block with 8 heads and feature vectors of length 128 (ChatGPT also uses feature vectors of length 128, but has 96 attention blocks, each with 96 heads) then it doesn’t seem possible to get it to learn much about parenthesis language. But with 2 attention blocks, the learning process seems to converge—at least after 10 million or so examples have been given (and, as is common with transformer nets, showing yet more examples just seems to degrade its performance).

So with this network, we can do the analog of what ChatGPT does, and ask for probabilities for what the next token should be—in a parenthesis sequence:

And in the first case, the network is “pretty sure” that the sequence can’t end here—which is good, because if it did, the parentheses would be left unbalanced. In the second case, however, it “correctly recognizes” that the sequence can end here, though it also “points out” that it’s possible to “start again”, putting down a “(”, presumably with a “)” to follow. But, oops, even with its 400,000 or so laboriously trained weights, it says there’s a 15% probability to have “)” as the next token—which isn’t right, because that would necessarily lead to an unbalanced parenthesis.

Here’s what we get if we ask the network for the highest-probability completions for progressively longer sequences of (’s:

And, yes, up to a certain length the network does just fine. But then it starts failing. It’s a pretty typical kind of thing to see in a “precise” situation like this with a neural net (or with machine learning in general). Cases that a human “can solve in a glance” the neural net can solve too. But cases that require doing something “more algorithmic” (e.g. explicitly counting parentheses to see if they’re closed) the neural net tends to somehow be “too computationally shallow” to reliably do. (By the way, even the full current ChatGPT has a hard time correctly matching parentheses in long sequences.)

So what does this mean for things like ChatGPT and the syntax of a language like English? The parenthesis language is “austere”—and much more of an “algorithmic story”. But in English it’s much more realistic to be able to “guess” what’s grammatically going to fit on the basis of local choices of words and other hints. And, yes, the neural net is much better at this—even though perhaps it might miss some “formally correct” case that, well, humans might miss as well. But the main point is that the fact that there’s an overall syntactic structure to the language—with all the regularity that implies—in a sense limits “how much” the neural net has to learn. And a key “natural-science-like” observation is that the transformer architecture of neural nets like the one in ChatGPT seems to successfully be able to learn the kind of nested-tree-like syntactic structure that seems to exist (at least in some approximation) in all human languages.

Syntax provides one kind of constraint on language. But there are clearly more. A sentence like “Inquisitive electrons eat blue theories for fish” is grammatically correct but isn’t something one would normally expect to say, and wouldn’t be considered a success if ChatGPT generated it—because, well, with the normal meanings for the words in it, it’s basically meaningless.

But is there a general way to tell if a sentence is meaningful? There’s no traditional overall theory for that. But it’s something that one can think of ChatGPT as having implicitly “developed a theory for” after being trained with billions of (presumably meaningful) sentences from the web, etc.

What might this theory be like? Well, there’s one tiny corner that’s basically been known for two millennia, and that’s logic. And certainly in the syllogistic form in which Aristotle discovered it, logic is basically a way of saying that sentences that follow certain patterns are reasonable, while others are not. Thus, for example, it’s reasonable to say “All X are Y. This is not Y, so it’s not an X” (as in “All fishes are blue. This is not blue, so it’s not a fish.”). And just as one can somewhat whimsically imagine that Aristotle discovered syllogistic logic by going (“machine-learning-style”) through lots of examples of rhetoric, so too one can imagine that in the training of ChatGPT it will have been able to “discover syllogistic logic” by looking at lots of text on the web, etc. (And, yes, while one can therefore expect ChatGPT to produce text that contains “correct inferences” based on things like syllogistic logic, it’s a quite different story when it comes to more sophisticated formal logic—and I think one can expect it to fail here for the same kind of reasons it fails in parenthesis matching.)

But beyond the narrow example of logic, what can be said about how to systematically construct (or recognize) even plausibly meaningful text? Yes, there are things like Mad Libs that use very specific “phrasal templates”. But somehow ChatGPT implicitly has a much more general way to do it. And perhaps there’s nothing to be said about how it can be done beyond “somehow it happens when you have 175 billion neural net weights”. But I strongly suspect that there’s a much simpler and stronger story.

We discussed above that inside ChatGPT any piece of text is effectively represented by an array of numbers that we can think of as coordinates of a point in some kind of “linguistic feature space”. So when ChatGPT continues a piece of text this corresponds to tracing out a trajectory in linguistic feature space. But now we can ask what makes this trajectory correspond to text we consider meaningful. And might there perhaps be some kind of “semantic laws of motion” that define—or at least constrain—how points in linguistic feature space can move around while preserving “meaningfulness”?

So what is this linguistic feature space like? Here’s an example of how single words (here, common nouns) might get laid out if we project such a feature space down to 2D:

We saw another example above based on words representing plants and animals. But the point in both cases is that “semantically similar words” are placed nearby.

As another example, here’s how words corresponding to different parts of speech get laid out:

Of course, a given word doesn’t in general just have “one meaning” (or necessarily correspond to just one part of speech). And by looking at how sentences containing a word lay out in feature space, one can often “tease apart” different meanings—as in the example here for the word “crane” (bird or machine?):

OK, so it’s at least plausible that we can think of this feature space as placing “words nearby in meaning” close in this space. But what kind of additional structure can we identify in this space? Is there for example some kind of notion of “parallel transport” that would reflect “flatness” in the space? One way to get a handle on that is to look at analogies:

And, yes, even when we project down to 2D, there’s often at least a “hint of flatness”, though it’s certainly not universally seen.

So what about trajectories? We can look at the trajectory that a prompt for ChatGPT follows in feature space—and then we can see how ChatGPT continues that:

There’s certainly no “geometrically obvious” law of motion here. And that’s not at all surprising; we fully expect this to be a considerably more complicated story. And, for example, it’s far from obvious that even if there is a “semantic law of motion” to be found, what kind of embedding (or, in effect, what “variables”) it’ll most naturally be stated in.

In the picture above, we’re showing several steps in the “trajectory”—where at each step we’re picking the word that ChatGPT considers the most probable (the “zero temperature” case). But we can also ask what words can “come next” with what probabilities at a given point:

And what we see in this case is that there’s a “fan” of high-probability words that seems to go in a more or less definite direction in feature space. What happens if we go further? Here are the successive “fans” that appear as we “move along” the trajectory:

Here’s a 3D representation, going for a total of 40 steps:

And, yes, this seems like a mess—and doesn’t do anything to particularly encourage the idea that one can expect to identify “mathematical-physics-like” “semantic laws of motion” by empirically studying “what ChatGPT is doing inside”. But perhaps we’re just looking at the “wrong variables” (or wrong coordinate system) and if only we looked at the right one, we’d immediately see that ChatGPT is doing something “mathematical-physics-simple” like following geodesics. But as of now, we’re not ready to “empirically decode” from its “internal behavior” what ChatGPT has “discovered” about how human language is “put together”.

What does it take to produce “meaningful human language”? In the past, we might have assumed it could be nothing short of a human brain. But now we know it can be done quite respectably by the neural net of ChatGPT. Still, maybe that’s as far as we can go, and there’ll be nothing simpler—or more human understandable—that will work. But my strong suspicion is that the success of ChatGPT implicitly reveals an important “scientific” fact: that there’s actually a lot more structure and simplicity to meaningful human language than we ever knew—and that in the end there may be even fairly simple rules that describe how such language can be put together.

As we mentioned above, syntactic grammar gives rules for how words corresponding to things like different parts of speech can be put together in human language. But to deal with meaning, we need to go further. And one version of how to do this is to think about not just a syntactic grammar for language, but also a semantic one.

For purposes of syntax, we identify things like nouns and verbs. But for purposes of semantics, we need “finer gradations”. So, for example, we might identify the concept of “moving”, and the concept of an “object” that “maintains its identity independent of location”. There are endless specific examples of each of these “semantic concepts”. But for the purposes of our semantic grammar, we’ll just have some general kind of rule that basically says that “objects” can “move”. There’s a lot to say about how all this might work (some of which I’ve said before). But I’ll content myself here with just a few remarks that indicate some of the potential path forward.

It’s worth mentioning that even if a sentence is perfectly OK according to the semantic grammar, that doesn’t mean it’s been realized (or even could be realized) in practice. “The elephant traveled to the Moon” would doubtless “pass” our semantic grammar, but it certainly hasn’t been realized (at least yet) in our actual world—though it’s absolutely fair game for a fictional world.

When we start talking about “semantic grammar” we’re soon led to ask “What’s underneath it?” What “model of the world” is it assuming? A syntactic grammar is really just about the construction of language from words. But a semantic grammar necessarily engages with some kind of “model of the world”—something that serves as a “skeleton” on top of which language made from actual words can be layered.

Until recent times, we might have imagined that (human) language would be the only general way to describe our “model of the world”. Already a few centuries ago there started to be formalizations of specific kinds of things, based particularly on mathematics. But now there’s a much more general approach to formalization: computational language.

And, yes, that’s been my big project over the course of more than four decades (as now embodied in the Wolfram Language): to develop a precise symbolic representation that can talk as broadly as possible about things in the world, as well as abstract things that we care about. And so, for example, we have symbolic representations for cities and molecules and images and neural networks, and we have built-in knowledge about how to compute about those things.

And, after decades of work, we’ve covered a lot of areas in this way. But in the past, we haven’t particularly dealt with “everyday discourse”. In “I bought two pounds of apples” we can readily represent (and do nutrition and other computations on) the “two pounds of apples”. But we don’t (quite yet) have a symbolic representation for “I bought”.

It’s all connected to the idea of semantic grammar—and the goal of having a generic symbolic “construction kit” for concepts, that would give us rules for what could fit together with what, and thus for the “flow” of what we might turn into human language.

But let’s say we had this “symbolic discourse language”. What would we do with it? We could start off doing things like generating “locally meaningful text”. But ultimately we’re likely to want more “globally meaningful” results—which means “computing” more about what can actually exist or happen in the world (or perhaps in some consistent fictional world).

Right now in Wolfram Language we have a huge amount of built-in computational knowledge about lots of kinds of things. But for a complete symbolic discourse language we’d have to build in additional “calculi” about general things in the world: if an object moves from A to B and from B to C, then it’s moved from A to C, etc.

Given a symbolic discourse language we might use it to make “standalone statements”. But we can also use it to ask questions about the world, “Wolfram|Alpha style”. Or we can use it to state things that we “want to make so”, presumably with some external actuation mechanism. Or we can use it to make assertions—perhaps about the actual world, or perhaps about some specific world we’re considering, fictional or otherwise.

Human language is fundamentally imprecise, not least because it isn’t “tethered” to a specific computational implementation, and its meaning is basically defined just by a “social contract” between its users. But computational language, by its nature, has a certain fundamental precision—because in the end what it specifies can always be “unambiguously executed on a computer”. Human language can usually get away with a certain vagueness. (When we say “planet” does it include exoplanets or not, etc.?) But in computational language we have to be precise and clear about all the distinctions we’re making.

It’s often convenient to leverage ordinary human language in making up names in computational language. But the meanings they have in computational language are necessarily precise—and might or might not cover some particular connotation in typical human language usage.

How should one figure out the fundamental “ontology” suitable for a general symbolic discourse language? Well, it’s not easy. Which is perhaps why little has been done in these since the primitive beginnings Aristotle made more than two millennia ago. But it really helps that today we now know so much about how to think about the world computationally (and it doesn’t hurt to have a “fundamental metaphysics” from our Physics Project and the idea of the ruliad).

But what does all this mean in the context of ChatGPT? From its training ChatGPT has effectively “pieced together” a certain (rather impressive) quantity of what amounts to semantic grammar. But its very success gives us a reason to think that it’s going to be feasible to construct something more complete in computational language form. And, unlike what we’ve so far figured out about the innards of ChatGPT, we can expect to design the computational language so that it’s readily understandable to humans.

When we talk about semantic grammar, we can draw an analogy to syllogistic logic. At first, syllogistic logic was essentially a collection of rules about statements expressed in human language. But (yes, two millennia later) when formal logic was developed, the original basic constructs of syllogistic logic could now be used to build huge “formal towers” that include, for example, the operation of modern digital circuitry. And so, we can expect, it will be with more general semantic grammar. At first, it may just be able to deal with simple patterns, expressed, say, as text. But once its whole computational language framework is built, we can expect that it will be able to be used to erect tall towers of “generalized semantic logic”, that allow us to work in a precise and formal way with all sorts of things that have never been accessible to us before, except just at a “ground-floor level” through human language, with all its vagueness.

We can think of the construction of computational language—and semantic grammar—as representing a kind of ultimate compression in representing things. Because it allows us to talk about the essence of what’s possible, without, for example, dealing with all the “turns of phrase” that exist in ordinary human language. And we can view the great strength of ChatGPT as being something a bit similar: because it too has in a sense “drilled through” to the point where it can “put language together in a semantically meaningful way” without concern for different possible turns of phrase.

So what would happen if we applied ChatGPT to underlying computational language? The computational language can describe what’s possible. But what can still be added is a sense of “what’s popular”—based for example on reading all that content on the web. But then—underneath—operating with computational language means that something like ChatGPT has immediate and fundamental access to what amount to ultimate tools for making use of potentially irreducible computations. And that makes it a system that can not only “generate reasonable text”, but can expect to work out whatever can be worked out about whether that text actually makes “correct” statements about the world—or whatever it’s supposed to be talking about.

The basic concept of ChatGPT is at some level rather simple. Start from a huge sample of human-created text from the web, books, etc. Then train a neural net to generate text that’s “like this”. And in particular, make it able to start from a “prompt” and then continue with text that’s “like what it’s been trained with”.

As we’ve seen, the actual neural net in ChatGPT is made up of very simple elements—though billions of them. And the basic operation of the neural net is also very simple, consisting essentially of passing input derived from the text it’s generated so far “once through its elements” (without any loops, etc.) for every new word (or part of a word) that it generates.

But the remarkable—and unexpected—thing is that this process can produce text that’s successfully “like” what’s out there on the web, in books, etc. And not only is it coherent human language, it also “says things” that “follow its prompt” making use of content it’s “read”. It doesn’t always say things that “globally make sense” (or correspond to correct

The specific engineering of ChatGPT has made it quite compelling. But ultimately (at least until it can use outside tools) ChatGPT is “merely” pulling out some “coherent thread of text” from the “statistics of conventional wisdom” that it’s accumulated. But it’s amazing how human-like the results are. And as I’ve discussed, this suggests something that’s at least scientifically very important: that human language (and the patterns of thinking behind it) are somehow simpler and more “law like” in their structure than we thought. ChatGPT has implicitly discovered it. But we can potentially explicitly expose it, with semantic grammar, computational language, etc.

What ChatGPT does in generating text is very impressive—and the results are usually very much like what we humans would produce. So does this mean ChatGPT is working like a brain? Its underlying artificial-neural-net structure was ultimately modeled on an idealization of the brain. And it seems quite likely that when we humans generate language many aspects of what’s going on are quite similar.

When it comes to training (AKA learning) the different “hardware” of the brain and of current computers (as well as, perhaps, some undeveloped algorithmic ideas) forces ChatGPT to use a strategy that’s probably rather different (and in some ways much less efficient) than the brain. And there’s something else as well: unlike even in typical algorithmic computation, ChatGPT doesn’t internally “have loops” or “recompute on data”. And that inevitably limits its computational capability—even with respect to current computers, but definitely with respect to the brain.

It’s not clear how to “fix that” and still maintain the ability to train the system with reasonable efficiency. But to do so will presumably allow a future ChatGPT to do even more “brain-like things”. Of course, there are plenty of things that brains don’t do so well—particularly involving what amount to irreducible computations. And for these both brains and things like ChatGPT have to seek “outside tools”—like Wolfram Language.

But for now it’s exciting to see what ChatGPT has already been able to do. At some level it’s a great example of the fundamental scientific fact that large numbers of simple computational elements can do remarkable and unexpected things. But it also provides perhaps the best impetus we’ve had in two thousand years to understand better just what the fundamental character and principles might be of that central feature of the human condition that is human language and the processes of thinking behind it.

I’ve been following the development of neural nets now for about 43 years, and during that time I’ve interacted with many people about them. Among them—some from long ago, some from recently, and some across many years—have been: Giulio Alessandrini, Dario Amodei, Etienne Bernard, Taliesin Beynon, Sebastian Bodenstein, Greg Brockman, Jack Cowan, Pedro Domingos, Jesse Galef, Roger Germundsson, Robert Hecht-Nielsen, Geoff Hinton, John Hopfield, Yann LeCun, Jerry Lettvin, Jerome Louradour, Marvin Minsky, Eric Mjolsness, Cayden Pierce, Tomaso Poggio, Matteo Salvarezza, Terry Sejnowski, Oliver Selfridge, Gordon Shaw, Jonas Sjöberg, Ilya Sutskever, Gerry Tesauro and Timothee Verdier. For help with this piece, I’d particularly like to thank Giulio Alessandrini and Brad Klee.

Entropy increases. Mechanical work irreversibly turns into heat. The Second Law of thermodynamics is considered one of the great general principles of physical science. But 150 years after it was first introduced, there’s still something deeply mysterious about the Second Law. It almost seems like it’s going to be “provably true”. But one never quite gets there; it always seems to need something extra. Sometimes textbooks will gloss over everything; sometimes they’ll give some kind of “common-sense-but-outside-of-physics argument”. But the mystery of the Second Law has never gone away.

Why does the Second Law work? And does it even in fact always work, or is it actually sometimes violated? What does it really depend on? What would be needed to “prove it”?

For me personally the quest to understand the Second Law has been no less than a 50-year story. But back in the 1980s, as I began to explore the computational universe of simple programs, I discovered a fundamental phenomenon that was immediately reminiscent of the Second Law. And in the 1990s I started to map out just how this phenomenon might finally be able to demystify the Second Law. But it is only now—with ideas that have emerged from our Physics Project—that I think I can pull all the pieces together and finally be able to construct a proper framework to explain why—and to what extent—the Second Law is true.

In its usual conception, the Second Law is a law of thermodynamics, concerned with the dynamics of heat. But it turns out that there’s a vast generalization of it that’s possible. And in fact my key realization is that the Second Law is ultimately just a manifestation of the very same core computational phenomenon that is at the heart of our Physics Project and indeed the whole conception of science that is emerging from our study of the ruliad and the multicomputational paradigm.

It’s all a story of the interplay between underlying computational irreducibility and our nature as computationally bounded observers. Other observers—or even our own future technology—might see things differently. But at least for us now the ubiquity of computational irreducibility leads inexorably to the generation of behavior that we—with our computationally bounded nature—will read as “random”. We might start from something highly ordered (like gas molecules all in the corner of a box) but soon—at least as far as we’re concerned—it will typically seem to “randomize”, just as the Second Law implies.

In the twentieth century there emerged three great physical theories: general relativity, quantum mechanics and statistical mechanics, with the Second Law being the defining phenomenon of statistical mechanics. But while there was a sense that statistical mechanics (and in particular the Second Law) should somehow be “formally derivable”, general relativity and quantum mechanics seemed quite different. But our Physics Project has changed that picture. And the remarkable thing is that it now seems as if all three of general relativity, quantum mechanics and statistical mechanics are actually derivable, and from the same ultimate foundation: the interplay between computational irreducibility and the computational boundedness of observers like us.

The case of statistical mechanics and the Second Law is in some ways simpler than the other two because in statistical mechanics it’s realistic to separate the observer from the system they’re observing, while in general relativity and quantum mechanics it’s essential that the observer be an integral part of the system. It also helps that phenomena about things like molecules in statistical mechanics are much more familiar to us today than those about atoms of space or branches of multiway systems. And by studying the Second Law we’ll be able to develop intuition that we can use elsewhere, say in discussing “molecular” vs. “fluid” levels of description in my recent exploration of the physicalization of the foundations of metamathematics.

The earliest statements of the Second Law were things like: “Heat doesn’t flow from a colder body to a hotter one” or “You can’t systematically purely convert heat to mechanical work”. Later on there came the somewhat more abstract statement “Entropy tends to increase”. But in the end, all these statements boil down to the same idea: that somehow things always tend to get progressively “more random”. What may start in an orderly state will—according to the Second Law—inexorably “degrade” to a “randomized” state.

But how general is this phenomenon? Does it just apply to heat and temperature and molecules and things? Or is it something that applies across a whole range of kinds of systems?

The answer, I believe, is that underneath the Second Law there’s a very general phenomenon that’s extremely robust. And that has the potential to apply to pretty much any kind of system one can imagine.

Here’s a longtime favorite example of mine: the rule 30 cellular automaton:

Start from a simple “orderly” state, here containing just a single non-white cell. Then apply the rule over and over again. The pattern that emerges has some definite, visible structure. But many aspects of it “seem random”. Just as in the Second Law, even starting from something “orderly”, one ends up getting something “random”.

But is it “really random”? It’s completely determined by the initial condition and rule, and you can always recompute it. But the subtle yet critical point is that if you’re just given the output, it can still “seem random” in the sense that no known methods operating purely on this output can find regularities in it.

It’s reminiscent of the situation with something like the digits of π. There’s a fairly simple algorithm for generating these digits. Yet once generated, the digits on their own seem for practical purposes random.

In studying physical systems there’s a long history of assuming that whenever randomness is seen, it somehow comes from outside the system. Maybe it’s the effect of “thermal noise” or “perturbations” acting on the system. Maybe it’s chaos-theory-style “excavation” of higher-order digits supplied through real-number initial conditions. But the surprising discovery I made in the 1980s by looking at things like rule 30 is that actually no such “external source” is needed: instead, it’s perfectly possible for randomness to be generated intrinsically within a system just through the process of applying definite underlying rules.

How can one understand this? The key is to think in computational terms. And ultimately the source of the phenomenon is the interplay between the computational process associated with the actual evolution of the system and the computational processes that our perception of the output of that evolution brings to bear.

We might have thought if a system had a simple underlying rule—like rule 30—then it’d always be straightforward to predict what the system will do. Of course, we could in principle always just run the rule and see what happens. But the question is whether we can expect to “jump ahead” and “find the outcome”, with much less computational effort than the actual evolution of the system involves.

And an important conclusion of a lot of science I did in the 1980s and 1990s is that for many systems—presumably including rule 30—it’s simply not possible to “jump ahead”. And instead the evolution of the system is what I call computationally irreducible—so that it takes an irreducible amount of computational effort to find out what the system does.

Ultimately this is a consequence of what I call the Principle of Computational Equivalence, which states that above some low threshold, systems always end up being equivalent in the sophistication of the computations they perform. And this is why even our brains and our most sophisticated methods of scientific analysis can’t “computationally outrun” even something like rule 30, so that we must consider it computationally irreducible.

So how does this relate to the Second Law? It’s what makes it possible for a system like rule 30 to operate according to a simple underlying rule, yet to intrinsically generate what seems like random behavior. If we could do all the necessary computationally irreducible work then we could in principle “see through” to the simple rules underneath. But the key point (emphasized by our Physics Project) is that observers like us are computationally bounded in our capabilities. And this means that we’re not able to “see through the computational irreducibility”—with the result that the behavior we see “looks random to us”.

And in thermodynamics that “random-looking” behavior is what we associate with heat. And the Second Law assertion that energy associated with systematic mechanical work tends to “degrade into heat” then corresponds to the fact that when there’s computational irreducibility the behavior that’s generated is something we can’t readily “computationally see through”—so that it appears random to us.

Systems like rule 30 make the phenomenon of intrinsic randomness generation particularly clear. But how do such systems relate to the ones that thermodynamics usually studies? The original formulation of the Second Law involved gases, and the vast majority of its applications even today still concern things like gases.

At a basic level, a typical gas consists of a collection of discrete molecules that interact through collisions. And as an idealization of this, we can consider hard spheres that move according to the standard laws of mechanics and undergo perfectly elastic collisions with each other, and with the walls of a container. Here’s an example of a sequence of snapshots from a simulation of such a system, done in 2D:

We begin with an organized “flotilla” of “molecules”, systematically going in a particular direction (and not touching, to avoid a “Newton’s cradle” many-collisions-at-a-time effect). But after these molecules collide with a wall, they quickly start to move in what seem like much more random ways. The original systematic motion is like what happens when one is “doing mechanical work”, say moving a solid object. But what we see is that—just as the Second Law implies—this motion is quickly “degraded” into disordered and seemingly random “heat-like” microscopic motion.

Here’s a “spacetime” view of the behavior:

Looking from far away, with each molecule’s spacetime trajectory shown as a slightly transparent tube, we get:

There’s already some qualitative similarity with the rule 30 behavior we saw above. But there are many detailed differences. And one of the most obvious is that while rule 30 just has a discrete collection of cells, the spheres in the hard-sphere gas can be at any position. And, what’s more, the precise details of their positions can have an increasingly large effect. If two elastic spheres collide perfectly head-on, they’ll bounce back the way they came. But as soon as they’re even slightly off center they’ll bounce back at a different angle, and if they do this repeatedly even the tiniest initial off-centeredness will be arbitrarily amplified:

And, yes, this chaos-theory-like phenomenon makes it very difficult even to do an accurate simulation on a computer with limited numerical precision. But does it actually matter to the core phenomenon of randomization that’s central to the Second Law?

To begin testing this, let’s consider not hard spheres but instead hard squares (where we assume that the squares always stay in the same orientation, and ignore the mechanical torques that would lead to spinning). If we set up the same kind of “flotilla” as before, with the edges of the squares aligned with the walls of the box, then things are symmetrical enough that we don’t see any randomization—and in fact the only nontrivial thing that happens is a little

Viewed in “spacetime” we can see the “flotilla” is just bouncing unchanged off the walls:

But remove even a tiny bit of the symmetry—here by roughly doubling the “masses” of some of the squares and “riffling” their positions (which also avoids singular multi-square collisions)—and we get:

In “spacetime” this becomes

or “from the side”:

So despite the lack of chaos-theory-like amplification behavior (or any associated loss of numerical precision in our simulations), there’s still rapid “degradation” to a certain apparent randomness.

So how much further can we go? In the hard-square gas, the squares can still be at any location, and be moving at any speed in any direction. As a simpler system (that I happened to first investigate a version of nearly 50 years ago), let’s consider a discrete grid in which idealized molecules have discrete directions and are either present or not on each edge:

The system operates in discrete steps, with the molecules at each step moving or “scattering” according to the rules (up to rotations)

and interacting with the “walls” according to:

Running this starting with a “flotilla” we get on successive steps:

Or, sampling every 10 steps:

In “spacetime” this becomes (with the arrows tipped to trace out “worldlines”)

or “from the side”:

And again we see at least a certain level of “randomization”. With this model we’re getting quite close to the setup of something like rule 30. And reformulating this same model we can get even closer. Instead of having “particles” with explicit “velocity directions”, consider just having a grid in which an alternating pattern of 2×2 blocks are updated at each step according to

and the “wall” rules

as well as the “rotations” of all these rules. With this “block cellular automaton” setup, “isolated particles” move according to the rule like the pieces on a checkerboard:

A “flotilla” of particles—like equal-mass hard squares—has rather simple behavior in the “square enclosure”:

In “spacetime” this is just:

But if we add even a single fixed (“single-cell-of-wall”) “obstruction cell” (here at the very center of the box, so preserving reflection symmetry) the behavior is quite different:

In “spacetime” this becomes (with the “obstruction cell” shown in gray)

or “from the side” (with the “obstruction” sometimes getting obscured by cells in front):

As it turns out, the block cellular automaton model we’re using here is actually functionally identical to the “discrete velocity molecules” model we used above, as the correspondence of their rules indicates:

And seeing this correspondence one gets the idea of considering a “rotated container”—which no longer gives simple behavior even without any kind of “interior fixed obstruction cell”:

Here’s the corresponding “spacetime” view

and here’s what it looks like “from the side”:

Here’s a larger version of the same setup (though no longer with exact symmetry) sampled every 50 steps:

And, yes, it’s increasingly looking as if there’s intrinsic randomness generation going on, much like in rule 30. But if we go a little further the correspondence becomes even clearer.

The systems we’ve been looking at so far have all been in 2D. But what if—like in rule 30—we consider 1D? It turns out we can set up very much the same kind of “gas-like” block cellular automata. Though with blocks of size 2 and two possible values for each cell, there’s only one viable rule

where in effect the only nontrivial transformation is:

(In 1D we can also make things simpler by not using explicit “walls”, but instead just wrapping the array of cells around cyclically.) Here’s then what happens with this rule with a few possible initial states:

And what we see is that in all cases the “particles” effectively just “pass through each other” without really “interacting”. But we can make there be something closer to “real interactions” by introducing another color, and adding a transformation which effectively introduces a “time delay” to each “crossover” of particles (as an alternative, one can also stay with 2 colors, and use size-3 blocks):

And with this “delayed particle” rule (that, as it happens, I first studied in 1986) we get:

With sufficiently simple initial conditions this still gives simple behavior, such as:

But as soon as one reaches the 121st initial condition () one sees:

(As we’ll discuss below, in a finite-size region of the kind we’re using, it’s inevitable that the pattern eventually repeats, though in the particular case shown it takes 7022 steps.) Here’s a slightly larger example, in which there’s clearer “progressive degradation” of the initial condition to apparent randomness:

We’ve come quite far from our original hard-sphere “realistic gas molecules”. But there’s even further to go. With hard spheres there’s built-in conservation of energy, momentum and number of particles. And we don’t specifically have these things anymore. But the rule we’re using still does have conservation of the number of non-white cells. Dropping this requirement, we can have rules like

which gradually “fill in with particles”:

What happens if we just let this “expand into a vacuum”, without any “walls”? The behavior is complex. And as is typical when there’s computational irreducibility, it’s at first hard to know what will happen in the end. For this particular initial condition everything becomes essentially periodic (with period 70) after 979 steps:

But with a slightly different initial condition, it seems to have a good chance of growing forever:

With slightly different rules (that here happen not be left-right symmetric) we start seeing rapid “expansion into the vacuum”—basically just like rule 30:

The whole setup here is very close to what it is for rule 30. But there’s one more feature we’ve carried over here from our hard-sphere gas and other models. Just like in standard classical mechanics, every part of the underlying rule is reversible, in the sense that if the rule says that block *u* goes to block *v* it also says that block *v* goes to block *u*.

Rules like

remove this restriction but produce behavior that’s qualitatively no different from the reversible rules above.

But now we’ve got to systems that are basically set up just like rule 30. (They happen to be block cellular automata rather than ordinary ones, but that really doesn’t matter.) And, needless to say, being set up like rule 30 it shows the same kind of intrinsic randomness generation that we see in a system like rule 30.

We started here from a “physically realistic” hard-sphere gas model—which we’ve kept on simplifying and idealizing. And what we’ve found is that through all this simplification and idealization, the same core phenomenon has remained: that even starting from “simple” or “ordered” initial conditions, complex and “apparently random” behavior is somehow generated, just like it is in typical Second Law behavior.

At the outset we might have assumed that to get this kind of “Second Law behavior” would need at least quite a few features of physics. But what we’ve discovered is that this isn’t the case. And instead we’ve got evidence that the core phenomenon is much more robust and in a sense purely computational.

Indeed, it seems that as soon as there’s computational irreducibility in a system, it’s basically inevitable that we’ll see the phenomenon. And since from the Principle of Computational Equivalence we expect that computational irreducibility is ubiquitous, the core phenomenon of the Second Law will in the end be ubiquitous across a vast range of systems, from things like hard-sphere gases to things like rule 30.

Our typical everyday experience shows a certain fundamental irreversibility. An egg can readily be scrambled. But you can’t easily reverse that: it can’t readily be unscrambled. And indeed this kind of one-way transition from order to disorder—but not back—is what the Second Law is all about. But there’s immediately something mysterious about this. Yes, there’s irreversibility at the level of things like eggs. But if we drill down to the level of atoms, the physics we know says there’s basically perfect reversibility. So where is the irreversibility coming from? This is a core (and often confused) question about the Second Law, and in seeing how it resolves we will end up face to face with fundamental issues about the character of observers and their relationship to computational irreducibility.

A “particle cellular automaton” like the one from the previous section

has transformations that “go both ways”, making its rule perfectly reversible. Yet we saw above that if we start from a “simple initial condition” and then just run the rule, it will “produce increasing randomness”. But what if we reverse the rule, and run it backwards? Well, since the rule is reversible, the same thing must happen: we must get increasing randomness. But how can it be that “randomness increases” both going forward in time and going backward? Here’s a picture that shows what’s going on:

In the middle the system takes on a “simple state”. But going either forward or backward it “randomizes”. The second half of the evolution we can interpret as typical Second-Law-style “degradation to randomness”. But what about the first half? Something unexpected is happening here. From what seems like a “rather random” initial state, the system appears to be “spontaneously organizing itself” to produce—at least temporarily—a simple and “orderly” state. An initial “scrambled” state is spontaneously becoming “unscrambled”. In the setup of ordinary thermodynamics, this would be a kind of “anti-thermodynamic” behavior in which what seems like “random heat” is spontaneously generating “organized mechanical work”.

So why isn’t this what we see happening all the time? Microscopic reversibility guarantees that in principle it’s possible. But what leads to the observed Second Law is that in practice we just don’t normally end up setting up the kind of initial states that give “anti-thermodynamic” behavior. We’ll be talking at length below about why this is. But the basic point is that to do so requires more computational sophistication than we as computationally bounded observers can muster. If the evolution of the system is computationally irreducible, then in effect we have to invert all of that computationally irreducible work to find the initial state to use, and that’s not something that we—as computationally bounded observers—can do.

But before we talk more about this, let’s explore some of the consequences of the basic setup we have here. The most obvious aspect of the “simple state” in the middle of the picture above is that it involves a big blob of “adjacent particles”. So now here’s a plot of the “size of the biggest blob that’s present” as a function of time starting from the “simple state”:

The plot indicates that—as the picture above indicates—the “specialness” of the initial state quickly “decays” to a “typical state” in which there aren’t any large blobs present. And if we were watching the system at the beginning of this plot, we’d be able to “use the Second Law” to identify a definite “arrow of time”: later times are the ones where the states are “more disordered” in the sense that they only have smaller blobs.

There are many subtleties to all of this. We know that if we set up an “appropriately special” initial state we can get anti-thermodynamic behavior. And indeed for the whole picture above—with its “special initial state”—the plot of blob size vs. time looks like this, with a symmetrical peak “developing” in the middle:

We’ve “made this happen” by setting up “special initial conditions”. But can it happen “naturally”? To some extent, yes. Even away from the peak, we can see there are always little fluctuations: blobs being formed and destroyed as part of the evolution of the system. And if we wait long enough we may see a fairly large blob. Like here’s one that forms (and decays) after about 245,400 steps:

The actual structure this corresponds to is pretty unremarkable:

But, OK, away from the “special state”, what we see is a kind of “uniform randomness”, in which, for example, there’s no obvious distinction between forward and backward in time. In thermodynamic terms, we’d describe this as having “reached equilibrium”—a situation in which there’s no longer “obvious change”.

To be fair, even in “equilibrium”, there will always be “fluctuations”. But for example in the system we’re looking at here, “fluctuations” corresponding to progressively larger blobs tend to occur exponentially less frequently. So it’s reasonable to think of there being an “equilibrium state” with certain unchanging “typical properties”. And, what’s more, that state is the basic outcome from any initial condition. Whatever special characteristics might have been present in the initial state will tend to be degraded away, leaving only the generic “equilibrium state”.

One might think that the possibility of such an “equilibrium state” showing “typical behavior” would be a specific feature of microscopically reversible systems. But this isn’t the case. And much as the core phenomenon of the Second Law is actually something computational that’s deeper and more general than the specifics of particular physical systems, so also this is true of the core phenomenon of equilibrium. And indeed the presence of what we might call “computational equilibrium” turns out to be directly connected to the overall phenomenon of computational irreducibility.

Let’s look again at rule 30. We start it off with different initial states, but in each case it quickly evolves to look basically the same:

Yes, the details of the patterns that emerge depend on the initial conditions. But the point is that the overall form of what’s produced is always the same: the system has reached a kind of “computational equilibrium” whose overall features are independent of where it came from. Later, we’ll see that the rapid emergence of “computational equilibrium” is characteristic of what I long ago identified as “class 3 systems”—and it’s quite ubiquitous to systems with a wide range of underlying rules, microscopically reversible or not.

That’s not to say that microscopic reversibility is irrelevant to “Second-Law-like” behavior. In what I called class 1 and class 2 systems the force of irreversibility in the underlying rules is strong enough that it overcomes computational irreducibility, and the systems ultimately evolve not to a “computational equilibrium” that looks random but rather to a definite, predictable end state:

How common is microscopic reversibility? In some types of rules it’s basically always there, by construction. But in other cases microscopically reversible rules represent just a subset of possible rules of a given type. For example, for block cellular automata with *k* colors and blocks of size *b*, there are altogether (*k ^{b}*)

But a rule like this is still reversible

and there’s still a straightforward backward rule, but it’s not exactly the same as the forward rule:

Using the backward rule, we can again construct an initial state whose forward evolution seems “anti-thermodynamic”, but the detailed behavior of the whole system isn’t perfectly symmetric between forward and backward in time:

Basic mechanics—like for our hard-sphere gas—is reversible and “self-inverse”. But it’s known that in particle physics there are small deviations from time reversal invariance, so that the rules are not precisely self-inverse—though they are still reversible in the sense that there’s always both a unique successor and a unique predecessor to every state (and indeed in our Physics Project such reversibility is probably guaranteed to exist in the laws of physics assumed by any observer who “believes they are persistent in time”).

For block cellular automata it’s very easy to determine from the underlying rule whether the system is reversible (just look to see if the rule serves only to permute the blocks). But for something like an ordinary cellular automaton it’s more difficult to determine reversibility from the rule (and above one dimension the question of reversibility can actually be undecidable). Among the 256 2-color nearest-neighbor rules there are only 6 reversible examples, and they are all trivial. Among the 134,217,728 3-color nearest-neighbor rules, 1800 are reversible. Of the 82 of these rules that are self-inverse, all are trivial. But when the inverse rules are different, the behavior can be nontrivial:

Note that unlike with block cellular automata the inverse rule often involves a larger neighborhood than the forward rule. (So, for example, here 396 rules have *r* = 1 inverses, 612 have *r* = 2, 648 have *r* = 3 and 144 have *r* = 4.)

A notable variant on ordinary cellular automata are “second-order” ones, in which the value of a cell depends on its value two steps in the past:

With this approach, one can construct reversible second-order variants of all 256 “elementary cellular automata”:

Note that such second-order rules are equivalent to 4-color first-order nearest-neighbor rules:

Whenever there’s a system with deterministic rules and a finite total number of states, it’s inevitable that the evolution of the system will eventually repeat. Sometimes the repetition period—or “recurrence time”—will be fairly short

and sometimes it’s much longer:

In general we can make a state transition graph that shows how each possible state of the system transitions to another under the rules. For a reversible system this graph consists purely of cycles in which each state has a unique successor and a unique predecessor. For a size-4 version of the system we’re studying here, there are a total of 2 ✕ 3^{4} = 162 possible states (the factor 2 comes from the even/odd “phases” of the block cellular automaton)—and the state transition graph for this system is:

For a non-reversible system—like rule 30—the state transition graph (here shown for sizes 4 and 8) also includes “transient trees” of states that can be visited only once, on the way to a cycle:

In the past one of the key ideas for the origin of Second-Law-like behavior was ergodicity. And in the discrete-state systems we’re discussing here the definition of perfect ergodicity is quite straightforward: ergodicity just implies that the state transition graph must consist not of many cycles, but instead purely of one big cycle—so that whatever state one starts from, one’s always guaranteed to eventually visit every possible other state.

But why is this relevant to the Second Law? Well, we’ve said that the Second Law is about “degradation” from “special states” to “typical states”. And if one’s going to “do the ergodic thing” of visiting all possible states, then inevitably most of the states we’ll at least eventually pass through will be “typical”.

But on its own, this definitely isn’t enough to explain “Second-Law behavior” in practice. In an example like the following, one sees rapid “degradation” of a simple initial state to something “random” and “typical”:

But of the 2 ✕ 3^{80} ≈ 10^{38} possible states that this system would eventually visit if it were ergodic, there are still a huge number that we wouldn’t consider “typical” or “random”. For example, just knowing that the system is eventually ergodic doesn’t tell one that it wouldn’t start off by painstakingly “counting down” like this, “keeping the action” in a tightly organized region:

So somehow there’s more than ergodicity that’s needed to explain the “degradation to randomness” associated with “typical Second-Law behavior”. And, yes, in the end it’s going to be a computational story, connected to computational irreducibility and its relationship to observers like us. But before we get there, let’s talk some more about “global structure”, as captured by things like state transition diagrams.

Consider again the size-4 case above. The rules are such that they conserve the number of “particles” (i.e. non-white cells). And this means that the states of the system necessarily break into separate “sectors” for different particle numbers. But even with a fixed number of particles, there are typically quite a few distinct cycles:

The system we’re using here is too small for us to be able to convincingly identify “simple” versus “typical” or “random” states, though for example we can see that only a few of the cycles have the simplifying feature of left-right symmetry.

Going to size 6 one begins to get a sense that there are some “always simple” cycles, as well as others that involve “more typical” states:

At size 10 the state transition graph for “4-particle” states has the form

and the longer cycles are:

It’s notable that most of the longest (“closest-to-ergodicity”) cycles look rather “simple and deliberate” all the way through. The “more typical and random” behavior seems to be reserved here for shorter cycles.

But in studying “Second Law behavior” what we’re mostly interested in is what happens from initially orderly states. Here’s an example of the results for progressively larger “blobs” in a system of size 30:

To get some sense of how the “degradation to randomness” proceeds, we can plot how the maximum blob size evolves in each case:

For some of the initial conditions one sees “thermodynamic-like” behavior, though quite often it’s overwhelmed by “freezing”, fluctuations, recurrences, etc. In all cases the evolution must eventually repeat, but the “recurrence times” vary widely (the longest—for a width-13 initial blob—being 861,930):

Let’s look at what happens in these recurrences, using as an example a width-17 initial blob—whose evolution begins:

As the picture suggests, the initial “large blob” quickly gets at least somewhat degraded, though there continue to be definite fluctuations visible:

If one keeps going long enough, one reaches the recurrence time, which in this case is 155,150 steps. Looking at the maximum blob size through a “whole cycle” one sees many fluctuations:

Most are small—as illustrated here with ordinary and logarithmic histograms:

But some are large. And for example at half the full recurrence time there is a fluctuation

that involves an “emergent blob” as wide as in the initial condition—that altogether lasts around 280 steps:

There are also “runner-up” fluctuations with various forms—that reach “blob width 15” and occur more or less equally spaced throughout the cycle:

It’s notable that clear Second-Law-like behavior occurs even in a size-30 system. But if we go, say, to a size-80 system it becomes even more obvious

and one sees rapid and systematic evolution towards an “equilibrium state” with fairly small fluctuations:

It’s worth mentioning again that the idea of “reaching equilibrium” doesn’t depend on the particulars of the rule we’re using—and in fact it can happen more rapidly in other reversible block cellular automata where there are no “particle conservation laws” to slow things down:

In such rules there also tend to be fewer, longer cycles in the state transition graph, as this comparison for size 6 with the “delayed particle” rule suggests:

But it’s important to realize that the “approach to equilibrium” is its own—computational—phenomenon, not directly related to long cycles and concepts like ergodicity. And indeed, as we mentioned above, it also doesn’t depend on built-in reversibility in the rules, so one sees it even in something like rule 30:

At an everyday level, the core manifestation of the Second Law is the tendency of things to “degrade” to randomness. But just how random is the randomness? One might think that anything that is made by a simple-to-describe algorithm—like the pattern of rule 30 or the digits of π—shouldn’t really be considered “random”. But for the purpose of understanding our experience of the world what matters is not what is “happening underneath” but instead what our perception of it is. So the question becomes: when we see something produced, say by rule 30 or by π, can we recognize regularities in it or not?

And in practice what the Second Law asserts is that systems will tend to go from states where we can recognize regularities to ones where we cannot. And the point is that this phenomenon is something ubiquitous and fundamental, arising from core computational ideas, in particular computational irreducibility.

But what does it mean to “recognize regularities”? In essence it’s all about seeing if we can find succinct ways to summarize what we see—or at least the aspects of what we see that we care about. In other words, what we’re interested in is finding some kind of compressed representation of things. And what the Second Law is ultimately about is saying that even if compression works at first, it won’t tend to keep doing so.

As a very simple example, let’s consider doing compression by essentially “representing our data as a sequence of blobs”—or, more precisely, using run-length encoding to represent sequences of 0s and 1s in terms of lengths of successive runs of identical values. For example, given the data

we split into runs of identical values

then as a “compressed representation” just give the length of each run

which we can finally encode as a sequence of binary numbers with base-3 delimiters:

“Transforming” our “particle cellular automaton” in this way we get:

The “simple” initial conditions here are successfully compressed, but the later “random” states are not. Starting from a random initial condition, we don’t see any significant compression at all:

What about other methods of compression? A standard approach involves looking at blocks of successive values on a given step, and asking about the relative frequencies with which different possible blocks occur. But for the particular rule we are discussing here, there’s immediately an issue. The rule conserves the total number of non-white cells—so at least for size-1 blocks the frequency of such blocks will always be what it was for the initial conditions.

What about for larger blocks? This gives the evolution of relative frequencies of size-2 blocks starting from the simple initial condition above:

Arranging for exactly half the cells to be non-white, the frequencies of size-2 block converge towards equality:

In general, the presence of unequal frequencies for different blocks allows the possibility of compression: much like in Morse code, one just has to use shorter codewords for more frequent blocks. How much compression is ultimately possible in this way can be found by computing –Σ*p _{i}* log

In the end we know that the initial conditions were “simple” and “special”. But the issue is whether whatever method we use for compression or for recognizing regularities is able to pick up on this. Or whether somehow the evolution of the system has sufficiently “encoded” the information about the initial condition that it’s no longer detectable. Clearly if our “method of compression” involved explicitly running the evolution of the system backwards, then it’d be possible to pick out the special features of the initial conditions. But explicitly running the evolution of the system requires doing lots of computational work.

So in a sense the question is whether there’s a shortcut. And, yes, one can try all sorts of methods from statistics, machine learning, cryptography and so on. But so far as one can tell, none of them make any significant progress: the “encoding” associated with the evolution of the system seems to just be too strong to “break”. Ultimately it’s hard to know for sure that there’s no scheme that can work. But any scheme must correspond to running some program. So a way to get a bit more evidence is just to enumerate “possible compression programs” and see what they do.

In particular, we can for example enumerate simple cellular automata, and see whether when run they produce “obviously different” results. Here’s what happens for a collection of different cellular automata when they are applied to a “simple initial condition”, to states obtained after 20 and 200 steps of evolution according to the particle cellular automaton rule and to an independently random state:

And, yes, in many cases the simple initial condition leads to “obviously different behavior”. But there’s nothing obviously different about the behavior obtained in the last two cases. Or, in other words, at least programs based on these simple cellular automata don’t seem to be able to “decode” the different origins of the second and third cases shown here.

What does all this mean? The fundamental point is that there seems to be enough computational irreducibility in the evolution of the system that no computationally bounded observer can “see through it”. And so—at least as far as a computationally bounded observer is concerned—“specialness” in the initial conditions is quickly “degraded” to an “equilibrium” state that “seems random”. Or, in other words, the computational process of evolution inevitably seems to lead to the core phenomenon of the Second Law.

“Entropy increases” is a common statement of the Second Law. But what does this mean, especially in our computational context? The answer is somewhat subtle, and understanding it will put us right back into questions of the interplay between computational irreducibility and the computational boundedness of observers.

When it was first introduced in the 1860s, entropy was thought of very much like energy, and was computed from ratios of heat content to temperature. But soon—particularly through work on gases by Boltzmann—there arose a quite different way of computing (and thinking about) entropy: in terms of the log of the number of possible states of a system. Later we’ll discuss the correspondence between these different ideas of entropy. But for now let’s consider what I view as the more fundamental definition based on counting states.

In the early days of entropy, when one imagined that—like in the cases of the hard-sphere gas—the parameters of the system were continuous, it could be mathematically complex to tease out any kind of discrete “counting of states”. But from what we’ve discussed here, it’s clear that the core phenomenon of the Second Law doesn’t depend on the presence of continuous parameters, and in something like a cellular automaton it’s basically straightforward to count discrete states.

But now we have get more careful about our definition of entropy. Given any particular initial state, a deterministic system will always evolve through a series of individual states—so that there’s always only one possible state for the system, which means the entropy will always be exactly zero. (This is a lot muddier and more complicated when continuous parameters are considered, but in the end the conclusion is the same.)

So how do we get a more useful definition of entropy? The key idea is to think not about individual states of a system but instead about collections of states that we somehow consider “equivalent”. In a typical case we might imagine that we can’t measure all the detailed positions of molecules in a gas, so we look just at “coarse-grained” states in which we consider, say, only the number of molecules in particular overall bins or blocks.

The entropy can be thought of as counting the number of possible microscopic states of the system that are consistent with some overall constraint—like a certain number of particles in each bin. If the constraint talks specifically about the position of every particle, there’ll only be one microscopic state consistent with the constraints, and the entropy will be zero. But if the constraint is looser, there’ll often be many possible microscopic states consistent with it, and the entropy we define will be nonzero.

Let’s look at this in the context of our particle cellular automaton. Here’s a particular evolution, starting from a specific microscopic state, together with a sequence of “coarse grainings” of this evolution in which we keep track only of “overall particle density” in progressively larger blocks:

The very first “coarse graining” here is particularly trivial: all it’s doing is to say whether a “particle is present” or not in each cell—or, in other words, it’s showing every particle but ignoring whether it’s “light” or “dark”. But in making this and the other coarse-grained pictures we’re always starting from the single “underlying microscopic evolution” that’s shown and just “adding coarse graining after the fact”.

But what if we assume that all we ever know about the system is a coarse-grained version? Say we look at the “particle-or-not” case. At a coarse-grained level the initial condition just says there are 6 particles present. But it doesn’t say if each particle is light or dark, and actually there are 2^{6} = 64 possible microscopic configurations. And the point is that each of these microscopic configurations has its own evolution:

But now we can consider coarse graining things. All 64 initial conditions are—by construction—equivalent under particle-or-not coarse graining:

But after just one step of evolution, different initial “microstates” can lead to different coarse-grained evolutions:

In other words, a single coarse-grained initial condition “spreads out” after just one step to several coarse-grained states:

After another step, a larger number of coarse-grained states are possible:

And in general the number of distinct coarse-grained states that can be reached grows fairly rapidly at first, though soon saturates, showing just fluctuations thereafter:

But the coarse-grained entropy is basically just proportional to the log of this quantity, so it too will show rapid growth at first, eventually leveling off at an “equilibrium” value.

The framework of our Physics Project makes it natural to think of coarse-grained evolution as a multicomputational process—in which a given coarse-grained state has not just a single successor, but in general multiple possible successors. For the case we’re considering here, the multiway graph representing all possible evolution paths is then:

The branching here reflects a spreading out in coarse-grained state space, and an increase in coarse-grained entropy. If we continue longer—so that the system begins to “approach equilibrium”—we’ll start to see some merging as well

as a less “time-oriented” graph layout makes clear:

But the important point is that in its “approach to equilibrium” the system in effect rapidly “spreads out” in coarse-grained state space. Or, in other words, the number of possible states of the system consistent with a particular coarse-grained initial condition increases, corresponding to an increase in what one can consider to be the entropy of the system.

There are many possible ways to set up what we might view as “coarse graining”. An example of another possibility is to focus on the values of a particular block of cells, and then to ignore the values of all other cells. But it typically doesn’t take long for the effects of other cells to “seep into” the block we’re looking at:

So what is the bigger picture? The basic point is that insofar as the evolution of each individual microscopic state “leads to randomness”, it’ll tend to end up in a different “coarse-grained bin”. And the result is that even if one starts with a tightly defined coarse-grained description, it’ll inevitably tend to “spread out”, thereby encompassing more states and increasing the entropy.

In a sense, looking at entropy and coarse graining is just a less direct way to detect that a system tends to “produce effective randomness”. And while it may have seemed like a convenient formalism when one was, for example, trying to tease things out from systems with continuous variables, it now feels like a rather indirect way to get at the core phenomenon of the Second Law.

It’s useful to understand a few more connections, however. Let’s say one’s trying to work out the average value of something (say particle density) in a system. What do we mean by “average”? One possibility is that we take an “ensemble” of possible states of the system, then find the average across these. But another possibility is that we instead look at the average across successive states in the evolution of the system. The “ergodic hypothesis” is that the ensemble average will be the same as the time average.

One way this would—at least eventually—be guaranteed is if the evolution of the system is ergodic, in the sense that it eventually visits all possible states. But as we saw above, this isn’t something that’s particularly plausible for most systems. But it also isn’t necessary. Because so long as the evolution of the system is “effectively random” enough, it’ll quickly “sample typical states”, and give essentially the same averages as one would get from sampling all possible states, but without having to laboriously visit all these states.

How does one tie all this down with rigorous, mathematical-style proofs? Well, it’s difficult. And in a first approximation not much progress has been made on this for more than a century. But having seen that the core phenomenon of the Second Law can be reduced to an essentially purely computational statement, we’re now in a position to examine this in a different—and I think ultimately very clarifying—way.

At its core the Second Law is essentially the statement that “things tend to get more random”. And in a sense the ultimate driver of this is the surprising phenomenon of computational irreducibility I identified in the 1980s—and the remarkable fact that even from simple initial conditions simple computational rules can generate behavior of great complexity. But there are definitely additional nuances to the story.

For example, we’ve seen that—particularly in a reversible system—it’s always in principle possible to set up initial conditions that will evolve to “magically produce” whatever “simple” configuration we want. And when we say that we generate “apparently random” states, our “analyzer of randomness” can’t go in and invert the computational process that generated the states. Similarly, when we talk about coarse-grained entropy and its increase, we’re assuming that we’re not inventing some elaborate coarse-graining procedure that’s specially set up to pick out collections of states with “special” behavior.

But there’s really just one principle that governs all these things: that whatever method we have to prepare or analyze states of a system is somehow computationally bounded. This isn’t as such a statement of physics. Rather, it’s a general statement about observers, or, more specifically, observers like us.

We could imagine some very detailed model for an observer, or for the experimental apparatus they use. But the key point is that the details don’t matter. Really all that matters is that the observer is computationally bounded. And it’s then the basic computational mismatch between the observer and the computational irreducibility of the underlying system that leads us to “experience” the Second Law.

At a theoretical level we can imagine an “alien observer”—or even an observer with technology from our own future—that would not have the same computational limitations. But the point is that insofar as we are interested in explaining our own current experience, and our own current scientific observations, what matters is the way we as observers are now, with all our computational boundedness. And it’s then the interplay between this computational boundedness, and the phenomenon of computational irreducibility, that leads to our basic experience of the Second Law.

At some level the Second Law is a story of the emergence of complexity. But it’s also a story of the emergence of simplicity. For the very statement that things go to a “completely random equilibrium” implies great simplification. Yes, if an observer could look at all the details they would see great complexity. But the point is that a computationally bounded observer necessarily can’t look at those details, and instead the features they identify have a certain simplicity.

And so it is, for example, that even though in a gas there are complicated underlying molecular motions, it’s still true that at an overall level a computationally bounded observer can meaningfully discuss the gas—and make predictions about its behavior—purely in terms of things like pressure and temperature that don’t probe the underlying details of molecular motions.

In the past one might have thought that anything like the Second Law must somehow be specific to systems made from things like interacting particles. But in fact the core phenomenon of the Second Law is much more general, and in a sense purely computational, depending only on the basic computational phenomenon of computational irreducibility, together with the fundamental computational boundedness of observers like us.

And given this generality it’s perhaps not surprising that the core phenomenon appears far beyond where anything like the Second Law has normally been considered. In particular, in our Physics Project it now emerges as fundamental to the structure of space itself—as well as to the phenomenon of quantum mechanics. For in our Physics Project we imagine that at the lowest level everything in our universe can be represented by some essentially computational structure, conveniently described as a hypergraph whose nodes are abstract “atoms of space”. This structure evolves by following rules, whose operation will typically show all sorts of computational irreducibility. But now the question is how observers like us will perceive all this. And the point is that through our limitations we inevitably come to various “aggregate” conclusions about what’s going on. It’s very much like with the gas laws and their broad applicability to systems involving different kinds of molecules. Except that now the emergent laws are about spacetime and correspond to the equations of general relativity.

But the basic intellectual structure is the same. Except that in the case of spacetime, there’s an additional complication. In thermodynamics, we can imagine that there’s a system we’re studying, and the observer is outside it, “looking in”. But when we’re thinking about spacetime, the observer is necessarily embedded within it. And it turns out that there’s then one additional feature of observers like us that’s important. Beyond the statement that we’re computationally bounded, it’s also important that we assume that we’re persistent in time. Yes, we’re made of different atoms of space at different moments. But somehow we assume that we have a coherent thread of experience. And this is crucial in deriving our familiar laws of physics.

We’ll talk more about it later, but in our Physics Project the same underlying setup is also what leads to the laws of quantum mechanics. Of course, quantum mechanics is notable for the apparent randomness associated with observations made in it. And what we’ll see later is that in the end the same core phenomenon responsible for randomness in the Second Law also appears to be what’s responsible for randomness in quantum mechanics.

The interplay between computational irreducibility and computational limitations of observers turns out to be a central phenomenon throughout the multicomputational paradigm and its many emerging applications. It’s core to the fact that observers can experience computationally reducible laws in all sorts of samplings of the ruliad. And in a sense all of this strengthens the story of the origins of the Second Law. Because it shows that what might have seemed like arbitrary features of observers are actually deep and general, transcending a vast range of areas and applications.

But even given the robustness of features of observers, we can still ask about the origins of the whole computational phenomenon that leads to the Second Law. Ultimately it begins with the Principle of Computational Equivalence, which asserts that systems whose behavior is not obviously simple will tend to be equivalent in their computational sophistication. The Principle of Computational Equivalence has many implications. One of them is computational irreducibility, associated with the fact that “analyzers” or “predictors” of a system cannot be expected to have any greater computational sophistication than the system itself, and so are reduced to just tracing each step in the evolution of a system to find out what it does.

Another implication of the Principle of Computational Equivalence is the ubiquity of computation universality. And this is something we can expect to see “underneath” the Second Law. Because we can expect that systems like the particle cellular automaton—or, for that matter, the hard-sphere gas—will be provably capable of universal computation. Already it’s easy to see that simple logic gates can be constructed from configurations of particles, but a full demonstration of computation universality will be considerably more elaborate. And while it’d be nice to have such a demonstration, there’s still more that’s needed to establish full computational irreducibility of the kind the Principle of Computational Equivalence implies.

As we’ve seen, there are a variety of “indicators” of the operation of the Second Law. Some are based on looking for randomness or compression in individual states. Others are based on computing coarse grainings and entropy measures. But with the computational interpretation of the Second Law we can expect to translate such indicators into questions in areas like computational complexity theory.

At some level we can think of the Second Law as being a consequence of the dynamics of a system so “encrypting” the initial conditions of a system that no computations available to an “observer” can feasibly “decrypt” it. And indeed as soon as one looks at “inverting” coarse-grained results one is immediately faced with fairly classic NP problems from computational complexity theory. (Establishing NP completeness in a particular case remains challenging, just like establishing computation universality.)

In our discussion here, we’ve treated the Second Law of thermodynamics primarily as an abstract computational phenomenon. But when thermodynamics was historically first being developed, the computational paradigm was still far in the future, and the only way to identify something like the Second Law was through its manifestations in terms of physical concepts like heat and temperature.

The First Law of thermodynamics asserted that heat was a form of energy, and that overall energy was conserved. The Second Law then tried to characterize the nature of the energy associated with heat. And a core idea was that this energy was somehow incoherently spread among a large number of separate microscopic components. But ultimately thermodynamics was always a story of energy.

But is energy really a core feature of thermodynamics or is it merely “scaffolding” relevant for its historical development and early practical applications? In the hard-sphere gas example that we started from above, there’s a pretty clear notion of energy. But quite soon we largely abstracted energy away. Though in our particle cellular automaton we do still have something somewhat analogous to energy conservation: we have conservation of the number of non-white cells.

In a traditional physical system like a gas, temperature gives the average energy per degree of freedom. But in something like our particle cellular automaton, we’re effectively assuming that all particles always have the same energy—so there is for example no way to “change the temperature”. Or, put another way, what we might consider as the energy of the system is basically just given by the number of particles in the system.

Does this simplification affect the core phenomenon of the Second Law? No. That’s something much stronger, and quite independent of these details. But in the effort to make contact with recognizable “textbook thermodynamics”, it’s useful to consider how we’d add in ideas like heat and temperature.

In our discussion of the Second Law, we’ve identified entropy with the log of the number states consistent with a constraint. But more traditional thermodynamics involves formulas like *dS* = *dQ*/*T**Q* gives total heat content, or “total heat energy” (not worrying about what this is measured relative to, which is what makes it *dQ* rather than *Q*). *T* gives average energy per “degree of freedom” (or, roughly, particle). And this means that *Q*/*T* effectively measures something like the “number of particles”. But at least in a system like a particle cellular automaton, the number of possible complete configurations is exponential in the number of particles, making its logarithm, the entropy *S*, roughly proportional to the number of particles, and thus to *Q*/*T*. That anything like this argument works depends, though, on being able to discuss things “statistically”, which in turn depends on the core phenomenon of the Second Law: the tendency of things to evolve to uniform (“equilibrium”) randomness.

When the Second Law was first introduced, there were several formulations given, all initially referencing energy. One formulation stated that “heat does not spontaneously go from a colder body to a hotter”. And even in our particle cellular automaton we can see a fairly direct version of this. Our proxy for “temperature” is density of particles. And what we observe is that an initial region of higher density tends to “diffuse” out:

Another formulation of the Second Law talks about the impossibility of systematically “turning heat into mechanical work”. At a computational level, the analog of “mechanical work” is systematic, predictable behavior. So what this is saying is again that systems tend to generate randomness, and to “remove predictability”.

In a sense this is a direct reflection of computational irreducibility. To get something that one can “harness as mechanical work” one needs something that one can readily predict. But the whole point is that the presence of computational irreducibility makes prediction take an irreducible amount of computational work—that is beyond the capabilities of an “observer like us”.

Closely related is the statement that it’s not possible to make a perpetual motion machine (“of the second kind”, i.e. violating the Second Law), that continually “makes systematic motion” from “heat”. In our computational setting this would be like extracting a systematic, predictable sequence of bits from our particle cellular automaton, or from something like rule 30. And, yes, if we had a device that could for example systematically predict rule 30, then it would be straightforward, say, “just to pick out black cells”, and effectively to derive a predictable sequence. But computational irreducibility implies that we won’t be able to do this, without effectively just directly reproducing what rule 30 does, which an “observer like us” doesn’t have the computational capability to do.

Much of the textbook discussion of thermodynamics is centered around the assumption of “equilibrium”—or something infinitesimally close to it—in which one assumes that a system behaves “uniformly and randomly”. Indeed, the Zeroth Law of thermodynamics is essentially the statement that “statistically unique” equilibrium can be achieved, which in terms of energy becomes a statement that there is a unique notion of temperature.

Once one has the idea of “equilibrium”, one can then start to think of its properties as purely being functions of certain parameters—and this opens up all sorts of calculus-based mathematical opportunities. That anything like this makes sense depends, however, yet again on “perfect randomness as far as the observer is concerned”. Because if the observer could notice a difference between different configurations, it wouldn’t be possible to treat all of them as just being “in the equilibrium state”.

Needless to say, while the intuition of all this is made rather clear by our computational view, there are details to be filled in when it comes to any particular mathematical formulation of features of thermodynamics. As one example, let’s consider a core result of traditional thermodynamics: the Maxwell–Boltzmann exponential distribution of energies for individual particles or other degrees of freedom.

To set up a discussion of this, we need to have a system where there can be many possible microscopic amounts of energy, say, associated with some kind of idealized particles. Then we imagine that in “collisions” between such particles energy is exchanged, but the total is always conserved. And the question is how energy will eventually be distributed among the particles.

As a first example, let’s imagine that we have a collection of particles which evolve in a series of steps, and that at each step particles are paired up at random to “collide”. And, further, let’s assume that the effect of the collision is to randomly redistribute energy between the particles, say with a uniform distribution.

We can represent this process using a token-event graph, where the events (indicated here in yellow) are the collisions, and the tokens (indicated here in red) represent states of particles at each step. The energy of the particles is indicated here by the size of the “token dots”:

Continuing this a few more steps we get:

At the beginning we started with all particles having equal energies. But after a number of steps the particles have a distribution of energies—and the distribution turns out to be accurately exponential, just like the standard Maxwell–Boltzmann distribution:

If we look at the distribution on successive steps we see rapid evolution to the exponential form:

Why we end up with an exponential is not hard to see. In the limit of enough particles and enough collisions, one can imagine approximating everything purely in terms of probabilities (as one does in deriving Boltzmann transport equations, basic SIR models in epidemiology, etc.) Then if the probability for a particle to have energy *E* is ƒ(*E*), in every collision once the system has “reached equilibrium” one must have ƒ(*E*_{1})ƒ(*E*_{2}) = ƒ(*E*_{3})ƒ(*E*_{4}) where *E*_{1} + *E*_{2} = *E*_{3} + *E*_{4}—and the only solution to this is ƒ(*E*) ∼ *e ^{–β E}*.

In the example we’ve just given, there’s in effect “immediate mixing” between all particles. But what if we set things up more like in a cellular automaton—with particles only colliding with their local neighbors in space? As an example, let’s say we have our particles arranged on a line, with alternating pairs colliding at each step in analogy to a block cellular automaton (the long-range connections represent wraparound of our lattice):

In the picture above we’ve assumed that in each collision energy is randomly redistributed between the particles. And with this assumption it turns out that we again rapidly evolve to an exponential energy distribution:

But now that we have a spatial structure, we can display what’s going on in more of a cellular automaton style—where here we’re showing results for 3 different sequences of random energy exchanges:

And once again, if we run long enough, we eventually get an exponential energy distribution for the particles. But note that the setup here is very different from something like rule 30—because we’re continuously injecting randomness from the outside into the system. And as a minimal way to avoid this, consider a model where at each collision the particles get fixed fractions *α*)/2*α*)/2

Here’s what happens with energy concentrated into a few particles

and with random initial energies:

And in all cases the system eventually evolves to a “pure checkerboard” in which the only particle energies are (1 – *α*)/2 and (1 + *α*)/2. (For *α* = 0 the system corresponds to a discrete version of the diffusion equation.) But if we look at the structure of the system, we can think of it as a continuous block cellular automaton. And as with other cellular automata, there are lots of possible rules that don’t lead to such simple behavior.

In fact, all we need do is allow *α* to depend on the energies *E*_{1} and *E*_{2} of colliding pairs of particles (or, here, the values of cells in each block). As an example, let’s take *α*(*E*_{1}, *E*_{2}) = ±`FractionalPart`[*κ E*],*E* is the total energy of the pair, and the + is used when *E*_{1} > *E*_{2}:

And with this setup we once again often see “rule-30-like behavior” in which effectively quite random behavior is generated even without any explicit injection of randomness from outside (the lower panels start at step 1000):

The underlying construction of the rule ensures that total energy is conserved. But what we see is that the evolution of the system distributes it across many elements. And at least if we use random initial conditions

we eventually in all cases see an exponential distribution of energy values (with simple initial conditions it can be more complicated):

The evolution towards this is very much the same as in the systems above. In a sense it depends only on having a suitably randomized energy-conserving collision process, and it takes only a few steps to go from a uniform initial distribution energy to an accurately exponential one:

So how does this all work in a “physically realistic” hard-sphere gas? Once again we can create a token-event graph, where the events are collisions, and the tokens correspond to periods of free motion of particles. For a simple 1D “Newton’s cradle” configuration, there is an obvious correspondence between the evolution in “spacetime”, and the token-event graph:

But we can do exactly the same thing for a 2D configuration. Indicating the energies of particles by the sizes of tokens we get (excluding wall collisions, which don’t affect particle energy)

where the “filmstrip” at the side gives snapshots of the evolution of the system. (Note that in this system, unlike the ones above, there aren’t definite “steps” of evolution; the collisions just happen “asynchronously” at times determined by the dynamics.)

In the initial condition we’re using here, all particles have the same energy. But when we run the system we find that the energy distribution for the particles rapidly evolves to the standard exponential form (though note that here successive panels are “snapshots”, not “steps”):

And because we’re dealing with “actual particles”, we can look not only at their energies, but also at their speeds (related simply by *E* = 1/2 *m* *v*^{2}). When we look at the distribution of speeds generated by the evolution, we find that it has the classic Maxwellian form:

And it’s this kind of final or “equilibrium” result that’s what’s mainly discussed in typical textbooks of thermodynamics. Such books also tend to talk about things like tradeoffs between energy and entropy, and define things like the (Helmholtz) free energy *F* = *U* – *T* *S* (where *U* is internal energy, *T* is temperature and *S* is entropy) that are used in answering questions like whether particular chemical reactions will occur under certain conditions.

But given our discussion of energy here, and our earlier discussion of entropy, it’s at first quite unclear how these quantities might relate, and how they can trade off against each other, say in the formula for free energy. But in some sense what connects energy to the standard definition of entropy in terms of the logarithm of the number of states is the Maxwell–Boltzmann distribution, with its exponential form. In the usual physical setup, the Maxwell–Boltzmann distribution is basically *e*^{(–E/kT)}, where *T* is the temperature, and *kT* is the average energy.

But now imagine we’re trying to figure out whether some process—say a chemical reaction—will happen. If there’s an energy barrier, say associated with an energy difference Δ, then according to the Maxwell–Boltzmann distribution there’ll be a probability proportional to *e*^{(–Δ/kT)} for molecules to have a high enough energy to surmount that barrier. But the next question is how many configurations of molecules there are in which molecules will “try to surmount the barrier”. And that’s where the entropy comes in. Because if the number of possible configurations is Ω, the entropy *S* is given by *k* log Ω, so that in terms of *S*, Ω = *e*^{(S/k)}. But now the “average number of molecules which will surmount the barrier” is roughly given by *e*^{(S/k)} *e*^{(–Δ/kT)},*T S*, which has the form of the free energy *U* – *T* *S*.

This argument is quite rough, but it captures the essence of what’s going on. And at first it might seem like a remarkable coincidence that there’s a logarithm in the definition of entropy that just “conveniently fits together” like this with the exponential in the Maxwell–Boltzmann distribution. But it’s actually not a coincidence at all. The point is that what’s really fundamental is the concept of counting the number of possible states of a system. But typically this number is extremely large. And we need some way to “tame” it. We could in principle use some slow-growing function other than log to do this. But if we use log (as in the standard definition of entropy) we precisely get the tradeoff with energy in the Maxwell–Boltzmann distribution.

There is also another convenient feature of using log. If two systems are independent, one with Ω_{1} states, and the other with Ω_{2} states, then a system that combines these (without interaction) will have Ω_{1}, Ω_{2} states. And if *S* = *k* log Ω, then this means that the entropy of the combined state will just be the sum *S*_{1} + *S*_{2} of the entropies of the individual states. But is this fact actually “fundamentally independent” of the exponential character of the Maxwell–Boltzmann distribution? Well, no. Or at least it comes from the same mathematical idea. Because it’s the fact that in equilibrium the probability ƒ(*E*) is supposed to satisfy ƒ(*E*_{1})ƒ(*E*_{2}) = ƒ(*E*_{3})ƒ(*E*_{4}) when *E*_{1} + *E*_{2} = *E*_{3} + *E*_{4} that makes ƒ(*E*) have its exponential form. In other words, both stories are about exponentials being able to connect additive combination of one quantity with multiplicative combination of another.

Having said all this, though, it’s important to understand that you don’t need energy to talk about entropy. The concept of entropy, as we’ve discussed, is ultimately a computational concept, quite independent of physical notions like energy. In many textbook treatments of thermodynamics, energy and entropy are in some sense put on a similar footing. The First Law is about energy. The Second Law is about entropy. But what we’ve seen here is that energy is really a concept at a different level from entropy: it’s something one gets to “layer on” in discussing physical systems, but it’s not a necessary part of the “computational essence” of how things work.

(As an extra wrinkle, in the case of our Physics Project—as to some extent in traditional general relativity and quantum mechanics—there are some fundamental connections between energy and entropy. In particular—related to what we’ll discuss below—the number of possible discrete configurations of spacetime is inevitably related to the “density” of events, which defines energy.)

It would be nice to be able to say, for example, that “using computation theory, we can prove the Second Law”. But it isn’t as simple as that. Not least because, as we’ve seen, the validity of the Second Law depends on things like what “observers like us” are capable of. But we can, for example, formulate what the outline of a proof of the Second Law could be like, though to give a full formal proof we’d have to introduce a variety of “axioms” (essentially about observers) that don’t have immediate foundations in existing areas of mathematics, physics or computation theory.

The basic idea is that one imagines a state *S* of a system (which could just be a sequence of values for cells in something like a cellular automaton). One considers an “observer function” Θ which, when applied to the state *S*, gives a “summary” of *S*. (A very simple example would be the run-length encoding that we used above.) Now we imagine some “evolution function” Ξ that is applied to *S*. The basic claim of the Second Law is that the “sizes” normally satisfy the inequality Θ[Ξ[*S*]] ≥ Θ[*S*], or in other words, that “compression by the observer” is less effective after the evolution of system, in effect because the state of the system has “become more random”, as our informal statement of the Second Law suggests.

What are the possible forms of Θ and Ξ? It’s slightly easier to talk about Ξ, because we imagine that this is basically any not-obviously-trivial computation, run for an increasing number of steps. It could be repeated application of a cellular automaton rule, or a Turing machine, or any other computational system. We might represent an individual step by an operator *ξ*, and say that in effect Ξ = *ξ*^{t}. We can always construct *ξ*^{t} by explicitly applying *ξ* successively *t* times. But the question of computational irreducibility is whether there’s a shortcut way to get to the same result. And given any specific representation of *ξ*^{t} (say, rather prosaically, as a Boolean circuit), we can ask how the size of that representation grows with *t*.

With the current state of computation theory, it’s exceptionally difficult to get definitive general results about minimal sizes of *ξ*^{t}, though in sufficiently small cases it’s possible to determine this “experimentally”, essentially by exhaustive search. But there’s an increasing amount of at least circumstantial evidence that for many kinds of systems, one can’t do much better than explicitly constructing *ξ*^{t}, as the phenomenon of computational irreducibility suggests. (One can imagine “toy models”, in which *ξ* corresponds to some very simple computational process—like a finite automaton—but while this likely allows one to prove things, it’s not at all clear how useful or representative any of the results will be.)

OK, so what about the “observer function” Θ? For this we need some kind of “observer theory”, that characterizes what observers—or, at least “observers like us”—can do, in the same kind of way that standard computation theory characterizes what computational systems can do. There are clearly some features Θ must have. For example, it can’t involve unbounded amounts of computation. But realistically there’s more than that. Somehow the role of observers is to take all the details that might exist in the “outside world”, and reduce or compress these to some “smaller” representation that can “fit in the mind of the observer”, and allow the observer to “make decisions” that abstract from the details of the outside world whatever specifics the observer “cares about”. And—like a construction such as a Turing machine—one must in the end have some way of building up “possible observers” from something like basic primitives.

Needless to say, even given primitives—or an axiomatic foundation—for Ξ and Θ, things are not straightforward. For example, it’s basically inevitable that many specific questions one might ask will turn out to be formally undecidable. And we can’t expect (particularly as we’ll see later) that we’ll be able to show that the Second Law is “just true”. It’ll be a statement that necessarily involves qualifiers like “typically”. And if we ask to characterize “typically” in terms, say, of “probabilities”, we’ll be stuck in a kind of recursive situation of having to define probability measures in terms of the very same constructs we’re starting from.

But despite these difficulties in making what one might characterize as general abstract statements, what our computational formulation achieves is to provide a clear intuitive guide to the origin of the Second Law. And from this we can in particular construct an infinite range of specific computational experiments that illustrate the core phenomenon of the Second Law, and give us more and more understanding of how the Second Law works, and where it conceptually comes from.

Even in the very early years of the formulation of the Second Law, James Clerk Maxwell already brought up an objection to its general applicability, and to the idea that systems “always become more random”. He imagined that a box containing gas molecules had a barrier in the middle with a small door controlled by a “demon” who could decide on a molecule-by-molecule basis which molecules to let through in each direction. Maxwell suggested that such a demon should readily be able to “sort” molecules, thereby reversing any “randomness” that might be developing.

As a very simple example, imagine that at the center of our particle cellular automaton we insert a barrier that lets particles pass from left to right but not the reverse. (We also add “reflective walls” on the two ends, rather than having cyclic boundary conditions.)

Unsurprisingly, after a short while, all the particles have collected on one side of the barrier, rather than “coming to equilibrium” in a “uniform random distribution” across the system:

Over the past century and a half (and even very recently) a whole variety of mechanical ratchets, molecular switches, electrical diodes, noise-reducing signal processors and other devices have been suggested as at least conceptually practical implementations of Maxwell’s demon. Meanwhile, all kinds of objections to their successful operation have been raised. “The demon can’t be made small enough”; “The demon will heat up and stop working”; “The demon will need to reset its memory, so has to be fundamentally irreversible”; “The demon will inevitably randomize things when it tries to sense molecules”; etc.

So what’s true? It depends on what we assume about the demon—and in particular to what extent we suppose that the demon needs to be following the same underlying laws as the system it’s operating on. As a somewhat extreme example, let’s imagine trying to “make a demon out of gas molecules”. Here’s an attempt at a simple model of this in our particle cellular automaton:

For a while we successfully maintain a “barrier”. But eventually the barrier succumbs to the same “degradation” processes as everything else, and melts away. Can we do better?

Let’s imagine that “inside the barrier” (AKA “demon”) there’s “machinery” that whenever the barrier is “buffeted” in a given way “puts up the right kind of armor” to “protect it” from that kind of buffeting. Assuming our underlying system is for example computation universal, we should at some level be able to “implement any computation we want”. (What needs to be done is quite analogous to cellular automata that successfully erase up to finite levels of “noise”.)

But there’s a problem. In order to “protect the barrier” we have to be able to “predict” how it will be “attacked”. Or, in other words, our barrier (or demon) will have to be able to systematically determine what the outside system is going to do before it does it. But if the behavior of the outside system is computationally irreducible this won’t in general be possible. So in the end the criterion for a demon like this to be impossible is essentially the same as the criterion for Second Law behavior to occur in the first place: that the system we’re looking at is computationally irreducible.

There’s a bit more to say about this, though. We’ve been talking about a demon that’s trying to “achieve something fairly simple”, like maintaining a barrier or a “one-way membrane”. But what if we’re more flexible in what we consider the objective of the demon to be? And even if the demon can’t achieve our original “simple objective” might there at least be some kind of “useful sorting” that it can do?

Well, that depends on what we imagine constitutes “useful sorting”. The system is always following its rules to do something. But probably it’s not something we consider “useful sorting”. But what would count as “useful sorting”? Presumably it’s got to be something that an observer will “notice”, and more than that, it should be something that has “done some of the job of decision making” ahead of the observer. In principle a sufficiently powerful observer might be able to “look inside the gas” and see what the results of some elaborate sorting procedure would be. But the point is for the demon to just make the sorting happen, so the job of the observer becomes essentially trivial.

But all of this then comes back to the question of what kind of thing an observer might want to observe. In general one would like to be able to characterize this by having an “observer theory” that provides a metatheory of possible observers in something like the kind of way that computation theory and ideas like Turing machines provide a metatheory of possible computational systems.

So what really is an observer, or at least an observer like us? The most crucial feature seems to be that the observer is always ultimately some kind of “finite mind” that takes all the complexity of the world and extracts from it just certain “summary features” that are relevant to the “decisions” it has to make. (Another crucial feature seems to be that the observer can consistently view themselves as being “persistent”.) But we don’t have to go all the way to a sophisticated “mind” to see this picture in operation. Because it’s already what’s going on not only in something like perception but also in essentially anything we’d usually call “measurement”.

For example, imagine we have a gas containing lots of molecules. A standard measurement might be to find the pressure of the gas. And in doing such a measurement, what’s happening is that we’re reducing the information about all the detailed motions of individual molecules, and just summarizing it by a single aggregate number that is the pressure.

How do we achieve this? We might have a piston connected to the box of gas. And each time a molecule hits the piston it’ll push it a little. But the point is that in the end the piston moves only as a whole. And the effects of all the individual molecules are aggregated into that overall motion.

At a microscopic level, any actual physical piston is presumably also made out of molecules. But unlike the molecules in the gas, these molecules are tightly bound together to make the piston solid. Every time a gas molecule hits the surface of the piston, it’ll transfer some momentum to a molecule in the piston, and there’ll be some kind of tiny deformation wave that goes through the piston. To get a “definitive pressure measurement”—based on definitive motion of the piston as a whole—that deformation wave will somehow have to disappear. And in making a theory of the “piston as observer” we’ll typically ignore the physical details, and idealize things by saying that the piston moves only as a whole.

But ultimately if we were to just look at the system “dispassionately”, without knowing the “intent” of the piston, we’d just see a bunch of molecules in the gas, and a bunch of molecules in the piston. So how would we tell that the piston is “acting as an observer”? In some ways it’s a rather circular story. If we assume that there’s a particular kind of thing an observer wants to measure, then we can potentially identify parts of a system that “achieve that measurement”. But in the abstract we don’t know what an observer “wants to measure”. We’ll always see one part of a system affecting another. But is it “achieving measurement” or not?

To resolve this, we have to have some kind of metatheory of the observer: we have to be able to say what kinds of things we’re going to count as observers and what not. And ultimately that’s something that must inevitably devolve to a rather human question. Because in the end what we care about is what we humans sense about the world, which is what, for example, we try to construct science about.

We could talk very specifically about the sensory apparatus that we humans have—or that we’ve built with technology. But the essence of observer theory should presumably be some kind of generalization of that. Something that recognizes fundamental features—like computational boundedness—of us as entities, but that does not depend on the fact that we happen to use sight rather than smell as our most important sense.

The situation is a bit like the early development of computation theory. Something like a Turing machine was intended to define a mechanism that roughly mirrored the computational capabilities of the human mind, but that also provided a “reasonable generalization” that covered, for example, machines one could imagine building. Of course, in that particular case the definition that was developed proved extremely useful, being, it seems, of just the right generality to cover computations that can occur in our universe—but not beyond.

And one might hope that in the future observer theory would identify a similarly useful definition for what a “reasonable observer” can be. And given such a definition, we will, for example, be in position to further tighten up our characterization of what the Second Law might say.

It may be worth commenting that in thinking about an observer as being an “entity like us” one of the immediate attributes we might seek is that the observer should have some kind of “inner experience”. But if we’re just looking at the pattern of molecules in a system, how do we tell where there’s an “inner experience” happening? From the outside, we presumably ultimately can’t. And it’s really only possible when we’re “on the inside”. We might have scientific criteria that tell us whether something can reasonably support an inner experience. But to know if there actually is an inner experience “going on” we basically have to be experiencing it. We can’t make a “first-principles” objective theory; we just have to posit that such-and-such part of the system is representing our subjective experience.

Of course, that doesn’t mean that there can’t still be very general conclusions to be drawn. Because it can still be—as it is in our Physics Project and in thinking about the ruliad—that it takes knowing only rather basic features of “observers like us” to be able to make very general statements about things like the effective laws we will experience.

It didn’t take long after the Second Law was first proposed for people to start talking about its implications for the long-term evolution of the universe. If “randomness” (for example as characterized by entropy) always increases, doesn’t that mean that the universe must eventually evolve to a state of “equilibrium randomness”, in which all the rich structures we now see have decayed into “random heat”?

There are several issues here. But the most obvious has to do with what observer one imagines will be experiencing that future state of the universe. After all, if the underlying rules which govern the universe are reversible, then in principle it will always be possible to go back from that future “random heat” and reconstruct from it all the rich structures that have existed in the history of the universe.

But the point of the Second Law as we’ve discussed it is that at least for computationally bounded observers like us that won’t be possible. The past will always in principle be determinable from the future, but it will take irreducibly much computation to do so—and vastly more than observers like us can muster.

And along the same lines, if observers like us examine the future state of the universe we won’t be able to see that there’s anything special about it. Even though it came from the “special state” that is the current state of our universe, we won’t be able to tell it from a “typical” state, and we’ll just consider it “random”.

But what if the observers evolve with the evolution of the universe? Yes, to us today that future configuration of particles may just “look random”. But in actuality, it has rich computational content that there’s no reason to assume a future observer will not find in some way or another significant. Indeed, in a sense the longer the universe has been around, the larger the amount of irreducible computation it will have done. And, yes, observers like us today might not care about most of what comes out of that computation. But in principle there are features of it that could be mined to inform the “experience” of future observers.

At a practical level, our basic human senses pick out certain features on certain scales. But as technology progresses, it gives us ways to pick out much more, on much finer scales. A century ago we couldn’t realistically pick out individual atoms or individual photons; now we routinely can. And what seemed like “random noise” just a few decades ago is now often known to have specific, detailed structure.

There is, however, a complex tradeoff. A crucial feature of observers like us is that there is a certain coherence to our experience; we sample little enough about the world that we’re able to turn it into a coherent thread of experience. But the more an observer samples, the more difficult this will become. So, yes, a future observer with vastly more advanced technology might successfully be able to sample lots of details of the future universe. But to do that, the observer will have to lose some of their own coherence, and ultimately we won’t even be able to identify that future observer as “coherently existing” at all.

The usual “heat death of the universe” refers to the fate of matter and other particles in the universe. But what about things like gravity and the structure of spacetime? In traditional physics, that’s been a fairly separate question. But in our Physics Project everything is ultimately described in terms of a single abstract structure that represents both space and everything in it. And we can expect that the evolution of this whole structure then corresponds to a computationally irreducible process.

The basic setup is at its core just like what we’ve seen in our general discussion of the Second Law. But here we’re operating at the lowest level of the universe, so the irreducible progression of computation can be thought of as representing the fundamental inexorable passage of time. As time moves forward, therefore, we can generally expect “more randomness” in the lowest-level structure of the universe.

But what will observers perceive? There’s considerable trickiness here—particularly in connection with quantum mechanics—that we’ll discuss later. In essence, the point is that there are many paths of history for the universe, that branch and merge—and observers sample certain collections of paths. And for example on some paths the computations may simply halt, with no further rules applying—so that in effect “time stops”, at least for observers on those paths. It’s a phenomenon that can be identified with spacetime singularities, and with what happens inside (at least certain) black holes.

So does this mean that the universe might “just stop”, in effect ending with a collection of black holes? It’s more complicated than that. Because there are always other paths for observers to follow. Some correspond to different quantum possibilities. But ultimately what we imagine is that our perception of the universe is a sampling from the whole ruliad—the limiting entangled structure formed by running all abstractly possible computations forever. And it’s a feature of the construction of the ruliad that it’s infinite. Individual paths in it can halt, but the whole ruliad goes on forever.

So what does this mean about the ultimate fate of the universe? Much like the situation with heat death, specific observers may conclude that “nothing interesting is happening anymore”. But something always will be happening, and in fact that something will represent the accumulation of larger and larger amounts of irreducible computation. It won’t be possible for an observer to encompass all this while still themselves “remaining coherent”. But as we’ll discuss later there will inexorably be pockets of computational reducibility for which coherent observers can exist, although what those observers will perceive is likely to be utterly incoherent with anything that we as observers now perceive.

The universe does not fundamentally just “descend into randomness”. And indeed all the things that exist in our universe today will ultimately be encoded in some way forever in the detailed structure that develops. But what the core phenomenon of the Second Law suggests is that at least many aspects of that encoding will not be accessible to observers like us. The future of the universe will transcend what we so far “appreciate”, and will require a redefinition of what we consider meaningful. But it should not be “taken for dead” or dismissed as being just “random heat”. It’s just that to find what we consider interesting, we may in effect have to migrate across the ruliad.

The Second Law gives us the expectation that so long as we start from “reasonable” initial conditions, we should always evolve to some kind of “uniformly random” configuration that we can view as a “unique equilibrium state” that’s “lost any meaningful memory” of the initial conditions. But now that we’ve got ways to explore the Second Law in specific, simple computational systems, we can explicitly study the extent to which this expectation is upheld. And what we’ll find is that even though as a general matter it is, there can still be exceptions in which traces of initial conditions can be preserved at least long into the evolution.

Let’s look again at our “particle cellular automaton” system. We saw above that the evolution of an initial “blob” (here of size 17 in a system with 30 cells) leads to configurations that typically look quite random:

But what about other initial conditions? Here are some samples of what happens:

In some cases we again get what appears to be quite random behavior. But in other cases the behavior looks much more structured. Sometimes this is just because there’s a short recurrence time:

And indeed the overall distribution of recurrence times falls off in a first approximation exponentially (though with a definite tail):

But the distribution is quite broad—with a mean of more than 50,000 steps. (The 17-particle initial blob gives a recurrence time of 155,150 steps.) So what happens with “typical” initial conditions that don’t give short recurrences? Here’s an example:

What’s notable here is that unlike for the case of the “simple blob”, there seem to be identifiable traces of the initial conditions that persist for a long time. So what’s going on—and how does it relate to the Second Law?

Given the basic rules for the particle cellular automaton

we immediately know that at least a couple of aspects of the initial conditions will persist forever. In particular, the rules conserve the total number of “particles” (i.e. non-white cells) so that:

In addition, the number of light or dark cells can change only by increments of 2, and therefore their total number must remain either always even or always odd—and combined with overall particle conservation this then implies that:

What about other conservation laws? We can formulate the conservation of total particle number as saying that the number of instances of “length-1 blocks” with weights specified as follows is always constant:

Then we can go on and ask about conservation laws associated with longer blocks. For blocks of length 2, there are no new nontrivial conservation laws, though for example the weighted combination of blocks

is nominally “conserved”—but only because it is 0 for any possible configuration.

But in addition to such global conservation laws, there are also more local kinds of regularities. For example, a single “light particle” on its own just stays fixed, and a pair of light particles can always trap a single dark particle between them:

For any separation of light particles, it turns out to always be possible to trap any number of dark particles:

But not every initial configuration of dark particles gets trapped. With separation *s* and *d* dark particles, there are a total of `Binomial`[*s*,*d*] possible initial configurations. For *d* = 2, a fraction *s* – 3)/(*s* – 1)*d* = 3, the fraction becomes *s* – 3)(*s* – 4)/(*s*(*s* – 1))*d* = 4 it is (*s* – 4)(*s* – 5)/(*s*(*s* – 1)). (For larger *d*, the trapping fraction continues to be a rational function of *s*, but the polynomials involved rapidly become more complicated.) For sufficiently large separation *s* the trapping fraction always goes to 1—though does so more slowly as *d* increases:

What’s basically going on is that a single dark particle always just “bounces off” a light particle:

But a pair of dark particles can “go through” the light particle, shifting it slightly:

Different things happen with different configurations of dark particles:

And with more complicated “barriers” the behavior can depend in detail on precise phase and separation relationships:

But the basic point is that—although there are various ways they can be modified or destroyed—“light particle walls” can persist for a least a long time. And the result is that if such walls happen to occur in an initial condition they can at least significantly slow down “degradation to randomness”.

For example, this shows evolution over the course of 200,000 steps from a particular initial condition, sampled every 20,000 steps—and even over all these steps we see that there’s definite “wall structure” that survives:

Let’s look at a simpler case: a single light particle surrounded by a few dark particles:

If we plot the position of the light particle we see that for thousands of steps it just jiggles around

but if one runs it long enough it shows systematic motion at a rate of about 1 position every 1300 steps, wrapping around the cyclic boundary conditions, and eventually returning to its starting point—at the recurrence time of 46,836 steps:

What does all this mean? Essentially the point is that even though something like our particle cellular automaton exhibits computational irreducibility and often generates “featureless” apparent randomness, a system like this is also capable of exhibiting computational reducibility in which traces of the initial conditions can persist, and there isn’t just “generic randomness generation”.

Computational irreducibility is a powerful force. But, as we’ll discuss below, its very presence implies that there must inevitably also be “pockets” of computational reducibility. And once again (as we’ll discuss below) it’s a question of the observer how obvious or not these pockets may be in a particular case, and whether—say for observers like us—they affect what we perceive in terms of the operation of the Second Law.

It’s worth commenting that such issues are not just a feature of systems like our particle cellular automaton. And indeed they’ve appeared—stretching all the way back to the 1950s—pretty much whenever detailed simulations have been done of systems that one might expect would show “Second Law” behavior. The story is typically that, yes, there is apparent randomness generated (though it’s often barely studied as such), just as the Second Law would suggest. But then there’s a big surprise of some kind of unexpected regularity. In arrays of nonlinear springs, there were solitons. In hard-sphere gases, there were “long-time tails”—in which correlations in the motion of spheres were seen to decay not exponentially in time, but rather like power laws.

The phenomenon of long-time tails is actually visible in the cellular automaton “approximation” to hard-sphere gases that we studied above. And its interpretation is a good example of how computational reducibility manifests itself. At a small scale, the motion of our idealized molecules shows computational irreducibility and randomness. But on a larger scale, it’s more like “collective hydrodynamics”, with fluid mechanics effects like vortices. And it’s these much-simpler-to-describe computationally reducible effects that lead to the “unexpected regularities” associated with long-time tails.

At its core, the Second Law is about evolution from orderly “simple” initial conditions to apparent randomness. And, yes, this is a phenomenon we can certainly see happen in things like hard-sphere gases in which we’re in effect emulating the motion of physical gas molecules. But what about systems with other underlying rules? Because we’re explicitly doing everything computationally, we’re in a position to just enumerate possible rules (i.e. possible programs) and see what they do.

As an example, here are the distinct patterns produced by all 288 3-color reversible block cellular automata that don’t change the all-white state (but don’t necessarily conserve “particle number”):

As is typical to see in the computational universe of simple programs, there’s quite a diversity of behavior. Often we see it “doing the Second Law thing” and “decaying” to apparent randomness

although sometimes taking a while to do so:

But there are also cases where the behavior just stays simple forever

as well as other cases where it takes a fairly long time before it’s clear what’s going to happen:

In many ways, the most surprising thing here is that such simple rules can generate randomness. And as we’ve discussed, that’s in the end what leads to the Second Law. But what about rules that don’t generate randomness, and just produce simple behavior? Well, in these cases the Second Law doesn’t apply.

In standard physics, the Second Law is often applied to gases—and indeed this was its very first application area. But to a solid whose atoms have stayed in more or less fixed positions for a billion years, it really doesn’t usefully apply. And the same is true, say, for a line of masses connected by perfect springs, with perfect linear behavior.

There’s been a quite pervasive assumption that the Second Law is somehow always universally valid. But it’s simply not true. The validity of the Second Law is associated with the phenomenon of computational irreducibility. And, yes, this phenomenon is quite ubiquitous. But there are definitely systems and situations in which it does not occur. And those will not show “Second Law” behavior.

There are plenty of complicated “marginal” cases, however. For example, for a given rule (like the 3 shown here), some initial conditions may not lead to randomness and “Second Law behavior”, while others do:

And as is so often the case in the computational universe there are phenomena one never expects, like the strange “shock-front-like” behavior of the third rule, which produces randomness, but only on a scale determined by the region it’s in:

It’s worth mentioning that while restricting to a finite region often yields behavior that more obviously resembles a “box of gas molecules”, the general phenomenon of randomness generation also occurs in infinite regions. And indeed we already know this from the classic example of rule 30. But here it is in a reversible block cellular automaton:

In some simple cases the behavior just repeats, but in other cases it’s nested

albeit sometimes in rather complicated ways:

Having identified the computational nature of the core phenomenon of the Second Law we can start to understand in full generality just what the range of this phenomenon is. But what about the ordinary Second Law as it might be applied to familiar physical situations?

Does the ubiquity of computational irreducibility imply that ultimately absolutely everything must “degrade to randomness”? We saw in the previous section that there are underlying rules for which this clearly doesn’t happen. But what about with typical “real-world” systems involving molecules? We’ve seen lots of examples of idealized hard-sphere gases in which we observe randomization. But—as we’ve mentioned several times—even when there’s computational irreducibility, there are always pockets of computational reducibility to be found.

And for example the fact that simple overall gas laws like *PV* = constant apply to our hard-sphere gas can be viewed as an example of computational reducibility. And as another example, consider a hard-sphere gas in which vortex-like circulation has been set up. To get a sense of what happens we can just look at our simple discrete model. At a microscopic level there’s clearly lots of apparent randomness, and it’s hard to see what’s globally going on:

But if we coarse grain the system by 3×3 blocks of cells with “average velocities” we see that there’s a fairly persistent hydrodynamic-like vortex that can be identified:

Microscopically, there’s computational irreducibility and apparent randomness. But macroscopically the particular form of coarse-grained measurement we’re using picks out a pocket of reducibility—and we see overall behavior whose obvious features don’t show “Second-Law-style” randomness.

And in practice this is how much of the “order” we see in the universe seems to work. At a small scale there’s all sorts of computational irreducibility and randomness. But on a larger scale there are features that we as observers notice that tap into pockets of reducibility, and that show the kind of order that we can describe, for example, with simple mathematical laws.

There’s an extreme version of this in our Physics Project, where the underlying structure of space—like the underlying structure of something like a gas—is full of computational irreducibility, but where there are certain overall features that observers like us notice, and that show computational reducibility. One example involves the large-scale structure of spacetime, as described by general relativity. Another involves the identification of particles that can be considered to “move without change” through the system.

One might have thought—as people often have—that the Second Law would imply a degradation of every feature of a system to uniform randomness. But that’s just not how computational irreducibility works. Because whenever there’s computational irreducibility, there are also inevitably an infinite number of pockets of computational reducibility. (If there weren’t, that very fact could be used to “reduce the irreducibility”.)

And what that means is that when there’s irreducibility and Second-Law-like randomization, there’ll also always be orderly laws to be found. But which of those laws will be evident—or relevant—to a particular observer depends on the nature of that observer.

The Second Law is ultimately a story of the mismatch between the computational irreducibility of underlying systems, and the computational boundedness of observers like us. But the point is that if there’s a pocket of computational reducibility that happens to be “a fit” for us as observers, then despite our computational limitations, we’ll be perfectly able to recognize the orderliness that’s associated with it—and we won’t think that the system we’re looking at has just “degraded to randomness”.

So what this means is that there’s ultimately no conflict between the existence of order in the universe, and the operation of the Second Law. Yes, there’s an “ocean of randomness” generated by computational irreducibility. But there’s also inevitably order that lives in pockets of reducibility. And the question is just whether a particular observer “notices” a given pocket of reducibility, or whether they only “see” the “background” of computational irreducibility.

In the “hydrodynamics” example above, the “observer” picks out a “slice” of behavior by looking at aggregated local averages. But another way for an observer to pick out a “slice” of behavior is just to look only at a specific region in a system. And in that case one can observe simpler behavior because in effect “the complexity has radiated away”. For example, here are reversible cellular automata where a random initial block is “simplified” by “radiating its information out”:

If one picked up all those “pieces of radiation” one would be able—with appropriate computational effort—to reconstruct all the randomness in the initial condition. But if we as observers just “ignore the radiation to infinity” then we’ll again conclude that the system has evolved to a simpler state—against the “Second-Law trend” of increasing randomness.

When I first studied cellular automata back in the 1980s, I identified four basic classes of behavior that are seen when starting from generic initial conditions—as exemplified by:

Class 1 essentially always evolves to the same final “fixed-point” state, immediately destroying information about its initial state. Class 2, however, works a bit like solid matter, essentially just maintaining whatever configuration it was started in. Class 3 works more like a gas or a liquid, continually “mixing things up” in a way that looks quite random. But class 4 does something more complicated.

In class 3 there aren’t significant identifiable persistent structures, and everything always seems to quickly get randomized. But the distinguishing feature of class 4 is the presence of identifiable persistent structures, whose interactions effectively define the activity of the system.

So how do these types of behavior relate to the Second Law? Class 1 involves intrinsic irreversibility, and so doesn’t immediately connect to standard Second Law behavior. Class 2 is basically too static to follow the Second Law. But class 3 shows quintessential Second Law behavior, with rapid evolution to “typical random states”. And it’s class 3 that captures the kind of behavior that’s seen in typical Second Law systems, like gases.

But what about class 4? Well, it’s a more complicated story. The “level of activity” in class 4—while above class 2—is in a sense below class 3. But unlike in class 3, where there is typically “too much activity” to “see what’s going on”, class 4 often gives one the idea that it’s operating in a “more potentially understandable” way. There are many different detailed kinds of behavior that appear in class 4 systems. But here are a few examples in reversible block cellular automata:

Looking at the first rule, it’s easy to identify some simple persistent structures, some stationary, some moving:

But even with this rule, many other things can happen too

and in the end the whole behavior of the system is built up from combinations and interactions of structures like these.

The second rule above behaves in an immediately more elaborate way. Here it is starting from a random initial condition:

Starting just from one gets:

Sometimes the behavior seems simpler

though even in the last case here, there is elaborate “number-theoretical” behavior that seems to never quite become either periodic or nested:

We can think of any cellular automaton—or any system based on rules—as “doing a computation” when it evolves. Class 1 and 2 systems basically behave in computationally simple ways. But as soon as we reach class 3 we’re dealing with computational irreducibility, and with a “density of computation” that lets us decode almost nothing about what comes out, with the result that what we see we can basically describe only as “apparently random”. Class 4 no doubt has the same ultimate computational irreducibility—and the same ultimate computational capabilities—as class 3. But now the computation is “less dense”, and seemingly more accessible to human interpretation. In class 3 it’s difficult to imagine making any kind of “symbolic summary” of what’s going on. But in class 4, we see definite structures whose behavior we can imagine being able to describe in a symbolic way, building up what we can think of as a “human-accessible narrative” in which we talk about “structure X collides with structure Y to produce structure Z” and so on.

And indeed if we look at the picture above, it’s not too difficult to imagine that it might correspond to the execution trace of a computation we might do. And more than that, given the “identifiable components” that arise in class 4 systems, one can imagine assembling these to explicitly set up particular computations one wants to do. In a class 3 system “randomness” always just “spurts out”, and one has very little ability to “meaningfully control” what happens. But in a class 4 system, one can potentially do what amounts to traditional engineering or programming to set up an arrangement of identifiable component “primitives” that achieves some particular purpose one has chosen.

And indeed in a case like the rule 110 cellular automaton we know that it’s possible to perform any computation in this way, proving that the system is capable of universal computation, and providing a piece of evidence for the phenomenon of computational irreducibility. No doubt rule 30 is also computation universal. But the point is that with our current ways of analyzing things, class 3 systems like this don’t make this something we can readily recognize.

Like so many other things we’re discussing, this is basically again a story of observers and their capabilities. If observers like us—with our computational boundedness—are going to be able to “get things into our minds” we seem to need to break them down to the point where they can be described in terms of modest numbers of types of somewhat-independent parts. And that’s what the “decomposition into identifiable structures” that we observe in class 4 systems gives us the opportunity to do.

What about class 3? Notwithstanding things like our discussion of traces of initial conditions above, our current powers of perception just don’t seem to let us “understand what’s going on” to the point where we can say much more than there’s apparent randomness. And of course it’s this very point that we’re arguing is the basis for the Second Law. Could there be observers who could “decode class 3 systems”? In principle, absolutely yes. And even if the observers—like us—are computationally bounded, we can expect that there will be at least some pockets of computational reducibility that could be found that would allow progress to be made.

But as of now—with the methods of perception and analysis currently at our disposal—there’s something very different for us about class 3 and class 4. Class 3 shows quintessential “apparently random” behavior, like molecules in a gas. Class 4 shows behavior that looks more like the “insides of a machine” that could have been “intentionally engineered for a purpose”. Having a system that is like this “in bulk” is not something familiar, say from physics. There are solids, and liquids, and gases, whose components have different general organizational characteristics. But what we see in class 4 is something yet different—and quite unfamiliar.

Like solids, liquids and gases, it’s something that can exist “in bulk”, with any number of components. We can think of it as a “phase” of a system. But it’s a new type of phase, that we might call a “mechanoidal phase”.

How do we recognize this phase? Again, it’s a question of the observer. Something like a solid phase is easy for observers like us to recognize. But even the distinction between a liquid and a gas can be more difficult to recognize. And to recognize the mechanoidal phase we basically have to be asking something like “Is this a computation we recognize?”

How does all this relate to the Second Law? Class 3 systems—like gases—immediately show typical “Second Law” behavior, characterized by randomness, entropy increase, equilibrium, and so on. But class 4 systems work differently. They have new characteristics that don’t fit neatly into the rubric of the Second Law.

No doubt one day we will have theories of the mechanoidal phase just like today we have theories of gases, of liquids and of solids. Likely those theories will have to get more sophisticated in characterizing the observer, and in describing what kinds of coarse graining can reasonably be done. Presumably there will be some kind of analog of the Second Law that leverages the difference between the capabilities and features of the observer and the system they’re observing. But in the mechanoidal phase there is in a sense less distance between the mechanism of the system and the mechanism of the observer, so we probably can’t expect a statement as ultimately simple and clear-cut as the usual Second Law.

The Second Law has long had an uneasy relationship with biology. “Physical” systems like gases readily show the “decay” to randomness expected from the Second Law. But living systems instead somehow seem to maintain all sorts of elaborate organization that doesn’t immediately “decay to randomness”— and indeed actually seems able to grow just through “processes of biology”.

It’s easy to point to the continual absorption of energy and material by living systems—as well as their eventual death and decay—as reasons why such systems might still at least nominally follow the Second Law. But even if at some level this works, it’s not particularly useful in letting us talk about the actual significant “bulk” features of living systems—in the kind of way that the Second Law routinely lets us make “bulk” statements about things like gases.

So how might we begin to describe living systems “in bulk”? I suspect a key is to think of them as being in large part in what we’re here calling the mechanoidal phase. If one looks inside a living organism at a molecular scale, there are some parts that can reasonably be described as solid, liquid or gas. But what molecular biology has increasingly shown is that there’s often much more elaborate molecular-scale organization than exist in those phases—and moreover that at least at some level this organization seems “describable” and “machine-like”, with molecules and collections of molecules that we can say have “particular functions”, often being “carefully” and actively transported by things like the cytoskeleton.

In any given organism, there are for example specific proteins defined by the genomics of the organism, that behave in specific ways. But one suspects that there’s also a higher-level or “bulk” description that allows one to make at least some kinds of general statements. There are already some known general principles in biology—like the concept of natural selection, or the self-replicating digital character of genetic information—that let one come to various conclusions independent of microscopic details.

And, yes, in some situations the Second Law provides certain kinds of statements about biology. But I suspect that there are much more powerful and significant principles to be discovered, that in fact have the potential to unlock a whole new level of global understanding of biological systems and processes.

It’s perhaps worth mentioning an analogy in technology. In a microprocessor what we can think of as the “working fluid” is essentially a gas of electrons. At some level the Second Law has things to say about this gas of electrons, for example describing scattering processes that lead to electrical resistance. But the vast majority of what matters in the behavior of this particular gas of electrons is defined not by things like this, but by the elaborate pattern of wires and switches that exist in the microprocessor, and that guide the motion of the electrons.

In living systems one sometimes also cares about the transport of electrons—though more often it’s atoms and ions and molecules. And living systems often seem to provide what one can think of as a close analog of wires for transporting such things. But what is the arrangement of these “wires”? Ultimately it’ll be defined by the application of rules derived from things like the genome of the organism. Sometimes the results will for example be analogous to crystalline or amorphous solids. But in other cases one suspects that it’ll be better described by something like the mechanoidal phase.

Quite possibly this may also provide a good bulk description of technological systems like microprocessors or large software codebases. And potentially then one might be able to have high-level laws—analogous to the Second Law—that would make high-level statements about these technological systems.

It’s worth mentioning that a key feature of the mechanoidal phase is that detailed dynamics—and the causal relations it defines—matter. In something like a gas it’s perfectly fine for most purposes to assume “molecular chaos”, and to say that molecules are arbitrarily mixed. But the mechanoidal phase depends on the “detailed choreography” of elements. It’s still a “bulk phase” with arbitrarily many elements. But things like the detailed history of interactions of each individual element matter.

In thinking about typical chemistry—say in a liquid or gas phase—one’s usually just concerned with overall concentrations of different kinds of molecules. In effect one assumes that the “Second Law has acted”, and that everything is “mixed randomly” and the causal histories of molecules don’t matter. But it’s increasingly clear that this picture isn’t correct for molecular biology, with all its detailed molecular-scale structures and mechanisms. And instead it seems more promising to model what’s there as being in the mechanoidal phase.

So how does this relate to the Second Law? As we’ve discussed, the Second Law is ultimately a reflection of the interplay between underlying computational irreducibility and the limited computational capabilities of observers like us. But within computational irreducibility there are inevitably always “pockets” of computational reducibility—which the observer may or may not care about, or be able to leverage.

In the mechanoidal phase there is ultimately computational irreducibility. But a defining feature of this phase is the presence of “local computational reducibility” visible in the existence of identifiable localized structures. Or, in other words, even to observers like us, it’s clear that the mechanoidal phase isn’t “uniformly computationally irreducible”. But just what general statements can be made about it will depend—potentially in some detail—on the characteristics of the observer.

We’ve managed to get a long way in discussing the Second Law—and even more so in doing our Physics Project—by making only very basic assumptions about observers. But to be able to make general statements about the mechanoidal phase—and living systems—we’re likely to have to say more about observers. If one’s presented with a lump of biological tissue one might at first just describe it as some kind of gel. But we know there’s much more to it. And the question is what features we can perceive. Right now we can see with microscopes all kinds of elaborate spatial structures. Perhaps in the future there’ll be technology that also lets us systematically detect dynamic and causal structures. And it’ll be the interplay of what we perceive with what’s computationally going on underneath that’ll define what general laws we will be able to see emerge.

We already know we won’t just get the ordinary Second Law. But just what we will get isn’t clear. But somehow—perhaps in several variants associated with different kinds of observers—what we’ll get will be something like “general laws of biology”, much like in our Physics Project we get general laws of spacetime and of quantum mechanics, and in our analysis of metamathematics we get “general laws of mathematics”.

Traditional twentieth-century physics treats spacetime a bit like a continuous fluid, with its characteristics being defined by the continuum equations of general relativity. Attempts to align this with quantum field theory led to the idea of attributing an entropy to black holes, in essence to represent the number of quantum states “hidden” by the event horizon of the black hole. But in our Physics Project there is a much more direct way of thinking about spacetime in what amount to thermodynamic terms.

A key idea of our Physics Project is that there’s something “below” the “fluid” representation of spacetime—and in particular that space is ultimately made of discrete elements, whose relations (which can conveniently be represented by a hypergraph) ultimately define everything about the structure of space. This structure evolves according to rules that are somewhat analogous to those for block cellular automata, except that now one is doing replacements not for blocks of cell values, but instead for local pieces of the hypergraph.

So what happens in a system like this? Sometimes the behavior is simple. But very often—much like in many cellular automata—there is great complexity in the structure that develops even from simple initial conditions:

It’s again a story of computational irreducibility, and of the generation of apparent randomness. The notion of “randomness” is a bit less straightforward for hypergraphs than for arrays of cell values. But what ultimately matters is what “observers like us” perceive in the system. A typical approach is to look at geodesic balls that encompass all elements within a certain graph distance of a given element—and then to study the effective geometry that emerges in the large-scale limit. It’s then a bit like seeing fluid dynamics emerge from small-scale molecular dynamics, except that here (after navigating many technical issues) it’s the Einstein equations of general relativity that emerge.

But the fact that this can work relies on something analogous to the Second Law. It has to be the case that the evolution of the hypergraph leads at least locally to something that can be viewed as “uniformly random”, and on which statistical averages can be done. In effect, the microscopic structure of spacetime is reaching some kind of “equilibrium state”, whose detailed internal configuration “seems random”—but which has definite “bulk” properties that are perceived by observers like us, and give us the impression of continuous spacetime.

As we’ve discussed above, the phenomenon of computational irreducibility means that apparent randomness can arise completely deterministically just by following simple rules from simple initial conditions. And this is presumably what basically happens in the evolution and “formation” of spacetime. (There are some additional complications associated with multicomputation that we’ll discuss at least to some extent later.)

But just like for the systems like gases that we’ve discussed above, we can now start talking directly about things like entropy for spacetime. As “large-scale observers” of spacetime we’re always effectively doing coarse graining. So now we can ask how many microscopic configurations of spacetime (or space) are consistent with whatever result we get from that coarse graining.

As a toy example, consider just enumerating all possible graphs (say up to a given size), then asking which of them have a certain pattern of volumes for geodesic balls (i.e. a certain sequence of numbers of distinct nodes within a given graph distance of a particular node). The “coarse-grained entropy” is simply determined by the number of graphs in which the geodesic ball volumes start in the same way. Here are all trivalent graphs (with up to 24 nodes) that have various such geodesic ball “signatures” (most, but not all, turn out to be vertex transitive; these graphs were found by filtering a total of 125,816,453 possibilities):

We can think of the different numbers of graphs in each case as representing different entropies for a tiny fragment of space constrained to have a given “coarse-grained” structure. At the graph sizes we’re dealing with here, we’re very far from having a good approximation to continuum space. But assume we could look at much larger graphs. Then we might ask how the entropy varies with “limiting geodesic ball signature”—which in the continuum limit is determined by dimension, curvature, etc.

For a general “disembodied lump of spacetime” this is all somewhat hard to define, particularly because it depends greatly on issues of “gauge” or of how the spacetime is foliated into spacelike slices. But event horizons, being in a sense much more global, don’t have such issues, and so we can expect to have fairly invariant definitions of spacetime entropy in this case. And the expectation would then be that for example the entropy we would compute would agree with the “standard” entropy computed for example by analyzing quantum fields or strings near a black hole. But with the setup we have here we should also be able to ask more general questions about spacetime entropy—for example seeing how it varies with features of arbitrary gravitational fields.

In most situations the spacetime entropy associated with any spacetime configuration that we can successfully identify at our coarse-grained level will be very large. But if we could ever find a case where it is instead small, this would be somewhere we could expect to start seeing a breakdown of the continuum “equilibrium” structure of spacetime, and where evidence of discreteness should start to show up.

We’ve so far mostly been discussing hypergraphs that represent instantaneous states of space. But in talking about spacetime we really need to consider causal graphs that map out the causal relationships between updating events in the hypergraph, and that represent the structure of spacetime. And once again, such graphs can show apparent randomness associated with computational irreducibility.

One can make causal graphs for all sorts of systems. Here is one for a “Newton’s cradle” configuration of an (effectively 1D) hard-sphere gas, in which events are collisions between spheres, and two events are causally connected if a sphere goes from one to the other:

And here is an example for a 2D hard-sphere case, with the causal graph now reflecting the generation of apparently random behavior:

Similar to this, we can make a causal graph for our particle cellular automaton, in which we consider it an event whenever a block changes (but ignore “no-change updates”):

For spacetime, features of the causal graph have some definite interpretations. We define the reference frame we’re using by specifying a foliation of the causal graph. And one of the results of our Physics Project is then that the flux of causal edges through the spacelike hypersurfaces our foliation defines can be interpreted directly as the density of physical energy. (The flux through timelike hypersurfaces gives momentum.)

One can make a surprisingly close analogy to causal graphs for hard-sphere gases—except that in a hard-sphere gas the causal edges correspond to actual, nonrelativistic motion of idealized molecules, while in our model of spacetime the causal edges are abstract connections that are in effect always lightlike (i.e. they correspond to motion at the speed of light). In both cases, reducing the number of events is like reducing some version of temperature—and if one approaches no-event “absolute zero” both the gas and spacetime will lose their cohesion, and no longer allow propagation of effects from one part of the system to another.

If one increases density in the hard-sphere gas one will eventually form something like a solid, and in this case there will be a regular arrangement of both spheres and the causal edges. In spacetime something similar may happen in connection with event horizons—which may behave like an “ordered phase” with causal edges aligned.

What happens if one combines thinking about spacetime and thinking about matter? A long-unresolved issue concerns systems with many gravitationally attracting bodies—say a “gas” of stars or galaxies. While the molecules in an ordinary gas might evolve to an apparently random configuration in a standard “Second Law way”, gravitationally attracting bodies tend to clump together to make what seem like “progressively simpler” configurations.

It could be that this is a case where the standard Second Law just doesn’t apply, but there’s long been a suspicion that the Second Law can somehow be “saved” by appropriately associating an entropy with the structure of spacetime. In our Physics Project, as we’ve discussed, there’s always entropy associated with our coarse-grained perception of spacetime. And it’s conceivable that, at least in terms of overall counting of states, increased “organization” of matter could be more than balanced by enlargement in the number of available states for spacetime.

We’ve discussed at length above the idea that “Second Law behavior” is the result of us as observers (and preparers of initial states) being “computationally weak” relative to the computational irreducibility of the underlying dynamics of systems. And we can expect that very much the same thing will happen for spacetime. But what if we could make a Maxwell’s demon for spacetime? What would this mean?

One rather bizarre possibility is that it could allow faster-than-light “travel”. Here’s a rough analogy. Gas molecules—say in air in a room—move at roughly the speed of sound. But they’re always colliding with other molecules, and getting their directions randomized. But what if we had a Maxwell’s-demon-like device that could tell us at every collision which molecule to ride on? With an appropriate choice for the sequence of molecules we could then potentially “surf” across the room at roughly the speed of sound. Of course, to have the device work it’d have to overcome the computational irreducibility of the basic dynamics of the gas.

In spacetime, the causal graph gives us a map of what event can affect what other event. And insofar as we just treat spacetime as “being in uniform equilibrium” there’ll be a simple correspondence between “causal distance” and what we consider distance in physical space. But if we look down at the level of individual causal edges it’ll be more complicated. And in general we could imagine that an appropriate “demon” could predict the microscopic causal structure of spacetime, and carefully pick causal edges that could “line up” to “go further in space” than the “equilibrium expectation”.

Of course, even if this worked, there’s still the question of what could be “transported” through such a “tunnel”—and for example even a particle (like an electron) presumably involves a vast number of causal edges, that one wouldn’t be able to systematically organize to fit through the tunnel. But it’s interesting to realize that in our Physics Project the idea that “nothing can go faster than light” becomes something very much analogous to the Second Law: not a fundamental statement about underlying rules, but rather a statement about our interaction with them, and our capabilities as observers.

So if there’s something like the Second Law that leads to the structure of spacetime as we typically perceive it, what can be said about typical issues in thermodynamics in connection with spacetime? For example, what’s the story with perpetual motion machines in spacetime?

Even before talking about the Second Law, there are already issues with the First Law of thermodynamics—because in a cosmological setting there isn’t local conservation of energy as such, and for example the expansion of the universe can transfer energy to things. But what about the Second Law question of “getting mechanical work from heat”? Presumably the analog of “mechanical work” is a gravitational field that is “sufficiently organized” that observers like us can readily detect it, say by seeing it pull objects in definite directions. And presumably a perpetual motion machine based on violating the Second Law would then have to take the heat-like randomness in “ordinary spacetime” and somehow organize it into a systematic and measurable gravitational field. Or, in other words, “perpetual motion” would somehow have to involve a gravitational field “spontaneously being generated” from the microscopic structure of spacetime.

Just like in ordinary thermodynamics, the impossibility of doing this involves an interplay between the observer and the underlying system. And conceivably it might be possible that there could be an observer who can measure specific features of spacetime that correspond to some slice of computational reducibility in the underlying dynamics—say some weird configuration of “spontaneous motion” of objects. But absent this, a “Second-Law-violating” perpetual motion machine will be impossible.

Like statistical mechanics (and thermodynamics), quantum mechanics is usually thought of as a statistical theory. But whereas the statistical character of statistical mechanics one imagines to come from a definite, knowable “mechanism underneath”, the statistical character of quantum mechanics has usually just been treated as a formal, underivable “fact of physics”.

In our Physics Project, however, the story is different, and there’s a whole lower-level structure—ultimately rooted in the ruliad—from which quantum mechanics and its statistical character appears to be derived. And, as we’ll discuss, that derivation in the end has close connections both to what we’ve said about the standard Second Law, and to what we’ve said about the thermodynamics of spacetime.

In our Physics Project the starting point for quantum mechanics is the unavoidable fact that when one’s applying rules to transform hypergraphs, there’s typically more than one rewrite that can be done to any given hypergraph. And the result of this is that there are many different possible “paths of history” for the universe.

As a simple analog, consider rewriting not hypergraphs but strings. And doing this, we get for example:

This is a deterministic representation of all possible “paths of history”, but in a sense it’s very wasteful, among other things because it includes multiple copies of identical strings (like BBBB). If we merge such identical copies, we get what we call a multiway graph, that contains both branchings and mergings:

In the “innards” of quantum mechanics one can imagine that all these paths are being followed. So how is it that we as observers perceive definite things to happen in the world? Ultimately it’s a story of coarse graining, and of us conflating different paths in the multiway graph.

But there’s a wrinkle here. In statistical mechanics we imagine that we can observe from outside the system, implementing our coarse graining by sampling particular features of the system. But in quantum mechanics we imagine that the multiway system describes the whole universe, including us. So then we have the peculiar situation that just as the universe is branching and merging, so too are our brains. And ultimately what we observe is therefore the result of a branching brain perceiving a branching universe.

But given all those branches, can we just decide to conflate them into a single thread of experience? In a sense this is a typical question of coarse graining and of what we can consistently equivalence together. But there’s something a bit different here because without the “coarse graining” we can’t talk at all about “what happened”, only about what might be happening. Put another way, we’re now fundamentally dealing not with computation (like in a cellular automaton) but with multicomputation.

And in multicomputation, there are always two fundamental kinds of operations: the generation of new states from old, and the equivalencing of states, effectively by the observer. In ordinary computation, there can be computational irreducibility in the process of generating a thread of successive states. In multicomputation, there can be multicomputational irreducibility in which in a sense all computations in the multiway system have to be done in order even to determine a single equivalenced result. Or, put another way, you can’t shortcut following all the paths of history. If you try to equivalence at the beginning, the equivalence class you’ve built will inevitably be “shredded” by the evolution, forcing you to follow each path separately.

It’s worth commenting that just as in classical mechanics, the “underlying dynamics” in our description of quantum mechanics are reversible. In the original unmerged evolution tree above, we could just reverse each rule and from any point uniquely construct a “backwards tree”. But once we start merging and equivalencing, there isn’t the same kind of “direct reversibility”—though we can still count possible paths to determine that we preserve “total probability”.

In ordinary computational systems, computational irreducibility implies that even from simple initial conditions we can get behavior that “seems random” with respect to most computationally bounded observations. And something directly analogous happens in multicomputational systems. From simple initial conditions, we generate collections of paths of history that “seem random” with respect to computationally bounded equivalencing operations, or, in other words, to observers who do computationally bounded coarse graining of different paths of history.

When we look at the graphs we’ve drawn representing the evolution of a multiway system, we can think of there being a time direction that goes down the page, following the arrows that point from states to their successors. But across the page, in the transverse direction, we can think of there as being a space in which different paths of history are laid—what we call “branchial space”.

A typical way to start constructing branchial space is to take slices across the multiway graph, then to form a branchial graph in which two states are joined if they have a common ancestor on the step before (which means we can consider them “entangled”):

Although the details remain to be clarified, it seems as if in the standard formalism of quantum mechanics, distance in branchial space corresponds essentially to quantum phase, so that, for example, particles whose phases would make them show destructive interference will be at “opposite ends” of branchial space.

So how do observers relate to branchial space? Basically what an observer is doing is to coarse grain in branchial space, equivalencing certain paths of history. And just as we have a certain extent in physical space, which determines our coarse graining of gases, and—at a much smaller scale—of the structure of spacetime, so also we have an extent in branchial space that determines our coarse graining across branches of history.

But this is where multicomputational irreducibility and the analog of the Second Law are crucial. Because just as we imagine that gases—and spacetime—achieve a certain kind of “unique random equilibrium” that leads us to be able to make consistent measurements of them, so also we can imagine that in quantum mechanics there is in effect a “branchial space equilibrium” that is achieved.

Think of a box of gas in equilibrium. Put two pistons on different sides of the box. So long as they don’t perturb the gas too much, they’ll both record the same pressure. And in our Physics Project it’s the same story with observers and quantum mechanics. Most of the time there’ll be enough effective randomness generated by the multicomputationally irreducible evolution of the system (which is completely deterministic at the level of the multiway graph) that a computationally bounded observer will always see the same “equilibrium values”.

A central feature of quantum mechanics is that by making sufficiently careful measurements one can see what appear to be random results. But where does that randomness come from? In the usual formalism for quantum mechanics, the idea of purely probabilistic results is just burnt into the formal structure. But in our Physics Project, the apparent randomness one sees has a definite, “mechanistic” origin. And it’s basically the same as the origin of randomness for the standard Second Law, except that now we’re dealing with multicomputational rather than pure computational irreducibility.

By the way, the “Bell’s inequality” statement that quantum mechanics cannot be based on “mechanistic randomness” unless it comes from a nonlocal theory remains true in our Physics Project. But in the Physics Project we have an immediate ubiquitous source of “nonlocality”: the equivalencing or coarse graining “across” branchial space done by observers.

(We’re not discussing the role of physical space here. But suffice it to say that instead of having each node of the multiway graph represent a complete state of the universe, we can make an extended multiway graph in which different spatial elements—like different paths of history—are separated, with their “causal entanglements” then defining the actual structure of space, in a spatial analog of the branchial graph.)

As we’ve already noted, the complete multiway graph is entirely deterministic. And indeed if we have a complete branchial slice of the graph, this can be used to determine the whole future of the graph (the analog of “unitary evolution” in the standard formalism of quantum mechanics). But if we equivalence states—corresponding to “doing a measurement”—then we won’t have enough information to uniquely determine the future of the system, at least when it comes to what we consider to be quantum effects.

At the outset, we might have thought that statistical mechanics, spacetime mechanics and quantum mechanics were all very different theories. But what our Physics Project suggests is that in fact they are all based on a common, fundamentally computational phenomenon.

So what about other ideas associated with the standard Second Law? How do they work in the quantum case?

Entropy, for example, now just becomes a measure of the number of possible configurations of a branchial graph consistent with a certain coarse-grained measurement. Two independent systems will have disconnected branchial graphs. But as soon as the systems interact, their branchial graphs will connect, and the number of possible graph configurations will change, leading to an “entanglement entropy”.

One question about the quantum analog of the Second Law is what might correspond to “mechanical work”. There may very well be highly structured branchial graphs—conceivably associated with things like coherent states—but it isn’t yet clear how they work and whether existing kinds of measurements can readily detect them. But one can expect that multicomputational irreducibility will tend to produce branchial graphs that can’t be “decoded” by most computationally bounded measurements—so that, for example, “quantum perpetual motion”, in which “branchial organization” is spontaneously produced, can’t happen.

And in the end randomness in quantum measurements is happening for essentially the same basic reason we’d see randomness if we looked at small numbers of molecules in a gas: it’s not that there’s anything fundamentally not deterministic underneath, it’s just there’s a computational process that’s making things too complicated for us to “decode”, at least as observers with bounded computational capabilities. In the case of the gas, though, we’re sampling molecules at different places in physical space. But in quantum mechanics we’re doing the slightly more abstract thing of sampling states of the system at different places in branchial space. But the same fundamental randomization is happening, though now through multicomputational irreducibility operating in branchial space.

The original formulation of the Second Law a century and a half ago—before even the existence of molecules was established—was an impressive achievement. And one might assume that over the course of 150 years—with all the mathematics and physics that’s been done—a complete foundational understanding of the Second Law would long ago have been developed. But in fact it has not. And from what we’ve discussed here we can now see why. It’s because the Second Law is ultimately a computational phenomenon, and to understand it requires an understanding of the computational paradigm that’s only very recently emerged.

Once one starts doing actual computational experiments in the computational universe (as I already did in the early 1980s) the core phenomenon of the Second Law is surprisingly obvious—even if it violates one’s traditional intuition about how things should work. But in the end, as we have discussed here, the Second Law is a reflection of a very general, if deeply computational, idea: an interplay between computational irreducibility and the computational limitations of observers like us. The Principle of Computational Equivalence tells us that computational irreducibility is inevitable. But the limitation of observers is something different: it’s a kind of epiprinciple of science that’s in effect a formalization of our human experience and our way of doing science.

Can we tighten up the formulation of all this? Undoubtedly. We have various standard models of the computational process—like Turing machines and cellular automata. We still need to develop an “observer theory” that provides standard models for what observers like us can do. And the more we can develop such a theory, the more we can expect to make explicit proofs of specific statements about the Second Law. Ultimately these proofs will have solid foundations in the Principle of Computational Equivalence (although there remains much to formalize there too), but will rely on models for what “observers like us” can be like.

So how general do we expect the Second Law to be in the end? In the past couple of sections we’ve seen that the core of the Second Law extends to spacetime and to quantum mechanics. But even when it comes to the standard subject matter of statistical mechanics, we expect limitations and exceptions to the Second Law.

Computational irreducibility and the Principle of Computational Equivalence are very general, but not very specific. They talk about the overall computational sophistication of systems and processes. But they do not say that there are no simplifying features. And indeed we expect that in any system that shows computational irreducibility, there will always be arbitrarily many “slices of computational reducibility” that can be found.

The question then is whether those slices of reducibility will be what an observer can perceive, or will care about. If they are, then one won’t see Second Law behavior. If they’re not, one will just see “generic computational irreducibility” and Second Law behavior.

How can one find the slices of reducibility? Well, in general that’s irreducibly hard. Every slice of reducibility is in a sense a new scientific or mathematical principle. And the computational irreducibility involved in finding such reducible slices basically speaks to the ultimately unbounded character of the scientific and mathematical enterprise. But once again, even though there might be an infinite number of slices of reducibility, we still have to ask which ones matter to us as observers.

The answer could be one thing for studying gases, and another, for example, for studying molecular biology, or social dynamics. The question of whether we’ll see “Second Law behavior” then boils to whether whatever we’re studying turns out to be something that doesn’t simplify, and ends up showing computational irreducibility.

If we have a sufficiently small system—with few enough components—then the computational irreducibility may not be “strong enough” to stop us from “going beyond the Second Law”, and for example constructing a successful Maxwell’s demon. And indeed as computer and sensor technology improve, it’s becoming increasingly feasible to do measurement and set up control systems that effectively avoid the Second Law in particular, small systems.

But in general the future of the Second Law and its applicability is really all about how the capabilities of observers develop. What will future technology, and future paradigms, do to our ability to pick away at computational irreducibility?

In the context of the ruliad, we are currently localized in rulial space based on our existing capabilities. But as we develop further we are in effect “colonizing” rulial space. And a system that may look random—and may seem to follow the Second Law—from one place in rulial space may be “revealed as simple” from another.

There is an issue, though. Because the more we as observers spread out in rulial space, the less coherent our experience will become. In effect we’ll be following a larger bundle of threads in rulial space, which makes who “we” are less definite. And in the limit we’ll presumably be able to encompass all slices of computational reducibility, but at the cost of having our experience “incoherently spread” across all of them.

It’s in the end some kind of tradeoff. Either we can have a coherent thread of experience, in which case we’ll conclude that the world produces apparent randomness, as the Second Law suggests. Or we can develop to the point where we’ve “spread our experience” and no longer have coherence as observers, but can recognize enough regularities that the Second Law potentially seems irrelevant.

But as of now, the Second Law is still very much with us, even if we are beginning to see some of its limitations. And with our computational paradigm we are finally in a position to see its foundations, and understand how it ultimately works.

*Thanks & Notes*

Thanks to Brad Klee, Kegan Allen, Jonathan Gorard, Matt Kafker, Ed Pegg and Michael Trott for their help—as well as to the many people who have contributed to my understanding of the Second Law over the 50+ years I’ve been interested in it.

Wolfram Language to generate every image here is available by clicking the image in the online version. Raw research notebooks for this work are available here; video work logs are here.

I’ve been trying to understand the Second Law now for a bit more than 50 years.

It all started when I was 12 years old. Building on an earlier interest in space and spacecraft, I’d gotten very interested in physics, and was trying to read everything I could about it. There were several shelves of physics books at the local bookstore. But what I coveted most was the largest physics book collection there: a series of five plushly illustrated college textbooks. And as a kind of graduation gift when I finished (British) elementary school in June 1972 I arranged to get those books. And here they are, still on my bookshelf today, just a little faded, more than half a century later:

For a while the first book in the series was my favorite. Then the third. The second. The fourth. The fifth one at first seemed quite mysterious—and somehow more abstract in its goals than the others:

What story was the filmstrip on its cover telling? For a couple of months I didn’t look seriously at the book. And I spent much of the summer of 1972 writing my own (unseen by anyone else for 30+ years) *Concise Directory of Physics*

that included a rather stiff page about energy, mentioning entropy—along with the heat death of the universe.

But one afternoon late that summer I decided I should really find out what that mysterious fifth book was all about. Memory being what it is I remember that—very unusually for me—I took the book to read sitting on the grass under some trees. And, yes, my archives almost let me check my recollection: in the distance, there’s the spot, except in 1967 the trees are significantly smaller, and in 1985 they’re bigger:

Of course, by 1972 I was a little bigger than in 1967—and here I am a little later, complete with a book called *Planets and Life* on the ground, along with a tube of (British) Smarties, and, yes, a pocket protector (but, hey, those were actual ink pens):

But back to the mysterious green book. It wasn’t like anything I’d seen before. It was full of pictures like the one on the cover. And it seemed to be saying that—just by looking at those pictures and thinking—one could figure out fundamental things about physics. The other books I’d read had all basically said “physics works like this”. But here was a book saying “you can figure out how physics has to work”. Back then I definitely hadn’t internalized it, but I think what was so exciting that day was that I got a first taste of the idea that one didn’t have to be told how the world works; one could just figure it out:

I didn’t yet understand quite a bit of the math in the book. But it didn’t seem so relevant to the core phenomenon the book was apparently talking about: the tendency of things to become more random. I remember wondering how this related to stars being organized into galaxies. Why might that be different? The book didn’t seem to say, though I thought maybe somewhere it was buried in the math.

But soon the summer was over, and I was at a new school, mostly away from my books, and doing things like diligently learning more Latin and Greek. But whenever I could I was learning more about physics—and particularly about the hot area of the time: particle physics. The pions. The kaons. The lambda hyperon. They all became my personal friends. During the school vacations I would excitedly bicycle the few miles to the nearby university library to check out the latest journals and the latest news about particle physics.

The school I was at (Eton) had five centuries of history, and I think at first I assumed no particular bridge to the future. But it wasn’t long before I started hearing mentions that somewhere at the school there was a computer. I’d seen a computer in real life only once—when I was 10 years old, and from a distance. But now, tucked away at the edge of the school, above a bicycle repair shed, there was an island of modernity, a “computer room” with a glass partition separating off a loudly humming desk-sized piece of electronics that I could actually touch and use: an Elliott 903C computer with 8 kilowords of 18-bit ferrite core memory (acquired by the school in 1970 for £12,000, or about about $300k today):

At first it was such an unfamiliar novelty that I was satisfied writing little programs to do things like compute primes, print curious patterns on the teleprinter, and play tunes with the built-in diagnostic tone generator. But it wasn’t long before I set my sights on the goal of using the computer to reproduce that interesting picture on the book cover.

I programmed in assembler, with my programs on paper tape. The computer had just 16 machine instructions, which included arithmetic ones, but only for integers. So how was I going to simulate colliding “molecules” with that? Somewhat sheepishly, I decided to put everything on a grid, with everything represented by discrete elements. There was a convention for people to name their programs starting with their own first initial. So I called the program SPART, for “Stephen’s Particle Program”. (Thinking about it today, maybe that name reflected some aspiration of relating this to particle physics.)

It was the most complicated program I had ever written. And it was hard to test, because, after all, I didn’t really know what to expect it to do. Over the course of several months, it went through many versions. Rather often the program would just mysteriously crash before producing any output (and, yes, there weren’t real debugging tools yet). But eventually I got it to systematically produce output. But to my disappointment the output never looked much like the book cover.

I didn’t know why, but I assumed it was because I was simplifying things too much, putting everything on a grid, etc. A decade later I realized that in writing my program I’d actually ended up inventing a form of 2D cellular automaton. And I now rather suspect that this cellular automaton—like rule 30—was actually intrinsically generating randomness, and in some sense showing what I now understand to be the core phenomenon of the Second Law. But at the time I absolutely wasn’t ready for this, and instead I just assumed that what I was seeing was something wrong and irrelevant. (In past years, I had suspected that what went wrong had to do with details of particle behavior on square—as opposed to other—grids. But I now suspect it was instead that the system was in a sense generating too much randomness, making the intended “molecular dynamics” unrecognizable.)

I’d love to “bring SPART back to life”, but I don’t seem to have a copy anymore, and I’m pretty sure the printouts I got as output back in 1973 seemed so “wrong” I didn’t keep them. I do still have quite a few paper tapes from around that time, but as of now I’m not sure what’s on them—not least because I wrote my own “advanced” paper-tape loader, which used what I later learned were error-correcting codes to try to avoid problems with pieces of “confetti” getting stuck in the holes that had been punched in the tape:

I don’t know what would have happened if I’d thought my program was more successful in reproducing “Second Law” behavior back in 1973 when I was 13 years old. But as it was, in the summer of 1973 I was away from “my” computer, and spending all my time on particle physics. And between that summer and early 1974 I wrote a book-length summary of what I called “The Physics of Subatomic Particles”:

I don’t think I’d looked at this in any detail in 48 years. But reading it now I am a bit shocked to find history and explanations that I think are often better than I would immediately give today—even if they do bear definite signs of coming from a British early teenager writing “scientific prose”.

Did I talk about statistical mechanics and the Second Law? Not directly, though there’s a curious passage where I speculate about the possibility of antimatter galaxies, and their (rather un-Second-Law-like) segregation from ordinary, matter galaxies:

By the next summer I was writing the 230-page, much more technical “Introduction to the Weak Interaction”. Lots of quantum mechanics and quantum field theory. No statistical mechanics. The closest it gets is a chapter on CP violation (AKA time-reversal violation)—a longtime favorite topic of mine—but from a very particle-physics point of view. By the next year I was publishing papers about particle physics, with no statistical mechanics in sight—though in a picture of me (as a “lanky youth”) from that time, the *Statistical Physics* book is right there on my shelf, albeit surrounded by particle physics books:

But despite my focus on particle physics, I still kept thinking about statistical mechanics and the Second Law, and particularly its implications for the large-scale structure of the universe, and things like the possibility of matter-antimatter separation. And in early 1977, now 17 years old, and (briefly) a college student in Oxford, my archives record that I gave a talk to the newly formed (and short-lived) Oxford Natural Science Club entitled “Whither Physics” in which I talked about “large, small, many” as the main frontiers of physics, and presented the visual

with a dash of “unsolved purple” impinging on statistical mechanics, particularly in connection with non-equilibrium situations. Meanwhile, looking at my archives today, I find some “back of the envelope” equilibrium statistical mechanics from that time (though I have no idea now what this was about):

But then, in the fall of 1977 I ended up for the first time really needing to use statistical mechanics “in production”. I had gotten interested in what would later become a hot area: the intersection between particle physics and the early universe. One of my interests was neutrino background radiation (the neutrino analog of the cosmic microwave background); another was early-universe production of stable charged particles heavier than the proton. And it turned out that to study these I needed all three of cosmology, particle physics, and statistical mechanics:

In the couple of years that followed, I worked on all sorts of topics in particle physics and in cosmology. Quite often ideas from statistical mechanics would show up, like when I worked on the hadronization of quarks and gluons, or when I worked on phase transitions in the early universe. But it wasn’t until 1979 that the Second Law made its first explicit appearance by name in my published work.

I was studying how there could be a net excess of matter over antimatter throughout the universe (yes, I’d by then given up on the idea of matter-antimatter separation). It was a subtle story of quantum field theory, time reversal violation, General Relativity—and non-equilibrium statistical mechanics. And in the paper we wrote we included a detailed appendix about Boltzmann’s *H* theorem and the Second Law—and the generalization we needed for relativistic quantum time-reversal-violating systems in an expanding universe:

All this got me thinking again about the foundations of the Second Law. The physicists I was around mostly weren’t too interested in such topics—though Richard Feynman was something of an exception. And indeed when I did my PhD thesis defense in November 1979 it ended up devolving into a spirited multi-hour debate with Feynman about the Second Law. He maintained that the Second Law must ultimately cause everything to randomize, and that the order we see in the universe today must be some kind of temporary fluctuation. I took the point of view that there was something else going on, perhaps related to gravity. Today I would have more strongly made the rather Feynmanesque point that if you have a theory that says everything we observe today is an exception to your theory, then the theory you have isn’t terribly useful.

Back in 1973 I never really managed to do much science on the very first computer I used. But by 1976 I had access to much bigger and faster computers (as well as to the ARPANET—forerunner of the internet). And soon I was routinely using computers as powerful tools for physics, and particularly for symbolic manipulation. But by late 1979 I had basically outgrown the software systems that existed, and within weeks of getting my PhD I embarked on the project of building my own computational system.

It’s a story I’ve told elsewhere, but one of the important elements for our purposes here is that in designing the system I called SMP (for “Symbolic Manipulation Program”) I ended up digging deeply into the foundations of computation, and its connections to areas like mathematical logic. But even as I was developing the critical-to-Wolfram-Language-to-this-day paradigm of basing everything on transformations for symbolic expressions, as well as leading the software engineering to actually build SMP, I was also continuing to think about physics and its foundations.

There was often something of a statistical mechanics orientation to what I did. I worked on cosmology where even the collection of possible particle species had to be treated statistically. I worked on the quantum field theory of the vacuum—or effectively the “bulk properties of quantized fields”. I worked on what amounts to the statistical mechanics of cosmological strings. And I started working on the quantum-field-theory-meets-statistical-mechanics problem of “relativistic matter” (where my unfinished notes contain questions like “Does causality forbid relativistic solids?”):

But hovering around all of this was my old interest in the Second Law, and in the seemingly opposing phenomenon of the spontaneous emergence of complex structure.

SMP Version 1.0 was ready in mid-1981. And that fall, as a way to focus my efforts, I taught a “Topics in Theoretical Physics” course at Caltech (supposedly for graduate students but actually almost as many professors came too) on what, for want of a better name, I called “non-equilibrium statistical mechanics”. My notes for the first lecture dived right in:

Echoing what I’d seen on that book cover back in 1972 I talked about the example of the expansion of a gas, noting that even in this case “Many features [are] still far from understood”:

I talked about the Boltzmann transport equation and its elaboration in the BBGKY hierarchy, and explored what might be needed to extend it to things like self-gravitating systems. And then—in what must have been a very overstuffed first lecture—I launched into a discussion of “Possible origins of irreversibility”. I began by talking about things like ergodicity, but soon made it clear that this didn’t go the distance, and there was much more to understand—saying that “with a bit of luck” the material in my later lectures might help:

I continued by noting that some systems can “develop order and considerable organization”—which non-equilibrium statistical mechanics should be able to explain:

I then went quite “cosmological”:

The first candidate explanation I listed was the fluctuation argument Feynman had tried to use:

I discussed the possibility of fundamental microscopic irreversibility—say associated with time-reversal violation in gravity—but largely dismissed this. I talked about the possibility that the universe could have started in a special state in which “the matter is in thermal equilibrium, but the gravitational field is not.” And finally I gave what the 22-year-old me thought at the time was the most plausible explanation:

All of this was in a sense rooted in a traditional mathematical physics style of thinking. But the second lecture gave a hint of a quite different approach:

In my first lecture, I had summarized my plans for subsequent lectures:

But discovery intervened. People had discussed reaction-diffusion patterns as examples of structure being formed “away from equilibrium”. But I was interested in more dramatic examples, like galaxies, or snowflakes, or turbulent flow patterns, or forms of biological organisms. What kinds of models could realistically be made for these? I started from neural networks, self-gravitating gases and spin systems, and just kept on simplifying and simplifying. It was rather like language design, of the kind I’d done for SMP. What were the simplest primitives from which I could build up what I wanted?

Before long I came up with what I’d soon learn could be called one-dimensional cellular automata. And immediately I started running them on a computer to see what they did:

And, yes, they were “organizing themselves”—even from random initial conditions—to make all sorts of structures. By December I was beginning to frame how I would write about what was going on:

And by May 1982 I had written my first long paper about cellular automata (published in 1983):

The Second Law featured prominently, even in the first sentence:

I made quite a lot out of the fundamentally irreversible character of most cellular automaton rules, pretty much assuming that this was the fundamental origin of their ability to “generate complex structures”—as the opening transparencies of two talks I gave at the time suggested:

It wasn’t that I didn’t know there could be reversible cellular automata. And a footnote in my paper even records the fact these can generate nested patterns with a certain fractal dimension—as computed in a charmingly manual way on a couple of pages I now find in my archives:

But somehow I hadn’t quite freed myself from the assumption that microscopic irreversibility was what was “causing” structures to be formed. And this was related to another important—and ultimately incorrect—assumption: that all the structure I was seeing was somehow the result of the “filtering” random initial conditions. Right there in my paper is a picture of rule 30 starting from a single cell:

And, yes, the printout from which that was made is still in my archives, if now a little worse for wear:

Of course, it probably didn’t help that with my “display” consisting of an array of printed characters I couldn’t see too much of the pattern—though my archives do contain a long “imitation-high-resolution” printout of the conveniently narrow, and ultimately nested, pattern from rule 225:

But I think the more important point was that I just didn’t have the necessary conceptual framework to absorb what I was seeing in rule 30—and I wasn’t ready for the intuitional shock that it takes only simple rules with simple initial conditions to produce highly complex behavior.

My motivation for studying the behavior of cellular automata had come from statistical mechanics. But I soon realized that I could discuss cellular automata without any of the “baggage” of statistical mechanics, or the Second Law. And indeed even as I was finishing my long statistical-mechanics-themed paper on cellular automata, I was also writing a short paper that described cellular automata essentially as purely computational systems (even though I still used the term “mathematical models”) without talking about any kind of Second Law connections:

Through much of 1982 I was alternating between science, technology and the startup of my first company. I left Caltech in October 1982, and after stops at Los Alamos and Bell Labs, started working at the Institute for Advanced Study in Princeton in January 1983, equipped with a newly obtained Sun workstation computer whose (“one megapixel”) bitmap display let me begin to see in more detail how cellular automata behave:

It had very much the flavor of classic observational science—looking not at something like mollusc shells, but instead at images on a screen—and writing down what I saw in a “lab notebook”:

What did all those rules do? Could I somehow find a way to classify their behavior?

Mostly I was looking at random initial conditions. But in a near miss of the rule 30 phenomenon I wrote in my lab notebook: “In irregular cases, appears that patterns starting from small initial states are not self-similar (e.g. code 10)”. I even looked again at asymmetric “elementary” rules (of which rule 30 is an example)—but only from random initial conditions (though noting the presence of “class 4” rules, which would include rule 110):

My technology stack at the time consisted of printing screen dumps of cellular automaton behavior

then using repeated photocopying to shrink them—and finally cutting out the images and assembling arrays of them using Scotch tape:

And looking at these arrays I was indeed able to make an empirical classification, identifying initially five—but in the end four—basic classes of behavior. And although I sometimes made analogies with solids, liquids and gases—and used the mathematical concept of entropy—I was now mostly moving away from thinking in terms of statistical mechanics, and was instead using methods from areas like dynamical systems theory, and computation theory:

Even so, when I summarized the significance of investigating the computational characteristics of cellular automata, I reached back to statistical mechanics, suggesting that much as information theory provided a mathematical basis for equilibrium statistical mechanics, so similarly computation theory might provide a foundation for non-equilibrium statistical mechanics:

My experiments had shown that cellular automata could “spontaneously produce structure” even from randomness. And I had been able to characterize and measure various features of this structure, notably using ideas like entropy. But could I get a more complete picture of what cellular automata could make? I turned to formal language theory, and started to work out the “grammar of possible states”. And, yes, a quarter century before `Graph` in Wolfram Language, laying out complicated finite state machines wasn’t easy:

But by November 1983 I was writing about “self-organization as a computational process”:

The introduction to my paper again led with the Second Law, though now talked about the idea that computation theory might be what could characterize non-equilibrium and self-organizing phenomena:

The concept of equilibrium in statistical mechanics makes it natural to ask what will happen in a system after an infinite time. But computation theory tells one that the answer to that question can be non-computable or undecidable. I talked about this in my paper, but then ended by discussing the ultimately much richer finite case, and suggesting (with a reference to NP completeness) that it might be common for there to be no computational shortcut to cellular automaton evolution. And rather presciently, I made the statement that “One may speculate that [this phenomenon] is widespread in physical systems” so that “the consequences of their evolution could not be predicted, but could effectively be found only by direct simulation or observation.”:

These were the beginnings of powerful ideas, but I was still tying them to somewhat technical things like ensembles of all possible states. But in early 1984, that began to change. In January I’d been asked to write an article for the then-top popular science magazine *Scientific American* on the subject of “Computers in Science and Mathematics”. I wrote about the general idea of computer experiments and simulation. I wrote about SMP. I wrote about cellular automata. But then I wanted to bring it all together. And that was when I came up with the term “computational irreducibility”.

By May 26, the concept was pretty clearly laid out in my draft text:

But just a few days later something big happened. On June 1 I left Princeton for a trip to Europe. And in order to “have something interesting to look at on the plane” I decided to print out pictures of some cellular automata I hadn’t bothered to look at much before. The first one was rule 30:

And it was then that it all clicked. The complexity I’d been seeing in cellular automata wasn’t the result of some kind of “self-organization” or “filtering” of random initial conditions. Instead, here was an example where it was very obviously being “generated intrinsically” just by the process of evolution of the cellular automaton. This was computational irreducibility up close. No need to think about ensembles of states or statistical mechanics. No need to think about elaborate programming of a universal computer. From just a single black cell rule 30 could produce immense complexity, and showed what seemed very likely to be clear computational irreducibility.

Why hadn’t I figured out before that something like this could happen? After all, I’d even generated a small picture of rule 30 more than two years earlier. But at the time I didn’t have a conceptual framework that made me pay attention to it. And a small picture like that just didn’t have the same in-your-face “complexity from nothing” character as my larger picture of rule 30.

Of course, as is typical in the history of ideas, there’s more to the story. One of the key things that had originally let me start “scientifically investigating” cellular automata is that out of all the infinite number of possible constructible rules, I’d picked a modest number on which I could do exhaustive experiments. I’d started by considering only “elementary” cellular automata, in one dimension, with *k* = 2 colors, and with rules of range *r* = 1. There are 256 such “elementary rules”. But many of them had what seemed to me “distracting” features—like backgrounds alternating between black and white on successive steps, or patterns that systematically shifted to the left or right. And to get rid of these “distractions” I decided to focus on what I (somewhat foolishly in retrospect) called “legal rules”: the 32 rules that leave blank states blank, and are left-right symmetric.

When one uses random initial conditions, the legal rules do seem—at least in small pictures—to capture the most obvious behaviors one sees across all the elementary rules. But it turns out that’s not true when one looks at simple initial conditions. Among the “legal” rules, the most complicated behavior one sees with simple initial conditions is nesting.

But even though I concentrated on “legal” rules, I still included in my first major paper on cellular automata pictures of a few “illegal” rules starting from simple initial conditions—including rule 30. And what’s more, in a section entitled “Extensions”, I discussed cellular automata with more than 2 colors, and showed—though without comment—the pictures:

These were low-resolution pictures, and I think I imagined that if one ran them further, the behavior would somehow resolve into something simple. But by early 1983, I had some clues that this wouldn’t happen. Because by then I was generating fairly high-resolution pictures—including ones of the *k* = 2, *r* = 2 totalistic rule with code 10 starting from a simple initial condition:

In early drafts of my 1983 paper on “Universality and Complexity in Cellular Automata” I noted the generation of “irregularity”, and speculated that it might be associated with class 4 behavior. But later I just stated as an observation without “cause” that some rules—like code 10—generate “irregular patterns”. I elaborated a little, but in a very “statistical mechanics” kind of way, not getting the main point:

In September 1983 I did a little better:

But in the end it wasn’t until June 1, 1984, that I really grokked what was going on. And a little over a week later I was in a scenic area of northern Sweden

at a posh “Nobel Symposium” conference on “The Physics of Chaos and Related Problems”—talking for the first time about the phenomenon I’d seen in rule 30 and code 10. And from June 15 there’s a transcript of a discussion session where I bring up the never-before-mentioned-in-public concept of computational irreducibility—and, unsurprisingly, leave the other participants (who were basically all traditional mathematically oriented physicists) at best slightly bemused:

I think I was still a bit prejudiced against rule 30 and code 10 as specific rules: I didn’t like the asymmetry of rule 30, and I didn’t like the rapid growth of code 10. (Rule 73—while symmetric—I also didn’t like because of its alternating background.) But having now grokked the rule 30 phenomenon I knew it also happened in “more aesthetic” “legal” rules with more than 2 colors. And while even 3 colors led to a rather large total space of rules, it was easy to generate examples of the phenomenon there.

A few days later I was back in the US, working on finishing my article for *Scientific American*. A photographer came to help get pictures from the color display I now had:

And, yes, those pictures included multicolor rules that showed the rule 30 phenomenon:

The caption I wrote commented: “Even in this case the patterns generated can be complex, and they sometimes appear quite random. The complex patterns formed in such physical processes as the flow of a turbulent fluid may well arise from the same mechanism.”

The article went on to describe computational irreducibility and its implications in quite a lot of detail— illustrating it rather nicely with a diagram, and commenting that “It seems likely that many physical and mathematical systems for which no simple description is now known are in fact computationally irreducible”:

I also included an example—that would show up almost unchanged in *A New Kind of Science* nearly 20 years later—indicating how computational irreducibility could lead to undecidability (back in 1984 the picture was made by stitching together many screen photographs, yes, with strange artifacts from long-exposure photography of CRTs):

In a rather newspaper-production-like experience, I spent the evening of July 18 at the offices of *Scientific American* in New York City putting finishing touches to the article, which at the end of the night—with minutes to spare—was dispatched for final layout and printing.

But already by that time, I was talking about computational irreducibility and the rule 30 phenomenon all over the place. In July I finished “Twenty Problems in the Theory of Cellular Automata” for the proceedings of the Swedish conference, including what would become a rather standard kind of picture:

Problem 15 talks specifically about rule 30, and already asks exactly what would—35 years later—become Problem #2 in my 2019 Rule 30 Prizes

while Problem 18 asks the (still largely unresolved) question of what the ultimate frequency of computational irreducibility is:

Very late in putting together the *Scientific American* article I’d added to the caption of the picture showing rule-30-like behavior the statement “Complex patterns generated by cellular automata can also serve as a source of effectively random numbers, and they can be applied to encrypt messages by converting a text into an apparently random form.” I’d realized both that cellular automata could act as good random generators (we used rule 30 as the default in Wolfram Language for more than 25 years), and that their evolution could effectively encrypt things, much as I’d later describe the Second Law as being about “encrypting” initial conditions to produce effective irreversibility.

Back in 1984 it was a surprising claim that something as simple and “science-oriented” as a cellular automaton could be useful for encryption. Because at the time practical encryption was basically always done by what at least seemed like arbitrary and complicated engineering solutions, whose security relied on details or explanations that were often considered military or commercial secrets.

I’m not sure when I first became aware of cryptography. But back in 1973 when I first had access to a computer there were a couple of kids (as well as a teacher who’d been a friend of Alan Turing’s) who were programming Enigma-like encryption systems (perhaps fueled by what were then still officially just rumors of World War II goings-on at Bletchley Park). And by 1980 I knew enough about encryption that I made a point of encrypting the source code of SMP (using a modified version of the Unix crypt program). (As it happens, we lost the password, and it was only in 2015 that we got access to the source again.)

My archives record a curious interaction about encryption in May 1982—right around when I’d first run (though didn’t appreciate) rule 30. A rather colorful physicist I knew named Brosl Hasslacher (who we’ll encounter again later) was trying to start a curiously modern-sounding company named Quantum Encryption Devices (or QED for short)—that was actually trying to market a quite hacky and definitively classical (multiple-shift-register-based) encryption system, ultimately to some rather shady customers (and, yes, the “expected” funding did not materialize):

But it was 1984 before I made a connection between encryption and cellular automata. And the first thing I imagined was giving input as the initial condition of the cellular automaton, then running the cellular automaton rule to produce “encrypted output”. The most straightforward way to make encryption was then to have the cellular automaton rule be reversible, and to run the inverse rule to do the decryption. I’d already done a little bit of investigation of reversible rules, but this led to a big search for reversible rules—which would later come in handy for thinking about microscopically reversible processes and thermodynamics.

Just down the hall from me at the Institute for Advanced Study was a distinguished mathematician named John Milnor, who got very interested in what I was doing with cellular automata. My archives contain all sorts of notes from Jack, like:

There’s even a reversible (“one-to-one”) rule, with nice, minimal BASIC code, along with lots of “real math”:

But by the spring of 1984 Jack and I were talking a lot about encryption in cellular automata—and we even began to draft a paper about it

complete with outlines of how encryption schemes could work:

The core of our approach involved reversible rules, and so we did all sorts of searches to find these (and by 1984 Jack was—like me—writing C code):

I wondered how random the output from cellular automata was, and I asked people I knew at Bell Labs about randomness testing (and, yes, email headers haven’t changed much in four decades, though then I was swolf@ias.uucp; research!ken was Ken Thompson of Unix fame):

But then came my internalization of the rule 30 phenomenon, which led to a rather different way of thinking about encryption with cellular automata. Before, we’d basically been assuming that the cellular automaton rule was the encryption key. But rule 30 suggested one could instead have a fixed rule, and have the initial condition define the key. And this is what led me to more physics-oriented thinking about cryptography—and to what I said in *Scientific American*.

In July I was making “encryption-friendly” pictures of rule 30:

But what Jack and I were most interested in was doing something more “cryptographically sophisticated”, and in particular inventing a practical public-key cryptosystem based on cellular automata. Pretty much the only public-key cryptosystems known then (or even now) are based on number theory. But we thought maybe one could use something like products of rules instead of products of numbers. Or maybe one didn’t need exact invertibility. Or something. But by the late summer of 1984, things weren’t looking good:

And eventually we decided we just couldn’t figure it out. And it’s basically still not been figured out (and maybe it’s actually impossible). But even though we don’t know how to make a public-key cryptosystem with cellular automata, the whole idea of encrypting initial data and turning it into effective randomness is a crucial part of the whole story of the computational foundations of thermodynamics as I think I now understand them.

Right from when I first formulated it, I thought computational irreducibility was an important idea. And in the late summer of 1984 I decided I’d better write a paper specifically about it. The result was:

It was a pithy paper, arranged to fit in the 4-page limit of *Physical Review Letters*, with a rather clear description of computational irreducibility and its immediate implications (as well as the relation between physics and computation, which it footnoted as a “physical form of the Church–Turing thesis”). It illustrated computational reducibility and irreducibility in a single picture, here in its original Scotch-taped form:

The paper contains all sorts of interesting tidbits, like this run of footnotes:

In the paper itself I didn’t mention the Second Law, but in my archives I find some notes I made in preparing the paper, about candidate irreducible or undecidable problems (with many still unexplored)

which include “Will a hard sphere gas started from a particular state ever exhibit some specific anti-thermodynamic behaviour?”

In November 1984 the then-editor of *Physics Today* asked if I’d write something for them. I never did, but my archives include a summary of a possible article—which among other things promises to use computational ideas to explain “why the Second Law of thermodynamics holds so widely”:

So by November 1984 I was already aware of the connection between computational irreducibility and the Second Law (and also I didn’t believe that the Second Law would necessarily always hold). And my notes—perhaps from a little later—make it clear that actually I was thinking about the Second Law along pretty much the same lines as I do now, except that back then I didn’t yet understand the fundamental significance of the observer:

And spelunking now in my old filesystem (retrieved from a 9-track backup tape) I find from November 17, 1984 (at 2:42am), troff source for a putative paper (which, yes, we even now run through troff):

This is all that’s in my filesystem. So, yes, in effect, I’m finally (more or less) finishing this 38 years later.

But in 1984 one of the hot—if not new—ideas of the time was “chaos theory”, which talked about how “randomness” could “deterministically arise” from progressive “excavation” of higher and higher-order digits in the initial conditions for a system. But having seen rule 30 this whole phenomenon of what was often (misleadingly) called “deterministic chaos” seemed to me at best like a sideshow—and definitely not the main effect leading to most randomness seen in physical systems.

I began to draft a paper about this

including for the first time an anchor picture of rule 30 intrinsically generating randomness—to be contrasted with pictures of randomness being generated (still in cellular automata) from sensitive dependence on random initial conditions:

It was a bit of a challenge to find an appropriate publishing venue for what amounted to a rather “interdisciplinary” piece of physics-meets-math-meets-computation. But *Physical Review Letters* seemed like the best bet, so on November 19, 1984, I submitted a version of the paper there, shortened to fit in its 4-page limit.

A couple of months later the journal said it was having trouble finding appropriate reviewers. I revised the paper a bit (in retrospect I think not improving it), then on February 1, 1985, sent it in again, with the new title “Origins of Randomness in Physical Systems”:

On March 8 the journal responded, with two reports from reviewers. One of the reviewers completely missed the point (yes, a risk in writing shift-the-paradigm papers). The other sent a very constructive two-page report:

I didn’t know it then, but later I found out that Bob Kraichnan had spent much of his life working on fluid turbulence (as well as that he was a very independent and think-for-oneself physicist who’d been one of Einstein’s last assistants at the Institute for Advanced Study). Looking at his report now it’s a little charming to see his statement that “no one who has looked much at turbulent flows can easily doubt [that they intrinsically generate randomness]” (as opposed to getting randomness from noise, initial conditions, etc.). Even decades later, very few people seem to understand this.

There were several exchanges with the journal, leaving it controversial whether they would publish the paper. But then in May I visited Los Alamos, and Bob Kraichnan invited me to lunch. He’d also invited a then-young physicist from Los Alamos who I’d known fairly well a few years earlier—and who’d once paid me the unintended compliment that it wasn’t fair for me to work on science because I was “too efficient”. (He told me he’d “intended to work on cellular automata”, but before he’d gotten around to it, I’d basically figured everything out.) Now he was riding the chaos theory bandwagon hard, and insofar as my paper threatened that, he wanted to do anything he could to kill the paper.

I hadn’t seen this kind of “paradigm attack” before. Back when I’d been doing particle physics, it had been a hot and cutthroat area, and I’d had papers plagiarized, sometimes even egregiously. But there wasn’t really any “paradigm divergence”. And cellular automata—being quite far from the fray—were something I could just peacefully work on, without anyone really paying much attention to whatever paradigm I might be developing.

At lunch I was treated to a lecture about why what I was doing was nonsense, or even if it wasn’t, I shouldn’t talk about it, at least now. Eventually I got a chance to respond, I thought rather effectively—causing my “opponent” to leave in a huff, with the parting line “If you publish the paper, I’ll ruin your career”. It was a strange thing to say, given that in the pecking order of physics, he was quite junior to me. (A decade and half later there were nevertheless a couple of “incidents”.) Bob Kraichnan turned to me, cracked a wry smile and said “OK, I’ll go right now and tell [the journal] to publish your paper”:

Kraichnan was quite right that the paper was much too short for what it was trying to say, and in the end it took a long book—namely *A New Kind of Science*—to explain things more clearly. But the paper was where a high-resolution picture of rule 30 first appeared in print. And it was the place where I first tried to explain the distinction between “randomness that’s just transcribed from elsewhere” and the fundamental phenomenon one sees in rule 30 where randomness is intrinsically generated by computational processes within a system.

I wanted words to describe these two different cases. And reaching back to my years of learning ancient Greek in school I invented the terms “homoplectic” and “autoplectic”, with the noun “autoplectism” to describe what rule 30 does. In retrospect, I think these terms are perhaps “too Greek” (or too “medical sounding”), and I’ve tended to just talk about “intrinsic randomness generation” instead of autoplectism. (Originally, I’d wanted to avoid the term “intrinsic” to prevent confusion with randomness that’s baked into the rules of a system.)

The paper (as Bob Kraichnan pointed out) talks about many things. And at the end, having talked about fluid turbulence, there’s a final sentence—about the Second Law:

In my archives, I find other mentions of the Second Law too. Like an April 1985 proto-paper that was never completed

but included the statement:

My main reason for working on cellular automata was to use them as idealized models for systems in nature, and as a window into foundational issues. But being quite involved in the computer industry, I couldn’t help wondering whether they might be directly useful for practical computation. And I talked about the possibility of building a “metachip” in which—instead of having predefined “meaningful” opcodes like in an ordinary microprocessor—everything would be built up “purely in software” from an underlying universal cellular automaton rule. And various people and companies started sending me possible designs:

But in 1984 I got involved in being a consultant to an MIT-spinoff startup called Thinking Machines Corporation that was trying to build a massively parallel “Connection Machine” computer with 65536 processors. The company had aspirations around AI (hence the name, which I’d actually been involved in suggesting), but their machine could also be put to work simulating cellular automata, like rule 30. In June 1985, hot off my work on the origins of randomness, I went to spend some of the summer at Thinking Machines, and decided it was time to do whatever analysis—or, as I would call it now, ruliology—I could on rule 30.

My filesystem from 1985 records that it was fast work. On June 24 I printed a somewhat-higher-resolution image of rule 30 (my login was “swolf” back then, so that’s how my printer output was labeled):

By July 2 a prototype Connection Machine had generated 2000 steps of rule 30 evolution:

With a large-format printer normally used to print integrated circuit layouts I got an even larger “piece of rule 30”—that I laid out on the floor for analysis, for example trying to measure (with meter rules, etc.) the slope of the border between regularity and irregularity in the pattern.

Richard Feynman was also a consultant at Thinking Machines, and we often timed our visits to coincide:

Feynman and I had talked about randomness quite a bit over the years, most recently in connection with the challenges of making a “quantum randomness chip” as a minimal example of quantum computing. Feynman at first didn’t believe that rule 30 could really be “producing randomness”, and that there must be some way to “crack” it. He tried, both by hand and with a computer, particularly using statistical mechanics methods to try to compute the slope of the border between regularity and irregularity:

But in the end, he gave up, telling me “OK, Wolfram, I think you’re on to something”.

Meanwhile, I was throwing all the methods I knew at rule 30. Combinatorics. Dynamical systems theory. Logic minimization. Statistical analysis. Computational complexity theory. Number theory. And I was pulling in all sorts of hardware and software too. The Connection Machine. A Cray supercomputer. A now-long-extinct Celerity C1200 (which successfully computed a length-40,114,679,273 repetition period). A LISP machine for graph layout. A circuit-design logic minimization program. As well as my own SMP system. (The Wolfram Language was still a few years in the future.)

But by July 21, there it was: a 50-page “ruliological profile” of rule 30, in a sense showing what one could of the “anatomy” of its randomness:

A month later I attended in quick succession a conference in California about cryptography, and one in Japan about fluid turbulence—with these two fields now firmly connected through what I’d discovered.

Back from when I first saw it at the age of 14 it was always my favorite page in *The Feynman Lectures on Physics*. But how did the phenomenon of turbulence that it showed happen, and what really was it?

In late 1984, the first version of the Connection Machine was nearing completion, and there was a question of what could be done with it. I agreed to analyze its potential uses in scientific computation, and in my resulting (never ultimately completed) report

the very first section was about fluid turbulence (others sections were about quantum field theory, *n*-body problems, number theory, etc.):

The traditional computational approach to studying fluids was to start from known continuum fluid equations, then to try to construct approximations to these suitable for numerical computation. But that wasn’t going to work well for the Connection Machine. Because in optimizing for parallelism, its individual processors were quite simple, and weren’t set up to do fast (e.g. floating-point) numerical computation.

I’d been saying for years that cellular automata should be relevant to fluid turbulence. And my recent study of the origins of randomness made me all the more convinced that they would for example be able to capture the fundamental randomness associated with turbulence (which I explained as being a bit like encryption):

I sent a letter to Feynman expressing my enthusiasm:

I had been invited to a conference in Japan that summer on “High Reynolds Number Flow Computation” (i.e. computing turbulent fluid flow), and on May 4 I sent an abstract which explained a little more of my approach:

My basic idea was to start not from continuum equations, but instead from a cellular automaton idealization of molecular dynamics. It was the same kind of underlying model as I’d tried to set up in my SPART program in 1973. But now instead of using it to study thermodynamic phenomena and the microscopic motions associated with heat, my idea was to use it to study the kind of visible motion that occurs in fluid dynamics—and in particular to see whether it could explain the apparent randomness of fluid turbulence.

I knew from the beginning that I needed to rely on “Second Law behavior” in the underlying cellular automaton—because that’s what would lead to the randomness necessary to “wash out” the simple idealizations I was using in the cellular automaton, and allow standard continuum fluid behavior to emerge. And so it was that I embarked on the project of understanding not only thermodynamics, but also hydrodynamics and fluid turbulence, with cellular automata—on the Connection Machine.

I’ve had the experience many times in my life of entering a field and bringing in new tools and new ideas. Back in 1985 I’d already done that several times, and it had always been a pretty much uniformly positive experience. But, sadly, with fluid turbulence, it was to be, at best, a turbulent experience.

The idea that cellular automata might be useful in studying fluid turbulence definitely wasn’t obvious. The year before, for example, at the Nobel Symposium conference in Sweden, a French physicist named Uriel Frisch had been summarizing the state of turbulence research. Fittingly for the topic of turbulence, he and I first met after a rather bumpy helicopter ride to a conference event—where Frisch told me in no uncertain terms that cellular automata would never be relevant to turbulence, and talked about how turbulence was better thought of as being associated (a bit like in the mathematical theory of phase transitions) with “singularities getting close to the real line”. (Strangely, I just now looked at Frisch’s paper in the proceedings of the conference: “Ou en est la Turbulence Developpée?” [roughly: “Fully Developed Turbulence: Where Do We Stand?”], and was surprised to discover that its last paragraph actually mentions cellular automata, and its acknowledgements thank me for conversations—even though the paper says it was received June 11, 1984, a couple of days before I had met Frisch. And, yes, this is the kind of thing that makes accurately reconstructing history hard.)

Los Alamos had always been a hotbed of computational fluid dynamics (not least because of its importance in simulating nuclear explosions)—and in fact of computing in general—and, starting in the late fall of 1984, on my visits there I talked to many people about using cellular automata to do fluid dynamics on the Connection Machine. Meanwhile, Brosl Hasslacher (mentioned above in connection with his 1982 encryption startup) had—after a rather itinerant career as a physicist—landed at Los Alamos. And in fact I had been asked by the Los Alamos management for a letter about him in December 1984 (yes, even though he was 18 years older than me), and ended what I wrote with: “He has considerable ability in identifying promising areas of research. I think he would be a significant addition to the staff at Los Alamos.”

Well, in early 1985 Brosl identified cellular automaton fluid dynamics as a promising area, and started energetically talking to me about it. Meanwhile, the Connection Machine was just starting to work, and a young software engineer named Jim Salem was assigned to help me get cellular automaton fluid dynamics running on it. I didn’t know it at the time, but Brosl—ever the opportunist—had also made contact with Uriel Frisch, and now I find the curious document in French dated May 10, 1985, with the translated title “A New Concept for Supercomputers: Cellular Automata”, laying out a grand international multiyear plan, and referencing the (so far as I know, nonexistent) B. Hasslacher and U. Frisch (1985), “The Cellular Automaton Turbulence Machine”, Los Alamos:

I visited Los Alamos again in May, but for much of the summer I was at Thinking Machines, and on July 18 Uriel Frisch came to visit there, along with a French physicist named Yves Pomeau, who had done some nice work in the 1970s on applying methods of traditional statistical mechanics to “lattice gases”.

But what about realistic fluid dynamics, and turbulence? I wasn’t sure how easy it would be to “build up from the (idealized) molecules” to get to pictures of recognizable fluid flows. But we were starting to have some success in generating at least basic results. It wasn’t clear how seriously anyone else was taking this (especially given that at the time I hadn’t seen the material Frisch had already written), but insofar as anything was “going on”, it seemed to be a perfectly collegial interaction—where perhaps Los Alamos or the French government or both would buy a Connection Machine computer. But meanwhile, on the technical side, it had become clear that the most obvious square-lattice model (that Pomeau had used in the 1970s, and that was basically what my SPART program from 1973 was supposed to implement) was fine for diffusion processes, but couldn’t really represent proper fluid flow.

When I first started working on cellular automata in 1981 the minimal 1D case in which I was most interested had barely been studied, but there had been quite a bit of work done in previous decades on the 2D case. By the 1980s, however, it had mostly petered out—with the exception of a group at MIT led by Ed Fredkin, who had long had the belief that one might in effect be able to “construct all of physics” using cellular automata. Tom Toffoli and Norm Margolus, who were working with him, had built a hardware 2D cellular automaton simulator—that I happened to photograph in 1982 when visiting Fredkin’s island in the Caribbean:

But while “all of physics” was elusive (and our Physics Project suggests that a cellular automaton with a rigid lattice is not the right place to start), there’d been success in making for example an idealized gas, using essentially a block cellular automaton on a square grid. But mostly the cellular automaton machine was used in a maddeningly “Look at this cool thing!” mode, often accompanied by rapid physical rewiring.

In early 1984 I visited MIT to use the machine to try to do what amounted to natural science, systematically studying 2D cellular automata. The result was a paper (with Norman Packard) on 2D cellular automata. We restricted ourselves to square grids, though mentioned hexagonal ones, and my article in *Scientific American* in late 1984 opened with a full-page hexagonal cellular automaton simulation of a snowflake made by Packard (and later in 1984 turned into one of a set of cellular automaton cards for sale):

In any case, in the summer of 1985, with square lattices not doing what was needed, it was time to try hexagonal ones. I think Yves Pomeau already had a theoretical argument for this, but as far as I was concerned, it was (at least at first) just a “next thing to try”. Programming the Connection Machine was at that time a rather laborious process (which, almost unprecedentedly for me, I wasn’t doing myself), and mapping a hexagonal grid onto its basically square architecture was a little fiddly, as my notes record:

Meanwhile, at Los Alamos, I’d introduced a young and very computer-savvy protege of mine named Tsutomu Shimomura (who had a habit of getting himself into computer security scrapes, though would later become famous for taking down a well-known hacker) to Brosl Hasslacher, and now Tsutomu jumped into writing optimized code to implement hexagonal cellular automata on a Cray supercomputer.

In my archives I now find a draft paper from September 7 that starts with a nice (if not entirely correct) discussion of what amounts to computational irreducibility, and then continues by giving theoretical symmetry-based arguments that a hexagonal cellular automaton should be able to reproduce fluid mechanics:

Near the end, the draft says (misspelling Tsutomu Shimomura’s name):

Meanwhile, we (as well as everyone else) were starting to get results that looked at least suggestive:

By November 15 I had drafted a paper

that included some more detailed pictures

and that at the end (I thought, graciously) thanked Frisch, Hasslacher, Pomeau and Shimomura for “discussions and for sharing their unpublished results with us”, which by that point included a bunch of suggestive, if not obviously correct, pictures of fluid-flow-like behavior.

To me, what was important about our paper is that, after all these years, it filled in with more detail just how computational systems like cellular automata could lead to Second-Law-style thermodynamic behavior, and it “proved” the physicality of what was going on by showing easy-to-recognize fluid-dynamics-like behavior.

Just four days later, though, there was a big surprise. *The Washington Post* ran a front-page story—alongside the day’s characteristic-Cold-War-era geopolitical news—about the “Hasslacher–Frisch model”, and about how it might be judged so important that it “should be classified to keep it out of Soviet hands”:

At that point, things went crazy. There was talk of Nobel Prizes (I wasn’t buying it). There were official complaints from the French embassy about French scientists not being adequately recognized. There was upset at Thinking Machines for not even being mentioned. And, yes, as the originator of the idea, I was miffed that nobody seemed to have even suggested contacting me—even if I did view the rather breathless and “geopolitical” tenor of the article as being pretty far from immediate reality.

At the time, everyone involved denied having been responsible for the appearance of the article. But years later it emerged that the source was a certain John Gage, former political operative and longtime marketing operative at Sun Microsystems, who I’d known since 1982, and had at some point introduced to Brosl Hasslacher. Apparently he’d called around various government contacts to help encourage open (international) sharing of scientific code, quoting this as a test case.

But as it was, the article had pretty much exactly the opposite effect, with everyone now out for themselves. In Princeton, I’d interacted with Steve Orszag, whose funding for his new (traditional) computational fluid dynamics company, Nektonics, now seemed at risk, and who pulled me into an emergency effort to prove that cellular automaton fluid dynamics couldn’t be competitive. (The paper he wrote about this seemed interesting, but I demurred on being a coauthor.) Meanwhile, Thinking Machines wanted to file a patent as quickly as possible. Any possibility of the French government getting a Connection Machine evaporated and soon Brosl Hasslacher was claiming that “the French are faking their data”.

And then there was the matter of the various academic papers. I had been sent the Frisch–Hasslacher–Pomeau paper to review, and checking my 1985 calendar for my whereabouts I must have received it the very day I finished my paper. I told the journal they should publish the paper, suggesting some changes to avoid naivete about computing and computer technology, but not mentioning its very thin recognition of my work.

Our paper, on the other hand, triggered a rather indecorous competitive response, with two “anonymous reviewers” claiming that the paper said nothing more than its “reference 5” (the Frisch–Hasslacher–Pomeau paper). I patiently pointed out that that wasn’t the case, not least because our paper had actual simulations, but also that actually I happened to have “been there first” with the overall idea. The journal solicited other opinions, which were mostly supportive. But in the end a certain Leo Kadanoff swooped in to block it, only to publish his own a few months later.

It felt corrupt, and distasteful. I was at that point a successful and increasingly established academic. And some of the people involved were even longtime friends. So was this kind of thing what I had to look forward to in a life in academia? That didn’t seem attractive, or necessary. And it was what began the process that led me, a year and a half later, to finally choose to leave academia behind, never to return.

Still, despite the “turbulence”—and in the midst of other activities—I continued to work hard on cellular automaton fluids, and by January 1986 I had the first version of a long (and, I thought, rather good) paper on their basic theory (that was finished and published later that year):

As it turns out, the methods I used in that paper provide some important seeds for our Physics Project, and even in recent times I’ve often found myself referring to the paper, complete with its SMP open-code appendix:

But in addition to developing the theory, I was also getting simulations done on the Connection Machine, and getting actual experimental data (particularly on flow past cylinders) to compare them to. By February 1986, we had quite a few results:

But by this point there was a quite industrial effort, particularly in France, that was churning out papers on cellular automaton fluids at a high rate. I’d called my theory paper “Cellular Automaton Fluid 1: Basic Theory”. But was it really worth finishing part 2? There was a veritable army of perfectly good physicists “competing” with me. And, I thought, “I have other things to do. Just let them do this. This doesn’t need me”.

And so it was that in the middle of 1986 I stopped working on cellular automaton fluids. And, yes, that freed me up to work on lots of other interesting things. But even though methods derived from cellular automaton fluids have become widely used in practical fluid dynamics computations, the key basic science that I thought could be addressed with cellular automaton fluids—about things like the origin of randomness in turbulence—has still, even to this day, not really been further explored.

In June 1986 I was about to launch both a research center (the Center for Complex Systems Research at the University of Illinois) and a journal (*Complex Systems*)—and I was also organizing a conference called CA ’86 (which was held at MIT). The core of the conference was poster presentations, and a few days before the conference was to start I decided I should find a “nice little project” that I could quickly turn into a poster.

In studying cellular automaton fluids I had found that cellular automata with rules based on idealized physical molecular dynamics could on a large scale approximate the continuum behavior of fluids. But what if one just started from continuum behavior? Could one derive underlying rules that would reproduce it? Or perhaps even find the minimal such rules?

By mid-1985 I felt I’d made decent progress on the science of cellular automata. But what about their engineering? What about constructing cellular automata with particular behavior? In May 1985 I had given a conference talk about “Cellular Automaton Engineering”, which turned into a paper about “Approaches to Complexity Engineering”—that in effect tried to set up “trainable cellular automata” in what might still be a powerful simple-programs-meet-machine-learning scheme that deserves to be explored:

But so it was that a few days before the CA ’86 conference I decided to try to find a minimal “cellular automaton approximation” to a simple continuum process: diffusion in one dimension.

I explained

and described as my objective:

I used block cellular automata, and tried to find rules that were reversible and also conserved something that could serve as “microscopic density” or “particle number”. I quickly determined that there were no such rules with 2 colors and blocks of sizes 2 or 3 that achieved any kind of randomization.

To go to 3 colors, I used SMP to generate candidate rules

where for example the function `Apper` can be literally be translated into Wolfram Language as

or, more idiomatically, just

then did what I have done so many times and just printed out pictures of their behavior:

Some clearly did not show randomization, but a couple did. And soon I was studying what I called the “winning rule”, which—like rule 30—went from simple initial conditions to apparent randomness:

I analyzed what the rule was “microscopically doing”

and explored its longer-time behavior:

Then I did things like analyze its cycle structure in a finite-size region by running C programs I’d basically already developed back in 1982 (though now they were modified to automatically generate troff code for typesetting):

And, like rule 30, the “winning rule” that I found back in June 1986 has stayed with me, essentially as a minimal example of reversible, number-conserving randomness. It appeared in *A New Kind of Science*, and it appears now in my recent work on the Second Law—and, of course, the patterns it makes are always the same:

Back in 1986 I wanted to know just how efficiently a simple rule like this could reproduce continuum behavior. And in a portent of observer theory my notes from the time talk about “optimal coarse graining, where the 2nd law is ‘most true’”, then go on to compare the distributed character of the cellular automaton with traditional “collect information into numerical value” finite-difference approximations:

In a talk I gave I summarized my understanding:

The phenomenon of randomization is generic in computational systems (witness rule 30, the “winning rule”, etc.) This leads to the genericity of thermodynamics. And this in turn leads to the genericity of continuum behavior, with diffusion and fluid behavior being two examples.

It would take another 34 years, but these basic ideas would eventually be what underlies our Physics Project, and our understanding of the emergence of things like spacetime. As well as now being crucial to our whole understanding of the Second Law.

By the end of 1986 I had begun the development of Mathematica, and what would become the Wolfram Language, and for most of the next five years I was submerged in technology development. But in 1991 I started to use the technology I now had, and began the project that became *A New Kind of Science*.

Much of the first couple of years was spent exploring the computational universe of simple programs, and discovering that the phenomena I’d discovered in cellular automata were actually much more general. And it was seeing that generality that led me to the Principle of Computational Equivalence. In formulating the concept of computational irreducibility I’d in effect been thinking about trying to “reduce” the behavior of systems using an external as-powerful-as-possible universal computer. But now I’d realized I should just be thinking about all systems as somehow computationally equivalent. And in doing that I was pulling the conception of the “observer” and their computational ability closer to the systems they were observing.

But the further development of that idea would have to wait nearly three more decades, until the arrival of our Physics Project. In *A New Kind of Science*, Chapter 7 on “Mechanisms in Programs and Nature” describes the concept of intrinsic randomness generation, and how it’s distinguished from other sources of randomness. Chapter 8 on “Implications for Everyday Systems” then has a section on fluid flow, where I describe the idea that randomness in turbulence could be intrinsically generated, making it, for example, repeatable, rather than inevitably different every time an experiment is run.

And then there’s Chapter 9, entitled “Fundamental Physics”. The majority of the chapter—and its “most famous” part—is the presentation of the direct precursor to our Physics Project, including the concept of graph-rewriting-based computational models for the lowest-level structure of spacetime and the universe.

But there’s an earlier part of Chapter 9 as well, and it’s about the Second Law. There’s a precursor about “The Notion of Reversibility”, and then we’re on to a section about “Irreversibility and the Second Law of Thermodynamics”, followed by “Conserved Quantities and Continuum Phenomena”, which is where the “winning rule” I discovered in 1996 appears again:

My records show I wrote all of this—and generated all the pictures—between May 2 and July 11, 1995. I felt I already had a pretty good grasp of how the Second Law worked, and just needed to write it down. My emphasis was on explaining how a microscopically reversible rule—through its intrinsic ability to generate randomness—could lead to what appears to be irreversible behavior.

Mostly I used reversible 1D cellular automata as my examples, showing for example randomization both forwards and backwards in time:

I soon got to the nub of the issue with irreversibility and the Second Law:

I talked about how “typical textbook thermodynamics” involves a bunch of details about energy and motion, and to get closer to this I showed a simple example of an “ideal gas” 2D cellular automaton:

But despite my early exposure to hard-sphere gases, I never went as far as to use them as examples in *A New Kind of Science*. We did actually take some photographs of the mechanics of real-life billiards:

But cellular automata always seemed like a much clearer way to understand what was going on, free from issues like numerical precision, or their physical analogs. And by looking at cellular automata I felt as if I could really see down the foundations of the Second Law, and why it was true.

And mostly it was a story of computational irreducibility, and intrinsic randomness generation. But then there was rule 37R. I’ve often said that in studying the computational universe we have to remember that the “computational animals” are at least as smart as we are—and they’re always up to tricks we don’t expect.

And so it is with rule 37R. In 1986 I’d published a book of cellular automaton papers, and as an appendix I’d included lots of tables of properties of cellular automata. Almost all the tables were about the ordinary elementary cellular automata. But as a kind of “throwaway” at the very end I gave a table of the behavior of the 256 second-order reversible versions of the elementary rules, including 37R starting both from completely random initial conditions

and from single black cells:

So far, nothing remarkable. And years go by. But then—apparently in the middle of working on the 2D systems section of *A New Kind of Science*—at 4:38am on February 21, 1994 (according to my filesystem records), I generate pictures of all the reversible elementary rules again, but now from initial conditions that are slightly more complicated than a single black cell. Opening the notebook from that time (and, yes, Wolfram Language and our notebook format have been stable enough that 28 years later that still works) it shows up tiny on a modern screen, but there it is: rule 37R doing something “interesting”:

Clearly I noticed it. Because by 4:47am I’ve generated lots of pictures of rule 37R, like this one evolving from a block of 21 black cells, and showing only every other step

and by 4:54am I’ve got things like:

My guess is that I was looking for class 4 behavior in reversible cellular automata. And with rule 37R I’d found it. And at the time I moved on to other things. (On March 1, 1994, I slipped on some ice and broke my ankle, and was largely out of action for several weeks.)

And that takes us back to May 1995, when I was working on writing about the Second Law. My filesystem records that I did quite a few more experiments on rule 37R then, looking at different initial conditions, and running it as long as I could, to see if its strange neither-simple-nor-randomizing—and not very Second-Law-like—behavior would somehow “resolve”.

Up to that moment, for nearly a quarter of a century, I had always fundamentally believed in the Second Law. Yes, I thought there might be exceptions with things like self-gravitating systems. But I’d always assumed that—perhaps with some pathological exceptions—the Second Law was something quite universal, whose origins I could even now understand through computational irreducibility.

But seeing rule 37R this suddenly didn’t seem right. In *A New Kind of Science* I included a long run of rule 37R (here colorized to emphasize the structure)

then explained:

How could one describe what was happening in rule 37R? I discussed the idea that it was effectively forming “membranes” which could slowly move, but keep things “modular” and organized inside. I summarized at the time, tagging it as “something I wanted to explore in more detail one day”:

Rounding out the rest of *A New Kind of Science* takes another seven years of intense work. But finally in May 2002 it was published. The book talked about many things. And even within Chapter 9 my discussion of the Second Law was overshadowed by the outline I gave of an approach to finding a truly fundamental theory of physics—and of the ideas that evolved into our Physics Project.

After *A New Kind of Science* was finished I spent many years working mainly on technology—building Wolfram|Alpha, launching the Wolfram Language and so on. But “follow up on Chapter 9” was always on my longterm to-do list. The biggest—and most difficult—part of that had to do with fundamental physics. But I still had a great intellectual attachment to the Second Law, and I always wanted to use what I’d then understood about the computational paradigm to “tighten up” and “round out” the Second Law.

I’d mention it to people from time to time. Usually the response was the same: “Wasn’t the Second Law understood a century ago? What more is there to say?” Then I’d explain, and it’d be like “Oh, yes, that is interesting”. But somehow it always seemed like people felt the Second Law was “old news”, and that whatever I might do would just be “dotting an *i* or crossing a *t*”. And in the end my Second Law project never quite made it onto my active list, despite the fact that it was something I always wanted to do.

Occasionally I would write about my ideas for finding a fundamental theory of physics. And, implicitly I’d rely on the understanding I’d developed of the foundations and generalization of the Second Law. In 2015, for example, celebrating the centenary of General Relativity, I wrote about what spacetime might really be like “underneath”

and how a perceived spacetime continuum might emerge from discrete underlying structure like fluid behavior emerges from molecular dynamics—in effect through the operation of a generalized Second Law:

It was 17 years after the publication of *A New Kind of Science* that (as I’ve described elsewhere) circumstances finally aligned to embark on what became our Physics Project. And after all those years, the idea of computational irreducibility—and its immediate implications for the Second Law—had come to seem so obvious to me (and to the young physicists with whom I worked) that they could just be taken for granted as conceptual building blocks in constructing the tower of ideas we needed.

One of the surprising and dramatic implications of our Physics Project is that General Relativity and quantum mechanics are in a sense both manifestations of the same fundamental phenomenon—but played out respectively in physical space and in branchial space. But what really is this phenomenon?

What became clear is that ultimately it’s all about the interplay between underlying computational irreducibility and our nature as observers. It’s a concept that had its origins in my thinking about the Second Law. Because even in 1984 I’d understood that the Second Law is about our inability to “decode” underlying computationally irreducible behavior.

In *A New Kind of Science* I’d devoted Chapter 10 to “Processes of Perception and Analysis”, and I’d recognized that we should view such processes—like any processes in nature or elsewhere—as being fundamentally computational. But I still thought of processes of perception and analysis as being separated from—and in some sense “outside”—actual processes we might be studying. But in our Physics Project we’re studying the whole universe, so inevitably we as observers are “inside” and part of the system.

And what then became clear is the emergence of things like General Relativity and quantum mechanics depends on certain characteristics of us as observers. “Alien observers” might perceive quite different laws of physics (or no systematic laws at all). But for “observers like us”, who are computationally bounded and believe we are persistent in time, General Relativity and quantum mechanics are inevitable.

In a sense, therefore, General Relativity and quantum mechanics become “abstractly derivable” given our nature as observers. And the remarkable thing is that at some level the story is exactly the same with the Second Law. To me it’s a surprising and deeply beautiful scientific unification: that all three of the great foundational theories of physics—General Relativity, quantum mechanics and statistical mechanics—are in effect manifestations of the same core phenomenon: an interplay between computational irreducibility and our nature as observers.

Back in the 1970s I had no inkling of all this. And even when I chose to combine my discussions of the Second Law and of my approach to a fundamental theory of physics into a single chapter of *A New Kind of Science*, I didn’t know how deeply these would be connected. It’s been a long and winding path, that’s needed to pass through many different pieces of science and technology. But in the end the feeling I had when I first studied that book cover when I was 12 years old that “this was something fundamental” has played out on a scale almost incomprehensibly beyond what I had ever imagined.

Most of my journey with the Second Law has had to do with understanding origins of randomness, and their relation to “typical Second-Law behavior”. But there’s another piece—still incompletely worked out—which has to do with surprises like rule 37R, and, more generally, with large-scale versions of class 4 behavior, or what I’ve begun to call the “mechanoidal phase”.

I first identified class 4 behavior as part of my systematic exploration of 1D cellular automata at the beginning of 1983—with the “code 20” *k* = 2, *r* = 2 totalistic rule being my first clear example:

Very soon my searches had identified a whole variety of localized structures in this rule:

At the time, the most significant attribute of class 4 cellular automata as far as I was concerned was that they seemed likely to be computation universal—and potentially provably so. But from the beginning I was also interested in what their “thermodynamics” might be. If you start them off from random initial conditions, will their patterns die out, or will some arrangement of localized structures persist, and perhaps even grow?

In most cellular automata—and indeed most systems with local rules—one expects that at least their statistical properties will somehow stabilize when one goes to the limit of infinite size. But, I asked, does that infinite-size limit even “exist” for class 4 systems—or if you progressively increase the size, will the results you get keep on jumping around forever, perhaps as you succeed in sampling progressively more exotic structures?

A paper I wrote in September 1983 talks about the idea that in a sufficiently large class 4 cellular automaton one would eventually get self-reproducing structures, which would end up “taking over everything”:

The idea that one might be able to see “biology-like” self-reproduction in cellular automata has a long history. Indeed, one of the multiple ways that cellular automata were invented (and the one that led to their name) was through John von Neumann’s 1952 effort to construct a complicated cellular automaton in which there could be a complicated configuration capable of self-reproduction.

But could self-reproducing structures ever “occur naturally” in cellular automata? Without the benefit of intuition from things like rule 30, von Neumann assumed that something like self-reproduction would need an incredibly complicated setup, as it seems to have, for example, in biology. But having seen rule 30—and more so class 4 cellular automata—it didn’t seem so implausible to me that even with very simple underlying rules, there could be fairly simple configurations that would show phenomena like self-reproduction.

But for such a configuration to “occur naturally” in a random initial condition might require a system with exponentially many cells. And I wondered if in the oceans of the early Earth there might have been only “just enough” molecules for something like a self-reproducing lifeform to occur.

Back in 1983 I already had pretty efficient code for searching for structures in class 4 cellular automata. But even running for days at a time, I never found anything more complicated than purely periodic (if sometimes moving) structures. And in March 1985, following an article about my work in *Scientific American*, I appealed to the public to find “interesting structures”—like “glider guns” that would “shoot out” moving structures:

As it happened, right before I made my “public appeal”, a student at Princeton working with a professor I knew had sent me a glider gun he’d found the *k* = 2, *r* = 3 totalistic code 88 rule:

At the time, though, with computer displays only large enough to see behavior like

I wasn’t convinced this was an “ordinary class 4 rule”—even though now, with the benefit of higher display resolution, it seems more convincing:

The “public appeal” generated a lot of interesting feedback—but no glider guns or other exotic structures in the rules I considered “obviously class 4”. And it wasn’t until after I started working on *A New Kind of Science* that I got back to the question. But then, on the evening of December 31, 1991, using exactly the same code as in 1983, but now with faster computers, there it was: in an ordinary class 4 rule (*k* = 3, *r* = 1 code 1329), after finding several localized structures, there was one that grew without bound (albeit not in the most obvious “glider gun” way):

But that wasn’t all. Exemplifying the principle that in the computational universe there are always surprises, searching a little further revealed yet other unexpected structures:

Every few years something else would come up with class 4 rules. In 1994, lots of work on rule 110. In 1995, the surprise of rule 37R. In 1998 efforts to find analogs of particles that might carry over to my graph-based model of space.

After *A New Kind of Science* was published in 2002, we started our annual Wolfram Summer School (at first called the NKS Summer School)—and in 2010 our High School Summer Camp. Some years we asked students to pick their “favorite cellular automaton”. Often they were class 4:

And occasionally someone would do a project to explore the world of some particular class 4 rule. But beyond those specifics—and statements about computation universality—it’s never been clear quite what one could say about class 4.

Back in 1984 in the series of cellular automaton postcards I’d produced, there were a couple of class 4 examples:

And even then the typical response to these images was that they looked “organic”—like the kind of thing living organisms might produce. A decade later—for *A New Kind of Science*—I studied “organic forms” quite a bit, trying to understand how organisms get their overall shapes, and surface patterns. Mostly that didn’t end up being a story of class 4 behavior, though.

Since the early 1980s I’ve been interested in molecular computing, and in how computation might be done at the level of molecules. My discoveries in *A New Kind of Science* (and specifically the Principle of Computational Equivalence) convinced me that it should be possible to get even fairly simple collections of molecules to “do arbitrary computations” or even build more or less arbitrary structures (in a more general and streamlined way than happens with the whole protein synthesis structure in biology). And over the years, I sometimes thought about trying to do practical work in this area. But it didn’t feel as if the ambient technology was quite ready. So I never jumped in.

Meanwhile, I’d long understood the basic correspondence between multiway systems and patterns of possible pathways for chemical reactions. And after our Physics Project was announced in 2020 and we began to develop the general multicomputational paradigm, I immediately considered molecular computing a potential application. But just what might the “choreography” of molecules be like? What causal relationships might there be, for example, between different interactions of the same molecule? That’s not something ordinary chemistry—dealing for example with liquid-phase reactions—tends to consider important.

But what I increasingly started to wonder is whether in molecular biology it might actually be crucial. And even in the 20 years since *A New Kind of Science* was published, it’s become increasingly clear that in molecular biology things are extremely “orchestrated”. It’s not about molecules randomly moving around, like in a liquid. It’s about molecules being carefully channeled and actively transported from one “event” to another.

Class 3 cellular automata seem to be good “metamodels” for things like liquids, and readily give Second-Law-like behavior. But what about the kind of situation that seems to exist in molecular biology? It’s something I’ve been thinking about only recently, but I think this is a place where class 4 cellular automata can contribute. I’ve started calling the “bulk limit” of class 4 systems the “mechanoidal phase”. It’s a place where the ordinary Second Law doesn’t seem to apply.

Four decades ago when I was trying to understand how structure could arise “in violation of the Second Law” I didn’t yet even know about computational irreducibility. But now we’ve come a lot further, in particular with the development of the multicomputational paradigm, and the recognition of the importance of the characteristics of the observer in defining what perceived overall laws there will be. It’s an inevitable feature of computational irreducibility that there will always be an infinite sequence of new challenges for science, and new pieces of computational reducibility to be found. So, now, yes, a challenge is to understand the mechanoidal phase. And with all the tools and ideas we’ve developed, I’m hoping the process will happen more than it has for the ordinary Second Law.

I began my quest to understand the Second Law a bit more than 50 years ago. And—even though there’s certainly more to say and figure out—it’s very satisfying now to be able to bring a certain amount of closure to what has been the single longest-running piece of intellectual “unfinished business” in my life. It’s been an interesting journey—that’s very much relied on, and at times helped drive, the tower of science and technology that I’ve spent my life building. There are many things that might not have happened as they did. And in the end it’s been a story of longterm intellectual tenacity—stretching across much of my life so far.

For a long time I’ve kept (automatically when possible) quite extensive archives. And now these archives allow one to reconstruct in almost unprecedented detail my journey with the Second Law. One sees the gradual formation of intellectual frameworks over the course of years, then the occasional discovery or realization that allows one to take the next step in what is sometimes mere days. There’s a curious interweaving of computational and essentially philosophical methodologies—with an occasional dash of mathematics.

Sometimes there’s general intuition that’s significantly ahead of specific results. But more often there’s a surprise computational discovery that seeds the development of new intuition. And, yes, it’s a little embarrassing how often I managed to generate in a computer experiment something that I completely failed to interpret or even notice at first because I didn’t have the right intellectual framework or intuition.

And in the end, there’s an air of computational irreducibility to the whole process: there really wasn’t a way to shortcut the intellectual development; one just had to live it. Already in the 1990s I had taken things a fair distance, and I had even written a little about what I had figured out. But for years it hung out there as one of a small collection of unfinished projects: to finally round out the intellectual story of the Second Law, and to write down an exposition of it. But the arrival of our Physics Project just over two years ago brought both a cascade of new ideas, and for me personally a sense that even things that had been out there for a long time could in fact be brought to closure.

And so it is that I’ve returned to the quest I began when I was 12 years old—but now with five decades of new tools and new ideas. The wonder and magic of the Second Law is still there. But now I’m able to see it in a much broader context, and to realize that it’s not just a law about thermodynamics and heat, but instead a window into a very general computational phenomenon. None of this I could know when I was 12 years old. But somehow the quest I was drawn to all those years ago has turned out to be deeply aligned with the whole arc of intellectual development that I have followed in my life. And no doubt it’s no coincidence.

But for now I’m just grateful to have had the quest to understand Second Law as one of my guiding forces through so much of my life, and now to realize that my quest was part of something so broad and so deep.

What is the backstory of the book cover that launched my long journey with the Second Law? The book was published in 1965, and inside its front flap we find:

On page 7 we then find:

In 2001—as I was putting the finishing touches to the historical notes for *A New Kind of Science*—I tracked down Berni Alder (who died in 2020 at the age of 94) to ask him the origin of the pictures. It turned out to be a complex story, reaching back to the earliest serious uses of computers for basic science, and even beyond.

The book had been born out the sense of urgency around science education in the US that followed the launch of Sputnik by the Soviet Union—with a group of professors from Berkeley and Harvard believing that the teaching of freshman college physics was in need of modernization, and that they should write a series of textbooks to enable this. (It was also the time of the “new math”, and a host of other STEM-related educational initiatives.) Fred Reif (who died at the age of 92 in 2019) was asked to write the statistical physics volume. As he explained in the preface to the book

ending with:

Well, it’s taken me 50 years to get to the point where I think I really understand the Second Law that is at the center of the book. And in 2001 I was able to tell Fred Reif that, yes, his book had indeed been useful. He said he was pleased to learn that, adding “It is all too rare that one’s educational efforts seem to bear some fruit.”

He explained to me that when he was writing the book he thought that “the basic ideas of irreversibility and fluctuations might be very vividly illustrated by the behavior of a gas of particles spreading through a box”. He added: “It then occurred to me that Berni Alder might actually show this by a computer generated film since he had worked on molecular dynamics simulations and had also good computer facilities available to him. I was able to enlist Berni’s interest in this project, with the results shown in my book.”

The acknowledgements in the book report:

Berni Alder and and Fred Reif did indeed create a “film loop”, which “could be bought separately from the book and viewed in the physics lab”, as Alder told me, adding that “I understand the students liked it very much, but the venture was not a commercial success.” Still, he sent me a copy of a videotape version:

The film (which has no sound) begins:

Soon it’s showing an actual process of “coming to equilibrium”:

“However”, as Alder explained it to me, “if a large number of particles are put in the corner and the velocities of all the particles are reversed after a certain time, the audience laughs or is supposed to after all the particles return to their original positions.” (One suspects that particularly in the 1960s this might have been reminiscent of various cartoon-film gags.)

OK, so how were the pictures (and the film) made? It was done in 1964 at what’s now Lawrence Livermore Lab (that had been created in 1952 as a spinoff of the Berkeley Radiation Lab, which had initiated some key pieces for the Manhattan Project) on a computer called the LARC (“Livermore Advanced Research Computer”), first made in 1960, that was probably the most advanced scientific computer of the time. Alder explained to me, however: “We could not run the problem much longer than about 10 collision times with 64 bits [*sic*] arithmetic before the round-off error prevented the particles from returning.”

Why did they start the particles off in a somewhat random configuration? (The randomness, Alder told me, had been created by a middle-square random number generator.) Apparently if they’d been in a regular array—which would have made the whole process of randomization much easier to see—the roundoff errors would have been too obvious. (And it’s issues like this that made it so hard to recognize the rule 30 phenomenon in systems based on real numbers—and without the idea of just studying simple programs not tied to traditional equation-based formulations of physics.)

The actual code for the molecular dynamics simulation was written in assembler and run by Mary Ann Mansigh (Karlsen), who had a degree in math and chemistry and worked as a programmer at Livermore from 1955 until the 1980s, much of the time specifically with Alder. Here she is at the console of the LARC (yes, computers had built-in desks in those days):

The program that was used was called STEP, and the original version of it had actually been written (by a certain Norm Hardy, who ended up having a long Silicon Valley career) to run on a previous generation of computer. (A still-earlier program was called APE, for “Approach to Equilibrium”.) But it was only with the LARC—and STEP—that things were fast enough to run substantial simulations, at the rate of about 200,000 collisions per hour (the simulation for the book cover involved 40 particles and about 500 collisions). At the time of the book STEP used an *n*^{2} algorithm where all pairs of particles were tested for collisions; later a neighborhood-based linked list method was used.

The standard method of getting output from a computer back in 1964—and basically until the 1980s—was to print characters on paper. But the LARC could also drive an oscilloscope, and it was with this that the graphics for the book were created (capturing them from the oscilloscope screen with a Polaroid instant camera).

But why was Berni Alder studying molecular dynamics and “hard sphere gases” in the first place? Well, that’s another long story. But ultimately it was driven by the effort to develop a microscopic theory of liquids.

The notion that gases might consist of discrete molecules in motion had arisen in the 1700s (and even to some extent in antiquity), but it was only in the mid-1800s that serious development of the “kinetic theory” idea began. Pretty immediately it was clear how to derive the ideal gas law *P V* = *R T* for essentially non-interacting molecules. But what analog of this “equation of state” might apply to gases with significant interactions between molecules, or, for that matter, liquids? In 1873 Johannes Diderik van der Waals proposed, on essentially empirical grounds, the formula (*P* + *a*/*V*^{2})(*V*–*b*) = *RT*—where the parameter *b* represented “excluded volume” taken up by molecules, that were implicitly being viewed as hard spheres. But could such a formula be derived—like the ideal gas law—from a microscopic kinetic theory of molecules? At the time, nobody really knew how to start, and the problem languished for more than half a century.

(It’s worth pointing out, by the way, that the idea of modeling gases, as opposed to liquids, as collections of hard spheres was extensively pursued in the mid-1800s, notably by Maxwell and Boltzmann—though with their traditional mathematical analysis methods, they were limited to studying average properties of what amount to dilute gases.)

Meanwhile, there was increasing interest in the microscopic structure of liquids, particularly among chemists concerned for example with how chemical solutions might work. And at the end of the 1920s the technique of x-ray diffraction, which had originally been used to study the microscopic structure of crystals, was applied to liquids—allowing in particular the experimental determination of the radial distribution function (or pair correlation function) *g*(*r*), which gives the probability to find another molecule a distance *r* from a given one.

But how might this radial distribution function be computed? By the mid-1930s there were several proposals based on looking at the statistics of random assemblies of hard spheres:

Some tried to get results by mathematical methods; others did physical experiments with ball bearings and gelatin balls, getting at least rough agreement with actual experiments on liquids:

But then in 1939 a physical chemist named John Kirkwood gave an actual probabilistic derivation (using a variety of simplifying assumptions) that fairly closely reproduced the radial distribution function:

But what about just computing from first principles, on the basis of the mechanics of colliding molecules? Back in 1872 Ludwig Boltzmann had proposed a statistical equation (the “Boltzmann transport equation”) for the behavior of collections of molecules, that was based on the approximation of independent probabilities for individual molecules. By the 1940s the independence assumption had been overcome, but at the cost of introducing an infinite hierarchy of equations (the “BBGKY hierarchy”, where the “K” stood for Kirkwood). And although the full equations were intractable, approximations were suggested that—while themselves mathematically sophisticated—seemed as if they should, at least in principle, be applicable to liquids.

Meanwhile, in 1948, Berni Alder, fresh from a master’s degree in chemical engineering, and already interested in liquids, went to Caltech to work on a PhD with John Kirkwood—who suggested that he look at a couple of approximations to the BBGKY hierarchy for the case of hard spheres. This led to some nasty integro-differential equations which couldn’t be solved by analytical techniques. Caltech didn’t yet have a computer in the modern sense, but in 1949 they acquired an IBM 604 Electronic Calculating Punch, which could be wired to do calculations with input and output specified on punched cards—and it was on this machine that Alder got the calculations he needed done (the paper records that “[this] … was calculated … with the use of IBM equipment and the file of punched cards of *sin*(*ut*) employed in these laboratories for electron diffraction calculation”):

Our story now moves to Los Alamos, where in 1947 Stan Ulam had suggested the Monte Carlo method as a way to study neutron diffusion. In 1949 the method was implemented on the ENIAC computer. And in 1952 Los Alamos got its own MANIAC computer. Meanwhile, there was significant interest at Los Alamos in computing equations of state for matter, especially in extreme conditions such as those in a nuclear explosion. And by 1953 the idea had arisen of using the Monte Carlo method to do this.

The concept was to take a collection of hard spheres (or actually 2D disks), and move them randomly in a series of steps with the constraint that they could not overlap—then look at the statistics of the resulting “equilibrium” configurations. This was done on the MANIAC, with the resulting paper now giving “Monte Carlo results” for things like the radial distribution function:

Kirkwood and Alder had been continuing their BBGKY hierarchy work, now using more realistic Lennard-Jones forces between molecules. But by 1954 Alder was also using the Monte Carlo method, implementing it partly (rather painfully) on the IBM Electronic Calculating Punch, and partly on the Manchester Mark II computer in the UK (whose documentation had been written by Alan Turing):

In 1955 Alder started working full-time at Livermore, recruited by Edward Teller. Another Livermore recruit—fresh from a physics PhD—was Thomas Wainwright. And soon Alder and Wainwright came up with an alternative to the Monte Carlo method—that would eventually give the book cover pictures: just explicitly compute the dynamics of colliding hard spheres, with the expectation that after enough collisions the system would come to equilibrium and allow things like equations of state to be obtained.

In 1953 Livermore had obtained its first computer: a Remington Rand Univac I. And it was on this computer that Alder and Wainwright did a first proof of concept of their method, tracing 100 hard spheres with collisions computed at the rate of about 100 per hour. Then in 1955 Livermore got IBM 704 computers, which, with their hardware floating-point capabilities, were able to compute about 2000 collisions per hour.

Alder and Wainwright reported their first results at a statistical mechanics conference in Brussels in August 1956 (organized by Ilya Prigogine). The published version appeared in 1958:

It gives evidence—that they tagged as “provisional”—for the emergence of a Maxwell–Boltzmann velocity distribution “after the system reached equilibrium”

as well as things like the radial distribution function—and the equation of state:

It was notable that there seemed to be a discrepancy between the results for the equation of state computed by explicit molecular dynamics and by the Monte Carlo method. And what is more, there seemed to be evidence of some kind of discontinuous phase-transition-like behavior as the density of spheres changed (an effect which Kirkwood had predicted in 1949).

Given the small system sizes and short runtimes it was all a bit muddy. But by August 1957 Alder and Wainwright announced that they’d found a phase transition, presumably between a high-density phase where the spheres were packed together like in a crystalline solid, and a low-density phase, where they were able to more freely “wander around” like in a liquid or gas. Meanwhile, the group at Los Alamos had redone their Monte Carlo calculations, and they too now claimed a phase transition. Their papers were published back to back:

But at this point no actual pictures of molecular trajectories had yet been published, or, I believe, made. All there was were traditional plots of aggregated quantities. And in 1958, these plots made their first appearance in a textbook. Tucked into Appendix C of *Elementary Statistical Physics* by Berkeley physics professor Charles Kittel (who would later be chairman of the group developing the Berkeley Physics Course book series) were two rather confusing plots about the approach to the Maxwell–Boltzmann distribution taken from a pre-publication version of Alder and Wainwright’s paper:

Alder and Wainwright’s phase transition result had created enough of a stir that they were asked to write a *Scientific American* article about it. And in that article—entitled “Molecular Motions”, from October 1959—there were finally pictures of actual trajectories, with their caption explaining that the “paths of particles … appear as bright lines on the face of a cathode-ray tube hooked to a computer” (the paths are of the centers of the colliding disks):

A technical article published at the same time gave a diagram of the logic for the dynamical computation:

Then in 1960 Livermore (after various delays) took delivery of the LARC computer—arguably the first scientific supercomputer—which allowed molecular dynamics computations to be done perhaps 20 times times faster. A 1962 picture shows Berni Alder (left) and Thomas Wainwright (right) looking at outputs from the LARC with Mary Ann Mansigh (yes, in those days it was typical for male physicists to wear ties):

And in 1964, the pictures for the *Statistical Physics* book (and film loop) got made, with Mary Ann Mansigh painstakingly constructing images of disks on the oscilloscope display.

Work on molecular dynamics continued, though to do it required the most powerful computers, so for many years it was pretty much restricted to places like Livermore. And in 1967, Alder and Wainwright made another discovery about hard spheres. Even in their first paper about molecular dynamics they’d plotted the velocity autocorrelation function, and noted that it decayed roughly exponentially with time. But by 1967 they had much more precise data, and realized that there was a deviation from exponential decay: a definite “long-time tail”. And soon they had figured out that this power-law tail was basically the result of a continuum hydrodynamic effect (essentially a vortex) operating even on the scale of a few molecules. (And—though it didn’t occur to me at the time—this should have suggested that even with fairly small numbers of cells cellular automaton fluid simulations had a good chance of giving recognizable hydrodynamic results.)

It’s never been entirely easy to do molecular dynamics, even with hard spheres, not least because in standard computations one’s inevitably confronted with things like numerical roundoff errors. And no doubt this is why some of the obvious foundational questions about the Second Law weren’t really explored there, and why intrinsic randomness generation and the rule 30 phenomenon weren’t identified.

Incidentally, even before molecular dynamics emerged, there was already one computer study of what could potentially have been Second Law behavior. Visiting Los Alamos in the early 1950s Enrico Fermi had gotten interested in using computers for physics, and wondered what would happen if one simulated the motion of an array of masses with nonlinear springs between them. The results of running this on the MANIAC computer were reported in 1955 (after Fermi had died)

and it was noted that there wasn’t just exponential approach to equilibrium, but instead something more complicated (later connected to solitons). Strangely, though, instead of plotting actual particle trajectories, what were given were mode energies—but these still exhibited what, if it hadn’t been obscured by continuum issues, might have been recognized as something like the rule 30 phenomenon:

But I knew none of this history when I saw the *Statistical Physics* book cover in 1972. And indeed, for all I knew, it could have been a “standard statistical physics cover picture”. I didn’t know it was the first of its kind—and a leading-edge example of the use of computers for basic science, accessible only with the most powerful computers of the time. Of course, had I known those things, I probably wouldn’t have tried to reproduce the picture myself and I wouldn’t have had that early experience in trying to use a computer to do science. (Curiously enough, looking at the numbers now, I realize that the base speed of the LARC was only 20x the Elliott 903C, though with floating point, etc.—a factor that pales in comparison with the 500x speedup in computers in the 40 years since I started working on cellular automata.)

But now I know the history of that book cover, and where it came from. And what I only just discovered now is that actually there’s a bigger circle than I knew. Because the path from Berni Alder to that book cover to my work on cellular automaton fluids came full circle—when in 1988 Alder wrote a paper based on cellular automaton fluids (though through the vicissitudes of academic behavior I don’t think he knew these had been my idea—and now it’s too late to tell him his role in seeding them):

*Notes & Thanks*

There are many people who’ve contributed to the 50-year journey I’ve described here. Some I’ve already mentioned by name, but others not—including many who doubtless wouldn’t even be aware that they contributed. The longtime store clerk at Blackwell’s bookstore who in 1972 sold college physics books to a 12-year old without batting an eye. (I learned his name—Keith Clack—30 years later when he organized a book signing for *A New Kind of Science* at Blackwell’s.) John Helliwell and Lawrence Wickens who in 1977 invited me to give the first talk where I explicitly discussed the foundations of the Second Law. Douglas Abraham who in 1977 taught a course on mathematical statistical mechanics that I attended. Paul Davies who wrote a book on *The Physics of Time Asymmetry* that I read around that time. Rocky Kolb who in 1979 and 1980 worked with me on cosmology that used statistical mechanics. The students (including professors like Steve Frautschi and David Politzer) who attended my 1981 class at Caltech about “nonequilibrium statistical mechanics”. David Pines and Elliott Lieb who in 1983 were responsible for publishing my breakout paper on “Statistical Mechanics of Cellular Automata”. Charles Bennett (curiously, a student of Berni Alder’s) with whom in the early 1980s I discussed applying computation theory (notably the ideas of Greg Chaitin) to physics. Brian Hayes who commissioned my 1984 *Scientific American* article, and Peter Brown who edited it. Danny Hillis and Sheryl Handler who in 1984 got me involved with Thinking Machines. Jim Salem and Bruce Nemnich (Walker) who worked on fluid dynamics on the Connection Machine with me. Then—36 years later—Jonathan Gorard and Max Piskunov, who catalyzed the doing of our Physics Project.

In the last 50 years, there’ve been surprisingly few people with whom I’ve directly discussed the foundations of the Second Law. Perhaps one reason is that back when I was a “professional physicist” statistical mechanics as a whole wasn’t a prominent area. But, more important, as I’ve described elsewhere, for more than a century most physicists have effectively assumed that the foundations of the Second Law are a solved (or at least merely pedantic) problem.

Probably the single person with whom I had the most discussions about the foundations of the Second Law is Richard Feynman. But there are others with whom at one time or another I’ve discussed related issues, including: Bruce Boghosian, Richard Crandall, Roger Dashen, Mitchell Feigenbaum, Nigel Goldenfeld, Theodore Gray, Bill Hayes, Joel Lebowitz, David Levermore, Ed Lorenz, John Maddox, Roger Penrose, Ilya Prigogine, Rudy Rucker, David Ruelle, Rob Shaw, Yakov Sinai, Michael Trott, Léon van Hove and Larry Yaffe. (There are also many others with whom I’ve discussed general issues about origins of randomness.)

Finally, one technical note about the presentation here: in an effort to maintain a clearer timeline, I’ve typically shown the earliest drafts or preprint versions of papers that I have. Their final published versions (if indeed they were ever published) appeared anything from weeks to years later, sometimes with changes.

]]>As I’ve explained elsewhere, I think I now finally understand the Second Law of thermodynamics. But it’s a new understanding, and to get to it I’ve had to overcome a certain amount of conventional wisdom about the Second Law that I at least have long taken for granted. And to check myself I’ve been keen to know just where this conventional wisdom came from, how it’s been validated, and what might have made it go astray.

And from this I’ve been led into a rather detailed examination of the origins and history of thermodynamics. All in all, it’s a fascinating story, that both explains what’s been believed about thermodynamics, and provides some powerful examples of the complicated dynamics of the development and acceptance of ideas.

The basic concept of the Second Law was first formulated in the 1850s, and rather rapidly took on something close to its modern form. It began partly as an empirical law, and partly as something abstractly constructed on the basis of the idea of molecules, that nobody at the time knew for sure existed. But by the end of the 1800s, with the existence of molecules increasingly firmly established, the Second Law began to often be treated as an almost-mathematically-proven necessary law of physics. There were still mathematical loose ends, as well as issues such as its application to living systems and to systems involving gravity. But the almost-universal conventional wisdom became that the Second Law must always hold, and if it didn’t seem to in a particular case, then that must just be because there was something one didn’t yet understand about that case.

There was also a sense that regardless of its foundations, the Second Law was successfully used in practice. And indeed particularly in chemistry and engineering it’s often been in the background, justifying all the computations routinely done using entropy. But despite its ubiquitous appearance in textbooks, when it comes to foundational questions, there’s always been a certain air of mystery around the Second Law. Though after 150 years there’s typically an assumption that “somehow it must all have been worked out”. I myself have been interested in the Second Law now for a little more than 50 years, and over that time I’ve had a growing awareness that actually, no, it hasn’t all been worked out. Which is why, now, it’s wonderful to see the computational paradigm—and ideas from our Physics Project—after all these years be able to provide solid foundations for understanding the Second Law, as well as seeing its limitations.

And from the vantage point of the understanding we now have, we can go back and realize that there were precursors of it even from long ago. In some ways it’s all an inspiring tale—of how there were scientists with ideas ahead of their time, blocked only by the lack of a conceptual framework that would take another century to develop. But in other ways it’s also a cautionary tale, of how the forces of “conventional wisdom” can blind people to unanswered questions and—over a surprisingly long time—inhibit the development of new ideas.

But, first and foremost, the story of the Second Law is the story of a great intellectual achievement of the mid-19th century. It’s exciting now, of course, to be able to use the latest 21st-century ideas to take another step. But to appreciate how this fits in with what’s already known we have to go back and study the history of what originally led to the Second Law, and how what emerged as conventional wisdom about it took shape.

Once it became clear what heat is, it actually didn’t take long for the Second Law to be formulated. But for centuries—and indeed until the mid-1800s—there was all sorts of confusion about the nature of heat.

That there’s a distinction between hot and cold is a matter of basic human perception. And seeing fire one might imagine it as a disembodied form of heat. In ancient Greek times Heraclitus (~500 BC) talked about everything somehow being “made of fire”, and also somehow being intrinsically “in motion”. Democritus (~460–~370 BC) and the Epicureans had the important idea (that also arose independently in other cultures) that everything might be made of large numbers of a few types of tiny discrete atoms. They imagined these atoms moving around in the “void” of space. And when it came to heat, they seem to have correctly associated it with the motion of atoms—though they imagined it came from particular spherical “fire” atoms that could slide more quickly between other atoms, and they also thought that souls were the ultimate sources of motion and heat (at least in warm-blooded animals?), and were made of fire atoms.

And for two thousand years that’s pretty much where things stood. And indeed in 1623 Galileo (1564–1642) (in his book *The Assayer*, about weighing competing world theories) was still saying:

Those materials which produce heat in us and make us feel warmth, which are known by the general name of “fire,” would then be a multitude of minute particles having certain shapes and moving with certain velocities. Meeting with our bodies, they penetrate by means of their extreme subtlety, and their touch as felt by us when they pass through our substance is the sensation we call “heat.”

He goes on:

Since the presence of fire-corpuscles alone does not suffice to excite heat, but their motion is needed also, it seems to me that one may very reasonably say that motion is the cause of heat… But I hold it to be silly to accept that proposition in the ordinary way, as if a stone or piece of iron or a stick must heat up when moved. The rubbing together and friction of two hard bodies, either by resolving their parts into very subtle flying particles or by opening an exit for the tiny fire-corpuscles within, ultimately sets these in motion; and when they meet our bodies and penetrate them, our conscious mind feels those pleasant or unpleasant sensations which we have named heat…

And although he can tell there’s something different about it, he thinks of heat as effectively being associated with a substance or material:

The tenuous material which produces heat is even more subtle than that which causes odor, for the latter cannot leak through a glass container, whereas the material of heat makes its way through any substance.

In 1620, Francis Bacon (1561–1626) (in his “update on Aristotle”, *The New Organon*) says, a little more abstractly, if obscurely—and without any reference to atoms or substances:

[It is not] that heat generates motion or that motion generates heat (though both are true in certain cases), but that heat itself, its essence and quiddity, is motion and nothing else.

But real progress in understanding the nature of heat had to wait for more understanding about the nature of gases, with air being the prime example. (It was actually only in the 1640s that any kind of general notion of gas began to emerge—with the word “gas” being invented by the “anti-Galen” physician Jan Baptista van Helmont (1580–1644), as a Dutch rendering of the Greek word “chaos”, that meant essentially “void”, or primordial formlessness.) Ever since antiquity there’d been Aristotle-style explanations like “nature abhors a vacuum” about what nature “wants to do”. But by the mid-1600s the idea was emerging that there could be more explicit and mechanical explanations for phenomena in the natural world.

And in 1660 Robert Boyle (1627–1691)—now thoroughly committed to the experimental approach to science—published *New Experiments Physico-mechanicall, Touching the Spring of the Air and its Effects* in which he argued that air has an intrinsic pressure associated with it, which pushes it to fill spaces, and for which he effectively found Boyle’s Law *PV* = constant.

But what was air actually made of? Boyle had two basic hypotheses that he explained in rather flowery terms:

His first hypothesis was that air might be like a “fleece of wool” made of “aerial corpuscles” (gases were later often called “aeriform fluids”) with a “power or principle of self-dilatation” that resulted from there being “hairs” or “little springs” between these corpuscles. But he had a second hypothesis too—based, he said, on the ideas of “that most ingenious gentleman, Monsieur Descartes”: that instead air consists of “flexible particles” that are “so whirled around” that “each corpuscle endeavors to beat off all others”. In this second hypothesis, Boyle’s “spring of the air” was effectively the result of particles bouncing off each other.

And, as it happens, in 1668 there was quite an effort to understand the “laws of impact” (that would for example be applicable to balls in games like croquet and billiards, that had existed since at least the 1300s, and were becoming popular), with John Wallis (1616–1703), Christopher Wren (1632–1723) and Christiaan Huygens (1629–1695) all contributing, and Huygens producing diagrams like:

But while some understanding developed of what amount to impacts between pairs of hard spheres, there wasn’t the mathematical methodology—or probably the idea—to apply this to large collections of spheres.

Meanwhile, in his 1687 *Principia Mathematica*, Isaac Newton (1642–1727), wanting to analyze the properties of self-gravitating spheres of fluid, discussed the idea that fluids could in effect be made up of arrays of particles held apart by repulsive forces, as in Boyle’s first hypothesis. Newton had of course had great success with his 1/*r*^{2} universal attractive force for gravity. But now he noted (writing originally in Latin) that with a 1/*r* repulsive force between particles in a fluid, he could essentially reproduce Boyle’s law:

Newton discussed questions like whether one particle would “shield” others from the force, but then concluded:

But whether elastic fluids do really consist of particles so repelling each other, is a physical question. We have here demonstrated mathematically the property of fluids consisting of particles of this kind, that hence philosophers may take occasion to discuss that question.

Well, in fact, particularly given Newton’s authority, for well over a century people pretty much just assumed that this was how gases worked. There was one major exception, however, in 1738, when—as part of his eclectic mathematical career spanning probability theory, elasticity theory, biostatistics, economics and more—Daniel Bernoulli (1700–1782) published his book on hydrodynamics. Mostly he discusses incompressible fluids and their flow, but in one section he considers “elastic fluids”—and along with a whole variety of experimental results about atmospheric pressure in different places—draws the picture

and says

Let the space ECDF contain very small particles in rapid motion; as they strike against the piston EF and hold it up by their impact, they constitute an elastic fluid which expands as the weight P is removed or reduced; but if P is increased it becomes denser and presses on the horizontal case CD just as if it were endowed with no elastic property.

Then—in a direct and clear anticipation of the kinetic theory of heat—he goes on:

The pressure of the air is increased not only by reduction in volume but also by rise in temperature. As it is well known that heat is intensified as the internal motion of the particles increases, it follows that any increase in the pressure of air that has not changed its volume indicates more intense motion of its particles, which is in agreement with our hypothesis…

But at the time, and in fact for more than a century thereafter, this wasn’t followed up.

A large part of the reason seems to have been that people just assumed that heat ultimately had to have some kind of material existence; to think that it was merely a manifestation of microscopic motion was too abstract an idea. And then there was the observation of “radiant heat” (i.e. infrared radiation)—that seemed like it could only work by explicitly transferring some kind of “heat material” from one body to another.

But what was this “heat material”? It was thought of as a fluid—called caloric—that could suffuse matter, and for example flow from a hotter body to a colder. And in an echo of Democritus, it was often assumed that caloric consisted of particles that could slide between ordinary particles of matter. There was some thought that it might be related to the concept of phlogiston from the mid-1600s, that was effectively a chemical substance, for example participating in chemical reactions or being generated in combustion (through the “principle of fire”). But the more mainstream view was that there were caloric particles that would collect around ordinary particles of matter (often called “molecules”, after the use of that term by Descartes (1596–1650) in 1620), generating a repulsive force that would for example expand gases—and that in various circumstances these caloric particles would move around, corresponding to the transfer of heat.

To us today it might seem hacky and implausible (perhaps a little like dark matter, cosmological inflation, etc.), but the caloric theory lasted for more than two hundred years and managed to explain plenty of phenomena—and indeed was certainly going strong in 1825 when Laplace wrote his *A Treatise of Celestial Mechanics*, which included a successful computation of properties of gases like the speed of sound and the ratio of specific heats, on the basis of a somewhat elaborated and mathematicized version of caloric theory (that by then included the concept of “caloric rays” associated with radiant heat).

But even though it wasn’t understood what heat ultimately was, one could still measure its attributes. Already in antiquity there were devices that made use of heat to produce pressure or mechanical motion. And by the beginning of the 1600s—catalyzed by Galileo’s development of the thermoscope (in which heated liquid could be seen to expand up a tube)—the idea quickly caught on of making thermometers, and of quantitatively measuring temperature.

And given a measurement of temperature, one could correlate it with effects one saw. So, for example, in the late 1700s the French balloonist Jacques Charles (1746–1823) noted the linear increase of volume of a gas with temperature. Meanwhile, at the beginning of the 1800s Joseph Fourier (1768–1830) (science advisor to Napoleon) developed what became his 1822 *Analytical Theory of Heat*, and in it he begins by noting that:

Heat, like gravity, penetrates every substance of the universe, its rays occupy all parts of space. The object of our work is to set forth the mathematical laws which this element obeys. The theory of heat will hereafter form one of the most important branches of general physics.

Later he describes what he calls the “Principle of the Communication of Heat”. He refers to “molecules”—though basically just to indicate a small amount of substance—and says

When two molecules of the same solid are extremely near and at unequal temperatures, the most heated molecule communicates to that which is less heated a quantity of heat exactly expressed by the product of the duration of the instant, of the extremely small difference of the temperatures, and of certain function of the distance of the molecules.

then goes on to develop what’s now called the heat equation and all sorts of mathematics around it, all the while effectively adopting a caloric theory of heat. (And, yes, if you think of heat as a fluid it does lead you to describe its “motion” in terms of differential equations just like Fourier did. Though it’s then ironic that Bernoulli, even though he studied hydrodynamics, seemed to have a less “fluid-based” view of heat.)

At the beginning of the 1800s the Industrial Revolution was in full swing—driven in no small part by the availability of increasingly efficient steam engines. There had been precursors of steam engines even in antiquity, but it was only in 1712 that the first practical steam engine was developed. And after James Watt (1736–1819) produced a much more efficient version in 1776, the adoption of steam engines began to take off.

Over the years that followed there were all sorts of engineering innovations that increased the efficiency of steam engines. But it wasn’t clear how far it could go—and whether for example there was a limit to how much mechanical work could ever, even in principle, be derived from a given amount of heat. And it was the investigation of this question—in the hands of a young French engineer named Sadi Carnot (1796–1832)—that began the development of an abstract basic science of thermodynamics, and to the Second Law.

The story really begins with Sadi Carnot’s father, Lazare Carnot (1753–1823), who was trained as an engineer but ascended to the highest levels of French politics, and was involved with both the French Revolution and Napoleon. Particularly in years when he was out of political favor, Lazare Carnot worked on mathematics and mathematical engineering. His first significant work—in 1778—was entitled *Memoir on the Theory of Machines*. The mathematical and geometrical science of mechanics was by then fairly well developed; Lazare Carnot’s objective was to understand its consequences for actual engineering machines, and to somehow abstract general principles from the mechanical details of the operation of those machines. In 1803 (alongside works on the geometrical theory of fortifications) he published his *Fundamental Principles of [Mechanical] Equilibrium and Movement*, which argued for what was at one time called (in a strange foreshadowing of reversible thermodynamic processes) “Carnot’s Principle”: that useful work in a machine will be maximized if accelerations and shocks of moving parts are minimized—and that a machine with perpetual motion is impossible.

Sadi Carnot was born in 1796, and was largely educated by his father until he went to college in 1812. It’s notable that during the years when Sadi Carnot was a kid, one of his father’s activities was to give opinions on a whole range of inventions—including many steam engines and their generalizations. Lazare Carnot died in 1823. Sadi Carnot was by that point a well-educated but professionally undistinguished French military engineer. But in 1824, at the age of 28, he produced his one published work, *Reflections on the Motive Power of Fire, and on Machines to Develop That Power* (where by “fire” he meant what we would call heat):

The style and approach of the younger Carnot’s work is quite similar to his father’s. But the subject matter turned out to be more fruitful. The book begins:

Everyone knows that heat can produce motion. That it possesses vast motive-power none can doubt, in these days when the steam-engine is everywhere so well known… The study of these engines is of the greatest interest, their importance is enormous, their use is continually increasing, and they seem destined to produce a great revolution in the civilized world. Already the steam-engine works our mines, impels our ships, excavates our ports and our rivers, forges iron, fashions wood, grinds grain, spins and weaves our cloths, transports the heaviest burdens, etc. It appears that it must some day serve as a universal motor, and be substituted for animal power, water-falls, and air currents. …

Notwithstanding the work of all kinds done by steam-engines, notwithstanding the satisfactory condition to which they have been brought to-day, their theory is very little understood, and the attempts to improve them are still directed almost by chance. …

The question has often been raised whether the motive power of heat is unbounded, whether the possible improvements in steam-engines have an assignable limit, a limit which the nature of things will not allow to be passed by any means whatever; or whether, on the contrary, these improvements may be carried on indefinitely. We propose now to submit these questions to a deliberate examination.

Carnot operated very much within the framework of caloric theory, and indeed his ideas were crucially based on the concept that one could think about “heat itself” (which for him was caloric fluid), independent of the material substance (like steam) that was hot. But—like his father’s efforts with mechanical machines—his goal was to develop an abstract “metamodel” of something like a steam engine, crucially assuming that the generation of unbounded heat or mechanical work (i.e. perpetual motion) in the closed cycle of the operation of the machine was impossible, and noting (again with a reflection of his father’s work) that the system would necessarily maximize efficiency if it operated reversibly. And he then argued that:

The production of motive power is then due in steam-engines not to an actual consumption of caloric, but to its transportation from a warm body to a cold body, that is, to its re-establishment of equilibrium…

In other words, what was important about a steam engine was that it was a “heat engine”, that “moved heat around”. His book is mostly words, with just a few formulas related to the behavior of ideal gases, and some tables of actual parameters for particular materials. But even though his underlying conceptual framework—of caloric theory—was not correct, the abstract arguments that he made (that involved essentially logical consequences of reversibility and of operating in a closed cycle) were robust enough that it didn’t matter, and in particular he was able to successfully show that there was a theoretical maximum efficiency for a heat engine, that depended only on the temperatures of its hot and cold reservoirs of heat. But what’s important for our purposes here is that in the setup Carnot constructed he basically ended up introducing the Second Law.

At the time it appeared, however, Carnot’s book was basically ignored, and Carnot died in obscurity from cholera in 1832 (about 9 months after Évariste Galois (1811–1832)) at the age of 36. (The Sadi Carnot who would later become president of France was his nephew.) But in 1834, Émile Clapeyron (1799–1864)—a rather distinguished French engineering professor (and steam engine designer)—wrote a paper entitled “Memoir on the Motive Power of Heat”. He starts off by saying about Carnot’s book:

The idea which serves as a basis of his researches seems to me to be both fertile and beyond question; his demonstrations are founded on the absurdity of the possibility of creating motive power or heat out of nothing. …

This new method of demonstration seems to me worthy of the attention of theoreticians; it seems to me to be free of all objection …

I believe that it is of some interest to take up this theory again; S. Carnot, avoiding the use of mathematical analysis, arrives by a chain of difficult and elusive arguments at results which can be deduced easily from a more general law which I shall attempt to prove…

Clapeyron’s paper doesn’t live up to the claims of originality or rigor expressed here, but it served as a more accessible (both in terms of where it was published and how it was written) exposition of Carnot’s work, featuring, for example, for the first time a diagrammatic representation of a Carnot cycle

as well as notations like *Q*-for-heat that are still in use today:

One of the implications of Newton’s Laws of Motion is that momentum is conserved. But what else might also be conserved? In the 1680s Gottfried Leibniz (1646–1716) suggested the quantity *m v*^{2}, which he called, rather grandly, *vis viva—*or, in English, “life force”. And yes, in things like elastic collisions, this quantity did seem to be conserved. But in plenty of situations it wasn’t. By 1807 the term “energy” had been introduced, but the question remained of whether it could in any sense globally be thought of as conserved.

It had seemed for a long time that heat was something a bit like mechanical energy, but the relation wasn’t clear—and the caloric theory of heat implied that caloric (i.e. the fluid corresponding to heat) was conserved, and so certainly wasn’t something that for example could be interconverted with mechanical energy. But in 1798 Benjamin Thompson (Count Rumford) (1753–1814) measured the heat produced by the mechanical process of boring a cannon, and began to make the argument that, in contradiction to the caloric theory, there was actually some kind of correspondence between mechanical energy and amount of heat.

It wasn’t a very accurate experiment, and it took until the 1840s—with new experiments by the English brewer and “amateur” scientist James Joule (1818–1889) and the German physician Robert Mayer (1814–1878)—before the idea of some kind of equivalence between heat and mechanical work began to look more plausible. And in 1847 this was something William Thomson (1824–1907) (later Lord Kelvin)—a prolific young physicist recently graduated from the Mathematical Tripos in Cambridge and now installed as a professor of “natural philosophy” (i.e. physics) in Glasgow—began to be curious about.

But first we have to go back a bit in the story. In 1845 Kelvin (as we’ll call him) had spent some time in Paris (primarily at at a lab that was measuring properties of steam for the French government), and there he’d learned about Carnot’s work from Clapeyron’s paper (at first he couldn’t get a copy of Carnot’s actual book). Meanwhile, one of the issues of the time was a proliferation of different temperature scales based on using different kinds of thermometers based on different substances. And in 1848 Kelvin realized that Carnot’s concept of a “pure heat engine”—assumed at the time to be based on caloric—could be used to define an “absolute” scale of temperature in which, for example, at absolute zero all caloric would have been removed from all substances:

Having found Carnot’s ideas useful, Kelvin in 1849 wrote a 33-page summary of them (small world that it was then, the immediately preceding paper in the journal is “On the Theory of Rolling Curves”, written by the then-17-year-old James Clerk Maxwell (1831–1879), while the one that follows is “Theoretical Considerations on the Effect of Pressure in Lowering the Freezing Point of Water” by James Thomson (1786–1849), engineering-oriented older brother of William):

He characterizes Carnot’s work as being based not so much on physics and experiment, but on the “strictest principles of philosophy”:

He doesn’t immediately mention “caloric” (though it does slip in later), referring instead to a vaguer concept of “thermal agency”:

In keeping with the idea that this is more philosophy than experimental science, he refers to “Carnot’s fundamental principle”—that after a complete cycle an engine can be treated as back in the “same state”—while adding the footnote that “this is tacitly assumed as an axiom”:

In actuality, to say that an engine comes back to the same state is a nontrivial statement of the existence of some kind of unique equilibrium in the system, related to the Second Law. But in 1848 Kelvin brushes this off by saying that the “axiom” has “never, so far as I am aware, been questioned by practical engineers”.

His next page is notable for the first-ever use of the term “thermo-dynamic” (then hyphenated) to discuss systems where what matters is “the dynamics of heat”:

That same page has a curious footnote presaging what will come, and making the statement that “no energy can be destroyed”, and considering it “perplexing” that this seems incompatible with Carnot’s work and its caloric theory framework:

After going through Carnot’s basic arguments, the paper ends with an appendix in which Kelvin basically says that even though the theory seems to just be based on a formal axiom, it should be experimentally tested:

He proceeds to give some tests, which he claims agree with Carnot’s results—and finally ends with a very practical (but probably not correct) table of theoretical efficiencies for steam engines of his day:

But now what of Joule’s and Mayer’s experiments, and their apparent disagreement with the caloric theory of heat? By 1849 a new idea had emerged: that perhaps heat was itself a form of energy, and that, when heat was accounted for, the total energy of a system would always be conserved. And what this suggested was that heat was somehow a dynamical phenomenon, associated with microscopic motion—which in turn suggested that gases might indeed consist just of molecules in motion.

And so it was that in 1850 Kelvin (then still “William Thomson”) wrote a long exposition “On the Dynamical Theory of Heat”, attempting to reconcile Carnot’s ideas with the new concept that heat was dynamical in origin:

He begins by quoting—presumably for some kind of “British-based authority”—an “anti-caloric” experiment apparently done by Humphry Davy (1778–1829) as a teenager, involving melting pieces of ice by rubbing them together, and included anonymously in a 1799 list of pieces of knowledge “principally from the west of England”:

But soon Kelvin is getting to the main point:

And then we have it: a statement of the Second Law (albeit with some hedging to which we’ll come back later):

And there’s immediately a footnote that basically asserts the “absurdity” of a Second-Law-violating perpetual motion machine:

But by the next page we find out that Kelvin admits he’s in some sense been “scooped”—by a certain Rudolf Clausius (1822–1888), who we’ll be discussing soon. But what’s remarkable is that Clausius’s “axiom” turns out to be exactly equivalent to Kelvin’s statement:

And what this suggests is that the underlying concept—the Second Law—is something quite robust. And indeed, as Kelvin implies, it’s the main thing that ultimately underlies Carnot’s results. And so even though Carnot is operating on the now-outmoded idea of caloric theory, his main results are still correct, because in the end all they really depend on is a certain amount of “logical structure”, together with the Second Law (and a version of the First Law, but that’s a slightly trickier story).

Kelvin recognized, though, that Carnot had chosen to look at the particular (“equilibrium thermodynamics”) case of processes that occur reversibly, effectively at an infinitesimal rate. And at the end of the first installment of his exposition, he explains that things will be more complicated if finite rates are considered—and that in particular the results one gets in such cases will depend on things like having a correct model for the nature of heat.

Kelvin’s exposition on the “dynamical nature of heat” runs to four installments, and the next two dive into detailed derivations and attempted comparison with experiment:

But before Kelvin gets to publish part four of his exposition he publishes two other pieces. In the first, he’s talking about sources of energy for human use (now that he believes energy is conserved):

He emphasizes that the Sun is—directly or indirectly—the main source of energy on Earth (later he’ll argue that coal will run out, etc.):

But he wonders how animals actually manage to produce mechanical work, noting that “the animal body does not act as a thermo-dynamic engine; and [it is] very probable that the chemical forces produce the external mechanical effects through electrical means”:

And then, by April 1852, he’s back to thinking directly about the Second Law, and he’s cut through the technicalities, and is stating the Second Law in everyday (if slightly ponderous) terms:

It’s interesting to see his apparently rather deeply held Presbyterian beliefs manifest themselves here in his mention that “Creative Power” is what must set the total energy of the universe. He ends his piece with:

In (2) the hedging is interesting. He makes the definitive assertion that what amounts to a violation of the Second Law “is impossible in inanimate material processes”. And he’s pretty sure the same is true for “vegetable life” (recognizing that in his previous paper he discussed the harvesting of sunlight by plants). But what about “animal life”, like us humans? Here he says that “by our will” we can’t violate the Second Law—so we can’t, for example, build a machine to do it. But he leaves it open whether we as humans might have some innate (“God-given”?) ability to overcome the Second Law.

And then there’s his (3). It’s worth realizing that his whole paper is less than 3 pages long, and right before his conclusions we’re seeing triple integrals:

So what is (3) about? It’s presumably something like a Second-Law-implies-heat-death-of-the-universe statement (but what’s this stuff about the past?)—but with an added twist that there’s something (God?) beyond the “known operations going on at present in the material world” that might be able to swoop in to save the world for us humans.

It doesn’t take people long to pick up on the “cosmic significance” of all this. But in the fall of 1852, Kelvin’s colleague, the Glasgow engineering professor William Rankine (1820–1872) (who was deeply involved with the First Law of thermodynamics), is writing about a way the universe might save itself:

After touting the increasingly solid evidence for energy conservation and the First Law

he goes on to talk about dissipation of energy and what we now call the Second Law

and the fact that it implies an “end of all physical phenomena”, i.e. heat death of the universe. He continues:

But now he offers a “ray of hope”. He believes that there must exist a “medium capable of transmitting light and heat”, i.e. an aether, “[between] the heavenly bodies”. And if this aether can’t itself acquire heat, he concludes that all energy must be converted into a radiant form:

Now he supposes that the universe is effectively a giant drop of aether, with nothing outside, so that all this radiant energy will get totally internally reflected from its surface, allowing the universe to “[reconcentrate] its physical energies, and [renew] its activity and life”—and save it from heat death:

He ends with the speculation that perhaps “some of the luminous objects which we see in distant regions of space may be, not stars, but foci in the interstellar aether”.

But independent of cosmic speculations, Kelvin himself continues to study the “dynamical theory of gases”. It’s often a bit unclear what’s being assumed. There’s the First Law (energy conservation). And the Second Law. But there’s also reversibility. Equilibrium. And the ideal gas law (*P V* = *R T*). But it soon becomes clear that that’s not always correct for real gases—as the Joule–Thomson effect demonstrates:

Kelvin soon returned to more cosmic speculations, suggesting that perhaps gravitation—rather than direct “Creative Power”—might “in reality [be] the ultimate created antecedent of all motion…”:

Not long after these papers Kelvin got involved with the practical “electrical” problem of laying a transatlantic telegraph cable, and in 1858 was on the ship that first succeeded in doing this. (His commercial efforts soon allowed him to buy a 126-ton yacht.) But he continued to write physics papers, which ranged over many different areas, occasionally touching thermodynamics, though most often in the service of answering a “general science” question—like how old the Sun is (he estimated 32,000 years from thermodynamic arguments, though of course without knowledge of nuclear reactions).

Kelvin’s ideas about the inevitable dissipation of “useful energy” spread quickly—by 1854, for example, finding their way into an eloquent public lecture by Hermann von Helmholtz (1821–1894). Helmholtz had trained as a doctor, becoming in 1843 a surgeon to a German military regiment. But he was also doing experiments and developing theories about “animal heat” and how muscles manage to “do mechanical work”, for example publishing an 1845 paper entitled “On Metabolism during Muscular Activity”. And in 1847 he was one of the inventors of the law of conservation of energy—and the First Law of thermodynamics—as well as perhaps its clearest expositor at the time (the word “force” in the title is what we now call “energy”):

By 1854 Helmholtz was a physiology professor, beginning a distinguished career in physics, psychophysics and physiology—and talking about the Second Law and its implications. He began his lecture by saying that “A new conquest of very general interest has been recently made by natural philosophy”—and what he’s referring to here is the Second Law:

Having discussed the inability of “automata” (he uses that word) to reproduce living systems, he starts talking about perpetual motion machines:

First he disposes of the idea that perpetual motion can be achieved by generating energy from nothing (i.e. violating the First Law), charmingly including the anecdote:

And then he’s on to talking about the Second Law

and discussing how it implies the heat death of the universe:

He notes, correctly, that the Second Law hasn’t been “proved”. But he’s impressed at how Kelvin was able to go from a “mathematical formula” to a global fact about the fate of the universe:

He ends the whole lecture quite poetically:

We’ve talked quite a bit about Kelvin and how his ideas spread. But let’s turn now to Rudolf Clausius, who in 1850 at least to some extent “scooped” Kelvin on the Second Law. At that time Clausius was a freshly minted German physics PhD. His thesis had been on an ingenious but ultimately incorrect theory of why the sky is blue. But he’d also worked on elasticity theory, and there he’d been led to start thinking about molecules and their configurations in materials. By 1850 caloric theory had become fairly elaborate, complete with concepts like “latent heat” (bound to molecules) and “free heat” (able to be transferred). Clausius’s experience in elasticity theory made him skeptical, and knowing Mayer’s and Joule’s results he decided to break with the caloric theory—writing his career-launching paper (translated from German in 1851, with Carnot’s *puissance motrice* [“motive power”] being rendered as “moving force”):

The first installment of the English version of the paper gives a clear description of the ideal gas laws and the Carnot cycle, having started from a statement of the “caloric-busting” First Law:

The general discussion continues in the second installment, but now there’s a critical side comment that describes the “general deportment of heat, which every-where exhibits the tendency to annul differences of temperature, and therefore to pass from a warmer body to a colder one”:

Clausius “has” the Second Law, as Carnot basically did before him. But when Kelvin quotes Clausius he does so much more forcefully:

But there it is: by 1852 the Second Law is out in the open, in at least two different forms. The path to reach it has been circuitous and quite technical. But in the end, stripped of its technical origins, the law seems somehow unsurprising and even obvious. For it’s a matter of common experience that heat flows from hotter bodies to colder ones, and that motion is dissipated by friction into heat. But the point is that it wasn’t until basically 1850 that the overall scientific framework existed to make it useful—or even really possible—to enunciate such observations as a formal scientific law.

Of course the fact that a law “seems true” based on common experience doesn’t mean it’ll always be true, and that there won’t be some special circumstance or elaborate construction that will evade it. But somehow the very fact that the Second Law had in a sense been “technically hard won”—yet in the end seemed so “obvious”—appears to have given it a sense of inevitability and certainty. And it didn’t hurt that somehow it seemed to have emerged from Carnot’s work, which had a certain air of “logical necessity”. (Of course, in reality, the Second Law entered Carnot’s logical structure as an “axiom”.) But all this helped set the stage for some of the curious confusions about the Second Law that would develop over the century that followed.

In the first half of the 1850s the Second Law had in a sense been presented in two ways. First, as an almost “footnote-style” assumption needed to support the “pure thermodynamics” that had grown out of Carnot’s work. And second, as an explicitly-stated-for-the-first-time—if “obvious”—“everyday” feature of nature, that was now realized as having potentially cosmic significance. But an important feature of the decade that followed was a certain progressive at-least-phenomenological “mathematicization” of the Second Law—pursued most notably by Rudolf Clausius.

In 1854 Clausius was already beginning this process. Perhaps confusingly, he refers to the Second Law as the “second fundamental theorem [*Hauptsatz*]” in the “mechanical theory of heat”—suggesting it’s something that is proved, even though it’s really introduced just as an empirical law of nature, or perhaps a theoretical axiom:

He starts off by discussing the “first fundamental theorem”, i.e. the First Law. And he emphasizes that this implies that there’s a quantity *U* (which we now call “internal energy”) that is a pure “function of state”—so that its value depends only on the state of a system, and not the path by which that state was reached. And as an “application” of this, he then points out that the overall change in *U* in a cyclic process (like the one executed by Carnot’s heat engine) must be zero.

And now he’s ready to tackle the Second Law. He gives a statement that at first seems somewhat convoluted:

But soon he’s deriving this from a more “everyday” statement of the Second Law (which, notably, is clearly not a “theorem” in any normal sense):

After giving a Carnot-style argument he’s then got a new statement (that he calls “the theorem of the equivalence of transformations”) of the Second Law:

And there it is: basically what we now call entropy (even with the same notation of *Q* for heat and *T* for temperature)—together with the statement that this quantity is a function of state, so that its differences are “independent of the nature of the process by which the transformation is effected”.

Pretty soon there’s a familiar expression for entropy change:

And by the next page he’s giving what he describes as “the analytical expression” of the Second Law, for the particular case of reversible cyclic processes:

A bit later he backs out of the assumption of reversibility, concluding that:

(And, yes, with modern mathematical rigor, that should be “non-negative” rather than “positive”.)

He goes on to say that if something has changed after going around a cycle, he’ll call that an “uncompensated transformation”—or what we would now refer to as an irreversible change. He lists a few possible (now very familiar) examples:

Earlier in his paper he’s careful to say that *T* is “a function of temperature”; he doesn’t say it’s actually the quantity we measure as temperature. But now he wants to determine what it is:

He doesn’t talk about the ultimately critical assumption (effectively the Zeroth Law of thermodynamics) that the system is “in equilibrium”, with a uniform temperature. But he uses an ideal gas as a kind of “standard material”, and determines that, yes, in that case *T* can be simply the absolute temperature.

So there it is: in 1854 Clausius has effectively defined entropy and described its relation to the Second Law, though everything is being done in a very “heat-engine” style. And pretty soon he’s writing about “Theory of the Steam-Engine” and filling actual approximate steam tables into his theoretical formulas:

After a few years “off” (working, as we’ll discuss later, on the kinetic theory of gases) Clausius is back in 1862 talking about the Second Law again, in terms of his “theorem of the equivalence of transformations”:

He’s slightly tightened up his 1854 discussion, but, more importantly, he’s now stating a result not just for reversible cyclic processes, but for general ones:

But what does this result really mean? Clausius claims that this “theorem admits of strict mathematical proof if we start from the fundamental proposition above quoted”—though it’s not particularly clear just what that proposition is. But then he says he wants to find a “physical cause”:

A little earlier in the paper he said:

So what does he think the “physical cause” is? He says that even from his first investigations he’d assumed a general law:

What are these “resistances”? He’s basically saying they are the forces between molecules in a material (which from his work on the kinetic theory of gases he now imagines exist):

He introduces what he calls the “disgregation” to represent the microscopic effect of adding heat:

For ideal gases things are straightforward, including the proportionality of “resistance” to absolute temperature. But in other cases, it’s not so clear what’s going on. A decade later he identifies “disgregation” with average kinetic energy per molecule—which is indeed proportional to absolute temperature. But in 1862 it’s all still quite muddy, with somewhat curious statements like:

And then the main part of the paper ends with what seems to be an anticipation of the Third Law of thermodynamics:

There’s an appendix entitled “On Terminology” which admits that between Clausius’s own work, and other people’s, it’s become rather difficult to follow what’s going on. He agrees that the term “energy” that Kelvin is using makes sense. He suggests “energy of the body” for what he calls *U* and we now call “internal energy”. He suggests “heat of the body” or “thermal content of the body” for *Q*. But then he talks about the fact that these are measured in thermal units (say the amount of heat needed to increase the temperature of water by 1°), while mechanical work is measured in units related to kilograms and meters. He proposes therefore to introduce the concept of “ergon” for “work measured in thermal units”:

And pretty soon he’s talking about the “interior ergon” and “exterior ergon”, as well as concepts like “ergonized heat”. (In later work he also tries to introduce the concept of “ergal” to go along with his development of what he called—in a name that did stick—the “virial theorem”.)

But in 1865 he has his biggest success in introducing a term. He’s writing a paper, he says, basically to clarify the Second Law, (or, as he calls it, “the second fundamental theorem”—rather confidently asserting that he will “prove this theorem”):

Part of the issue he’s trying to address is how the calculus is done:

The partial derivative symbol ∂ had been introduced in the late 1700s. He doesn’t use it, but he does introduce the now-standard-in-thermodynamics subscript notation for variables that are kept constant:

A little later, as part of the “notational cleanup”, we see the variable *S*:

And then—there it is—Clausius introduces the term “entropy”, “Greekifying” his concept of “transformation”:

His paper ends with his famous crisp statements of the First and Second Laws of thermodynamics—manifesting the parallelism he’s been claiming between energy and entropy:

We began above by discussing the history of the question of “What is heat?” Was it like a fluid—the caloric theory? Or was it something more dynamical, and in a sense more abstract? But then we saw how Carnot—followed by Kelvin and Clausius—managed in effect to sidestep the question, and come up with all sorts of “thermodynamic conclusions”, by talking just about “what heat does” without ever really having to seriously address the question of “what heat is”. But to be able to discuss the foundations of the Second Law—and what it says about heat—we have to know more about what heat actually is. And the crucial development that began to clarify the nature of heat was the kinetic theory of gases.

Central to the kinetic theory of gases is the idea that gases are made up of discrete molecules. And it’s important to remember that it wasn’t until the beginning of the 1900s that anyone knew for sure that molecules existed. Yes, something like them had been discussed ever since antiquity, and in the 1800s there was increasing “circumstantial evidence” for them. But nobody had directly “seen a molecule”, or been able, for example, until about 1870, to even guess what the size of molecules might be. Still, by the mid-1800s it had become common for physicists to talk and reason in terms of ordinary matter at least effectively being made of up molecules.

But if a gas was made of molecules bouncing off each other like billiard balls according to the laws of mechanics, what would its overall properties be? Daniel Bernoulli had in 1738 already worked out the basic answer that pressure would vary inversely with volume, or in his notation, π = *P*/*s* (and he even also gave formulas for molecules of nonzero size—in a precursor of van der Waals):

Results like Bernouilli’s would be rediscovered several times, for example in 1820 by John Herapath (1790–1868), a math teacher in England, who developed a fairly elaborate theory that purported to describe gravity as well as heat (but for example implied a *P V* = *a T*^{2} gas law):

Then there was the case of John Waterston (1811–1883), a naval instructor for the East India company, who in 1843 published a book called *Thoughts on the Mental Functions*, which included results on what he called the “*vis viva* theory of heat”—that he developed in more detail in a paper he wrote in 1846. But when he submitted the paper to the Royal Society it was rejected as “nonsense”, and its manuscript was “lost” until 1891 when it was finally published (with an “explanation” of the “delay”):

The paper had included a perfectly sensible mathematical analysis that included a derivation of the kinetic theory relation between pressure and mean-square molecular velocity:

But with all these pieces of work unknown, it fell to a German high-school chemistry teacher (and sometime professor and philosophical/theological writer) named August Krönig (1822–1879) to publish in 1856 yet another “rediscovery”, that he entitled “Principles of a Theory of Gases”. He said it was going to analyze the “mechanical theory of heat”, and once again he wanted to compute the pressure associated with colliding molecules. But to simplify the math, he assumed that molecules went only along the coordinate directions, at a fixed speed—almost anticipating a cellular automaton fluid:

What ultimately launched the subsequent development of the kinetic theory of gases, however, was the 1857 publication by Rudolf Clausius (by then an increasingly established German physics professor) of a paper entitled rather poetically “On the Nature of the Motion Which We Call Heat” (“Über die Art der Bewegung die wir Wärme nennen”):

It’s a clean and clear paper, with none of the mathematical muddiness around Clausius’s work on the Second Law (which, by the way, isn’t even mentioned in this paper even though Clausius had recently worked on it). Clausius figures out lots of the “obvious” implications of his molecular theory, outlining for example what happens in different phases of matter:

It takes him only a couple of pages of very light mathematics to derive the standard kinetic theory formula for the pressure of an ideal gas:

He’s implicitly assuming a certain randomness to the motions of the molecules, but he barely mentions it (and this particular formula is robust enough that average values are actually all that matter):

But having derived the formula for pressure, he goes on to use the ideal gas law to derive the relation between average molecular kinetic energy (which he still calls “*vis viva*”) and absolute temperature:

From this he can do things like work out the actual average velocities of molecules in different gases—which he does without any mention of the question of just how real or not molecules might be. By knowing experimental results about specific heats of gases he also manages to determine that not all the energy (“heat”) in a gas is associated with “translatory motion”: he realizes that for molecules involving several atoms there can be energy associated with other (as we would now say) internal degrees of freedom:

Clausius’s paper was widely read. And it didn’t take long before the Dutch meteorologist (and effectively founder of the World Meteorological Organization) Christophorus Buys Ballot (1817–1890) asked why—if molecules were moving as quickly as Clausius suggested—gases didn’t mix much more quickly than they’re observed to do:

Within a few months, Clausius published the answer: the molecules didn’t just keep moving in straight lines; they were constantly being deflected, to follow what we would now call a random walk. He invented the concept of a mean free path to describe how far on average a molecule goes before it hits another molecule:

As a capable theoretical physicist, Clausius quickly brings in the concept of probability

and is soon computing the average number of molecules which will survive undeflected for a certain distance:

Then he works out the mean free path λ (and it’s often still called λ):

And he concludes that actually there’s no conflict between rapid microscopic motion and large-scale “diffusive” motion:

Of course, he could have actually drawn a sample random walk, but drawing diagrams wasn’t his style. And in fact it seems as if the first published drawing of a random walk was something added by John Venn (1834–1923) in the 1888 edition of his *Logic of Chance*—and, interestingly, in alignment with my computational irreducibility concept from a century later he used the digits of π to generate his “randomness”:

In 1859, Clausius’s paper came to the attention of the then-28-year-old James Clerk Maxwell, who had grown up in Scotland, done the Mathematical Tripos in Cambridge, and was now back in Scotland as professor of “natural philosophy” at Aberdeen. Maxwell had already worked on things like elasticity theory, color vision, the mechanics of tops, the dynamics of the rings of Saturn and electromagnetism—having published his first paper (on geometry) at age 14. And, by the way, Maxwell was quite a “diagrammist”—and his early papers include all sorts of pictures that he drew:

But in 1859 Maxwell applied his talents to what he called the “dynamical theory of gases”:

He models molecules as hard spheres, and sets about computing the “statistical” results of their collisions:

And pretty soon he’s trying to compute distribution of their velocities:

It’s a somewhat unconvincing (or, as Maxwell himself later put it, “precarious”) derivation (how does it work in 1D, for example?), but somehow it manages to produce what’s now known as the Maxwell distribution:

Maxwell observes that the distribution is the same as for “errors … in the ‘method of least squares’”:

Maxwell didn’t get back to the dynamical theory of gases until 1866, but in the meantime he was making a “dynamical theory” of something else: what he called the electromagnetic field:

Even though he’d worked extensively with the inverse square law of gravity he didn’t like the idea of “action at a distance”, and for example he wanted magnetic field lines to have some underlying “material” manifestation

imagining that they might be associated with arrays of “molecular vortices”:

We now know, of course, that there isn’t this kind of “underlying mechanics” for the electromagnetic field. But—with shades of the story of Carnot—even though the underlying framework isn’t right, Maxwell successfully derives correct equations for the electromagnetic field—that are now known as Maxwell’s equations:

His statement of how the electromagnetic field “works” is highly reminiscent of the dynamical theory of gases:

But he quickly and correctly adds:

And a few sections later he derives the idea of general electromagnetic waves

noting that there’s no evidence that the medium through which he assumes they’re propagating has elasticity:

By the way, when it comes to gravity he can’t figure out how to make his idea of a “mechanical medium” work:

But in any case, after using it as an inspiration for thinking about electromagnetism, Maxwell in 1866 returns to the actual dynamical theory of gases, still feeling that he needs to justify looking at a molecular theory:

And now he gives a recognizable (and correct, so far as it goes) derivation of the Maxwell distribution:

He goes on to try to understand experimental results on gases, about things like diffusion, viscosity and conductivity. For some reason, Maxwell doesn’t want to think of molecules, as he did before, as hard spheres. And instead he imagines that they have “action at a distance” forces, which basically work like hard squares if it’s *r*^{-5} force law:

In the years that followed, Maxwell visited the dynamical theory of gases several more times. In 1871, a few years before he died at age 48, he wrote a textbook entitled *Theory of Heat*, which begins, in erudite fashion, discussing what “thermodynamics” should even be called:

Most of the book is concerned with the macroscopic “theory of heat”—though, as we’ll discuss later, in the very last chapter Maxwell does talk about the “molecular theory”, if in somewhat tentative terms.

The Second Law was in effect originally introduced as a formalization of everyday observations about heat. But the development of kinetic theory seemed to open up the possibility that the Second Law could actually be proved from the underlying mechanics of molecules. And this was something that Ludwig Boltzmann (1844–1906) embarked on towards the end of his physics PhD at the University of Vienna. In 1865 he’d published his first paper (“On the Movement of Electricity on Curved Surfaces”), and in 1866 he published his second paper, “On the Mechanical Meaning of the Second Law of Thermodynamics”:

The introduction promises “a purely analytical, perfectly general proof of the Second Law”. And what he seemed to imagine was that the equations of mechanics would somehow inevitably lead to motion that would reproduce the Second Law. And in a sense what computational irreducibility, rule 30, etc. now show is that in the end that’s indeed basically how things work. But the methods and conceptual framework that Boltzmann had at his disposal were very far away from being able to see that. And instead what Boltzmann did was to use standard mathematical methods from mechanics to compute average properties of cyclic mechanical motions—and then made the somewhat unconvincing claim that combinations of these averages could be related (e.g. via temperature as average kinetic energy) to “Clausius’s entropy”:

It’s not clear how much this paper was read, but in 1871 Boltzmann (now a professor of mathematical physics in Graz) published another paper entitled simply “On the Priority of Finding the Relationship between the Second Law of Thermodynamics and the Principle of Least Action” that claimed (with some justification) that Clausius’s then-newly-announced virial theorem was already contained in Boltzmann’s 1866 paper.

But back in 1868—instead of trying to get all the way to Clausius’s entropy—Boltzmann instead uses mechanics to get a generalization of Maxwell’s law for the distribution of molecular velocities. His paper “Studies on the Equilibrium of [Kinetic Energy] between [Point Masses] in Motion” opens by saying that while analytical mechanics has in effect successfully studied the evolution of mechanical systems “from a given state to another”, it’s had little to say about what happens when such systems “have been left moving on their own for a long time”. He intends to remedy that, and spends 47 pages—complete with elaborate diagrams and formulas about collisions between hard spheres—in deriving an exponential distribution of energies if one assumes “equilibrium” (or, more specifically, balance between forward and backward processes):

It’s notable that one of the mathematical approaches Boltzmann uses is to discretize (i.e. effectively quantize) things, then look at the “combinatorial” limit. (Based on his later statements, he didn’t want to trust “purely continuous” mathematics—at least in the context of discrete molecular processes—and wanted to explicitly “watch the limits happening”.) But in the end it’s not clear that Boltzmann’s 1868 arguments do more than the few-line functional-equation approach that Maxwell had already used. (Maxwell would later complain about Boltzmann’s “overly long” arguments.)

Boltzmann’s 1868 paper had derived what the distribution of molecular energies should be “in equilibrium”. (In 1871 he was talking about “equipartition” not just of kinetic energy, but also of energies associated with “internal motion” of polyatomic molecules.) But what about the approach to equilibrium? How would an initial distribution of molecular energies evolve over time? And would it always end up at the exponential (“Maxwell–Boltzmann”) distribution? These are questions deeply related to a microscopic understanding of the Second Law. And they’re what Boltzmann addressed in 1872 in his 22nd published paper “Further Studies on the Thermal Equilibrium of Gas Molecules”:

Boltzmann explains that:

Maxwell already found the value

Av^{2}e^{–Bv2}[for the distribution of velocities] … so that the probability of different velocities is given by a formula similar to that for the probability of different errors of observation in the theory of the method of least squares. The first proof which Maxwell gave for this formula was recognized to be incorrect even by himself. He later gave a very elegant proof that, if the above distribution has once been established, it will not be changed by collisions. He also tries to prove that it is the only velocity distribution that has this property. But the latter proof appears to me to contain a false inference. It has still not yet been proved that, whatever the initial state of the gas may be, it must always approach the limit found by Maxwell. It is possible that there may be other possible limits. This proof is easily obtained, however, by the method which I am about to explain…

(He gives a long footnote explaining why Maxwell might be wrong, talking about how a sequence of collisions might lead to a “cycle of velocity states”—which Maxwell hasn’t proved will be traversed with equal probability in each direction. Ironically, this is actually already an analog of where things are going to go wrong with Boltzmann’s own argument.)

The main idea of Boltzmann’s paper is not to assume equilibrium, but instead to write down an equation (now called the Boltzmann Transport Equation) that explicitly describes how the velocity (or energy) distribution of molecules will change as a result of collisions. He begins by defining infinitesimal changes in time:

He then goes through a rather elaborate analysis of velocities before and after collisions, and how to integrate over them, and eventually winds up with a partial differential equation for the time variation of the energy distribution (yes, he confusingly uses *x* to denote energy)—and argues that Maxwell’s exponential distribution is a stationary solution to this equation:

A few paragraphs further on, something important happens: Boltzmann introduces a function that here he calls *E*, though later he’ll call it *H*:

Ten pages of computation follow

and finally Boltzmann gets his main result: if the velocity distribution evolves according to his equation, *H* can never increase with time, becoming zero for the Maxwell distribution. In other words, he is saying that he’s proved that a gas will always (“monotonically”) approach equilibrium—which seems awfully like some kind of microscopic proof of the Second Law.

But then Boltzmann makes a bolder claim:

It has thus been rigorously proved that, whatever the initial distribution of kinetic energy may be, in the course of a very long time it must always necessarily approach the one found by Maxwell. The procedure used so far is of course nothing more than a mathematical artifice employed in order to give a rigorous proof of a theorem whose exact proof has not previously been found. It gains meaning by its applicability to the theory of polyatomic gas molecules. There one can again prove that a certain quantity

Ecan only decrease as a consequence of molecular motion, or in a limiting case can remain constant. One can also prove that for the atomic motion of a system of arbitrarily many material points there always exists a certain quantity which, in consequence of any atomic motion, cannot increase, and this quantity agrees up to a constant factor with the value found for the well-known integral ∫dQ/Tin my [1871] paper on the “Analytical proof of the 2nd law, etc.”. We have therefore prepared the way for an analytical proof of the Second Law in a completely different way from those previously investigated. Up to now the object has been to show that ∫dQ/T= 0 for reversible cyclic processes, but it has not been proved analytically that this quantity is always negative for irreversible processes, which are the only ones that occur in nature. The reversible cyclic process is only an ideal, which one can more or less closely approach but never completely attain. Here, however, we have succeeded in showing that ∫dQ/Tis in general negative, and is equal to zero only for the limiting case, which is of course the reversible cyclic process (since if one can go through the process in either direction, ∫dQ/Tcannot be negative).

In other words, he’s saying that the quantity *H* that he’s defined microscopically in terms of velocity distributions can be identified (up to a sign) with the entropy that Clausius defined as *dQ*/*T*. He says that he’ll show this in the context of analyzing the mechanics of polyatomic molecules.

But first he’s going to take a break and show that his derivation doesn’t need to assume continuity. In a pre-quantum-mechanics pre-cellular-automaton-fluid kind of way he replaces all the integrals by limits of sums of discrete quantities (i.e. he’s quantizing kinetic energy, etc.):

He says that this discrete approach makes everything clearer, and quotes Lagrange’s derivation of vibrations of a string as an example of where this has happened before. But then he argues that everything works out fine with the discrete approach, and that *H* still decreases, with the Maxwell distribution as the only possible end point. As an aside, he makes a jab at Maxwell’s derivation, pointing out that with Maxwell’s functional equation:

… there are infinitely many other solutions, which are not useful however since ƒ(

x) comes out negative or imaginary for some values ofx. Hence, it follows very clearly that Maxwell’s attempt to provea priorithat his solution is the only one must fail, since it is not the only one but rather it is the only one that gives purely positive probabilities, and therefore the only useful one.

But finally—after another aside about computing thermal conductivities of gases—Boltzmann digs into polyatomic molecules, and his claim about *H* being related to entropy. There’s another 26 pages of calculations, and then we get to a section entitled “Solution of Equation (81) and Calculation of Entropy”. More pages of calculation about polyatomic molecules ensue. But finally we’re computing *H*, and, yes, it agrees with the Clausius result—but anticlimactically he’s only dealing with the case of equilibrium for monatomic molecules, where we already knew we got the Maxwell distribution:

And now he decides he’s not talking about polyatomic molecules anymore, and instead:

In order to find the relation of the quantity [

H] to the second law of thermodynamics in the form ∫dQ/T< 0, we shall interpret the system of mass points not, as previously, as a gas molecule, but rather as an entire body.

But then, in the last couple of pages of his paper, Boltzmann pulls out another idea. He’s discussed the concept that polyatomic molecules (or, now, whole systems) can be in many different configurations, or “phases”. But now he says: “We shall replace [our] single system by a large number of equivalent systems distributed over many different phases, but which do not interact with each other”. In other words, he’s introducing the idea of an ensemble of states of a system. And now he says that instead of looking at the distribution just for a single velocity, we should do it for all velocities, i.e. for the whole “phase” of the system.

[These distributions] may be discontinuous, so that they have large values when the variables are very close to certain values determined by one or more equations, and otherwise vanishingly small. We may choose these equations to be those that characterize visible external motion of the body and the kinetic energy contained in it. In this connection it should be noted that the kinetic energy of visible motion corresponds to such a large deviation from the final equilibrium distribution of kinetic energy

that it leads to an infinity inH, so that from the point of view of the Second Law of thermodynamics it acts like heat supplied from an infinite temperature.

There are a bunch of ideas swirling around here. Phase-space density (*cf*. Liouville’s equation). Coarse-grained variables. Microscopic representation of mechanical work. Etc. But the paper is ending. There’s a discussion about *H* for systems that interact, and how there’s an equilibrium value achieved. And finally there’s a formula for entropy

that Boltzmann said “agrees … with the expression I found in my previous [1871] paper”.

So what exactly did Boltzmann really do in his 1872 paper? He introduced the Boltzmann Transport Equation which allows one to compute at least certain non-equilibrium properties of gases. But is his ƒ log ƒ quantity really what we can call “entropy” in the sense Clausius meant? And is it true that he’s proved that entropy (even in his sense) increases? A century and a half later there’s still a remarkable level of confusion around both these issues.

But in any case, back in 1872 Boltzmann’s “minimum theorem” (now called his “*H* theorem”) created quite a stir. But after some time there was an objection raised, which we’ll discuss below. And partly in response to this, Boltzmann (after spending time working on microscopic models of electrical properties of materials—as well as doing some actual experiments) wrote another major paper on entropy and the Second Law in 1877:

The translated title of the paper is “On the Relation between the Second Law of Thermodynamics and Probability Theory with Respect to the Laws of Thermal Equilibrium”. And at the very beginning of the paper Boltzmann makes a statement that was pivotal for future discussions of the Second Law: he says it’s now clear to him that an “analytical proof” of the Second Law is “only possible on the basis of probability calculations”. Now that we know about computational irreducibility and its implications one could say that this was the point where Boltzmann and those who followed him went off track in understanding the true foundations of the Second Law. But Boltzmann’s idea of introducing probability theory was effectively what launched statistical mechanics, with all its rich and varied consequences.

Boltzmann makes his basic claim early in the paper

with the statement (quoting from a comment in a paper he’d written earlier the same year) that “it is clear” (always a dangerous thing to say!) that in thermal equilibrium all possible states of the system—say, spatially uniform and nonuniform alike—are equally probable

… comparable to the situation in the game of Lotto where every single quintet is as improbable as the quintet 12345. The higher probability that the state distribution becomes uniform with time arises only because there are far more uniform than nonuniform state distributions…

He goes on:

[Thus] it is possible to calculate the thermal equilibrium state by finding the probability of the different possible states of the system. The initial state will in most cases be highly improbable but from it the system will always rapidly approach a more probable state until it finally reaches the most probable state, i.e., that of thermal equilibrium. If we apply this to the Second Law we will be able to identify the quantity which is usually called entropy with the probability of the particular state…

He’s talked about thermal equilibrium, even in the title, but now he says:

… our main purpose here is not to limit ourselves to thermal equilibrium, but to explore the relationship of the probabilistic formulation to the [Second Law].

He says his goal is to calculate probability distribution for different states, and he’ll start with

as simple a case as possible, namely a gas of rigid absolutely elastic spherical molecules trapped in a container with absolutely elastic walls. (Which interact with central forces only within a certain small distance, but not otherwise; the latter assumption, which includes the former as a special case, does not change the calculations in the least).

In other words, yet again he’s going to look at hard sphere gases. But, he says:

Even in this case, the application of probability theory is not easy. The number of molecules is not infinite, in a mathematical sense, yet the number of velocities each molecule is capable of is effectively infinite. Given this last condition, the calculations are very difficult; to facilitate understanding, I will, as in earlier work, consider a limiting case.

And this is where he “goes discrete” again—allowing (“cellular-automaton-style”) only discrete possible velocities for each molecule:

He says that upon colliding, two molecules can exchange these discrete velocities, but nothing more. As he explains, though:

Even if, at first sight, this seems a very abstract way of treating the problem, it rapidly leads to the desired objective, and when you consider that in nature all infinities are but limiting cases, one assumes each molecule can behave in this fashion only in the limiting case where each molecule can assume more and more values of the velocity.

But now—much like in an earlier paper—he makes things even simpler, saying he’s going to ignore velocities for now, and just say that the possible energies of molecules are “in an arithmetic progression”:

He plans to look at collisions, but first he just wants to consider the combinatorial problem of distributing these energies among *n* molecules in all possible ways, subject to the constraint of having a certain fixed total energy. He sets up a specific example, with 7 molecules, total energy 7, and maximum energy per molecule 7—then gives an explicit table of all possible states (up to, as he puts it, “immaterial permutations of molecular labels”):

Tables like this had been common for nearly two centuries in combinatorial mathematics books like Jacob Bernoulli’s (1655–1705) *Ars Conjectandi*

but this might have been the first place such a table had appeared in a paper about fundamental physics.

And now Boltzmann goes into an analysis of the distribution of states—of the kind that’s now long been standard in textbooks of statistical physics, but will then have been quite unfamiliar to the pure-calculus-based physicists of the time:

He derives the average energy per molecule, as well as the fluctuations:

He says that “of course” the real interest is in the limit of an infinite number of molecules, but he still wants to show that for “moderate values” the formulas remain quite accurate. And then (even without Wolfram Language!) he’s off finding (using Newton’s method it seems) approximate roots of the necessary polynomials:

Just to show how it all works, he considers a slightly larger case as well:

Now he’s computing the probability that a given molecule has a particular energy

and determining that in the limit it’s an exponential

that is, as he says, “consistent with that known from gases in thermal equilibrium”.

He claims that in order to really get a “mechanical theory of heat” it’s necessary to take a continuum limit. And here he concludes that thermal equilibrium is achieved by maximizing the quantity Ω (where the “*l*” stands for log, so this is basically ƒ log ƒ):

He explains that Ω is basically the log of the number of possible permutations, and that it’s “of special importance”, and he’ll call it the “permutability measure”. He immediately notes that “the total permutability measure of two bodies is equal to the sum of the permutability measures of each body”. (Note that Boltzmann’s Ω isn’t the modern total-number-of-states Ω; confusingly, that’s essentially the exponential of Boltzmann’s Ω.)

He goes through some discussion of how to handle extra degrees of freedom in polyatomic molecules, but then he’s on to the main event: arguing that Ω is (essentially) the entropy. It doesn’t take long:

Basically he just says that in equilibrium the probability ƒ(…) for a molecule to have a particular velocity is given by the Maxwell distribution, then he substitutes this into the formula for Ω, and shows that indeed, up to a constant, Ω is exactly the “Clausius entropy” ∫*dQ*/*T*.

So, yes, in equilibrium Ω seems to be giving the entropy. But then Boltzmann makes a bit of a jump. He says that in processes that aren’t reversible both “Clausius entropy” and Ω will increase, and can still be identified—and enunciates the general principle, printed in his paper in special doubled-spaced form:

… [In] any system of bodies that undergoes state changes … even if the initial and final states are not in thermal equilibrium … the total permutability measure for the bodies will continually increase during the state changes, and can remain constant only so long as all the bodies during the state changes remain infinitely close to thermal equilibrium (reversible state changes).

In other words, he’s asserting that Ω behaves the same way entropy is said to behave according to the Second Law. He gives various thought experiments about gases in boxes with dividers, gases under gravity, etc. And finally concludes that, yes, the relationship of entropy to Ω “applies to the general case”.

There’s one final paragraph in the paper, though:

Up to this point, these propositions may be demonstrated exactly using the theory of gases. If one tries, however, to generalize to liquid drops and solid bodies, one must dispense with an exact treatment from the outset, since far too little is known about the nature of the latter states of matter, and the mathematical theory is barely developed. But I have already mentioned reasons in previous papers, in virtue of which it is likely that for these two aggregate states, the thermal equilibrium is achieved when Ω becomes a maximum, and that when thermal equilibrium exists, the entropy is given by the same expression. It can therefore be described as likely that the validity of the principle which I have developed is not just limited to gases, but that the same constitutes a general natural law applicable to solid bodies and liquid droplets, although the exact mathematical treatment of these cases still seems to encounter extraordinary difficulties.

Interestingly, Boltzmann is only saying that it’s “likely” that in thermal equilibrium his permutability measure agrees with Clausius’s entropy, and he’s implying that actually that’s really the only place where Clausius’s entropy is properly defined. But certainly his definition is more general (after all, it doesn’t refer to things like temperature that are only properly defined in equilibrium), and so—even though Boltzmann didn’t explicitly say it—one can imagine basically just using it as the definition of entropy for arbitrary cases. Needless to say, the story is actually more complicated, as we’ll see soon.

But this definition of entropy—crispened up by Max Planck (1858–1947) and with different notation—is what ended up years later “written in stone” at Boltzmann’s grave:

In his 1877 paper Boltzmann had made the claim that in equilibrium all possible microscopic states of a system would be equally probable. But why should this be true? One reason could be that in its pure “mechanical evolution” the system would just successively visit all these states. And this was an idea that Boltzmann seems to have had—with increasing clarity—from the time of his very first paper in 1866 that purported to “prove the Second Law” from mechanics.

In modern times—with our understanding of discrete systems and computational rules—it’s not difficult to describe the idea of “visiting all states”. But in Boltzmann’s time it was considerably more complicated. Did one expect to hit all the infinite possible infinitesimally separated configurations of a system? Or somehow just get close? The fact is that Boltzmann had certainly dipped his toe into thinking about things in terms of discrete quantities. But he didn’t make the jump to imagining discrete rules, even though he certainly did know about discrete iterative processes, like Newton’s method for finding roots.

Boltzmann knew about cases—like circular motion—where everything was purely periodic. But maybe when motion wasn’t periodic, it’d inevitably “visit all states”. Already in 1868 Boltzmann was writing a paper entitled “Solution to a Mechanical Problem” where he studies a single point mass moving in an α/*r *– β/*r*^{2} potential and bouncing elastically off a line—and manages to show that it visits every position with equal probability. In this paper he’s just got traditional formulas, but by 1871, in “Some General Theorems about Thermal Equilibrium”—computing motion in the same potential as before—he’s got a picture:

Boltzmann probably knew about Lissajous figures—cataloged in 1857

and the fact that in this case a rational ratio of *x* and *y* periods gives a periodic overall curve while an irrational one always gives a curve that visits every position might have led him to suspect that all systems would either be periodic, or would visit every possible configuration (or at least, as he identified in his paper, every configuration that had the same values of “constants of the motion”, like energy).

In early 1877 Boltzmann returned to the same question, including as one section in his “Remarks on Some Problems in the Mechanical Theory of Heat” more analysis of the same potential as before, but now showing a diversity of more complicated pictures that almost seem to justify his rule-30-before-its-time idea that there could be “pure mechanics” that would lead to “Second Law” behavior:

In modern times, of course, it’s easy to solve those equations of motion, and typical results obtained for an array of values of parameters are:

Boltzmann returned to these questions in 1884, responding to Helmholtz’s analysis of what he was calling “monocyclic systems”. Boltzmann used the same potential again, but now with a name for the “visit-all-states” property: isodic. Meanwhile, Boltzmann had introduced the name “ergoden” for the collection of all possible configurations of a system with a given energy (what would now be called the microcanonical ensemble). But somehow, quite a few years later, Boltzmann’s student Paul Ehrenfest (1880–1933) (along with Tatiana Ehrenfest-Afanassjewa (1876–1964)) would introduce the term “ergodic” for Boltzmann’s isodic. And “ergodic” is the term that caught on. And in the twentieth century there was all sorts of development of “ergodic theory”, as we’ll discuss a bit later.

But back in the 1800s people continued to discuss the possibility that what would become called ergodicity was somehow generic, and would explain why all states would somehow be equally probable, why the Maxwell distribution of velocities would be obtained, and ultimately why the Second Law was true. Maxwell worked out some examples. So did Kelvin. But it remained unclear how it would all work out, as Kelvin (now with many letters after his name) discussed in a talk he gave in 1900 celebrating the new century:

The dynamical theory of light didn’t work out. And about the dynamical theory of heat, he quotes Maxwell (following Boltzmann) in one of his very last papers, published in 1878, as saying, in reference to what amounts to a proof of the Second Law from underlying dynamics:

Kelvin talks about exploring test cases:

When, for example, is the motion of a single particle bouncing around in a fixed region ergodic? He considers first an ellipse, and proves that, no, there isn’t in general ergodicity there:

Then he goes on to the much more complicated case

and now he does an “experiment” (with a rather Monte Carlo flavor):

Kelvin considers a few other examples

but mostly concludes that he can’t tell in general about ergodicity—and that probably something else is needed, or as he puts it (somehow wrapping the theory of light into the story as well):

Had Boltzmann’s 1872 *H* theorem proved the Second Law? Was the Second Law—with its rather downbeat implication about the heat death of the universe—even true? One skeptic was Boltzmann’s friend and former teacher, the chemist Josef Loschmidt (1821–1895), who in 1866 had used kinetic theory to (rather accurately) estimate the size of air molecules. And in 1876 Loschmidt wrote a paper entitled “On the State of Thermal Equilibrium in a System of Bodies with Consideration of Gravity” in which he claimed to show that when gravity was taken into account, there wouldn’t be uniform thermal equilibrium, the Maxwell distribution, or the Second Law—and thus, as he poetically explained:

The terroristic nimbus of the Second Law is destroyed, a nimbus which makes that Second Law appear as the annihilating principle of all life in the universe—and at the same time we are confronted with the comforting perspective that, as far as the conversion of heat into work is concerned, mankind will not solely be dependent on the intervention of coal or of the Sun, but will have available an inexhaustible resource of convertible heat at all times.

His main argument revolves around a thought experiment involving molecules in a gravitational field:

Over the next couple of years, despite Loschmidt’s progressively more elaborate constructions

Boltzmann and Maxwell will debunk this particular argument—even though to this day the role of gravity in relation to the Second Law remains incompletely resolved.

But what’s more important for our narrative about Loschmidt’s original paper are a couple of paragraphs tucked away at the end of one section (that in fact Kelvin had basically anticipated in 1874):

[Consider what would happen if] after a time

tsufficiently long for the stationary state to obtain, we suddenly reversed the velocities of all atoms. Initially we would be in a state that would look like the stationary state. This would be true for some time, but in the long run the stationary state would deteriorate and after the timetwe would inevitably return to the initial state…It is clear that in general in any system one can revert the entire course of events by suddenly inverting the velocities of all the elements of the system. This doesn’t give a solution to the problem of undoing everything that happens [in the universe] but it does give a simple prescription: just suddenly revert the instantaneous velocities of all atoms of the universe.

How did this relate to the *H* theorem? The underlying molecular equations of motion that Boltzmann had assumed in his proof were reversible in time. Yet Boltzmann claimed that *H* was always going to a minimum. But why couldn’t one use Loschmidt’s argument to construct an equally possible “reverse evolution” in which *H* was instead going to a maximum?

It didn’t take Boltzmann long to answer, in print, tucked away in a section of his paper “Remarks on Some Problems in the Mechanical Theory of Heat”. He admits that Loschmidt’s argument “has great seductiveness”. But he claims it is merely “an interesting sophism”—and then says he will “locate the source of the fallacy”. He begins with a classic setup: a collection of hard spheres in a box.

Suppose that at time zero the distribution of spheres in the box is not uniform; for example, suppose that the density of spheres is greater on the right than on the left … The sophism now consists in saying that, without reference to the initial conditions, it cannot be proved that the spheres will become uniformly mixed in the course of time.

But then he rather boldly claims that with the actual initial conditions described, the spheres will “almost always [become] uniform” at a future time *t*. Now he imagines (following Loschmidt) reversing all the velocities in this state at time *t*. Then, he says:

… the spheres would sort themselves out as time progresses, and at [the analog of] time 0, they would have a completely nonuniform distribution, even though the [new] initial distribution [one had used] was almost uniform.

But now he says that, yes—given this counterexample—it won’t be possible to prove that the final distribution of spheres will always be uniform.

This is in fact a consequence of probability theory, for any nonuniform distribution, no matter how improbable it may be, is still not absolutely impossible. Indeed it is clear that any individual uniform distribution, which might arise after a certain time from some particular initial state, is just as improbable as an individual nonuniform distribution; just as in the game of Lotto, any individual set of five numbers is as improbable as the set 1, 2, 3, 4, 5. It is only because there are many more uniform distributions than nonuniform ones that the distribution of states will become uniform in the course of time. One therefore cannot prove that, whatever may be the positions and velocities of the spheres at the beginning, the distribution must become uniform after a long time; rather one can only prove that infinitely many more initial states will lead to a uniform one after a definite length of time than to a nonuniform one.

He adds:

One could even calculate, from the relative numbers of the different state distributions, their probabilities, which might lead to an interesting method for the calculation of thermal equilibrium.

And indeed within a few months Boltzmann has followed up on that “interesting method” to produce his classic paper on the probabilistic interpretation of entropy.

But in his earlier paper he goes on to argue:

Since there are infinitely many more uniform than nonuniform distributions of states, the latter case is extraordinarily improbable [to arise] and can be considered impossible for practical purposes; just as it may be considered impossible that if one starts with oxygen and nitrogen mixed in a container, after a month one will find chemically pure oxygen in the lower half and nitrogen in the upper half, although according to probability theory this is merely very improbable but not impossible.

He talks about how interesting it is that the Second Law is intimately connected with probability while the First Law is not. But at the end he does admit:

Perhaps this reduction of the Second Law to the realm of probability makes its application to the entire universe appear dubious, but the laws of probability theory are confirmed by all experiments carried out in the laboratory.

At this point it’s all rather unconvincing. The *H* theorem had purported to prove the Second Law. But now he’s just talking about probability theory. He seems to have given up on proving the Second Law. And he’s basically just saying that the Second Law is true because it’s observed to be true—like other laws of nature, but not like something that can be “proved”, say from underlying molecular dynamics.

For many years not much attention was paid to these issues, but by the late 1880s there were attempts to clarify things, particularly among the rather active British circle of kinetic theorists. A published 1894 letter from the Irish mathematician Edward Culverwell (1855–1931) (who also wrote about ice ages and Montessori education) summed up some of the confusions that were circulating:

At a lecture in England the next year, Boltzmann countered (conveniently, in English):

He goes on, but doesn’t get much more specific:

He then makes an argument that will be repeated many times in different forms, saying that there will be fluctuations, where *H* deviates temporarily from its minimum value, but these will be rare:

Later he’s talking about what he calls the “*H* curve” (a plot of *H* as a function of time), and he’s trying to describe its limiting form:

And he even refers to Weierstrass’s recent work on nondifferentiable functions:

But he doesn’t pursue this, and instead ends his “rebuttal” with a more philosophical—and in some sense anthropic—argument that he attributes to his former assistant Ignaz Schütz (1867–1927):

It’s an argument that we’ll see in various forms repeated over the century and a half that follows. In essence what it’s saying is that, yes, the Second Law implies that the universe will end up in thermal equilibrium. But there’ll always be fluctuations. And in a big enough universe there’ll be fluctuations somewhere that are large enough to correspond to the world as we experience it, where “visible motion and life exist”.

But regardless of such claims, there’s a purely formal question about the *H* theorem. How exactly is it that from the Boltzmann transport equation—which is supposed to describe reversible mechanical processes—the *H* theorem manages to prove that the *H* function irreversibly decreases? It wasn’t until 1895—fully 25 years after Boltzmann first claimed to prove the *H* theorem—that this issue was even addressed. And it first came up rather circuitously through Boltzmann’s response to comments in a textbook by Gustav Kirchhoff (1824–1887) that had been completed by Max Planck.

The key point is that Boltzmann’s equation makes an implicit assumption, that’s essentially the same as Maxwell made back in 1860: that before each collision between molecules, the molecules are statistically uncorrelated, so that the probability for the collision has the factored form ƒ(*v*_{1}) ƒ(*v*_{2}). But what about after the collision? Inevitably the collision itself will lead to correlations. So now there’s an asymmetry: there are no correlations before each collision, but there are correlations after. And that’s why the behavior of the system doesn’t have to be symmetrical—and the *H* theorem can prove that *H* irreversibly decreases.

In 1895 Boltzmann wrote a 3-page paper (after half in footnotes) entitled “More about Maxwell’s Distribution Law for Speeds” where he explained what he thought was going on:

[The reversibility of the laws of mechanics] has been recently applied in judging the assumptions necessary for a proof of [the

Htheorem]. This proof requires the hypothesis that the state of the gas is and remains molecularly disordered, namely, that the molecules of a given class do not always or predominantly collide in a specific manner and that, on the contrary, the number of collisions of a given kind can be found by the laws of probability.Now, if we assume that in general a state distribution never remains molecularly ordered for an unlimited time and also that for a stationary state-distribution every velocity is as probable as the reversed velocity, then it follows that by inversion of all the velocities after an infinitely long time every stationary state-distribution remains unchanged. After the reversal, however, there are exactly as many collisions occurring in the reversed way as there were collisions occurring in the direct way. Since the two state distributions are identical, the probability of direct and indirect collisions must be equal for each of them, whence follows Maxwell’s distribution of velocities.

Boltzmann is introducing what we’d now call the “molecular chaos” assumption (and what Ehrenfest would call the *Stosszahlansatz*)—giving a rather self-fulfilling argument for why the assumption should be true. In Boltzmann’s time there wasn’t really anything better to do. By the 1940s the BBGKY hierarchy at least let one organize the hierarchy of correlations between molecules—though it still didn’t give one a tractable way to assess what correlations should exist in practice, and what not.

Boltzmann knew these were all complicated issues. But he wrote about them at a technical level only a few more times in his life. The last time was in 1898 when, responding to a request from the mathematician Felix Klein (1849–1925), he wrote a paper about the *H* curve for mathematicians. He begins by saying that although this curve comes from the theory of gases, the essence of it can be reproduced by a process based on accumulating balls randomly picked from an urn. He then goes on to outline what amounts to a story of random walks and fractals. In another paper, he actually sketches the curve

saying that his drawing “should be taken with a large grain of salt”, noting—in a remarkably fractal-reminiscent way—that “a zincographer [i.e. an engraver of printing plates] would not have been able to produce a real figure since the *H*-curve has a very large number of maxima and minima on each finite segment, and hence defies representation as a line of continuously changing direction.”

Of course, in modern times it’s easy to produce an approximation to the *H* curve according to his prescription:

But at the end of his “mathematical” paper he comes back to talking about gases. And first he makes the claim that the effective reversibility seen in the *H* curve will never be seen in actual physical systems because, in essence, there are always perturbations from outside. But then he ends, in a statement of ultimate reversibility that casts our everyday observation of irreversibility as tautological:

There is no doubt that it is just as conceivable to have a world in which all natural processes take place in the wrong chronological order. But a person living in this upside-down world would have feelings no different than we do: they would just describe what we call the future as the past and vice versa.

Probably the single most prominent research topic in mathematical physics in the 1800s was the three-body problem—of solving for the motion under gravity of three bodies, such as the Earth, Moon and Sun. And in 1890 the French mathematician Henri Poincaré (1854–1912) (whose breakout work had been on the three-body problem) wrote a paper entitled “On the Three-Body Problem and the Equations of Dynamics” in which, as he said:

It is proved that there are infinitely many ways of choosing the initial conditions such that the system will return infinitely many times as close as one wishes to its initial position. There are also an infinite number of solutions that do not have this property, but it is shown that these unstable solutions can be regarded as “exceptional” and may be said to have zero probability.

This was a mathematical result. But three years later Poincaré wrote what amounted to a philosophy paper entitled “Mechanism and Experience” which expounded on its significance for the Second Law:

In the mechanistic hypothesis, all phenomena must be reversible; for example, the stars might traverse their orbits in the retrograde sense without violating Newton’s law; this would be true for any law of attraction whatever. This is therefore not a fact peculiar to astronomy; reversibility is a necessary consequence of all mechanistic hypotheses.

Experience provides on the contrary a number of irreversible phenomena. For example, if one puts together a warm and a cold body, the former will give up its heat to the latter; the opposite phenomenon never occurs. Not only will the cold body not return to the warm one the heat which it has taken away when it is in direct contact with it; no matter what artifice one may employ, using other intervening bodies, this restitution will be impossible, at least unless the gain thereby realized is compensated by an equivalent or large loss. In other words, if a system of bodies can pass from state A to state B by a certain path, it cannot return from B to A, either by the same path or by a different one. It is this circumstance that one describes by saying that not only is there not direct reversibility, but also there is not even indirect reversibility.

But then he continues:

A theorem, easy to prove, tells us that a bounded world, governed only by the laws of mechanics, will always pass through a state very close to its initial state. On the other hand, according to accepted experimental laws (if one attributes absolute validity to them, and if one is willing to press their consequences to the extreme), the universe tends toward a certain final state, from which it will never depart. In this final state, which will be a kind of death, all bodies will be at rest at the same temperature.

But in fact, he says, the recurrence theorem shows that:

This state will not be the final death of the universe, but a sort of slumber, from which it will awake after millions of millions of centuries. According to this theory, to see heat pass from a cold body to a warm one … it will suffice to have a little patience. [And we may] hope that some day the telescope will show us a world in the process of waking up, where the laws of thermodynamics are reversed.

By 1903, Poincaré was more strident in his critique of the formalism around the Second Law, writing (in English) in a paper entitled “On Entropy”:

But back in 1896, Boltzmann and the *H* theorem had another critic: Ernst Zermelo (1871–1953), a recent German math PhD who was then working with Max Planck on applied mathematics—though would soon turn to foundations of mathematics and become the “Z” in ZFC set theory. Zermelo’s attack on the *H* theorem began with a paper entitled “On a Theorem of Dynamics and the Mechanical Theory of Heat”. After explaining Poincaré’s recurrence theorem, Zermelo gives some “mathematician-style” conditions (the gas must be in a finite region, must have no infinite energies, etc.), then says that even though there must exist states that would be non-recurrent and could show irreversible behavior, there would necessarily be infinitely more states that “would periodically repeat themselves … with arbitrarily small variations”. And, he argues, such repetition would affect macroscopic quantities discernable by our senses. He continues:

In order to retain the general validity of the Second Law, we therefore would have to assume that just those initial states leading to irreversible processes are realized in nature, their small number notwithstanding, while the other ones, whose probability of existence is higher, mathematically speaking, do not actually occur.

And he concludes that the Poincaré recurrence phenomenon means that:

… it is certainly impossible to carry out a mechanical derivation of the Second Law on the basis of the existing theory without specializing the initial states.

Boltzmann responded promptly but quite impatiently:

I have pointed out particularly often, and as clearly as I possibly could … that the Second Law is but a principle of probability theory as far as the molecular-theoretic point of view is concerned. … While the theorem by Poincaré that Zermelo discusses in the beginning of his paper is of course correct, its application to heat theory is not.

Boltzmann talks about the *H* curve, and first makes rather a mathematician-style point about the order of limits:

If we first take the number of gas molecules to be infinite, as was clearly done in [my 1896 proof], and only then let the time grow very large, then, in the vast majority of cases, we obtain a curve asymptotically [always close to zero]. Moreover, as can easily be seen, Poincaré’s theorem is not applicable in this case. If, however, we take the time [span] to be infinitely great and, in contrast, the number of molecules to be very great but not absolutely infinite, then the

H-curve has a different character. It almost always runs very close to [zero], but in rare cases it rises above that, in what we shall call a “hump” … at which significant deviations from the Maxwell velocity distribution can occur …

Boltzmann then argues that even if you start “at a hump”, you won’t stay there, and “over an enormously long period of time” you’ll see something infinitely close to “equilibrium behavior”. But, he says:

… it is [always] possible to reach again a greater hump of the

H-curve by further extending the time … In fact, it is even the case that the original state must return, provided only that we continue to sufficiently extend the time…

He continues:

Mr. Zermelo is therefore right in claiming that, mathematically speaking, the motion is periodic. He has by no means succeeded, however, in refuting my theorems, which, in fact, are entirely consistent with this periodicity.

After giving arguments about the probabilistic character of his results, and (as we would now say it) the fact that a 1D random walk is certain to repeatedly return to the origin, Boltzmann says that:

… we must not conclude that the mechanical approach has to be modified in any way. This conclusion would be justified only if the approach had a consequence that runs contrary to experience. But this would be the case only if Mr. Zermelo were able to prove that the duration of the period within which the old state of the gas must recur in accordance with Poincaré’s theorem has an observable length…

He goes on to imagine “a trillion tiny spheres, each with a [certain initial velocity] … in the one corner of a box” (and by “trillion” he means million million million, or today’s quintillion) and then says that “after a short time the spheres will be distributed fairly evenly in the box”, but the period for a “Poincaré recurrence” in which they all will return to their original corner is “so great that nobody can live to see it happen”. And to make this point more forcefully, Boltzmann has an appendix in which he tries to get an actual approximation to the recurrence time, concluding that its numerical value “has many trillions of digits”.

He concludes:

If we consider heat as a motion of molecules that occurs in accordance with the general equations of mechanics and assume that the arrangement of bodies that we perceive is currently in a highly improbable state, then a theorem follows that is in agreement with the Second Law for all phenomena so far observed.

Of course, this theorem can no longer hold once we observe bodies of so small a scale that they only contain a few molecules. Since, however, we do not have at hand any experimental results on the behavior of bodies so small, this assumption does not run counter to previous experience. In fact, certain experiments conducted on very small bodies in gases seem rather to support the assumption, although we are still far from being able to assert its correctness on the basis of experimental proof.

But then he gives an important caveat—with a small philosophical flourish:

Of course, we cannot expect natural science to answer the question as to why the bodies surrounding us currently exist in a highly improbable state, just as we cannot expect it to answer the question as to why there are any phenomena at all and why they adhere to certain given principles.

Unsurprisingly—particularly in view of his future efforts in the foundations of mathematics—Zermelo is unconvinced by all of this. And six months later he replies again in print. He admits that a full Poincaré recurrence might take astronomically long, but notes that (where, by “physical state”, he means one that we perceive):

… we are after all always concerned only with the “physical state”, which can be realized by many different combinations, and hence can recur much sooner.

Zermelo zeroes in on many of the weaknesses in Boltzmann’s arguments, saying that the thing he particularly “contests … is the analogy that is supposed to exist between the properties of the *H* curve and the Second Law”. He claims that irreversibility cannot be explained from “mechanical suppositions” without “new physical assumptions”—and in particular criteria for choosing appropriate initial states. He ends by saying that:

From the great successes of the kinetic theory of gases in explaining the relationships among states we must not deduce its … applicability also to temporal processes. … [For in this case I am] convinced that it necessarily fails in the absence of entirely new assumptions.

Boltzmann replies again—starting off with the strangely weak argument:

The Second Law receives a mechanical explanation by virtue of the assumption, which is of course unprovable, that the universe, when considered as a mechanical system, or at least a very extensive part thereof surrounding us, started out in a highly improbable state and still is in such a state.

And, yes, there’s clearly something missing in the understanding of the Second Law. And even as Zermelo pushes for formal mathematician-style clarity, Boltzmann responds with physicist-style “reasonable arguments”. There’s lots of rhetoric:

The applicability of the calculus of probabilities to a particular case can of course never be proved with precision. If 100 out of 100,000 objects of a particular sort are consumed by fire per year, then we cannot infer with certainty that this will also be the case next year. On the contrary! If the same conditions continue to obtain for 10

^{10}years, then it will often be the case during this period that the 100,000 objects are all consumed by fire at once on a single day, and even that not a single object suffers damage over the course of an entire year. Nevertheless, every insurance company places its faith in the calculus of probabilities.

Or, in justification of the idea that we live in a highly improbable “low-entropy” part of the universe:

I refuse to grant the objection that a mental picture requiring so great a number of dead parts of the universe for the explanation of so small a number of animated parts is wasteful, and hence inexpedient. I still vividly remember someone who adamantly refused to believe that the Sun’s distance from the Earth is 20 million miles on the ground that it would simply be foolish to assume so vast a space only containing luminiferous aether alongside so small a space filled with life.

Curiously—given his apparent reliance on “commonsense” arguments—Boltzmann also says:

I myself have repeatedly cautioned against placing excessive trust in the extension of our mental pictures beyond experience and issued reminders that the pictures of contemporary mechanics, and in particular the conception of the smallest particles of bodies as material points, will turn out to be provisional.

In other words, we don’t know that we can think of atoms (even if they exist at all) as points, and we can’t really expect our everyday intuition to tell us about how they work. Which presumably means that we need some kind of solid, “formal” argument if we’re going to explain the Second Law.

Zermelo didn’t respond again, and moved on to other topics. But Boltzmann wrote one more paper in 1897 about “A Mechanical Theorem of Poincaré” ending with two more why-it-doesn’t-apply-in-practice arguments:

Poincaré’s theorem is of course never applicable to terrestrial bodies which we can hold in our hands as none of them is entirely closed. Nor it is applicable to an entirely closed gas of the sort considered by the kinetic theory if first the number of molecules and only then the quotients of the intervals between two neighboring collisions in the observation time is allowed to become infinite.

Boltzmann—and Maxwell before him—had introduced the idea of using probability theory to discuss the emergence of thermodynamics and potentially the Second Law. But it wasn’t until around 1900—with the work of J. Willard Gibbs (1839–1903)—that a principled mathematical framework for thinking about this developed. And while we can now see that this framework distracts in some ways from several of the key issues in understanding the foundations of the Second Law, it’s been important in framing the discussion of what the Second Law really says—as well as being central in defining the foundations for much of what’s been done over the past century or so under the banner of “statistical mechanics”.

Gibbs seems to have first gotten involved with thermodynamics around 1870. He’d finished his PhD at Yale on the geometry of gears in 1863—getting the first engineering PhD awarded in the US. After traveling in Europe and interacting with various leading mathematicians and physicists, he came back to Yale (where he stayed for the remaining 34 years of his life) and in 1871 became professor of mathematical physics there.

His first papers (published in 1873 when he was already 34 years old) were in a sense based on taking seriously the formalism of equilibrium thermodynamics defined by Clausius and Maxwell—treating entropy and internal energy, just like pressure, volume and temperature, as variables that defined properties of materials (and notably whether they were solids, liquids or gases). Gibbs’s main idea was to “geometrize” this setup, and make it essentially a story of multivariate calculus:

Unlike the European developers of thermodynamics, Gibbs didn’t interact deeply with other scientists—with the possible exception of Maxwell, who (a few years before his death in 1879) made a 3D version of Gibbs’s thermodynamic surface out of clay—and supplemented his 2D thermodynamic diagrams after the first edition of his textbook *Theory of Heat* with renderings of 3D versions:

Three years later, Gibbs began publishing what would be a 300-page work defining what has become the standard formalism for equilibrium chemical thermodynamics. He began with a quote from Clausius:

In the years that followed, Gibbs’s work—stimulated by Maxwell—mostly concentrated on electrodynamics, and later quaternions and vector analysis. But Gibbs published a few more small papers on thermodynamics—always in effect taking equilibrium (and the Second Law) for granted.

In 1882—a certain Henry Eddy (1844–1921) (who in 1879 had written a book on thermodynamics, and in 1890 would become president of the University of Cincinnati), claimed that “radiant heat” could be used to violate the Second Law:

Gibbs soon published a 2-page rebuttal (in the 6th-ever issue of *Science* magazine):

Then in 1889 Clausius died, and Gibbs wrote an obituary—praising Clausius but making it clear he didn’t think the kinetic theory of gases was a solved problem:

That same year Gibbs announced a short course that he would teach at Yale on “The *a priori* Deduction of Thermodynamic Principles from the Theory of Probabilities”. After a decade of work, this evolved into Gibbs’s last publication—an original and elegant book that’s largely defined how the Second Law has been thought about ever since:

The book begins by explaining that mechanics is about studying the time evolution of single systems:

But Gibbs says he is going to do something different: he is going to look at what he’ll call an ensemble of systems, and see how the distribution of their characteristics changes over time:

He explains that these “inquiries” originally arose in connection with deriving the laws of thermodynamics:

But he argues that this area—which he’s calling statistical mechanics—is worth investigating even independent of its connection to thermodynamics:

Still, he expects this effort will be relevant to the foundations of thermodynamics:

He immediately then goes on to what he’ll claim is the way to think about the relation of “observed thermodynamics” to his exact statistical mechanics:

Soon he makes the interesting—if, in the light of history, very overly optimistic—claim that “the laws of thermodynamics may be easily obtained from the principles of statistical mechanics”:

At first the text of the book reads very much like a typical mathematical work on mechanics:

But soon it’s “going statistical”, talking about the “density” of systems in “phase” (i.e. with respect to the variables defining the configuration of the system). And a few pages in, he’s proving the fundamental result that the density of “phase fluid” satisfies a continuity equation (which we’d now call the Liouville equation):

It’s all quite elegant, and all very rooted in the calculus-based mathematics of its time. He’s thinking about a collection of instances of a system. But while with our modern computational paradigm we’d readily be able to talk about a discrete list of instances, with his calculus-based approach he has to consider a continuous collection of instances—whose treatment inevitably seems more abstract and less explicit.

He soon makes contact with the “theory of errors”, discussing in effect how probability distributions over the space of possible states evolve. But what probability distributions should one consider? By chapter 4, he’s looking at what he calls (and is still called) the “canonical distribution”:

He gives a now-classic definition for the probability as a function of energy ϵ:

He observes that this distribution combines nicely when independent parts of a system are brought together, and soon he’s noting that:

But so far he’s careful to just talk about how things are “analogous”, without committing to a true connection:

More than halfway through the book he’s defined certain properties of his probability distributions that “may … correspond to the thermodynamic notions of entropy and temperature”:

Next he’s on to the concept of a “microcanonical ensemble” that includes only states of a given energy. For him—with his continuum-based setup—this is a slightly elaborate thing to define; in our modern computational framework it actually becomes more straightforward than his “canonical ensemble”. Or, as he already says:

But what about the Second Law? Now he’s getting a little closer:

When he says “index of probability” he’s talking about the log of a probability in his ensemble, so this result is about the fact that this quantity is extremized when all the elements of the ensemble have equal probability:

Soon he’s discussing whether he can use his index as a way—like Boltzmann tried to do with his version of entropy—to measure deviations from “statistical equilibrium”:

But now Gibbs has hit one of the classic gotchas of his approach: if you look in perfect detail at the evolution of an ensemble of systems, there’ll never be a change in the value of his index—essentially because of the overall conservation of probability. Gibbs brings in what amounts to a commonsense physics argument to handle this. He says to consider putting “coloring matter” in a liquid that one stirs. And then he says that even though the liquid (like his phase fluid) is microscopically conserved, the coloring matter will still end up being “uniformly mixed” in the liquid:

He talks about how the conclusion about whether mixing happens in effect depends on what order one takes limits in. And while he doesn’t put it quite this way, he’s essentially realized that there’s a competition between the system “mixing things up more and more finely” and the observer being able to track finer and finer details. He realizes, though, that not all systems will show this kind of mixing behavior, noting for example that there are mechanical systems that’ll just keep going in simple cycles forever.

He doesn’t really resolve the question of why “practical systems” should show mixing, more or less ending with a statement that even though his underlying mechanical systems are reversible, it’s somehow “in practice” difficult to go back:

Despite things like this, Gibbs appears to have been keen to keep the majority of his book “purely mathematical”, in effect proving theorems that necessarily followed from the setup he had given. But in the penultimate chapter of the book he makes what he seems to have viewed as a less-than-satisfactory attempt to connect what he’s done with “real thermodynamics”. He doesn’t really commit to the connection, though, characterizing it more as an “analogy”:

But he soon starts to be pretty clear that he actually wants to prove the Second Law:

He quickly backs off a little, in effect bringing in the observer to soften the requirements:

But then he fires his best shot. He says that the quantities he’s defined in connection with his canonical ensemble satisfy the same equations as Clausius originally set up for temperature and entropy:

He adds that fluctuations (or “anomalies”, as he calls them) become imperceptible in the limit of a large system:

But in physical reality, why should one have a whole collection of systems as in the canonical ensemble? Gibbs suggests it would be mor