The Shock of ChatGPT
Just a few months ago writing an original essay seemed like something only a human could do. But then ChatGPT burst onto the scene. And suddenly we realized that an AI could write a passable human-like essay. So now it’s natural to wonder: How far will this go? What will AIs be able to do? And how will we humans fit in?
My goal here is to explore some of the science, technology—and philosophy—of what we can expect from AIs. I should say at the outset that this is a subject fraught with both intellectual and practical difficulty. And all I’ll be able to do here is give a snapshot of my current thinking—which will inevitably be incomplete—not least because, as I’ll discuss, trying to predict how history in an area like this will unfold is something that runs straight into an issue of basic science: the phenomenon of computational irreducibility.
But let’s start off by talking about that particularly dramatic example of AI that’s just arrived on the scene: ChatGPT. So what is ChatGPT? Ultimately, it’s a computational system for generating text that’s been set up to follow the patterns defined by human-written text from billions of webpages, millions of books, etc. Give it a textual prompt and it’ll continue in a way that’s somehow typical of what it’s seen us humans write.
The results (which ultimately rely on all sorts of specific engineering) are remarkably “human like”. And what makes this work is that whenever ChatGPT has to “extrapolate” beyond anything it’s explicitly seen from us humans it does so in ways that seem similar to what we as humans might do.
Inside ChatGPT is something that’s actually computationally probably quite similar to a brain—with millions of simple elements (“neurons”) forming a “neural net” with billions of connections that have been “tweaked” through a progressive process of training until they successfully reproduce the patterns of human-written text seen on all those webpages, etc. Even without training the neural net would still produce some kind of text. But the key point is that it won’t be text that we humans consider meaningful. To get such text we need to build on all that “human context” defined by the webpages and other materials we humans have written. The “raw computational system” will just do “raw computation”; to get something aligned with us humans requires leveraging the detailed human history captured by all those pages on the web, etc.
But so what do we get in the end? Well, it’s text that basically reads like it was written by a human. In the past we might have thought that human language was somehow a uniquely human thing to produce. But now we’ve got an AI doing it. So what’s left for us humans? Well, somewhere things have got to get started: in the case of text, there’s got to be a prompt specified that tells the AI “what direction to go in”. And this is the kind of thing we’ll see over and over again. Given a defined “goal”, an AI can automatically work towards achieving it. But it ultimately takes something beyond the raw computational system of the AI to define what us humans would consider a meaningful goal. And that’s where we humans come in.
What does this mean at a practical, everyday level? Typically we use ChatGPT by telling it—using text—what we basically want. And then it’ll fill in a whole essay’s worth of text talking about it. We can think of this interaction as corresponding to a kind of “linguistic user interface” (that we might dub a “LUI”). In a graphical user interface (GUI) there’s core content that’s being rendered (and input) through some potentially elaborate graphical presentation. In the LUI provided by ChatGPT there’s instead core content that’s being rendered (and input) through a textual (“linguistic”) presentation.
You might jot down a few “bullet points”. And in their raw form someone else would probably have a hard time understanding them. But through the LUI provided by ChatGPT those bullet points can be turned into an “essay” that can be generally understood—because it’s based on the “shared context” defined by everything from the billions of webpages, etc. on which ChatGPT has been trained.
There’s something about this that might seem rather unnerving. In the past, if you saw a custom-written essay you’d reasonably be able to conclude that a certain irreducible human effort was spent in producing it. But with ChatGPT this is no longer true. Turning things into essays is now “free” and automated. “Essayification” is no longer evidence of human effort.
Of course, it’s hardly the first time there’s been a development like this. Back when I was a kid, for example, seeing that a document had been typeset was basically evidence that someone had gone to the considerable effort of printing it on printing press. But then came desktop publishing, and it became basically free to make any document be elaborately typeset.
And in a longer view, this kind of thing is basically a constant trend in history: what once took human effort eventually becomes automated and “free to do” through technology. There’s a direct analog of this in the realm of ideas: that with time higher and higher levels of abstraction are developed, that subsume what were formerly laborious details and specifics.
Will this end? Will we eventually have automated everything? Discovered everything? Invented everything? At some level, we now know that the answer is a resounding no. Because one of the consequences of the phenomenon of computational irreducibility is that there’ll always be more computations to do—that can’t in the end be reduced by any finite amount of automation, discovery or invention.
Ultimately, though, this will be a more subtle story. Because while there may always be more computations to do, it could still be that we as humans don’t care about them. And that somehow everything we care about can successfully be automated—say by AIs—leaving “nothing more for us to do”.
Untangling this issue will be at the heart of questions about how we fit into the AI future. And in what follows we’ll see over and over again that what might at first essentially seem like practical matters of technology quickly get enmeshed with deep questions of science and philosophy.
Intuition from the Computational Universe
I’ve already mentioned computational irreducibility a couple of times. And it turns out that this is part of a circle of rather deep—and at first surprising—ideas that I believe are crucial to thinking about the AI future.
Most of our existing intuition about “machinery” and “automation” comes from a kind of “clockwork” view of engineering—in which we specifically build systems component by component to achieve objectives we want. And it’s the same with most software: we write it line by line to specifically do—step by step—whatever it is we want. And we expect that if we want our machinery—or software—to do complex things then the underlying structure of the machinery or software must somehow be correspondingly complex.
So when I started exploring the whole computational universe of possible programs in the early 1980s it was a big surprise to discover that things work quite differently there. And indeed even tiny programs—that effectively just apply very simple rules repeatedly—can generate great complexity. In our usual practice of engineering we haven’t seen this, because we’ve always specifically picked programs (or other structures) where we can readily foresee how they’ll behave, so that we can explicitly set them up to do what we want. But out in the computational universe it’s very common to see programs that just “intrinsically generate” great complexity, without us ever having to explicitly “put it in”.
And having discovered this, we realize that there’s actually a big example that’s been around forever: the natural world. And indeed it increasingly seems as if the “secret” that nature uses to make the complexity it so often shows is exactly to operate according to the rules of simple programs. (For about three centuries it seemed as if mathematical equations were the ultimate way to describe the natural world—but in the past few decades, and particularly poignantly with our recent Physics Project, it’s become clear that simple programs are in general a more powerful approach.)
How does all this relate to technology? Well, technology is about taking what’s out there in the world, and harnessing it for human purposes. And there’s a fundamental tradeoff here. There may be some system out in nature that does amazingly complex things. But the question is whether we can “slice off” certain particular things that we humans happen to find useful. A donkey has all sorts of complex things going on inside. But at some point it was discovered that we can use it “technologically” to do the rather simple thing of pulling a cart.
And when it comes to programs out in the computational universe it’s extremely common to see ones that do amazingly complex things. But the question is whether we can find some aspect of those things that’s useful to us. Maybe the program is good at making pseudorandomness. Or distributedly determining consensus. Or maybe it’s just doing its complex thing, and we don’t yet know any “human purpose” that this achieves.
One of the notable features of a system like ChatGPT is that it isn’t constructed in an “understand-every-step” traditional engineering way. Instead one basically just starts from a “raw computational system” (in the case of ChatGPT, a neural net), then progressively tweaks it until its behavior aligns with the “human-relevant” examples one has. And this alignment is what makes the system “technologically useful”—to us humans.
Underneath, though, it’s still a computational system, with all the potential “wildness” that implies. And free from the “technological objective” of “human-relevant alignment” the system might do all sorts of sophisticated things. But they might not be things that (at least at this time in history) we care about. Even though some putative alien (or our future selves) might.
OK, but let’s come back to the “raw computation” side of things. There’s something very different about computation from all other kinds of “mechanisms” we’ve seen before. We might have a cart that can move forward. And we might have a stapler that can put staples in things. But carts and staplers do very different things; there’s no equivalence between them. But for computational systems (at least ones that don’t just always behave in obviously simple ways) there’s my Principle of Computational Equivalence—which implies that all these systems are in a sense equivalent in the kinds of computations they can do.
This equivalence has many consequences. One of them is that one can expect to make something equally computationally sophisticated out of all sorts of different kinds of things—whether brain tissue or electronics, or some system in nature. And this is effectively where computational irreducibility comes from.
One might think that given, say, some computational system based on a simple program it would always be possible for us—with our sophisticated brains, mathematics, computers, etc.—to “jump ahead” and figure out what the system will do before it’s gone through all the steps to do it. But the Principle of Computational Equivalence implies that this won’t in general be possible—because the system itself can be as computationally sophisticated as our brains, mathematics, computers, etc. are. So this means that the system will be computationally irreducible: the only way to find out what it does is effectively just to go through the same whole computational process that it does.
There’s a prevailing impression that science will always eventually be able do better than this: that it’ll be able to make “predictions” that allow us to work out what will happen without having to trace through each step. And indeed over the past three centuries there’s been lots of success in doing this, mainly by using mathematical equations. But ultimately it turns out that this has only been possible because science has ended up concentrating on particular systems where these methods work (and then these systems have been used for engineering). But the reality is that many systems show computational irreducibility. And in the phenomenon of computational irreducibility science is in effect “deriving its own limitedness”.
Contrary to traditional intuition, try as we might, in many systems we’ll never be able find “formulas” (or other “shortcuts”) that describe what’s going to happen in the systems—because the systems are simply computationally irreducible. And, yes, this represents a limitation on science, and on knowledge in general. But while at first this might seem like a bad thing, there’s also something fundamentally satisfying about it. Because if everything were computationally reducible, we could always “jump ahead” and find out what will happen in the end, say in our lives. But computational irreducibility implies that in general we can’t do that—so that in some sense “something irreducible is being achieved” by the passage of time.
There are a great many consequences of computational irreducibility. Some—that I have particularly explored recently—are in the domain of basic science (for example, establishing core laws of physics as we perceive them from the interplay of computational irreducibility and our computational limitations as observers). But computational irreducibility is also central in thinking about the AI future—and in fact I increasingly feel that it adds the single most important intellectual element needed to make sense of many of the most important questions about the potential roles of AIs and humans in the future.
For example, from our traditional experience with engineering we’re used to the idea that to find out why something happened in a particular way we can just “look inside” a machine or program and “see what it did”. But when there’s computational irreducibility, that won’t work. Yes, we could “look inside” and see, say, a few steps. But computational irreducibility implies that to find out what happened, we’d have to trace through all the steps. We can’t expect to find a “simple human narrative” that “says why something happened”.
But having said this, one feature of computational irreducibility is that within any computationally irreducible systems there must always be (ultimately, infinitely many) “pockets of computational reducibility” to be found. So for example, even though we can’t say in general what will happen, we’ll always be able to identify specific features that we can predict. (“The leftmost cell will always be black”, etc.) And as we’ll discuss later we can potentially think of technological (as well as scientific) progress as being intimately tied to the discovery of these “pockets of reducibility”. And in effect the existence of infinitely many such pockets is the reason that “there’ll always be inventions and discoveries to be made”.
Another consequence of computational irreducibility has to do with trying to ensure things about the behavior of a system. Let’s say one wants to set up an AI so it’ll “never do anything bad”. One might imagine that one could just come up with particular rules that ensure this. But as soon as the behavior of the system (or its environment) is computationally irreducible one will never be able to guarantee what will happen in the system. Yes, there may be particular computationally reducible features one can be sure about. But in general computational irreducibility implies that there’ll always be a “possibility of surprise” or the potential for “unintended consequences”. And the only way to systematically avoid this is to make the system not computationally irreducible—which means it can’t make use of the full power of computation.
“AIs Will Never Be Able to Do That”
We humans like to feel special, and feel as if there’s something “fundamentally unique” about us. Five centuries ago we thought we lived at the center of the universe. Now we just tend to think that there’s something about our intellectual capabilities that’s fundamentally unique and beyond anything else. But the progress of AI—and things like ChatGPT—keep on giving us more and more evidence that that’s not the case. And indeed my Principle of Computational Equivalence says something even more extreme: that at a fundamental computational level there’s just nothing fundamentally special about us at all—and that in fact we’re computationally just equivalent to lots of systems in nature, and even to simple programs.
This broad equivalence is important in being able to make very general scientific statements (like the existence of computational irreducibility). But it also highlights how significant our specifics—our particular history, biology, etc.—are. It’s very much like with ChatGPT. We can have a generic (untrained) neural net with the same structure as ChatGPT, that can do certain “raw computation”. But what makes ChatGPT interesting—at least to us—is that it’s been trained with the “human specifics” described on billions of webpages, etc. In other words, for both us and ChatGPT there’s nothing computationally “generally special”. But there is something “specifically special”—and it’s the particular history we’ve had, particular knowledge our civilization has accumulated, etc.
There’s a curious analogy here to our physical place in the universe. There’s a certain uniformity to the universe, which means there’s nothing “generally special” about our physical location. But at least to us there’s still something “specifically special” about it, because it’s only here that we have our particular planet, etc. At a deeper level, ideas based on our Physics Project have led to the concept of the ruliad: the unique object that is the entangled limit of all possible computational processes. And we can then view our whole experience as “observers of the universe” as consisting of sampling the ruliad at a particular place.
It’s a bit abstract (and a long story, which I won’t go into in any detail here), but we can think of different possible observers as being both at different places in physical space, and at different places in rulial space—giving them different “points of view” about what happens in the universe. Human minds are in effect concentrated in a particular region of physical space (mostly on this planet) and a particular region of rulial space. And in rulial space different human minds—with their different experiences and thus different ways of thinking about the universe—are in slightly different places. Animal minds might be fairly close in rulial space. But other computational systems (like, say, the weather, which is sometimes said to “have a mind of its own”) are further away—as putative aliens might also be.
So what about AIs? It depends what we mean by “AIs”. If we’re talking about computational systems that are set up to do “human-like things” then that means they’ll be close to us in rulial space. But insofar as “an AI” is an arbitrary computational system it can be anywhere in rulial space, and it can do anything that’s computationally possible—which is far broader than what we humans can do, or even think about. (As we’ll talk about later, as our intellectual paradigms—and ways of observing things—expand, the region of rulial space in which we humans operate will correspondingly expand.)
But, OK, just how “general” are the computations that we humans (and the AIs that follow us) are doing? We don’t know enough about the brain to be sure. But if we look at artificial neural net systems—like ChatGPT—we can potentially get some sense. And in fact the computations really don’t seem to be that “general”. In most neural net systems data that’s given as input just “ripples once through the system” to produce output. It’s not like in a computational system like a Turing machine where there can be arbitrary “recirculation of data”. And indeed without such “arbitrary recirculation” the computation is necessarily quite “shallow” and can’t ultimately show computational irreducibility.
It’s a bit of a technical point, but one can ask whether ChatGPT, with its “re-feeding of text produced so far” can in fact achieve arbitrary (“universal”) computation. And I suspect that in some formal sense it can (or at least a sufficiently expanded analog of it can)—though by producing an extremely verbose piece of text that for example in effect lists successive (self-delimiting) states of a Turing machine tape, and in which finding “the answer” to a computation will take a bit of effort. But—as I’ve discussed elsewhere—in practice ChatGPT is presumably almost exclusively doing “quite shallow” computation.
It’s an interesting feature of the history of practical computing that what one might consider “deep pure computations” (say in mathematics or science) were done for decades before “shallow human-like computations” became feasible. And the basic reason for this is that for “human-like computations” (like recognizing images or generating text) one needs to capture lots of “human context”, which requires having lots of “human-generated data” and the computational resources to store and process it.
And, by the way, brains also seem to specialize in fundamentally shallow computations. And to do the kind of deeper computations that allow one to take advantage of more of what’s out there in the computational universe, one has to turn to computers. As we’ve discussed, there’s plenty out in the computational universe that we humans don’t (yet) care about: we just consider it “raw computation”, that doesn’t seem to be “achieving human purposes”. But as a practical matter it’s important to make a bridge between the things we humans do care about and think about, and what’s possible in the computational universe. And in a sense that’s at the core of the project I’ve put so much effort into in the Wolfram Language of creating a full-scale computational language that describes in computational terms the things we think about, and experience in the world.
OK, people have been saying for years: “It’s nice that computers can do A and B, but only humans can do X”. What X is supposed to be has changed—and narrowed—over the years. And ChatGPT provides us with a major unexpected new example of something more that computers can do.
So what’s left? People might say: “Computers can never show creativity or originality”. But—perhaps disappointingly—that’s surprisingly easy to get, and indeed just a bit of randomness “seeding” a computation can often do a pretty good job, as we saw years ago with our WolframTones music-generation system, and as we see today with ChatGPT’s writing. People might also say: “Computers can never show emotions”. But before we had a good way to generate human language we wouldn’t really have been able to tell. And now it already works pretty well to ask ChatGPT to write “happily”, “sadly”, etc. (In their raw form emotions in both humans and other animals are presumably associated with rather simple “global variables” like neurotransmitter concentrations.)
In the past people might have said: “Computers can never show judgement”. But by now there are endless examples of machine learning systems that do well at reproducing human judgement in lots of domains. People might also say: “Computers don’t show common sense”. And by this they typically mean that in a particular situation a computer might locally give an answer, but there’s a global reason why that answer doesn’t make sense, that the computer “doesn’t notice”, but a person would.
So how does ChatGPT do on this? Not too badly. In plenty of cases it correctly recognizes that “that’s not what I’ve typically read”. But, yes, it makes mistakes. Some of them have to do with it not being able to do—purely with its neural net—even slightly “deeper”computations. (And, yes, that’s something that can often be fixed by it calling Wolfram|Alpha as a tool.) But in other cases the problem seems to be that it can’t quite connect different domains well enough.
It’s perfectly capable of doing simple (“SAT-style”) analogies. But when it comes to larger-scale ones it doesn’t manage them. My guess, though, is that it won’t take much scaling up before it starts to be able to make what seem like very impressive analogies (that most of us humans would never even be able to make)—at which point it’ll probably successfully show broader “common sense”.
But so what’s left that humans can do, and AIs can’t? There’s—almost by definition—one fundamental thing: define what we would consider goals for what to do. We’ll talk more about this later. But for now we can note that any computational system, once “set in motion”, will just follow its rules and do what it does. But what “direction should it be pointed in”? That’s something that has to come from “outside the system”.
So how does it work for us humans? Well, our goals are in effect defined by the whole web of history—both from biological evolution and from our cultural development—in which we are embedded. But ultimately the only way to truly participate in that web of history is to be part of it.
Of course, we can imagine technologically emulating every “relevant” aspect of a brain—and indeed things like the success of ChatGPT may suggest that that’s easier to do than we might have thought. But that won’t be enough. To participate in the “human web of history” (as we’ll discuss later) we’ll have to emulate other aspects of “being human”—like moving around, being mortal, etc. And, yes, if we make an “artificial human” we can expect it (by definition) to show all the features of us humans.
But while we’re still talking about AIs as—for example—“running on computers” or “being purely digital” then, at least as far as we’re concerned, they’ll have to “get their goals from outside”. One day (as we’ll discuss) there will no doubt be some kind of “civilization of AIs”—which will form its own web of history. But at this point there’s no reason to think that we’ll still be able to describe what’s going on in terms of goals that we recognize. In effect the AIs will at that point have left our domain of rulial space. And—as we’ll discuss—they’ll be operating more like the kind of systems we see in nature, where we can tell there’s computation going on, but we can’t describe it, except rather anthropomorphically, in terms of human goals and purposes.
Will There Be Anything Left for the Humans to Do?
It’s been an issue that’s been raised—with varying degrees of urgency—for centuries: with the advance of automation (and now AI), will there eventually be nothing left for humans to do? Back in the early days of our species, there was lots of hard work of hunting and gathering to do, just to survive. But at least in the developed parts of the world, that kind of work is now at best a distant historical memory.
And yet at each stage in history—at least so far—there always seem to be other kinds of work that keep people busy. But there’s a pattern that increasingly seems to repeat. Technology in some way or another enables some new occupation. And eventually that occupation becomes widespread, and lots of people do it. But then there’s a technological advance, and the occupation gets automated—and people aren’t needed to do it anymore. But now there’s a new level of technology, that enables new occupations. And the cycle continues.
A century ago the increasingly widespread use of telephones meant that more and more people worked as switchboard operators. But then telephone switching was automated—and those switchboard operators weren’t needed anymore. But with automated switching there could be huge development of telecommunications infrastructure, opening up all sorts of new types of jobs, that in aggregate employ vastly more people than were ever switchboard operators.
Something somewhat similar happened with accounting clerks. Before there were computers, one needed to have people laboriously tallying up numbers. But with computers, that was all automated away. But with that automation came the ability to do more complex financial computations—which allowed for more complex financial transactions, more complex regulations, etc., which in turn led to all sorts of new types of jobs.
And across a whole range of industries, it’s been the same kind of story. Automation obsoletes some jobs, but enables others. There’s quite often a gap in time, and a change in the skills that are needed. But at least so far there always seems to have been a broad frontier of jobs that have been made possible—but haven’t yet been automated.
Will this at some point end? Will there come a time when everything we humans want (or at least need) is delivered automatically? Well, of course, that depends on what we want, and whether, for example, that evolves with what technology has made possible. But could we just decide that “enough is enough”; let’s stop here, and just let everything be automated?
I don’t think so. And the reason is ultimately because of computational irreducibility. We try to get the world to be “just so”, say set up so we’re “predictably comfortable”. Well, the problem is that there’s inevitably computational irreducibility in the way things develop—not just in nature, but in things like societal dynamics too. And that means that things won’t stay “just so”. There’ll always be something unpredictable that happens; something that the automation doesn’t cover.
At first we humans might just say “we don’t care about that”. But in time computational irreducibility will affect everything. So if there’s anything at all we care about (including, for example, not going extinct), we’ll eventually have to do something—and go beyond whatever automation was already set up.
It’s easy to find practical examples. We might think that when computers and people are all connected in a seamless automated network, there’d be nothing more to do. But what about the “unintended consequence” of computer security issues? What might have seemed like a case where “technology finished things” quickly creates a new kind of job for people to do. And at some level, computational irreducibility implies that things like this must always happen. There must always be a “frontier”. At least if there’s anything at all we want to preserve (like not going extinct).
But let’s come back to the situation here and now with AI. ChatGPT just automated all sorts of text-related tasks. It used to take lots of effort—and people—to write customized reports, letters, etc. But (at least so long as one’s dealing with situations where one doesn’t need 100% “correctness”) ChatGPT just automated a lot of that, so people aren’t needed for it anymore. But what will this mean? Well, it means that there’ll be a lot more customized reports, letters, etc. that can be produced. And that will lead to new kinds of jobs—managing, analyzing, validating etc. all that mass-customized text. Not to mention the need for prompt engineers (a job category that just didn’t exist until a few months ago), and what amount to AI wranglers, AI psychologists, etc.
But let’s talk about today’s “frontier” of jobs that haven’t been “automated away”. There’s one category that in many ways seems surprising to still be “with us”: jobs that involve lots of mechanical manipulation, like construction, fulfillment, food preparation, etc. But there’s a missing piece of technology here: there isn’t yet good general-purpose robotics (as there is general-purpose computing), and we humans still have the edge in dexterity, mechanical adaptability, etc. But I’m quite sure that in time—and perhaps quite suddenly—the necessary technology will be developed (and, yes, I have ideas about how to do it). And this will mean that most of today’s “mechanical manipulation” jobs will be “automated away”—and won’t need people to do them.
But then, just as in our other examples, this will mean that mechanical manipulation will become much easier and cheaper to do, and more of it will be done. Houses might routinely be built and dismantled. Products might routinely be picked up from wherever they’ve ended up, and redistributed. Vastly more ornate “food constructions” might become the norm. And each of these things—and many more—will open up new jobs.
But will every job that exists in the world today “on the frontier” eventually be automated? What about jobs where it seems like a large part of the value is just “having a human be there”? Jobs like flying a plane where one wants the “commitment” of the pilot being there in the plane. Caregiver jobs where one wants the “connection” of a human being there. Sales or education jobs where one wants “human persuasion” or “human encouragement”. Today one might think “only a human can make one feel that way”. But that’s typically based on the way the job is done now. And maybe there’ll be different ways found that allow the essence of the task to be automated, almost inevitably opening up new tasks to be done.
For example, something that in the past needed “human persuasion” might be “automated” by something like gamification—but then more of it can be done, with new needs for design, analytics, management, etc.
We’ve been talking about “jobs”. And that term immediately brings to mind wages, economics, etc. And, yes, plenty of what people do (at least in the world as it is today) is driven by issues of economics. But plenty is also not. There are things we “just want to do”—as a “social matter”, for “entertainment”, for “personal satisfaction”, etc.
Why do we want to do these things? Some of it seems intrinsic to our biological nature. Some of it seems determined by the “cultural environment” in which we find ourselves. Why might one walk on a treadmill? In today’s world one might explain that it’s good for health, lifespan, etc. But a few centuries ago, without modern scientific understanding, and with a different view of the significance of life and death, that explanation really wouldn’t work.
What drives such changes in our view of what we “want to do”, or “should do”? Some seems to be driven by the pure “dynamics of society”, presumably with its own computational irreducibility. But some has to do with our ways of interacting with the world—both the increasing automation delivered by the advance of technology, and the increasing abstraction delivered by the advance of knowledge.
And there seem to be similar “cycles” seen here as in the kinds of things we consider to be “occupations” or “jobs”. For a while something is hard to do, and serves as a good “pastime”. But then it gets “too easy” (“everybody now knows how to win at game X”, etc.), and something at a “higher level” takes its place.
About our “base” biologically driven motivations it doesn’t seem like anything has really changed in the course of human history. But there are certainly technological developments that could have an effect in the future. Effective human immortality, for example, would change many aspects of our motivation structure. As would things like the ability to implant memories or, for that matter, implant motivations.
For now, there’s a certain element of what we want to do that’s “anchored” by our biological nature. But at some point we’ll surely be able to emulate with a computer at least the essence of what our brains are doing (and indeed the success of things like ChatGPT makes it seems like the moment when that will happen is closer at hand than we might have thought). And at that point we’ll have the possibility of what amount to “disembodied human souls”.
To us today it’s very hard to imagine what the “motivations” of such a “disembodied soul” might be. Looked at “from the outside” we might “see the soul” doing things that “don’t make much sense” to us. But it’s like asking what someone from a thousand years ago would think about many of our activities today. These activities make sense to us today because we’re embedded in our whole “current framework”. But without that framework they don’t make sense. And so it will be for the “disembodied soul”. To us, what it does may not make sense. But to it, with its “current framework”, it will.
Could we “learn how to make sense of it”? There’s likely to be a certain barrier of computational irreducibility: in effect the only way to “understand the soul of the future” is to retrace its steps to get to where it is. So from our vantage point today, we’re separated by a certain “irreducible distance”, in effect in rulial space.
But could there be some science of the future that will at least tell us general things about how such “souls” behave? Even when there’s computational irreducibility we know that there will always be pockets of computational reducibility—and thus features of behavior that are predictable. But will those features be “interesting”, say from our vantage point today? Maybe some of them will be. Maybe they’ll show us some kind of metapsychology of souls. But inevitably they can only go so far. Because in order for those souls to even experience the passage of time there has to be computational irreducibility. If too much of what happens is too predictable, it’s as if “nothing is happening”—or at least nothing “meaningful”.
And, yes, this is all tied up with questions about “free will”. Even when there’s a disembodied soul that’s operating according to some completely deterministic underlying program, computational irreducibility means its behavior can still “seem free”—because nothing can “outrun it” and say what it’s going to be. And the “inner experience” of the disembodied soul can be significant: it’s “intrinsically defining its future”, not just “having its future defined for it”.
One might have assumed that once everything is just “visibly operating” as “mere computation” it would necessarily be “soulless” and “meaningless”. But computational irreducibility is what breaks out of this, and what allows there to be something irreducible and “meaningful” achieved. And it’s the same phenomenon whether one’s talking about our life now in the physical universe, or a future “disembodied” computational existence. Or in other words, even if absolutely everything—even our very existence—has been “automated by computation”, that doesn’t mean we can’t have a perfectly good “inner experience” of meaningful existence.
Generalized Economics and the Concept of Progress
If we look at human history—or, for that matter, the history of life on Earth—there’s a certain pervasive sense that there’s some kind of “progress” happening. But what fundamentally is this “progress”? One can view it as the process of things being done at a progressively “higher level”, so that in effect “more of what’s important” can happen with a given effort. This idea of “going to a higher level” takes many forms—but they’re all fundamentally about eliding details below, and being able to operate purely in terms of the “things one cares about”.
In technology, this shows up as automation, in which what used to take lots of detailed steps gets packaged into something that can be done “with the push of a button”. In science—and the intellectual realm in general—it shows up as abstraction, where what used to involve lots of specific details gets packaged into something that can be talked about “purely collectively”. And in biology it shows up as some structure (ribosome, cell, wing, etc.) that can be treated as a “modular unit”.
That it’s possible to “do things at a higher level” is a reflection of being able to find “pockets of computational reducibility”. And—as we mentioned above—the fact that (given underlying computational irreducibility) there are necessarily an infinite number of such pockets means that “progress can always go on forever”.
When it comes to human affairs we tend to value such progress highly, because (at least for now) we live finite lives, and insofar as we “want more to happen”, “progress” makes that possible. It’s certainly not self-evident that having more happen is “good”; one might just “want a quiet life”. But there is one constraint that in a sense originates from the deep foundations of biology.
If something doesn’t exist, then nothing can ever “happen to it”. So in biology, if one’s going to have anything “happen” with organisms, they’d better not be extinct. But the physical environment in which biological organisms exist is finite, with many resources that are finite. And given organisms with finite lives, there’s an inevitability to the process of biological evolution, and to the “competition” for resources between organisms.
Will there eventually be an “ultimate winning organism”? Well, no, there can’t be—because of computational irreducibility. There’ll in a sense always be more to explore in the computational universe—more “raw computational material for possible organisms”. And given any “fitness criterion” (like—in a Turing machine analog—“living longer before halting”) there’ll always be a way to “do better” with it.
One might still wonder, however, whether perhaps biological evolution—with its underlying process of random genetic mutation—could “get stuck” and never be able to discover some “way to do better”. And indeed simple models of evolution might give one the intuition that this would happen. But actual evolution seems more like deep learning with a large neural net—where one’s effectively operating in an extremely high-dimensional space where there’s typically always a “way to get there from here”, at least given enough time.
But, OK, so from our history of biological evolution there’s a certain built-in sense of “competition for scarce resources”. And this sense of competition has (so far) also carried over to human affairs. And indeed it’s the basic driver for most of the processes of economics.
But what if resources aren’t “scarce” anymore? What if progress—in the form of automation, or AI—makes it easy to “get anything one wants”? We might imagine robots building everything, AIs figuring everything out, etc. But there are still things that are inevitably scarce. There’s only so much real estate. Only one thing can be “the first ___”. And, in the end, if we have finite lives, we only have so much time.
Still, the more efficient—or high level—the things we do (or have) are, the more we’ll be able to get done in the time we have. And it seems as if what we perceive as “economic value” is intimately connected with “making things higher level”. A finished phone is “worth more” than its raw materials. An organization is “worth more” than its separate parts. But what if we could have “infinite automation”? Then in a sense there’d be “infinite economic value everywhere”, and one might imagine there’d be “no competition left”.
But once again computational irreducibility stands in the way. Because it tells us there’ll never be “infinite automation”, just as there’ll never be an ultimate winning biological organism. There’ll always be “more to explore” in the computational universe, and different paths to follow.
What will this look like in practice? Presumably it’ll lead to all sorts of diversity. So that, for example, a chart of “what the components of an economy are” will become more and more fragmented; it won’t just be “the single winning economic activity is ___”.
There is one potential wrinkle in this picture of unending progress. What if nobody cares? What if the innovations and discoveries just don’t matter, say to us humans? And, yes, there is of course plenty in the world that at any given time in history we don’t care about. That piece of silicon we’ve been able to pick out? It’s just part of a rock. Well, until we start making microprocessors out of it.
But as we’ve discussed, as soon as we’re “operating at some level of abstraction” computational irreducibility makes it inevitable that we’ll eventually be exposed to things that “require going beyond that level”.
But then—critically—there will be choices. There will be different paths to explore (or “mine”) in the computational universe—in the end infinitely many of them. And whatever the computational resources of AIs etc. might be, they’ll never be able to explore all of them. So something—or someone—will have to make a choice of which ones to take.
Given a particular set of things one cares about at a particular point, one might successfully be able to automate all of them. But computational irreducibility implies there will always be a “frontier”, where choices have to be made. And there’s no “right answer”; no “theoretically derivable” conclusion. Instead, if we humans are involved, this is where we get to define what’s going to happen.
How will we do that? Well, ultimately it’ll be based on our history—biological, cultural, etc. We’ll get to use all that irreducible computation that went into getting us to where we are to define what to do next. In a sense it’ll be something that goes “through us”, and that uses what we are. It’s the place where—even when there’s automation all around—there’s still always something us humans can “meaningfully” do.
How Can We Tell the AIs What to Do?
Let’s say we want an AI (or any computational system) to do a particular thing. We might think we could just set up its rules (or “program it”) to do that thing. And indeed for certain kinds of tasks that works just fine. But the deeper the use we make of computation, the more we’re going to run into computational irreducibility, and the less we’ll be able to know how to set up particular rules to achieve what we want.
And then, of course, there’s the question of defining what “we want” in the first place. Yes, we could have specific rules that say what particular pattern of bits should occur at a particular point in a computation. But that probably won’t have much to do with the kind of overall “human-level” objective that we typically care about. And indeed for any objective we can even reasonably define, we’d better be able to coherently “form a thought” about it. Or, in effect, we’d better have some “human-level narrative” to describe it.
But how can we represent such a narrative? Well, we have natural language—probably the single most important innovation in the history of our species. And what natural language fundamentally does is to allow us to talk about things at a “human level”. It’s made of words that we can think of as representing “human-level packets of meaning”. And so, for example, the word “chair” represents the human-level concept of a chair. It’s not referring to some particular arrangement of atoms. Instead, it’s referring to any arrangement of atoms that we can usefully conflate into the single human-level concept of a chair, and from which we can deduce things like the fact that we can expect to sit on it, etc.
So, OK, when we’re “talking to an AI” can we expect to just say what we want using natural language? We can definitely get a certain distance—and indeed ChatGPT helps us get further than ever before. But as we try to make things more precise we run into trouble, and the language we need rapidly becomes increasingly ornate, as in the “legalese” of complex legal documents. So what can we do? If we’re going to keep things at the level of “human thoughts” we can’t “reach down” into all the computational details. But yet we want a precise definition of how what we might say can be implemented in terms of those computational details.
Well, there’s a way to deal with this, and it’s one that I’ve personally devoted many decades to: it’s the idea of computational language. When we think about programming languages, they’re things that operate solely at the level of computational details, defining in more or less the native terms of a computer what the computer should do. But the point of a true computational language (and, yes, in the world today the Wolfram Language is the sole example) is to do something different: to define a precise way of talking in computational terms about things in the world (whether concretely countries or minerals, or abstractly computational or mathematical structures).
Out in the computational universe, there’s immense diversity in the “raw computation” that can happen. But there’s only a thin sliver of it that we humans (at least currently) care about and think about. And we can view computational language as defining a bridge between the things we think about and what’s computationally possible. The functions in our computational language (7000 or so of them in the Wolfram Language) are in effect like words in a human language—but now they have a precise grounding in the “bedrock” of explicit computation. And the point is to design the computational language so it’s convenient for us humans to think and express ourselves in (like a vastly expanded analog of mathematical notation), but so it can also be precisely implemented in practice on a computer.
Given a piece of natural language it’s often possible to give a precise, computational interpretation of it—in computational language. And indeed this is exactly what happens in Wolfram|Alpha. Give a piece of natural language and the Wolfram|Alpha NLU system will try to find an interpretation of it as computational language. And from this interpretation, it’s then up to the Wolfram Language to do the computation that’s specified, and give back the results—and potentially synthesize natural language to express them.
As a practical matter, this setup is useful not only for humans, but also for AIs—like ChatGPT. Given a system that produces natural language, the Wolfram|Alpha NLU system can “catch” natural language it is “thrown”, and interpret it as computational language that precisely specifies a potentially irreducible computation to do.
With both natural language and computational language one’s basically “directly saying what one wants”. But an alternative approach—more aligned with machine learning—is just to give examples, and (implicitly or explicitly) say “follow these”. Inevitably there has to be some underlying model for how to do that following—typically in practice just defined by “what a neural net with a certain architecture will do”. But will the result be “right”? Well, the result will be whatever the neural net gives. But typically we’ll tend to consider it “right” if it’s somehow consistent with what we humans would have concluded. And in practice this often seems to happen, presumably because the actual architecture of our brains is somehow similar enough to the architecture of the neural nets we’re using.
But what if we want to “know for sure” what’s going to happen—or, for example, that some particular “mistake” can never be made? Well then we’re presumably thrust back into computational irreducibility, with the result that there’s no way to know, for example, whether a particular set of training examples can lead to a system that’s capable of doing (or not doing) some particular thing.
OK, but let’s say we’re setting up some AI system, and we want to make sure it “doesn’t do anything bad”. There are several levels of issues here. The first is to decide what we mean by “anything bad”. And, as we’ll discuss below, that in itself is very hard. But even if we could abstractly figure this out, how should we actually express it? We could give examples—but then the AI will inevitably have to “extrapolate” from them, in ways we can’t predict. Or we could describe what we want in computational language. It might be difficult to cover “every case” (as it is in present-day human laws, or complex contracts). But at least we as humans can read what we’re specifying. Though even in this case, there’s an issue of computational irreducibility: that given the specification it won’t be possible to work out all its consequences.
What does all this mean? In essence it’s just a reflection of the fact that as soon as there’s “serious computation” (i.e. irreducible computation) involved, one isn’t going to be immediately able to say what will happen. (And in a sense that’s inevitable, because if one could say, it would mean the computation wasn’t in fact irreducible.) So, yes, we can try to “tell AIs what to do”. But it’ll be like many systems in nature (or, for that matter, people): you can set them on a path, but you can’t know for sure what will happen; you just have to wait and see.
A World Run by AIs
In the world today, there are already plenty of things that are being done by AIs. And, as we’ve discussed, there’ll surely be more in the future. But who’s “in charge”? Are we telling the AIs what to do, or are they telling us? Today it’s at best a mixture: AIs suggest content for us (for example from the web), and in general make all sorts of recommendations about what we should do. And no doubt in the future those recommendations will be even more extensive and tightly coupled to us: we’ll be recording everything we do, processing it with AI, and continually annotating with recommendations—say through augmented reality—everything we see. And in some sense things might even go beyond “recommendations”. If we have direct neural interfaces, then we might be making our brains just “decide” they want to do things, so that in some sense we become pure “puppets of the AI”.
And beyond “personal recommendations” there’s also the question of AIs running the systems we use, or in fact running the whole infrastructure of our civilization. Today we ultimately expect people to make large-scale decisions for our world—often operating in systems of rules defined by laws, and perhaps aided by computation, and even what one might call AI. But there may well come a time when it seems as if AIs could just “do a better job than humans”, say at running a central bank or waging a war.
One might ask how one would ever know if the AI would “do a better job”. Well, one could try tests, and run examples. But once again one’s faced with computational irreducibility. Yes, the particular tests one tries might work fine. But one can’t ultimately predict everything that could happen. What will the AI do if there’s suddenly a never-before-seen seismic event? We basically won’t know until it happens.
But can we be sure the AI won’t do anything “crazy”? Could we—with some definition of “crazy”—effectively “prove a theorem” that the AI can never do that? For any realistically nontrivial definition of crazy we’ll again run into computational irreducibility—and this won’t be possible.
Of course, if we’ve put a person (or even a group of people) “in charge” there’s also no way to “prove” that they won’t do anything “crazy”—and history shows that people in charge quite often have done things that, at least in retrospect, we consider “crazy”. But even though at some level there’s no more certainty about what people will do than about what AIs might do, we still get a certain comfort when people are in charge if we think that “we’re in it together”, and that if something goes wrong those people will also “feel the effects”.
But still, it seems inevitable that lots of decisions and actions in the world will be taken directly by AIs. Perhaps it’ll be because this will be cheaper. Perhaps the results (based on tests) will be better. Or perhaps, for example, things will just have to be done too quickly and in numbers too large for us humans to be in the loop.
But, OK, if a lot of what happens in our world is happening through AIs, and the AIs are effectively doing irreducible computations, what will this be like? We’ll be in a situation where things are “just happening” and we don’t quite know why. But in a sense we’ve very much been in this situation before. Because it’s what happens all the time in our interaction with nature.
Processes in nature—like, for example, the weather—can be thought of as corresponding to computations. And much of the time there’ll be irreducibility in those computations. So we won’t be able to readily predict them. Yes, we can do natural science to figure out some aspects of what’s going to happen. But it’ll inevitably be limited.
And so we can expect it to be with the “AI infrastructure” of the world. Things are happening in it—as they are in the weather—that we can’t readily predict. We’ll be able to say some things—though perhaps in ways that are closer to psychology or social science than to traditional exact science. But there’ll be surprises—like maybe some strange AI analog of a hurricane or an ice age. And in the end all we’ll really be able to do is to try to build up our human civilization so that such things “don’t fundamentally matter” to it.
In a sense the picture we have is that in time there’ll be a whole “civilization of AIs” operating—like nature—in ways that we can’t readily understand. And like with nature, we’ll coexist with it.
But at least at first we might think there’s an important difference between nature and AIs. Because we imagine that we don’t “pick our natural laws”—yet insofar as we’re the ones building the AIs we imagine we can “pick their laws”. But both parts of this aren’t quite right. Because in fact one of the implications of our Physics Project is precisely that the laws of nature that we perceive are the way they are because we are observers who are the way we are. And on the AI side, computational irreducibility implies that we can’t expect to be able to determine the final behavior of the AIs just from knowing the underlying laws we gave them.
But what will the “emergent laws” of the AIs be? Well, just like in physics, it’ll depend on how we “sample” the behavior of the AIs. If we look down at the level of individual bits, it’ll be like looking at molecular dynamics (or the behavior of atoms of space). But typically we won’t do this. And just like in physics, we’ll operate as computationally bounded observers—measuring only certain aggregated features of an underlying computationally irreducible process. But what will the “overall laws of AIs” be like? Maybe they’ll show close analogies to physics. Or maybe they’ll seem more like psychological theories (superegos for AIs?). But we can expect them in many ways to be like large-scale laws of nature of the kind we know.
Still, there’s one more difference between at least our interaction with nature and with AIs. Because we have in effect been “co-evolving” with nature for billions of years—yet AIs are “new on the scene”. And through our co-evolution with nature we’ve developed all sorts of structural, sensory and cognitive features that allow us to “interact successfully” with nature. But with AIs we don’t have these. So what does this mean?
Well, our ways of interacting with nature can be thought of as leveraging pockets of computational reducibility that exist in natural processes—to make things seem at least somewhat predictable to us. But without having found such pockets for AIs, we’re likely to be faced with much more “raw computational irreducibility”—and thus much more unpredictability. It’s been a conceit of modern times that—particularly with the help of science—we’ve been able to make more and more of our world predictable to us, though in practice a large part of what’s led to this is the way we’ve built and controlled the environment in which we live, and the things we choose to do.
But for the new “AI world”, we’re effectively starting from scratch. And to make things predictable in that world may be partly a matter of some new science, but perhaps more importantly a matter of choosing how we set up our “way of life” around the AIs there. (And, yes, if there’s lots of unpredictability we may be back to more ancient points of view about the importance of fate—or we may view AIs as a bit like the Olympians of Greek mythology, duking it out among themselves and sometimes having an effect on mortals.)
Governance in an AI World
Let’s say the world is effectively being run by AIs, but let’s assume that we humans have at least some control over what they do. Then what principles should we have them follow? And what, for example, should their “ethics” be?
Well, the first thing to say is that there’s no ultimate, theoretical “right answer” to this. There are many ethical and other principles that AIs could follow. And it’s basically just a choice which ones should be followed.
When we talk about “principles” and “ethics” we tend to think more in terms of constraints on behavior than in terms of rules for generating behavior. And that means we’re dealing with something more like mathematical axioms, where we ask things like what theorems are true according to those axioms, and what are not. And that means there can be issues like whether the axioms are consistent—and whether they’re complete, in the sense that they can “determine the ethics of anything”. But now, once again, we’re face to face with computational irreducibility, here in the form of Gödel’s theorem and its generalizations.
And what this means is that it’s in general undecidable whether any given set of principles is inconsistent, or incomplete. One might “ask an ethical question”, and find that there’s a “proof chain” of unbounded length to determine what the answer to that question is within one’s specified ethical system, or whether there is even a consistent answer.
One might imagine that somehow one could add axioms to “patch up” whatever issues there are. But Gödel’s theorem basically says that it’ll never work. It’s the same story as so often with computational irreducibility: there’ll always be “new situations” that can arise, that in this case can’t be captured by a finite set of axioms.
OK, but let’s imagine we’re picking a collection of principles for AIs. What criteria could we use to do it? One might be that these principles won’t inexorably lead to a simple state—like one where the AIs are extinct, or have to keep looping doing the same thing forever. And there may be cases where one can readily see that some set of principles will lead to such outcomes. But most of the time, computational irreducibility (here in the form of things like the halting problem) will once again get in the way, and one won’t be able to tell what will happen, or successfully pick “viable principles” this way.
So this means that there are going to be a wide range of principles that we could in theory pick. But presumably what we’ll want is to pick ones that make AIs give us humans some sort of “good time”, whatever that might mean.
And a minimal idea might be to get AIs just to observe what we humans do, and then somehow imitate this. But most people wouldn’t consider this the right thing. They’d point out all the “bad” things people do. And they’d perhaps say “let’s have the AIs follow not what we actually do, but what we aspire to do”.
But where should we get these aspirations from? Different people, and different cultures, can have very different aspirations—with very different resulting principles. So whose should we pick? And, yes, there are pitifully few—if any—principles that we truly find in common everywhere. (Though, for example, the major religions all tend to share things like respect for human life, the Golden Rule, etc.)
But do we in fact have to pick one set of principles? Maybe some AIs can have some principles, and some can have others. Maybe it should be like different countries, or different online communities: different principles for different groups or in different places.
Right now that doesn’t seem plausible, because technological and commercial forces have tended to make it seem as if powerful AIs always have to be centralized. But I expect that this is just a feature of the present time, and not something intrinsic to any “human-like” AI.
So could everyone (and maybe every organization) have “their own AI” with its own principles? For some purposes this might work OK. But there are many situations where AIs (or people) can’t really act independently, and where there have to be “collective decisions” made.
Why is this? In some cases it’s because everyone is in the same physical environment. In other cases it’s because if there’s to be social cohesion—of the kind needed to support even something like a language that’s useful for communication—then there has to be certain conceptual alignment.
It’s worth pointing out, though, that at some level having a “collective conclusion” is effectively just a way of introducing certain computational reducibility to make it “easier to see what to do”. And potentially it can be avoided if one has enough computation capability. For example, one might assume that there has to be a collective conclusion about which side of the road cars should drive on. But that wouldn’t be true if every car had the computation capability to just compute a trajectory that would for example optimally weave around other cars using both sides of the road.
But if we humans are going to be in the loop, we presumably need a certain amount of computational reducibility to make our world sufficiently comprehensible to us that we can operate in it. So that means there’ll be collective—“societal”—decisions to make. We might want to just tell the AIs to “make everything as good as it can be for us”. But inevitably there will be tradeoffs. Making a collective decision one way might be really good for 99% of people, but really bad for 1%; making it the other way might be pretty good for 60%, but pretty bad for 40%. So what should the AI do?
And, of course, this is a classic problem of political philosophy, and there’s no “right answer”. And in reality the setup won’t be as clean as this. It may be fairly easy to work out some immediate effects of different courses of action. But inevitably one will eventually run into computational irreducibility—and “unintended consequences”—and so one won’t be able to say with certainty what the ultimate effects (good or bad) will be.
But, OK, so how should one actually make collective decisions? There’s no perfect answer, but in the world today, democracy in one form or another is usually viewed as the best option. So how might AI affect democracy—and perhaps improve on it? Let’s assume first that “humans are still in charge”, so that it’s ultimately their preferences that matter. (And let’s also assume that humans are more or less in their “current form”: unique and unreplicable discrete entities that believe they have independent minds.)
The basic setup for current democracy is computationally quite simple: discrete votes (or perhaps rankings) are given (sometimes with weights of various kinds), and then numerical totals are used to determine the winner (or winners). And with past technology this was pretty much all that could be done. But now there are some new elements. Imagine not casting discrete votes, but instead using computational language to write a computational essay to describe one’s preferences. Or imagine having a conversation with a linguistically enabled AI that can draw out and debate one’s preferences, and eventually summarize them in some kind of feature vector. Then imagine feeding computational essays or feature vectors from all “voters” to some AI that “works out the best thing to do”.
Well, there are still the same political philosophy issues. It’s not like 60% of people voted for A and 40% for B, so one chose A. It’s much more nuanced. But one still won’t be able to make everyone happy all the time, and one has to have some base principles to know what to do about that.
And there’s a higher-order problem in having an AI “rebalance” collective decisions all the time based on everything it knows about people’s detailed preferences (and perhaps their actions too): for many purposes—like us being able to “keep track of what’s going on”—it’s important to maintain consistency over time. But, yes, one could deal with this by having the AI somehow also weigh consistency in figuring out what to do.
But while there are no doubt ways in which AI can “tune up” democracy, AI doesn’t seem—in and of itself—to deliver any fundamentally new solution for making collective decisions, and for governance in general.
And indeed, in the end things always seem to come down to needing some fundamental set of principles about how one wants things to be. Yes, AIs can be the ones to implement these principles. But there are many possibilities for what the principles could be. And—at least if we humans are “in charge”—we’re the ones who are going to have to come up with them.
Or, in other words, we need to come up with some kind of “AI constitution”. Presumably this constitution should basically be written in precise computational language (and, yes, we’re trying to make it possible for the Wolfram Language to be used), but inevitably (as yet another consequence of computational irreducibility) there’ll be “fuzzy” definitions and distinctions, that will rely on things like examples, “interpolated” by systems like neural nets. Maybe when such a constitution is created, there’ll be multiple “renderings” of it, which can all be applied whenever the constitution is used, with some mechanism for picking the “overall conclusion”. (And, yes, there’s potentially a certain “observer-dependent” multicomputational character to this.)
But whatever its detailed mechanisms, what should the AI constitution say? Different people and groups of people will definitely come to different conclusions about it. And presumably—just as there are different countries, etc. today with different systems of laws—there’ll be different groups that want to adopt different AI constitutions. (And, yes, the same issues about collective decision making apply again when those AI constitutions have to interact.)
But given an AI constitution, one has a base on which AIs can make decisions. And on top of this one imagines a huge network of computational contracts that are autonomously executed, essentially to “run the world”.
And this is perhaps one of those classic “what could possibly go wrong?” moments. An AI constitution has been agreed on, and now everything is being run efficiently and autonomously by AIs that are following it. Well, once again, computational irreducibility rears its head. Because however carefully the AI constitution is drafted, computational irreducibility implies that one won’t be able to foresee all its consequences: “unexpected” things will always happen—and some of them will undoubtedly be things “one doesn’t like”.
In human legal systems there’s always a mechanism for adding “patches”—filling in laws or precedents that cover new situations that have come up. But if everything is being autonomously run by AIs there’s no room for that. Yes, we as humans might characterize “bad things that happen” as “bugs” that could be fixed by adding a patch. But the AI is just supposed to be operating—essentially axiomatically—according to its constitution, so it has no way to “see that it’s a bug”.
Similar to what we discussed above, there’s an interesting analogy here with human law versus natural law. Human law is something we define and can modify. Natural law is something the universe just provides us (notwithstanding the issues about observers discussed above). And by “setting an AI constitution and letting it run” we’re basically forcing ourselves into a situation where the “civilization of the AIs” is some “independent stratum” in the world, that we essentially have to take as it is, and adapt to.
Of course, one might wonder if the AI constitution could “automatically evolve”, say based on what’s actually seen to happen in the world. But one quickly returns to the exact same issues of computational irreducibility, where one can’t predict whether the evolution will be “right”, etc.
So far, we’ve assumed that in some sense “humans are in charge”. But at some level that’s an issue for the AI constitution to define. It’ll have to define whether AIs have “independent rights”—just like humans (and, in many legal systems, some other entities too). Closely related to the question of independent rights for AIs is whether an AI can be considered autonomously “responsible for its actions”—or whether such responsibility must always ultimately rest with the (presumably human) creator or “programmer” of the AI.
Once again, computational irreducibility has something to say. Because it implies that the behavior of the AI can go “irreducibly beyond” what its programmer defined. And in the end (as we discussed above) this is the same basic mechanism that allows us humans to effectively have “free will” even when we’re ultimately operating according to deterministic underlying natural laws. So if we’re going to claim that we humans have free will, and can be “responsible for our actions” (as opposed to having our actions always “dictated by underlying laws”) then we’d better claim the same for AIs.
So just as a human builds up something irreducible and irreplaceable in the course of their life, so can an AI. As a practical matter, though, AIs can presumably be backed up, copied, etc.—which isn’t (yet) possible for humans. So somehow their individual instances don’t seem as valuable, even if the “last copy” might still be valuable. As humans, we might want to say “those AIs are something inferior; they shouldn’t have rights”. But things are going to get more entangled. Imagine a bot that no longer has an identifiable owner but that’s successfully befriending people (say on social media), and paying for its underlying operation from donations, ads, etc. Can we reasonably delete that bot? We might argue that “the bot can feel no pain”—but that’s not true of its human friends. But what if the bot starts doing “bad” things? Well, then we’ll need some form of “bot justice”—and pretty soon we’ll find ourselves building a whole human-like legal structure for the AIs.
So Will It End Badly?
OK, so AIs will learn what they can from us humans, then they’ll fundamentally just be running as autonomous computational systems—much like nature runs as an autonomous computational system—sometimes “interacting with us”. What will they “do to us”? Well, what does nature “do to us”? In a kind of animistic way, we might attribute intentions to nature, but ultimately it’s just “following its rules” and doing what it does. And so it will be with AIs. Yes, we might think we can set things up to determine what the AIs will do. But in the end—insofar as the AIs are really making use of what’s possible in the computational universe—there’ll inevitably be computational irreducibility, and we won’t be able to foresee what will happen, or what consequences it will have.
So will the dynamics of AIs in fact have “bad” effects—like, for example, wiping us out? Well, it’s perfectly possible nature could wipe us out too. But one has the feeling that—extraterrestrial “accidents” aside—the natural world around us is at some level enough in some kind of “equilibrium” that nothing too dramatic will happen. But AIs are something new. So maybe they’ll be different.
And one possibility might be that AIs could “improve themselves” to produce a single “apex intelligence” that would in a sense dominate everything else. But here we can see computational irreducibility as coming to the rescue. Because it implies that there can never be a “best at everything” computational system. It’s a core result of the emerging field of metabiology: that whatever “achievement” you specify, there’ll always be a computational system somewhere out there in the computational universe that will exceed it. (A simple example is that there’s always a Turing machine that can be found that will exceed any upper bound you specify on the time it takes to halt.)
So what this means is that there’ll inevitably be a whole “ecosystem” of AIs—with no single winner. Of course, while that might be an inevitable final outcome, it might not be what happens in the shorter term. And indeed the current tendency to centralize AI systems has a certain danger of AI behavior becoming “unstabilized” relative to what it would be with a whole ecosystem of “AIs in equilibrium”.
And in this situation there’s another potential concern as well. We humans are the product of a long struggle for life played out over the course of the history of biological evolution. And insofar as AIs inherit our attributes we might expect them to inherit a certain “drive to win”—perhaps also against us. And perhaps this is where the AI constitution becomes important: to define a “contract” that supersedes what AIs might “naturally” inherit from effectively observing our behavior. Eventually we can expect the AIs to “independently reach equilibrium”. But in the meantime, the AI constitution can help break their connection with our “competitive” history of biological evolution.
Preparing for an AI World
We’ve talked quite a bit about the ultimate future course of AIs, and their relation to us humans. But what about the short term? How today can we prepare for the growing capabilities and uses of AIs?
As has been true throughout history, people who use tools tend to do better than those who don’t. Yes, you can go on doing by direct human effort what has now been successfully automated, but except in rare cases you’ll increasingly be left behind. And what’s now emerging is an extremely powerful combination of tools: neural-net-style AI for “immediate human-like tasks”, along with computational language for deeper access to the computational universe and computational knowledge.
So what should people do with this? The highest leverage will come from figuring out new possibilities—things that weren’t possible before but have now “come into range” as a result of new capabilities. And as we discussed above, this is a place where we humans are inevitably central contributors—because we’re the ones who must define what we consider has value for us.
So what does this mean for education? What’s worth learning now that so much has been automated? I think the fundamental answer is how to think as broadly and deeply as possible—calling on as much knowledge and as many paradigms as possible, and particularly making use of the computational paradigm, and ways of thinking about things that directly connect with what computation can help with.
In the course of human history a lot of knowledge has been accumulated. But as ways of thinking have advanced, it’s become unnecessary to learn directly that knowledge in all its detail: instead one can learn things at a higher level, abstracting out many of the specific details. But in the past few decades something fundamentally new has come on the scene: computers and the things they enable.
For the first time in history, it’s become realistic to truly automate intellectual tasks. The leverage this provides is completely unprecedented. And we’re only just starting to come to terms with what it means for what and how we should learn. But with all this new power there’s a tendency to think something must be lost. Surely it must still be worth learning all those intricate details—that people in the past worked so hard to figure out—of how to do some mathematical calculation, even though Mathematica has been able to do it automatically for more than a third of a century?
And, yes, at the right time it can be interesting to learn those details. But in the effort to understand and best make use of the intellectual achievements of our civilization, it makes much more sense to leverage the automation we have, and treat those calculations just as “building blocks” that can be put together in “finished form” to do whatever it is we want to do.
One might think this kind of leveraging of automation would just be important for “practical purposes”, and for applying knowledge in the real world. But actually—as I have personally found repeatedly to great benefit over the decades—it’s also crucial at a conceptual level. Because it’s only through automation that one can get enough examples and experience that one’s able to develop the intuition needed to reach a higher level of understanding.
Confronted with the rapidly growing amount of knowledge in the world there’s been a tremendous tendency to assume that people must inevitably become more and more specialized. But with increasing success in the automation of intellectual tasks—and what we might broadly call AI—it becomes clear there’s an alternative: to make more and more use of this automation, so people can operate at a higher level, “integrating” rather than specializing.
And in a sense this is the way to make the best use of our human capabilities: to let us concentrate on setting the “strategy” of what we want to do—delegating the details of how to do it to automated systems that can do it better than us. But, by the way, the very fact that there’s an AI that knows how to do something will no doubt make it easier for humans to learn how to do it too. Because—although we don’t yet have the complete story—it seems inevitable that with modern techniques AIs will be able to successfully “learn how people learn”, and effectively present things an AI “knows” in just the right way for any given person to absorb.
So what should people actually learn? Learn how to use tools to do things. But also learn what things are out there to do—and learn facts to anchor how you think about those things. A lot of education today is about answering questions. But for the future—with AI in the picture—what’s likely to be more important is to learn how to ask questions, and how to figure out what questions are worth asking. Or, in effect, how to lay out an “intellectual strategy” for what to do.
And to be successful at this, what’s going to be important is breadth of knowledge—and clarity of thinking. And when it comes to clarity of thinking, there’s again something new in modern times: the concept of computational thinking. In the past we’ve had things like logic, and mathematics, as ways to structure thinking. But now we have something new: computation.
Does that mean everyone should “learn to program” in some traditional programming language? No. Traditional programming languages are about telling computers what to do in their terms. And, yes, lots of humans do this today. But it’s something that’s fundamentally ripe for direct automation (as examples with ChatGPT already show). And what’s important for the long term is something different. It’s to use the computational paradigm as a structured way to think not about the operation of computers, but about both things in the world and abstract things.
And crucial to this is having a computational language: a language for expressing things using the computational paradigm. It’s perfectly possible to express simple “everyday things” in plain, unstructured natural language. But to build any kind of serious “conceptual tower” one needs something more structured. And that’s what computational language is about.
One can see a rough historical analog in the development of mathematics and mathematical thinking. Up until about half a millennium ago, mathematics basically had to be expressed in natural language. But then came mathematical notation—and from it a more streamlined approach to mathematical thinking, that eventually made possible all the various mathematical sciences. And it’s now the same kind of thing with computational language and the computational paradigm. Except that it’s a much broader story, in which for basically every field or occupation “X” there’s a “computational X” that’s emerging.
In a sense the point of computational language (and all my efforts in the development of the Wolfram Language) is to be able to let people get “as automatically as possible” to computational X—and to let people express themselves using the full power of the computational paradigm.
Something like ChatGPT provides “human-like AI” in effect by piecing together existing human material (like billions of words of human-written text). But computational language lets one tap directly into computation—and gives the ability to do fundamentally new things, that immediately leverage our human capabilities for defining intellectual strategy.
And, yes, while traditional programming is likely to be largely obsoleted by AI, computational language is something that provides a permanent bridge between human thinking and the computational universe: a channel in which the automation is already done in the very design (and implementation) of the language—leaving in a sense an interface directly suitable for humans to learn, and to use as a basis to extend their thinking.
But, OK, what about the future of discovery? Will AIs take over from us humans in, for example, “doing science”? I, for one, have used computation (and many things one might think of as AI) as a tool for scientific discovery for nearly half a century. And, yes, many of my discoveries have in effect been “made by computer”. But science is ultimately about connecting things to human understanding. And so far it’s taken a human to knit what the computer finds into the whole web of human intellectual history.
One can certainly imagine, though, that an AI—even one rather like ChatGPT—could be quite successful in taking a “raw computational discovery” and “explaining” how it might relate to existing human knowledge. One could also imagine that the AI would be successful at identifying what aspects of some system in the world could be picked out to describe in some formal way. But—as is typical for the process of modeling in general—a key step is to decide “what one cares about”, and in effect in what direction to go in extending one’s science. And this—like so much else—is inevitably tied into the specifics of the goals we humans set ourselves.
In the emerging AI world there are plenty of specific skills that won’t make sense for (most) humans to learn—just as today the advance of automation has obsoleted many skills from the past. But—as we’ve discussed—we can expect there to “be a place” for humans. And what’s most important for us humans to learn is in effect how to pick “where next to go”—and where, out of all the infinite possibilities in the computational universe, we should take human civilization.
Afterword: Looking at Some Actual Data
OK, so we’ve talked quite a bit about what might happen in the future. But what about actual data from the past? For example, what’s been the actual history of the evolution of jobs? Conveniently, in the US, the Census Bureau has records of people’s occupations going back to 1850. Of course, many job titles have changed since then. Switchmen (on railroads), chainmen (in surveying) and sextons (in churches) aren’t really things anymore. And telemarketers, aircraft pilots and web developers weren’t things in 1850. But with a bit of effort, it’s possible to more or less match things up—at least if one aggregates into large enough categories.
So here are pie charts of different job categories at 50-year intervals:
And, yes, in 1850 the US was firmly an agricultural economy, with just over half of all jobs being in agriculture. But as agriculture got more efficient—with the introduction of machinery, irrigation, better seeds, fertilizers, etc.—the fraction dropped dramatically, to just a few percent today.
After agriculture, the next biggest category back in 1850 was construction (along with other real-estate-related jobs, mainly maintenance). And this is a category that for a century and a half hasn’t changed much in size (at least so far), presumably because, even though there’s been greater automation, this has just allowed buildings to be more complex.
Looking at the pie charts above, we can see a clear trend towards greater diversification in jobs (and indeed the same thing is seen in the development of other economies around the world). It’s an old theory in economics that increasing specialization is related to economic growth, but from our point of view here, we might say that the very possibility of a more complex economy, with more niches and jobs, is a reflection of the inevitable presence of computational irreducibility, and the complex web of pockets of computational reducibility that it implies.
Beyond the overall distribution of job categories, we can also look at trends in individual categories over time—with each one in a sense providing a certain window onto history:
One can definitely see cases where the number of jobs decreases as a result of automation. And this happens not only in areas like agriculture and mining, but also for example in finance (fewer clerks and bank tellers), as well as in sales and retail (online shopping). Sometimes—as in the case of manufacturing—there’s a decrease of jobs partly because of automation, and partly because the jobs move out of the US (mainly to countries with lower labor costs).
There are cases—like military jobs—where there are clear “exogenous” effects. And then there are cases like transportation+logistics where there’s a steady increase for more than half a century as technology spreads and infrastructure gets built up—but then things “saturate”, presumably at least partly as a result of increased automation. It’s a somewhat similar story with what I’ve called “technical operations”—with more “tending to technology” needed as technology becomes more widespread.
Another clear trend is an increase in job categories associated with the world becoming an “organizationally more complicated place”. Thus we see increases in management, as well as administration, government, finance and sales (which all have recent decreases as a result of computerization). And there’s also a (somewhat recent) increase in legal.
Other areas with increases include healthcare, engineering, science and education—where “more is known and there’s more to do” (as well as there being increased organizational complexity). And then there’s entertainment, and food+hospitality, with increases that one might attribute to people leading (and wanting) “more complex lives”. And, of course, there’s information technology which takes off from nothing in the mid-1950s (and which had to be rather awkwardly grafted into the data we’re using here).
So what can we conclude? The data seems quite well aligned with what we discussed in more general terms above. Well-developed areas get automated and need to employ fewer people. But technology also opens up new areas, which employ additional people. And—as we might expect from computational irreducibility—things generally get progressively more complicated, with additional knowledge and organizational structure opening up more “frontiers” where people are needed. But even though there are sometimes “sudden inventions”, it still always seems to take decades (or effectively a generation) for there to be any dramatic change in the number of jobs. (The few sharp changes visible in the plots seem mostly to be associated with specific economic events, and—often related—changes in government policies.)
But in addition to the different jobs that get done, there’s also the question of how individual people spend their time each day. And—while it certainly doesn’t live up to my own (rather extreme) level of personal analytics—there’s a certain amount of data on this that’s been collected over the years (by getting time diaries from randomly sampled people) in the American Heritage Time Use Study. So here, for example, are plots based on this survey for how the amount of time spent on different broad activities has varied over the decades (the main line shows the mean—in hours—for each activity; the shaded areas indicate successive deciles):
And, yes, people are spending more time on “media & computing”, some mixture of watching TV, playing videogames, etc. Housework, at least for women, takes less time, presumably mostly as a result of automation (appliances, etc.). (“Leisure” is basically “hanging out” as well as hobbies and social, cultural, sporting events, etc.; “Civic” includes volunteer, religious, etc. activities.)
If one looks specifically at people who are doing paid work
one notices several things. First, the average number of hours worked hasn’t changed much in half a century, though the distribution has broadened somewhat. For people doing paid work, media & computing hasn’t increased significantly, at least since the 1980s. One category in which there is systematic increase (though the total time still isn’t very large) is exercise.
What about people who—for one reason or another—aren’t doing paid work? Here are corresponding results in this case:
Not so much increase in exercise (though the total times are larger to begin with), but now a significant increase in media & computing, with the average recently reaching nearly 6 hours per day for men—perhaps as a reflection of “more of life going online”.
But looking at all these results on time use, I think the main conclusion that over the past half century, the ways people (at least in the US) spend their time have remained rather stable—even as we’ve gone from a world with almost no computers to a world in which there are more computers than people.