Posts from 2024

Five Most Productive Years: What Happened and What’s Next

Five Most Productive Years: What Happened and What's Next

So… What Happened?

Today is my birthday—for the 65th time. Five years ago, on my 60th birthday, I did a livestream where I talked about some of my plans. So… what happened? Well, what happened was great. And in fact I’ve just had the most productive five years of my life. Nine books. 3939 pages of writings (1,283,267 words). 499 hours of podcasts and 1369 hours of livestreams. 14 software product releases (with our great team). Oh, and a bunch of big—and beautiful—ideas and results.

It’s been wonderful. And unexpected. I’ve spent my life alternating between technology and basic science, progressively building a taller and taller tower of practical capabilities and intellectual concepts (and sharing what I’ve done with the world). Five years ago everything was going well, and making steady progress. But then there were the questions I never got to. Over the years I’d come up with a certain number of big questions. And some of them, within a few years, I’d answered. But others I never managed to get around to.

And five years ago, as I explained in my birthday livestream, I began to think “it’s now or never”. I had no idea how hard the questions were. Yes, I’d spent a lifetime building up tools and knowledge. But would they be enough? Or were the questions just not for our time, but only perhaps for some future century? Continue reading

What’s Really Going On in Machine Learning? Some Minimal Models

What's Really Going On in Machine Learning? Some Minimal Models

The Mystery of Machine Learning

It’s surprising how little is known about the foundations of machine learning. Yes, from an engineering point of view, an immense amount has been figured out about how to build neural nets that do all kinds of impressive and sometimes almost magical things. But at a fundamental level we still don’t really know why neural nets “work”—and we don’t have any kind of “scientific big picture” of what’s going on inside them.

The basic structure of neural networks can be pretty simple. But by the time they’re trained up with all their weights, etc. it’s been hard to tell what’s going on—or even to get any good visualization of it. And indeed it’s far from clear even what aspects of the whole setup are actually essential, and what are just “details” that have perhaps been “grandfathered” all the way from when computational neural nets were first invented in the 1940s.

Well, what I’m going to try to do here is to get “underneath” this—and to “strip things down” as much as possible. I’m going to explore some very minimal models—that, among other things, are more directly amenable to visualization. At the outset, I wasn’t at all sure that these minimal models would be able to reproduce any of the kinds of things we see in machine learning. But, rather surprisingly, it seems they can. Continue reading