25 Comments
User's avatar
⭠ Return to thread
MD's avatar

I have a single data point to share here: in Humanity's Last Exam, the physics question (example 8/8) is definitely not "extremely difficult even for specialized human experts". It's the setup for an elliptical pendulum. This exact system was one of the homework problems in my theoretical mechanics course, second year of undergrad. It takes some calculus effort if you don't know the solution ahead of time, but the solution is duplicated in probably hundreds of textbooks all over the training data for these models. Given such a textbook with the general solution, the specific question asked here could be solved by a highschooler capable of manipulating expressions.

I cannot tell whether this is the case for the problems in other disciplines, but I wouldn't be surprised if it were.

I think these benchmarks are, if anything, exposing how structured school exercises are, compared to what "adult" scientists do in their actual jobs. They're also exposing how huge the internet is and how many things have already been done, since they provide a way to search and interpolate this database.

But all of this also shows how dangerous it is to talk of the accomplishments of Einstein when you don't know what he actually *did*.

Expand full comment
Steve Newman's avatar

Thanks for the perspective on the HLE physics question. This jibes with a recent (and recommended!) essay on the FrontierMath benchmark: https://lemmata.substack.com/p/what-i-wish-i-knew-about-frontiermath.

Can you say more about Einstein? I can guess at what you might be saying but I'm not certain.

Expand full comment
MD's avatar

I'll admit I was a bit smug writing that line, and I didn't have something specific in mind. Where I was going generally is that often in discussions around AI (especially on Substack), one sees people talking about intelligence in the abstract, without any grounding in actual subject matter. (For example, these days with o3, one often sees the System 1 / System 2 distinction thrown around, like the discussion up to this comment (https://news.ycombinator.com/item?id=42485938#42492865).) Aside from coding, most of the benchmarks are on topics the average debater is not involved in, and it's hard to judge what is or isn't impressive. Something like the piano question you linked to elsewhere (https://x.com/deanwball/status/1871424965230379465): without being a historian of music, it's hard to make anything of that, but people readily try to.

But let me have my penance for mentioning Einstein, because there is something to be said here. Using Einstein as a benchmark for intelligence is immediately evocative in the same way E = mc^2 is a symbol for a genius idea, but there's lots of different things that Einstein did that are impressive in different ways. An "Einstein level of talent" could mean doing any of those individual things, or it could mean being the kind of person who finds and does all of them.

- Twice, with special relativity and the concept of the photon, his contributions were brilliant because they took a hypothesis that had already been more-or-less worked out, took seriously the conceptual shift it suggested, and used it to go further.

-- Planck developed the hypothesis of the light quantum in 1900 to solve the ultraviolet catastrophe problem (https://en.wikipedia.org/wiki/Ultraviolet_catastrophe), but considered it a purely formal trick and unsuccessfully tried to get rid of the quantization. Einstein considered the photon a physical reality and used it to explain the photoelectric effect (which he got the Nobel prize for).

-- By 1905, when Einstein published the paper that formed special relativity, Lorentz and Poincaré had already worked out the formulae for time dilation and length contraction, they were speculating about the relationship between "absolute" and "apparent" time, and even found that the speed of light cannot be exceeded (see https://en.wikipedia.org/wiki/History_of_special_relativity#Lorentz's_1904_model for the state in 1904), but it was only Einstein who recognised that the concept of the absolute coordinate system can be dropped, that theories should be formulated independently of observers, and that everything follows from there. I think that it would not have been possible to get to general relativity without this framework.

- It seems (but as with all of alternate history it's hard to tell) that general relativity followed from there practically inevitably, and if it weren't for Einstein it would have been discovered by Hilbert weeks later. (see https://en.wikipedia.org/wiki/General_relativity_priority_dispute)

- Besides those findings he's famous for, Einstein also contributed to the explanation of Brownian motion or to the later development of quantum mechanics (which he never accepted, but his attempts at demolishing the theory were helpful in constructing it -- see e.g. the EPR paradox (https://en.wikipedia.org/wiki/Einstein%E2%80%93Podolsky%E2%80%93Rosen_paradox)). He even invented a refrigerator with no moving parts (https://en.wikipedia.org/wiki/Einstein_refrigerator). There's a fascinating tendency in history up until about WWII that famous names reappear in distant fields -- for example, one solution of Einstein's equations, a spacetime containing closed timelike curves ("a time machine"), was discovered by Kurt Gödel, otherwise of fame for the incompleteness theorems in logic (see https://en.wikipedia.org/wiki/G%C3%B6del_metric). Talent was indeed apparently concentrated in a few people, though whether this is because of innate abilities, the right group (scene?) getting together, or something else entirely I don't know.

- And beside all of that, Einstein did a lot of this work in Germany in the early 20th century, and he was one of the few physicists who escaped the nationalist zeal, observing WWI "as personnel in a madhouse" (my GR textbook has a touching digression on this: https://utf.mff.cuni.cz/~semerak/GTR.pdf#subsection.8.1.5). This is not obvious: contrast, for example, that the Deutsche Physik program was led by Lenard and Stark, two Nobel winners caught up in the Nazi madness.

All of this is quite far away from AI, and today's problems are in any case of a whole different nature again, but still I think it's nice context to have for what genius looks like. I would love to see more AI research focused on how people work (not just with genius ideas, but also mapping the kind of routine tasks that are accessible to AI here and now), rather than isolated benchmarks that seem to build on how school suggests people *should* work.

Thanks for that essay, it goes in the same direction and poses some helpful questions!

Expand full comment
Steve Newman's avatar

Thanks, this is fascinating perspective (for me at least) on Einstein's famous accomplishments. I knew he had been building on previous work but didn't know a lot of these details, and the "took a hypothesis that had already been more-or-less worked out, took seriously the conceptual shift it suggested, and used it to go further" framing is very interesting. I may quote what you've written here in a future post, if I try to explore the nature of genius.

Expand full comment
MD's avatar

Thanks! I would be careful with the "he had been building on previous work" part, since from my quick research for this comment (https://en.wikipedia.org/wiki/History_of_special_relativity#Electrodynamics_of_moving_bodies and https://en.wikipedia.org/wiki/Relativity_priority_dispute) it seems to be somewhat unclear what exactly Einstein was familiar with in 1905. It doesn't help that he didn't provide citations in the paper...

Edit: Oh, and this conceptual shift is also a massive special case in physics. YMMV, but since the development of quantum mechanics around 1925, there hasn't really been any shift on such a massive scale, so it might not be all that helpful for today.

Expand full comment