Images from https://chatgpt.com/c/6705d33a-9a84-800b-8ddd-f52c7dfd7db2 + Wikipedia + DeepMind’s AlphaProof writeup Thesis: in humans, creativity is an exploratory process, requiring a diverse set of tools. Those tools are not simply skills to be acquired, they require developing muscle memory for a specific problem domain. AlphaProof and o1 shed some light on how AIs will get there: through the generation of large amounts of relevant training data. Progress will be gradual, somewhat domain-specific, and idiosyncratic. However, computer strengths (e.g. speed / brute force, breadth of knowledge) can compensate – contributing to idiosyncraticicity. Creativity requires a large toolbox Mention test-time compute computer strengths (e.g. speed / brute force, breadth of knowledge) can compensate – contributing to idiosyncraticicity Are you telling me I will be able to dodge bullets? I'm telling you when you're ready you won't have to AIs may still have to, but they may just be fast enough to do it through force can make up for lack of technique (For another interesting bit of progress toward creative problem solving, see this writeup of PlanSearch.) See discussion of PlanSearch in https://thegradientpub.substack.com/p/update-83-ai-music-fraud-and-plansearch See “How Deep does Intelligence Go” in TextEdit Also: Siméon (@Simeon_Cps) posted at 1:43 PM on Sun, Dec 15, 2024:Sounds about right. Maths has all the ingredients: Data generation (theorem provers + lean), verifiability, and closed system that doesn't require any real-world onboarding.(https://x.com/Simeon_Cps/status/1868411560768811147?t=bZkhvW38tzhEnpgL8CFjHQ&s=03) However, computer strengths (e.g. speed / brute force, breadth of knowledge) can compensate – contributing to idiosyncraticicity Matt asks a great question, why did OpenAI (with o1) target math before sales rep? From my conversation with Matt: when can agents do 10 steps in a messy RAG-ish context? Review https://twitter.com/MFarajtabar/status/1844456893605191804 Boaz Barak (@boazbaraktcs) posted at 8:34 AM on Fri, Oct 11, 2024:This is very interesting paper, but disagree with hypothesis that it shows that "current LLMs are not capable of genuine logical reasoning."There is a confounder here:Many top LLMs are *chat models*. Chat is very different from math exams. Chats are messy, and to do a good(https://x.com/boazbaraktcs/status/1844763538260209818?t=4i54UKxp2_R6CYFm1mmz2w&s=03) Nathan Lambert (@natolambert) posted at 7:24 AM on Tue, Oct 15, 2024:You spend more tokens on inference, your scores go up, its a simple business.This is clever work to make it simple to implement.(https://x.com/natolambert/status/1846195544458514561?t=-8t7_j62mtGGhRAlG71m6w&s=03) [Child Page: Offcuts] Software needs to function correctly; a marketing campaign should be aesthetically consistent and interesting while conveying the right message to the target audience. We know the pace of improvement for models built on human training data, like GPT-4. We don’t yet know the trend for synthetic data, as used in AlphaProof and o1. Here’s another example. Last year, I asked GPT-4 to solve a real-world problem we faced while building Google Docs, and it failed miserably. o1-mini nailed it, although I gave it an important hint (GPT-4 was hopeless even with multiple hints). Without that hint, o1-mini did not manage to figure it out. Creativity Doesn’t Require Novelty There was a certain amount of creativity in the IMO proof I presented in my last post. But it contained nothing truly new; I was just gluing together known techniques and ideas. The creativity is in the combination – specifically, in finding a combination that worked. can I point to an example of “creativity” that used a completely existing thing, the only cleverness was in realizing that thing was applicable? Most (all?) human creativity can equally well be characterized as mixing and matching things that have come before. Lucas acknowledges that Star Wars was influenced by Kurosawa movies, etc. https://claude.ai/chat/4d6ea2d6-598c-4a59-8e0f-fe0593009af4 shows ideas that are at least as “creative” as most movies. A Lot of “Creative” AI Work is Crap The flip side of arguing that AI can’t be creative because it only interpolates, is getting overly excited about some piece of AI output merely because it is novel. [Zvi] Colin Fraser offers skeptical review of the recent paper about LLMs generating novel research ideas. (A lot of non-AI creative work is also, of course, crap) Don’t tell me AI (LLMs?) can’t be creative, unless you can give an operational (?) definition of “creative” and a coherent argument for why AIs (LLMs?) can’t meet that definition. An argument that doesn’t assume its conclusion (“they can only reproduce what’s in their training data”) or prove too much (humans made out of neurons can’t be creative either). Conversely, be wary of getting excited about an AI doing a “creative” thing (writing a scientific paper) until it does it genuinely well (and watch out for false positives – benchmarks leaking into the training data, cherry-picking, questionable scoring methods used for the scientific papers, etc.)