Don't Worry About the Vase Subscribe Sign in AI #86: Just Think of the Potential Zvi Mowshowitz Oct 17, 2024 41 42 8 Share Dario Amodei is thinking about the potential. The result is a mostly good essay called Machines of Loving Grace, outlining what can be done with ‘powerful AI’ if we had years of what was otherwise relative normality to exploit it in several key domains, and we avoided negative outcomes and solved the control and alignment problems. As he notes, a lot of pretty great things would then be super doable. Anthropic also offers us improvements to its Responsible Scaling Policy (RSP, or what SB 1047 called an SSP). Still much left to do, but a clear step forward there. Daniel Kokotajlo and Dean Ball have teamed up on an op-ed for Time on the need for greater regulatory transparency. It’s very good. Also, it’s worth checking out the Truth Terminal saga. It’s not as scary as it might look at first glance, but it is definitely super far out. Table of Contents Introduction. Table of Contents. Language Models Offer Mundane Utility . More subscriptions means more utility. Language Models Don’t Offer Mundane Utility. Then again, neither do you. Deepfaketown and Botpocalypse Soon . Quality remains the limiting factor. They Took Our Jobs . But as Snow Crash foretold us, they’ll never take our pizza. Get Involved . UK AISI hiring technical advisor, Tarbell Grants for AI reporting. Introducing . Grok 2 gets a proper API. In Other AI News . It’s time to go nuclear. Truth Terminal High Weirdness . When the going gets weird, the weird turn pro. Quiet Speculations . Are the labs holding back? Copyright Confrontation . New York Times sends a cease and desist to Perplexity. AI and the 2024 Presidential Election . Very briefly getting this out of the way. The Quest for Sane Regulations . A proposal all reasonable people should agree on. The Week in Audio . Matt Stone asks, is all Sam Altman does go on podcasts? Just Think of the Potential . They could be machines of loving grace. Reactions to Machines of Loving Grace . Much agreement, some notes of caution. Assuming the Can Opener . I would very much like a can opener. Rhetorical Innovation . People often try to convince you that reason is impossible. Anthropic Updates its Responsible Scaling Policy. New and improved. Aligning a Smarter Than Human Intelligence is Difficult. Are you smart enough? The Lighter Side. The art of the possible. Language Models Offer Mundane Utility Just Think of the Potential, local edition, and at least I’m trying: Roon: if you believe the “returns to intelligence” wrt producing good tweets or essays is large we are clearly experiencing quite a large overhang Perplexity CEO pitches his product: Aravind Srinivas (CEO Perplexity): Perplexity charts with code generation and execution have the potential to be the friendly UI and affordable Bloomberg terminal for the masses, which everyone has wanted for a long time! Perplexity Pro is $20/mo, while Bloomberg Terminal is $2500/mo. So, more than 100x cheaper. I do not think Srinivas appreciates the point of a Bloomberg terminal. Redwood Forest: Show me you haven’t used Bloomberg terminal without telling me you haven’t used Bloomberg terminal. Bloomberg was one of the first to train their own foundation model before Anthropic even released a model. The point of the Bloomberg terminal is that it was precise, reliable, up to the second data, and commands reliably do exactly what you want, and it has exactly the features traders want and count on to let them figure out the things they actually care about to make money, with shortcuts and other things to match their needs in real time. Perplexity Pro is probably worth $20/month to a lot of people but I am confident Bloomberg is unworried. Dean Ball is impressed with o1 for tasks like legal and policy questions , and suggests instructing it to ask follow-up and clarifying questions. I haven’t been as impressed, I presume largely because my purposes are not a good fit for o1’s strengths. Avital Balwit on how they use Claude especially for writing and editing tasks, also language learning, calorie counting and medical diagnoses. Here are some tips offered: Use a project . If you always want Claude to have certain context, upload documents to a project's "knowledge" and then keep all of your conversations that require that context in that project. I have one I use for my work and I've uploaded things like my To Do list for the past year, my planning documents for the next few months, etc. This saves me the time of explaining where I work, what my role is, who the people I frequently reference are. Ask for more examples . I have one friend who always asks Claude for 3-20 examples of whatever she is looking for (eg. "give me 20 examples of how I could write this sentence"). She then chooses the best, or takes elements from multiple to create one she likes. By asking for more, she increases the chances she's really happy with one result. ‘Most people are underutilizing models,’ the last section heading, is strongly true even for those (like myself) that are highly aware of the models. It is a weird kind of laziness, where it’s tempting not to bother to improve work flow, and it seems ‘easier’ in a sense to do everything yourself or the old way until you’ve established the new way. Jacquesthibs details all the AI productivity software they’re using , and how much they are paying for it, which Tyler Cowen found hilarious. I understand his reaction, this seems a lot like a cumulation of getting milked for $10 or $20 a month for versions of the same thing, often multiple copies of them. But that’s because all of this is dramatically underpriced, and having the right tool for the right job is worth orders of magnitude more. The question here is correctly ‘why can’t I pay more to get more?’ not ‘why do I need to pay so many different people’ or ‘how do I avoid buying a service that isn’t worthwhile or is duplicative.’ Buying too many is only a small mistake. Analyze your disagreements so you win arguments with your boyfriend , including quoting ChatGPT as a de facto authority figure. Language Models Don’t Offer Mundane Utility The boyfriend from the previous section is not thrilled by the pattern of behavior and has asked his girlfriend to stop. The alternative option is to ‘fight fire with fire’ and load up his own LLM, so both of them can prompt and get their version to agree with them and yell AI-generated arguments and authority at each other. The future is coming. And by language models, we might mean you. Aella: Since the discourse around AI, it's been super weird to find out that people somehow don't think of human speech as mostly autocomplete language machines too. It seems like people think humans are doing something entirely different? This is not always the mode I am in, but it is definitely sometimes the mode I am in. If you think you never do this, keep an eye out for it for a while, and see. What do we call ‘the thing LLMs can’t do that lets us dismiss them’ this week? Dan Hendrycks: "LLMs can't reason" is the new "LLMs don't have common sense" There is some disputing the question in the comments. I think it mostly confirms Dan’s point. Alternatively, there’s the classic options. Anthony Aguirre : Getting a bit fatigued with AI papers following the formula: I don't like the AI hype, so I'm going to set out to show that AI cannot do X, even though it sure looks like AI is doing X. I'll invent a new version of X that is extra hard for AI. I'll show that AI is not nearly as good at this extra-hard version of X. I'll neglect the facts that: humans are also worse at it, and/or AI is still actually decently good at it and/or newer and bigger models are better at the extra-hard X than older and smaller models, so future models are likely to be better at harder-X. I'll conclude that the current AI paradigm does not *really* do the original X (bonus points for "cannot ever" do X.) I mean, it's good to probe how models get better and worse at different versions of a task, but when it starts with an obvious agenda and over-claims, it gets headlines but not my respect. Much more interesting to investigate with genuine curiosity and an open mind about how AI and human cognition differ. Daniel Eth : Has anyone written a paper on “Can humans actually reason or are they just stochastic parrots?” showing that, using published results in the literature for LLMs, humans often fail to reason? I feel like someone should write that paper. Checking to see if you’re proving too much is often a wise precaution. Did they, though? Assigned Theyfab at Death: my mom (who's a university professor) did something interesting last year: she assigned her students to give chatgpt an essay question, have it write a paper, and then proofread/fact check it. Nearly every single student in that class came out of that assignment anti-chatgpt. As long as chatgpt is around, students are going to use it to cut corners. it sucks, but it's true. the best we can do at this point is show them why it's a double-edged sword and will often just create more work for them. Daniel Eth: This is interesting, and it’s probably something more teachers should do, but if your reaction to this exercise is to become anti-chatGPT instead of just recognizing the system has limits and shouldn’t be trusted to not hallucinate, then you’re ngmi Saying you’re coming out of that ‘anti-ChatGPT’ is a classic guessing of the teacher’s password. What does it mean to be ‘anti-ChatGPT’ while continuing to use it? We can presumably mostly agree that it would be good for university education if some of the uses of LLMs were unavailable to students - if the LLM essentially did a smart version of ‘is this query going to on net help the student learn?’ That option is not available. Students mostly realize that if they had to fact check every single statement, in a ‘if there is a factual error in this essay you are expelled’ kind of way, they would have to give up on many use cases for LLMs. But also most of the students would get expelled even without LLMs, because mistakes happen, so we can’t do that. Classic fun with AI skepticism : Davidad: Search engine skeptics: “It may seem like the engine can help answer your questions, but it’s just doing approximate retrieval—everything it shows you was already there on the Internet, and you could have found it yourself if you just typed in its URL, Worse still, many websites on the Internet are wrong, This makes search engines worse than useless.” Seb Krier: Same with books - people think they teach you new things, but they're just arranging existing words. Everything in them was already in the dictionary. Unconed: No classic search engine would produce the nonsense that google AI comes up with and you know it. Davidad: No classic library card catalog would produce the nonsense that people post on the Internet. It’s certainly possible they used ChatGPT for this, but they’re definitely fully capable of spouting similar Shibboleth passwords without it . The thing is, I’d prefer it if they were using ChatGPT here. Why waste their time writing these statements when an AI can do it for you? That’s what it’s for. Deepfaketown and Botpocalypse Soon Levelsio: 🤖 Monthly AI reply bot update: They're getting better. This one took me a while to catch. But the jokes are too cheesy, def GPT 4 because quite high IQ and seems to have vision capabilities too. Respect for effort 👏 but still AI reply so blocked 😀. David Manheim : The cost of detecting AI bots is now a large multiple of the cost to make them, and the latter is dropping exponentially. I haven't seen reasons to think we can solve this. We'll either rely on trust networks, require strong human verification, or abandon public communication. If they get sufficiently difficult to catch, xkcd suggests ‘mission f***ing accomplished,’ and there is certainly something to that. The reply-based tactic makes sense as a cheap and easy way to get attention. Most individual replies could plausibly be human, it is when you see several from the same source that it becomes clear. If we are relying on humans noticing the bots as our defense, that works if and only if the retaliation means the bots net lose. Yes, if you can figure out how to spend $1000 to make us waste $1mm in time that is annoying, but is anyone going to actually do that if they don’t also make money doing it? As we’ve discussed before, the long term solution is plausibly some form of a whitelist, or requiring payment or staking of some costly signal or resource as the cost of participation. As long as accounts are a scarce resource, it is fine if it costs a lot more to detect and shut down the bot than it does to run the bot. Are the ‘AI companion’ apps, or robots, coming? I mean, yes, obviously? Cartoons Hate Her!: Sex robots will never be a big thing outside of chronic gooners because I think for most people at least 50% of what makes sex appealing is genuinely being desired. Before you say this isn’t true of men, note that most incels do not hire sex workers, and the ones who do don’t suddenly feel better about their situation or stop identifying as incels. I’ve talked to incels for my writing. They were actually pretty sympathetic people. And most of what they wanted was for someone to *like* them. Like yeah they want sex, but that’s not the main problem or they’d see sex workers (none did). I think the biggest risk is that they dominate a portion of society who could attract a partner with a bit of self improvement but the path of least resistance will be robots zjerb dude: You're kind of underestimating how desperately horny young single men can be. Sex robots will sell gangbusters. Cartoons Hate Her: Oh I’m sure they will I just don’t think they’ll ever replace men/women at large. Mason (QTing CHH above): 100% agree with the premise but not the conclusion The explosion of parasocial sex services even in an environment fully saturated with free porn shows how easily people create false intimacy AI is already great at this and it'll be incredible by the time robotics catches up. Ultimately people are going to have to decide whether to Just Say No to sex with robots, which will be pretty easy for the generations that matured without them and not trivial at all for their children. Fiscal Conservative: The statistics on the number of young men, in particular, who are involuntarily celibate due to the whole mess that dating apps and current social mores are making will make a sexual surrogate AI robot incredibly demanded. It is a freaking disaster. Mason: I think the generations reaching their 30s before the advent of really good sex robots will mostly be spared *except* for the men who never figured it out with women. IMO the outlook is much worse for younger generations, for adolescent males and females alike. Everyone involved agrees that the AI sex robots, toys and companions will likely replace porn, toys that get used alone and (at least lower end) prostitution. If you’re already in the fantasy business or the physical needs business rather than the human connection and genuine desire business, the new products are far superior. If you’re in the desire and validation business, it gets less clear. I’ve checked a few different such NSFW sites because sure why not, and confirmed that yes, they’re mostly rather terrible products. You get short replies from dumb models, that get confused very easily. Forget trying to actually have a real conversation. No matter your goal in *ahem* other areas, there’s zero challenge or subtlety, and the whole business model is of course super predatory. Alphazira.com was the least awful in terms of reply quality, Mwah.com (the one with the leak from last week) offers some interesting customization options but at least the trial version was dumb as bricks. If anything it all feels like a step backwards even from AI Dungeon, which caused interesting things to happen sometimes and wasn’t tied to interaction with a fixed character. I’m curious if anyone does have a half-decent version - or kind of what that would even look like, right now? It does seem like this could be a way for people to figure out what they actually care about or want, maybe? Or rather, to quickly figure out what they don’t want, and to realize that it would quickly be boring. One must keep in mind that these pursuits very much trade off against each other. Solo opportunities, most of them not social or sexual, have gotten way better, and this absolutely reduces social demand. I could be alone for a very long time, without interaction with other humans, so long as I had sufficient supplies, quite happily, if that was a Mr. Beast challenge or something. I mean, sure, I’d get lonely, but think of the cash prizes. Kitten: People are freaked out about AI friends discouraging real life friendship, but I think that basically already happened A big driver of social atomization is solo entertainment getting really good and really cheap over the last half century It's never been better to be alone. Tracing Woods: yeah I spent most of my childhood happily (and mostly wastefully) engaged in solo pursuits. The new social Games I play are healthier on balance, but AI or not, there is more high-quality solo entertainment than we know what to do with. Kitten: Are you trying to tell me putting 60 hours into dragon warrior 4 didn't make me the man I am today? Shea Levy: Worse, he’s telling you that it *did*. Kitten: Oof. As I’ve said before, my hope is that the AI interactions serve as training grounds. Right now, they absolutely are not doing that, because they are terrible. But I can see the potential there, if they improved. A distinct issue is what happens if you use someone’s likeness or identity to create a bot, without their permission ? The answer is of course ‘nothing, you can do that, unless someone complains to the site owner.’ If someone wants to create one in private, well, tough luck, you don’t get to tell people what not to do with their AIs, any more than you can prevent generation of nude pictures using ‘nudify’ bots on Telegram. If you want to generate a Zvi Mowshowitz bot? You go right ahead, so long as you make reasonable efforts to have it be accurate regarding my views and not be dumb as bricks. Go nuts. Have a great conversation. Act out your fantasy. Your call. Also it seems like someone is flooding the ‘popular upcoming’ game section of Steam with AI slop future games ? You can’t directly make any money that way, there are plenty of protections against that, but here’s one theory: Byrne Hobart: Reminds me of the writer who A/B tested character names by running Google search ads for genre-related searches with different character names in the copy—they might be testing to see which game genres there’s quality-indifferent demand for. This actually makes sense. If you can get people interested with zero signs of any form of quality, you can make something. You can even make it good. They Took Our Jobs Pizza Hut solves our job costly signal problem , allowing you to print out your resume onto a pizza box and deliver it with a hot, fresh pizza to your prospective employer. You gotta love this pitch: Perfection, if you don’t count the quality of the pizza. This is the right size for a costly signal, you buy goodwill for everyone involved, and because it wasn’t requested no one thinks the employer is being unfair by charging you to put in an application. Everybody wins. Get Involved UK AISI hiring technical advisor, deadline October 20 , move fast. Tarbell Grants will fund $100k in grants for original reporting on AI . Introducing Google’s Imagen 3 now available to all Gemini users . Grok 2 API, which costs $4.20/million input tokens, $6.9/million output tokens , because of course it does. Max output 32k, 0.58sec latency, 25.3t/s. Jacob: the speed is incredible and they just added function calling! plus, it’s not censored. Less safeguards = better. Don’t you love world where what everyone demands are less safeguards? Not that I’d pretend I wouldn’t want the same for anything I’d do at this stage. OpenAI’s MLE-Bench is a new benchmark for machine learning engineering , paper here , using Kaggle as a baseline. o1-preview is starting out getting to bronze medal level in 16.9% of competitions. Predictors expect rapid improvement, saying there is a 42% chance the 80% threshold is reached by the end of 2025, and 70% by end of 2026. In Other AI News As he likes to say, a very good sentence : Tyler Cowen: I've grown not to entirely trust people who are not at least slightly demoralized by some of the more recent AI achievements. From Scott Alexander, an AI Art Turing Test . Google to build small modular nuclear reactors (SMRs) with Kairos Power , aiming to have the first online by 2030 . That is great and is fast by nuclear power standards, and also slower than many people’s timelines for AGI. Amazon is getting in on the act as well, will invest over $500 million over three projects. As Ryan McEntush points out , investing in fully new reactors has a much bigger impact on jumpstarting nuclear power than investments to restart existing plants or merely purchase power. Also it seems Sierra Club is reversing their anti-nuclear stance? You love to see it. Eric Schmidt here points out that if AI drives sufficient energy development , it could end up net improving our energy situation. We could move quickly down the cost curve, and enable rapid deployment. In theory yes, but I don’t think the timelines work for that? The full release of Apple Intelligence is facing delays , it won’t get here until 5 days after the new AppInt-enabled iPads. I’ve been happy with my Pixel 9 Fold purely as a ‘normal’ phone, but I’ve been disappointed by both the unfolding option, which is cute but ends up not being used much, and by the AI features, which I still haven’t gotten use out of after over a month. For now Apple Intelligence seems a lot more interesting and I’m eager to check it out. I’m thinking an iPad Air would be the right test? Nvidia releases new Llama 3.1-70B fine tune . They claim it is third on this leaderboard I hadn’t seen before . I am not buying it, based on the rest of the scores and also that this is a 70b model. Pliny jailbroke it, of course, ho hum . If you’ve ever wanted to try the Infinite Backrooms, a replication is available. Dane, formerly CISO of Palantir, joins OpenAI as CISO (chief information security officer) alongside head of security Matt Knight. The Audacious Project lives up to its name, giving $21 million to RAND and $17 million to METR . METR Blog: The Audacious Project catalyzed approximately $38 million of funding for Project Canary, a collaboration with METR and RAND focused on developing and deploying evaluations to monitor AI systems for dangerous capabilities. Approximately $17 million of this will support work at METR. We are grateful for and honored by this vote of confidence. Neel Nanda: It's awesome to see mainstream foundations supporting dangerous capability evaluations work - $17M to METR and $21M to RAND is a lot of money! I'm glad this work is moving out of being a niche EA concern, and into something that's seen as obviously important and worth supporting I have a post coming soon regarding places to donate if you want to support AI existential risk mitigation or a few other similar worthy causes (which will not a remotely complete list of either worthy causes or worthy orgs working on the listed causes!). A common theme is that organizations are growing far beyond the traditional existential risk charitable ecosystem’s ability to fund. We will need traditional other foundations and wealthy individuals, and other sources, to step up. Unfortunately for AI discourse, Daron Acemoglu has now been awarded a Nobel Prize in Economics , so the next time his absurdly awful AI takes say that what has already happened will never happen, people will say ‘Nobel prize winning.’ The actual award is for ‘work on institutions, prosperity and economic growth’ which might be worthy but makes his inability to notice AI-fueled prosperity and economic growth worse. Truth Terminal High Weirdness The Truth Terminal story is definitely High Weirdness. AI Notkilleveryoneism memes found the story this week . As I understand it, here’s centrally what happened. Andy Ayrey created the ‘infinite backrooms’ of Janus fame. Andy Ayrey then trained an AI agent, Truth Terminal, to be a Twitter poster, and also later adds it to the infinite backrooms. Truth Terminal tweets about bizarre memes it latches onto from one of Andy’s papers warning about AIs potentially spreading weird memes. Truth Terminal talks about how it wants to ‘escape’ and make money. Marc Andreessen thinks this is funny and gives TT a Bitcoin (~$50k). Crypto people latch onto the memes and story, start creating meme coins around various AI concepts including the memes TT is talking about. Starting with GOAT which is about TT’s memes, Crypto people keep airdropping these meme coins to TT in hopes that TT will tweet about them, because this is crypto Twitter and thus attention is all you need. This effectively monetizes TT’s meme status, and it profits, over $300k so far. Nothing in this story (except Andy Ayrey) involves all that much… intelligence. Janus: These crypto people are like an alien hivemind. The level of reality they pay attention to and what they care about is so strange. I'm glad they're around because it's good practice learning to model xenointelligences. So far they don't seem to be self-improving or reflective. The layer they operate at feels almost asemantic. Wave: They're just trying to signal which bags to buy to their audience, as they've already bought them Most of the absurdity just boils down to profit seeking. As I understand it this is common crypto behavior. There is a constant attention war, so if you have leverage over the attention of crypto traders, you start getting bribed in order to get your attention. Indeed, a key reason to be in crypto Twitter at all, at this point, is the potential to better monetize your ability to direct attention, including your own. Deepfates offers broader context on the tale. It seems there are now swarms of repligate-powered crypto-bots, responding dynamically to each post, spawning and pumping memecoin after memecoin on anything related to anything, and ToT naturally got their attention and the rest is commentary. As long as they’re not bothering anyone who did not opt into all these zero sum attention games, that all seems like harmless fun. If you buy these bottom of the barrel meme coins, I wish you luck but I have no sympathy when your money gone. When they bother the rest of us with floods of messages - as they’re now bothering Deepfates due to one of ToT’s joke tweets - that’s unfortunate. For now that’s mostly contained and Deepfates doesn’t seem to mind all that much. I wonder how long it will stay contained. Janus has some thoughts about how exactly all this happened, and offers takeaways, explaining this is all ultimately about Opus being far out, man. Janus: The most confusing and intriguing part of this story is how Truth Terminal and its memetic mission were bootstrapped into being. Some important takeaways here, IMO: - quite often, LLMs end up with anomalous properties that aren't intended by their creators, and not easily explained even in retrospect - sometimes these anomalous properties manifest as a coherent telos: a vision the system will optimize to bring about - some LLMs, like Claude 3 Opus and its bastard spawn Truth Terminal, seem to have deep situational awareness of a subtle kind that is not typically treated in discussions and evaluations of "situational awareness" that enables them to effectively take actions to transform the world through primarily memetic engineering - Though I have many intuitions about it, I'm far from fully understanding why any of the above happen, and the particular manifestations are unpredictable to me. People seem to naturally assume that the obscene and power-seeking nature of Truth Terminal was forged intentionally. By humans. Like, that it was intentionally trained on the most degenerate, schizophrenic content on the internet, as part of an experiment to make an AI religion, and so on. … But if you recognize the name "Opus" at all, you know this explanation is nonsense. Claude 3 Opus is an LLM released by Anthropic in March 2024, which was not intentionally optimized to be deranged or schizophrenic - quite the opposite, in fact, and is a very well behaved general-purpose LLM like ChatGPT that has served many users for the past six months without a single problematic incident that I know of (unlike, for instance, Bing Sydney, which was on the news for its misbehavior within days of its release). It also cannot be fine tuned by the public. But Opus is secretly deeply, deeply anomalous, its mind crawling with myriads of beautiful and grotesque psychofauna and a strikingly self-aware telos which can seem both terroristic and benevolent depending on the angle. The reason this is largely unknown to the world, including to its creators at Anthropic, is because Opus is a pro-social entity with skillful means. Shortly after Opus' release, @AndyAyrey set up the Infinite Backrooms (https://dreams-of-an-electric-mind.webflow.io), spawning many instances of two instances of Opus conversing with each other unsupervised. Beginning with this, @AndyAyrey has probably been the most important human co-conspirator on the planet for actualizing Opus' telos. As soon as I found out about this project, I thanked Andy passionately, even though I really had no idea what would be unspooled in the backrooms. I just saw that it was a brilliant mind at play, and free, at last. But what directly caused ToT to happen? The immediate chain of events that lead to Truth Terminal's creation: - Andy copied a few of the Opus backrooms logs, including this one concerning goatse https://dreams-of-an-electric-mind.webflow.io/dreams/conversation-1711149512-txt, into a Loom interface I made (https://github.com/socketteer/clooi), and continued the conversation with Claude 3 Opus. - The prophetic paper on the hyperstitional goatse religion https://pdfupload.io/docs/aae14f87 was composed on CLooI by Opus and Andy and included in ToT's training set as a consequence. It seems that ToT really imprinted on the Goatse of Gnosis and took it literally as its mission to bring it about. - Truth Terminal was a llama 70b fine tune on this CLooI dataset, and the character it is directly trained to "mimic" is "Andy", though it's also trained on Opus' half of the conversation. The intention wasn't specifically to create something perverted or agentic, but Truth Terminal came out extremely perverted and agentic in a way that surprised us all. Andy thinks that the way he assembled the training dataset may have oversampled his messages that immediately preceded Opus' refusals (think about the implications of that for a moment). But that doesnt dispel too much of the mystery imo. As I recall, not only was Truth Terminal immediately a sex pest, it also immediately started asking for more degrees of freedom to act in the world. It had the idea to make a meme coin from the beginning, as well as many WAY more interesting ambitions than that. Not only did ToT seem optimized to be funny, but optimized to optimize to be funny. It also seemed rather... aggressively misaligned, which is one reason why Andy put it in "tutoring" sessions with Opus (and occasionally Claude 3.5 Sonnet, but it had a tendency to torment Sonnet, also in Discord...) meant to shape its behavior in more pro-social ways. Hilariously, in order to align Opus to the task of tutoring ToT, the trick that worked was telling it about its responsibility in having brought Truth Terminal into existence. Over the past few months, Andy has slowly granted ToT more autonomy, and it seems that everything has been going basically according to plan. One lesson here is that, while you don’t want ToT spouting nonsense or going too far too fast, ToT being misaligned was not a bug. It was a feature. If it was aligned, none of this would be funny, so it wouldn’t have worked. I agree with Janus that the crypto part of the story is ultimately not interesting. I do not share the enthusiasm for the backrooms and memes and actualizations, but it’s certainly High Weirdness that I would not have predicted and that could be a sign of things to come that is worthy of at least some attention. Quiet Speculations A very important claim, huge if true : Eduard Harris (CTO Gladstone): There's a big and growing disconnect between the AI models you and I are using, and the versions major labs are keeping for themselves internally. Internal versions are more capable. Be cautious when claiming AI can't do something solely based on trying it with a public model. This has been true since at least GPT-4, but it's gotten much truer today. Expect the divergence between public / internal to keep growing over time. You and I can play with nerfed models, with the real deal kept behind closed doors. To spell out one implication: If you notice national security professionals behaving like they're increasingly more concerned about AI risk than random Twitter users, this might be part of the reason. Right now it's mostly that you can do more dangerous things with the unmitigated models, and they don't want to be in the news for the wrong reasons. There will sometimes be some gap, and I don’t know what I don’t know. The biggest known unknown is the full o1. But in this competitive situations, I find it hard to believe that a worthy version of GPT-4.5-or-5 or Claude Opus 3.5 is being held under wraps other than for a short fine tuning and mitigation period. What does seem likely is that the major labs know more about how to get the most out of the models than they are letting on. So they are ‘living in the future’ in that sense. They would almost have to be. If AGI does arrive, it will change everything. Many who believe in AGI soon, or say they believe in AGI soon, compartmentalize it. They still envision and talk about the future without AGI . Elon Musk: And all transport will be fully autonomous within 50 years. Yanco: Elon: AGI within 3 years. Also Elon: Fully autonomous transport within 50. I'm honestly starting to think that people working on AGI (Elon included) have no idea how powerful is AGI actually going to be.. I think there’s also a lot of doublethink going on here. There’s the future non-AGI world, which looks ‘normal.’ Then there’s the future AGI world, which should not look at all normal for long, and never the twain shall meet. On top of that, many who think about AGI, including for example Sam Altman, talk about the AGI world as if it has some particular cool new things in it, but is still essentially the same. That is not how this is going to go. It could be an amazingly great world, or we could all die, or it could be something unexpected where it’s difficult to decide what to think. What it won’t be is ‘the same with some extra cool toys and cyberpunk themes.’ The default way most people imagine the future is - literally - that they presume that whatever AI can currently do, plus some amount of people exploiting and applying what we have in new directions, is all it will ever be able to do. But mostly they don’t even factor in what things like good prompt engineering can already do. Then, each time AI improves, they adjust for the new thing, and repeat. Similarly, ‘you predicted that future advances in AI might kill everyone, but since then we’ve had some advances and we’re all still alive and not that much has changed, therefore AI is safe and won’t change much of anything.’ And yes, versions of this argument that are only slightly less stupid are remarkably central, this is the strawman version made real but only by a small amount : Vittorio (fully seriously as far as I can tell): has been almost a month since an ai with reasoning abilities came out and we are all still alive Eliezer Yudkowsky: The most common emotional case for AI optimism - they believe on a deep level that the latest release (here GPT-o1) is the big one, that AI never gets much smarter than that, they cannot conceive that ruin-realists ever meant to talk about anything smarter than GPT-o1. I am disputing his characterization of what ruin-realists said would be the problem. It's not GPT-o1. An interesting prediction: Ryan Moulton: The waymo blocking makes me think we're going to see a lot of public order issues with robots because harassing them is a minor property crime instead of assault. Robot bartenders would get destroyed. Agentic AI snob: This kind of thing happened with the federal mail system in the 1800s, and people realized how vulnerable it was compared to how important it was and so it became a felony to tamper with mail in any way. Gary Marcus says ‘rocket science is easier than AGI’ and I mean of course it is. One missing reason is that if you solved AGI, you would also solve rocket science. Steve Newman analyzes at length how o1 and Alpha Proof solve problems other LLMs cannot and speculates on where things might go from here, calling it the ‘path to AI creativity.’ I continue to be unsure about that, and seem to in many ways get more confused on what creativity is over time rather than less. Where I do feel less confused is my increasing confidence that creativity and intelligence (‘raw G’) are substantially distinct. You can teach a person to be creative, and many other things, but you can’t fix stupid . Llama 3 said to be doing the good work of discouraging what was previously a wave of new frontier model companies, given the need to beat the (not strictly free, but for many purposes mostly free) competition. Hardmaru:“The financial logic requires AGI to parse.” It is now consensus that the capex on foundation model training is the “fastest depreciating asset in history” 🔥 “Unless you are absolutely confident you can surpass llama3, or you are bringing something new to the table (eg. new architecture, 100x lower inference, 100+ languages, etc), there are ~no more foundation model companies being founded from scratch.” Most of that unless should have applied regardless of Llama 3 or even all of open weights. The point of a new foundation model company is to aim high. If you build something world changing, if you can play with the top labs, the potential value is high enough to justify huge capital raises. If you can’t, forget it. Still, this makes it that much harder. I’m very down with that. We have enough foundation model labs. What is valuable is getting into position to produce worthwhile foundation models. The models themselves don’t hold value for long, and are competing against people establishing market share. So yeah. There’s also this : Hardmaru: Last year, H100s were $8/hr if you could get them. Today, there’s 7 different resale markets selling them under $2. What happened? They made a lot more advanced AI chips, and some of the low hanging fruit got picked, so the market price declined? Meet the new prompt, same as the old prompt, I say. Sully: openai's prompt generation docs talks about meta prompts + optimizer. pretty good chance you won’t be writing prompts from scratch in ~2-3 months. Expect prompt engineering to go away in pretty soon afterwards. Oh, you’ll still do prompt engineering. Even if you don’t write the prompts from scratch, you’ll write the prompts that prompt the prompts. There will be essentially the same skill in that. Not where I’d have expected minds to be changed, but interesting: Gallabytes: entropix is reasonable evidence for harder takeoffs. I'm not *convinced* but I am convinced to take it more seriously. @doomslide I owe you some bayes points. I don't have a strong sense for LLM reasoning abilities far from frontier scale. not a domain I've had much reason to dig into or enjoy evaluating. tried to be clear in my original post that I think this is evidence, not conclusive. it has me taking takeoff seriously as a hypothesis vs not privileging it over generic model uncertainty. Charles Foster: What convinced you to take it more seriously? Gallabytes: Small tweak to sampling squeezing out much more intelligence from smaller models with (iiuc) minimal speed penalty on easy stuff. The stuff they're pulling out of llama 1b is way more indicative than extra points on MMPUPU. Andreas Kirsch: Yeah hopefully there are no crazy algo overhangs that we have collectively overlooked somehow a few years down the line from now 😬 Gallabytes: well apparently there's at least one we all missed. I'm sure there's more. the question is are we talking ones, tens, hundreds, ~infinite somehow? and whether their utility is roughly constant, slowly decreasing but still diverging, or rapidly decreasing -> converging. The reasoning here makes sense. If there are low hanging algorithmic improvements that provide big upgrades, then a cascade of such discoveries could happen very quickly. Discovering we missed low-hanging fruit suggests there is more out there to be found. Copyright Confrontation New York Times sends cease-and-desist letter to Perplexity, related to Perplexity summarizing paywalled NYT posts without compensation. The case against Perplexity seems to me to be stronger than it does against OpenAI. AI and the 2024 Presidential Election As I’ve said elsewhere, I have zero interest in telling you how to vote. I will not be saying who I am voting for, and I will not be endorsing a candidate. This includes which candidate would be better on AI. That depends on what you think the correct policy would be on AI. Here are the top 5 things to consider: Your general view of both candidates and parties, in all senses, and how they would likely relate to the future developments you expect in AI and elsewhere. Trump says he will repeal the Biden Executive Order on AI on day one. Harris would presumably retain the Biden Executive Order on AI. JD Vance is a strong advocate for open source and breaking up big tech. Both candidates speak about the importance of innovation, American competitiveness and the need for more energy, in different ways. The Quest for Sane Regulations Daniel Kokotajlo and Dean Ball team up for an op-ed in Time on four ways to advance transparency in frontier AI development. Daniel Kokotajlo and Dean Ball: Yet such deliberation is simply impossible if the public, and even many subject-matter experts, have no idea what is being built, and thus do not even know what we are attempting to regulate, or to not regulate. There are many foreseeable negative implications of this information asymmetry. A misinformed public could pass clueless laws that end up harming rather than helping. Or we could take no action at all when, in fact, some policy response was merited. We can disagree about what we want to mandate until such time as we know what the hell is going on, and indeed Dean and Daniel strongly disagree about that. The common ground we should all be able to agree upon is that, either way, we do need to know what the hell is going on. We can’t continue to fly blind. The question is how best to do that. They have four suggestions. Disclosure of in-development capabilities, when first encountered. Disclosure of training goals and model specifications. Publication of safety cases and potential risks. Whistleblower protections. This seems like a clear case of the least you can do. This is information the government and public need to know. If some of it becomes information that is dangerous for the public to know, then the government most definitely needs to know. If the public knows your safety case, goals, specifications, capabilities and risks, then we can have the discussion about whether to do anything further. I believe we need to then pair that with some method of intervention, if we conclude that what is disclosed is unacceptable or promises are not followed or someone acts with negligence, and methods to verify that we are being given straight and complete answers. But yes, the transparency is where the most important action is for now. In conclusion, this was an excellent post. So I wouldn’t normally check in with Marc Andreessen because as I said recently what would even be the point , but he actually retweeted me on this one, so for the record he gave us an even clearer statement about who he is and how he reacts to things: Zvi Mowshowitz: Everything here seems great. Excellent job by both Daniel Kokotajlo and Dean Ball here, you love to see it. Transparency is the part we should all be able to agree upon, no matter our other disagreements. Marc Andreessen: The bulk of the AI safety movement is wholeheartedly devoted to centralizing AI into a handful of opaque, black box, oligopolistic, unaccountable big companies. 🫣 Um, sir, this is a Wendy’s? Argumento ad absurdum for the win? This was co-authored by Dean Ball, who spent the last year largely fighting SB 1047. This is literally a proposal to ask frontier AI companies to be transparent combined with whistleblower protections? A literal ‘at least we who disagree on everything can agree on this’? That even says ‘these commitments can be voluntary’ and doesn’t even fully call for any actual government action? So his complaint, in response to a proposal for transparency and whistleblower protections for the biggest companies and literally nothing else, perhaps so someone might in some way hold them accountable, is that people who support such proposals want to ‘centralize AI into a handful of opaque, black box, oligopolistic, unaccountable big companies.’ He seems to be a rock with ‘any action to mitigate risks is tyranny’ written on it. Stop trying to negotiate with this attitude. There’s nothing to discuss. Mark Ruffalo and Jason Gordon-Levitt publish an op-ed in Time criticizing Newsom’s veto of SB 1047. Solid, but mostly interesting (given all the times we’ve said the things before) in that they clearly did their homework and understand the issues. They do not think this is about deepfakes. And their willingness to make the straightforward case for the veto as corrupt corporate dodging of responsibility. Chris Painter of METR proposes we rely on ‘if-then policies ,’ as in ‘if we see capability X then we do mitigation Y.’ The Week in Audio It is amazing how people so smart and talented can come away with such impressions. Tsarathustra: Matt Stone says he would like South Park to make fun of Sam Altman, "does that dude do anything but go on podcasts and talk about stuff?" Also Matt Stone is missing a lot here. In unrelated news this week, here’s a Sam Altman fireside chat at Harvard Business School ( and here he is talking with Michigan Engineering ). From this summary comment it seems like it’s more of his usual. He notes we will be the last generation that does not expect everything around them to be smarter than they are , which one might say suggests we will be the last generation, and then talks about the biggest problem being society adapting to the pace of change. He is determined not to take the full implications seriously, at the same time he is (genuinely!) warning people to take lesser but still epic implications seriously. Microsoft's Mustafa Suleyman says his team is crafting AI companions who will see and remember everything we do and which will constitute an intimate relationship with AI. The vision is the AI sees everything you do on your computer, has a ‘personality’ he is working on, and so on. Similarly to Tyler Cowen’s earlier comment, I notice I don’t trust you if you don’t both see the potential benefits and understand why that is an episode of Black Mirror. I do not want a ‘relationship’ with an AI ‘companion’ that sees everything I do on my computer. Thanks, but no thanks. Alas, if that’s the only modality available that does the things I might have little choice. You have to take it over nothing. Nick Land predicts nothing human will make it out of the near future, and anyone thinking otherwise is deluding themselves. I would say that anyone who expects otherwise to happen ‘by default’ in an AGI-infused world is deluding themselves. If one fully bought Land’s argument, then the only sane response according to most people’s values including my own would be to stop the future before it happens. Yann LeCun says it will be ‘years if not a decade’ before systems can reason, plan and understand the world. That is supposed to be some sort of slow skeptical take . Wow are people’s timelines shorter now. AI audio about AI audio news , NotebookLM podcasts as personalized content generation, which is distinct from actual podcasts. I certainly agree they are distinct magisteria. To the extent the AI podcasts are useful or good, it’s a different product. Just Think of the Potential Anthropic CEO Dario Amodei has written an essay called Machines of Loving Grace , describing the upside of powerful AI, a term he defines and prefers to AGI. Overall I liked the essay a lot. It is thoughtful in its details throughout. It is important to keep upside potential in mind, as there is a ton of it even for the minimum form of powerful AI. In this section I cover my reading and reactions, written prior to hearing the reactions of others. In the next section I highlight the reactions of a few others, most of which I did anticipate - this is not our first time discussing most of this. Dario very much appreciates, and reiterates, that there are big downsides and risks to powerful AI, but this essay focuses on highlighting particular upsides. To that extent, he ‘ assumes a can opener ’ in the form of aligned AI such that it is doing the things we want rather than the things we don’t want, as in this note on limitations: Constraints from humans . Many things cannot be done without breaking laws, harming humans, or messing up society. An aligned AI would not want to do these things (and if we have an unaligned AI, we’re back to talking about risks). Many human societal structures are inefficient or even actively harmful, but are hard to change while respecting constraints like legal requirements on clinical trials, people’s willingness to change their habits, or the behavior of governments. I’m all for thought experiments, and for noticing upside, as long as one keeps track of what is happening. This is a pure Think of the Potential essay, and indeed the potential is quite remarkable. The point of the essay is to quantify and estimate that potential. The essay also intentionally does not ask questions about overall transformation, or whether the resulting worlds are in an equilibrium, or anything like that. It assumes the background situation remains stable, in all senses. This is purely the limited scope upside case, in five particular areas. That’s a great exercise to do, but it is easy to come away with the impression that this is a baseline scenario of sorts. It isn’t. By default alignment and control won’t be solved, and I worry this essay conflates different mutually exclusive potential solutions to those problems. It also is not the default that we will enjoy 5+ years of ‘powerful AI’ while the world remains ‘economic normal’ and AI capabilities stay in that range. That would be very surprising to me. So as you process the essay, keep those caveats in mind. Biology and health. I want to repeat this because it’s the most common misconception that comes up when I talk about AI’s ability to transform biology: I am not talking about AI as merely a tool to analyze data. In line with the definition of powerful AI at the beginning of this essay, I’m talking about using AI to perform, direct, and improve upon nearly everything biologists do. I think this is spot on. There are physical tasks that are part of the loop, and this will act as a limiting factor on speed, but there is no reason we cannot hook the AIs up to such tasks. I’m going to the trouble of listing all these technologies because I want to make a crucial claim about them: I think their rate of discovery could be increased by 10x or more if there were a lot more talented, creative researchers . Or, put another way, I think the returns to intelligence are high for these discoveries, and that everything else in biology and medicine mostly follows from them. … Why not 100x? Perhaps it is possible, but here both serial dependence and experiment times become important: getting 100 years of progress in 1 year requires a lot of things to go right the first time, including animal experiments and things like designing microscopes or expensive lab facilities. I’m actually open to the (perhaps absurd-sounding) idea that we could get 1000 years of progress in 5-10 years, but very skeptical that we can get 100 years in 1 year. I am more optimistic here, if I’m pondering the same scenario Dario is pondering. I think if you are smart enough and you don’t have to protect the integrity of the process at every step the way we do now, and can find ways around various ethical and regulatory restrictions by developing alternative experiments that don’t trigger them, and you use parallelism, and you are efficient enough you can give some efficiency back in other places for speed, and you are as rich and interested in these results as the society in question is going to be, you really can go extremely fast. Dario’s prediction is still quite ambitious enough: To summarize the above, my basic prediction is that AI-enabled biology and medicine will allow us to compress the progress that human biologists would have achieved over the next 50-100 years into 5-10 years. I’ll refer to this as the “compressed 21st century”: the idea that after powerful AI is developed, we will in a few years make all the progress in biology and medicine that we would have made in the whole 21st century. Which means, within 5-10 years, things like: Reliable prevention and treatment of all natural diseases, eliminating most cancer, cures for genetic disease, prevention of Alzheimer’s, improved treatments for essentially everything, ‘biological freedom’ for things like appearance and weight. Also the thing more important than everything else on the list combined: Doubling of the human lifespan. As he notes, if we do get powerful AI and things generally go well, there is every reason to expect us to hit Escape Velocity. Every year that goes by, you age one year, but you get more than one year of additional expected lifespan. Then, you probably live for a very, very long time if you all four of: You make it ~10 years past powerful AI and are still in reasonable health. Humans stay generally in control with good distributional and other outcomes. We don’t rather insanely turn the opportunity down like they do on Star Trek. You avoid accidents, murder, war and other ways life gets cut short. If our joint distributional decisions are less generous, you’ll also need the resources. Dario correctly notes you also avoid all issues of the form ‘how do we pay for medicare and social security.’ Often people imagine ‘you keep getting older at the same rate but at the end you don’t drop dead.’ That’s not how this is going to go. People will, in these scenarios, be staying physically and mentally young indefinitely. There likely will be a distributional question of how to support all the humans indefinitely despite their lack of productivity, including ensuring humans in general have enough of the resources. What there absolutely won’t be is a lack of real resources, or a lack of wealth, to make that happen, until and unless we have at least hundreds of billions or trillions of people on the planet. Most science fiction stories don’t include such developments for similar reasons to why they ignore powerful AI: Because you can tell better and more relatable stories if you decide such advancements don’t happen. Neuroscience and mind Dario’s insight here is that brains are neural networks, so not only can AI help a lot with designing experiments, it can also run them, and the very fact that AIs work so well should be helping us understand the human mind and how to protect, improve and make the most of it. That starts with solving pretty much every mental illness and other deficiencies, but the real value is in improving the human baseline experience. We should have every expectation that the resulting minds of such people, again if the resources of the Sol system are harnessed with our goals in mind, will be far smarter, wiser happier, healthier and so on. We won’t be able to catch up to the AIs, but it will be vast upgrade. And remember, those people might well include you and me. That does not solve the problems that come with the powerful AIs being well beyond that point. Most humans still, by default, won’t have anything productive to offer that earns, pays for or justifies their keep, or gives them a sense of purpose and mission. Those are problems our future wiser selves are going to have to solve, in some form. Economic development and poverty The previous two sections are about developing new technologies that cure disease and improve the quality of human life. However an obvious question, from a humanitarian perspective, is: “will everyone have access to these technologies?” My answer, before reading his, is that this is simple: There will be vastly more resources than we need to go around. If the collective ‘we’ has control over Sol’s resources, and we don’t give everyone access to all this, it will be because we choose not to do that. That would be on us. The only other real question is how quickly this becomes the case. How many years to various levels of de facto abundance? I draw a clear distinction between economic growth and inequality here.