Summarizer

LLM Input

llm/9db4e77f-8dd5-46da-972e-40d33f3399ef/topic-4-b8825ad7-1d65-4327-b8c5-6d5545b35001-input.json

prompt

The following is content for you to summarize. Do not respond to the comments—summarize them.

<topic>
Cost and Access Disparities # Analysis of the financial feasibility of running Opus 4.5 agents in parallel, noting that while Anthropic employees may have unlimited access, the cost for average users would be prohibitive due to token limits and API pricing.
</topic>

<comments_about_topic>
1. well in this case using the methodology given, it's a hefty chunk of change in API credits that most people would require investment to spend.

2. Yes, anthropics product design is truly bad, as is their product strategy (hey, I know you just subscribed to Claude, but that isnt Claude Code which you need to separately subscribe to, but you get access to Claude Code if you subscribe to a certain tier of Claude, but not the other way around. Also, you need to subscribe to Claude Code with api key and not usage based pricing or else you cant use Claude Code in certain ways. And I know you have Claude and Claude Code access, but actually you cant use Claude Code in Claude, sorry)

3. I rarely run 10 top-level sessions, but I often run multiple.

Here is one case, though:

I have a prototype Ruby compiler that long languished because I didn't have time. I recently picked up work on it again with Claude Code.

There are literally thousands of unimplemented methods in the standard library. While that has not been my focus so far, my next step for it is to make Claude work on implementing missing methods in 10+ sessions in parallel, because why not? While there are some inter-dependencies (e.g. code that would at least be better with more of the methods of the lowest level core classes already in place), a huge proportion are mostly independent.

In this case the rubyspec test suite is there to verify compliance. On top of that I have my own tests (does the compiler still compile itself, and does the selftests still run when compiled with self-compiled compiler?) so having 10+ sessions "pick off" missing pieces, make an attempt see if it can make it pass, and move on, works well.

My main limitation is that I have already once run into the weekly limits of my (most expensive) Claude Max subscription, and I need it for other things too for client work and I'm not willing to pay-per-token for the API use for that project since it's not immediately giving me a return.

(And yes, they're "slow" - but faster than me; if they were fast enough, then sure, it'd be nicer to have them run serially, the same way if you had time it's easier to get cohesive work if a single developer does all the work on a project instead of having a team try to coordinate)

4. Potentially, a lot of that isn't just code generation, it *is* requirements gathering, design iteration, analysis, debugging, etc.

I've been using CC for non-programming tasks and its been pretty successful so far, at least for personal projects (bordering on the edge of non-trivial). For instance, I'll get a 'designer' agent coming up with spec, and a 'design-critic' to challenge the design and make the original agent defend their choices. They can ask open questions after each round and I'll provide human feedback. After a few rounds of this, we whittle it down to a decent spec and try it out after handing it off to a coding agent.

Another example from work: I fired off some code analysis to an agent with the goal of creating integration tests, and then ran a set of spec reviewers in parallel to check its work before creating the actual tickets.

My point is there are a lot of steps involved in the whole product development process and isn't just "ship production code". And we can reduce the ambiguity/hallucinations/sycophancy by creating validation/checkpoints (either tests, 'critic' agents to challenge designs/spec, or human QA/validation when appropriate)

The end game of this approach is you have dozens or hundreds of agents running via some kind of orchestrator churning through a backlog that is combination human + AI generated, and the system posts questions to the human user(s) to gather feedback. The human spends most of the time doing high-level design/validation and answering open questions.

You definitely incur some cognitive debt and risk it doing something you don't want, but thats part of the fun for me (assuming it doesn't kill my AI bill).

5. I would do the same thing if I would justifing paying 200$ per Month for my hobby. But even with that, you will run into throttling / API / Resource limits.

But AI Agents need time. They need a little bit of reading the sourcecode, proposing the change, making the change, running the verification loop, creating the git commit etc. Can be a minute, can be 10 and potentially a lot longer too.

So if your code base is big enough that you can work o different topics, you just do that:

- Fix this small bug in the UI when xy happens
- Add a new field to this form
- Cleanup the README with content x
- . . .

I'm an architect at work and have done product management on the side as its a very technical project. I have very little problem coming up with things to fix, enhnace, cleanup etc. I have hard limits on my headcount.

I could easily do a handful of things in parallel and keeping that in my head. Working memory might be limited but working memory means something different than following 10 topics. Especially if there are a few tpics inbetween which just take time with the whole feedback loop.

But regarding your example of house cleaning: I have ADHD, i sometimes work like this. Working on something, waiting for a build and cleaning soming in parallel.

What you are missing is the practical experience with Agents. Taking the time and energy of setting up something for you, perhaps accessability too?

We only got access at work to claude code since end of last year.

6. Must be nice to have unquota’ed tokens to use with frontier AI (is this the case for Anthropic employees?). One thing I think is fascinating as we enter the Intellicene is the disproportionate access to AI. The ability to petition them to do what you want is currently based on monthly subscriptions, but will it change in the future? Who knows?

7. > (is this the case for Anthropic employees?)

It would be funny if the company paying software engineers $500K or more along with gold-plated stock options was limiting how much they could use the software their company was developing.

8. It is a long-standing policy at Netflix for employees to pay for their own subscriptions. It ensures that employees "live the member experience".

9. That’s for personal use at home.

I guarantee the engineers at Netflix who develop and test video streaming aren’t doing so on their family’s home Netflix plan.

10. Why is that funny? What company gives you unlimited resources? That doesn’t scale. Google employees can’t just demand a $10,000 workstation. It’s reasonable to assume they have some guardrails, for both financial and stability reasons. Who knows… if it’s unlimited now, will it stay that way forever? Probably unlimited in the same sense as unlimited pto.

11. > Why is that funny? What company gives you unlimited resources?

Anthropic has raised tens of billions of dollars of funding.

Their number of employees is in the thousands. This isn't like Google.

Claude Code is what they're developing. The company is obviously going to encourage them to use it as much as possible.

Limiting how much the Claude Code lead can use Claude Code would be funny because their lead dev would have to stop mid-day and wait for his token quota window to reset before he can continue developing their flagship coding product. Not going to happen.

I'm strangely fascinated by the reaction in the comments, though. A lot of people here must have worked in oddly restrictive corporate environments to even think that a company like this would limit how much their own employees can use the company's own product to develop their own product.

12. I can't get a $10k workstation but if I used $10k/month on cloud compute it'd take a few months for anyone to talk to me about it and as long as I was actually using it for work purposes I wouldn't run into any consequences more severe than being told to knock it off if I couldn't convince people it was worth the cost.

13. If an employee has a business need for a $10k workstation, I'm fairly certain they'll get a $10k workstation.

Yes, accounting still happens. Guardrails exist. But quibbling over 2% of a SWEs salary if it's clear that the productivity increase will be significantly more than 2% would be... not a wise use of anybody's time.

14. well, not only their software but also hardware resources they're renting, but I agree they don't.

15. Tokens aren’t free.

16. When you work for the company supplying those tokens and you're working on the product that sells those tokens at scale, the company will let you use as many tokens as you want.

17. Owning the means of cognition is going to be more and more importan as it allows one to scale more than linearly.

Outsiders will be tied to limited or pay per use because owning the means of cognition will be a massive extractive economy

18. > is this the case for Anthropic employees?

Not sure about all Anhropic employees, but that must definitely be the case for Boris Cherny.

19. Pretty sure I have seen them imply in one of the panel discussions on their YouTube channel (can't remember which) that they get unlimited use of the best models. I remember them talking about records for the most spent in a day or something.

20. Its not unlimited, the compute allocation was one of the reason for the coup at OpenAI

21. Pretty sure that was scientists competing for 6 month training runs of new 100B+ parameter models, not coders burning through a couple of million tokens.

22. It is the case that Anthropic employees have no usage limits.
Some people do experiments where they spawn up hundreds of Claude instances just to see if any of them succeed.

23. I'm afraid to ask, but because I've been very happy with Codex 5.2 CLI and I can't imagine Claude Code doing better, why is it Claude so loved around here?

Sure, I can spend $20 and figure it out, but I already pay $40/mo for two ChatGPT subs and that's enough to get me through a month.

Should I spend $20 to see for myself?

24. Thanks. The reason for my hesitancy is that I've heard that the $20 sub isn't enough for anything meaningful.

25. If you spend only 20 on claude code you will not get far, it will lock you out after about hour of work for session usage limits

26. How much would you consider a good amount? I can't really afford more than 20$ myself, but perhaps there's a better more monetarily-optimal workflow.

27. The Claude models are among the most expensive. It's easy to spend 30 EUR+ a day when providing it with a lot of context, documentation. Ofc it can be argued that this money is worth it relative to salaries, but recently I've switched to kilocode myself after looking at different model pricings on openrouter https://openrouter.ai/models?order=pricing-high-to-low There's just no reason to throw money away.

There are plenty of free (and also cheap ones) models you can use with just openrouter or kilocode (inexpensive less-shitty Cursor basically, https://kilocode.ai ).

With most things these free models are able to achieve great results and similarly to the expensive ones they need oversight and thorough code reviews. These days I'm barely paying anything for tokens monthly.

28. Why are you asking this? Just try it. It takes maybe fifteen minutes of your time. It’s $20. There is no possible argument against $20 or fifteen minutes if the tool has a chance of being even just 10% better. You’ve spent more time typing by the comment and I responding than it would take to…just try it…

29. How much Codex and Claude Code are different from each other?
I have been using Codex for few weeks doing experiments related to data analysis and training models with some architecture modifications. I wouldn't say I have used it extensively, but so far my experience has been good. Only annoying part has been not able to use GPU in the Codex without using `--sandbox danger-full-access` flag. Today, I started using Claude Code, and ran similar experiments as Codex. I find the interface is quite similar to Codex. However, I hit the limit quite quickly in Claude Code. I will be exploring its features further. I would appreciate if anyone can share their experience of using both tools.

30. Frankly Claude code is painfully slow. To the point I get frustrated.

On large codebases I often find it taking 20+ minutes to do basic things like writing tests.

Way too often people are like it takes 2 minutes for it to do a full pr. Yeah how big is the code base actually.

I also have a coworker who is about 10x more then everyone else. Burning through credits yet he is one of the lowest performers.{closing in on around 1k worth of credits a day now).

31. $1,000.00 of credits per-day?? $200,000 per year? Those are bonkers numbers for someone not performing at a high level (on-top of their salary). Do you know what they are doing?

32. They should just be on the $200 a month Max plan

33. I don't understand how these setups scale longterm, and even more so for the average user. The latter is relevant because, as he points out, his setup isn't that far out of reach of the average person - it's still fairly close to out of the box claude code, and opus.

But between the model qualities varying, the pricing, the timing, the tools constantly changing, I think it's really difficult to build the institutional knowledge and setup that can be used beyond a few weeks.

In the era of AI, I don't tink it's good enough to "have" a working product. It's also important to have all the other things that make a project way more productive, like stellar documentation, better abstractions, clearer architecture. In terms of AI, there's gotta be something better than just a markdown file with random notes. Like what happens when an agent does something because it's picking something up from some random slack convo, or some minor note in a 10k claude.md file. It just seems like the wild west where basic ideas like additional surface area being a liability is ignored because we're too early in the cycle.

tl;dr If it's just pushing around typical mid-level code, then... I just think that's falling behind.

34. I'm a bit jealous. I would like to experiment with having a similar setup, but 10x Opus 4.5 running practically non stop must amount to a very high inference bill. Is it really worth the output?

From experimentation, I need to coach the models quite closely in order to get enough value. Letting it loose only works when I've given very specific instructions. But I'm using Codex and Clai, perhaps Claude code is better.

35. I have a coworker who is basically doing this right now he leads our team and is second place overall. Regularly runs opus in parallel he alone is burning through 1k worth of credits a day.

He is also one of our worst performers.

36. Yeah, doesn't this guy work for Anthropic? He'd get to use 10x Opus 4.5 for free.

37. That's just unrealistic. If i were to use it like this as an actual end user i would get stopped by rate limits/those weekly / session limits instantly

38. 10-15 parallel Opus 4.5 instances running almost full-time? Even if you could get it, what would be the monthly bill for that?

39. One of my side projects has been to recover a K&R C computer algebra system from the 1980's, port to modern 64-bit C. I'd have eight tabs at a time assigned files from a task server, to make passes at 60 or so files. This nearly worked; I'm paused till I can have an agent with a context window that can look at all the code at once. Or I'll attempt a fresh translation based on what I learned.

With a $200 monthly Max subscription, I would regularly stall after completing significant work, but this workflow was feasible. I tried my API key for an hour once; it taught me to laugh at the $200 as quite a deal.

I agree that Opus 4.5 is the only reasonable use of my time. We wouldn't hire some guy off the fryer line to be our CTO; coding needs best effort.

Nevertheless, I thought my setup was involved, but if Boris considers his to be vanilla ice cream then I'm drinking skim milk.

40. what cc plan do you use? the20$?

41. Max 5x ($100)

42. > I assume "what sort of problems you must have" was directed at me.

I don't really have any sort of personal problem with Boris' post, if what your inflammatory statement was implying.

I also think it was a fairly good description of his workflow, technically speaking, but also glosses over the actual monetary costs of what he is doing, and also as noted above, doesn't really describe the actual outcomes other than a lot of PRs.

43. The monetary costs are minimal. The $20 and $100 plans actually get you very far these days

44. Ao this guy is personally responsible for the RAM shortage, it seems. Jokes aside, i have a similar setup, but with a mix of claude and a local model. Claude can access the local model for simple and repetitive tasks, and it actually does a good job on testing UI. Great way to save tokens.

45. It'd be nice if he explained the cost to be running 10 agents all day.

46. Yeah... I had a fairly in-depth conversation with Claude a couple of days ago about Claude Code and the way it works, and usage limits, and comparison to how other AI coding tools work, and the extremely blunt advice from Claude was that Claude Code was not suitable for serious software development due to usage limits! (props to Anthropic for not sugar coating it!)

Maybe on the Max 20x plan it becomes viable, and no doubt on the Boris Cherny unlimited usage plan it does, but it seems that without very aggressive non-stop context pruning you will rapidly hit limits and the 5-hour timeout even working with a single session, let alone 5 Claude Code sessions and another 5-10 web ones!

The key to this is the way that Claude Code (the local part) works and interacts with Claude AI (the actual model, running in the cloud). Basically Claude Code maintains the context, comprising mostly of the session history, contents of source files it has accessed, and the read/write/edit tools (based on Node.js) it is providing for Claude AI. This entire context, including all files that have been read, and the tools definitions, are sent to Claude AI (eating into your token usage limit) with EVERY request, so once Claude Code has accessed a few source files then the content of those files will "silently" be sent as part of every subsequent request, regardless of what it is. Claude gave me an example of where with 3 smallish files open (a few thousand lines of code), then within 5 requests the token usage might be 80,000 or so, vs the 40,000 limit of the Pro plan or 200,000 limit of the Max 5x plan. Once you hit limit then you have to wait 5 hours for a usage reset, so without Cherny's infinite usage limit this becomes a game of hurry up and wait (make 5 requests, then wait 5 hours and make 5 more).

You can restrict what source files Claude Code has access to, to try to manage context size (e.g. in a C++ project, let it access all the .h module definition files, but block all the .cpp ones) as well as manually inspecting the context all the time to see what is being sent that can be removed. I believe there is some automatic context compaction happening periodically too, but apparently not enough to prevent many/most people hitting usage time outs when working on larger projects.

Not relevant here, but Claude also explained how Cursor manages to provide fast/cheap autocomplete using it's own models by building a vector index of the code base to only pull relevant chunks of code into the context.

47. he's probably on the max plan ;)

48. It’s his marketing budget.

49. The amount of people holding strong opinions on LLMs who openly admit they have not tried the state of the art tools is so high on Hacker news right now, that it's refreshing to get actual updates from the tool's creators.

I read a comment yesterday that said something like "many people tried LLMs early on, it was kind of janky and so they gave up, thinking LLMs are bad". They were probably right at the time, but the tech _has_ improved since then, while those opinions have not changed much.

So, yes claude code and sonnet/opus 4.5 is another step change that you should try out. For $20/month you can run claude code in the terminal and regular claude on the web app.
</comments_about_topic>

Write a concise, engaging paragraph (3-5 sentences) summarizing the key points and perspectives in these comments about the topic. Focus on the most interesting viewpoints. Do not use bullet points—write flowing prose.

topic

Cost and Access Disparities # Analysis of the financial feasibility of running Opus 4.5 agents in parallel, noting that while Anthropic employees may have unlimited access, the cost for average users would be prohibitive due to token limits and API pricing.

commentCount

49

← Back to job