Feasibility of Parallel Agent Workflows

Skepticism regarding the human capacity to supervise multiple AI agents simultaneously, utilizing analogies like washing dishes vs. laundry, and debating the cognitive load required for context switching between 10 active coding streams.

While some developers dismiss the idea of managing multiple parallel AI agents as "showboating" that exceeds the limits of human working memory, others argue it is a viable strategy for offloading "yak-shaving" tasks and independent features while the human focus remains on high-level design. Skeptics maintain that the primary bottleneck isn't code generation but the grueling labor of requirement gathering and code review, likening the supervising role to watching "babies in a glassware shop" or producing massive amounts of technical debt. However, proponents find success by treating these agents as a "small team" that utilizes automated guardrails and sub-agent verification to handle repetitive chores like log analysis, unit testing, and UI refinements. Ultimately, the discussion highlights a polarizing shift in software engineering, where the promise of exponential output is balanced against "brain-frying" cognitive loads, soaring inference costs, and the potential loss of the creative satisfaction found in traditional coding.

View on HN · Topics

This is interesting to hear, but I don't understand how this workflow actually works.

I don't need 10 parallel agents making 50-100 PRs a week, I need 1 agent that successfully solves the most important problem.

I don't understand how you can generate requirements quicky enough to have 10 parallel agents chewing away at meaningful work. I don't understand how you can have any meaningful supervising role over 10 things at once given the limits of human working memory.

It's like someone is claiming they unlocked ultimate productivity by washing dishes, in parallel with doing laundry, and cleaning their house.

Likely I am missing something. This is just my gut reaction as someone who has definitely not mastered using agents. Would love to hear from anyone that has a similar workflow where there is high parallelism.

View on HN · Topics

This is what gets me... Even at companies with relatively small engineering teams compared to company size, actually getting coherent requirements and buy-in from every stakeholder on a single direction was enough work that we didn't really struggle with getting things done.

Sure, there was some lead, but not nearly enough to 2x the team's productivity, let alone 10x.

Even when presented with something, there was still lead time turning that into something actually actionable as edge cases were sussed out.

View on HN · Topics

So true, as a mere software developer on a payroll: I might spend 10 minutes doing a task with AI rather than an hour (w/o AI), but trust me - I am going to keep 50 minutes to myself, not deliver 5 more tasks )))) And when I work on my hobby project - having one AI agent crawling around my codebase is like watching a baby in a glassware shop. 10 babies? no thanks!

View on HN · Topics

Same. I am doing this as Claude knocked out two annoying yak shaving tasks I did not really want to do. Required careful review and tweaking.

Claiming that you now have 10 AI minions just wrecking your codebase sounds like showboating. I do not pity the people who will inherit those codebases later.

View on HN · Topics

DO you have any idea of the man hours it took to build those large projects you are speaking of? Lets take Linux for example. Suppose for the sake of argument that Claude Code with Opus 4.5 is as smart as an average person(AGI), but with the added benefit that he can work 24/7. Suppose now i have millions of dollars to burn and am running 1000 such instances on max plans. Now if I have started running this agent since the date Claude Opus 4.5 was released and i prompted him to create a commercial-grade multi-platform OS from the caliber of Linux.
An estimate of the linux kernel is 100 million man hours of work. divide by 1000. We expect to have a functioning OS like Linux by 2058 from these calcualtions.
How long has claude been released? 2 months.

View on HN · Topics

just like a baby in a month by 9 women, isn't it )

View on HN · Topics

My favorite movie quote as it pertains to software engineering has for a long time been Jurassic Park's: “Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should.”

That’s how I feel about a lot of AI-powered development. Just because you can have 10 parallel agents cranking out features 24/7 and have AI write 100% of the code, that doesn’t mean you’re actually building a product that users want and/or that is a viable business.

I’m currently in this situation, working on a greenfield project as founder/solo dev. Yes, AI has been tremendously useful in speeding things up, especially in patching over smaller knowledge gaps of mine.

But in the end, as in all the projects before in my career, building the MVP has rarely been the hard part of starting a company.

View on HN · Topics

Yes thank you! I find I get more than enough done (and more than enough code to review) by prompting the agent step by step. I want to see what kind of projects are getting done with multiple async autonomous agents. Was hoping to find youtube videos of someone setting up a project for multiple agents so I could see the cadence of the human stepping in and making directions

View on HN · Topics

I run 3-5 on distinct projects often. (20x plan) I quite enjoy the context switching and always have. I have a vanilla setup too, and I don't use plugins/skills/commands, sometimes I enable a MCP server for different things and definitely list out cli tools in my claude.md files. I keep a Google doc open where I list out all the projects I'm working on and write notes as I'm jumping thought the Claude tabs, I also start drafting more complex prompts in the Google doc. I've been using turbo repo a lot so I don't have to context switch the architecture in my head. (But projects still using multiple types of DevOps set ups)

Often these days I vibe code a feedback loop for each project, a way to validate itself as OP said. This adds time to how long Claude takes to complete giving me time to switch context for another active project.

I also use light mode which might help others... jks

View on HN · Topics

Multiple instances of agents are an equivalent to tabs in other applications - primarily holders of state, rather than means for extreme parallelism.

View on HN · Topics

I have not used Claude. But my experience with Gemini and aider is that multiple instances of agents will absolutely stomp over each other. Even in a single sessions overwriting my changes after telling the agent that I did modifications will often result in clobbering.

View on HN · Topics

I agree. I'm imagining a large software team with hundreds of tickets "ready to be worked on" might support this workflow - but even then, surely you're going to start running into unnecessary conflicts.

The max Claude instances I've run is 2 because beyond that, I'm - as you say - unable to actually determine the next best course during the processing time. I could spend the entire day planning / designing prompts - and perhaps that will be the most efficient software development practise in the future. And/or perhaps there it is a sign I'm doing insufficient design up front.

View on HN · Topics

I suppose he may have a list of feature requests and bug reports to work on, but it does seem a bit odd from a human perspective to want to work on 5 or more things literally in parallel, unless they are all so simple that there is no cognitive load and context switching required to mentally juggle them.

Washing dishes in parallel with laundry and cleaning is of course easily possible, but precisely because there is no cognitive load involved. When the washing machine stops you can interrupt what you are doing to load clothes into the drier, then go back to cleaning/whatever. Software development for anything non-trivial obviously has a much higher task-switching overhead. Optimal flow for a purely human developer is to "load context" at the beginning of the day, then remain in flow-state without interruptions.

The cynical part of me can't also help but wonder if Cherny/Anthopic aren't just advocating token-maxxing!

View on HN · Topics

Yeah I don’t understand these posts recently with people running 10 at once

Can someone give an example of what each of them would be doing?

Are they just really slow, is that the problem?

View on HN · Topics

For me it's their speed, yes. I only run 0-3 at a time, and often the problem at hand is very much not complex. For example "Take this component out of the file into its own file, including its styles." The agent may take 5 minutes for that and what do I do in the meantime? I can start another agent for the next task at hand.

Could also be a bug hunt "Sometimes we get an error message about XYZ, please investigate how that might happen." or "Please move setting XY from localstorage to cookies".

View on HN · Topics

I rarely run 10 top-level sessions, but I often run multiple.

Here is one case, though:

I have a prototype Ruby compiler that long languished because I didn't have time. I recently picked up work on it again with Claude Code.

There are literally thousands of unimplemented methods in the standard library. While that has not been my focus so far, my next step for it is to make Claude work on implementing missing methods in 10+ sessions in parallel, because why not? While there are some inter-dependencies (e.g. code that would at least be better with more of the methods of the lowest level core classes already in place), a huge proportion are mostly independent.

In this case the rubyspec test suite is there to verify compliance. On top of that I have my own tests (does the compiler still compile itself, and does the selftests still run when compiled with self-compiled compiler?) so having 10+ sessions "pick off" missing pieces, make an attempt see if it can make it pass, and move on, works well.

My main limitation is that I have already once run into the weekly limits of my (most expensive) Claude Max subscription, and I need it for other things too for client work and I'm not willing to pay-per-token for the API use for that project since it's not immediately giving me a return.

(And yes, they're "slow" - but faster than me; if they were fast enough, then sure, it'd be nicer to have them run serially, the same way if you had time it's easier to get cohesive work if a single developer does all the work on a project instead of having a team try to coordinate)

View on HN · Topics

It just happens automatically. Once you set it running and it's chugging away there's nothing for you to do for a while. So of course you start working on something else. Then that is running ... before you know it, 5 of them are going and you have forgotten which is what and this is your new problem.

View on HN · Topics

Yep.

For one of the things I am doing, I am the solo developer on a web application. At any given point, there are 4-5 large features I want and I instruct Claude to heavily test those features, so it is not unusual for each to run for 30-45 minutes and for overall conversations to span several hours. People are correct that it often makes mistakes, so that testing phase usually uncovers a bunch of issues it has to fix.

I usually have 1-2 mop up terminal windows open for small things I notice as I go along that I want to fix. Claude can be bad about things like putting white text on a white button and I want a free terminal to just drop every little nitpick into it. They exist for me to just throw small tasks into. Yes, you really should start a new convo every need, but these are small things and I do not want to disrupt my flow.

There are another 2-3 for smaller features that I am regularly reviewing and resetting. And then another one dedicated to just running the tests already built over and over again and solving any failures or investigating things. Another one is for research to tell me things about the codebase.

View on HN · Topics

Depends on the project you are working on. Solo on a web app? You probably have 100s of small things to fix. Some more padding there, add a small new feature here, etc.

View on HN · Topics

Agree. People are stuck applying the "agent" = "employee" analogy and think they are more productive by having a team/company of agents. Unless you've perfectly spec'ed and detailed multiple projects up front, the speed of a single agent shouldn't be the bottleneck.

View on HN · Topics

Potentially, a lot of that isn't just code generation, it *is* requirements gathering, design iteration, analysis, debugging, etc.

I've been using CC for non-programming tasks and its been pretty successful so far, at least for personal projects (bordering on the edge of non-trivial). For instance, I'll get a 'designer' agent coming up with spec, and a 'design-critic' to challenge the design and make the original agent defend their choices. They can ask open questions after each round and I'll provide human feedback. After a few rounds of this, we whittle it down to a decent spec and try it out after handing it off to a coding agent.

Another example from work: I fired off some code analysis to an agent with the goal of creating integration tests, and then ran a set of spec reviewers in parallel to check its work before creating the actual tickets.

My point is there are a lot of steps involved in the whole product development process and isn't just "ship production code". And we can reduce the ambiguity/hallucinations/sycophancy by creating validation/checkpoints (either tests, 'critic' agents to challenge designs/spec, or human QA/validation when appropriate)

The end game of this approach is you have dozens or hundreds of agents running via some kind of orchestrator churning through a backlog that is combination human + AI generated, and the system posts questions to the human user(s) to gather feedback. The human spends most of the time doing high-level design/validation and answering open questions.

You definitely incur some cognitive debt and risk it doing something you don't want, but thats part of the fun for me (assuming it doesn't kill my AI bill).

View on HN · Topics

My impression is that people who are exploring coordinated multi-agent-coding systems are working towards replacing full teams, not augmenting individuals. "Meaningful supervising role" becomes "automated quality and process control"; "generate requirements quickly" -> we already do this for large human software teams.

If that's the goal, then we shouldn't interpret the current experiment as the destination.

View on HN · Topics

I use Beads which makes it more easy to grasp since its "tickets" for the agent, and I tell it what I want, it creates a bead (or "ticket") and then I ask it to do research, brain dump on it, and even ask it to ask me clarifying questions, and it updates the tasks, by the end once I have a few tasks with essentially a well defined prompt, I tell Claude to run x tasks in parallel, sometimes I dump a bunch of different tasks and ask it to research them all in parallel, and it fills them in, and I review. When it's all over, I test the code, look at the code, and mention any follow ups.

I guess it comes down to, how much do you trust the agent? If you don't trust it fully you want to inspect everything, which you still can, but you can choose to do it after it runs wild instead of every second it works.

View on HN · Topics

I would do the same thing if I would justifing paying 200$ per Month for my hobby. But even with that, you will run into throttling / API / Resource limits.

But AI Agents need time. They need a little bit of reading the sourcecode, proposing the change, making the change, running the verification loop, creating the git commit etc. Can be a minute, can be 10 and potentially a lot longer too.

So if your code base is big enough that you can work o different topics, you just do that:

- Fix this small bug in the UI when xy happens
- Add a new field to this form
- Cleanup the README with content x
- . . .

I'm an architect at work and have done product management on the side as its a very technical project. I have very little problem coming up with things to fix, enhnace, cleanup etc. I have hard limits on my headcount.

I could easily do a handful of things in parallel and keeping that in my head. Working memory might be limited but working memory means something different than following 10 topics. Especially if there are a few tpics inbetween which just take time with the whole feedback loop.

But regarding your example of house cleaning: I have ADHD, i sometimes work like this. Working on something, waiting for a build and cleaning soming in parallel.

What you are missing is the practical experience with Agents. Taking the time and energy of setting up something for you, perhaps accessability too?

We only got access at work to claude code since end of last year.

View on HN · Topics

> It's like someone is claiming they unlocked ultimate productivity by washing dishes, in parallel with doing laundry, and cleaning their house.

In this case you have to take a leap of faith and assume that Claude or Codex will get each task done correctly enough that your house won't burn down.

View on HN · Topics

This is it! “I don't need 10 parallel agents making 50-100 PRs a week, I need 1 agent that successfully solves the most important problem.”

View on HN · Topics

Do you generally only have one problem? For me the use case is that I have numerous needs and Claude frees up time to work on some of the more complicated ones.

View on HN · Topics

I usually have 4-5, but it's because they are working on different parts of the codebase, or some I will use as read only to brainstorm

View on HN · Topics

The problem isn't generating requirements, it's validating work. Spec driven development and voice chat with ticket/chat context is pretty fast, but the validation loop is still mostly manual. When I'm building, I can orchestrate multiple swarm no problem, however any time I have to drop in to validate stuff, my throughput drops and I can only drive 1-2 agents at a time.

View on HN · Topics

It depends on the specifics of the tasks; I routinely work on 3-5 projects at once (sometimes completely different stuff), and having a tool like cloud code fits great in my workflow.

Also, the feedback doesnt have to be immediate: sometimes I have sessions that run over a week, because of casual iterations; In my case its quite common to do this to test concepts, micro-benchmarking and library design.

View on HN · Topics

If you're trying to solve one very hard problem, parallelism is not the answer. Recursion is.

Recursion can give you an exponential reduction in error as you descend into the call stack. It's not guaranteed in the context of an LLM but there are ways to strongly encourage some contraction in error at each step. As long as you are, on average, working with a slightly smaller version of the problem each time you recurse, you still get exponential scaling.

View on HN · Topics

> It's like someone is claiming they unlocked ultimate productivity by washing dishes, in parallel with doing laundry, and cleaning their house.

But we do this routinely with machines. Not saying I don't get your point re 100 PRs a week, just that it's a strange metaphor given the similarities.

View on HN · Topics

> I need 1 agent that successfully solves the most important problem.

If you only have that one problem, that is a reasonable criticism, but you may have 10 different problems and want to focus on the important one while the smaller stuff is AIed away.

> I don't understand how you can generate requirements quicky enough to have 10 parallel agents chewing away at meaningful work.

I am generally happy with the assumptions it makes when given few requirements? In a lot of cases I just need a feature and the specifics are fairly open or very obvious given the context.

For example, I am adding MFA options to one project. As I already have MFA for another portal on it, I just told Claude to add MFA options for all users. Single sentence with no details. Result seems perfectly servicable, if in need of some CSS changes.

View on HN · Topics

Exactly. And if that problem is complex, your first step should be to plan how to sub-divide it anyway. So just ask Claude to map out interdependencies for tasks to look for opportunities to paralellise.

View on HN · Topics

Claude is absolutely plastering Facebook with this bullshit.

Every PR Claude makes needs to be reviewed. Every single one. So great! You have 10 instances of Claude doing things. Great! You're still going to need to do 10 reviews.

View on HN · Topics

At the begining of the project, the runs are fast, but as the project gets bigger, the runs are slower:

- there are bigger contexts

- the test suite is much longer and slower

- you need to split worktree, resources (like db, ports) and sometimes containers to work in isolation

So having 10 workers will run for a long time. Which give plenty of time to write good spec.

You need good spec, so the llm produce good tests, so it can write good code to match these tests.

Having a very strong spec + test suite + quality gates (linter, type checkers, etc) is the only way to get good results from an LLM as the project become more complex.

Unlike a human, it's not very good at isolating complexity by itself, nor stopping and asking question in the face of ambiguity. So the guardrails are the only thing that keeps it on track.

And running a lot of guardrail takes time.

E.G: yesterday I had a big migration to do from HTMX to viewjs, I asked the LLM to produce screenshots of each state, and then do the migration in steps in a way that kept the screenshit 90% identical.

This way I knew it would not break the design.

But it's very long to run e2e tests + screenshot comparison every time you do a modification. Still faster than a human, but it gives plenty of time to talk to another llm.

Plus you can assign them very different task:

- One work on adding a new feature

- One improves the design

- One refactor part of the code (it's something you should do regularly, LLM produce tech debt quickly)

- One add more test to your test suite

- One is deploying on a new server

- One is analyzing the logs of your dev/test/prod server and tell you what's up

- One is cooking up a new logo for you and generating x versions at different resolutions.

Etc.

It's basically a small team at your disposal.

View on HN · Topics

> I don't understand how you can generate requirements quicky enough to have 10 parallel agents chewing away at meaningful work.

You use agents to expand the requirements as well , either in plan mode (as OP does) or with a custom scaffold (rules in CLAUDE.md about how to handle requirements; personally I prefer giving Claude the latitude to start when Claude is ready rather than wait for my go-ahead)

> I don't understand how you can have any meaningful supervising role over 10 things at once given the limits of human working memory.

[this got long: TL;DR: This is what works for me: Stop worrying about individual steps; use sub-agents and slash-commands to encapsulate units of work to make Claude run longer; use permissions to allow as much as you dare (and/or run in a VM to allow Claude to run longer; give Claude tools to verify its work (linters, test suites, sub-agents double-checking the work against the spec) and make it use it; don't sit and wait and read invidiual parts of the conversation - it will only infuriate you to see Claude make stupid mistakes, but if well scaffolded it will fix them before it returns the code to you, so stop reading, breathe, and let it work; only verify when Claude has worked for a long time and checked its own work -- that way you review far less code and far more complete and coherent changes]

You don't. You wait until each agent is done , and you review the PR's. To make this kind of thing work well you need agents and slash-commands, like OP does - sub-agents in particular help prevent the top-level agents from "context anxiety": Claude Code appears to have knowledge of context use, and will be prone to stopping before context runs out; sub-agents use their own context and the top-level agent only uses context to manage the input to and output from them, so the more is farmed out to sub-agents, the longer Claude Code is willing to run. I when I got up this morning, Claude Code had run all night and produced about 110k words of output.

This also requires extensive permissions to use safe tools without asking (what OP does), or --dangerously-skip-permissions (I usually do this; you might want to put this in a container/VM as it will happily do things like "killall -9 python" or similar without "thinking through" consequences - I've had it kill the terminal it itself ran in before), or it'll stop far too quickly.

You'll also want to explicitly tell it to do things in parallel when possible. E.g. if you want to use it as a "smarter linter" (DO NOT rely on it as the only linter, use a regular one too, but using claude to apply more complex rules that requires some reasoning works great), you can ask it to "run the linter agent in parallel on all typescript files" for example, and it will tend to spawn multiple sub-agents running in parallel, and metaphorically twiddle its thumbs waiting for them to finish (it's fun seeing it get "bored" and decide to do other things in the meantime, or get impatient and check on progress obsessively).

You'll also want to make Claude use sub-agents to review, verify, test its work, with instructions to repeat until all the verification sub-agents give its changes a PASS (see 12/ and 13/ in the thread) - there is no reason for you to waste your time reviewing code that Claude itself can tell isn't ready.

[E.g. concrete example: "Vanilla" Claude "loves" using instance_variable_get() in Ruby if facing a class that is missing an accessor for an instance variable. Whether you know Ruby or not, that should stand out like a sore thumb - it's a horrifically gross code smell, as it's basically bypassing encapsulation entirely. But you shouldn't worry about that - if you write Ruby with Claude, you'd want a rule in CLAUDE.md telling it how to address missing accessors, and sub-agent, and possibly a hook, making sure that Claude is told to fix it immediately if it ever uses it.]

Farming it off to sub-agents both makes it willing to work longer, especially on "boring" tasks, and avoids the problem that it'll look at past work and decide it already "knows" this code is ready and start skipping steps.

The key thing is to stop obsessing over every step Claude takes, and treat that as a developer experimenting with something they're not clear on how to do yet. If you let it work, and its instructions are good, and it has ways of checking its work, it will figure out its first attempts are broken, fix them, and leave you with output that takes far less of your time to review.

When Claude tells you its done with a change, if you stop egregious problems, fix your CLAUDE.md, fix your planning steps, fix your agents.

None of the above will absolve you of reviewing code, and you will need to kick things back and have it fix them, and sometimes that will be tedious, but Claude is good enough that the problems you have it fix should be complex, not simple code smells or logic errors, and 9 out 10 times they should signal that your scaffold is lacking important detail about your project or that your spec is incomplete at a functional/acceptance criteria level (not low level detail)

View on HN · Topics

I implemented some of his setup and have been loving it so far.

My current workflow is typically 3-5 Claude Codes in parallel

- Shallow clone, plan mode back and forth until I get the spec down, hand off to subagent to write a plan.md

- Ralph Wiggum Claude using plan.md and skills until PR passes tests, CI/CD, auto-responds to greptile reviews, prepares the PR for me to review

- Back and forth with Claude for any incremental changes or fixes

- Playwright MCP for Claude to view the browser for frontend

I still always comb through the PRs and double check everything including local testing, which is definitely the bottleneck in my dev cycles, but I'll typically have 2-4 PRs lined up ready for me at any moment.

View on HN · Topics

3-5 parallel claude code, do they work at same repo?

do they work on same features/goals?

View on HN · Topics

We have a giant monorepo, hence the shallow clones. Each Claude works on its own feature / bug / ticket though, sometimes in the same part of the codebase but usually in different parts (my ralph loop has them resolve any merge conflicts automatically). I also have one Claude running just for spelunking through K8s, doing research, or asking questions about the codebase I'm unfamiliar with.

View on HN · Topics

I feel like it's time for me to hang up this career. Prompting is boring, and doing it 5 times at once is just annoying multitasking. I know I'm mostly in it for the money, but at least there used to be a feeling of accomplishment sometimes. Now it's like, whose accomplishment is it?

View on HN · Topics

It is the case that Anthropic employees have no usage limits.
Some people do experiments where they spawn up hundreds of Claude instances just to see if any of them succeed.

View on HN · Topics

The main difference is that slash commands are invoked by humans, whereas skills can only be invoked by the agent itself. It works kinda as conditional instructions.

As an example, I have skills that aide in adding more detail to plans/specs, debugging, and for spinning up/partitioning subagents to execute tasks. I don't need to invoke a slash command each time, and the agent can contextually know by the instructions I give it what skills to use.

View on HN · Topics

Why stop at 5-10?
Make it 5 billion - 10 billion parallel agents.
PR number go up

View on HN · Topics

Yup. The way he works is all tasks he is issued in a sprint he just fires them through opus in parallel hoping to get a hit on Claude magically solving the ticket having them constantly be iterated on them. He doesnt even try using proper having plans be created.

Often time tickets get fleshed out or requirements change. He just throws everything out and reshoves it into Claude.

I weep for the planet.

View on HN · Topics

I don't understand how these setups scale longterm, and even more so for the average user. The latter is relevant because, as he points out, his setup isn't that far out of reach of the average person - it's still fairly close to out of the box claude code, and opus.

But between the model qualities varying, the pricing, the timing, the tools constantly changing, I think it's really difficult to build the institutional knowledge and setup that can be used beyond a few weeks.

In the era of AI, I don't tink it's good enough to "have" a working product. It's also important to have all the other things that make a project way more productive, like stellar documentation, better abstractions, clearer architecture. In terms of AI, there's gotta be something better than just a markdown file with random notes. Like what happens when an agent does something because it's picking something up from some random slack convo, or some minor note in a 10k claude.md file. It just seems like the wild west where basic ideas like additional surface area being a liability is ignored because we're too early in the cycle.

tl;dr If it's just pushing around typical mid-level code, then... I just think that's falling behind.

View on HN · Topics

I'm a bit jealous. I would like to experiment with having a similar setup, but 10x Opus 4.5 running practically non stop must amount to a very high inference bill. Is it really worth the output?

From experimentation, I need to coach the models quite closely in order to get enough value. Letting it loose only works when I've given very specific instructions. But I'm using Codex and Clai, perhaps Claude code is better.

View on HN · Topics

I've tried running a number of claude's in paralell on a CRUD full stack JS app. Yes, it got features made faster, yes it definitely did not leave me enough time to acutally look at what they did, yes it definitely produced sub-par code.

At the moment with one claude + manually fixing crap it produces I am faster at solving "easier" features (Think add API endpoint, re-build API client, implement frontend logic for API endpoint + UI) faster than if I write it myself.

Things that are more logic dense, it tends to produce so many errors that it's faster to solve myself.

View on HN · Topics

I had this ShowHN yesterday, which didn't grab much attention, so i'm using this opportunity as it seems relevant (it is a solution for running CC in parallel)

if you folks like to run parallel claude-code sessions, and like native terminal like Ghostty, i have a solution for using Git Worktree natively with Ghostty, it is called agentastic.dev, and it has a built-in Worktree/IDE/Diff/Codereview around Ghostty (macos only for now).

Would be happy to answer any questions

ShowHN post: https://news.ycombinator.com/item?id=46501758

View on HN · Topics

That's just unrealistic. If i were to use it like this as an actual end user i would get stopped by rate limits/those weekly / session limits instantly

View on HN · Topics

10-15 parallel Opus 4.5 instances running almost full-time? Even if you could get it, what would be the monthly bill for that?

View on HN · Topics

One of my side projects has been to recover a K&R C computer algebra system from the 1980's, port to modern 64-bit C. I'd have eight tabs at a time assigned files from a task server, to make passes at 60 or so files. This nearly worked; I'm paused till I can have an agent with a context window that can look at all the code at once. Or I'll attempt a fresh translation based on what I learned.

With a $200 monthly Max subscription, I would regularly stall after completing significant work, but this workflow was feasible. I tried my API key for an hour once; it taught me to laugh at the $200 as quite a deal.

I agree that Opus 4.5 is the only reasonable use of my time. We wouldn't hire some guy off the fryer line to be our CTO; coding needs best effort.

Nevertheless, I thought my setup was involved, but if Boris considers his to be vanilla ice cream then I'm drinking skim milk.

View on HN · Topics

I spent a whole day running 3x local CC sessions and about 7 Claude code web sessions over the day. This was the most heavy usage day ever for me, about 30 pull requests created and merged over 3 projects.

I got a lot done, but my brain was fried after that. Like wired but totally exhausted.

Has anyone else experienced this and did you find strategies to help (or find that it gets easier)?

View on HN · Topics

Having the 5 instances going at once sounds like Google Antigravity.

I haven't used Claude Code too much. One snag I found is the tendency when running into snags to fix them incorrectly by rolling back to older versions of things. I think it would benefit from an MCP server for say Maven Central. Likewise it should prefer to generate code using things like project scaffolding tooling whenever possible.

View on HN · Topics

This feels like the desperate, look at me! post, which is the exact opposite of Andrej Karpathy's recent tweet[0] about feeling left behind as a programmer, as covered on Hacker News[1].

I guess would want to see how sustainable this 5 parallel AI effort is, and are there demonstrably positive outcomes. There are plenty of "I one-shotted this" examples of something that already (mostly) existed, which are very impressive in their own right, but I haven't seen a lot of truly novel creations.

[0] https://x.com/karpathy/status/2004607146781278521

[1] https://news.ycombinator.com/item?id=46395714

View on HN · Topics

I have to say it sounds insane. 5 tabs of claude, back and forth from terminal to browser - and no actual workflow detailed. Are we to believe that claude is making changes in parallel to one codebase, and if so - why?

View on HN · Topics

I actually use dozens of claude codes "in parallel" myself (most are sitting idle for a lot of the time though). I set up a web interface and then made it usable by others at clodhost.com if anybody wants to try it (free)!

View on HN · Topics

It'd be nice if he explained the cost to be running 10 agents all day.

View on HN · Topics

so... what's he actually doing with 10 terminals of claude code?

View on HN · Topics

Working on Claude Code?

View on HN · Topics

Absolute madness and no thank you.

Have others not noticed the extremely obvious astroturfing campaign specifically promoting Claude code that is mostly happening on X in recent days/weeks?

View on HN · Topics

why he open multiple tab? is it means he gives multiple task to them?

View on HN · Topics

it means he develope 5 to 10 features for same repo

View on HN · Topics

I'm sure this works for people and I'm happy for them but this sounds like absolute hell to me.

Just take out everything that I like about SWE and then just leave me to do the stuff I hate.

Summarizer