Summarizer

LLM Input

llm/065c6e83-d0d5-4aca-be3d-92768a8a3506/topic-5-910c4834-e731-4874-b176-453dad174c0d-input.json

prompt

The following is content for you to summarize. Do not respond to the comments—summarize them.

<topic>
Spec-Driven Development Tools # References to existing frameworks: OpenSpec, SpecKit, BMAD-METHOD, Kiro, Antigravity. Discussion of how these tools formalize the research-plan-implement workflow described in the article.
</topic>

<comments_about_topic>
1. Here's mine! https://github.com/pjlsergeant/moarcode

2. I use AWS Kiro, and its spec driven developement is exactly this, I find it really works well as it makes me slow down and think about what I want it to do.

Requirements, design, task list, coding.

3. Most LLM apps have a 'plan' or 'ask' mode for that.

4. > What previously would've taken a multi-man team weeks of planning and executing, stand ups, jour fixes, architecture diagrams, etc. can now be done within a single week by myself.

This has been my experience. We use Miro at work for diagramming. Lots of visual people on the team, myself included. Using Miro's MCP I draft a solution to a problem and have Miro diagram it. Once we talk it through as a team, I have Claude or codex implement it from the diagram.

It works surprisingly well.

> They are amazed however that what has taken months previously, can be done in hours nowadays.

Of course they're amazed. They don't have to pay you for time saved ;)

> reap benefits of LLMs for as long as they don't replace me
> What previously would've taken a multi-man team

I think this is the part that people are worried about. Every engineer who uses LLMs says this. By definition it means that people are being replaced.

I think I justify it in that no one on my team has been replaced. But management has explicitly said "we don't want to hire more because we can already 20x ourselves with our current team +LLM." But I do acknowledge that many people ARE being replaced; not necessarily by LLMs, but certainly by other engineers using LLMs.

5. I take this concept and I meta-prompt it even more.

I have a road map (AI generated, of course) for a side project I'm toying around with to experiment with LLM-driven development. I read the road map and I understand and approve it. Then, using some skills I found on skills.sh and slightly modified, my workflow is as such:

1. Brainstorm the next slice

It suggests a few items from the road map that should be worked on, with some high level methodology to implement. It asks me what the scope ought to be and what invariants ought to be considered. I ask it what tradeoffs could be, why, and what it recommends, given the product constraints. I approve a given slice of work.

NB: this is the part I learn the most from. I ask it why X process would be better than Y process given the constraints and it either corrects itself or it explains why. "Why use an outbox pattern? What other patterns could we use and why aren't they the right fit?"

2. Generate slice

After I approve what to work on next, it generates a high level overview of the slice, including files touched, saved in a MD file that is persisted. I read through the slice, ensure that it is indeed working on what I expect it to be working on, and that it's not scope creeping or undermining scope, and I approve it. It then makes a plan based off of this.

3. Generate plan

It writes a rather lengthy plan, with discrete task bullets at the top. Beneath, each step has to-dos for the llm to follow, such as generating tests, running migrations, etc, with commit messages for each step. I glance through this for any potential red flags.

4. Execute

This part is self explanatory. It reads the plan and does its thing.

I've been extremely happy with this workflow. I'll probably write a blog post about it at some point.

6. I use Claude Code for lecture prep.

I craft a detailed and ordered set of lecture notes in a Quarto file and then have a dedicated claude code skill for translating those notes into Slidev slides, in the style that I like.

Once that's done, much like the author, I go through the slides and make commented annotations like "this should be broken into two slides" or "this should be a side-by-side" or "use your generate clipart skill to throw an image here alongside these bullets" and "pull in the code example from ../examples/foo." It works brilliantly.

And then I do one final pass of tweaking after that's done.

But yeah, annotations are super powerful. Token distance in-context and all that jazz.

7. is your skill open source

8. Not yet... but also I'm not sure it makes a lot of sense to be open source. It's super specific to how I like to build slide decks and to my personal lecture style.

But it's not hard to build one. The key for me was describing, in great detail:

1. How I want it to read the source material (e.g., H1 means new section, H2 means at least one slide, a link to an example means I want code in the slide)

2. How to connect material to layouts (e.g., "comparison between two ideas should be a two-cols-title," "walkthrough of code should be two-cols with code on right," "learning objectives should be side-title align:left," "recall should be side-title align:right")

Then the workflow is:

1. Give all those details and have it do a first pass.

2. Give tons of feedback.

3. At the end of the session, ask it to "make a skill."

4. Manually edit the skill so that you're happy with the examples.

9. Well, that's already done by Amazon's Kiro [0], Google's Antigravity [1], GitHub's Spec Kit [2], and OpenSpec [3]!

[0]: https://kiro.dev/

[1]: https://antigravity.google/

[2]: https://github.github.com/spec-kit/

[3]: https://openspec.dev/

10. > Read deeply, write a plan, annotate the plan until it’s right, then let Claude execute the whole thing without stopping, checking types along the way.

As others have already noted, this workflow is exactly what the Google Antigravity agent (based off Visual Studio Code) has been created for. Antigravity even includes specialized UI for a user to annotate selected portions of an LLM-generated plan before iterating it.

One significant downside to Antigravity I have found so far is the fact that even though it will properly infer a certain technical requirement and clearly note it in the plan it generates (for example, "this business reporting column needs to use a weighted average"), it will sometimes quietly downgrade such a specialized requirement (for example, to a non-weighted average), without even creating an appropriate "WARNING:" comment in the generated code. Especially so when the relevant codebase already includes a similar, but not exactly appropriate API. My repetitive prompts to ALWAYS ask about ANY implementation ambiguities WHATSOEVER go unanswered.

From what I gather Claude Code seems to be better than other agents at always remembering to query the user about implementation ambiguities, so maybe I will give Claude Code a shot over Antigravity.

11. This is pretty much my approach. I started with some spec files for a project I'm working on right now, based on some academic papers I've written. I ended up going back and forth with Claude, building plans, pushing info back into the specs, expanding that out and I ended up with multiple spec/architecture/module documents. I got to the point where I ended up building my own system (using claude) to capture and generate artifacts, in more of a systems engineering style (e.g. following IEEE standards for conops, requirement documents, software definitions, test plans...). I don't use that for session-level planning; Claude's tools work fine for that. (I like superpowers, so far. It hasn't seemed too much)

I have found it to work very well with Claude by giving it context and guardrails. Basically I just tell it "follow the guidance docs" and it does. Couple that with intense testing and self-feedback mechanisms and you can easily keep Claude on track.

I have had the same experience with Codex and Claude as you in terms of token usage. But I haven't been happy with my Codex usage; Claude just feels like it's doing more of what I want in the way I want.

12. Yes this is what agent "skills" are. Just guides on any topic. The key is that you have the agent write and maintain them.

13. That sounds like the recommended approach. However, there's one more thing I often do: whenever Claude Code and I complete a task that didn't go well at first, I ask CC what it learned, and then I tell it to write down what it learned for the future. It's hard to believe how much better CC has become since I started doing that. I ask it to write dozens of unit tests and it just does. Nearly perfectly. It's insane.

14. I'm interested in this as well.

Skills almost seem like a solution, but they still need an out-of-band process to keep them updated as the codebase evolves. For now, a structured workflow that includes aggressive updates at the end of the loop is what I use.

15. There are frameworks like https://github.com/bmad-code-org/BMAD-METHOD and https://github.github.com/spec-kit/ that are working on encoding a similar kind of approach and process.

16. This is what I do with the obra/superpowers[0] set of skills.

1. Use brainstorming to come up with the plan using the Socratic method

2. Write a high level design plan to file

3. I review the design plan

4. Write an implementation plan to file. We've already discussed this in detail, so usually it just needs skimming.

5. Use the worktree skill with subagent driven development skill

6. Agent does the work using subagents that for each task:

a. Implements the task

b. Spec reviews the completed task

c. Code reviews the completed task

7. When all tasks complete: create a PR for me to review

8. Go back to the agent with any comments

9. If finished, delete the plan files and merge the PR

[0]: https://github.com/obra/superpowers

17. If you’ve ever desired the ability for annotating the plan more visually, try fitting Plannotator in this workflow. There is a slash command for use when you use custom workflows outside of normal plan mode.

https://github.com/backnotprop/plannotator

18. I'm using the in-built features as well, but I like the flow that I have with superpowers. You've made a lot of assumptions with your comment that are just not true (at least for me).

I find that brainstorming + (executing plans OR subagent driven development) is way more reliable than the built-in tooling.

19. Regarding inline notes, I use a specific format in the `/plan` command, by using th `ME:` prefix.

https://github.com/srid/AI/blob/master/commands/plan.md#2-pl...

It works very similar to Antigravity's plan document comment-refine cycle.

https://antigravity.google/docs/implementation-plan

20. Shameless plug: https://beadhub.ai allows you to do exactly that, but with several agents in parallel. One of them is in the role of planner, which takes care of the source-of-truth document and the long term view. They all stay in sync with real-time chat and mail.

It's OSS.

Real-time work is happening at https://app.beadhub.ai/juanre/beadhub (beadhub is a public project at https://beadhub.ai so it is visible).

Particularly interesting (I think) is how the agents chat with each other, which you can see at https://app.beadhub.ai/juanre/beadhub/chat

21. The annotation cycle is the key insight for me. Treating the plan as a living doc you iterate on before touching any code makes a huge difference in output quality.

Experimentally, i've been using mfbt.ai [ https://mfbt.ai ] for roughly the same thing in a team context. it lets you collaboratively nail down the spec with AI before handing off to a coding agent via MCP.

Avoids the "everyone has a slightly different plan.md on their machine" problem. Still early days but it's been a nice fit for this kind of workflow.

22. I agree, and this is why I tend to use gptel in emacs for planning - the document is the conversation context, and can be edited and annotated as you like.

23. https://github.blog/ai-and-ml/generative-ai/spec-driven-deve...

24. The author seems to think theyve invented a special workflow...

We all tend to regress to average (same thoughts/workflows)...

Have had many users already doing the exact same workflow with:
https://github.com/backnotprop/plannotator

25. Same, I formalized a similar workflow for my team (oriented around feature requirement docs), I am thinking about fully productizing it and am looking to for feedback - https://acai.sh

Even if the product doesn’t resonate I think I’ve stumbled on some ideas you might find useful^

I do think spec-driven development is where this all goes. Still making up my mind though.

26. Spec-driven looks very much like what the author describes. He may have some tweaks of his own but they could just as well be coded into the artifacts that something like OpenSpec produces.

27. I recently discovered GitHub speckit which separates planning/execution in stages: specify, plan, tasks, implement. Finding it aligns with the OP with the level of “focus” and “attention” this gets out of Claude Code.

Speckit is worth trying as it automates what is being described here, and with Opus 4.6 it's been a kind of BC/AD moment for me.

28. Try OpenSpec and it'll do all this for you. SpecKit works too. I don't think there's a need to reinvent the wheel on this one, as this is spec-driven development.

29. I just use Jesse’s “superpowers” plugin. It does all of this but also steps you through the design and gives you bite sized chunks and you make architecture decisions along the way. Far better than making big changes to an already established plan.

30. Link for those interested: https://claude.com/plugins/superpowers

31. Have you tried https://github.com/pcvelz/superpowers ?

32. I suggest reading the tests that Superpowers author has come up with for testing the skills. See the GitHub repo.

33. https://github.com/obra/superpowers

34. I’ve been using Claude through opencode, and I figured this was just how it does it. I figured everyone else did it this way as well. I guess not!

35. The baffling part of the article is all the assertions about how this is unique, novel, not the typical way people are doing this etc.

There are whole products wrapped around this common workflow already (like Augment Intent).

36. Google Anti-Gravity has this process built in. This is essentially a cycle a developer would follow: plan/analyse - document/discuss - break down tasks/implement. We’ve been using requirements and design documents as best practice since leaving our teenage bedroom lab for the professional world. I suppose this could be seen as our coding agents coming of age.

37. This is a similar workflow to speckit, kiro, gsd, etc.

38. All sounds like a bespoke way of remaking https://github.com/Fission-AI/OpenSpec

39. I use amazon kiro.

The AI first works with you to write requirements, then it produces a design, then a task list.

The helps the AI to make smaller chunks to work on, it will work on one task at a time.

I can let it run for an hour or more in this mode. Then there is lots of stuff to fix, but it is mostly correct.

Kiro also supports steering files, they are files that try to lock the AI in for common design decisions.

the price is that a lot of the context is used up with these files and kiro constantly pauses to reset the context.

40. Sounds a bit like what Claude Plan Mode or Amazon's Kiro were built for. I agree it's a useful flow, but you can also overdo it.

41. There are a few prompt frameworks that essentially codify these types of workflows by adding skills and prompts

https://github.com/obra/superpowers
https://github.com/jlevy/tbd

42. The “inline comments on a plan” is one of the best features of Antigravity, and I’m surprised others haven’t started copycatting.

43. my rlm-workflow skill has this encoded as a repeatable workflow.

give it a try: https://skills.sh/doubleuuser/rlm-workflow/rlm-workflow

44. You described how AntiGravity works natively.

45. Use OpenSpec and simplify everything.

46. Kiro's spec-based development looks identical.

https://kiro.dev/docs/specs/

It looks verbose but it defines the requirements based on your input, and when you approve it then it defines a design, and (again) when you approve it then it defines an implementation plan (a series of tasks.)

47. One thing for me has been the ability to iterate over plans - with a better visual of them as well as ability to annotate feedback about the plan.

https://github.com/backnotprop/plannotator Plannotator does this really effectively and natively through hooks
</comments_about_topic>

Write a concise, engaging paragraph (3-5 sentences) summarizing the key points and perspectives in these comments about the topic. Focus on the most interesting viewpoints. Do not use bullet points—write flowing prose.

topic

Spec-Driven Development Tools # References to existing frameworks: OpenSpec, SpecKit, BMAD-METHOD, Kiro, Antigravity. Discussion of how these tools formalize the research-plan-implement workflow described in the article.

commentCount

47

← Back to job