Summarizer

LLM Input

llm/065c6e83-d0d5-4aca-be3d-92768a8a3506/batch-8-09e06099-8529-4e6e-93a4-39404e78b185-input.json

Pretty-print

prompt

The following is content for you to classify. Do not respond to the comments—classify them.

<topics>
1. Not Novel or Revolutionary
Related: Many commenters argue this workflow is standard practice, not radically different. References to existing tools like Kiro, OpenSpec, SpecKit, and Antigravity that already implement spec-driven development. Claims the approach was documented 2+ years ago in Cursor forums.
2. LLMs as Junior Developers
Related: Analogy comparing LLMs to unreliable interns with boundless energy. Discussion of treating AI like junior developers requiring supervision, documentation, and oversight. The shift from coder to software manager role.
3. AI-Generated Article Concerns
Related: Multiple commenters suspect the article itself was written by AI, noting characteristic style and patterns. Debate about whether AI-written content should be evaluated differently or dismissed outright.
4. Magic Words and Prompt Engineering
Related: Skepticism about whether words like 'deeply' and 'in great details' actually affect LLM behavior. Discussion of attention mechanisms, emotional prompting research, and whether prompt techniques are superstition or cargo cult.
5. Planning vs Just Coding
Related: Debate about whether extensive planning overhead eliminates time savings. Some argue writing specs takes longer than writing code. Others counter that planning prevents compounding errors and technical debt.
6. Spec-Driven Development Tools
Related: References to existing frameworks: OpenSpec, SpecKit, BMAD-METHOD, Kiro, Antigravity. Discussion of how these tools formalize the research-plan-implement workflow described in the article.
7. Context Window Management
Related: Strategies for handling large codebases and context limits. Maintaining markdown files for subsystems, using skills, aggressive compaction. Concerns about context rot and performance degradation.
8. Waterfall Methodology Comparison
Related: Commenters note the approach resembles waterfall development with detailed upfront planning. Discussion of whether this contradicts agile principles or represents rediscovering proven methods.
9. Test-Driven Development Integration
Related: Suggestions to add comprehensive tests to the workflow. Writing tests before implementation, using tests as verification. Arguments that test coverage enables safer refactoring with AI.
10. Single Session vs Multiple Sessions
Related: Author's claim of running entire workflows in single long sessions without performance degradation. Others recommend clearing context between phases for better results.
11. Determinism and Reproducibility
Related: Concerns about non-deterministic LLM outputs. Discussion of whether software engineering can accommodate probabilistic tools. Comparisons to gambling and slot machines.
12. Token Cost Considerations
Related: Discussion of workflow being token-heavy and expensive. Comparisons between Claude subscription tiers. Arguments that simpler approaches save money while achieving similar results.
13. Annotation Workflow Details
Related: Questions about how to format inline annotations for Claude to recognize. Techniques like TODO prefixes, HTML comments, and clear separation between human and AI-written content.
14. Subagent Architecture
Related: Using multiple agents for different phases: planning, implementation, review. Red team/blue team approaches. Dispatching parallel agents for independent tasks.
15. Reference Implementation Technique
Related: Using existing code from open source projects as examples for Claude. Questions about licensing implications. Claims this dramatically improves output quality.
16. Claude vs Other Models
Related: Comparisons between Claude, Codex, Gemini, and other models. Discussion of model-specific behaviors and optimal prompting strategies. Using multiple models in complementary roles.
17. Greenfield vs Existing Codebases
Related: Observation that most AI coding articles focus on greenfield development. Different challenges when working with legacy code and established patterns.
18. Human Review Requirements
Related: Debate about whether all AI-generated code must be reviewed line-by-line. Questions about trust, liability, and whether AI can eventually be trusted without oversight.
19. Productivity Claims Skepticism
Related: Questions about actual time savings versus perceived productivity. References to studies showing AI sometimes makes developers less productive. Concerns about false progress.
20. Documentation as Side Benefit
Related: Plans and research documents serve as valuable documentation for future maintainers. Version controlling plan files in git. Using plans to understand architectural decisions later.
0. Does not fit well in any category
</topics>

<comments_to_classify>
[

{
"id": "47110032",
"text": "Another approach is to spec functionality using comments and interfaces, then tell the LLM to first implement tests and finally make the tests pass. This way you also get regression safety and can inspect that it works as it should via the tests."
}
,

{
"id": "47107646",
"text": "Claude appeared to just crash in my session: https://news.ycombinator.com/item?id=47107630"
}
,

{
"id": "47109089",
"text": "That is just spec driven development without a spec, starting with the plan step instead."
}
,

{
"id": "47108371",
"text": "This is just Waterfall for LLMs. What happens when you explore the problem space and need to change up the plan?"
}
,

{
"id": "47107690",
"text": "AI only improves and changes. Embrace the scientific method and make sure your “here’s how to” are based in data."
}
,

{
"id": "47107822",
"text": "my rlm-workflow skill has this encoded as a repeatable workflow.\n\ngive it a try: https://skills.sh/doubleuuser/rlm-workflow/rlm-workflow"
}
,

{
"id": "47109931",
"text": "Sorry but I didn't get the hype with this post, isnt it what most of the people doing? I want to see more posts on how you use the claude \"smart\" without feeding the whole codebase polluting the context window and also more best practices on cost efficient ways to use it, this workflow is clearly burning million tokens per session, for me is a No"
}
,

{
"id": "47107873",
"text": "That's great, actually, doesn't the logic apply to other services as well?"
}
,

{
"id": "47109953",
"text": "I feel like if I have to do all this, I might as well write the code myself."
}
,

{
"id": "47109586",
"text": "This is great. My workflow is also heading in that direction, so this is a great roadmap. I've already learned that just naively telling Claude what to do and letting it work, is a recipe for disaster and wasted time.\n\nI'm not this structured yet, but I often start with having it analyse and explain a piece of code, so I can correct it before we move on. I also often switch to an LLM that's separate from my IDE because it tends to get confused by sprawling context."
}
,

{
"id": "47108107",
"text": "My workflow is a bit different.\n\n* I ask the LLM for it's understanding of a topic or an existing feature in code. It's not really planning, it's more like understanding the model first\n\n* Then based on its understanding, I can decide how great or small to scope something for the LLM\n\n* An LLM showing good understand can deal with a big task fairly well.\n\n* An LLM showing bad understanding still needs to be prompted to get it right\n\n* What helps a lot is reference implementations. Either I have existing code that serves as the reference or I ask for a reference and I review.\n\nA few folks do it at my work do it OPs way, but my arguments for not doing it this way\n\n* Nobody is measuring the amount of slop within the plan. We only judge the implementation at the end\n\n* it's still non deterministic - folks will have different experiences using OPs methods. If claude updates its model, it outdates OPs suggestions by either making it better or worse. We don't evaluate when things get better, we only focus on things not gone well.\n\n* it's very token heavy - LLM providers insist that you use many tokens to get the task done. It's in their best interest to get you to do this. For me, LLMs should be powerful enough to understand context with minimal tokens because of the investment into model training.\n\nBoth ways gets the task done and it just comes down to my preference for now.\n\nFor me, I treat the LLM as model training + post processing + input tokens = output tokens. I don't think this is the best way to do non deterministic based software development. For me, we're still trying to shoehorn \"old\" deterministic programming into a non deterministic LLM."
}
,

{
"id": "47107700",
"text": "Is this not just Ralph with extra steps and the risk of context rot?"
}
,

{
"id": "47109810",
"text": "That's exactly what Cursor's \"plan\" mode does? It even creates md files, which seems to be the main \"thing\" the author discovered. Along with some cargo cult science?\n\nHow is this noteworthy other than to spark a discussion on hn? I mean I get it, but a little more substance would be nice."
}
,

{
"id": "47107537",
"text": "How much time are you actually saving at this point?"
}
,

{
"id": "47110210",
"text": "Has Claude Code become slow, laggy, imprecise, giving wrong answers for other people here?"
}
,

{
"id": "47110195",
"text": "This is exactly how I use it."
}
,

{
"id": "47107434",
"text": "Tip:\nLLMs are very good at following conventions (this is actually what is happening when it writes code).\nIf you create a .md file with a list of entries of the following structure:\n# <identifier>\n<description block>\n<blank space>\n# <identifier>\n...\nwhere an <identifier> is a stable and concise sequence of tokens that identifies some \"thing\" and seed it with 5 entries describing abstract stuff, the LLM will latch on and reference this. I call this a PCL (Project Concept List). I just tell it:\n> consume tmp/pcl-init.md pcl.md\nThe pcl-init.md describes what PCL is and pcl.md is the actual list.\nI have pcl.md file for each independent component in the code (logging, http, auth, etc).\nThis works very very well.\nThe LLM seems to \"know\" what you're talking about.\nYou can ask questions and give instructions like \"add a PCL entry about this\".\nIt will ask if should add a PCL entry about xyz.\nIf the description block tends to be high information-to-token ratio, it will follow that convention (which is a very good convention BTW).\n\nHowever, there is a caveat. LLMs resist ambiguity about authority. So the \"PCL\" or whatever you want to call it, needs to be the ONE authoritative place for everything. If you have the same stuff in 3 different files, it won't work nearly as well.\n\nBonus Tip:\nI find long prompt input with example code fragments and thoughtful descriptions work best at getting an LLM to produce good output. But there will always be holes (resource leaks, vulnerabilities, concurrency flaws, etc). So then I update my original prompt input (keep it in a separate file PROMPT.txt as a scratch pad) to add context about those things maybe asking questions along the way to figure out how to fix the holes. Then I /rewind back to the prompt and re-enter the updated prompt. This feedback loop advances the conversation without expending tokens."
}
,

{
"id": "47110187",
"text": "What works extremely well for me is this: Let Claude Code create the plan, then turn over the plan to Codex for review, and give the response back to Claude Code. Codex is exceptionally good at doing high level reviews and keeping an eye on the details. It will find very suble errors and omissins. And CC is very good at quickly converting the plan into code.\n\nThis back and forth between the two agents with me steering the conversation elevates Claude Code into next level."
}
,

{
"id": "47107968",
"text": "You described how AntiGravity works natively."
}
,

{
"id": "47108987",
"text": "falling asleep here. when will the babysitting end"
}
,

{
"id": "47107422",
"text": "Use OpenSpec and simplify everything."
}
,

{
"id": "47108768",
"text": "Honestly, I found that the best way to use these CLIs is exactly how the CLI creators have intended."
}
,

{
"id": "47107049",
"text": "The plan document and todo are an artifact of context size limits. I use them too because it allows using /reset and then continuing."
}
,

{
"id": "47109127",
"text": "I don't know. I tried various methods. And this one kind of doesn't work quite a bit of times. The problem is plan naturally always skips some important details, or assumes some library function, but is taken as instruction in the next section. And claude can't handle ambiguity if the instruction is very detailed(e.g. if plan asks to use a certain library even if it is a bad fit claude won't know that decision is flexible). If the instruction is less detailed, I saw claude is willing to try multiple things and if it keeps failing doesn't fear in reverting almost everything.\n\nIn my experience, the best scenario is that instruction and plan should be human written, and be detailed."
}
,

{
"id": "47108996",
"text": "We're just slowly reinventing agile for telling Ai agents what to do lol\n\nJust skip to the Ai stand-ups"
}
,

{
"id": "47109364",
"text": "Another pattern is:\n\n1. First vibecode software to figure out what you want\n\n2. Then throw it out and engineer it"
}
,

{
"id": "47107736",
"text": "Wow, I never bother with using phrases like “deeply study this codebase deeply.” I consistently get pretty fantastic results."
}
,

{
"id": "47107160",
"text": "I have a different approach where I have claude write coding prompts for stages then I give the prompt to another agent. I wonder if I should write it up as a blog post"
}
,

{
"id": "47108603",
"text": "add another agent review, I ask Claude to send plan for review to Codex and fix critical and high issues, with complexity gating (no overcomplicated logic), run in a loop, then send to Gemini reviewer, then maybe final pass with Claude, once all C+H pass the sequence is done"
}
,

{
"id": "47107067",
"text": "Kiro's spec-based development looks identical.\n\nhttps://kiro.dev/docs/specs/\n\nIt looks verbose but it defines the requirements based on your input, and when you approve it then it defines a design, and (again) when you approve it then it defines an implementation plan (a series of tasks.)"
}
,

{
"id": "47107194",
"text": "This separation of planning and execution resonates deeply with how I approach task management in general, not just coding.\n\nThe key insight here - that planning and execution should be distinct phases - applies to productivity tools too. I've been using www.dozy.site which takes a similar philosophy: it has smart calendar scheduling that automatically fills your empty time slots with planned tasks. The planning happens first (you define your tasks and projects), then the execution is automated (tasks get scheduled into your calendar gaps).\n\nThe parallel is interesting: just like you don't want Claude writing code before the plan is solid, you don't want to manually schedule tasks before you've properly planned what needs to be done. The separation prevents wasted effort and context switching.\n\nThe annotation cycle you describe (plan -> review -> annotate -> refine) is exactly how I work with my task lists too. Define the work, review it, adjust priorities and dependencies, then let the system handle the scheduling."
}
,

{
"id": "47107211",
"text": "Pretty sure this entire comment is AI generated."
}
,

{
"id": "47107403",
"text": "Almost think we're at the point on HN where we need a special [flag bot] link for those that meet a certain threshold and it alerts @dang or something to investigate them in more detail. The amount of bots on here has been increasing at an alarming rate."
}
,

{
"id": "47108719",
"text": "There has been this really weird flood of new accounts lately that are making these kinds of bot comments with no clear purpose to making them. Maybe it comes from people experimenting with OpenClaw?"
}
,

{
"id": "47107891",
"text": "I don't see how this is 'radically different' given that Claude Code literally has a planning mode.\n\nThis is my workflow as well, with the big caveat that 80% of 'work' doesn't require substantive planning, we're making relatively straight forward changes.\n\nEdit: there is nothing fundamentally different about 'annotating offline' in an MD vs in the CLI and iterating until the plan is clear. It's a UI choice.\n\nSpec Driven Coding with AI is very well established, so working from a plan, or spec (they can be somewhat different) is not novel.\n\nThis is conventional CC use."
}
,

{
"id": "47107904",
"text": "last i checked, you can't annotate inline with planning mode. you have to type a lot to explain precisely what needs to change, and then it re-presents you with a plan (which may or may not have changed something else).\n\ni like the idea of having an actual document because you could actually compare the before and after versions if you wanted to confirm things changed as intended when you gave feedback"
}
,

{
"id": "47107963",
"text": "'Giving precise feedback on a plan' is literally annotating the plan.\n\nIt comes back to you with an update for verification.\n\nYou ask it to 'write the plan' as matter of good practice.\n\nWhat the author is describing is conventional usage of claude code."
}
,

{
"id": "47108358",
"text": "A plan is just a file you can edit and then tell CC to check your annotations"
}
,

{
"id": "47107082",
"text": "One thing for me has been the ability to iterate over plans - with a better visual of them as well as ability to annotate feedback about the plan.\n\nhttps://github.com/backnotprop/plannotator Plannotator does this really effectively and natively through hooks"
}
,

{
"id": "47107128",
"text": "Wow, I've been needing this! The one issue I’ve had with terminals is reviewing plans, and desiring the ability to provide feedback on specific plan sections in a more organized way.\n\nReally nice ui based on the demo."
}

]
</comments_to_classify>

Based on the comments above, assign each to up to 3 relevant topics.

Return ONLY a JSON array with this exact structure (no other text):
[

{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,

{
"id": "comment_id_2",
"topics": [
2
]
}
,

{
"id": "comment_id_3",
"topics": [
0
]
}
,
...
]

Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment

Remember: Output ONLY the JSON array, no other text.

commentCount

← Back to job