Summarizer

LLM Input

llm/065c6e83-d0d5-4aca-be3d-92768a8a3506/batch-0-748a3235-c92f-467d-b989-01ab7d10f4ca-input.json

Pretty-print

prompt

The following is content for you to classify. Do not respond to the comments—classify them.

<topics>
1. Not Novel or Revolutionary
Related: Many commenters argue this workflow is standard practice, not radically different. References to existing tools like Kiro, OpenSpec, SpecKit, and Antigravity that already implement spec-driven development. Claims the approach was documented 2+ years ago in Cursor forums.
2. LLMs as Junior Developers
Related: Analogy comparing LLMs to unreliable interns with boundless energy. Discussion of treating AI like junior developers requiring supervision, documentation, and oversight. The shift from coder to software manager role.
3. AI-Generated Article Concerns
Related: Multiple commenters suspect the article itself was written by AI, noting characteristic style and patterns. Debate about whether AI-written content should be evaluated differently or dismissed outright.
4. Magic Words and Prompt Engineering
Related: Skepticism about whether words like 'deeply' and 'in great details' actually affect LLM behavior. Discussion of attention mechanisms, emotional prompting research, and whether prompt techniques are superstition or cargo cult.
5. Planning vs Just Coding
Related: Debate about whether extensive planning overhead eliminates time savings. Some argue writing specs takes longer than writing code. Others counter that planning prevents compounding errors and technical debt.
6. Spec-Driven Development Tools
Related: References to existing frameworks: OpenSpec, SpecKit, BMAD-METHOD, Kiro, Antigravity. Discussion of how these tools formalize the research-plan-implement workflow described in the article.
7. Context Window Management
Related: Strategies for handling large codebases and context limits. Maintaining markdown files for subsystems, using skills, aggressive compaction. Concerns about context rot and performance degradation.
8. Waterfall Methodology Comparison
Related: Commenters note the approach resembles waterfall development with detailed upfront planning. Discussion of whether this contradicts agile principles or represents rediscovering proven methods.
9. Test-Driven Development Integration
Related: Suggestions to add comprehensive tests to the workflow. Writing tests before implementation, using tests as verification. Arguments that test coverage enables safer refactoring with AI.
10. Single Session vs Multiple Sessions
Related: Author's claim of running entire workflows in single long sessions without performance degradation. Others recommend clearing context between phases for better results.
11. Determinism and Reproducibility
Related: Concerns about non-deterministic LLM outputs. Discussion of whether software engineering can accommodate probabilistic tools. Comparisons to gambling and slot machines.
12. Token Cost Considerations
Related: Discussion of workflow being token-heavy and expensive. Comparisons between Claude subscription tiers. Arguments that simpler approaches save money while achieving similar results.
13. Annotation Workflow Details
Related: Questions about how to format inline annotations for Claude to recognize. Techniques like TODO prefixes, HTML comments, and clear separation between human and AI-written content.
14. Subagent Architecture
Related: Using multiple agents for different phases: planning, implementation, review. Red team/blue team approaches. Dispatching parallel agents for independent tasks.
15. Reference Implementation Technique
Related: Using existing code from open source projects as examples for Claude. Questions about licensing implications. Claims this dramatically improves output quality.
16. Claude vs Other Models
Related: Comparisons between Claude, Codex, Gemini, and other models. Discussion of model-specific behaviors and optimal prompting strategies. Using multiple models in complementary roles.
17. Greenfield vs Existing Codebases
Related: Observation that most AI coding articles focus on greenfield development. Different challenges when working with legacy code and established patterns.
18. Human Review Requirements
Related: Debate about whether all AI-generated code must be reviewed line-by-line. Questions about trust, liability, and whether AI can eventually be trusted without oversight.
19. Productivity Claims Skepticism
Related: Questions about actual time savings versus perceived productivity. References to studies showing AI sometimes makes developers less productive. Concerns about false progress.
20. Documentation as Side Benefit
Related: Plans and research documents serve as valuable documentation for future maintainers. Version controlling plan files in git. Using plans to understand architectural decisions later.
0. Does not fit well in any category
</topics>

<comments_to_classify>
[

{
"id": "47109042",
"text": "The author seems to think they've hit upon something revolutionary...\n\nThey've actually hit upon something that several of us have evolved to naturally.\n\nLLM's are like unreliable interns with boundless energy. They make silly mistakes, wander into annoying structural traps, and have to be unwound if left to their own devices. It's like the genie that almost pathologically misinterprets your wishes.\n\nSo, how do you solve that? Exactly how an experienced lead or software manager does: you have systems write it down before executing, explain things back to you, and ground all of their thinking in the code and documentation, avoiding making assumptions about code after superficial review.\n\nWhen it was early ChatGPT, this meant function-level thinking and clearly described jobs. When it was Cline it meant cline rules files that forced writing architecture.md files and vibe-code.log histories, demanding grounding in research and code reading.\n\nMaybe nine months ago, another engineer said two things to me, less than a day apart:\n\n- \"I don't understand why your clinerules file is so large. You have the LLM jumping through so many hoops and doing so much extra work. It's crazy.\"\n\n- The next morning: \"It's basically like a lottery. I can't get the LLM to generate what I want reliably. I just have to settle for whatever it comes up with and then try again.\"\n\nThese systems have to deal with minimal context, ambiguous guidance, and extreme isolation. Operate with a little empathy for the energetic interns, and they'll uncork levels of output worth fighting for. We're Software Managers now. For some of us, that's working out great."
}
,

{
"id": "47109309",
"text": "Revolutionary or not it was very nice of the author to make time and effort to share their workflow.\n\nFor those starting out using Claude Code it gives a structured way to get things done bypassing the time/energy needed to “hit upon something that several of us have evolved to naturally”."
}
,

{
"id": "47109757",
"text": "It's this line that I'm bristling at: \"...the workflow I’ve settled into is radically different from what most people do with AI coding tools...\"\n\nAnyone who spends some time with these tools (and doesn't black out from smashing their head against their desk) is going to find substantial benefit in planning with clarity.\n\nIt was #6 in Boris's run-down:\nhttps://news.ycombinator.com/item?id=46470017\n\nSo, yes, I'm glad that people write things out and share. But I'd prefer that they not lead with \"hey folks, I have news: we should *slice* our bread!\""
}
,

{
"id": "47110201",
"text": "But the author's workflow is actually very different from Boris'.\n\n#6 is about using plan mode whereas the author says \"The built-in plan mode sucks\".\n\nThe author's post is much more than just \"planning with clarity\"."
}
,

{
"id": "47111222",
"text": "I would say he’s saying “hey folks, I have news. We should slice our bread with a knife rather than the spoon that came with the bread.”"
}
,

{
"id": "47111050",
"text": "This kind of flows have been documented in the wild for some time now. They started to pop up in the Cursor forums 2+ years ago... eg: https://github.com/johnpeterman72/CursorRIPER\n\nPersonally I have been using a similar flow for almost 3 years now, tailored for my needs. Everybody who uses AI for coding eventually gravitates towards a similar pattern because it works quite well (for all IDEs, CLIs, TUIs)"
}
,

{
"id": "47109400",
"text": "Its ai written though, the tells are in pretty much every paragraph."
}
,

{
"id": "47109450",
"text": "I don’t think it’s that big a red flag anymore. Most people use ai to rewrite or clean up content, so I’d think we should actually evaluate content for what it is rather than stop at “nah it’s ai written.”"
}
,

{
"id": "47109881",
"text": ">Most people use ai to rewrite or clean up content\n\nI think your sentence should have been \"people who use ai do so to mostly rewrite or clean up content\", but even then I'd question the statistical truth behind that claim.\n\nPersonally, seeing something written by AI means that the person who wrote it did so just for looks and not for substance. Claiming to be a great author requires both penmanship and communication skills, and delegating one or either of them to a large language model inherently makes you less than that.\n\nHowever, when the point is just the contents of the paragraph(s) and nothing more then I don't care who or what wrote it. An example is the result of a research, because I'd certainly won't care about the prose or effort given to write the thesis but more on the results (is this about curing cancer now and forever? If yes, no one cares if it's written with AI).\n\nWith that being said, there's still that I get anywhere close to understanding the author behind the thoughts and opinions. I believe the way someone writes hints to the way they think and act. In that sense, using LLM's to rewrite something to make it sound more professional than what you would actually talk in appropriate contexts makes it hard for me to judge someone's character, professionalism, and mannerisms. Almost feels like they're trying to mask part of themselves. Perhaps they lack confidence in their ability to sound professional and convincing?"
}
,

{
"id": "47110496",
"text": "> I don’t think it’s that big a red flag anymore. Most people use ai to rewrite or clean up content, so I’d think we should actually evaluate content for what it is rather than stop at “nah it’s ai written.”\n\nUnfortunately, there's a lot of people trying to content-farm with LLMs; this means that whatever style they default to, is automatically suspect of being a slice of \"dead internet\" rather than some new human discovery.\n\nI won't rule out the possibility that even LLMs, let alone other AI, can help with new discoveries, but they are definitely better at writing persuasively than they are at being inventive, which means I am forced to use \"looks like LLM\" as proxy for both \"content farm\" and \"propaganda which may work on me\", even though some percentage of this output won't even be LLM and some percentage of what is may even be both useful and novel."
}
,

{
"id": "47109628",
"text": "I don't judge content for being AI written, I judge it for the content itself (just like with code).\n\nHowever I do find the standard out-of-the-box style very grating. Call it faux-chummy linkedin corporate workslop style.\n\nWhy don't people give the llm a steer on style? Either based on your personal style or at least on a writer whose style you admire. That should be easier."
}
,

{
"id": "47109695",
"text": "Because they think this is good writing. You can’t correct what you don’t have taste for. Most software engineers think that reading books means reading NYT non-fiction bestsellers."
}
,

{
"id": "47110563",
"text": "While I agree with:\n\n> Because they think this is good writing. You can’t correct what you don’t have taste for.\n\nI have to disagree about:\n\n> Most software engineers think that reading books means reading NYT non-fiction bestsellers.\n\nThere's a lot of scifi and fantasy in nerd circles, too. Douglas Adams, Terry Pratchett, Vernor Vinge, Charlie Stross, Iain M Banks, Arthur C Clarke, and so on.\n\nBut simply enjoying good writing is not enough to fully get what makes writing good. Even writing is not itself enough to get such a taste: thinking of Arthur C Clarke, I've just finished 3001, and at the end Clarke gives thanks to his editors, noting his own experience as an editor meant he held a higher regard for editors than many writers seemed to. Stross has, likewise, blogged about how writing a manuscript is only the first half of writing a book, because then you need to edit the thing."
}
,

{
"id": "47111315",
"text": "Very high chance someone that’s using Claude to write code is also using Claude to write a post from some notes. That goes beyond rewriting and cleaning up."
}
,

{
"id": "47110236",
"text": "Even though I use LLMs for code, I just can't read LLM written text, I kind of hate the style, it reminds me too much of LinkedIn."
}
,

{
"id": "47111272",
"text": "ai;dr\n\nIf your \"content\" smells like AI, I'm going to use _my_ AI to condense the content for me. I'm not wasting my time on overly verbose AI \"cleaned\" content.\n\nWrite like a human, have a blog with an RSS feed and I'll most likely subscribe to it."
}
,

{
"id": "47109532",
"text": "Well, real humans may read it though. Personally I much prefer real humans write real articles than all this AI generated spam-slop. On youtube this is especially annoying - they mix in real videos with fake ones. I see this when I watch animal videos - some animal behaviour is taken from older videos, then AI fake is added. My own policy is that I do not watch anything ever again from people who lie to the audience that way so I had to begin to censor away such lying channels. I'd apply the same rationale to blog authors (but I am not 100% certain it is actually AI generated; I just mention this as a safety guard)."
}
,

{
"id": "47109771",
"text": "The main issue with evaluating content for what it is is how extremely asymmetric that process has become.\n\nSlop looks reasonable on the surface, and requires orders of magnitude more effort to evaluate than to produce. It’s produced once, but the process has to be repeated for every single reader.\n\nDisregarding content that smells like AI becomes an extremely tempting early filtering mechanism to separate signal from noise - the reader’s time is valuable."
}
,

{
"id": "47110230",
"text": "If you want to write something with AI, send me your prompt. I'd rather read what you intend for it to produce rather than what it produces. If I start to believe you regularly send me AI written text, I will stop reading it. Even at work. You'll have to call me to explain what you intended to write."
}
,

{
"id": "47110358",
"text": "And if my prompt is a 10 page wall of text that I would otherwise take the time to have the AI organize, deduplicate, summarize, and sharpen with an index, executive summary, descriptive headers, and logical sections, are you going to actually read all of that, or just whine \"TL;DR\"?\n\nIt's much more efficient and intentional for the writer to put the time into doing the condensing and organizing once, and review and proofread it to make sure it's what they mean, than to just lazily spam every human they want to read it with the raw prompt, so every recipient has to pay for their own AI to perform that task like a slot machine, producing random results not reviewed and approved by the author as their intended message.\n\nIs that really how you want Hacker News discussions and your work email to be, walls of unorganized unfiltered text prompts nobody including yourself wants to take the time to read? Then step aside, hold my beer!\n\nOr do you prefer I should call you on the phone and ramble on for hours in an unedited meandering stream of thought about what I intended to write?"
}
,

{
"id": "47110620",
"text": "Yeah but it's not. This a complete contrivance and you're just making shit up. The prompt is much shorter than the output and you are concealing that fact. Why?\n\nGithub repo or it didn't happen. Let's go."
}
,

{
"id": "47109473",
"text": "I think as humans it's very hard to abstract content from its form. So when the form is always the same boring, generic AI slop, it's really not helping the content."
}
,

{
"id": "47109525",
"text": "And maybe writing an article or a keynote slides is one of the few places we can still exerce some human creativity, especially when the core skills (programming) is almost completely in the hands of LLMs already"
}
,

{
"id": "47109607",
"text": "> I don’t think it’s that big a red flag anymore.\n\nIt is to me, because it indicates the author didn't care about the topic. The only thing they cared about is to write an \"insightful\" article about using llms. Hence this whole thing is basically linked-in resume improvement slop.\n\nNot worth interacting with, imo\n\nAlso, it's not insightful whatsoever. It's basically a retelling of other articles around the time Claude code was released to the public (March-August 2025)"
}
,

{
"id": "47110088",
"text": ">the tells are in pretty much every paragraph.\n\nIt's not just misleading — it's lazy.\nAnd honestly? That doesn't vibe with me.\n\n[/s obviously]"
}
,

{
"id": "47109725",
"text": "So is GP.\n\nThis is clearly a standard AI exposition:\n\nLLM's are like unreliable interns with boundless energy. They make silly mistakes, wander into annoying structural traps, and have to be unwound if left to their own devices. It's like the genie that almost pathologically misinterprets your wishes."
}
,

{
"id": "47110297",
"text": "Then ask your own ai to rewrite it so it doesn't trigger you into posting uninteresting thought stopping comments proclaiming why you didn't read the article, that don't contribute to the discussion."
}
,

{
"id": "47109644",
"text": "Here's mine! https://github.com/pjlsergeant/moarcode"
}
,

{
"id": "47109900",
"text": "Agreed. The process described is much more elaborate than what I do but quite similar. I start to discuss in great details what I want to do, sometimes asking the same question to different LLMs. Then a todo list, then manual review of the code, esp. each function signature, checking if the instructions have been followed and if there are no obvious refactoring opportunities (there almost always are).\n\nThe LLM does most of the coding, yet I wouldn't call it \"vibe coding\" at all.\n\n\"Tele coding\" would be more appropriate."
}
,

{
"id": "47111077",
"text": "I use AWS Kiro, and its spec driven developement is exactly this, I find it really works well as it makes me slow down and think about what I want it to do.\n\nRequirements, design, task list, coding."
}
,

{
"id": "47111062",
"text": "I've been doing the exact same thing for 2 months now. I wish I had gotten off my ass and written a blog post about it. I can't blame the author for gathering all the well deserved clout they are getting for it now."
}
,

{
"id": "47111601",
"text": "I went through the blog. I started using Claude Code about 2 weeks ago and my approach is practically the same. It just felt logical. I think there are a bunch of us who have landed on this approach and most are just quietly seeing the benefits."
}
,

{
"id": "47111170",
"text": "Don’t worry. This advice has been going around for much more than 2 months, including links posted here as well as official advice from the major companies (OpenAI and Anthropic) themselves. The tools literally have had plan mode as a first class feature.\n\nSo you probably wouldn’t have any clout anyways, like all of the other blog posts."
}
,

{
"id": "47109257",
"text": "I’ve also found that a bigger focus on expanding my agents.md as the project rolls on has led to less headaches overall and more consistency (non-surprisingly). It’s the same as asking juniors to reflect on the work they’ve completed and to document important things that can help them in the future. Software Manger is a good way to put this."
}
,

{
"id": "47109580",
"text": "AGENTS.md should mostly point to real documentation and design files that humans will also read and keep up to date. It's rare that something about a project is only of interest to AI agents."
}
,

{
"id": "47110290",
"text": "It feels like retracing the history of software project management. The post is quite waterfall-like. Writing a lot of docs and specs upfront then implementing. Another approach is to just YOLO (on a new branch) make it write up the lessons afterwards, then start a new more informed try and throw away the first. Or any other combo.\n\nFor me what works well is to ask it to write some code upfront to verify its assumptions against actual reality, not just be telling it to review the sources \"in detail\". It gains much more from real output from the code and clears up wrong assumptions. Do some smaller jobs, write up md files, then plan the big thing, then execute."
}
,

{
"id": "47110503",
"text": "'The post is quite waterfall-like. Writing a lot of docs and specs upfront then implementing' - It's only waterfall if the specs cover the entire system or app. If it's broken up into sub-systems or vertical slices, then it's much more Agile or Lean."
}
,

{
"id": "47110325",
"text": "This is exactly what I do. I assume most people avoid this approach due to cost."
}
,

{
"id": "47110329",
"text": "It makes an endless stream of assumptions. Some of them brilliant and even instructive to a degree, but most of them are unfounded and inappropriate in my experience."
}
,

{
"id": "47109312",
"text": "I really like your analogy of LLMs as 'unreliable interns'. The shift from being a 'coder' to a 'software manager' who enforces documentation and grounding is the only way to scale these tools. Without an architecture.md or similar grounding, the context drift eventually makes the AI-generated code a liability rather than an asset. It's about moving the complexity from the syntax to the specification."
}
,

{
"id": "47111311",
"text": "> LLM's are like unreliable interns with boundless energy\n\nThis isn’t directed specifically at you but the general community of SWEs: we need to stop anthropomorphizing a tool. Code agents are not human capable and scaling pattern matching will never hit that goal. That’s all hype and this is coming from someone who runs the range of daily CC usage. I’m using CC to its fullest capability while also being a good shepherd for my prod codebases.\n\nPretending code agents are human capable is fueling this koolaide drinking hype craze."
}
,

{
"id": "47109273",
"text": "Oh no, maybe the V-Model was right all the time? And right sizing increments with control stops after them. No wonder these matrix multiplications start to behave like humans, that is what we wanted them to do."
}
,

{
"id": "47109380",
"text": "So basically you’re saying LLMs are helping us be better humans?"
}
,

{
"id": "47109534",
"text": "Better humans? How and where?"
}
,

{
"id": "47109709",
"text": "It's nice to have it written down in a concise form. I shared it with my team as some engineers have been struggling with AI, and I think this (just trying to one-shot without planning) could be why."
}
,

{
"id": "47111464",
"text": "if only there was another simpler way to use your knowledge to write code..."
}
,

{
"id": "47110596",
"text": "If you have a big rules file you’re in the right direction but still not there. Just as with humans, the key is that your architecture should make it very difficult to break the rules by accident and still be able to compile/run with correct exit status.\n\nMy architecture is so beautifully strong that even LLMs and human juniors can’t box their way out of it."
}
,

{
"id": "47109385",
"text": "It's alchemy all over again."
}
,

{
"id": "47109536",
"text": "Alchemy involved a lot of do-it-yourself though. With AI it is like someone else does all the work (well, almost all the work)."
}
,

{
"id": "47109583",
"text": "It was mainly a jab at the protoscientific nature of it."
}

]
</comments_to_classify>

Based on the comments above, assign each to up to 3 relevant topics.

Return ONLY a JSON array with this exact structure (no other text):
[

{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,

{
"id": "comment_id_2",
"topics": [
2
]
}
,

{
"id": "comment_id_3",
"topics": [
0
]
}
,
...
]

Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment

Remember: Output ONLY the JSON array, no other text.

commentCount

← Back to job