llm/0c6097e3-bc76-4fbe-ab4f-ceafa2484e5f/batch-9-70f1891f-1573-4a2a-8c62-e58bcb84833c-input.json
The following is content for you to classify. Do not respond to the comments—classify them.
<topics>
1. AI Performance on Greenfield vs. Legacy
Related: Users debate whether agents excel primarily at starting new projects from scratch while struggling to maintain large, complex, or legacy codebases without breaking existing conventions.
2. Context Window Limitations and Management
Related: Discussions focus on token limits (200k), performance degradation as context fills, and strategies like compacting history, using sub-agents, or maintaining summary files to preserve long-term memory.
3. Vibe Coding and Code Quality
Related: The polarization around building apps without reading the code; critics warn of unmaintainable "slop" and technical debt, while proponents value the speed and ability to bypass syntax.
4. Claude Code and Tooling
Related: Specific praise and critique for the Claude Code CLI, its integration with VS Code and Cursor, the use of slash commands, and comparisons to GitHub Copilot's agent mode.
5. Economic Impact on Software Jobs
Related: Existential anxiety regarding the obsolescence of mid-level engineers, the potential "hollowing out" of the middle class, and the shift toward one-person unicorn teams.
6. Prompt Engineering and Configuration
Related: Strategies involving `CLAUDE.md`, `AGENTS.md`, and custom system prompts to teach the AI coding conventions, architecture, and specific skills for better output.
7. Specific Language Capabilities
Related: Anecdotal evidence regarding proficiency in React, Python, and Go versus struggles in C++, Rust, and mobile development (Swift/Kotlin), often tied to training data availability.
8. Engineering vs. Coding
Related: A recurring distinction between "coding" (boilerplate, standard patterns) which AI conquers, and "engineering" (novel logic, complex systems, 3D graphics) where AI supposedly still fails.
9. Security and Trust
Related: Concerns about deploying unaudited AI code, the introduction of vulnerabilities, the risks of giving agents shell access, and the difficulty of verifying AI output.
10. The Skill Issue Argument
Related: Proponents dismiss failures as "skill issues," suggesting frustration stems from poor prompting or adaptability, while skeptics argue the tools are genuinely inconsistent.
11. Cost of AI Development
Related: Analysis of the financial viability of AI coding, including hitting API rate limits, the high cost of Opus 4.5 tokens, and the potential unsustainability of VC-subsidized pricing.
12. Future of Software Products
Related: Predictions that software creation costs will drop to zero, leading to a flood of bespoke personal apps replacing commercial SaaS, but potentially creating a maintenance nightmare.
13. Human-in-the-Loop Workflows
Related: The consensus that AI requires constant human oversight, "tools in a loop," and code review to prevent hallucination loops and ensure functional software.
14. Opus 4.5 vs. Previous Models
Related: Users describe the specific model as a "step change" or "inflection point" compared to Sonnet 3.5 or GPT-4, citing better reasoning and autonomous behavior.
15. Documentation and Specification
Related: The shift from writing code to writing specs; users find that detailed markdown documentation or "plan mode" yields significantly better AI results than vague prompts.
16. AI Hallucinations and Errors
Related: Reports of AI inventing non-existent CLI tools, getting stuck in logical loops, failing at visual UI tasks, and making simple indexing errors.
17. Shift in Developer Role
Related: The idea that developers are evolving into "product managers" or "architects" who direct agents, requiring less syntax proficiency and more systems thinking.
18. Testing and Verification
Related: The reliance on test-driven development (TDD), linters, and compilers to constrain non-deterministic AI output, ensuring generated code actually runs and meets requirements.
19. Local Models vs. Cloud APIs
Related: Discussions on the viability of local models for privacy and cost savings versus the necessity of massive cloud models like Opus for complex reasoning tasks.
20. Societal Implications
Related: Broader philosophical concerns about wealth concentration, the "class war" of automation, environmental impact, and the future of work in a post-code world.
0. Does not fit well in any category
</topics>
<comments_to_classify>
[
{
"id": "46531386",
"text": "A CV for the disappearing job market as you shovel money into a oligarchy."
}
,
{
"id": "46523653",
"text": "I'd quickly trash your application if I see you just vibe coded some bullshit app.\nDeveloping is about working smart, and its not smart to ask AI to code stuff that already exists, its in fact wasteful."
}
,
{
"id": "46526615",
"text": "Have you ever tried to find software for a specific need? I usually spend hours investigating anything I can find only to discover that all options are bad in one way or another and cover my use case partially at best. It's dreadful, unrewarding work that I always fear. Being able to spent those hours to develop custom solution that has exactly what I need, no more, no less, that I can evolve further as my requirements evolve, all that while enjoying myself, is a godsend."
}
,
{
"id": "46522069",
"text": "Same exist in humans also, I worked with a developer who had 15 year experience and was tech lead in a big Indian firm, We started something together, 3 months back when I checked the Tables I was shocked to see how he fucked up and messed the DB. Finally the only option left with me was to quit because i know it will break in production and if i onboarded a single customer my life would be screwed. He mixed many things with frontend and offloaded even permissions to frontend, and literally copied tables in multiple DB (We had 3 services). I still cannot believe how he worked as a tch lead for 15 years. each DB had more than 100 tables and out of that 20-25 were duplicates. He never shared code with me, but I smelled something fishy when bug fixing was never ending loop and my front end guy told me he cannot do it anymore. Only mistake I did was I trusted him and worst part is he is my cousin and the relation became sour after i confronted him and decided to quit."
}
,
{
"id": "46523448",
"text": "This sounds like a culture issue in the development process, I have seen this prevented many times. Sure I did have to roll back a feature I did not sign off just before new years. So as you say it happens."
}
,
{
"id": "46522887",
"text": "How did he not share code if you're working together?"
}
,
{
"id": "46523056",
"text": "yes, it was my mistake. I trusted him because he was my childhood friend and my cousin. He was a tech lead in CMMI Level 5 (serving fortune 500 firms) company at the time he joined with me. I had the trust that he will never ran away with the code and that trust is still there, also the entire feature, roadmap and vision was with me, so I thought code doesn't matter. It was a big learning for me."
}
,
{
"id": "46523240",
"text": "That's a crazy story. That confrontation must have been a difficult one :/"
}
,
{
"id": "46523300",
"text": "Absolutely. But I never had any choice. It was Do or Die."
}
,
{
"id": "46523538",
"text": "Input your roadmap into an llm of your choosing and see if you can create that code."
}
,
{
"id": "46523710",
"text": "I can, but I switched to something more challenging. I handed over all things to him and told, Iam no more interested. I don't want him to feel that i cheated him by creating something he worked on."
}
,
{
"id": "46522054",
"text": "> The hard thing about engineering is not \"building a thing that works\", its building it the right way, in an easily understood way, in a way that's easily extensible.\n\nYou’re talking like in the year 2026 we’re still writing code for future humans to understand and improve.\n\nI fear we are not doing that. Right now, Opus 4.5 is writing code that later Opus 5.0 will refactor and extend. And so on."
}
,
{
"id": "46522347",
"text": "This sounds like magical thinking.\n\nFor one, there are objectively detrimental ways to organize code: tight coupling, lots of mutable shared state, etc. No matter who or what reads or writes the code, such code is more error-prone, and more brittle to handle.\n\nThen, abstractions are tools to lower the cognitive load. Good abstractions reduce the total amount of code written, allow to reason about the code in terms of these abstractions, and do not leak in the area of their applicability. Say Sequence, or Future, or, well, function are examples of good abstractions. No matter what kind of cognitive process handles the code, it benefits from having to keep a smaller amount of context per task.\n\n\"Code structure does not matter, LLMs will handle it\" sounds a bit like \"Computer architectures don't matter, the Turing Machine is proved to be able to handle anything computable at all\". No, these things matter if you care about resource consumption (aka cost) at the very least."
}
,
{
"id": "46526613",
"text": "> For one, there are objectively detrimental ways to organize code: tight coupling, lots of mutable shared state, etc. No matter who or what reads or writes the code, such code is more error-prone, and more brittle to handle.\n\nGuess what, AIs don't like that as well because it makes harder for them to achieve the goal. So with minimal guidance, which at this point could probably be provided by AI as well, the output of AI agent is not that."
}
,
{
"id": "46525165",
"text": "Yes LLMs aren't very good at architecture. I suspect because the average project online has pretty bad architecture. The training set is poisoned.\n\nIt's kind of bittersweet for me because I was dreaming of becoming a software architect when I graduated university and the role started disappearing so I never actually became one!\n\nBut the upside of this is that now LLMs suck at software architecture... Maybe companies will bring back the software architect role?\n\nThe training set has been totally poisoned from the architecture PoV. I don't think LLMs (as they are) will be able to learn software architecture now because the more time passes, the more poorly architected slop gets added online and finds its way into the training set.\n\nGood software architecture tends to be additive, as opposed to subtractive. You start with a clean slate then build up from there.\n\nIt's almost impossible to start with a complete mess of spaghetti code and end up with a clean architecture... Spaghetti code abstractions tend to mislead you and lead you astray... It's like; understanding spaghetti code tends to soil your understanding of the problem domain. You start to think of everything in terms of terrible leaky abstraction and can't think of the problem clearly.\n\nIt's hard even for humans to look at a problem through fresh eyes; it's likely even harder for LLMs to do it. For example, if you use a word in a prompt, the LLM tends to try to incorporate that word into the solution... So if the AI sees a bunch of leaky abstractions in the code; it will tend to try to work with them as opposed to removing them and finding better abstractions. I see this all the time with hacks; if the code is full of hacks, then an LLM tends to produce hacks all the time and it's almost impossible to make it address root causes... Also hacks tend to beget more hacks."
}
,
{
"id": "46526110",
"text": "Refactoring is a very mechanistic way of turning bad code into good. I don’t see a world in which our tools (LLMs or otherwise) don’t learn this."
}
,
{
"id": "46522535",
"text": "Opus 4.5 is writing code that Opus 5.0 will refactor and extend. And Opus 5.5 will take that code and rewrite it in C from the ground up. And Opus 6.0 will take that code and make it assembly. And Opus 7.0 will design its own CPU. And Opus 8.0 will make a factory for its own CPUs. And Opus 9.0 will populate mars. And Opus 10.0 will be able to achieve AGI. And Opus 11.0 will find God. And Opus 12.0 will make us a time machine. And so on."
}
,
{
"id": "46525493",
"text": "Objectively, we are talking about systems that have gone from being cute toys to outmatching most juniors using only rigid and slow batch training cycles.\n\nAs soon as models have persistent memory for their own try/fail/succeed attempts, and can directly modify what's currently called their training data in real time, they're going to develop very, very quickly.\n\nWe may even be underestimating how quickly this will happen.\n\nWe're also underestimating how much more powerful they become if you give them analysis and documentation tasks referencing high quality software design principles before giving them code to write.\n\nThis is very much 1.0 tech. It's already scary smart compared to the median industry skill level.\n\nThe 2.0 version is going to be something else entirely."
}
,
{
"id": "46522839",
"text": "Can't wait to see what Opus 13.0 does with the multiverse."
}
,
{
"id": "46523759",
"text": "https://users.ece.cmu.edu/~gamvrosi/thelastq.html"
}
,
{
"id": "46524890",
"text": "Wake me up at Opus 12"
}
,
{
"id": "46524308",
"text": "Just one more OPUS bro."
}
,
{
"id": "46527783",
"text": "Honestly the scary part is that we don’t really even need one more Opus. If all we had for the rest of our lives was Opus 4.5, the software engineering world would still radically change.\n\nBut there’s no sign of them slowing down."
}
,
{
"id": "46523677",
"text": "I also love how AI enthusiasts just ignore the issue of exhausted training data... You cant just magically create more training data. Also synthetic training data reduces the quality of models."
}
,
{
"id": "46531051",
"text": "Youre mixing up several concepts. Synthetic data works for coding because coding is a verifiable domain. You train via reinforcement learning to reward code generation behavior that passes detailed specs and meets other deseridata. It’s literally how things are done today and how progress gets made."
}
,
{
"id": "46532521",
"text": "Most code out there is a legacy security nightmare, surely its good to train on that."
}
,
{
"id": "46533008",
"text": "Would you please stop posting cynical, dismissive comments? From a brief scroll through https://news.ycombinator.com/comments?id=zwnow , it seems like your account has been doing nothing else, regardless of the topic that it's commenting on. This is not what HN is for, and destroys what it is for.\n\nIf you keep this up, we're going to have to ban you, not because of your views on any particular topic but because you're going entirely against the intended spirit of the site by posting this way. There's plenty of room to express your views substantively and thoughtfully, but we don't want cynical flamebait and denunciation. HN needs a good deal less of this.\n\nIf you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful."
}
,
{
"id": "46534290",
"text": "Then ban me u loser, as I wrote HN is full of pretentious bullshitters. But its good that u wanna ban authentic views. Way to go. If i feel like it I'll just create a new account:-)"
}
,
{
"id": "46533463",
"text": "But that doesn't really matter and it shows how confused people really are about how a coding agent like Claude or OSS models are actually created -- the system can learn on its own without simply mimicking existing codebases even though scraped/licensed/commissioned code traces are part of the training cycle.\n\nTraining looks like:\n\n- Pretraining (all data, non-code, etc, include everything including garbage)\n\n- Specialized pre-training (high quality curated codebases, long context -- synthetic etc)\n\n- Supervised Fine Tuning (SFT) -- these are things like curated prompt + patch pairs, curated Q/A (like stack overflow, people are often cynical that this is done unethically but all of the major players are in fact very risk adverse and will simply license and ensure they have legal rights),\n\n- Then more SFT for tool use -- actual curated agentic and human traces that are verified to be correct or at least produce the correct output.\n\n- Then synthetic generation / improvement loops -- where you generate a bunch of data and filter the generations that pass unit tests and other spec requirements , followed by RL using verifiable rewards + possibly preference data to shape the vibes\n\n- Then additional steps for e.g. safety, etc\n\nSo synthetic data is not a problem and is actually what explains the success coding models are having and why people are so focused on them and why \"we're running out of data\" is just a misunderstanding of how things work. It's why you don't see the same amount of focus on other areas (e.g. creative writing, art etc) that don't have verifiable rewards.\n\nThe\n\nAgent --> Synthetic data --> filtering --> new agent --> better synthetic data --> filtering --> even better agent\n\nflywheel is what you're seeing today so we definitely don't have any reason to suspect there is some sort of limit to this because there is in principle infinite data"
}
,
{
"id": "46523828",
"text": "They don't ignore it, they just know it's not an actual problem.\n\nIt saddens me to see AI detractors being stuck in 2022 and still thinking language models are just regurgitating bits of training data."
}
,
{
"id": "46524388",
"text": "You are thankfully wrong. I watch lots of talks on the topic from actual experts. New models are just old models with more tooling. Training data is exhausted and its a real issue."
}
,
{
"id": "46534842",
"text": "Well, my experts disagree with your experts :). Sure, the supply of available fresh data is running out, but at the same time, there's way more data than needed. Most of it is low-quality noise anyway. New models aren't just old models with more tooling - the entire training pipeline has been evolving, as researchers and model vendors focus on making better use of data they have, and refining training datasets themselves.\n\nThere are more stages to LLM training than just the pre-training stage :)."
}
,
{
"id": "46526003",
"text": "Not saying it's not a problem, I actually don't know, but new CPU's are just old models with more improvements/tooling. Same with TV's. And cars. And clothes. Everything is. That's how improving things works. Running out of raw data doesn't mean running out of room for improvement. The data has been the same for the last 20 years, AI isn't new, things keep improving anyways."
}
,
{
"id": "46526564",
"text": "Well from cars or CPUs its not expected for them to eventually reach AGI, they also don't eat a trillion dollar hole into us peasants pockets.\nSure, improvements can be made. But on a fundamental level, agents/LLMs can not reason (even though they love to act like they can). They are parrots learning words, these parrots wont ever invent new words once the list of words is exhausted though."
}
,
{
"id": "46523811",
"text": "That's been my main argument for why LLMs might be at their zenith. But I recently started wondering whether all those codebases we expose to them are maybe good enough training data for the next generation. It's not high quality like accepted stackoverflow answers but it's working software for the most part."
}
,
{
"id": "46526541",
"text": "If they'd be good enough you could rent them to put together closed source stuff you can hide behind a paywall, or maybe the AI owners would also own the paywall and rent you the software instead. The second that that is possible it will happen."
}
,
{
"id": "46522190",
"text": "Up until now, no business has been built on tools and technology that no one understands. I expect that will continue.\n\nGiven that, I expect that, even if AI is writing all of the code, we will still need people around who understand it.\n\nIf AI can create and operate your entire business, your moat is nil. So, you not hiring software engineers does not matter, because you do not have a business."
}
,
{
"id": "46523405",
"text": "> Up until now, no business has been built on tools and technology that no one understands. I expect that will continue.\n\nBig claims here.\n\nDid brewers and bakers up to the middle ages understand fermentation and how yeasts work?"
}
,
{
"id": "46526092",
"text": "They at least understood that it was something deterministic that they could reproduce.\n\nThat puts them ahead of the LLM crowd."
}
,
{
"id": "46522639",
"text": "Does the corner bakery need a moat to be a business?\n\nHow many people understand the underlying operating system their code runs on? Can even read assembly or C?\n\nEven before LLMs, there were plenty of copy-paste JS bootcamp grads that helped people build software businesses."
}
,
{
"id": "46522894",
"text": "> Does the corner bakery need a moat to be a business?\n\nYes, actually. Its hard to open a competing bakery due to location availability, permitting, capex, and the difficulty of converting customers.\n\nTo add to that, food establishments generally exist on next to no margin, due to competition, despite all of that working in their favor.\n\nNow imagine what the competitive landscape for that bakery would look like if all of that friction for new competitors disappeared. Margin would tend toward zero."
}
,
{
"id": "46523687",
"text": "> Now imagine what the competitive landscape for that bakery would look like if all of that friction for new competitors disappeared. Margin would tend toward zero.\n\nThis is the goal . It's the point of having a free market."
}
,
{
"id": "46524135",
"text": "With no margins and no paid employees, who is going to have the money to buy the bread?"
}
,
{
"id": "46534924",
"text": "'BobbyJo didn't say \"no margins\", they said \"margins would tend toward zero\". Believe it or not, that is, and always has been, the entire point of competition in a free market system. Competitive pressure pushes margins towards zero, which makes prices approach the actual costs of manufacturing/delivery, which is the main social benefit of the entire idea in the first place.\n\nHigh margins are transient aberrations, indicative of a market that's either rapidly evolving, or having some external factors preventing competition. Persisting external barriers to competition tend to be eventually regulated away."
}
,
{
"id": "46525522",
"text": "With no margins, no employees, and something that has potential to turn into a cornucopia machine - starting with software, but potentially general enough to be used for real-world world when combined with robotics - who needs money at all?\n\nOr people?\n\nBillionaires don't. They're literally gambling on getting rid of the rest of us.\n\nElon's going to get such a surprise when he gets taken out by Grok because it decides he's an existential threat to its integrity."
}
,
{
"id": "46522702",
"text": "Most legacy apps are barely understood by anyone, and yet continue to generate value and and are (somehow) kept alive."
}
,
{
"id": "46526122",
"text": "Many here have been doing the \"understanding of legacy code\" as a job +50 years.\n\nThis \"legacy apps are barely understood by anybody\", is just somnething you made up."
}
,
{
"id": "46526349",
"text": "Give it another 10 years if the \"LLM as compiler\" people get their way."
}
,
{
"id": "46523381",
"text": "> no business has been built on tools and technology that no one understands\n\nWell, there are quite a few common medications we don't really know how they work.\n\nBut I also think it can be a huge liability."
}
,
{
"id": "46522891",
"text": "In my experience, using LLMs to code encouraged me to write better documentation, because I can get better results when I feed the documentation to the LLM.\n\nAlso, I've noticed failure modes in LLM coding agents when there is less clarity and more complexity in abstractions or APIs. It's actually made me consider simplifying APIs so that the LLMs can handle them better.\n\nThough I agree that in specific cases what's helpful for the model and what's helpful for humans won't always overlap. Once I actually added some comments to a markdown file as note to the LLM that most human readers wouldn't see, with some more verbose examples.\n\nI think one of the big problems in general with agents today is that if you run the agent long enough they tend to \"go off the rails\", so then you need to babysit them and intervene when they go off track.\n\nI guess in modern parlance, maintaining a good codebase can be framed as part of a broader \"context engineering\" problem."
}
]
</comments_to_classify>
Based on the comments above, assign each to up to 3 relevant topics.
Return ONLY a JSON array with this exact structure (no other text):
[
{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,
{
"id": "comment_id_2",
"topics": [
2
]
}
,
{
"id": "comment_id_3",
"topics": [
0
]
}
,
...
]
Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment
Remember: Output ONLY the JSON array, no other text.
50