Summarizer

LLM Input

llm/0c6097e3-bc76-4fbe-ab4f-ceafa2484e5f/batch-12-1c4a4fd0-ca01-479d-a089-aa018be90e80-input.json

prompt

The following is content for you to classify. Do not respond to the comments—classify them.

<topics>
1. AI Performance on Greenfield vs. Legacy
   Related: Users debate whether agents excel primarily at starting new projects from scratch while struggling to maintain large, complex, or legacy codebases without breaking existing conventions.
2. Context Window Limitations and Management
   Related: Discussions focus on token limits (200k), performance degradation as context fills, and strategies like compacting history, using sub-agents, or maintaining summary files to preserve long-term memory.
3. Vibe Coding and Code Quality
   Related: The polarization around building apps without reading the code; critics warn of unmaintainable "slop" and technical debt, while proponents value the speed and ability to bypass syntax.
4. Claude Code and Tooling
   Related: Specific praise and critique for the Claude Code CLI, its integration with VS Code and Cursor, the use of slash commands, and comparisons to GitHub Copilot's agent mode.
5. Economic Impact on Software Jobs
   Related: Existential anxiety regarding the obsolescence of mid-level engineers, the potential "hollowing out" of the middle class, and the shift toward one-person unicorn teams.
6. Prompt Engineering and Configuration
   Related: Strategies involving `CLAUDE.md`, `AGENTS.md`, and custom system prompts to teach the AI coding conventions, architecture, and specific skills for better output.
7. Specific Language Capabilities
   Related: Anecdotal evidence regarding proficiency in React, Python, and Go versus struggles in C++, Rust, and mobile development (Swift/Kotlin), often tied to training data availability.
8. Engineering vs. Coding
   Related: A recurring distinction between "coding" (boilerplate, standard patterns) which AI conquers, and "engineering" (novel logic, complex systems, 3D graphics) where AI supposedly still fails.
9. Security and Trust
   Related: Concerns about deploying unaudited AI code, the introduction of vulnerabilities, the risks of giving agents shell access, and the difficulty of verifying AI output.
10. The Skill Issue Argument
   Related: Proponents dismiss failures as "skill issues," suggesting frustration stems from poor prompting or adaptability, while skeptics argue the tools are genuinely inconsistent.
11. Cost of AI Development
   Related: Analysis of the financial viability of AI coding, including hitting API rate limits, the high cost of Opus 4.5 tokens, and the potential unsustainability of VC-subsidized pricing.
12. Future of Software Products
   Related: Predictions that software creation costs will drop to zero, leading to a flood of bespoke personal apps replacing commercial SaaS, but potentially creating a maintenance nightmare.
13. Human-in-the-Loop Workflows
   Related: The consensus that AI requires constant human oversight, "tools in a loop," and code review to prevent hallucination loops and ensure functional software.
14. Opus 4.5 vs. Previous Models
   Related: Users describe the specific model as a "step change" or "inflection point" compared to Sonnet 3.5 or GPT-4, citing better reasoning and autonomous behavior.
15. Documentation and Specification
   Related: The shift from writing code to writing specs; users find that detailed markdown documentation or "plan mode" yields significantly better AI results than vague prompts.
16. AI Hallucinations and Errors
   Related: Reports of AI inventing non-existent CLI tools, getting stuck in logical loops, failing at visual UI tasks, and making simple indexing errors.
17. Shift in Developer Role
   Related: The idea that developers are evolving into "product managers" or "architects" who direct agents, requiring less syntax proficiency and more systems thinking.
18. Testing and Verification
   Related: The reliance on test-driven development (TDD), linters, and compilers to constrain non-deterministic AI output, ensuring generated code actually runs and meets requirements.
19. Local Models vs. Cloud APIs
   Related: Discussions on the viability of local models for privacy and cost savings versus the necessity of massive cloud models like Opus for complex reasoning tasks.
20. Societal Implications
   Related: Broader philosophical concerns about wealth concentration, the "class war" of automation, environmental impact, and the future of work in a post-code world.
0. Does not fit well in any category
</topics>

<comments_to_classify>
[
  
{
  "id": "46523278",
  "text": "In a sense humans are fancy autocomplete, too."
}
,
  
{
  "id": "46523808",
  "text": "I actually don't disagree with this sentiment. The difference is we've optimised for autocompleting our way out of situations we currently don't have enough information to solve, and LLMs have gone the opposite direction of over-indexing on too much \"autocomplete the thing based on current knowledge\".\n\nAt this point I don't doubt that whatever human intelligence is, it's a computable function."
}
,
  
{
  "id": "46523864",
  "text": "You know that language had to emerge at some point? LLMs can only do anything because they have been fed on human data. Humans actually had to collectively come up with languages /without/ anything to copy since there was a time before language."
}
,
  
{
  "id": "46526546",
  "text": "On the contrary, Opus 4.5 is the best agent I’ve ever used for making cohesive changes across many files in a large, existing codebase. It maintains our patterns and looks like all the other code. Sometimes it hiccups for sure."
}
,
  
{
  "id": "46523317",
  "text": "But... you can ask! Ask claude to use encapsulation, or to write the equivalent of interfaces in the language you using, and to map out dependencies and duplicate features, or to maintain a dictionary of component responsibilities.\n\nAI coding is a multiplier of writing speed but doesn't excuse planning out and mapping out features.\n\nYou can have reasonably engineered code if you get models to stick to well designed modules but you need to tell them."
}
,
  
{
  "id": "46523352",
  "text": "But time I spend asking is time I could have been writing exactly what I wanted in the first place, if I already did the planning to understand what I wanted. Once I know what I want, it doesn't take that long, usually.\n\nWhich is why it's so great for prototyping, because it can create something during the planning, when you haven't planned out quite what you want yet."
}
,
  
{
  "id": "46522982",
  "text": "I totally agree. And welcome to disposable software age."
}
,
  
{
  "id": "46521816",
  "text": "> greenfield\n\nLLMs are pretty good at picking up existing codebases. Even with cleared context they can do „look at this codebase and this spec doc that created it. I want to add feature x“"
}
,
  
{
  "id": "46521848",
  "text": "What size of code base are you talking about? And this is your personal experience?"
}
,
  
{
  "id": "46522059",
  "text": "Overall Codebase size vs context matter less when you set it up as microservices style architecture from the starts.\n\nI just split it into boundaries that make sense to me. Get the LLM to make a quick cheat sheet about the api and then feed that into adjacent modules. It doesn’t need to know everything about all of it to make changes if you’ve got a grip on big picture and the boundaries are somewhat sane"
}
,
  
{
  "id": "46523099",
  "text": "Overall Codebase size vs context matter less when you set it up as microservices style architecture from the starts.\n\nIt'll be fun if the primary benefit of microservices turns out to be that LLMs can understand the codebase."
}
,
  
{
  "id": "46523281",
  "text": "That was the whole point for humans, too."
}
,
  
{
  "id": "46524223",
  "text": "Except it doesn't work the same way it won't work for LLMs.\n\nIf you use too many microserviced, you will get global state, race conditions, much more complex failure models again and no human/LLM can effectively reason about those. We somewhat have tools to do that in case of monoliths, but if one gets to this point with microservices, it's game over."
}
,
  
{
  "id": "46522418",
  "text": "So \"pretty good at picking up existing codebases\" so long as the existing codebase is all microservices."
}
,
  
{
  "id": "46522604",
  "text": "Or a Rails app."
}
,
  
{
  "id": "46523515",
  "text": "I work with multiple monoliths that span anywhere from 100k to 500k lines of code, in a non-mainstream language (Elixir). Opus 4.5 crushes everything I throw at it: complex bugs, extending existing features, adding new features in a way that matches conventions, refactors, migrations... The only time it struggles is if my instructions are unclear or incomplete. For example if I ask it to fix a bug but don't specify that such-and-such should continue to work the way it does due to an undocumented business requirement, Opus might mess that up. But I consider that normal because a human developer would also do fail at it."
}
,
  
{
  "id": "46524295",
  "text": "With all due respect those are very small codebases compared to the kinds of things a lot of software engineers work on."
}
,
  
{
  "id": "46523314",
  "text": "It doesn't have to be micro services, just code that is decoupled properly, so it can search and build its context easily."
}
,
  
{
  "id": "46525242",
  "text": "Yeah, all of those applications he shows do not really expose any complex business logic.\n\nWith all the due respect: a file converter for windows is glueing few windows APIs with the relevant codec.\n\nNow, good luck working on a complex warehouse management application where you need extremely complex logic to sort the order of picking, assembling, packing on an infinite number of variables: weight, amazon prime priority, distribution centers, number and type of carts available, number and type of assembly stations available, different delivery systems and requirements for different delivery operators (such as GLE, DHL, etc) that has to work with N customers all requiring slightly different capabilities and flows, all having different printers and operations, etc, etc. And I ain't even scratching the surface of the business logic complexity (not even mentioning functional requirements) to avoid boring the reader.\n\nMind you, AI is still tremendously useful in the analysis phase, and can sort of help in some steps of the implementation one, but the number of times you can avoid looking thoroughly at the code for any minor issue or discrepancy is absolutely close to 0."
}
,
  
{
  "id": "46522906",
  "text": "It just one shots bug fixes in complex codebases.\n\nCopy-paste the bug report and watch it go."
}
,
  
{
  "id": "46522577",
  "text": "It might scale.\n\nSo far, Im not convinced, but lets take a look at fundmentally whats happening and why humans > agents > LLMs.\n\nAt its heart, programming is a constraint satisfaction problem.\n\nThe more constraints (requirements, syntax, standards, etc) you have, the harder it is to solve them all simultaneously.\n\nNew projects with few contributors have fewer constraints.\n\nThe process of “any change” is therefore simpler.\n\nNow, undeniably\n\n1) agents have improved the ability to solve constraints by iterating ; eg. Generate, test, modify, etc. over raw LLm output.\n\n2) There is an upper bound (context size, model capability) to solve simultaneous constraints.\n\n3) Most people have a better ability to do this than agents (including claude code using opus 4.5).\n\nSo, if youre seeing good results from agents, you probably have a smaller set of constraints than other people.\n\nSimilarly, if youre getting bad results, you can probably improve them by relaxing some of the constraints (consistent ui, number of contributors, requirements, standards, security requirements, split code into well defined packages).\n\nThis will make both agents and humans more productive.\n\nThe open question is: will models continue to improve enough to approach or exceed human level ability in this?\n\nAre humans willing to relax the constraints enough for it to be plausible?\n\nI would say currently people clambering about the end of human developers are cluelessly deceived by the “appearance of complexity” which does not match the “reality of constraints” in larger applications.\n\nOpus 4.5 cannot do the work of a human on code bases Ive worked on. Hell, talented humans struggle to work on some of them.\n\n…but that doesnt mean it doesnt work.\n\nJust that, right now, the constraint set it can solve is not large enough to be useful in those situations .\n\n…and increasingly we see low quality software where people care only about speed of delivery; again, lowering the bar in terms of requirements.\n\nSo… you know. Watch this space. Im not counting on having a dev job in 10 years. If I do, it might be making a pile of barely working garbage.\n\n…but I have one now, and anyone who thinks that this year people will be largely replaced by AI is probably poorly informed and has misunderstood the capabilities on these models.\n\nTheres only so low you can go in terms of quality."
}
,
  
{
  "id": "46521806",
  "text": "Based on my experience using these LLMs regularly I strongly doubt it could even build any application with realistic complexity without screwing things up in major ways everywhere, and even on top of that still not meeting all the requirements."
}
,
  
{
  "id": "46524670",
  "text": "If you have microservices architecture in your project you are set for AI. You can swap out any lacking, legacy microservice in your system with \"greenfield\" vibecoded one."
}
,
  
{
  "id": "46522746",
  "text": "Man, I've been biting my tongue all day with regards to this thread and overall discussion.\n\nI've been building a somewhat-novel, complex, greenfield desktop app for 6 months now, conceived and architected by a human (me), visually designed by a human (me), implementation heavily leaning on mostly Claude Code but with Codex and Gemini thrown in the mix for the grunt work. I have decades of experience, could have built it bespoke in like 1-2 years probably, but I wanted a real project to kick the tires on \"the future of our profession\".\n\nTL;DR I started with 100% vibe code simply to test the limits of what was being promised. It was a functional toy that had a lot of problems. I started over and tried a CLI version. It needed a therapist. I started over and went back to visual UI. It worked but was too constrained. I started over again. After about 10 complete start-overs in blank folders, I had a better vision of what I wanted to make, and how to achieve it. Since then, I've been working day after day, screen after screen, building, refactoring, going feature by feature, bug after bug, exactly how I would if I was coding manually. Many times I've reached a point where it feels \"feature complete\", until I throw a bigger dataset at it, which brings it to its knees. Time to re-architect, re-think memory and storage and algorithms and libraries used. Code bloated, and I put it on a diet until it was trim and svelte. I've tried many different approaches to hard problems, some of which LLMs would suggest that truly surprised me in their efficacy, but only after I presented the issues with the previous implementation. There's a lot of conversation and back and forth with the machine, but we always end up getting there in the end. Opus 4.5 has been significantly better than previous Anthropic models. As I hit milestones, I manually audit code, rewrite things, reformat things, generally polish the turd.\n\nI tell this story only because I'm 95% there to a real, legitimate product, with 90% of the way to go still. It's been half a year.\n\nVibe coding a simple app that you just want to use personally is cool; let the machine do it all, don't worry about under the hood, and I think a lot of people will be doing that kind of stuff more and more because it's so empowering and immediate.\n\nUsing these tools is also neat and amazing because they're a force multiplier for a single person or small group who really understand what needs done and what decisions need made.\n\nThese tools can build very complex, maintainable software if you can walk with them step by step and articulate the guidelines and guardrails, testing every feature, pushing back when it gets it wrong, growing with the codebase, getting in there manually whenever and wherever needed.\n\nThese tools CANNOT one-shot truly new stuff, but they can be slowly cajoled and massaged into eventually getting you to where you want to go; like, hard things are hard, and things that take time don't get done for a while. I have no moral compunctions or philosophical musings about utilizing these tools, but IMO there's still significant effort and coordination needed to make something really great using them (and literally minimal effort and no coordination needed to make something passable)\n\nIf you're solo, know what you want, and know what you're doing, I believe you might see 2x, 4x gains in time and efficiency using Claude Code and all of his magical agents, but if your project is more than a toy, I would bet that 2x or 4x is applied to a temporal period of years, not days or months!"
}
,
  
{
  "id": "46522490",
  "text": ">day to day, when I ask it \"build me this feature\" it uses strange abstractions, and often requires several attempts on my part to do it in the way I consider \"right\"\n\nThen don't ask it to \"build me this feature\" instead lay out a software development process with designated human in the loop where you want it and guard rails to keep it on track. Create a code review agent to look for and reject strange abstractions. Tell it what you don't like and it's really good at finding it.\n\nI find Opus 4.5, properly prompted, to be significantly better at reviewing code than writing it, but you can just put it in a loop until the code it writes matches the review."
}
,
  
{
  "id": "46521713",
  "text": "> The hard thing about engineering is not \"building a thing that works\", its building it the right way, in an easily understood way, in a way that's easily extensible.\n\nThe number of production applications that achieve this rounds to zero\n\nI’ve probably managed 300 brownfield web, mobile, edge, datacenter, data processing and ML applications/products across DoD, B2B, consumer and literally zero of them were built in this way"
}
,
  
{
  "id": "46522170",
  "text": "I think there is a subjective difference. When a human builds dogshit at least you know they put some effort and the hours in.\n\nWhen I'm reading piles of LLM slop, I know that just reading it is already more effort than it took to write. It feels like I'm being played.\n\nThis is entirely subjective and emotional. But when someone writes something with an LLM in 5 seconds and asks me to spend hours reviewing...fuck off."
}
,
  
{
  "id": "46522446",
  "text": "If you are heavily using LLMs, you need to change the way you think about reviews\n\nI think most people now approach it as:\nDev0 uses an LLM to build a feature super fast, Dev1 spends time doing a in depth review.\n\nDev0 built it, Dev1 reviewed it. And Dev0 is happy because they used the tool to save time!\n\nBut what should happen is that Dev0 should take all that time they saved coding and reallocate it to the in depth review.\n\nThe LLM wrote it, Dev0 reviewed it, Dev1 double-reviewed it. Time savings are much less, but there’s less context switching between being a coder and a reviewer. We are all reviewers now all the time"
}
,
  
{
  "id": "46526277",
  "text": "Can't do that, else KPIs won't show that AI tools reduced amount of coding work by xx%"
}
,
  
{
  "id": "46527337",
  "text": "Your comment doesn’t address what I said and instead finds a new reason that it’s invalid because “reviewing code from a machine system is beneath me”\n\nGet over yourself"
}
,
  
{
  "id": "46526454",
  "text": "This is the exact copium I came here to enjoy."
}
,
  
{
  "id": "46521209",
  "text": "you can definitely just tell it what abstractions you want when adding a feature and do incremental work on existing codebase. but i generally prefer gpt-5.2"
}
,
  
{
  "id": "46522386",
  "text": "I've been using 5.2 a lot lately but hit my quota for the first time (and will probably continue to hit it most weeks) so I shelled out for claude code. What differences do you notice? Any 'metagame' that would be helpful?"
}
,
  
{
  "id": "46523008",
  "text": "I just use Cursor because I can pick any mode. The difference is hard to say exactly, Opus seems good but 5.2 seems smarter on the tasks I tried. Or possibly I just \"trust\" it more. I tend to use high or extra high reasoning."
}
,
  
{
  "id": "46521539",
  "text": "\"its building it the right way, in an easily understood way, in a way that's easily extensible\"\n\nI am in a unique situation where I work with a variety of codebases over the week. I have had no problem at all utilizing Claude Code w/ Opus 4.5 and Gemini CLI w/ Gemini 3.0 Pro to make excellent code that is indisputably \"the right way\", in an extremely clear and understandable way, and that is maximally extensible. None of them are greenfield projects.\n\nI feel like this is a bit of je ne sais quoi where people appeal to some indemonstrable essence that these tools just can't accomplish, and only the \"non-technical\" people are foolish enough to not realize it. I'm a pretty technical person (about 30 years of software development, up to staff engineer and then VP). I think they have reached a pretty high level of competence. I still audit the code and monitor their creations, but I don't think they're the oft claimed \"junior developer\" replacement, but instead do the work I would have gotten from a very experienced, expert-level developer, but instead of being an expert at a niche, they're experts at almost every niche.\n\nAre they perfect? Far from it. It still requires a practitioner who knows what they're doing. But frequently on here I see people giving takes that sound like they last used some early variant of Copilot or something and think that remains state of the art. The rest of us are just accelerating our lives with these tools, knowing that pretending they suck online won't slow their ascent an iota."
}
,
  
{
  "id": "46521966",
  "text": ">llm_nerd\n>created two years ago\n\nYou AI hype thots/bots are all the same. All these claims but never backed up with anything to look at. And also alway claiming “you’re holding it wrong”."
}
,
  
{
  "id": "46522057",
  "text": "I don't see how \"two years ago\" is incongruous with having been using LLMs for coding, it's exactly the timeline I would expect. Yes, some people do just post \"git gud\" but there are many people ITT and most of the others on LLM coding articles who are trying to explain their process to anyone who will listen. I'm not sure if it is fully explainable in a single comment though, I'd have to write a multi-part tutorial to cover everything but it's almost entirely just applying the same project management principles that you would in a larger team of developers but customized to the current limitations of LLMs. If you want full tutorials with examples I'm sure they're out there but I'd also just recommend reviewing some project management material and then seeing how you can apply it to a coding agent. You'll only really learn by doing."
}
,
  
{
  "id": "46525472",
  "text": ">You AI hype thots/bots are all the same\n\nThis isn't twitter, so save the garbage rhetoric. And if you must question my account, I create a new account whenever I setup a new main PC, and randomly pick a username that is top of mind at the moment. This isn't professionally or personally affiliated in any way so I'm not trying to build a thing. I mean, if I had a 10 year old account that only managed a few hundred upvotes despite prolific commenting, I'd probably delete it out of embarrassment though.\n\n>All these claims but never backed up with anything to look at\n\nUh...install the tools? Use them? What does \"to look at\" even mean? Loads of people are using these tools to great effect, while some tiny minority tell us online that no way they don't work, etc. And at some point they'll pull their head out of the sand and write the followup \"Wait, they actually do\"."
}
,
  
{
  "id": "46522810",
  "text": "I also have >30 years and I've had the same experience. I noticed an immediate improvement with 4.5 and I've been getting great results in general.\n\nAnd yes I do make sure it's not generating crazy architecture. It might do that.. if you let it. So don't let it."
}
,
  
{
  "id": "46526179",
  "text": "HN has a subset of users -- they're a minority, but they hit threads like this super hard -- who really, truly think that if they say that AI tools suck and are only for nubs loud enough and frequently enough, downvoting anyone who finds them useful, all AI advancements will unwind and it'll be the \"good old days\" again. It's rather bizarre stuff, but that's what happens when people in denial feel threatened."
}
,
  
{
  "id": "46515959",
  "text": "Opus 4.5 has become really capable.\n\nNot in terms of knowledge. That was already phenomenal. But in its ability to act independently: to make decisions, collaborate with me to solve problems, ask follow-up questions, write plans and actually execute them.\n\nYou have to experience it yourself on your own real problems and over the course of days or weeks.\n\nEvery coding problem I was able to define clearly enough within the limits of the context window, the chatbot could solve and these weren’t easy. It wasn’t just about writing and testing code. It also involved reverse engineering and cracking encoding-related problems. The most impressive part was how actively it worked on problems in a tight feedback loop.\n\nIn the traditional sense, I haven’t really coded privately at all in recent weeks. Instead, I’ve been guiding and directing, having it write specifications, and then refining and improving them.\n\nCurious how this will perform in complex, large production environments."
}
,
  
{
  "id": "46516123",
  "text": "> You have to experience it yourself on your own real problems and over the course of days or weeks.\n\nHow do you stop it from over-engineering everything?"
}
,
  
{
  "id": "46516147",
  "text": "This has always been my problem whether it's Gemini, openai or Claude. Unless you hand-hold it to an extreme degree, it is going to build a mountain next to a molehill.\n\nIt may end up working, but the thing is going to convolute apis and abstractions and mix patterns basically everywhere"
}
,
  
{
  "id": "46516173",
  "text": "Not in my experience - you need to build the fact that you don’t want it to do that into your design and specification."
}
,
  
{
  "id": "46516234",
  "text": "Sure, I can tell it not to do that, but it doesn't know what that is. It's a je ne sais quoi .\n\nI can't teach it taste ."
}
,
  
{
  "id": "46519510",
  "text": "Recent Claude will just look at your code and copy what you've been doing, mostly, in an existing codebase - without being asked. In a new codebase, you can just ask it to \"be conscice, keep it simple\" or something."
}
,
  
{
  "id": "46516348",
  "text": "It's very good at following instructions. You can build dedicated agents for different tasks (backend, API design, database design) and make it follow design and coding patterns.\n\nIt's verbose by default but a few hours of custom instructions and you can make it code just like anyone"
}
,
  
{
  "id": "46519311",
  "text": "> just like anyone\n\nArthur Whitney?\n\nhttps://en.wikipedia.org/wiki/Arthur_Whitney_(computer_scien..."
}
,
  
{
  "id": "46516340",
  "text": "Difficult and it really depends on the complexity. I definitely work in a spec-driven way, with a step-by-step implementation phase. If it goes the wrong way I prefer to rewrite the spec and throw away the code."
}
,
  
{
  "id": "46520582",
  "text": "I have it propose several approaches, pick and choose from each, and remove what I don't want done. \"Use the general structure of A, but use the validation structure of D. Using a view translation layer is too much, just rely on FastAPI/SQLModel's implicit view conversion.\""
}

]
</comments_to_classify>

Based on the comments above, assign each to up to 3 relevant topics.

Return ONLY a JSON array with this exact structure (no other text):
[
  
{
  "id": "comment_id_1",
  "topics": [
    1,
    3,
    5
  ]
}
,
  
{
  "id": "comment_id_2",
  "topics": [
    2
  ]
}
,
  
{
  "id": "comment_id_3",
  "topics": [
    0
  ]
}
,
  ...
]

Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment

Remember: Output ONLY the JSON array, no other text.

commentCount

50

← Back to job