llm/0c6097e3-bc76-4fbe-ab4f-ceafa2484e5f/batch-3-8477b78f-5030-407f-9176-325ce287577a-input.json
The following is content for you to classify. Do not respond to the comments—classify them.
<topics>
1. AI Performance on Greenfield vs. Legacy
Related: Users debate whether agents excel primarily at starting new projects from scratch while struggling to maintain large, complex, or legacy codebases without breaking existing conventions.
2. Context Window Limitations and Management
Related: Discussions focus on token limits (200k), performance degradation as context fills, and strategies like compacting history, using sub-agents, or maintaining summary files to preserve long-term memory.
3. Vibe Coding and Code Quality
Related: The polarization around building apps without reading the code; critics warn of unmaintainable "slop" and technical debt, while proponents value the speed and ability to bypass syntax.
4. Claude Code and Tooling
Related: Specific praise and critique for the Claude Code CLI, its integration with VS Code and Cursor, the use of slash commands, and comparisons to GitHub Copilot's agent mode.
5. Economic Impact on Software Jobs
Related: Existential anxiety regarding the obsolescence of mid-level engineers, the potential "hollowing out" of the middle class, and the shift toward one-person unicorn teams.
6. Prompt Engineering and Configuration
Related: Strategies involving `CLAUDE.md`, `AGENTS.md`, and custom system prompts to teach the AI coding conventions, architecture, and specific skills for better output.
7. Specific Language Capabilities
Related: Anecdotal evidence regarding proficiency in React, Python, and Go versus struggles in C++, Rust, and mobile development (Swift/Kotlin), often tied to training data availability.
8. Engineering vs. Coding
Related: A recurring distinction between "coding" (boilerplate, standard patterns) which AI conquers, and "engineering" (novel logic, complex systems, 3D graphics) where AI supposedly still fails.
9. Security and Trust
Related: Concerns about deploying unaudited AI code, the introduction of vulnerabilities, the risks of giving agents shell access, and the difficulty of verifying AI output.
10. The Skill Issue Argument
Related: Proponents dismiss failures as "skill issues," suggesting frustration stems from poor prompting or adaptability, while skeptics argue the tools are genuinely inconsistent.
11. Cost of AI Development
Related: Analysis of the financial viability of AI coding, including hitting API rate limits, the high cost of Opus 4.5 tokens, and the potential unsustainability of VC-subsidized pricing.
12. Future of Software Products
Related: Predictions that software creation costs will drop to zero, leading to a flood of bespoke personal apps replacing commercial SaaS, but potentially creating a maintenance nightmare.
13. Human-in-the-Loop Workflows
Related: The consensus that AI requires constant human oversight, "tools in a loop," and code review to prevent hallucination loops and ensure functional software.
14. Opus 4.5 vs. Previous Models
Related: Users describe the specific model as a "step change" or "inflection point" compared to Sonnet 3.5 or GPT-4, citing better reasoning and autonomous behavior.
15. Documentation and Specification
Related: The shift from writing code to writing specs; users find that detailed markdown documentation or "plan mode" yields significantly better AI results than vague prompts.
16. AI Hallucinations and Errors
Related: Reports of AI inventing non-existent CLI tools, getting stuck in logical loops, failing at visual UI tasks, and making simple indexing errors.
17. Shift in Developer Role
Related: The idea that developers are evolving into "product managers" or "architects" who direct agents, requiring less syntax proficiency and more systems thinking.
18. Testing and Verification
Related: The reliance on test-driven development (TDD), linters, and compilers to constrain non-deterministic AI output, ensuring generated code actually runs and meets requirements.
19. Local Models vs. Cloud APIs
Related: Discussions on the viability of local models for privacy and cost savings versus the necessity of massive cloud models like Opus for complex reasoning tasks.
20. Societal Implications
Related: Broader philosophical concerns about wealth concentration, the "class war" of automation, environmental impact, and the future of work in a post-code world.
0. Does not fit well in any category
</topics>
<comments_to_classify>
[
{
"id": "46523420",
"text": "If the trend line holds you’ll be very, very surprised."
}
,
{
"id": "46522635",
"text": "You enter some text and a computer spits out complex answers generated on the spot\n\nRight or wrong - doesn’t matter. You typed in a line of text and now your computer is making 3000 word stories, images, even videos based on it\n\nHow are you NOT astounded by that? We used to have NONE of this even 4 years ago!"
}
,
{
"id": "46523953",
"text": "Of course I'm astounded. But being spectacular and being useful are entirely different things."
}
,
{
"id": "46525029",
"text": "If you've found nothing useful about AI so far then the problem is likely you"
}
,
{
"id": "46533090",
"text": "I don't think it's necessarily a problem. And even if you accept that the problem is you, it doesn't exactly provide a \"solution\"."
}
,
{
"id": "46522947",
"text": "Because I want correct answers."
}
,
{
"id": "46526284",
"text": "> On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question.\n\n-- Charles Babbage"
}
,
{
"id": "46519725",
"text": "> Opus 4.5 really is at a new tier however. It just...works.\n\nLiterally tried it yesterday. I didn't see a single difference with whatever model Claude Code was using two months ago. Same crippled context window. Same \"I'll read 10 irrelevant lines from a file\", same random changes etc."
}
,
{
"id": "46520763",
"text": "The context window isn't \"crippled\".\n\nCreate a markdown document of your task (or use CLAUDE.md), put it in \"plan mode\" which allows Claude to use tool calls to ask questions before it generates the plan.\n\nWhen it finishes one part of the plan, have it create a another markdown document - \"progress.md\" or whatever with the whole plan and what is completed at that point.\n\nType /clear (no more context window), tell Claude to read the two documents.\n\nRepeat until even a massive project is complete - with those 2 markdown documents and no context window issues."
}
,
{
"id": "46523995",
"text": "> The context window isn't \"crippled\".\n\n... Proceeds to explain how it's crippled and all the workarounds you have to do to make it less crippled."
}
,
{
"id": "46526226",
"text": "> ... Proceeds to explain how it's crippled and all the workarounds you have to do to make it less crippled.\n\nNo - that's not what I did.\n\nYou don't need an extra-long context full of irrelevant tokens. Claude doesn't need to see the code it implemented 40 steps ago in a working method from Phase 1 if it is on Phase 3 and not using that method. It doesn't need reasoning traces for things it already \"thought\" through.\n\nThis other information is cluttering , not helpful. It is making signal to noise ratio worse.\n\nIf Claude needs to know something it did in Phase 1 for Phase 4 it will put a note on it in the living markdown document to simply find it again when it needs it."
}
,
{
"id": "46526532",
"text": "Again, you're basically explaining how Claude has a very short limited context and you have to implement multiple workarounds to \"prevent cluttering\". Aka: try to keep context as small as possible, restart context often, try and feed it only small relevant information.\n\nWhat I very succinctly called \"crippled context\" despite claims that Opus 4.5 is somehow \"next tier\". It's all the same techniques we've been using for over a year now."
}
,
{
"id": "46526864",
"text": "Context is a short term memory. Yours is even more limited and yet somehow you get by."
}
,
{
"id": "46529037",
"text": "I get by because I also have long-term memory, and experience, and I can learn. LLMs have none of that, and every new session is rebuilding the world anew.\n\nAnd even my short-term memory is significantly larger than the at most 50% of the 200k-token context window that Claude has. It runs out of context before my short-term memory is probably not even 1% full, for the same task ( and I'm capable of more context-switching in the meantime).\n\nAnd so even the \"Opus 4.5 really is at a new tier\" runs into the very same limitations all models have been running into since the beginning."
}
,
{
"id": "46529291",
"text": "> LLMs have none of that, and every new session is rebuilding the world anew.\n\nFor LLMs long term memory is achieved by tooling. Which you discounted in your previous comments.\n\nYou also overstimate capacity of your short-term memory by few orders of magnitude:\n\nhttps://my.clevelandclinic.org/health/articles/short-term-me..."
}
,
{
"id": "46529480",
"text": "> For LLMs long term memory is achieved by tooling. Which you discounted in your previous comments.\n\nMy specific complaint, which is an observable fact about \"Opus 4.5 is next tier\": it has the same crippled context that degrades the quality of the model as soon as it fills 50%.\n\nEMM_386: no-no-no, it's not crippled. All you have to do is keep track across multiple files, clear out context often, feed very specific information not to overflow context.\n\nMe: so... it's crippled, and you need multiple workarounds\n\nscotty79: After all it's the same as your own short-term memory, and <some unspecified tooling (I guess those same files)> provide long-term memory for LLMs.\n\nMe: Your comparison is invalid because I can go have lunch, and come back to the problem at hand and continue where I left off. \"Next tier Opus 4.5\" will have to be fed the entire world from scratch after a context clear/compact/in a new session.\n\nUnless, of course, you meant to say that \"next tier Opus model\" only has 15-30 second short term memory, and needs to keep multiple notes around like the guy from Memento. Which... makes it crippled."
}
,
{
"id": "46532984",
"text": "If you refuse to use what you call workarounds and I call long term memory then you end up with a guy from Memento and regardless of how smart the model is it can end up making same mistakes. And that's why you can't tell the difference between smarter and dumber one while others can."
}
,
{
"id": "46533131",
"text": "I think the premise is that if it was the \"next tier\" than you wouldn't need to use these workarounds."
}
,
{
"id": "46533614",
"text": "> If you refuse to use what you call workarounds\n\nWho said I refuse them?\n\nI evaluated the claim that Opus is somehow next tier/something different/amazeballs future at its face value . It still has all the same issues and needs all the same workarounds as whatever I was using two months ago (I had a bit of a coding hiatus between beginning of December and now).\n\n> then you end up with a guy from Memento and regardless of how smart the model is\n\nThose models are, and keep being the guy from memento. Your \"long memory\" is nothing but notes scribbled everywhere that you have to re-assemble every time.\n\n> And that's why you can't tell the difference between smarter and dumber one while others can.\n\nIf it was \"next tier smarter\" it wouldn't need the exact same workarounds as the \"dumber\" models. You wouldn't compare the context to the 15-30 second short-term memory and need unspecified tools [1] to have \"long-term memory\". You wouldn't have the model behave in an indistinguishable way from a \"dumber\" model after half of its context windows has been filled. You wouldn't even think about context windows. And yet here we are\n\n[1] For each person these tools will be a different collection of magic incantations. From scattered .md files to slop like Beads to MCP servers providing access to various external storage solutions to custom shell scripts to ...\n\nBTW, I still find \"superpowers\" from https://github.com/obra/superpowers to be the single best improvement to Claude (and other providers) even if it's just another in a long serious of magic chants I've evaluated."
}
,
{
"id": "46530840",
"text": "I'm not familiar with any form of intelligence that does not suffer from a bloated context. If you want to try and improve your workflow, a good place to start is using sub-agents so individual task implementations do not fill up your top level agents context. I used to regularly have to compact and clear, but since using sub-agents for most direct tasks, I hardly do anymore."
}
,
{
"id": "46532512",
"text": "1. It's a workaround for context limitations\n\n2. It's the same workarounds we've been doing forever\n\n3. It's indistinguishable from \"clear context and re-feed the entire world of relevant info from scratch\" we've had forever, just slightly more automated\n\nThat's why I don't understand all the \"it's new tier\" etc. It's all the same issues with all the same workarounds."
}
,
{
"id": "46523362",
"text": "That's because Opus has been out for almost 5 months now lol. Its the same model, so I think people have been vibe coding with a heavy dose of wine this holiday and are now convinced its the future."
}
,
{
"id": "46525037",
"text": "Looks like you hallucinated the Opus release date\n\nAre you sure you're not an LLM?"
}
,
{
"id": "46532519",
"text": "Opus 4.1 was released in August or smth."
}
,
{
"id": "46524708",
"text": "Opus 4.5 was released 24th November."
}
,
{
"id": "46519883",
"text": "200k+ tokens is a pretty big context window if you are feeding it the right context. Editors like Cursor are really good at indexing and curating context for you; perhaps it'd be worth trying something that does that better than Claude CLI does?"
}
,
{
"id": "46519960",
"text": "> a pretty big context window if you are feeding it the right context.\n\nYup. There's some magical \"right context\" that will fix all the problems. What is that right context? No idea, I guess I need to read a yet-another 20 000-word post describing magical incantations that you should or shouldn't do in the context.\n\nThe \"Opus 4.5 is something else/nex tier/just works\" claims in my mind means that I wouldn't need to babysit its every decision, or that it would actually read relevant lines from relevant files etc. Nope. Exact same behaviors as whatever the previous model was.\n\nOh, and that \"200k tokens context window\"? It's a lie. The quality quickly degrades as soon as Claude reaches somewhere around 50% of the context window. At 80+% it's nearly indistinguishable from a model from two years ago. (BTW, same for Codex/GPT with it's \"1 million token window\")"
}
,
{
"id": "46521230",
"text": "It's like working with humans:\n\n1) define problem\n2) split problem into small independently verifiable tasks\n3) implement tasks one by one, verify with tools\n\nWith humans 1) is the spec, 2) is the Jira or whatever tasks\n\nWith an LLM usually 1) is just a markdown file, 2) is a markdown checklist, Github issues (which Claude can use with the `gh` cli) and every loop of 3 gets a fresh context, maybe the spec from step 1 and the relevant task information from 2\n\nI haven't ran into context issues in a LONG time, and if I have it's usually been either intentional (it's a problem where compacting wont' hurt) or an error on my part."
}
,
{
"id": "46524159",
"text": "> every loop of 3 gets a fresh context, maybe the spec from step 1 and the relevant task information from 2\n\n> I haven't ran into context issues in a LONG time\n\nBecause you've become the reverse centaur :) \"a person who is serving as a squishy meat appendage for an uncaring machine.\" [1]\n\nYou are very aware of the exact issues I'm talking about, and have trained yourself to do all the mechanical dance moves to avoid them.\n\nI do the same dances, that's why I'm pointing out that they are still necessary despite the claims of how model X/Y/Z are \"next tier\".\n\n[1] https://doctorow.medium.com/https-pluralistic-net-2025-12-05..."
}
,
{
"id": "46524433",
"text": "Yes and no. I've worked quite a bit with juniors, offshore consultants and just in companies where processes are a bit shit.\n\nThe exact same method that worked for those happened to also work for LLMs, I didn't have to learn anything new or change much in my workflow.\n\n\"Fix bug in FoobarComponent\" is enough of a bug ticket for the 100x developer in your team with experience with that specific product, but bad for AI, juniors and offshored teams.\n\nThus, giving enough context in each ticket to tell whoever is working on it where to look and a few ideas what might be the root cause and how to fix it is kinda second nature to me.\n\nAlso my own brain is mostly neurospicy mush, so _I_ need to write the context to the tickets even if I'm the one on it a few weeks from now. Because now-me remembers things, two-weeks-from-now me most likely doesn't."
}
,
{
"id": "46524794",
"text": "The problem with LLMs (similar to people :) ) is that you never really know what works. I've had Claude one-shot \"implement <some complex requirement>\" with little additional input, and then completely botch even the smallest bug fix with explicit instructions and context. And vice versa :)"
}
,
{
"id": "46520305",
"text": "I realize your experience has been frustrating. I hope you see that every generation of model and harness is converting more hold-outs. We're still a few years from hard diminishing returns assuming capital keeps flowing (and that's without any major new architectures which are likely) so you should be able to see how this is going to play out.\n\nIt's in your interest to deal with your frustration and figure out how you can leverage the new tools to stay relevant (to the degree that you want to).\n\nRegarding the context window, Claude needs thinking turned up for long context accuracy, it's quite forgetful without thinking."
}
,
{
"id": "46521516",
"text": "I think it's important for people who want to write a comment like this to understand how much this sounds like you're in a cult."
}
,
{
"id": "46521811",
"text": "Personally I'm sympathetic to people who don't want to have to use AI, but I dislike it when they attack my use of AI as a skill issue. I'm quite certain the workplace is going to punish people who don't leverage AI though, and I'm trying to be helpful."
}
,
{
"id": "46524029",
"text": "> but I dislike it when they attack my use of AI as a skill issue.\n\nNo one attacked your use of AI. I explained my own experience with the \"Claude Opus 4.5 is next tier\". You barged in, ignored anything I said, and attacked my skills.\n\n> the workplace is going to punish people who don't leverage AI though, and I'm trying to be helpful.\n\nSo what exactly is helpful in your comments?"
}
,
{
"id": "46525680",
"text": "The only thing I disagreed with in your post is your objectively incorrect statement regarding Claude's context behavior. Other than that I'm just trying to encourage you to make preparations for something that I don't think you're taking seriously enough yet. No need to get all worked up, it'll only reflect on you."
}
,
{
"id": "46524014",
"text": "Note how nothing in your comment addresses anything I said. Except the last sentence that basically confirms what I said. This perfectly illustrates the discourse around AI.\n\nAs for the snide and patronizing \"it's in your interest to stay relevant\":\n\n1. I use these tools daily. That's why I don't subscribe to willful wide-eyed gullibility. I know exactly what these tools can and cannot do.\n\nThe vast majority of \"AI skeptics\" are the same.\n\n2. In a few years when the world is awash in barely working incomprehensible AI slop my skills will be in great demand. Not because I'm an amazing developer (I'm not), but because I have experience separating wheat from the chaff"
}
,
{
"id": "46525740",
"text": "The snide and patronizing is your projection. It kinda makes me sad when the discourse is so poisoned that I can't even encourage someone to protect their own future from something that's obviously coming (technical merits aside, purely based on social dynamics).\n\nIt seems the subject of AI is emotionally charged for you, so I expect friendly/rational discourse is going to be a challenge. I'd say something nice but since you're primed to see me being patronizing... Fuck you? That what you were expecting?"
}
,
{
"id": "46526008",
"text": "> The snide and patronizing is your projection.\n\nIt's not me who decided to barge in, assume their opponent doesn't use something or doesn't want to use something, and offer unsolicited advice.\n\n> It kinda makes me sad when the discourse is so poisoned that I can't even encourage someone to protect their own future from something that's obviously coming\n\nSee. Again. You're so in love with your \"wisdom\" that you can't even see what you sound like: snide, patronising, condenscending. And completely missing the whole point of what was written. You are literally the person who poisons the discourse.\n\nMe: \"here are the issues I still experience with what people claim are 'next tier frontier model'\"\n\nYou: \"it's in your interests to figure out how to leverage new tools to stay relevant in the future\"\n\nMe: ... what the hell are you talking about? I'm using these tools daily. Do you have anything constructive to add to the discourse?\n\n> so I expect friendly/rational discourse is going to be a challenge.\n\nIt's only challenge to you because you keep being in love with your voice and your voice only. Do you have anything to contribute to the actual rational discourse, are you going to attack my character?\n\n> 'd say something nice but since you're primed to see me being patronizing... Fuck you? T\n\nAh. The famous friendly/rational discourse of \"they attack my use of AI\" (no one attacked you), \"why don't you invest in learning tools to stay relevant in the future\" (I literally use these tools daily, do you have anything useful to say?) and \"fuck you\" (well, same to you).\n\n> That what you were expecting?\n\nWhat I was expecting is responses to what I wrote, not you riding in on a high horse."
}
,
{
"id": "46526631",
"text": "You were the one complaining about how the tools aren't giving you the results you expected. If you're using these tools daily and having a hard time, either you're working on something very different from the bulk of people using the tools and your problems or legitimate, or you aren't and it's a skill issue.\n\nIf you want to take politeness as being patronizing, I'm happy to stop bothering. My guess is you're not a special snowflake, and you need to \"get good\" or you're going to end up on unemployment complaining about how unfair life is. I'd have sympathy but you don't seem like a pleasant human being to interact with, so have fun!"
}
,
{
"id": "46529036",
"text": "> ou were the one complaining about how the tools aren't giving you the results you expected.\n\nThey are not giving me the results people claim they give. It is distinctly different from not giving the results I want.\n\n> If you're using these tools daily and having a hard time, either you're working on something very different from the bulk of people using the tools and your problems or legitimate, or you aren't and it's a skill issue.\n\nIndeed. And your rational/friendly discourse that you claim you're having would start with trying to figure that out. Did you? No, you didn't. You immediately assumed your opponent is a clueless idiot who is somehow against AI and is incapable or learning or something.\n\n> If you want to take politeness as being patronizing, I'm happy to stop bothering.\n\nNo. It's not politeness. It's smugness. You literally started your interaction in this thread with a \"git gud or else\" and even managed to complain later that \"you dislike it when they attack your use of AI as a skill issue\". While continuously attacking others.\n\n> you don't seem like a pleasant human being to interact with\n\nSays the person who has contributed nothing to the conversation except his arrogance, smugness, holier-than-thou attitude, engaged in nothing but personal attacks, complained about non-existent grievances and when called out on this behavior completed his \"friendly and rational discourse\" with a \"fuck you\".\n\nWell, fuck you, too.\n\nAdieu."
}
,
{
"id": "46519897",
"text": "I use Sonnet and Opus all the time and the differences are almost negligible"
}
,
{
"id": "46519893",
"text": "Opus 4.5 is fucking up just like Sonnet really. I don't know how your use is that much different than mine."
}
,
{
"id": "46520859",
"text": "I know someone who is using a vibe coded or at least heavily assisted text editor, praising it daily, while also saying llms will never be productive. There is a lot of dissonance right now."
}
,
{
"id": "46516674",
"text": "I teach at a university, and spend plenty of time programming for research and for fun. Like many others, I spent some time on the holidays trying to push the current generation of Cursor, Claude Code, and Codex as far as I could. (They're all very good.)\n\nI had an idea for something that I wanted, and in five scattered hours, I got it good enough to use. I'm thinking about it in a few different ways:\n\n1. I estimate I could have done it without AI with 2 weeks full-time effort. (Full-time defined as >> 40 hours / week.)\n\n2. I have too many other things to do that are purportedly more important that programming. I really can't dedicate to two weeks full-time to a \"nice to have\" project. So, without AI, I wouldn't have done it at all.\n\n3. I could hire someone to do it for me. At the university, those are students. From experience with lots of advising, a top-tier undergraduate student could have achieved the same thing, had they worked full tilt for a semester (before LLMs). This of course assumes that I'm meeting them every week."
}
,
{
"id": "46516850",
"text": "How do you compare Claude Code to Cursor? I'm a Cursor user quietly watching the CC parade with curiosity. Personally, I haven't been able to give up the IDE experience."
}
,
{
"id": "46530085",
"text": "Im so sold on the cli tools that I think IDEs are basically dead to me. I only have an IDE open so I can read the code, but most often I'm just changing configs (like switching a bool, or bumping up a limit or something like that).\n\nSeriously, I have 3+ claude code windows open at a time. Most days I don't even look at the IDE. It's still there running in the background, but I don't need to touch it."
}
,
{
"id": "46521277",
"text": "When I'm using Claude Code, I usually have a text editor open as well. The CC plugin works well enough to achieve most of what Cursor was doing for me in showing real-time diffs, but in my experience, the output is better and faster. YMMV"
}
,
{
"id": "46523114",
"text": "I was here a few weeks ago, but I'm now on the CC train. The challenge is that the terminal is quite counterintuitive. But if you put on the Linux terminal lens from a few years ago, and you start using it. It starts to make sense. The form factor of the terminal isn't intuitive for programming, but it's the ultimate.\n\nFYI, I still use cursor for small edits and reviews."
}
,
{
"id": "46516907",
"text": "I don't think I can scientifically compare the agents. As it is, you can use Opus / Codex in Cursor. The speed of Cursor composer-1 is phenomenal -- you can use it interactively for many tasks. There are also tasks that are not easier to describe in English, but you can tab through them."
}
]
</comments_to_classify>
Based on the comments above, assign each to up to 3 relevant topics.
Return ONLY a JSON array with this exact structure (no other text):
[
{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,
{
"id": "comment_id_2",
"topics": [
2
]
}
,
{
"id": "comment_id_3",
"topics": [
0
]
}
,
...
]
Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment
Remember: Output ONLY the JSON array, no other text.
50