Summarizer

LLM Input

llm/0c6097e3-bc76-4fbe-ab4f-ceafa2484e5f/batch-0-cf5c8fa4-3d31-40a1-92f1-68532dfdf1b4-input.json

Pretty-print

prompt

The following is content for you to classify. Do not respond to the comments—classify them.

<topics>
1. AI Performance on Greenfield vs. Legacy
Related: Users debate whether agents excel primarily at starting new projects from scratch while struggling to maintain large, complex, or legacy codebases without breaking existing conventions.
2. Context Window Limitations and Management
Related: Discussions focus on token limits (200k), performance degradation as context fills, and strategies like compacting history, using sub-agents, or maintaining summary files to preserve long-term memory.
3. Vibe Coding and Code Quality
Related: The polarization around building apps without reading the code; critics warn of unmaintainable "slop" and technical debt, while proponents value the speed and ability to bypass syntax.
4. Claude Code and Tooling
Related: Specific praise and critique for the Claude Code CLI, its integration with VS Code and Cursor, the use of slash commands, and comparisons to GitHub Copilot's agent mode.
5. Economic Impact on Software Jobs
Related: Existential anxiety regarding the obsolescence of mid-level engineers, the potential "hollowing out" of the middle class, and the shift toward one-person unicorn teams.
6. Prompt Engineering and Configuration
Related: Strategies involving `CLAUDE.md`, `AGENTS.md`, and custom system prompts to teach the AI coding conventions, architecture, and specific skills for better output.
7. Specific Language Capabilities
Related: Anecdotal evidence regarding proficiency in React, Python, and Go versus struggles in C++, Rust, and mobile development (Swift/Kotlin), often tied to training data availability.
8. Engineering vs. Coding
Related: A recurring distinction between "coding" (boilerplate, standard patterns) which AI conquers, and "engineering" (novel logic, complex systems, 3D graphics) where AI supposedly still fails.
9. Security and Trust
Related: Concerns about deploying unaudited AI code, the introduction of vulnerabilities, the risks of giving agents shell access, and the difficulty of verifying AI output.
10. The Skill Issue Argument
Related: Proponents dismiss failures as "skill issues," suggesting frustration stems from poor prompting or adaptability, while skeptics argue the tools are genuinely inconsistent.
11. Cost of AI Development
Related: Analysis of the financial viability of AI coding, including hitting API rate limits, the high cost of Opus 4.5 tokens, and the potential unsustainability of VC-subsidized pricing.
12. Future of Software Products
Related: Predictions that software creation costs will drop to zero, leading to a flood of bespoke personal apps replacing commercial SaaS, but potentially creating a maintenance nightmare.
13. Human-in-the-Loop Workflows
Related: The consensus that AI requires constant human oversight, "tools in a loop," and code review to prevent hallucination loops and ensure functional software.
14. Opus 4.5 vs. Previous Models
Related: Users describe the specific model as a "step change" or "inflection point" compared to Sonnet 3.5 or GPT-4, citing better reasoning and autonomous behavior.
15. Documentation and Specification
Related: The shift from writing code to writing specs; users find that detailed markdown documentation or "plan mode" yields significantly better AI results than vague prompts.
16. AI Hallucinations and Errors
Related: Reports of AI inventing non-existent CLI tools, getting stuck in logical loops, failing at visual UI tasks, and making simple indexing errors.
17. Shift in Developer Role
Related: The idea that developers are evolving into "product managers" or "architects" who direct agents, requiring less syntax proficiency and more systems thinking.
18. Testing and Verification
Related: The reliance on test-driven development (TDD), linters, and compilers to constrain non-deterministic AI output, ensuring generated code actually runs and meets requirements.
19. Local Models vs. Cloud APIs
Related: Discussions on the viability of local models for privacy and cost savings versus the necessity of massive cloud models like Opus for complex reasoning tasks.
20. Societal Implications
Related: Broader philosophical concerns about wealth concentration, the "class war" of automation, environmental impact, and the future of work in a post-code world.
0. Does not fit well in any category
</topics>

<comments_to_classify>
[

{
"id": "46516157",
"text": "Most software engineers are seriously sleeping on how good LLM agents are right now, especially something like Claude Code.\n\nOnce you’ve got Claude Code set up, you can point it at your codebase, have it learn your conventions, pull in best practices, and refine everything until it’s basically operating like a super-powered teammate. The real unlock is building a solid set of reusable “skills” plus a few agents for the stuff you do all the time.\n\nFor example, we have a custom UI library, and Claude Code has a skill that explains exactly how to use it. Same for how we write Storybooks, how we structure APIs, and basically how we want everything done in our repo. So when it generates code, it already matches our patterns and standards out of the box.\n\nWe also had Claude Code create a bunch of ESLint automation, including custom ESLint rules and lint checks that catch and auto-handle a lot of stuff before it even hits review.\n\nThen we take it further: we have a deep code review agent Claude Code runs after changes are made. And when a PR goes up, we have another Claude Code agent that does a full PR review, following a detailed markdown checklist we’ve written for it.\n\nOn top of that, we’ve got like five other Claude Code GitHub workflow agents that run on a schedule. One of them reads all commits from the last month and makes sure docs are still aligned. Another checks for gaps in end-to-end coverage. Stuff like that. A ton of maintenance and quality work is just… automated. It runs ridiculously smoothly.\n\nWe even use Claude Code for ticket triage. It reads the ticket, digs into the codebase, and leaves a comment with what it thinks should be done. So when an engineer picks it up, they’re basically starting halfway through already.\n\nThere is so much low-hanging fruit here that it honestly blows my mind people aren’t all over it. 2026 is going to be a wake-up call.\n\n(used voice to text then had claude reword, I am lazy and not gonna hand write it all for yall sorry!)\n\nEdit: made an example repo for ya\n\nhttps://github.com/ChrisWiles/claude-code-showcase"
}
,

{
"id": "46520993",
"text": "I made a similar comment on a different thread, but I think it also fits here: I think the disconnect between engineers is due to their own context. If you work with frontend applications, specially React/React Native/HTML/Mobile, your experience with LLMs is completely different than the experience of someone working with OpenGL, io_uring, libev and other lower level stuff. Sure, Opus 4.5 can one shot Windows utilities and full stack apps, but can't implement a simple shadowing algorithm from a 2003 paper in C++, GLFW, GLAD: https://www.cse.chalmers.se/~uffe/soft_gfxhw2003.pdf\n\nCodex/Claude Code are terrible with C++. It also can't do Rust really well, once you get to the meat of it. Not sure why that is, but they just spit out nonsense that creates more work than it helps me. It also can't one shot anything complete, even though I might feed him the entire paper that explains what the algorithm is supposed to do.\n\nTry to do some OpenGL or Vulkan with it, without using WebGPU or three.js. Try it with real code, that all of us have to deal with every day. SDL, Vulkan RHI, NVRHI. Very frustrating.\n\nTry it with boost, or cmake, or taskflow. It loses itself constantly, hallucinates which version it is working on and ignores you when you provide actual pointers to documentation on the repo.\n\nI've also recently tried to get Opus 4.5 to move the Job system from Doom 3 BFG to the original codebase. Clean clone of dhewm3, pointed Opus to the BFG Job system codebase, and explained how it works. I have also fed it the Fabien Sanglard code review of the job system: https://fabiensanglard.net/doom3_bfg/threading.php\n\nWe are not sleeping on it, we are actually waiting for it to get actually useful. Sure, it can generate a full stack admin control panel in JS for my PostgreSQL tables, but is that really \"not normal\"? That's basic."
}
,

{
"id": "46527849",
"text": "We have an in-house, Rust-based proxy server. Claude is unable to contribute to it meaningfully outside of grunt work like minor refactors across many files. It doesn't seem to understand proxying and how it works on both a protocol level and business logic level.\n\nWith some entirely novel work we're doing, it's actually a hindrance as it consistently tells us the approach isn't valid/won't work (it will) and then enters \"absolutely right\" loops when corrected.\n\nI still believe those who rave about it are not writing anything I would consider \"engineering\". Or perhaps it's a skill issue and I'm using it wrong, but I haven't yet met someone I respect who tells me it's the future in the way those running AI-based companies tell me."
}
,

{
"id": "46530482",
"text": "> We have an in-house, Rust-based proxy server. Claude is unable to contribute to it meaningfully outside\n\nI have a great time using Claude Code in Rust projects, so I know it's not about the language exactly.\n\nMy working model is is that since LLM are basically inference/correlation based, the more you deviate from the mainstream corpus of training data, the more confused LLM gets. Because LLM doesn't \"understand\" anything. But if it was trained on a lot of things kind of like the problem, it can match the patterns just fine, and it can generalize over a lot layers, including programming languages.\n\nAlso I've noticed that it can get confused about stupid stuff. E.g. I had two different things named kind of the same in two parts of the codebase, and it would constantly stumble on conflating them. Changing the name in the codebase immediately improved it.\n\nSo yeah, we've got another potentially powerful tool that requires understanding how it works under the hood to be useful. Kind of like git."
}
,

{
"id": "46532300",
"text": "Recently the v8 rust library changed it from mutable handle scopes to pinned scopes. A fairly simple change that I even put in my CLAUDE.md file. But it still generates methods with HandleScope's and then says... oh I have a different scope and goes on a random walk refactoring completely unrelated parts of the code. All the while Opus 4.5 burns through tokens. Things work great as long as you are testing on the training set. But that said, it is absolutely brilliant with React and Typescript."
}
,

{
"id": "46530000",
"text": "This isn't meant as a criticism, or to doubt your experience, but I've talked to a few people who had experiences like this. But, I helped them get Claude code setup, analyze the codebase and document the architecture into markdown (edit as needed after), create an agent for the architecture, and prompt it in an incremental way. Maybe 15-30 minutes of prep. Everyone I helped with this responded with things like \"This is amazing\", \"Wow!\", etc.\n\nFor some things you can fire up Claude and have it generate great code from scratch. But for bigger code bases and more complex architecture, you need to break it down ahead of time so it can just read about the architecture rather than analyze it every time."
}
,

{
"id": "46530178",
"text": "Is there any good documentation out there about how to perform this wizardry? I always assumed if you did /init in a new code base, that Claude would set itself up to maximize its own understanding of the code. If there are extra steps that need to be done, why don't Claude's developers just add those extra steps to /init?"
}
,

{
"id": "46530399",
"text": "Not that I have seen, which is probably a big part of the disconnect. Mostly it's tribal knowledge. I learned through experimentation, but I've seen tips here and there. Here's my workflow (roughly)\n\n> Create a CLAUDE.md for a c++ application that uses libraries x/y/z\n\n[Then I edit it, adding general information about the architecture]\n\n> Analyze the library in the xxx directory, and produce a xxx_architecture.md describing the major components and design\n\n> /agent [let claude make the agent, but when it asks what you want it to do, explain that you want it to specialize in subsystem xxx, and refer to xxx_architecture.md\n\nThen repeat until you have the major components covered. Then:\n\n> Using the files named with architecture.md analyze the entire system and update CLAUDE.md to use refer to them and use the specialized agents.\n\nNow, when you need to do something, put it in planning mode and say something like:\n\n> There's a bug in the xxx part of the application, where when I do yyy, it does zzz, but it should do aaa. Analyze the problem and come up with a plan to fix it, and automated tests you can perform if possible.\n\nThen, iterate on the plan with it if you need to, or just approve it.\n\nOne of the most important things you can do when dealing with something complex is let it come up with a test case so it can fix or implement something and then iterate until it's done. I had an image processing problem and I gave it some sample data, then it iterated (looking at the output image) until it fixed it. It spent at least an hour, but I didn't have to touch it while it worked."
}
,

{
"id": "46532173",
"text": "This is some great advice. What I would add is to avoid the internal plan mode and just build your own. Built in one creates md files outside the project, gives the files random names and its hard to reference in the future.\n\nIt's also hard to steer the plan mode or have it remember some behavior that you want to enforce. It's much better to create a custom command with custom instructions that acts as the plan mode.\n\nMy system works like this:\n\n/implement command acts as an orchestrator & plan mode, and it is instructed to launch predefined set of agents based on the problem and have them utilize specific skills. Every time /implement command is initiated, it has to create markdown file inside my own project, and then each subagent is also instructed to update the file when it finished working.\n\nThis way, orchestrator can spot that agent misbehaved, and reviewer agent can see what developer agent tried to do and why it was wrong."
}
,

{
"id": "46530452",
"text": "To be perfectly honest, I've never used a single /command besides /init. That probably means I'm using 1% of the software's capabilities. In frankness, the whole menu of /-commands is intimidating and I don't know where to start."
}
,

{
"id": "46531548",
"text": "/commands are like macros or mayyybe aliases. You just put in the commands you see yourself repeating often, like \"commit the unstaged files in distinct commits, use xxx style for the commit messages...\" - then you can iterate on it if you see any gaps or confusion, even give example commands to use in the different steps.\n\nSkills on the other hand are commands ON STEROIDS. They can be packaged with actual scripts and executables, the PEP723 Python style + uv is super useful.\n\nI have one skill for example that uses Python+Treesitter to check the unit thest quality of a Go project. It does some AST magic to check the code for repetition, stupid things like sleeps and relative timestamps etc. A /command _can_ do it, but it's not as efficient, the scripts for the skill are specifically designed for LLM use and output the result in a hyper-compact form a human could never be arsed to read."
}
,

{
"id": "46532224",
"text": "> In frankness, the whole menu of /-commands is intimidating and I don't know where to start.\n\nclaude-code has a built in plugin that it can use to fetch its own docs! You don't have to ever touch anything yourself, it can add the features to itself, by itself."
}
,

{
"id": "46530689",
"text": "You don't need to do much, the /agent command is the most useful, and it walks you through it. The main thing though is to give the agent something to work with before you create it. That's why I go through the steps of letting Claude analyze different components and document the design/architecture.\n\nThe major benefit of agents is that it keeps context clean for the main job. So the agent might have a huge context working through some specific code, but the main process can do something to the effect of \"Hey UI library agent, where do I need to put code to change the color of widget xyz\", then the agent does all the thinking and can reply with \"that's in file 123.js, line 200\". The cleaner you keep the main context, the better it works."
}
,

{
"id": "46531559",
"text": "Never thought of Agents in that way to be honest. I think I need to try that style =)"
}
,

{
"id": "46530722",
"text": "> if you did /init in a new code base, that Claude would set itself up to maximize its own understanding of the code.\n\nThis is definitely not the case, and the reason anthropic doesnt make claude do this is because its quality degrades massively as you use up its context. So the solution is to let users manage the context themselves in order to minimize the amount that is \"wasted\" on prep work. Context windows have been increasing quite a bit so I suspect that by 2030 this will no longer be an issue for any but the largest codebases, but for now you need to be strategic."
}
,

{
"id": "46532199",
"text": "Are you still talking about Opus 4.5 I’ve been working on a Rust, kotlin and c++ and it’s been doing well. Incredible at C++, like the number of mistakes it doesn’t make"
}
,

{
"id": "46530116",
"text": "> I still believe those who rave about it are not writing anything I would consider \"engineering\".\n\nCorrect. In fact, this is the entire reason for the disconnect, where it seems like half the people here think LLMs are the best thing ever and the other half are confused about where the value is in these slop generators.\n\nThe key difference is (despite everyone calling themselves an SWE nowadays) there's a difference between a \"programmer\" and an \"engineer\". Looking at OP, exactly zero of his screenshotted apps are what I would consider \"engineering\". Literally everything in there has been done over and over to the death. Engineering is.. novel, for lack of a better word.\n\nSee also: https://www.seangoedecke.com/pure-and-impure-engineering/"
}
,

{
"id": "46530937",
"text": "> Engineering is.. novel, for lack of a better word.\n\nTell that to the guys drawing up the world's 10 millionth cable suspension bridge"
}
,

{
"id": "46531636",
"text": "Actually, 10000th\n\nhttps://www.bridgemeister.com/fulllist.htm"
}
,

{
"id": "46530243",
"text": "I don't think it's that helpful to try to gatekeep the \"engineering\" term or try to separate it into \"pure\" and \"impure\" buckets, implying that one is lesser than the other. It should be enough to just say that AI assisted development is much better at non-novel tasks than it is at novel tasks. Which makes sense: LLMs are trained on existing work, and can't do anything novel because if it was trained on a task, that task is by definition not novel."
}
,

{
"id": "46530489",
"text": "Respectfully, it's absolutely important to \"gatekeep\" a title that has an established definition and certain expectations attached to the title.\n\nOP says, \"BUT YOU DON’T KNOW HOW THE CODE WORKS.. No I don’t. I have a vague idea, but you are right - I do not know how the applications are actually assembled.\" This is not what I would call an engineer. Or a programmer. \"Prompter\", at best.\n\nAnd yes, this is absolutely \"lesser than\", just like a middleman who subcontracts his work to Fiverr (and has no understanding of the actual work) is \"lesser than\" an actual developer."
}
,

{
"id": "46530702",
"text": "That's not the point being made to you. The point is that most people in the \"software engineering\" space are applying known tools and techniques to problems that are not groundbreaking. Very few are doing theoretical computer science, algorithm design, or whatever you think it is that should be called \"engineering.\""
}
,

{
"id": "46535339",
"text": "Coding agents as of Jan 2026 are great at what 95% of software engineers do. For remaining 5% that do really novel stuff -- the agents will get there in a few years."
}
,

{
"id": "46532209",
"text": "It's how you use the tool that matters. Some people get bitter and try to compare it to top engineers' work on novel things as a strawman so they can go \"Hah! Look how it failed!\" as they swing a hammer to demonstrate it cannot chop down a tree. Because the tool is so novel and it's use us a lot more abstract than that of an axe, it is taking awhile for some to see its potential, especially if they are remembering models from even six months ago.\n\nEngineering is just problem solving, nobody judges structural engineers for designing structures with another Simpson Strong Tie/No.2 Pine 2x4 combo because that is just another easy (and therefore cheap) way to rapidly get to the desired state. If your client/company want to pay for art, that's great! Most just want the thing done fast and robustly."
}
,

{
"id": "46521052",
"text": "I've had Opus 4.5 hand rolling CUDA kernels and writing a custom event loop on io_uring lately and both were done really well. Need to set up the right feedback loops so it can test its work thoroughly but then it flies."
}
,

{
"id": "46521160",
"text": "Yeah I've handed it a naive scalar implementation and said \"Make this use SIMD for Mac Silicon / NEON\" and it just spits out a working implementation that's 3-6x faster and passes the tests, which are binary exact specifications."
}
,

{
"id": "46521478",
"text": "It can do this at the level of a function, and that's -useful-, but like the parent reply to top-level comment, and despite investing the time, using skills & subagents, etc., I haven't gotten it to do well with C++ or Rust projects of sufficient complexity. I'm not going to say they won't some day, but, it's not today."
}
,

{
"id": "46522437",
"text": "Anecdotally, we use Opus 4.5 constantly on Zed's code base, which is almost a million lines of Rust code and has over 150K active users, and we use it for basically every task you can think of - new features, bug fixes, refactors, prototypes, you name it. The code base is a complex native GUI with no Web tech anywhere in it.\n\nI'm not talking about \"write this function\" but rather like implementing the whole feature by writing only English to the agent, over the course of numerous back-and-forth interactions and exhausting multiple 200K-token context windows.\n\nFor me personally, definitely at least 99% all of the Rust code I've committed at work since Opus 4.5 came out has been from an agent running that model. I'm reading lots of Rust code (that Opus generated) but I'm essentially no longer writing any of it. If dot-autocomplete (and LLM autocomplete) disappeared from IDE existence, I would not notice."
}
,

{
"id": "46534428",
"text": "I just uninstalled Zed today when I realized the reason I couldn't delete a file on Windows because it was open in Zed. So I wouldn't speak too highly of the LLM's ability to write code. I have never seen another editor on Windows make the mistake of opening files without enabling all 3 share modes."
}
,

{
"id": "46530815",
"text": "Woah that's a very interesting claim you made\nI was shying away from writing Rust as I am not a Rust developer but hearing from your experience looks like claude has gotten very good at writing Rust"
}
,

{
"id": "46534263",
"text": "Honestly I think the more you can give Claude a type system and effective tests, the more effective it can be. Rust is quite high up on the test strictness front (though I think more could be done...), so it's a great candidate. I also like it's performance on Haskell and Go, both get you pretty great code out of the box."
}
,

{
"id": "46528407",
"text": "Have you ever worried that by programming in this way, you are methodically giving Anthropic all the information it needs to copy your product? If there is any real value in what you are doing, what is to stop Anthropic or OpenAI or whomever from essentially one-shotting Zed? What happens when the model providers 10x their costs and also use the information you've so enthusiastically given them to clone your product and use the money that you paid them to squash you?"
}
,

{
"id": "46528424",
"text": "Zed's entire code base is already open source, so Anthropic has a much more straightforward way to see our code:\n\nhttps://github.com/zed-industries/zed"
}
,

{
"id": "46529662",
"text": "That's what things like AWS bedrock are for.\n\nAre you worried about microsoft stealing your codebase from github?"
}
,

{
"id": "46532980",
"text": "Isn’t it widely assumed Microsoft used private repos for LLM training?\n\nAnd even with a narrower definition of stealing, Microsoft’s ability to share your code with US government agencies is a common and very legitimate worry in plenty of threat model scenarios."
}
,

{
"id": "46527838",
"text": "The article is arguing that it will basically replace devs. Do you think it can replace you basically one-shotting features/bugs in Zed?\n\nAnd also - doesn’t that make Zed (and other editors) pointless?"
}
,

{
"id": "46528548",
"text": "> Do you think it can replace you basically one-shotting features/bugs in Zed?\n\nNobody is one-shotting anything nontrivial in Zed's code base, with Opus 4.5 or any other model.\n\nWhat about a future model? Literally nobody knows. Forecasts about AI capabilities have had horrendously low accuracy in both directions - e.g. most people underestimated what LLMs would be capable of today, and almost everyone who thought AI would at least be where it is today...instead overestimated and predicted we'd have AGI or even superintelligence by now. I see zero signs of that forecasting accuracy improving. In aggregate, we are atrocious at it.\n\nThe only safe bet is that hardware will be faster and cheaper (because the most reliable trend in the history of computing has been that hardware gets faster and cheaper), which will naturally affect the software running on it.\n\n> And also - doesn’t that make Zed (and other editors) pointless?\n\nIt means there's now demand for supporting use cases that didn't exist until recently, which comes with the territory of building a product for technologists! :)"
}
,

{
"id": "46528733",
"text": "Thanx. More of a \"faster keyboard\" so far then?\n\nAnd yeah - if I had a crystal ball, I would be on my private island instead of hanging on HN :)"
}
,

{
"id": "46528842",
"text": "Definitely more than a faster keyboard (e.g. I also ask the model to track down the source of a bug, or questions about the state of the code base after others have changed it, bounce architectural ideas off the model, research, etc.) but also definitely not a replacement for thinking or programming expertise."
}
,

{
"id": "46530102",
"text": "Trying to one-shot large codebases is a exercise in futility. You need to let Claude figure out and document the architecture first, then setup agents for each major part of the project. Doing this keeps the context clean for the main agent, since it doesn't have to go read the code each time. So one agent can fill it's entire context understanding part of the code and then the main agent asks it how to do something and gets a shorter response.\n\nIt takes more work than one-shot, but not a lot, and it pays dividends."
}
,

{
"id": "46532516",
"text": "Is there a guide for doing that successfully somewhere? I would love to play with this on a large codebase. I would also love to not reinvent the wheel on getting Claude working effectively on a large code base. I don’t even know where to start with, e.g., setting up agents for each part."
}
,

{
"id": "46521546",
"text": "I don't know if you've tried Chatgpt-5.2 but I find codex much better for Rust mostly due to the underlying model. You have to do planning and provide context, but 80%+ of the time it's a oneshot for small-to-medium size features in an existing codebase that's fairly complex. I honestly have to say that it's a better programmer than I am, it's just not anywhere near as good a software developer for all of the higher and lower level concerns that are the other 50% of the job.\n\nIf you have any opensource examples of your codebase, prompt, and/or output, I would happily learn from it / give advice. I think we're all still figuring it out.\n\nAlso this SIMD translation wasn't just a single function - it was multiple functions across a whole region of the codebase dealing with video and frame capture, so pretty substantial."
}
,

{
"id": "46526966",
"text": "\"I honestly have to say that it's a better programmer than I am, it's just not anywhere near as good a software developer for all of the higher and lower level concerns that are the other 50% of the job.\"\n\nThat's a good way to say it, I totally identify."
}
,

{
"id": "46529651",
"text": "Is that a context issue? I wonder if LSP would help there. Though Claude Code should grep the codebase for all necessary context and LSP should in theory only save time, I think there would be a real improvement to outcomes as well.\n\nThe bigger a project gets the more context you generally need to understand any particular part. And by default Claude Code doesn't inject context, you need to use 3rd party integrations for that."
}
,

{
"id": "46526925",
"text": "I'll second this. I'm making a fairly basic iOS/Swift app with an accompanying React-based site. I was able to vibe-code the React site (it isn't pretty, but it works and the code is fairly decent). But I've struggled to get the Swift code to be reliable.\n\nWhich makes sense. I'm sure there's lots of training data for React/HTML/CSS/etc. but much less with Swift, especially the newer versions."
}
,

{
"id": "46530647",
"text": "I had surprising success vibe coding a swift iOS app a while back. Just for fun, since I have a bluetooth OBD2 dongle and an electric truck, I told Claude to make me an app that could connect to the truck using the dongle, read me the VIN, odometer, and state of charge. This was middle of 2025, so before Opus 4.5. It took Claude a few attempts and some feedback on what was failing, but it did eventually make a working app after a couple hours.\n\nNow, was the code quality any good? Beats me, I am not a swift developer. I did it partly as an experiment to see what Claude was currently capable of and partly because I wanted to test the feasibility of setting up a simple passive data logger for my truck.\n\nI'm tempted to take another swing with Opus 4.5 for the science."
}
,

{
"id": "46531674",
"text": "I hate \"vibe code\" as a verb. May I suggest \"prompt\" instead? \"I was able to prompt the React site….\""
}
,

{
"id": "46524412",
"text": "I built an open to \"game engine\" entirely in Lua a many years ago, but relying on many third party libraries that I would bind to with FFI.\n\nI thought I'd revive it, but this time with Vulkan and no third-party dependencies (except for Vulkan)\n\n4.5 Sonet, Opus and Gemini 3.5 flash has helped me write image decoders for dds, png jpg, exr, a wayland window implementation, macOS window implementation, etc.\n\nI find that Gemini 3.5 flash is really good at understanding 3d in general while sonnet might be lacking a little.\n\nAll these sota models seem to understand my bespoke Lua framework and the right level of abstraction. For example at the low level you have the generated Vulkan bindings, then after that you have objects around Vulkan types, then finally a high level pipeline builder and whatnot which does not mention Vulkan anywhere.\n\nHowever with a larger C# codebase at work, they really struggle. My theory is that there are too many files and abstractions so that they cannot understand where to begin looking."
}
,

{
"id": "46522562",
"text": "I'm a quite senior frontend using React and even I see Sonnet 4.5 struggle with basic things. Today it wrote my Zod validation incorrectly, mixing up versions, then just decided it wasn't working and attempted to replace the entire thing with a different library."
}
,

{
"id": "46523336",
"text": "There’s little reason to use sonnet anymore. Haiku for summaries, opus for anything else. Sonnet isn’t a good model by today’s standards."
}

]
</comments_to_classify>

Based on the comments above, assign each to up to 3 relevant topics.

Return ONLY a JSON array with this exact structure (no other text):
[

{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,

{
"id": "comment_id_2",
"topics": [
2
]
}
,

{
"id": "comment_id_3",
"topics": [
0
]
}
,
...
]

Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment

Remember: Output ONLY the JSON array, no other text.

commentCount

← Back to job