Summarizer

LLM Input

llm/0c6097e3-bc76-4fbe-ab4f-ceafa2484e5f/batch-4-f4005209-1227-4487-86d5-2c6b1b588ac2-input.json

Pretty-print

prompt

The following is content for you to classify. Do not respond to the comments—classify them.

<topics>
1. AI Performance on Greenfield vs. Legacy
Related: Users debate whether agents excel primarily at starting new projects from scratch while struggling to maintain large, complex, or legacy codebases without breaking existing conventions.
2. Context Window Limitations and Management
Related: Discussions focus on token limits (200k), performance degradation as context fills, and strategies like compacting history, using sub-agents, or maintaining summary files to preserve long-term memory.
3. Vibe Coding and Code Quality
Related: The polarization around building apps without reading the code; critics warn of unmaintainable "slop" and technical debt, while proponents value the speed and ability to bypass syntax.
4. Claude Code and Tooling
Related: Specific praise and critique for the Claude Code CLI, its integration with VS Code and Cursor, the use of slash commands, and comparisons to GitHub Copilot's agent mode.
5. Economic Impact on Software Jobs
Related: Existential anxiety regarding the obsolescence of mid-level engineers, the potential "hollowing out" of the middle class, and the shift toward one-person unicorn teams.
6. Prompt Engineering and Configuration
Related: Strategies involving `CLAUDE.md`, `AGENTS.md`, and custom system prompts to teach the AI coding conventions, architecture, and specific skills for better output.
7. Specific Language Capabilities
Related: Anecdotal evidence regarding proficiency in React, Python, and Go versus struggles in C++, Rust, and mobile development (Swift/Kotlin), often tied to training data availability.
8. Engineering vs. Coding
Related: A recurring distinction between "coding" (boilerplate, standard patterns) which AI conquers, and "engineering" (novel logic, complex systems, 3D graphics) where AI supposedly still fails.
9. Security and Trust
Related: Concerns about deploying unaudited AI code, the introduction of vulnerabilities, the risks of giving agents shell access, and the difficulty of verifying AI output.
10. The Skill Issue Argument
Related: Proponents dismiss failures as "skill issues," suggesting frustration stems from poor prompting or adaptability, while skeptics argue the tools are genuinely inconsistent.
11. Cost of AI Development
Related: Analysis of the financial viability of AI coding, including hitting API rate limits, the high cost of Opus 4.5 tokens, and the potential unsustainability of VC-subsidized pricing.
12. Future of Software Products
Related: Predictions that software creation costs will drop to zero, leading to a flood of bespoke personal apps replacing commercial SaaS, but potentially creating a maintenance nightmare.
13. Human-in-the-Loop Workflows
Related: The consensus that AI requires constant human oversight, "tools in a loop," and code review to prevent hallucination loops and ensure functional software.
14. Opus 4.5 vs. Previous Models
Related: Users describe the specific model as a "step change" or "inflection point" compared to Sonnet 3.5 or GPT-4, citing better reasoning and autonomous behavior.
15. Documentation and Specification
Related: The shift from writing code to writing specs; users find that detailed markdown documentation or "plan mode" yields significantly better AI results than vague prompts.
16. AI Hallucinations and Errors
Related: Reports of AI inventing non-existent CLI tools, getting stuck in logical loops, failing at visual UI tasks, and making simple indexing errors.
17. Shift in Developer Role
Related: The idea that developers are evolving into "product managers" or "architects" who direct agents, requiring less syntax proficiency and more systems thinking.
18. Testing and Verification
Related: The reliance on test-driven development (TDD), linters, and compilers to constrain non-deterministic AI output, ensuring generated code actually runs and meets requirements.
19. Local Models vs. Cloud APIs
Related: Discussions on the viability of local models for privacy and cost savings versus the necessity of massive cloud models like Opus for complex reasoning tasks.
20. Societal Implications
Related: Broader philosophical concerns about wealth concentration, the "class war" of automation, environmental impact, and the future of work in a post-code world.
0. Does not fit well in any category
</topics>

<comments_to_classify>
[

{
"id": "46522271",
"text": "Just FYI, these days cc has 'ide integration' too, it's not just a cli. Grab the vscode extension."
}
,

{
"id": "46528641",
"text": "I use CC for so much more than just writing code that I cannot imagine being constrained within an IDE. Why would I want to launch an IDE to have CC update the *arr stack on my NAS to the latest versions for example? Last week I pointed CC at some media files that weren't playing correctly on my Apple TV. It detected what the problem formats were and updated my *arr download rules to prefer other releases and then configured tdarr to re-encode problem files in my existing library."
}
,

{
"id": "46519882",
"text": "This is where the LLM coding shines in my opinion, there's a list of things they are doing very well:\n\n- single scripts. Anything which can be reduced to a single script.\n\n- starting greenfield projects from scratch\n\n- code maintenance (package upgrades, old code...)\n\n- tasks which have a very clear and single definition. This isn't linked to complexity, some tasks can be both very complex but with a single definition.\n\nIf your work falls into this list they will do some amazing work (and yours clearly fits that), if it doesn't though, prepare yourself because it will be painful."
}
,

{
"id": "46520035",
"text": "I'm trying to determine what programming tasks are not in this list. :) I think it is trying to exclude adding new features and fixing bugs in existing code. I've done enough of that with LLMs, though not in large codebases.\n\nI should say I'm hardly ever vibe-coding, unlike the original article. If I think I want code that will last, I'll steer the models in ways that lean on years of non-LLM experience. E.g., I'll reject results that might work if they violate my taste in code.\n\nIt also helps that I can read code very fast. I estimate I can read code 100x faster than most students. I'm not sure there is any way to teach that other than the old-fashioned way, which involves reading (and writing) a lot of code."
}
,

{
"id": "46521876",
"text": "> I'm trying to determine what programming tasks are not in this list. :) I think it is trying to exclude adding new features and fixing bugs in existing code\n\nYes indeed, these are the things on the other hand which aren't working well in my opinion:\n\n- large codebase\n\n- complex domain knowledge\n\n- creating any feature where you need product insights\n\n- tasks requiring choices (again, complexity doesn't matter here, the task may be simple but require some choices)\n\n- anything unclear where you don't know where you are going first\n\nWhile you don't experience any of these when teaching or side projects, these are very common in any enterprise context."
}
,

{
"id": "46520047",
"text": "What did you build? I think people talk passed eachother when people don't share what exactly they were trying to do and achieving success/failure."
}
,

{
"id": "46520086",
"text": "Referring to this: https://github.com/arjunguha/slopcoder\n\nI then proceeded to use it to hack on its own codebase, and close a bunch of issues in a repository that I maintain ( https://github.com/nuprl/MultiPL-E/commits/main/ )."
}
,

{
"id": "46523183",
"text": "The crazy part is, once you have it setup and adapted your workflow, you start to notice all sorts of other \"small\" things:\n\nclaude can call ssh and do system admin tasks. It works amazingly well. I have 3 VM's, which depends on each other (proxmox with openwrt, adguard, unbound), and claude can prove to me that my dns chains works perfectly, my firewalls are perfect etc as claude can ssh into each. Setting up services, diagnosing issues, auditing configs... you name it. Just awesome.\n\nclaude can call other sh scripts on the machine, so over time, you can create a bunch of scripts that lets claude one shot certain tasks that would normally eat tokens. It works great. One script per intention - don't have a script do more than one thing.\n\nclaude can call the compiler, run the debug executable and read the debug logs.. in real time. So claude can read my android apps debug stream via adb.. or my C# debug console because claude calls the compiler, not me. Just ask it to do it and it will diagnose stuff really quickly.\n\nIt can also analyze your db tables (give it readonly sql access), look at the application code and queries, and diagnose performance issues.\n\nThe opportunities are endless here. People need to wake up to this."
}
,

{
"id": "46532904",
"text": "I have a /fix-ci-build slash command that instructs Claude how to use `gh` to get the latest build from that specific project's Github Actions and get the logs for the build\n\nIn addition there are instructions on how and where to push the possible fixes and how to check the results.\n\nI've yet to encounter a build failure it couldn't fix automatically."
}
,

{
"id": "46532234",
"text": "> Once you’ve got Claude Code set up, you can point it at your codebase, have it learn your conventions, pull in best practices, and refine everything until it’s basically operating like a super-powered teammate. The real unlock is building a solid set of reusable “skills” plus a few agents for the stuff you do all the time.\n\nI agree with this, but I haven't needed to use any advanced features to get good results. I think the simple approach gets you most of the benefits. Broadly, I just have markdown files in the repo written for a human dev audience that the agent can also use.\n\nBasically:\n\n- README.md with a quick start section for devs, descriptions of all build targets and tests, etc. Normal stuff.\n\n- AGENTS.md (only file that's not written for people specifically) that just describes the overall directory structure and has a short step of instructions for the agent: (1) Always read the readme before you start. (2) Always read the relevant design docs before you start. (3) Always run the linter, a build, and tests whenever you make code changes.\n\n- docs/*.md that contain design docs, architecture docs, and user stories, just text. It's important to have these resources anyway, agent or no.\n\nAs with human devs, the better the docs/requirements the better the results."
}
,

{
"id": "46517233",
"text": "Why do all these AI generated readmes have a directory structure sections it's so redundant because you know I could just run tree"
}
,

{
"id": "46519657",
"text": "It makes me so exhausted trying to read them... my brain can tell immediately when there's so much redundant information that it just starts shutting itself off."
}
,

{
"id": "46517489",
"text": "comments? also reading into an agent so the agent doesnt have to tool-call/bash out"
}
,

{
"id": "46516388",
"text": "I think we're entering a world where programmers as such won't really exist (except perhaps in certain niches). Being able to program (and read code, in particular) will probably remain useful, though diminished in value. What will matter more is your ability to actually create things, using whatever tools are necessary and available, and have them actually be useful. Which, in a way, is the same as it ever was. There's just less indirection involved now."
}
,

{
"id": "46520562",
"text": "We've been living in that world since the invention of the compiler (\"automatic programming\"). Few people write machine code any more. If you think of LLMs as a new variety of compiler, a lot of their shortcomings are easier to describe."
}
,

{
"id": "46521914",
"text": "My compiler runs on my computer and produces the same machine code given the same input. Neither of these are true with AI."
}
,

{
"id": "46534541",
"text": "You can run an LLM locally (and distributed compile systems, where the compiler runs in the cloud, are a thing, too) so that doesn't really produce a distinction between the two.\n\nLikewise, many optimization techniques involve some randomness, whether it's approximating an NP-thorny subproblem, or using PGO guided by statistical sampling. People might disable those in pursuit of reproducible builds, but no one would claim that enabling those features makes GCC or LLVM no longer a compiler. So nondeterminism isn't really the distinguishing factor either."
}
,

{
"id": "46520592",
"text": "last thing I want is non-deterministic compiler, do not vibe this analogy at all…"
}
,

{
"id": "46524631",
"text": "Finally we've invented a compiler that we can yell at when it gives bullshit errors. I really missed that with gcc."
}
,

{
"id": "46516564",
"text": "Isn't there more indirection as long as LLMs use \"human\" programming languages?"
}
,

{
"id": "46522900",
"text": "If you think of the training data, e.g. SO, github etc, then you have a human asking or describing a problem, then the code as the solution. So I suspect current-gen LLMs are still following this model, which means for the forseeable future a human like language prompt will still be the best.\n\nUntil such time, of course, when LLMs are eating their own dogfood, in which case they - as has already happened - create their own language, evolve dramatically, and cue skynet."
}
,

{
"id": "46516878",
"text": "More indirection in the sense that there's a layer between you and the code, sure. Less in that the code doesn't really matter as such and you're not having to think hard about the minutiae of programming in order to make something you want. It's very possible that \"AI-oriented\" programming languages will become the standard eventually (at least for new projects)."
}
,

{
"id": "46533279",
"text": "One benefit of conventional code is that it expresses logic in an unambiguous way. Much of \"the minutiae\" is deciding what happens in edge cases. It's even harder to express that in a human language than in computer languages. For some domains it probably doesn't matter."
}
,

{
"id": "46520370",
"text": "It’s not clear how affordances of programming languages really differ between humans and LLMs."
}
,

{
"id": "46516523",
"text": "You intrigue me.\n\n> have it learn your conventions, pull in best practices\n\nWhat do you mean by \"have it learn your conventions\"? Is there a way to somehow automatically extract your conventions and store it within CLAUDE.md?\n\n> For example, we have a custom UI library, and Claude Code has a skill that explains exactly how to use it. Same for how we write Storybooks, how we structure APIs, and basically how we want everything done in our repo. So when it generates code, it already matches our patterns and standards out of the box.\n\nDid you have to develop these skills yourself? How much work was that? Do you have public examples somewhere?"
}
,

{
"id": "46517304",
"text": "> What do you mean by \"have it learn your conventions\"?\n\nI'll give you an example: I use ruff to format my python code, which has an opinionated way of formatting certain things. After an initial formatting, Opus 4.5, without prompting, will write code in this same style so that the ruff formatter almost never has anything to do on new commits. Sonnet 4.5 is actually pretty good at this too."
}
,

{
"id": "46519611",
"text": "Isn't this a meaningless example? Formatters already exist. Generating code that doesn't need to be formatted is exactly the same as generating code and then formatting it.\n\nI care about the norms in my codebase that can't be automatically enforced by machine. How is state managed? How are end-to-end tests written to minimize change detectors? When is it appropriate to log something?"
}
,

{
"id": "46532633",
"text": "The second part is what I'd also like to have.\n\nBut I think it should be doable. You can tell it how YOU want the state to be managed and then have it write a custom \"linter\" that makes the check deterministic. I haven't tried this myself, but claude did create some custom clippy scripts in rust when I wanted to enforce something that isn't automatically enforced by anything out there."
}
,

{
"id": "46533399",
"text": "Lints are typically well suited for syntactic properties or some local semantic properties. Almost all interesting challenges in software design and evolution involve nonlocal semantic properties."
}
,

{
"id": "46520398",
"text": "Here's an example:\n\nWe have some tests in \"GIVEN WHEN THEN\" style, and others in other styles. Opus will try to match each style of testing by the project it is in by reading adjacent tests."
}
,

{
"id": "46529265",
"text": "Memes write themselves.\n\n\"AI has X\"\n\n\"We have X at home\"\n\n\"X at home: x\""
}
,

{
"id": "46516612",
"text": "Starting to use Opus 4.5 I'm reduces instrutions in claude.md and just ask claude to look in the codebase to understand the patterns already in use. Going from prompts/docs to instead having code being the \"truth\". Show don't tell. I've found this patterns has made a huge leap with Opus 4.5."
}
,

{
"id": "46530155",
"text": "I feel like I've been doing this since Sonnet 3.5 or Sonnet 4. I'll clone projects/modules/whatever into the working directory and tell claude to check it out. Voila, now it knows your standards and conventions."
}
,

{
"id": "46520691",
"text": "The Ash framework takes the approach you describe.\n\nFrom the docs ( https://hexdocs.pm/ash/what-is-ash.html ):\n\n\"Model your application's behavior first, as data, and derive everything else automatically. Ash resources center around actions that represent domain logic.\""
}
,

{
"id": "46516976",
"text": "When I ask Claude to do something, it independently, without me even asking or instructing it to, searches the codebase to understand what the convention is.\n\nI’ve even found it searching node_modules to find the API of non-public libraries."
}
,

{
"id": "46519568",
"text": "This sounds like it would take a huge amount of tokens. I've never used agents so could you disclose how much you pay for it?"
}
,

{
"id": "46520287",
"text": "If they're using Opus then it'll be the $100/month Claude Max 5x plan (could be the more expensive 20x plan depending on how intensive their use is). It does consume a lot of tokens, but I've been using the $100/mo plan and get a lot done without hitting limits. It helps to be mindful of context (regularly amending/pruning your CLAUDE.md instructions, clearing context between tasks, sizing your tasks to stay within the Opus context window). Claude Code plans have token limits that work in 5-hour blocks (that start when you send your first token, so it's often useful to prime it as early in the morning as possible).\n\nClaude Code will spawn sub-agents (that often use their cheap Haiki model) for exploration and planning tasks, with only the results imported into the main context.\n\nI've found the best results from a more interactive collaboration with Claude Code. As long as you describe the problem clearly, it does a good job on small/moderate tasks. I generally set two instances of Claude Code separate tasks and run them concurrently (the interaction with Claude Code distracts me too much to do my own independent coding simultaneously like with setting a task for a colleague, but I do work on architecture / planning tasks)\n\nThe one manner of taste that I have had to compromise on is the sheer amount of code - it likes to write a lot of code. I have a better experience if I sweat the low-level code less, and just periodically have it clean up areas where I think it's written too much / too repetitive code.\n\nAs you give it more freedom it's more prone to failure (and can often get itself stuck in a fruitless spiral) - however as you use it more you get a sense of what it can do independently and what's likely to choke on. A codebase with good human-designed unit & playwright tests is very good.\n\nCrucially, you get the best results where your tasks are complex but on the menial side of the spectrum - it can pay attention to a lot of details, but on the whole don't expect it to do great on senior-level tasks.\n\nTo give you an idea, in a little over a month \"npx ccusage\" shows that via my Claude Code 5x sub I've used 5M input tokens, 1.5M output, 121M Cache Create, 1.7B Cache Read. Estimated pay-as-you-go API cost equivalent is $1500 (N.B. for the tail end of December they doubled everybody's API limits, so I was using a lot more tokens on more experimental on-the-fly tool construction work)"
}
,

{
"id": "46520490",
"text": "FYI Opus is available and pretty usable in claude-code on the $20/Mo plan if you are at all judicious.\n\nI exclusively use opus for architecture / speccing, and then mostly Sonnet and occasionally Haiku to write the code. If my usage has been light and the code isn't too straightforward, I'll have Opus write code as well."
}
,

{
"id": "46520529",
"text": "That's helpful to know, thanks! I gave Max 5x a go and didn't look back. My suspicion is that Opus 4.5 is subsidised, so good to know there's flexibility if prices go up."
}
,

{
"id": "46523453",
"text": "The $20 plan for CC is good enough for 10-20 minutes of opus every 5h and you’ll be out of your weekly limit after 4-5 days if you sleep during the night. I wouldn’t be surprised if Anthropic actually makes a profit here. (Yeah probably not, but they aren’t burning cash.)"
}
,

{
"id": "46530126",
"text": "\"Claude, clone this repo https://github.com/repo , review the coding conventions, check out any markdown or readme files. This is an example of coding conventions we want to use on this project\""
}
,

{
"id": "46521636",
"text": "All of these things work very well IMO in a professional context.\n\nEspecially if you're in a place where a lot of time was spent previously revising PRs for best practices, etc, even for human-submitted code, then having the LLM do that for you that saves a bunch of time. Most humans are bad at following those super-well.\n\nThere's a lot of stuff where I'm pretty sure I'm up to at least 2x speed now. And for things like making CLI tools or bash scripts, 10x-20x. But in terms of \"the overall output of my day job in total\", probably more like 1.5x.\n\nBut I think we will need a couple major leaps in tooling - probably deterministic tooling, not LLM tooling - before anyone could responsibly ship code nobody has ever read in situations with millions of dollars on the line (which is different from vibe-coding something that ends up making millions - that's a low-risk-high-reward situation, where big bets on doing things fast make sense. if you're already making millions, dramatic changes like that can become high-risk-low-reward very quickly. In those companies, \"I know that only touching these files is 99.99% likely to be completely safe for security-critical functionality\" and similar \"obvious\" intuition makes up for the lack of ability to exhaustively test software in a practical way (even with fuzzers and things), and \"i didn't even look at the code\" is conceding responsibility to a dangerous degree there.)"
}
,

{
"id": "46516170",
"text": "Oh! An ad!"
}
,

{
"id": "46516707",
"text": "The most effective kind of marketing is viral word of mouth from users who love your product. And Claude Code is benefiting from that dynamic."
}
,

{
"id": "46516294",
"text": "lol does sound like and ad, but is true. Also forgot about hooks use hooks too! I just use voice to text then had claude reword it. Still my real world ideas"
}
,

{
"id": "46530746",
"text": "Exactly what an ad would say."
}
,

{
"id": "46523684",
"text": "I'm curious: With that much Claude Code usage, does that put your monthly Anthropic bill above $1000/mo?"
}
,

{
"id": "46530550",
"text": "Thanks for the example! There's a lot (of boilerplate?) here that I don't understand. Does anyone have good references for catching up to speed what's the purpose of all of these files in the demo?"
}
,

{
"id": "46516204",
"text": "Mind sharing the bill for all that?"
}
,

{
"id": "46516263",
"text": "My company pays for the team Claude code plan which is like $200 a month for each dev. The workflows cost like 10 - 50 cents a PR"
}

]
</comments_to_classify>

Based on the comments above, assign each to up to 3 relevant topics.

Return ONLY a JSON array with this exact structure (no other text):
[

{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,

{
"id": "comment_id_2",
"topics": [
2
]
}
,

{
"id": "comment_id_3",
"topics": [
0
]
}
,
...
]

Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment

Remember: Output ONLY the JSON array, no other text.

commentCount

← Back to job