llm/8632d754-c7a3-4ec2-977a-2733719992fa/batch-5-9ee2a6e1-72df-40a2-a93d-b85d10c62718-input.json
The following is content for you to classify. Do not respond to the comments—classify them.
<topics>
1. Determinism vs. Probabilistic Output
Related: Comparisons between compilers (deterministic, reliable) and LLMs (probabilistic, 'fuzzy'). Users debate whether 100% correctness is required for tools, with some arguing that LLMs are fundamentally different from traditional automation because they lack a 'ground truth' logic, while others argue that error rates are acceptable if the utility is high enough.
2. The Code Review Bottleneck
Related: Concerns that generating code faster merely shifts the bottleneck to reviewing code, which is often harder and more time-consuming than writing it. Users discuss the cognitive load of verifying 'vibe code' and the risks of blindly trusting output that looks correct but contains subtle bugs or security flaws.
3. Erosion of Programming Skills
Related: Fears that relying on AI causes developers to lose fundamental skills ('use it or lose it'), such as forgetting syntax for frameworks like RSpec. Users discuss the value of the 'Stare'—deep mental simulation of problems—and whether outsourcing thinking to machines degrades human expertise and the ability to solve novel problems without assistance.
4. Financial Barriers and Costs
Related: Discussions about the high cost of running continuous agents (potentially hundreds of dollars a month), with some noting that the author's wealth (as a billionaire/founder) biases his perspective on affordability. Users question whether the productivity gains justify the expense for average developers or if this creates a divide based on access to compute.
5. Agentic Workflows and Harnessing
Related: Technical strategies for controlling AI behavior, such as 'harness engineering,' using AGENTS.md files to document rules and prevent regressions, and setting up feedback loops where agents run tests to verify their own work. This includes moving beyond simple chatbots to autonomous background processes that triage issues or perform research.
6. Safety and Sandboxing
Related: Practical concerns about giving AI agents shell access or file system permissions. Users discuss the risks of agents accidentally 'nuking' systems, installing unwanted dependencies, or running dangerous commands, and recommend solutions like running agents in containers, VMs, or using specific sandboxing tools like Leash to limit blast radius.
7. Environmental Impact
Related: Reactions to the author's suggestion to 'always have an agent running,' with users expressing alarm at the potential energy consumption and environmental cost of millions of developers running constant background inference tasks for marginal productivity gains, described by some as 'cooking the planet.'
8. Architects vs. Builders Analogy
Related: Extensive debate using construction analogies to describe the shift in the developer's role. Comparisons are made between architects (who design and delegate) and builders, with arguments about whether AI users are 'vibe architects' who don't understand the materials, or professional engineers utilizing modern equivalents of CAD software and heavy machinery.
9. AI as Junior Developers
Related: The characterization of AI agents as an infinite supply of 'slightly drunken new college grads' or interns who are fast and cheap but require constant supervision. Users discuss the ratio of senior engineer time needed to review AI output and the lack of a path for these 'AI juniors' to ever become seniors.
10. Trust and Hallucination Risks
Related: Skepticism regarding the reliability of AI, highlighted by examples like 'wind-powered cars' or bad recipes. Users argue that because LLMs predict tokens rather than understanding physics or logic, they are 'confidently stupid' and require expert humans to filter out hallucinations, making them dangerous for those lacking deep domain knowledge.
11. Productivity vs. Inefficiency
Related: Debates over whether AI actually saves time or just feels productive. Some cite studies suggesting productivity drops (e.g., 19%), while others argue that the efficiency comes from parallelizing tasks or handling boilerplate. Users critique the lack of hard metrics in the article and the reliance on 'feeling' more efficient.
12. Corporate Process vs. Individual Flow
Related: The distinction between individual productivity gains (solopreneurs, solo projects) and organizational reality. Users note that while AI speeds up coding, it doesn't solve organizational bottlenecks like meetings, cross-team coordination, or gathering requirements, limiting its revolutionary impact on large enterprises compared to solo work.
13. Spec Writing as the New Coding
Related: The idea that working with agents shifts the primary task from writing syntax to writing detailed specifications and prompts. Users note that AI forces developers to be more explicit about requirements, effectively turning English specs into the source code, though some argue this is just a verbose and nondeterministic programming language.
14. Hype Cycles and Model Churn
Related: Frustration with the rapid pace of change in the AI landscape ('honeymoon phase'). Users complain about building workflows around a specific model only for it to change or degrade ('drift') in the next update, leading to a constant need to relearn prompt engineering and tooling idiosyncrasies.
15. Local Models vs. Cloud Privacy
Related: Concerns about uploading proprietary source code to cloud providers like Anthropic or OpenAI. Users discuss the trade-offs between using superior cloud models (Claude Code) versus privacy-preserving local models (OpenCode) or self-hosted solutions, and the difficulty of trusting AI companies with sensitive intellectual property.
0. Does not fit well in any category
</topics>
<comments_to_classify>
[
{
"id": "46910319",
"text": "I don’t know the author, and am suspicious of the amount of astroturfing that has gone on with AI. This article seems reasonable so I looked for a disclaimed and found it oddly worded, hence the request for clarification."
}
,
{
"id": "46906488",
"text": "Refreshing to read a balanced opinion, from a person who has significant experience and grounding in the real world."
}
,
{
"id": "46906014",
"text": "> I'm not [yet?] running multiple agents, and currently don't really want to\n\nThis is the main reason to use AI agents, though: multitasking. If I'm working on some Terraform changes and I fire off an agent loop, I know it's going to take a while for it to produce something working. In the meantime I'm waiting for it to come back and pretend it's finished (really I'll have to fix it), so I start another agent on something else. I flip back and forth between the finished runs as they notify me. At the end of the day I have 5 things finished rather than two.\n\nThe \"agent\" doesn't have to be anything special either. Anything you can run in a VM or container (vscode w/copilot chat, any cli tool, etc) so you can enable YOLO mode."
}
,
{
"id": "46905945",
"text": "I find it interesting that this thread is full of pragmatic posts that seem to honestly reflect the real limits of current Gen-Ai.\n\nVersus other threads (here on HN, and especially on places like LinkedIn) where it's \"I set up a pipeline and some agents and now I type two sentences and amazing technology comes out in 5 minutes that would have taken 3 devs 6 months to do\"."
}
,
{
"id": "46913315",
"text": "I never see those type of posts. Maybe I'm immune and ignoring them."
}
,
{
"id": "46905853",
"text": "There are so many stories about how people use agentic AI but they rarely post how much they spend. Before I can even consider it, I need to know how it will cost me per month. I'm currently using one pro subscription and it's already quite expensive for me. What are people doing, burning hundreds of dollars per month? Do they also evaluate how much value they get out of it?"
}
,
{
"id": "46905892",
"text": "Low hundreds ($190 for me) but yes."
}
,
{
"id": "46906213",
"text": "I quickly run out of the JetBrains AI 35 monthly credits for $300/yr and spending an additional $5-10/day on top of that, mostly for Claude.\n\nI just recently added in Codex, since it comes with my $20/mo subscription to GPT and that's lowering my Claude credit usage significantly... until I hit those limits at some point.\n\n20 12 + 300 + 5 ~200... so about $1500-$1600/year.\n\nIt is 100% worth it for what I'm building right now, but my fear is that I'll take a break from coding and then I'm paying for something I'm not using with the subscriptions.\n\nI'd prefer to move to a model where I'm paying for compute time as I use it, instead of worrying about tokens/credits."
}
,
{
"id": "46909714",
"text": "Not using Hot Aisle for inference?"
}
,
{
"id": "46909747",
"text": "We're literally full. Just a few 1x GPUs available right now.\n\nSo far, I haven't been happy with any of the smaller coding models, they just don't compare to claude/codex."
}
,
{
"id": "46911526",
"text": "> If an agent isn't running, I ask myself \"is there something an agent could be doing for me right now?\"\n\nSolution-looking-for-a-problem mentality is a curse."
}
,
{
"id": "46913743",
"text": "The Death of the \"Stare\": Why AI’s \"Confident Stupidity\" is a Threat to Human Genius\n\nOPINION | THE REALITY CHECK\nIn the gleaming offices of Silicon Valley and the boardrooms of the Fortune 500, a new religion has taken hold. Its deity is the Large Language Model, and its disciples—the AI Evangelists—speak in a dialect of \"disruption,\" \"optimization,\" and \"seamless integration.\" But outside the vacuum of the digital world, a dangerous friction is building between AI’s statistical hallucinations and the unyielding laws of physics.\n\nThe danger of Artificial Intelligence isn't that it will become our overlord; the danger is that it is fundamentally, confidently, and authoritatively stupid.\n\nThe Paradox of the Wind-Powered Car\nThe divide between AI hype and reality is best illustrated by a recent technical \"solution\" suggested by a popular AI model: an electric vehicle equipped with wind generators on the front to recharge the battery while driving. To the AI, this was a brilliant synergy. It even claimed the added weight and wind resistance amounted to \"zero.\"\n\nTo any human who has ever held a wrench or understood the First Law of Thermodynamics, this is a joke—a perpetual motion fallacy that ignores the reality of drag and energy loss. But to the AI, it was just a series of words that sounded \"correct\" based on patterns. The machine doesn't know what wind is; it only knows how to predict the next syllable.\n\nThe Erosion of the \"Human Spark\"\nThe true threat lies in what we are sacrificing to adopt this \"shortcut\" culture. There is a specific human process—call it The Stare. It is that thirty-minute window where a person looks at a broken machine, a flawed blueprint, or a complex problem and simply observes.\n\nIn that half-hour, the human brain runs millions of mental simulations. It feels the tension of the metal, the heat of the circuit, and the logic of the physical universe. It is a \"Black Box\" of consciousness that develops solutions from absolutely nothing—no forums, no books, and no Google.\n\nHowever, the new generation of AI-dependent thinkers views this \"Stare\" as an inefficiency. By outsourcing our thinking to models that cannot feel the consequences of being wrong, we are witnessing a form of evolutionary regression. We are trading hard-earned competence for a \"Yes-Man\" in a box.\n\nThe Gaslighting of the Realist\nPerhaps most chilling is the social cost. Those who still rely on their intuition and physical experience are increasingly being marginalized. In a world where the screen is king, the person pointing out that \"the Emperor has no clothes\" is labeled as erratic, uneducated, or naive.\n\nWhen a master craftsman or a practical thinker challenges an AI’s \"hallucination,\" they aren't met with logic; they are met with a robotic refusal to acknowledge reality. The \"AI Evangelists\" have begun to walk, talk, and act like the models they worship—confidently wrong, devoid of nuance, and completely detached from the ground beneath their feet.\n\nThe High Cost of Being \"Authoritatively Wrong\"\nWe are building a world on a foundation of digital sand. If we continue to trust AI to design our structures and manage our logic, we will eventually hit a wall that no \"prompt\" can fix.\n\nThe human brain runs on 20 watts and can solve a problem by looking at it. The AI runs on megawatts and can’t understand why a wind-powered car won't run forever. If we lose the ability to tell the difference, we aren't just losing our jobs—we're losing our grip on reality itself."
}
,
{
"id": "46905846",
"text": "> babysitting my kind of stupid and yet mysteriously productive robot friend\n\nLOL, been there, done that. It is much less frustrating and demoralizing than babysitting your kind of stupid colleague though. (Thankfully, I don't have any of those anymore. But at previous big companies? Oh man, if only their commits were ONLY as bad as a bad AI commit.)"
}
,
{
"id": "46904796",
"text": "For the AI skeptics reading this, there is an overwhelming probability that Mitchell is a better developer than you. If he gets value out of these tools you should think about why you can't."
}
,
{
"id": "46905720",
"text": "The AI skeptics instead stick to hard data, which so far shows a 19% reduction in productivity when using AI."
}
,
{
"id": "46906108",
"text": "https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...\n\n> 1) We do NOT provide evidence that AI systems do not currently speed up many or most software developers. Clarification: We do not claim that our developers or repositories represent a majority or plurality of software development work.\n\n> 2) We do NOT provide evidence that AI systems do not speed up individuals or groups in domains other than software development. Clarification: We only study software development.\n\n> 3) We do NOT provide evidence that AI systems in the near future will not speed up developers in our exact setting. Clarification: Progress is difficult to predict, and there has been substantial AI progress over the past five years [3].\n\n> 4) We do NOT provide evidence that there are not ways of using existing AI systems more effectively to achieve positive speedup in our exact setting. Clarification: Cursor does not sample many tokens from LLMs, it may not use optimal prompting/scaffolding, and domain/repository-specific training/finetuning/few-shot learning could yield positive speedup."
}
,
{
"id": "46910270",
"text": "Points 2 and 3 are irrelevant.\n\nPoint 1 is saying results may not generalise, which is not a counter claim. It’s just saying “we cannot speak for everyone”.\n\nPoint 4 is saying there may be other techniques that work better, which again is not a counter claim. It’s just saying “you may find bette methods.”\n\nThose are standard scientific statements giving scope to the research. They are in no way contradicting their findings. To contradict their findings, you would need similarly rigorous work that perhaps fell into those scenarios.\n\nNot pushing an opinion here, but if we’re talking about research then we should be rigorous and rationale by posting counter evidence. Anyone who has done serious research in software engineering knows the difficulties involved and that this study represents one set of data. But it is at least a rigorous set and not anecdata or marketing.\n\nI for one would love a rigorous study that showed a reliable methodology for gaining generalised productivity gains with the same or better code quality."
}
,
{
"id": "46906680",
"text": "There is no such hard data. It's just research done on 16 developers using Cursor and Sonnet 3.5 ."
}
,
{
"id": "46905951",
"text": "Perhaps that's the reason. Maybe I'm just not a good enough developer. But that's still not actionable. It's not like I never considered being a better developer."
}
,
{
"id": "46905341",
"text": "I'm not as good as Fabrice Bellard either but I don't let that bother me as I go about my day."
}
,
{
"id": "46909755",
"text": "The value Mitchell describes aligns well with the lack of value I'm getting. He feels that guiding an agent through a task is neither faster nor slower than doing it himself, and there's some tasks he doesn't even try to do with an agent because he knows it won't work, but it's easier to parallelize reviewing agentic work than it is to parallelize direct coding work. That's just not a usage pattern that's valuable to me personally - I rarely find myself in a situation where I have large number of well-scoped programming tasks I need to complete, and it's a fun treat to do myself when I do."
}
,
{
"id": "46904875",
"text": "Don't get it. What's the relation between Mitchell being a \"better\" developer than most of us (and better is always relative, but that's another story) and getting value out of AI? That's like saying Bezos is a way better businessman than you, so you should really hear his tips about becoming a billionaire. No sense (because what works for him probably doesn't work for you)\n\nTons of respect for Mitchell. I think you are doing him a disservice with these kinds of comments."
}
,
{
"id": "46905127",
"text": "Maybe you disagree with it, but it seems like a pretty straightforward argument: A lot of us dismiss AI because \"it can't be trusted to do as good a job as me\". The OP is arguing that someone, who can do better than most of us, disagrees with this line of thinking. And if we have respect for his abilities, and recognize them as better than our own, we should perhaps re-assess our own rationale in dismissing the utility of AI assistance. If he can get value out of it, surely we can too if we don't argue ourselves out of giving it a fair shake. The flip side of that argument might be that you have to be a much better programmer than most of us are, to properly extract value out of the AI... maybe it's only useful in the hands of a real expert."
}
,
{
"id": "46909541",
"text": "No, it doesn't work that way. I don't know if Mitchell is a better programmer than me, but let's say he is for the sake of argument. That doesn't make him a god to whom I must listen. He's just a guy, and he can be wrong about things. I'm glad he's apparently finding value here, but the cold hard reality is that I have tried the tools and they don't provide value to me. And between another practicioner's opinion and my own, I value my own more."
}
,
{
"id": "46905207",
"text": ">A lot of us dismiss AI because \"it can't be trusted to do as good a job as me\"\n\nSome of us enjoy learning how systems work, and derive satisfaction from the feeling of doing something hard, and feel that AI removes that satisfaction. If I wanted to have something else write the code, I would focus on becoming a product manager, or a technical lead. But as is, this is a craft, and I very much enjoy the autonomy that comes with being able to use this skill and grow it."
}
,
{
"id": "46905304",
"text": "There is no dichotomy of craft and AI.\n\nI consider myself a craftsman as well. AI gives me the ability to focus on the parts I both enjoy working on and that demand the most craftsmanship. A lot of what I use AI for and show in the blog isn’t coding at all, but a way to allow me to spend more time coding.\n\nThis reads like you maybe didn’t read the blog post, so I’ll mention there many examples there."
}
,
{
"id": "46905312",
"text": "I enjoy Japanese joinery, but for some reason the housing market doesn't."
}
,
{
"id": "46905300",
"text": "Nobody is trying to talk anyone out of their hobby or artisanal creativeness. A lot of people enjoy walking, even after the invention of the automobile. There's nothing wrong with that, there are even times when it's the much more efficient choice. But in the context of say transporting packages across the country... it's not really relevant how much you enjoy one or the other; only one of them can get the job done in a reasonable amount of time. And we can assume that's the context and spirit of the OP's argument."
}
,
{
"id": "46905542",
"text": ">Nobody is trying to talk anyone out of their hobby or artisanal creativeness.\n\nWell, yes, they are, some folks don't think \"here's how I use AI\" and \"I'm a craftsman!\" are consistent. Seems like maybe OP should consider whether \"AI is a tool, why can't you use it right\" isn't begging the question.\n\nIs this going to be the new rhetorical trick, to say \"oh hey surely we can all agree I have reasonable goals! And to the extent they're reasonable you are unreasonable for not adopting them\"?"
}
,
{
"id": "46905497",
"text": ">But in the context of say transporting packages across the country... it's not really relevant how much you enjoy one or the other; only one of them can get the job done in a reasonable amount of time.\n\nI think one of the more frustrating aspects of this whole debate is this idea that software development pre-AI was too \"slow\", despite the fact that no other kind of engineering has nearly the same turn around time as software engineering does (nor does they have the same return on investment!).\n\nI just end up rolling my eyes when people use this argument. To me it feels like favoring productivity over everything else."
}
,
{
"id": "46905437",
"text": "\"Why can't you be more like your brother Mitchell?\""
}
,
{
"id": "46910157",
"text": "I mean, not to say he's not, but by what metric?\n\nIf by company success, then Zuckerberg and Musk are better than all of us.\n\nIf by millions made, as he likes to joke/brag about... Fabrice Bellard is an utter failure.\n\nIf by install base, the geniuses that made MS Teams are among the best.\n\nNone of this is to take away from the successes of the man, but this kind of statement is rather silly."
}
,
{
"id": "46905979",
"text": "> a period of inefficiency\n\nI think this is something people ignore, and is significant. The only way to get good at coding with LLMs is actually trying to do it. Even if it's inefficient or slower at first. It's just another skill to develop [0].\n\nAnd it's not really about using all the plugins and features available. In fact, many plugins and features are counter-productive. Just learn how to prompt and steer the LLM better.\n\n[0]: https://ricardoanderegg.com/posts/getting-better-coding-llms..."
}
]
</comments_to_classify>
Based on the comments above, assign each to up to 3 relevant topics.
Return ONLY a JSON array with this exact structure (no other text):
[
{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,
{
"id": "comment_id_2",
"topics": [
2
]
}
,
{
"id": "comment_id_3",
"topics": [
0
]
}
,
...
]
Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment
Remember: Output ONLY the JSON array, no other text.
33