Summarizer

LLM Input

llm/8632d754-c7a3-4ec2-977a-2733719992fa/batch-2-2e70fc90-b9b4-4016-a4d7-4554a2c5524e-input.json
Pretty-print
prompt

The following is content for you to classify. Do not respond to the comments—classify them.

<topics>
1. Determinism vs. Probabilistic Output
   Related: Comparisons between compilers (deterministic, reliable) and LLMs (probabilistic, 'fuzzy'). Users debate whether 100% correctness is required for tools, with some arguing that LLMs are fundamentally different from traditional automation because they lack a 'ground truth' logic, while others argue that error rates are acceptable if the utility is high enough.
2. The Code Review Bottleneck
   Related: Concerns that generating code faster merely shifts the bottleneck to reviewing code, which is often harder and more time-consuming than writing it. Users discuss the cognitive load of verifying 'vibe code' and the risks of blindly trusting output that looks correct but contains subtle bugs or security flaws.
3. Erosion of Programming Skills
   Related: Fears that relying on AI causes developers to lose fundamental skills ('use it or lose it'), such as forgetting syntax for frameworks like RSpec. Users discuss the value of the 'Stare'—deep mental simulation of problems—and whether outsourcing thinking to machines degrades human expertise and the ability to solve novel problems without assistance.
4. Financial Barriers and Costs
   Related: Discussions about the high cost of running continuous agents (potentially hundreds of dollars a month), with some noting that the author's wealth (as a billionaire/founder) biases his perspective on affordability. Users question whether the productivity gains justify the expense for average developers or if this creates a divide based on access to compute.
5. Agentic Workflows and Harnessing
   Related: Technical strategies for controlling AI behavior, such as 'harness engineering,' using AGENTS.md files to document rules and prevent regressions, and setting up feedback loops where agents run tests to verify their own work. This includes moving beyond simple chatbots to autonomous background processes that triage issues or perform research.
6. Safety and Sandboxing
   Related: Practical concerns about giving AI agents shell access or file system permissions. Users discuss the risks of agents accidentally 'nuking' systems, installing unwanted dependencies, or running dangerous commands, and recommend solutions like running agents in containers, VMs, or using specific sandboxing tools like Leash to limit blast radius.
7. Environmental Impact
   Related: Reactions to the author's suggestion to 'always have an agent running,' with users expressing alarm at the potential energy consumption and environmental cost of millions of developers running constant background inference tasks for marginal productivity gains, described by some as 'cooking the planet.'
8. Architects vs. Builders Analogy
   Related: Extensive debate using construction analogies to describe the shift in the developer's role. Comparisons are made between architects (who design and delegate) and builders, with arguments about whether AI users are 'vibe architects' who don't understand the materials, or professional engineers utilizing modern equivalents of CAD software and heavy machinery.
9. AI as Junior Developers
   Related: The characterization of AI agents as an infinite supply of 'slightly drunken new college grads' or interns who are fast and cheap but require constant supervision. Users discuss the ratio of senior engineer time needed to review AI output and the lack of a path for these 'AI juniors' to ever become seniors.
10. Trust and Hallucination Risks
   Related: Skepticism regarding the reliability of AI, highlighted by examples like 'wind-powered cars' or bad recipes. Users argue that because LLMs predict tokens rather than understanding physics or logic, they are 'confidently stupid' and require expert humans to filter out hallucinations, making them dangerous for those lacking deep domain knowledge.
11. Productivity vs. Inefficiency
   Related: Debates over whether AI actually saves time or just feels productive. Some cite studies suggesting productivity drops (e.g., 19%), while others argue that the efficiency comes from parallelizing tasks or handling boilerplate. Users critique the lack of hard metrics in the article and the reliance on 'feeling' more efficient.
12. Corporate Process vs. Individual Flow
   Related: The distinction between individual productivity gains (solopreneurs, solo projects) and organizational reality. Users note that while AI speeds up coding, it doesn't solve organizational bottlenecks like meetings, cross-team coordination, or gathering requirements, limiting its revolutionary impact on large enterprises compared to solo work.
13. Spec Writing as the New Coding
   Related: The idea that working with agents shifts the primary task from writing syntax to writing detailed specifications and prompts. Users note that AI forces developers to be more explicit about requirements, effectively turning English specs into the source code, though some argue this is just a verbose and nondeterministic programming language.
14. Hype Cycles and Model Churn
   Related: Frustration with the rapid pace of change in the AI landscape ('honeymoon phase'). Users complain about building workflows around a specific model only for it to change or degrade ('drift') in the next update, leading to a constant need to relearn prompt engineering and tooling idiosyncrasies.
15. Local Models vs. Cloud Privacy
   Related: Concerns about uploading proprietary source code to cloud providers like Anthropic or OpenAI. Users discuss the trade-offs between using superior cloud models (Claude Code) versus privacy-preserving local models (OpenCode) or self-hosted solutions, and the difficulty of trusting AI companies with sensitive intellectual property.
0. Does not fit well in any category
</topics>

<comments_to_classify>
[
  
{
  "id": "46908384",
  "text": "I think this is the crux of why, when used as an enhancement to solo productivity, you'll have a pretty strict upper bound on productivity gains given that it takes experienced engineers to review code that goes out at scale.\n\nThat being said, software quality seems to be decreasing, or maybe it's just cause I use a lot of software in a somewhat locked down state with adblockers and the rest.\n\nAlthough, that wouldn't explain just how badly they've murdered the once lovely iTunes (now Apple Music) user interface. (And why does CMD-C not pick up anything 15% of the time I use it lately...)\n\nAnyways, digressions aside... the complexity in software development is generally in the organizational side. You have actual users, and then you have people who talk to those users and try to see what they like and don't like in order to distill that into product requirements which then have to be architected, and coordinated (both huge time sinks) across several teams.\n\nEven if you cut out 100% of the development time, you'd still be left with 80% of the timeline.\n\nOver time though... you'll probably see people doing what I do all day (which is move around among many repositories (although I've yet to use the AI much, got my Cursor license recently and am gonna spin up some POCs that I want to see soon)), enabled by their use of AI to quickly grasp what's happening in the repo, and the appropriate places to make changes.\n\nEnabling developers to complete features from tip to tail across deep, many pronged service architectures would could bring project time down drastically and bring project management, and cross team coordination costs down tremendously.\n\nSimilarly, in big companies, the hand is often barely aware at best of the foot. And space exploration is a serious challenge. Often folk know exactly one step away, and rely on well established async communication channels which also only know one step further. Principal engineers seem to know large amounts about finite spaces and are often in the dark small hops away to things like the internal tooling for the systems they're maintaining (and often not particularly great at coming in to new spaces and thinking with the same perspective... no we don't need individual micro services for every 12 request a month admin api group we want to set up).\n\nOnce systems can take a feature proposal and lay out concrete plans which each little kingdom can give a thumbs up or thumbs down to for further modifications, you can again reduce exploration, coordination, and architecture time down.\n\nSadly, seems like User Experience design is an often terribly neglected part of our profession. I love the memes about an engineer building the perfect interface like a water pitcher only for the person to position it weirdly in order to get a pour out of the fill hole or something. Lemme guess how many users you actually talked to (often zero), and how many layers of distillation occurred before you received a micro picture feature request that ends up being build and taking input from engineers with no macro understanding of a user's actual needs, or day to day.\n\nAnd who often are much more interested in perfecting some little algorithm thank thinking about enabling others.\n\nSo my money is on money flowing to...\n- People who can actually verify system integrity, and can fight fires and bugs (but a lot of bug fixing will eventually becoming prompting?)\n- Multi-talented individuals who can say... interact with users well enough to understand their needs as well as do a decent job verifying system architecture and security\n\nIt's outside of coding where I haven't seen much... I guess people use it to more quickly scaffold up expense reports, or generate mocks. So, lots of white collar stuff. But... it's not like the experience of shopping at the supermarket has changed, or going to the movies, or much of anything else."
}
,
  
{
  "id": "46911071",
  "text": "I will give Claude Code a trial run if I can run it locally without an internet connection. AI companies have procured so much training data through illegal means you have to be insane to trust them in even the smallest amount."
}
,
  
{
  "id": "46911845",
  "text": "You can run OpenCode in a container restricted to local network only and communicating with local/self-hosted models.\n\nClaude Code is linked to Anthropic's hosted models so you can't achieve this."
}
,
  
{
  "id": "46909134",
  "text": "Should AI tools use memory safe tabs or spaces for indentation? :)\n\nIt is a shame it's become such a polarized topic. Things which actually work fine get immediately bashed by large crowds at the same time things that are really not there get voted to the moon by extremely eager folks. A few years from now I expect I'll be thinking \"man, there was some really good stuff I missed out on because the discussions about it were so polarized at the time. I'm glad that has cleared up significantly!\""
}
,
  
{
  "id": "46906573",
  "text": "Your sentiment resonates with me a lot. I wonder what we’ll consider the inflection point 10 years from now. It seemed like the zeitgeist was screaming about scaling limits and running out of training data, then we got Claude code, sonnet 4.5, then Opus 4.5 and no ones looked back since."
}
,
  
{
  "id": "46907277",
  "text": "I wonder too. It might be that progress on the underlying models is going to plateau, or it might be that we haven't yet reached what in retrospect will be the biggest inflection point. Technological developments can seem to make sense in hindsight as a story of continuous progress when the dust has settled and we can write and tell the history, but when you go back and look at the full range of voices in the historical sources you realize just how deeply nothing was clear to anyone at all at the time it was happening because everyone was hurtling into the unknown future with a fog of war in front of them. In 1910 I'd say it would have been perfectly reasonable to predict airplanes would remain a terrifying curiosity reserved for daredevils only (and people did); or conversely, in the 1960s a lot of commentators thought that the future of passenger air travel in the 70s and 80s would be supersonic jets. I keep this in mind and don't really pay too much attention to over-confident predictions about the technological future."
}
,
  
{
  "id": "46907801",
  "text": "GPT-4 showed the potential but the automated workflows (context management, loops, test-running) and pure execution speed to handle all that \"reasoning\"/workflows (remember watching characters pop in slowly in GPT-4 streaming API response calls) are gamechangers.\n\nThe workflow automation and better (and model-directed) context management are all obvious in retrospect but a lot of people (like myself) were instead focused on IDE integration and such vs `grep` and the like. Maybe multi-agent with task boards is the next thing, but it feels like that might also start to outrun the ability to sensibly design and test new features for non-greenfield/non-port projects. Who knows yet.\n\nI think it's still very valuable for someone to dig in to the underlying models periodically (insomuch as the APIs even expose the same level of raw stuff anymore) to get a feeling for what's reliable to one-shot vs what's easily correctable by a \"ran the tests, saw it was wrong, fixed it\" loop. If you don't have a good sense of that, it's easy to get overambitious and end up with something you don't like if you're the sort of person who cares at all about what the code looks like."
}
,
  
{
  "id": "46910803",
  "text": "let me ask a stupid/still-ignorant question - about repeatability.\n\nIf one asks this generator/assistant same request/thing, within same initial contexts, 10 times, would it generate same result ? in different sessions and all that.\n\nbecause.. if not, then it's for once-off things only.."
}
,
  
{
  "id": "46910820",
  "text": "If I asked you for the same thing 10 times, wiping your memory each time, would you generate the same result?\n\nAnd why does it matter anyway? I'd the code passes the tests and you like the look of it, it's good. It doesn't need to be existentially complicated."
}
,
  
{
  "id": "46911307",
  "text": "A pretty bad comparison. If I gave you the correct answer once, it's unlikely that I'll give you a wrong answer the next time. Also, aren't computers supposed to be more reliable than us? If I'm going to use a tool that behaves just like humans, why not just use my brain instead?"
}
,
  
{
  "id": "46909469",
  "text": "Isn’t there something off about calling predictions about the future, that aren’t possible with current tech, hype? Like people predicted AI agents would be this huge change, they were called hype since earlier models were so unreliable, and now they are mostly right as ai agents work like a mid level engineer. And clearly super human in some areas."
}
,
  
{
  "id": "46909620",
  "text": "> ai agents work like a mid level engineer\n\nThey do not.\n\n> And clearly super human in some areas.\n\nSure, if you think calculators or bicycles are \"superhuman technology\".\n\nLay off the hype pills."
}
,
  
{
  "id": "46912247",
  "text": ">Sure, if you think calculators or bicycles are \"superhuman technology\".\n\nUh, yes they are? That's why they were revolutionary technologies!\n\nIt's hard to see why a bike that isn't superhuman would even make sense? Being superhuman in at least some aspect really seems like the bare minimum for a technology to be worth adopting."
}
,
  
{
  "id": "46907094",
  "text": "Is there any reason to use Claude Code specifically over Codex or Gemini? I’ve found the both Codex and Gemini similar in results, but I never tried Claude because of I keep hearing usage runs out so fast on pro plans and there’s no free trial for the CLI."
}
,
  
{
  "id": "46907149",
  "text": "I mostly mentioned Claude Code because it's what Mitchell first tried according to his post, and it's what I personally use. From what I hear Codex is pretty comparable; it has a lot of fans. There are definitely some differences and strengths and weaknesses of both the CLIs and the underlying LLMs that others who use more than one tool might want to weight in on, but they're all fairly comparable. (Although, we'll see how the new models released from Anthropic and OpenAI today stack up.) Codex and Gemini CLI are basically Claude Code clones with different LLMs behind them, after all."
}
,
  
{
  "id": "46907819",
  "text": "IME Gemini is pretty slow in comparison to Claude - but hey, it's super cheap at least.\n\nBut that speed makes a pretty significant difference in experience.\n\nIf you wait a couple minutes and then give the model a bunch of feedback about what you want done differently, and then have to wait again, it gets annoying fast.\n\nIf the feedback loop is much tighter things feel much more engaging. Cursor is also good at this (investigate and plan using slower/pricier models, implement using fast+cheap ones)."
}
,
  
{
  "id": "46911762",
  "text": "> It's a shame that AI coding tools have become such a polarizing issue among developers.\n\nFrankly I'm so tired of the usual \"I don't find myself more productive\", \"It writes soup\". Especially when some of the best software developers (and engineers) find many utility in those tools, there should be some doubt growing in that crowd.\n\nI have come to the conclusion that software developers , those only focusing on the craft of writing code are the naysayers.\n\nSoftware engineers immediately recognize the many automation/exploration/etc boosts, recognize the tools limits and work on improving them.\n\nHell, AI is an insane boost to productivity, even if you don't have it write a single line of code ever .\n\nBut people that focus on the craft (the kind of crowd that doesn't even process the concept of throwaway code or budgets or money) will keep laying in their \"I don't see the benefits because X\" forever, nonsensically confusing any tool use with vibe coding.\n\nI'm also convinced that since this crowd never had any notion of what engineering is (there is very little of it in our industry sadly, technology and code is the focus and rarely the business, budget and problems to solve) and confused it with architectural, technological or best practices they are genuinely insecure about their jobs because once their very valued craft and skills are diminished they pay the price of never having invested in understanding the business, the domain, processes or soft skills."
}
,
  
{
  "id": "46913986",
  "text": "I've spent 2+ decades producing software across a number of domains and orgs and can fully agree that _disciplined use_ of LLM systems can significantly boost productivity, but the rules and guidance around their use within our industry writ large are still in flux and causing as many problems as they're solving today.\n\nAs the most senior IC within my org, since the advent of (enforced) LLM adoption my code contribution/output has stalled as my focus has shifted to the reactionary work of sifting through the AI generated chaff following post mortems of projects that should have never have shipped in the first place. On a good day I end up rejecting several PRs that most certainly would have taken down our critical systems in production due to poor vetting and architectural flaws, and on the worst I'm in full on fire fighting mode to \"fix\" the same issues already taking down production (already too late.)\n\nThese are not inherent technical problems in LLMs, these are organizational/processes problems induced by AI pushers promising 10x output without the necessary 10x requirements gathering and validation efforts that come with that. \"Everyone with GenAI access is now a 10x SDE\" is the expectation, when the reality is much more nuanced.\n\nThe result I see today is massive incoming changesets that no one can properly vet given the new shortened delivery timelines and reduced human resourcing given to projects. We get test suite coverage inflation where \"all tests pass\" but undermine core businesses requirements and no one is being given the time or resources to properly confirm the business requirements are actually being met. Shit hits the fan, repeat ad nauseum. The focus within our industry needs to shift to education on the proper application and use of these tools, or we'll inevitably crash into the next AI winter; an increasingly likely future that would have been totally avoidable if everyone drinking the Koolaid stopped to observe what is actually happening.\n\nAs you implied, code is cheap and most code is \"throwaway\" given even modest time horizons, but all new code comes with hidden costs not readily apparent to all the stakeholders attempting to create a new normal with GenAI. As you correctly point out, the biggest problems within our industry aren't strictly technical ones, they're interpersonal, communication and domain expertise problems, and AI use is simply exacerbating those issues. Maybe all the orgs \"doing it wrong\" (of which there are MANY) simply fail and the ones with actual engineering discipline \"make it,\" but it'll be a reckoning we should not wish for.\n\nI have heard from a number of different industry players and they see the same patterns. Just look at the average linked in post about AI adoption to confirm. Maybe you observe different patterns and the issues aren't as systemic as I fear. I honestly hope so.\n\nYour implication that seniors like myself are \"insecure about our jobs\" is somewhat ironically correct, but not for the reasons you think."
}
,
  
{
  "id": "46906539",
  "text": "but annoying hype is exactly the issue with AI in my eyes. I get it's a useful tool in moderation and all, but I also experience that management values speed and quantity of delivery above all else, and hype-driven as they are I fear they will run this industry to the ground and we as users and customers will have to deal with the world where software is permanently broken as a giant pile of unmaintainable vibe code and no experienced junior developers to boot."
}
,
  
{
  "id": "46908625",
  "text": ">management values speed and quantity of delivery above all else\n\nI don't know about you but this has been the case for my entire career.\nMgmt never gave a shit about beautiful code or tech debt or maintainability or how enlightened I felt writing code."
}
,
  
{
  "id": "46908141",
  "text": "I think for a lot of people the turn off is the constant churn and the hype cycle. For a lot of people, they just want to get things done and not have to constantly keep on top of what's new or SOTA. Are we still using MCPs or are we using Skills now? Not long ago you had to know MCP or you'd be left behind and you definitely need to know MCP UI or you'll be left behind. I think. It just becomes really tiring, especially with all the FUD.\n\nI'm embracing LLMs but I think I've had to just pick a happy medium and stick with Claude Code with MCPs until somebody figures out a legitimate way to use the Claude subscription with open source tools like OpenCode, then I'll move over to that. Or if a company provides a model that's as good value that can be used with OpenCode."
}
,
  
{
  "id": "46909604",
  "text": "It reminds me a lot of 3D Printing tbh. Watching all these cool DIY 3d printing kits evolve over years, I remember a few times I'd checked on costs to build a DIY one. They kept coming down, and down, and then around the same time as \"Build a 3d printer for $200 (some assembly required)!\" The Bambu X1C was announced/released, for a bit over a grand iirc? And its whole selling point was that it was fast and worked, out of the box. And so I bought one and made a bunch of random one-off-things that solved _my_ specific problem, the way I wanted it solved. Mostly in the form of very specific adapter plates that I could quickly iterate on and random house 'wouldn't it be nice if' things.\n\nThat's kind of where AI-agent-coding is now too, though... software is more flexible."
}
,
  
{
  "id": "46911855",
  "text": "> Or if a company provides a model that's as good value that can be used with OpenCode.\n\nOpenAI's Codex?"
}
,
  
{
  "id": "46909024",
  "text": "> For a lot of people, they just want to get things done and not have to constantly keep on top of what's new or SOTA\n\nThat hasn’t been tech for a long time.\n\nFrontend has been changing forever. React and friends have new releases all the time. Node has new package managers and even Deno and Bun. AWS keeps changing things."
}
,
  
{
  "id": "46909511",
  "text": "You really shouldn't use the absolute hellscape of churn that is web dev as an example of broader industry trends. No other sub-field of tech is foolish enough to chase hype and new tools the way web dev is."
}
,
  
{
  "id": "46910285",
  "text": "I think the web/system dichotomy is also a major conflating factor for LLM discussions.\n\nA “few hundred lines of code” in Rust or Haskell can be bumping into multiple issues LLM assisted coding struggles with. Moving a few buttons on a website with animations and stuff through multiple front end frameworks may reasonably generate 5-10x that much “code”, but of an entirely different calibre.\n\n3,000 lines a day of well-formatted HTML template edits, paired with a reloadable website for rapid validation, is super digestible, while 300 lines of code per day into curl could be seen as reckless."
}
,
  
{
  "id": "46909112",
  "text": "There's a point at which these things become Good Enough though, and don't bottleneck your capacity to get things done.\n\nTo your point, React, while it has new updates, hasn't changed the fundamentals since 16.8.0 (introduction of hooks) and that was 7 years ago. Yes there are new hooks, but they typically build on older concepts. AWS hasn't deprecated any of our existing services at work (besides maybe a MySQL version becoming EOL) in the last 4 years that I've worked at my current company.\n\nWhile I prefer pnpm (to not take up my MacBook's inadequate SSD space), you can still use npm and get things done.\n\nI don't need to keep obsessing over whether Codex or Claude have a 1 point lead in a gamed benchmark test so long as I'm still able to ship features without a lot of churn."
}
,
  
{
  "id": "46913759",
  "text": "The Death of the \"Stare\": Why AI’s \"Confident Stupidity\" is a Threat to Human Genius\n\nOPINION | THE REALITY CHECK\nIn the gleaming offices of Silicon Valley and the boardrooms of the Fortune 500, a new religion has taken hold. Its deity is the Large Language Model, and its disciples—the AI Evangelists—speak in a dialect of \"disruption,\" \"optimization,\" and \"seamless integration.\" But outside the vacuum of the digital world, a dangerous friction is building between AI’s statistical hallucinations and the unyielding laws of physics.\n\nThe danger of Artificial Intelligence isn't that it will become our overlord; the danger is that it is fundamentally, confidently, and authoritatively stupid.\n\nThe Paradox of the Wind-Powered Car\nThe divide between AI hype and reality is best illustrated by a recent technical \"solution\" suggested by a popular AI model: an electric vehicle equipped with wind generators on the front to recharge the battery while driving. To the AI, this was a brilliant synergy. It even claimed the added weight and wind resistance amounted to \"zero.\"\n\nTo any human who has ever held a wrench or understood the First Law of Thermodynamics, this is a joke—a perpetual motion fallacy that ignores the reality of drag and energy loss. But to the AI, it was just a series of words that sounded \"correct\" based on patterns. The machine doesn't know what wind is; it only knows how to predict the next syllable.\n\nThe Erosion of the \"Human Spark\"\nThe true threat lies in what we are sacrificing to adopt this \"shortcut\" culture. There is a specific human process—call it The Stare. It is that thirty-minute window where a person looks at a broken machine, a flawed blueprint, or a complex problem and simply observes.\n\nIn that half-hour, the human brain runs millions of mental simulations. It feels the tension of the metal, the heat of the circuit, and the logic of the physical universe. It is a \"Black Box\" of consciousness that develops solutions from absolutely nothing—no forums, no books, and no Google.\n\nHowever, the new generation of AI-dependent thinkers views this \"Stare\" as an inefficiency. By outsourcing our thinking to models that cannot feel the consequences of being wrong, we are witnessing a form of evolutionary regression. We are trading hard-earned competence for a \"Yes-Man\" in a box.\n\nThe Gaslighting of the Realist\nPerhaps most chilling is the social cost. Those who still rely on their intuition and physical experience are increasingly being marginalized. In a world where the screen is king, the person pointing out that \"the Emperor has no clothes\" is labeled as erratic, uneducated, or naive.\n\nWhen a master craftsman or a practical thinker challenges an AI’s \"hallucination,\" they aren't met with logic; they are met with a robotic refusal to acknowledge reality. The \"AI Evangelists\" have begun to walk, talk, and act like the models they worship—confidently wrong, devoid of nuance, and completely detached from the ground beneath their feet.\n\nThe High Cost of Being \"Authoritatively Wrong\"\nWe are building a world on a foundation of digital sand. If we continue to trust AI to design our structures and manage our logic, we will eventually hit a wall that no \"prompt\" can fix.\n\nThe human brain runs on 20 watts and can solve a problem by looking at it. The AI runs on megawatts and can’t understand why a wind-powered car won't run forever. If we lose the ability to tell the difference, we aren't just losing our jobs—we're losing our grip on reality itself."
}
,
  
{
  "id": "46904972",
  "text": "> Break down sessions into separate clear, actionable tasks. Don't try to \"draw the owl\" in one mega session.\n\nThis is the key one I think. At one extreme you can tell an agent \"write a for loop that iterates over the variable `numbers` and computes the sum\" and they'll do this successfully, but the scope is so small there's not much point in using an LLM. On the other extreme you can tell an agent \"make me an app that's Facebook for dogs\" and it'll make so many assumptions about the architecture, code and product that there's no chance it produces anything useful beyond a cool prototype to show mom and dad.\n\nA lot of successful LLM adoption for code is finding this sweet spot. Overly specific instructions don't make you feel productive, and overly broad instructions you end up redoing too much of the work."
}
,
  
{
  "id": "46905076",
  "text": "This is actually an aspect of using AI tools I really enjoy: Forming an educated intuition about what the tool is good at, and tastefully framing and scoping the tasks I give it to get better results.\n\nIt cognitively feels very similar to other classic programming activities, like modularization at any level from architecture to code units/functions, thoughtfully choosing how to lay out and chunk things. It's always been one of the things that make programming pleasurable for me, and some of that feeling returns when slicing up tasks for agents."
}
,
  
{
  "id": "46908579",
  "text": "\"Become better at intuiting the behavior of this non-deterministic black box oracle maintained by a third party\" just isn't a strong professional development sell for me, personally. If the future of writing software is chasing what a model trainer has done with no ability to actually change that myself I don't think that's going to be interesting to nearly as many people."
}
,
  
{
  "id": "46909924",
  "text": "It sounds like you're talking more about \"vibe coding\" i.e. just using LLMs without inspecting the output. That's neither what the article nor the people to whom you're replying are saying. You can (and should) heavily review and edit LLM generated code. You have the full ability to change it yourself, because the code is just there and can be edited!"
}
,
  
{
  "id": "46910053",
  "text": "And yet the comments are chock full of cargo-culting about different moods of the oracle and ways to get better output."
}
,
  
{
  "id": "46912736",
  "text": "I think this is underrating the role of intuition in working effectively with deterministic but very complex software systems like operating systems and compilers. Determinism is a red herring."
}
,
  
{
  "id": "46909330",
  "text": "Whether it's interesting or not is irrelevant to whether it produces usable output that could be economically valuable."
}
,
  
{
  "id": "46909343",
  "text": "Yeah, still waiting for something to ship before I form a judgement on that"
}
,
  
{
  "id": "46910011",
  "text": "Claude Code is made with Anthropic's models and is very commercially successful."
}
,
  
{
  "id": "46910050",
  "text": "Something besides AI tooling. This isn't Amway."
}
,
  
{
  "id": "46912330",
  "text": "Since they started doing that it's gained a lot of bugs."
}
,
  
{
  "id": "46905785",
  "text": "I agree that framing and scoping tasks is becoming a real joy. The great thing about this strategy is there's a point at which you can scope something small enough that it's hard for the AI to get it wrong and it's easy enough for you as a human to comprehend what it's done and verify that it's correct.\n\nI'm starting to think of projects now as a tree structure where the overall architecture of the system is the main trunk and from there you have the sub-modules, and eventually you get to implementations of functions and classes. The goal of the human in working with the coding agent is to have full editorial control of the main trunk and main sub-modules and delegate as much of the smaller branches as possible.\n\nSometimes you're still working out the higher-level architecture, too, and you can use the agent to prototype the smaller bits and pieces which will inform the decisions you make about how the higher-level stuff should operate."
}
,
  
{
  "id": "46906658",
  "text": "[Edit: I may have been replying to another comment in my head as now I re-read it and I'm not sure I've said the same thing as you have. Oh well.]\n\nI agree. This is how I see it too. It's more like a shortcut to an end result that's very similar (or much better) than I would've reached through typing it myself.\n\nThe other day I did realise that I'm using my experience to steer it away from bad decisions a lot more than I noticed. It feels like it does all the real work, but I have to remember it's my/our (decades of) experience writing code playing a part also.\n\nI'm genuinely confused when people come in at this point and say that it's impossible to do this and produce good output and end results."
}
,
  
{
  "id": "46906840",
  "text": "I feel the same, but, also, within like three years this might look very different. Maybe you'll give the full end-to-end goal upfront and it just polls you when it needs clarification or wants to suggest alternatives, and it self-manages cleanly self-delegating.\n\nOr maybe something quite different but where these early era agentic tooling strategies still become either unneeded or even actively detrimental."
}
,
  
{
  "id": "46907074",
  "text": "> it just polls you when it needs clarification\n\nI think anyone who has worked on a serious software project would say, this means it would be polling you constantly.\n\nEven if we posit that an LLM is equivalent to a human, humans constantly clarify requirements/architecture. IMO on both of those fronts the correct path often reveals itself over time, rather than being knowable from the start.\n\nSo in this scenario it seems like you'd be dealing with constant pings and need to really make sure you're understanding of the project is growing with the LLM's development efforts as well.\n\nTo me this seems like the best-case of the current technology, the models have been getting better and better at doing what you tell it in small chunks but you still need to be deciding what it should be doing. These chunks don't feel as though they're getting bigger unless you're willing to accept slop."
}
,
  
{
  "id": "46908003",
  "text": "> Break down sessions into separate clear, actionable tasks.\n\nWhat this misses, of course, is that you can just have the agent do this too. Agent's are great at making project plans, especially if you give them a template to follow."
}
,
  
{
  "id": "46913260",
  "text": "It sounds to me like the goal there is to spell out everything you don't want the agent to make assumptions about. If you let the agent make the plan, it'll still make those assumptions for you."
}
,
  
{
  "id": "46910943",
  "text": "If you've got a plan for the plan, what else could you possibly need!"
}
,
  
{
  "id": "46911583",
  "text": "You joke, but the more I iterate on a plan before any code, the more successful the first pass is.\n\n1) Tell claude my idea with as much as I know, ask it to ask me questions. This could go on for a few rounds. (Opus)\n\n2) Run a validate skill on the plan, reviewer with a different prompt (Opus)\n\n3) codex reviews the plan, always finds a few small items after the above 2.\n\n4) claude opus implements in 1 shot, usually 99% accurate, then I manually test.\n\nIf I stay on target with those steps I always have good outcomes, but it is time consuming."
}
,
  
{
  "id": "46913876",
  "text": "I do something very similar. I have an \"outside expert\" script I tell my agent to use as the reviewer. It only bothers me when neither it OR the expert can figure out what the heck it is I actually wanted.\n\nIn my case I have Gemini CLI, so I tell Gemini to use the little python script called gatekeeper.py to validate it's plan before each phase with Qwen, Kimi, or (if nothing else is getting good results) ChatGPT 5.2 Thinking. Qwen & Kimi are via fireworks.ai so it's much cheaper than ChatGPT. The agent is not allowed to start work until one of the \"experts\" approves it via gatekeeper. Similarly it can't mark a phase as complete until the gatekeeper approves the code as bug free and up to standards and passes all unit tests & linting.\n\nLately Kimi is good enough, but when it's really stuck it will sometimes bother ChatGPT. Seldom does it get all the way to the bottom of the pile and need my input. Usually it's when my instructions turned out to be vague.\n\nI also have it use those larger thinking models for \"expert consultation\" when it's spent more than 100 turns on any problem and hasn't made progress by it's own estimation."
}
,
  
{
  "id": "46905905",
  "text": "> On the other extreme you can tell an agent \"make me an app that's Facebook for dogs\" and it'll make so many assumptions about the architecture, code and product that there's no chance it produces anything useful beyond a cool prototype to show mom and dad.\n\nAmusingly, this was my experience in giving Lovable a shot. The onboarding process was literally just setting me up for failure by asking me to describe the detailed app I was attempting to build.\n\nTaking it piece by piece in Claude Code has been significantly more successful."
}
,
  
{
  "id": "46905129",
  "text": "so many times I catch myself asking a coding agent e.g “please print the output” and it will update the file with “print (output)”.\n\nMaybe there’s something about not having to context switch between natural language and code just makes it _feel_ easier sometimes"
}

]
</comments_to_classify>

Based on the comments above, assign each to up to 3 relevant topics.

Return ONLY a JSON array with this exact structure (no other text):
[
  
{
  "id": "comment_id_1",
  "topics": [
    1,
    3,
    5
  ]
}
,
  
{
  "id": "comment_id_2",
  "topics": [
    2
  ]
}
,
  
{
  "id": "comment_id_3",
  "topics": [
    0
  ]
}
,
  ...
]

Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment

Remember: Output ONLY the JSON array, no other text.
commentCount

← Back to job