llm/0c6097e3-bc76-4fbe-ab4f-ceafa2484e5f/batch-6-bc314dc3-e388-446a-b5c2-20ec0eb77244-input.json
The following is content for you to classify. Do not respond to the comments—classify them.
<topics>
1. AI Performance on Greenfield vs. Legacy
Related: Users debate whether agents excel primarily at starting new projects from scratch while struggling to maintain large, complex, or legacy codebases without breaking existing conventions.
2. Context Window Limitations and Management
Related: Discussions focus on token limits (200k), performance degradation as context fills, and strategies like compacting history, using sub-agents, or maintaining summary files to preserve long-term memory.
3. Vibe Coding and Code Quality
Related: The polarization around building apps without reading the code; critics warn of unmaintainable "slop" and technical debt, while proponents value the speed and ability to bypass syntax.
4. Claude Code and Tooling
Related: Specific praise and critique for the Claude Code CLI, its integration with VS Code and Cursor, the use of slash commands, and comparisons to GitHub Copilot's agent mode.
5. Economic Impact on Software Jobs
Related: Existential anxiety regarding the obsolescence of mid-level engineers, the potential "hollowing out" of the middle class, and the shift toward one-person unicorn teams.
6. Prompt Engineering and Configuration
Related: Strategies involving `CLAUDE.md`, `AGENTS.md`, and custom system prompts to teach the AI coding conventions, architecture, and specific skills for better output.
7. Specific Language Capabilities
Related: Anecdotal evidence regarding proficiency in React, Python, and Go versus struggles in C++, Rust, and mobile development (Swift/Kotlin), often tied to training data availability.
8. Engineering vs. Coding
Related: A recurring distinction between "coding" (boilerplate, standard patterns) which AI conquers, and "engineering" (novel logic, complex systems, 3D graphics) where AI supposedly still fails.
9. Security and Trust
Related: Concerns about deploying unaudited AI code, the introduction of vulnerabilities, the risks of giving agents shell access, and the difficulty of verifying AI output.
10. The Skill Issue Argument
Related: Proponents dismiss failures as "skill issues," suggesting frustration stems from poor prompting or adaptability, while skeptics argue the tools are genuinely inconsistent.
11. Cost of AI Development
Related: Analysis of the financial viability of AI coding, including hitting API rate limits, the high cost of Opus 4.5 tokens, and the potential unsustainability of VC-subsidized pricing.
12. Future of Software Products
Related: Predictions that software creation costs will drop to zero, leading to a flood of bespoke personal apps replacing commercial SaaS, but potentially creating a maintenance nightmare.
13. Human-in-the-Loop Workflows
Related: The consensus that AI requires constant human oversight, "tools in a loop," and code review to prevent hallucination loops and ensure functional software.
14. Opus 4.5 vs. Previous Models
Related: Users describe the specific model as a "step change" or "inflection point" compared to Sonnet 3.5 or GPT-4, citing better reasoning and autonomous behavior.
15. Documentation and Specification
Related: The shift from writing code to writing specs; users find that detailed markdown documentation or "plan mode" yields significantly better AI results than vague prompts.
16. AI Hallucinations and Errors
Related: Reports of AI inventing non-existent CLI tools, getting stuck in logical loops, failing at visual UI tasks, and making simple indexing errors.
17. Shift in Developer Role
Related: The idea that developers are evolving into "product managers" or "architects" who direct agents, requiring less syntax proficiency and more systems thinking.
18. Testing and Verification
Related: The reliance on test-driven development (TDD), linters, and compilers to constrain non-deterministic AI output, ensuring generated code actually runs and meets requirements.
19. Local Models vs. Cloud APIs
Related: Discussions on the viability of local models for privacy and cost savings versus the necessity of massive cloud models like Opus for complex reasoning tasks.
20. Societal Implications
Related: Broader philosophical concerns about wealth concentration, the "class war" of automation, environmental impact, and the future of work in a post-code world.
0. Does not fit well in any category
</topics>
<comments_to_classify>
[
{
"id": "46519744",
"text": "Claude Code seems to package a relatively smart prompt as well, as it seems to work better even with one-line prompts than alternatives that just invoke the API.\n\nKey word: seems . It's impossible to do a proper qualitative analysis."
}
,
{
"id": "46526844",
"text": "Why dont I see any streams building apps as quickly as they say? Just HYpe"
}
,
{
"id": "46519572",
"text": "> (used voice to text then had claude reword, I am lazy and not gonna hand write it all for yall sorry!)\n\nReword? But why not just voice to text alone...\n\nOh but we all read the partially synthetic ad by this point. Psyche."
}
,
{
"id": "46525313",
"text": "I was expecting a showcase to showcase what you've done with it, not just another person's attempt at instructing an AI to follow instructions."
}
,
{
"id": "46519465",
"text": "They are sleeping on it because there is absolutely no incentive to use it.\n\nWhen needed it can be picked up in a day. Otherwise they are not paid based in tickets solved etc.\nIf the incentives were properly aligned everyone would already use it"
}
,
{
"id": "46528193",
"text": "> (used voice to text then had claude reword, I am lazy and not gonna hand write it all for yall sorry!)\n\ntake my downvote as hard as you can. this sort of thing is awfully off-putting."
}
,
{
"id": "46529569",
"text": "I'm at the point where I say fuck it, let them sleep.\n\nThe tech industry just went through an insane hiring craze and is now thinning out. This will help to separate the chaff from the wheat.\n\nI don't know why any company would want to hire \"tech\" people who are terrified of tech and completely obstinate when it comes to utilizing it. All the people I see downplaying it take a half-assed approach at using it then disparage it when it's not completely perfect.\n\nI started tinkering with LLMs in 2022. First use case, speak in natural english to the llm, give it a json structure, have it decipher the natural language and fill in that json structure (vacation planning app, so you talk to it about where/how you want to vacation and it creates the structured data in the app). Sometimes I'd use it for minor coding fixes (copy and paste a block into chatgpt, fix errors or maybe just ideation). This was all personal project stuff.\n\nAt my job we got LLM access in mid/late 2023. Not crazy useful, but still was helpful. We got claude code in 2024. These days I only have an IDE open so I can make quick changes (like bumping up a config parameter, changing a config bool, etc.). I almost write ZERO code now. I usually have 3+ claude code sessions open.\n\nOn my personal projects I'm using Gemini + codex primarily (since I have a google account and chatgpt $20/month account). When I get throttled on those I go to claude and pay per token. I'll often rip through new features, projects, ideas with one agent, then I have another agent come through and clean things up, look for code smells, etc. I don't allow the agents to have full unfettered control, but I'd say 70%+ of the time I just blindly accept their changes. If there are problems I can catch them on the MR/PR.\n\nI agree about the low hanging fruit and I'm constantly shocked at the sheer amount of FUD around LLMs. I want to generalize, like I feel like it's just the mid/jr level devs that speak poorly about it, but there's definitely senior/staff level people I see (rarely, mind you) that also don't like LLMs.\n\nI do feel like the online sentiment is slowly starting to change though. One thing I've noticed a lot of is that when it's an anonymous post it's more likely to downplay LLMs. But if I go on linkedin and look at actual good engineers I see them praising LLMs. Someone speaking about how powerful the LLMs are - working on sophisticated projects at startups or FAANG. Someone with FUD when it comes to LLM - web dev out of Alabama.\n\nI could go on and on but I'm just ranting/venting a little. I guess I can end this by saying that in my professional/personal life 9/10 of the top level best engineers I know are jumping on LLMs any chance they get. Only 1/10 talks about AI slop or bullshit like that."
}
,
{
"id": "46531882",
"text": "Not entirely disagreeing with your point but I think they've mostly been forced to pivot recently for their own sakes; they will never say it though. As much as they may seem eager the most public people tend to also be better at outside communication and knowing what they should say in public to enjoy more opportunities, remain employed or for the top engineers to still seem relevant in the face of the communities they are a part of. Its less about money and more about respect there I think.\n\nThe \"sudden switch\" since Opus 4.5 when many were saying just a few months ago \"I enjoy actual coding\" but now are praising LLM's isn't a one off occurrence. I do think underneath it is somewhat motivated by fear; not for the job however but for relevance. i.e. its in being relevant to discussions, tech talks, new opportunities, etc."
}
,
{
"id": "46526819",
"text": "OK, I am gonna be the guy and put my skin in the game here. I kind of get the hype, but the experience with e.g. Claude Code (or Github Copilot previously and others as weel) has so far been pretty unreliable.\n\nI have Django project with 50 kLOC and it is pretty capable of understanding the architecture, style of coding, naming of variables, functions etc. Sometimes it excels on tasks like \"replicate this non-trivial functionality for this other model and update the UI appropriately\" and leaves me stunned. Sometimes it solves for me tedious and labourous \"replace this markdown editor with something modern, allowing fullscreen edits of content\" and does annoying mistake that only visual control shows and is not capable to fix it after 5 prompts. I feel as I am becoming tester more than a developer and I do not like the shift. Especially when I do not like to tell someone he did an obvious mistake and should fix it - it seems I do not care if it is human or AI, I just do not like incompetence I guess.\n\nYesterday I had to add some parameters to very simple Falcon project and found out it has not been updated for several months and won't build due to some pip issues with pymssql. OK, this is really marginal sub-project so I said - let's migrate it to uv and let's not get hands dirty and let the Claude do it. He did splendidly but in the Dockerfile he missed the \"COPY server.py /data/\" while I asked him to change the path... Build failed, I updated the path myself and moved on.\n\nAnd then you listen to very smart guys like Karpathy who rave about Tab, Tab, Tab, while not understanding the language or anything about the code they write. Am I getting this wrong?\n\nI am really far far away from letting agents touch my infrastructure via SSH, access managed databases with full access privileges etc. and dread the day one of my silly customers asks me to give their agent permission to managed services. One might say the liability should then be shifted, but at the end of the day, humans will have to deal with the damage done.\n\nMy customer who uses all the codebase I am mentioning here asked me, if there is a way to provide \"some AI\" with item GTINs and let it generate photos, descriptions, etc. including metadata they handcrafted and extracted for years from various sources. While it looks like nice idea and for them the possibility of decreasing the staff count, I caught the feeling they do not care about the data quality anymore or do not understand the problems the are brining upon them due to errors nobody will catch until it is too late.\n\nTL;DR: I am using Opus 4.5, it helps a lot, I have to keep being (very) cautious. Wake up call 2026? Rather like waking up from hallucination."
}
,
{
"id": "46516279",
"text": "Everybody says how good Claude is and I go to my code base and I can't get it to correctly update one xaml file for me. It is quicker to make changes myself than to explain exactly what I need or learn how to do \"prompt engineering\".\n\nDisclaimer: I don't have access to Claude Code. My employer has only granted me Claude Teams. Supposedly, they don't use my poopy code to train their models if I use my work email Claude so I am supposed to use that. If I'm not pasting code (asking general questions) into Claude, I believe I'm allowed to use whatever."
}
,
{
"id": "46516312",
"text": "What's even the point of this comment if you self-admittedly don't have access to the flagship tool that everyone has been using to make these big bold coding claims?"
}
,
{
"id": "46516520",
"text": "isn't Claude Teams powerful? does it not have access to Opus?\n\npardon my ignorance.\n\nI use GitHub Copilot which has access to llms like Gemini 3, Sonnet/Opus 4.5 ang GPT 5.2"
}
,
{
"id": "46516351",
"text": "Because the same claims of \"AI tool does everything\" are made over and over again."
}
,
{
"id": "46516466",
"text": "The claims are being made for Claude Code, which you don't have access to."
}
,
{
"id": "46518559",
"text": "I believe part of why Claude Code is so great because it has the chance to catch its own mistakes. It can run compilers, linters, browsers and check its own output. If it makes a mistake, it takes one or two extra iterations until it gets it right."
}
,
{
"id": "46516509",
"text": "It's not \"AI tool does everything\", it's specifically Claude Code with Opus 4.5 is great at \"it\", for whatever \"it\" a given commenter is claiming."
}
,
{
"id": "46521448",
"text": "Didn't feel like reading all this so I shortened it! sorry!\n\nI shortened it for anyone else that might need it\n\n----\n\nSoftware engineers are sleeping on Claude Code agents. By teaching it your conventions, you can automate your entire workflow:\n\nCustom Skills: Generates code matching your UI library and API patterns.\n\nQuality Ops: Automates ESLint, doc syncing, and E2E coverage audits.\n\nAgentic Reviews: Performs deep PR checks against custom checklists.\n\nSmart Triage: Pre-analyzes tickets to give devs a head start.\n\nCheck out the showcase repo to see these patterns in action."
}
,
{
"id": "46534660",
"text": "you are part of the problem"
}
,
{
"id": "46518038",
"text": "Opus 4.5 ate through my Copilot quota last month, and it's already halfway through it for this month. I've used it a lot, for really complex code.\n\nAnd my conclusion is: it's still not as smart as a good human programmer. It frequently got stuck, went down wrong paths, ignored what I told it to do to do something wrong, or even repeat a previous mistake I had to correct.\n\nYet in other ways, it's unbelievably good. I can give it a directory full of code to analyze, and it can tell me it's an implementation of Kozo Sugiyama's dagre graph layout algorithm, and immediately identify the file with the error. That's unbelievably impressive. Unfortunately it can't fix the error. The error was one of the many errors it made during previous sessions.\n\nSo my verdict is that it's great for code analysis, and it's fantastic for injecting some book knowledge on complex topics into your programming, but it can't tackle those complex problems by itself.\n\nYesterday and today I was upgrading a bunch of unit tests because of a dependency upgrade, and while it was occasionally very helpful, it also regularly got stuck. I got a lot more done than usual in the same time, but I do wonder if it wasn't too much. Wasn't there an easier way to do this? I didn't look for it, because every step of the way, Opus's solution seemed obvious and easy, and I had no idea how deep a pit it was getting me into. I should have been more critical of the direction it was pointing to."
}
,
{
"id": "46520179",
"text": "Copilot and many coding agents truncates the context window and uses dynamic summarization to keep costs low for them. That's how they are able to provide flat fee plans.\n\nYou can see some of the context limits here:\n\nhttps://models.dev/\n\nIf you want the full capability, use the API and use something like opencode. You will find that a single PR can easily rack up 3 digits of consumption costs."
}
,
{
"id": "46521531",
"text": "Gerring off of their plans and prompts is so worth it, I know from experience, I'm paying less and getting more so far, paying by token, heavy gemini-3-flash user, it's a really good model, this is the future (distillations into fast, good enough for 90% of tasks), not mega models like Claude. Those will still be created for distillations and the harder problems"
}
,
{
"id": "46520903",
"text": "Maybe not, then. I'm afraid I have no idea what those numbers mean, but it looks like Gemini and ChatGPT 4 can handle a much larger context than Opus, and Opus 4.5 is cheaper than older versions. Is that correct? Because I could be misinterpreting that table."
}
,
{
"id": "46521144",
"text": "I don't know about GPT4 but the latest one (GPT 5.2) has 200k context window while Gemini has 1m, five times higher. You'll be wanting to stay within the first 100k on all of them to avoid hitting quotas very quickly though (either start a new task or compact when you reach that) so in practice there's no difference.\n\nI've been cycling between a couple of $20 accounts to avoid running out of quota and the latest of all of them are great. I'd give GPT 5.2 codex the slight edge but not by a lot.\n\nThe latest Claude is about the same too but the limits on the $20 plan are too low for me to bother with.\n\nThe last week has made me realize how close these are to being commodities already. Even the CLI the agents are nearly the same bar some minor quirks (although I've hit more bugs in Gemini CLI but each time I can just save a checkpoint and restart).\n\nThe real differentiating factor right now is quota and cost."
}
,
{
"id": "46521299",
"text": "You need to find where context breaks down, Claude was better at it even when Gemini had 5X more on paper, but both have improved with last releases."
}
,
{
"id": "46523634",
"text": "People are completely missing the points about agentic development. The model is obviously a huge factor in the quality of the output, but the real magic lies in how the tools are managing and injecting context in to them, as well as the tooling. I switched from Copilot to Cursor at the end of 2025, and it was absolute night and day in terms of how the agents behaved."
}
,
{
"id": "46524369",
"text": "Interesting you have this opinion yet you're using Cursor instead of Claude Code. By the same logic, you should get even better results directly using Anthropic's wrapper for their own model."
}
,
{
"id": "46525232",
"text": "My employer doesn't allow for Claude Code yet. I'm fully aware from speaking to other peers, that they are getting even better performance out of Claude Code."
}
,
{
"id": "46530477",
"text": "In my experience GPT-5 is also much more effective in the Cursor context than the Codex context. Cursor deserves props for doing something right under the hood."
}
,
{
"id": "46519389",
"text": "yes just using AI for code analysis is way under appreciated I think. Even the most sceptical people on using it for coding should try it out as a tool for Q&A style code interrogation as well as generating documentation. I would say it zero-shots documentation generation better than most human efforts would to the point it begs the question of whether it's worth having the documentation in the first place. Obviously it can make mistakes but I would say they are below the threshold of human mistakes from what I've seen."
}
,
{
"id": "46521229",
"text": "(I haven't used AI much, so feel free to ignore me.)\n\nThis is one thing I've tried using it for, and I've found this to be very, very tricky. At first glance, it seems unbelievably good. The comments read well, they seem correct, and they even include some very non-obvious information.\n\nBut almost every time I sit down and really think about a comment that includes any of that more complex analysis, I end up discarding it. Often, it's right but it's missing the point, in a way that will lead a reader astray. It's subtle and I really ought to dig up an example, but I'm unable to find the session I'm thinking about.\n\nThis was with ChatGPT 5, fwiw. It's totally possible that other models do better. (Or even newer ChatGPT; this was very early on in 5.)\n\nCode review is similar. It comes up with clever chains of reasoning for why something is problematic, and initially convinces me. But when I dig into it, the review comment ends up not applying.\n\nIt could also be the specific codebase I'm using this on? (It's the SpiderMonkey source.)"
}
,
{
"id": "46523066",
"text": "My main experience is with anthropic models.\n\nI've had some encounters with inaccuracies but my general experience has been amazing. I've cloned completely foreign git repos, cranked up the tool and just said \"I'm having this bug, give me an overview of how X and Y work\" and it will create great high level conceptual outlines that mean I can drive straight in where without it I would spend a long time just flailing around.\n\nI do think an essential skill is developing just the right level of scepticism. It's not really different to working with a human though. If a human tells me X or Y works in a certain way i always allow a small margin of possibility they are wrong."
}
,
{
"id": "46523528",
"text": "But have you actually thoroughly checked the documentation it generated? My experience suggests it can often be subtly wrong."
}
,
{
"id": "46523773",
"text": "If it can consistently verify that the error persists after fix--you can run (ok maybe you can't budget wise but theoretically) 10000 parallel instances of fixer agents then verify afterwards (this is in line with how the imo/ioi models work according to rumors)"
}
,
{
"id": "46521161",
"text": "> Opus 4.5 ate theough my Copilot quota last month\n\nSure, Copilot charges 3x tokens for using Opus 4.5, but, how were you still able to use up half the allocated tokens not even one week into January?\n\nI thought using up 50% was mad for me (inline completions + opencode), that's even worse"
}
,
{
"id": "46519784",
"text": "It acts differently when using it through a third party tool\n\nTry it again using Claude Code and a subscription to Claude. It can run as a chat window in VS Code and Cursor too."
}
,
{
"id": "46519906",
"text": "My employer gets me a Copilot subscription with access to Claude, not a subscription to Claude Code, unfortunately."
}
,
{
"id": "46520925",
"text": "at this point I would suggest getting a $20 subscription to start, seeing if you can expense it\n\nthe tooling is almost as important as the model"
}
,
{
"id": "46521961",
"text": ">So my verdict is that it's great for code analysis, and it's fantastic for injecting some book knowledge on complex topics into your programming, but it can't tackle those complex problems by itself.\n\nI don't think you've seen the full potential. I'm currently #1 on 5 different very complex computer engineering problems, and I can't even write a \"hello world\" in rust or cpp. You no longer need to know how to write code, you just need to understand the task at a high level and nudge the agents in the right direction. The game has changed.\n\n- https://highload.fun/tasks/3/leaderboard\n\n- https://highload.fun/tasks/12/leaderboard\n\n- https://highload.fun/tasks/15/leaderboard\n\n- https://highload.fun/tasks/18/leaderboard\n\n- https://highload.fun/tasks/24/leaderboard"
}
,
{
"id": "46532009",
"text": "If that is true; then all the commentary around software people having jobs still due to \"taste\" and other nice words is just that. Commentary. In the end the higher level stuff still needs someone to learn it (e.g. learning ASX2 architecture, knowing what tech to work with); but it requires IMO significantly less practice then coding which in itself was a gate. The skill morphs more into a tech expert rather than a coding expert.\n\nI'm not sure what this means for the future of SWE's though yet. I don't see higher levels of staff in big large businesses bothering to do this, and at some scale I don't see founders still wanting to manage all of these agents, and processes (got better things to do at higher levels). But I do see the barrier of learning to code gone; meaning it probably becomes just like any other job."
}
,
{
"id": "46522965",
"text": "How are you qualified to judge its performance on real code if you don't know how to write a hello world?\n\nYes, LLMs are very good at writing code, they are so good at writing code that they often generate reams of unmaintainable spaghetti.\n\nWhen you submit to an informatics contest you don't have paying customers who depend on your code working every day. You can just throw away yesterday's code and start afresh.\n\nClaude is very useful but it's not yet anywhere near as good as a human software developer. Like an excitable puppy it needs to be kept on a short leash."
}
,
{
"id": "46527486",
"text": "I know what's like running a business, and building complex systems. That's not the point.\n\nI used highload as an example because it seems like an objective rebuttal to the claim that \"but it can't tackle those complex problems by itself.\"\n\nAnd regarding this:\n\n\"Claude is very useful but it's not yet anywhere near as good as a human software developer. Like an excitable puppy it needs to be kept on a short leash\"\n\nAgain, a combination of LLM/agents with some guidance (from someone with no prior experience in this type of high performing architecture) was able to beat all human software developers that have taken these challenges."
}
,
{
"id": "46526442",
"text": "> Claude is very useful but it's not yet anywhere near as good as a human software developer. Like an excitable puppy it needs to be kept on a short leash.\n\nThe skill of \"a human software developer\" is in fact a very wide distribution, and your statement is true for a ever shrinking tail end of that"
}
,
{
"id": "46523160",
"text": "> How are you qualified to judge its performance on real code if you don't know how to write a hello world?\n\nThe ultimate test of all software is \"run it and see if it's useful for you.\" You do not need to be a programmer at all to be qualified to test this."
}
,
{
"id": "46523685",
"text": "What I think people get wrong (especially non-coders) is that they believe the limitation of LLMs is to build a complex algorithm.\nThat issue in reality was fixed a long time ago. The real issue is to build a product. Think about microservices in different projects, using APIs that are not perfectly documented or whose documentation is massive, etc.\n\nHonestly I don't know what commenters on hackernews are building, but a few months back I was hoping to use AI to build the interaction layer with Stripe to handle multiple products and delayed cancellations via subscription schedules. Everything is documented, the documentation is a bit scattered across pages, but the information is out there.\nAt the time there was Opus 4.1, so I used that. It wrote 1000 lines of non-functional code with 0 reusability after several prompts. I then asked something to Chat gpt to see if it was possible without using schedules, it told me yes (even if there is not) and when I told Claude to recode it, it started coding random stuff that doesn't exist.\nI built everything to be functional and reusable myself, in approximately 300 lines of code.\n\nThe above is a software engineering problem. Reimplementing a JSON parser using Opus is not fun nor useful, so that should not be used as a metric"
}
,
{
"id": "46531096",
"text": "> The above is a software engineering problem. Reimplementing a JSON parser using Opus is not fun nor useful, so that should not be used as a metric.\n\nI've also built a bitorrent implementation from the specs in rust where I'm keeping the binary under 1MB. It supports all active and accepted BEPs: https://www.bittorrent.org/beps/bep_0000.html\n\nAgain, I literally don't know how to write a hello world in rust.\n\nI also vibe coded a trading system that is connected to 6 trading venues. This was a fun weekend project but it ended up making +20k of pure arbitrage with just 10k of working capital. I'm not sure this proves my point, because while I don't consider myself a programmer, I did use Python, a language that I'm somewhat familiar with.\n\nSo yeah, I get what you are saying, but I don't agree. I used highload as an example, because it is an objective way of showing that a combination of LLM/agents with some guidance (from someone with no prior experience in this type of high performing architecture) was able to beat all human software developers that have taken these challenges."
}
,
{
"id": "46530650",
"text": "This hits the nail on the head. There's a marked difference between a JSON parser and a real world feature in a product. Real world features are complex because they have opaque dependencies, or ones that are unknown altogether. Creating a good solution requires building a mental model of the actual complex system you're working with, which an LLM can't do. A JSON parser is effectively a book problem with no dependencies."
}
,
{
"id": "46530965",
"text": "You are looking at this wrong. Creating a json parser is trivial. The thing is that my one-shot attempt was 10x slower than my final solution.\n\nCreating a parser for this challenge that is 10x more efficient than a simple approach does require deep understanding of what you are doing. It requires optimizing the hot loop (among other things) that 90-95% of software developers wouldn't know how to do. It requires deep understanding of the AVX2 architecture.\n\nHere you can read more about these challenges: https://blog.mattstuchlik.com/2024/07/12/summing-integers-fa..."
}
,
{
"id": "46524420",
"text": ">I'm currently #1 on 5 different very complex computer engineering problems\n\nAh yes, well known very complex computer engineering problems such as:\n\n* Parsing JSON objects, summing a single field\n\n* Matrix multiplication\n\n* Parsing and evaluating integer basic arithmetic expressions\n\nAnd you're telling me all you needed to do to get the best solution in the world to these problems was talk to an LLM?"
}
,
{
"id": "46527093",
"text": "Lol, the problem is not finding a solution, the problem is solving it in the most efficient way.\n\nIf you think you can beat an LLM, the leaderboard is right there."
}
,
{
"id": "46520951",
"text": "What bothers me about posts like this is: mid-level engineers are not tasked with atomic, greenfield projects. If all an engineer did all day was build apps from scratch, with no expectation that others may come along and extend, build on top of, or depend on, then sure, Opus 4.5 could replace them. The hard thing about engineering is not \"building a thing that works\", its building it the right way, in an easily understood way, in a way that's easily extensible.\n\nNo doubt I could give Opus 4.5 \"build be a XYZ app\" and it will do well. But day to day, when I ask it \"build me this feature\" it uses strange abstractions, and often requires several attempts on my part to do it in the way I consider \"right\". Any non-technical person might read that and go \"if it works it works\" but any reasonable engineer will know that thats not enough."
}
]
</comments_to_classify>
Based on the comments above, assign each to up to 3 relevant topics.
Return ONLY a JSON array with this exact structure (no other text):
[
{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,
{
"id": "comment_id_2",
"topics": [
2
]
}
,
{
"id": "comment_id_3",
"topics": [
0
]
}
,
...
]
Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices for matches
- Use index 0 if the comment does not fit well in any category
- Only assign topics that are genuinely relevant to the comment
Remember: Output ONLY the JSON array, no other text.
50