Summarizer

The Skill Issue Argument

Proponents dismiss failures as "skill issues," suggesting frustration stems from poor prompting or adaptability, while skeptics argue the tools are genuinely inconsistent.

← Back to Opus 4.5 is not the normal AI agent experience that I have had thus far

45 comments tagged with this topic

View on HN · Topics
We have an in-house, Rust-based proxy server. Claude is unable to contribute to it meaningfully outside of grunt work like minor refactors across many files. It doesn't seem to understand proxying and how it works on both a protocol level and business logic level. With some entirely novel work we're doing, it's actually a hindrance as it consistently tells us the approach isn't valid/won't work (it will) and then enters "absolutely right" loops when corrected. I still believe those who rave about it are not writing anything I would consider "engineering". Or perhaps it's a skill issue and I'm using it wrong, but I haven't yet met someone I respect who tells me it's the future in the way those running AI-based companies tell me.
View on HN · Topics
> We have an in-house, Rust-based proxy server. Claude is unable to contribute to it meaningfully outside I have a great time using Claude Code in Rust projects, so I know it's not about the language exactly. My working model is is that since LLM are basically inference/correlation based, the more you deviate from the mainstream corpus of training data, the more confused LLM gets. Because LLM doesn't "understand" anything. But if it was trained on a lot of things kind of like the problem, it can match the patterns just fine, and it can generalize over a lot layers, including programming languages. Also I've noticed that it can get confused about stupid stuff. E.g. I had two different things named kind of the same in two parts of the codebase, and it would constantly stumble on conflating them. Changing the name in the codebase immediately improved it. So yeah, we've got another potentially powerful tool that requires understanding how it works under the hood to be useful. Kind of like git.
View on HN · Topics
It's how you use the tool that matters. Some people get bitter and try to compare it to top engineers' work on novel things as a strawman so they can go "Hah! Look how it failed!" as they swing a hammer to demonstrate it cannot chop down a tree. Because the tool is so novel and it's use us a lot more abstract than that of an axe, it is taking awhile for some to see its potential, especially if they are remembering models from even six months ago. Engineering is just problem solving, nobody judges structural engineers for designing structures with another Simpson Strong Tie/No.2 Pine 2x4 combo because that is just another easy (and therefore cheap) way to rapidly get to the desired state. If your client/company want to pay for art, that's great! Most just want the thing done fast and robustly.
View on HN · Topics
Why do we all of a sudden hold these agents to some unrealistic high bar? Engineers write bugs all the time and write incorrect validations. But we iterate. We read the stacktrace in Sentry and realise what the hell I was thinking when I wrote that, and we fix things. If you're going to benefit from these agents, you'd need to be a bit more patient and point them correctly to your codebase. My rule of thumb is that if you can clearly describe exactly what you want to another engineer, then you can instruct the agent to do it too.
View on HN · Topics
> Engineers write bugs all the time Why do we hold calculators to such high bars? Humans make calculation mistakes all the time. Why do we hold banking software to such high bars? People forget where they put their change all the time. Etc etc.
View on HN · Topics
my unrealistic bar lies somewhere above "pick a new library" bug resolution
View on HN · Topics
I really think a lof of people tried AI coding earlier, got frustrated at the errors and gave up. That's where the rejection of all these doomer predictions comes from. And I get it. Coding with Claude Code really was prompting something, getting errors, and asking it to fix it. Which was still useful but I could see why a skilled coder adding a feature to a complex codebase would just give up Opus 4.5 really is at a new tier however. It just...works. The errors are far fewer and often very minor - "careless" errors, not fundamental issues (like forgetting to add "use client" to a nextjs client component.
View on HN · Topics
I decided to vibe code something myself last week at work. I've been wanting to create a poc that involves a coding agent create custom bokeh plots that a user can interact with and ask follow up questions. All this had to be served using a holoview panel library At work I only have access to calude using the GitHub copilot integration so this could be the cause of my problems. Claude was able to get slthe first iteration up pretty quick. At that stage the app could create a plot and you could interact with it and ask follow up questions. Then I asked it to extend the app so that it could generate multiple plots and the user could interact with all of them one at a time. It made a bunch of changes but the feature was never implemented. I asked it to do again but got the same outcome. I completely accept the fact that it could just be all because I am using vscode copilot or my promoting skills are not good but the LLM got 70% of the way there and then completely failed
View on HN · Topics
If it does the wrong thing you tell it what the right thing is and have it try again. With the latest models if you're clear enough with your requirements you'll usually find it does the right thing on the first try.
View on HN · Topics
Many people - simonw is the most visible of them, but there are countless others - have given up trying to convinced folks who are determined to not be convinced, and are simply enjoying their increased productivity. This is not a competition or an argument.
View on HN · Topics
Maybe they are struggling to convince others because they are unable to produce evidence that is able to convince people? My experience scrolling X and HN is a bunch of people going "omg opus omg Claude Code I'm 10x more productive" and that's it. Just hand wavy anecdotes based on their own perceived productivity. I'm open to being convinced but just saying stuff is not convincing. It's the opposite, it feels like people have been put under a spell. I'm following The Primeagen, he's doing a series where he is trying these tools on stream and following peoples advice on how to use them the best. He's actually quite a good programmer so I'm eager to see how it goes. So far he isn't impressed and thus neither am I. If he cracks it and unlocks significant productivity then I will be convinced.
View on HN · Topics
>> Maybe they are struggling to convince others because they are unable to produce evidence that is able to convince people? Simon has produced plenty of evidence over the past year. You can check their submission history and their blog: https://simonwillison.net/ The problem with people asking for evidence is that there's no level of evidence that will convince them. They will say things like "that's great but this is not a novel problem so obviously the AI did well" or "the AI worked only because this is a greenfield project, it fails miserably in large codebases".
View on HN · Topics
> Thus far, we do have evidence that AI (at least in OSS) produces a 19% decrease in productivity I generally agree with you, but I'd be remiss if I didn't point out that it's plausible that the slow down observed in the METR study was at least partially due to the subjects lack of experience with LLMs. Someone with more experience performed the same experiment on themselves, and couldn't find a significant difference between using LLMs and not [0]. I think the more important point here is that programmers subjective assessment of how much LLMs help them is not reliable, and biased towards the LLMs. [0] https://mikelovesrobots.substack.com/p/wheres-the-shovelware...
View on HN · Topics
I think we're on the same page re. that study. Actually your link made me think about the ongoing debate around IDE's vs stuff like Vim. Some people swear by IDE's and insist they drastically improve their productivity, others dismiss them or even claim they make them less productive. Sound familiar? I think it's possible these AI tools are simply another way to type code, and the differences averaged out end up being a wash.
View on HN · Topics
> With the latest models if you're clear enough with your requirements you'll usually find it does the right thing on the first try That's great that this is your experience, but it's not a lot of people's. There are projects where it's just not going to know what to do. I'm working in a web framework that is a Frankenstein-ing of Laravel and October CMS. It's so easy for the agent to get confused because, even when I tell it this is a different framework, it sees things that look like Laravel or October CMS and suggests solutions that are only for those frameworks. So there's constant made up methods and getting stuck in loops. The documentation is terrible, you just have to read the code. Which, despite what people say, Cursor is terrible at, because embeddings are not a real way to read a codebase.
View on HN · Topics
> I really think a lof of people tried AI coding earlier, got frustrated at the errors and gave up. That's where the rejection of all these doomer predictions comes from. It's not just the deficiencies of earlier versions, but the mismatch between the praise from AI enthusiasts and the reality. I mean maybe it is really different now and I should definitely try uploading all of my employer's IP on Claude's cloud and see how well it works. But so many people were as hyped by GPT-4 as they are now, despite GPT-4 actually being underwhelming. Too much hype for disappointing results leads to skepticism later on, even when the product has improved.
View on HN · Topics
If you've found nothing useful about AI so far then the problem is likely you
View on HN · Topics
I don't think it's necessarily a problem. And even if you accept that the problem is you, it doesn't exactly provide a "solution".
View on HN · Topics
The problem with LLMs (similar to people :) ) is that you never really know what works. I've had Claude one-shot "implement <some complex requirement>" with little additional input, and then completely botch even the smallest bug fix with explicit instructions and context. And vice versa :)
View on HN · Topics
Personally I'm sympathetic to people who don't want to have to use AI, but I dislike it when they attack my use of AI as a skill issue. I'm quite certain the workplace is going to punish people who don't leverage AI though, and I'm trying to be helpful.
View on HN · Topics
> but I dislike it when they attack my use of AI as a skill issue. No one attacked your use of AI. I explained my own experience with the "Claude Opus 4.5 is next tier". You barged in, ignored anything I said, and attacked my skills. > the workplace is going to punish people who don't leverage AI though, and I'm trying to be helpful. So what exactly is helpful in your comments?
View on HN · Topics
> The snide and patronizing is your projection. It's not me who decided to barge in, assume their opponent doesn't use something or doesn't want to use something, and offer unsolicited advice. > It kinda makes me sad when the discourse is so poisoned that I can't even encourage someone to protect their own future from something that's obviously coming See. Again. You're so in love with your "wisdom" that you can't even see what you sound like: snide, patronising, condenscending. And completely missing the whole point of what was written. You are literally the person who poisons the discourse. Me: "here are the issues I still experience with what people claim are 'next tier frontier model'" You: "it's in your interests to figure out how to leverage new tools to stay relevant in the future" Me: ... what the hell are you talking about? I'm using these tools daily. Do you have anything constructive to add to the discourse? > so I expect friendly/rational discourse is going to be a challenge. It's only challenge to you because you keep being in love with your voice and your voice only. Do you have anything to contribute to the actual rational discourse, are you going to attack my character? > 'd say something nice but since you're primed to see me being patronizing... Fuck you? T Ah. The famous friendly/rational discourse of "they attack my use of AI" (no one attacked you), "why don't you invest in learning tools to stay relevant in the future" (I literally use these tools daily, do you have anything useful to say?) and "fuck you" (well, same to you). > That what you were expecting? What I was expecting is responses to what I wrote, not you riding in on a high horse.
View on HN · Topics
You were the one complaining about how the tools aren't giving you the results you expected. If you're using these tools daily and having a hard time, either you're working on something very different from the bulk of people using the tools and your problems or legitimate, or you aren't and it's a skill issue. If you want to take politeness as being patronizing, I'm happy to stop bothering. My guess is you're not a special snowflake, and you need to "get good" or you're going to end up on unemployment complaining about how unfair life is. I'd have sympathy but you don't seem like a pleasant human being to interact with, so have fun!
View on HN · Topics
> ou were the one complaining about how the tools aren't giving you the results you expected. They are not giving me the results people claim they give. It is distinctly different from not giving the results I want. > If you're using these tools daily and having a hard time, either you're working on something very different from the bulk of people using the tools and your problems or legitimate, or you aren't and it's a skill issue. Indeed. And your rational/friendly discourse that you claim you're having would start with trying to figure that out. Did you? No, you didn't. You immediately assumed your opponent is a clueless idiot who is somehow against AI and is incapable or learning or something. > If you want to take politeness as being patronizing, I'm happy to stop bothering. No. It's not politeness. It's smugness. You literally started your interaction in this thread with a "git gud or else" and even managed to complain later that "you dislike it when they attack your use of AI as a skill issue". While continuously attacking others. > you don't seem like a pleasant human being to interact with Says the person who has contributed nothing to the conversation except his arrogance, smugness, holier-than-thou attitude, engaged in nothing but personal attacks, complained about non-existent grievances and when called out on this behavior completed his "friendly and rational discourse" with a "fuck you". Well, fuck you, too. Adieu.
View on HN · Topics
Homey, we're going to be replacing you devs that can't stand to use LLMs lol
View on HN · Topics
LOL what an argument. Seeing the replies here it actually doesn't seem like everyone else can do this. Looks like a lot of people really suck at using LLMs to me.
View on HN · Topics
I'm not saying they can all do it now... but I don't think it's much of a stretch that they can learn it quickly and cheaply.
View on HN · Topics
Power tools give way to robotics though so it seems small minded to think so small? Have you been following the latest trends though? New models come out all the time so you can't have this tool brand mindset. Keep studying and you'll get there.
View on HN · Topics
I think getting proficient at using coding agents effectively takes a few months of practice. It's also a skill that compounds over time, so if you have two years of experience with them you'll be able to use them more effectively than someone with two months of experience. In that respect, they're just normal technology. A Python programmer with two years of Python experience will be more effective than a programmer with two months of Python.
View on HN · Topics
> Claude is very useful but it's not yet anywhere near as good as a human software developer. Like an excitable puppy it needs to be kept on a short leash. The skill of "a human software developer" is in fact a very wide distribution, and your statement is true for a ever shrinking tail end of that
View on HN · Topics
This is a typical problem you see in autodidacts. They will recreate solutions to solved problems, trip over issues that could have been avoided, and generally do all of things you would expect someone to do if they are working with skill but no experience. LLMs accelerate this and make it more visible, but they are not the cause. It is almost always a person trying to solve a problem and just not knowing what they don't know because they are learning as they go.
View on HN · Topics
> [The cause] is almost always a person trying to solve a problem and just not knowing what they don't know because they are learning as they go. Isn't that what "using an LLM" is supposed to solve in the first place?
View on HN · Topics
I am hopeful autodidacts will leverage an LLM world like they did with an Internet search world from a library world from a printed word world. Each stage in that progression compressed the time it took for them to encompass a span of comprehension of a new body of understanding before applying to practice, expanded how much they applied the new understanding to, and deepened their adoption scope of best practices instead of reinventing the wheel. In this regard, I see LLM's as a way for us to way more efficiently encode, compress, convey and enable operational practice our combined learned experiences. What will be really exciting is watching what happens as LLM's simultaneously draw from and contribute to those learned experiences as we do; we don't need full AGI to sharply realize massive benefits from just rapidly, recursively enabling a new highly dynamic form of our knowledge sphere that drastically shortens the distance from knowledge to deeply-nuanced praxis.
View on HN · Topics
My impression is that LLM users are the kind of people that HATED that their questions on StackOverflow got closed because it was duplicated.
View on HN · Topics
- I cloned a project from GitHub and made some minor modifications. - I used AI-assisted programming to create a project. Even if the content is identical, or if the AI is smart enough to replicate the project by itself, the latter can be included on a CV.
View on HN · Topics
I think I would prefer the former if I were reviewing a CV. It at least tells me they understood the code well enough to know where to make their minor tweaks. (I've spent hours reading through a repo to know where to insert/comment out a line to suit my needs.) The second tells me nothing.
View on HN · Topics
Its odd you don't apply the same analysis to each. The latter certainly can provide a similar trail indicating knowledge of the use case and necessary parameters to achieve it. And certainly the former doesnt preclude llm interlocking.
View on HN · Topics
Do people really see a CV and read "computer mommy made me a program" and think it's impressive
View on HN · Topics
To be fair, you're not supposed to be doing the "one shot" thing with LLMs in a mature codebase. You have to supply it the right context with a well formed prompt, get a plan, then execute and do some cleanup. LLMs are only as good as the engineers using them, you need to master the tool first before you can be productive with it.
View on HN · Topics
Adding capacity to software engineering through LLMs is like adding lanes to a highway — all the new capacity will be utilized. By getting the LLM to keep changes minimal I’m able to keep quality high while increasing velocity to the point where productivity is limited by my review bandwidth. I do not fear competition from junior engineers or non-technical people wielding poorly-guided LLMs for sustained development. Nor for prototyping or one offs, for that matter — I’m confident about knowing what to ask for from the LLM and how to ask.
View on HN · Topics
> What bothers me about posts like this is: mid-level engineers are not tasked with atomic, greenfield projects They get those ocassionally all the time though too. Depends on the company. In some software houses it's constant "greenfield projects", one after another. And even in companies with 1-2 pieces of main established software to maintain, there are all kinds of smaller utilities or pipelines needed. > But day to day, when I ask it "build me this feature" it uses strange abstractions, and often requires several attempts on my part to do it in the way I consider "right". In some cases that's legit. In other cases it's just "it did it well, but not how I'd done it", which is often needless stickness to some particular style (often a contention between 2 human programmers too). Basically, what FloorEgg says in this thread: "There are two types of right/wrong ways to build: the context specific right/wrong way to build something and an overly generalized engineer specific right/wrong way to build things." And you can always not just tell it "build me this feature", but tell it (high level way) how to do it, and give it a generic context about such preferences too.
View on HN · Topics
This is the exact copium I came here to enjoy.
View on HN · Topics
>llm_nerd >created two years ago You AI hype thots/bots are all the same. All these claims but never backed up with anything to look at. And also alway claiming “you’re holding it wrong”.
View on HN · Topics
>You AI hype thots/bots are all the same This isn't twitter, so save the garbage rhetoric. And if you must question my account, I create a new account whenever I setup a new main PC, and randomly pick a username that is top of mind at the moment. This isn't professionally or personally affiliated in any way so I'm not trying to build a thing. I mean, if I had a 10 year old account that only managed a few hundred upvotes despite prolific commenting, I'd probably delete it out of embarrassment though. >All these claims but never backed up with anything to look at Uh...install the tools? Use them? What does "to look at" even mean? Loads of people are using these tools to great effect, while some tiny minority tell us online that no way they don't work, etc. And at some point they'll pull their head out of the sand and write the followup "Wait, they actually do".
View on HN · Topics
HN has a subset of users -- they're a minority, but they hit threads like this super hard -- who really, truly think that if they say that AI tools suck and are only for nubs loud enough and frequently enough, downvoting anyone who finds them useful, all AI advancements will unwind and it'll be the "good old days" again. It's rather bizarre stuff, but that's what happens when people in denial feel threatened.