Summarizer

Regex vs LLM Irony

Commenters noting the irony of a frontier AI company using regex patterns for frustration detection rather than their own models, though many acknowledge this is pragmatic for cost and latency reasons

← Back to The Claude Code Source Leak: fake tools, frustration regexes, undercover mode

While commenters find "peak irony" in an AI powerhouse like Anthropic using basic regex for frustration detection, many defend the choice as a pragmatic engineering decision to prioritize low latency and cost-efficiency. This "fuck chart" telemetry highlights a revealing cultural boundary where even frontier developers prefer traditional code for simple logging, though critics argue that such blunt instruments are prone to false positives and ignore the nuances of non-English speakers. Ultimately, the discussion suggests that while the regex is a humorous technical shortcut, it serves as a reminder that the most advanced AI companies still value the reliability of "old-school" tools for specific, high-speed tasks.

20 comments tagged with this topic

View on HN · Topics
There's a more worrying part: It refers to unreleased versions of Claude in more detail than released versions. For a company calling chinese companies out for distillation attacks on their models, this very much looks like a distillation attack against human maintainers, especially when combined with the frustration detector.
View on HN · Topics
People make fun that we should say magic words in interaction with LLMs. How frustrated can Claude be? /s
View on HN · Topics
The hooks system is the most underappreciated thing in what leaked. PreToolUse, PostToolUse, session lifecycle, all firing via curl to a local server. Clean enough to build real tooling on top of without fighting it. The frustration regex is funny but honestly the right call. Running an LLM call just to detect "wtf" would be ridiculous. KAIROS is what actually caught my attention. An always-on background agent that acts without prompting is a completely different thing from what Claude Code is today. The 15 second blocking budget tells me they actually thought through what it feels like to have something running in the background while you work, which is usually the part nobody gets right.
View on HN · Topics
> Running an LLM call just to detect "wtf" would be ridiculous. Tangentially, I wonder if the world trade federation or the Washington tennis foundation have any projects on GitHub :)
View on HN · Topics
> The frustration regex is funny but honestly the right call. I love that it only supports English. AI bubble in a nutshell.
View on HN · Topics
Regex for swearing detected, user needs to get more API tokens, he is very very pissed.
View on HN · Topics
> Sometimes a regex is the right tool. I’d argue that in this case, it isn’t. Exhibit 1 (from the earlier thread): https://github.com/anthropics/claude-code/issues/22284 . The user reports that this caused their account to be banned: https://news.ycombinator.com/item?id=47588970 Maybe it would be okay as a first filtering step, before doing actual sentiment analysis on the matches. That would at least eliminate obvious false positives (but of course still do nothing about false negatives).
View on HN · Topics
Is this really the use-case? I imagine the regex is good for a dashboard. You can collect matches per 1000 prompts or something like that, and see if the number grows or declines over time. If you miss some negative sentiment it shouldn't matter unless the use of that specific word doesn't correlate over time with other negative words and is also popular enough to have an impact on the metric.
View on HN · Topics
When you read the code, what you propose is actually its exclusive use... logging.
View on HN · Topics
have you heard about rlhf?
View on HN · Topics
Two things worth separating here: the leak mechanism and the leak contents. The mechanism is a build pipeline issue. Bun generates source maps by default, and someone didn't exclude the .map file from the npm publish. There's an open Bun issue (oven-sh/bun#28001) about this exact behavior. One missing line in .npmignore or the package.json files field. Same category of error as the Axios compromise earlier this week — npm packaging configuration is becoming a recurring single point of failure across the ecosystem. The contents are more interesting from a security architecture perspective. The anti-distillation system (injecting fake tool definitions to poison training data scraped from API traffic) is a defensive measure that only works when its existence is secret. Now that it's public, anyone training on Claude Code API traffic knows to filter for it. The strategic value evaporated the moment the .map file hit the CDN. The undercover mode discussion is being framed as deception, but the actual security question is narrower: should AI-authored contributions to public repositories carry attribution? That's an AI identity disclosure question that the industry hasn't settled. The code shows Anthropic made a specific product decision — strip AI attribution in public commits from employee accounts. Whether that's reasonable depends on whether you think AI authorship is material information for code reviewers. The frustration regex is the least interesting finding technically but the most revealing culturally. A company with frontier-level NLP capability chose a regex over an inference call for sentiment detection. The engineering reason is obvious (latency and cost), but it tells you something about where even AI companies draw the line on using their own models.
View on HN · Topics
> Frustration detection via regex (yes, regex) /\b(wtf|wth|ffs|omfg|shit(ty|tiest)?|dumbass|horrible|awful| piss(ed|ing)? off|piece of (shit|crap|junk)|what the (fuck|hell)| fucking? (broken|useless|terrible|awful|horrible)|fuck you| screw (this|you)|so frustrating|this sucks|damn it)\b/ Personally, I'm generally polite even towards AI and even when frustrated. I simply point out the its mistakes instead of using emotional words.
View on HN · Topics
I had a good laugh. I am too polite but I do remember using wth a few times in the past week. haha
View on HN · Topics
But think of all the API calls you save if you curse at Claude
View on HN · Topics
I used to swear at Claude. To be honest, I thought it helped get results (maybe this is "oldschool" LLM thinking), but I realized it was just making me annoyed.
View on HN · Topics
It does send an analytics event when you’re swearing based on a keyword filter (something like is_negative:true), presumably as a signal that the model isn’t performing well this session, but who knows?
View on HN · Topics
But what does Claude do when it detects user fruatration?! Don't leave us hanging here! Edit: it gets sent to Anthropic via telemetry and it ends up on the fuck chart! https://old.reddit.com/r/ClaudeCode/comments/1s99wz4/boris_t...
View on HN · Topics
We're about to reach AGI. One regex at a time...
View on HN · Topics
>This was the most-discussed finding in the HN thread. The general reaction: an LLM company using regexes for sentiment analysis is peak irony. >Is it ironic? Sure. Is it also probably faster and cheaper than running an LLM inference just to figure out if a user is swearing at the tool? Also yes. Sometimes a regex is the right tool. I'm reading an LLM written write up on an LLM tool that just summarizes HN comments. I'm so tired man, what the hell are we doing here.
View on HN · Topics
I wrote this an hour ago and it seems that Claude might not understand it as frustration: > change the code!!!! The previous comment was NOT ABOUT THE DESCRIPTION!!!!!!! Add to the {implementation}!!!!! This IS controlled BY CODE. *YOU* _MUST_ CHANGE THE CODE!!!!!!!!!!!