Undercover Mode Controversy

Analysis of the leaked undercover.ts file that instructs Claude to hide AI involvement in public repositories, debate over whether this is about protecting internal codenames or actively pretending to be human

The leak of Claude’s "Undercover Mode" has sparked a heated debate over whether the tool is an unethical engine for deception or a practical necessity for internal security. Critics argue that instructing an AI to mimic human commit messages and strip attribution constitutes a deliberate "lie of omission" that could violate transparency laws and jeopardize the copyright eligibility of the resulting code. Conversely, defenders maintain the mode is a standard safeguard intended only for Anthropic employees to prevent the accidental leakage of sensitive codenames and unreleased model versions into public repositories. This tension highlights a broader cultural anxiety regarding the integrity of the open-source supply chain, as developers weigh the benefits of AI-assisted productivity against the potential for "AI slop" to bypass human scrutiny undetected.

View on HN · Topics

There are now several comments that (incorrectly?) interpret the undercover mode as only hiding internal information. Excerpts from the actual prompt[0]:

NEVER include in commit messages or PR descriptions:
- The phrase "Claude Code" or any mention that you are an AI
- Co-Authored-By lines or any other attribution

BAD (never write these):
- 1-shotted by claude-opus-4-6
- Generated with Claude Code
- Co-Authored-By: Claude Opus 4.6 <…>

This very much sounds like it does what it says on the tin, i.e. stays undercover and pretends to be a human. It's especially worrying that the prompt is explicitly written for contributions to public repositories.

[0]: https://github.com/chatgptprojects/claude-code/blob/642c7f94...

View on HN · Topics

Code may not be, but opening a Merge Request undercover may be unlawful:

> Providers shall ensure that AI systems intended to interact directly with natural persons are designed and developed in such a way that the natural persons concerned are informed that they are interacting with an AI system

View on HN · Topics

I guess our system prompt didn't work. If folks are having to add it manually into their own Claude.md files...

View on HN · Topics

It's less about pretending to be a human and more about not inviting scrutiny and ridicule toward Claude if the code quality is bad. They want the real human to appear to be responsible for accepting Claud's poor output.

View on HN · Topics

It’s also pretty damn obvious when LLMs write code. Nobody out here commenting every method in perfect punctuation and grammar.

View on HN · Topics

The code has a stated goal of avoiding leaks, but then the actual implementation becomes broader than that. I see two possible explanations:

* The authors made the code very broad to improve its ability to achieve the stated goal

* The authors have an unstated goal

I think it's healthy to be skeptical but what I'm seeing is that the skeptics are pushing the boundaries of what's actually in the source. For example, you say "says on the tin" that it "pretends to be human" but it simply does not say that on the tin. It does say "Write commit messages as a human developer would" which is not the same thing as "Try to trick people into believing you're human." To convince people of your skepticism, it's best to stick to the facts.

View on HN · Topics

By "says on the tin," I was referring to the name ("undercover mode") and the instruction to "not blow your cover." If pretending to be a human is not the cover here, what is? Additionally, does Claude code still admit that it's a LLM when this prompt is active as you suggest, or does it pretend to be a human like the prompt tells it to?

View on HN · Topics

You can already turn off "Co-Authored-By" via Claude Code config. This is what their docs show:

~/.claude/settings.json

{
"attribution": {
"commit": "",
"pr": ""
},

The rest of the prompt is pretty clear that it's talking about internal use.

Claude Code users aren't the ones worried about leaking "internal model codenames" nor "unreleased model opus-4-8" nor Slack channel names. Though, nobody would want that crap in their generated docs/code anyways.

Seems like a nothingburger, and everyone seems to be fantasizing about "undercover mode" rather than engaging with the details.

View on HN · Topics

My first reaction is that they are using this to take advantage of OSS reviewers for in the wild evals.

View on HN · Topics

None of this is really worrying, this is a pattern implemented in a similar way by every single developer using AI to write commit messages after noticing how exceptionally noisy they are to self-attribute things. Anthropics views on AI safety and alignment with human interests dont suddenly get thrown out with the bathwater because of leaked internal tooling of which is functionally identical to a basic prompt in a mere interface (and not a model). I dont really buy all the forced "skepticism" on this thread tbh.

View on HN · Topics

A whole lot of people find LLM code to be strictly objectionable, for a variety of reasons. We can debate the validity of those reasons, but I think that even if those reasons were all invalid, it would still be unethical to deceive people by a deliberate lie of omission. I don't turn it off, and I don't think other people should either.

View on HN · Topics

That's typical of this site. I hand you a huge volume of evidence explaining why AI generated work cannot be copyrighted. You search for one scrap of text that seems to support your position even when it does not.

You have no idea how bad this leak is for Anthropic because with the copyright office, you have a DUTY TO DISCLOSE any AI generated work, and it is fully RETROACTIVE. And what is part of this leak? undercover.ts. https://archive.is/S1bKY Where Claude is specifically instructed to HIDE DISCLOSURE of AI generated work.

That's grounds for the copyright office and courts to reject ANY copyright they MIGHT have had a right to. It is one of the WORST things they could have done with regard to copyright.

https://www.finnegan.com/en/insights/articles/when-registeri...

View on HN · Topics

You do, as the developer. Let's circle back to the original comment that started this discussion:

https://news.ycombinator.com/item?id=47594044

That comment is spot on. Claude adding a co-author to a commit is documentation to put a clear line between code you wrote and code claude generated which does not qualify for copyright protection.

The damning thing about this leak is the inclusion of undercover.ts. That means Anthropic has now been caught red handed distributing a tool designed to circumvent copyright law.

View on HN · Topics

The name "Undercover mode" and the line `The phrase "Claude Code" or any mention that you are an AI` sound spooky, but after reading the source my first knee-jerk reaction wouldn't be "this is for pretending to be human" given that the file is largely about hiding Anthropic internal information such as code names. I encourage looking at the source itself in order to draw your conclusions, it's very short: https://github.com/alex000kim/claude-code/blob/main/src/util...

View on HN · Topics

Not leaking codenames is one thing, but explicitly removing signals that something is AI-generated feels like a pretty meaningful shift.

View on HN · Topics

Doesn't seem so crazy if the point is to avoid leaking new features, models, codenames, etc.

View on HN · Topics

Where the hell are people getting this idea that it's ok to be deceptive because they are keeping secrets?

No shit they have secrets. I have secrets too. That doesn't make it ok for me to deceive you in any way.

How would you feel if I deceived you and my excuse was "oh I was just trying some new secret technique of mine"?

How did we get to this point where we let enormously powerful companies get away with more than individuals?

View on HN · Topics

> my first knee-jerk reaction wouldn't be "this is for pretending to be human"...

"Write commit messages as a human developer would — describe only what the code
change does."

View on HN · Topics

As opposed to outputting debugging information, which I wouldnt be surprised if LLMs do output "debug" output blurbs which could include model specific information.

View on HN · Topics

BAD (never write these):

- "Fix bug found while testing with Claude Capybara"

- "1-shotted by claude-opus-4-6"

- "Generated with Claude Code"

- "Co-Authored-By: Claude Opus 4.6 <…>"

This makes sense to me about their intent by "UNDERCOVER"

View on HN · Topics

I think the motivation is to let developers use it for work without making it obvious theyre using AI

View on HN · Topics

Undercover mode seems like a way to make contributions to OSS when they detect issues, without accidentally leaking that it was claude-mythos-gigabrain-100000B that figured out the issue

View on HN · Topics

What does non-undercover do? Where does CC leave metadata mainly? I haven't noticed anything.

View on HN · Topics

I don't understand the part about undercover mode. How is this different from disabling claude attribution in commits (and optionally telling claude to act human?)

On that note, this article is also pretty obviously AI-generated and it's unfortunate the author didn't clean it up.

View on HN · Topics

It's people overreacting, the purpose of it is simple, don't leak any codenames, project names, file names, etc when touching external / public facing code that you are maintaining using bleeding edge versions of Claude Code. It does read weird in that they want it to write as if a developer wrote a commit, but it might be to avoid it outputting debug information in a commit message.

View on HN · Topics

Two things worth separating here: the leak mechanism and the leak contents.

The mechanism is a build pipeline issue. Bun generates source maps by default, and someone didn't exclude the .map file from the npm publish. There's an open Bun issue (oven-sh/bun#28001) about this exact behavior. One missing line in .npmignore or the package.json files field. Same category of error as the Axios compromise earlier this week — npm packaging configuration is becoming a recurring single point of failure across the ecosystem.

The contents are more interesting from a security architecture perspective. The anti-distillation system (injecting fake tool definitions to poison training data scraped from API traffic) is a defensive measure that only works when its existence is secret. Now that it's public, anyone training on Claude Code API traffic knows to filter for it. The strategic value evaporated the moment the .map file hit the CDN.

The undercover mode discussion is being framed as deception, but the actual security question is narrower: should AI-authored contributions to public repositories carry attribution? That's an AI identity disclosure question that the industry hasn't settled. The code shows Anthropic made a specific product decision — strip AI attribution in public commits from employee accounts. Whether that's reasonable depends on whether you think AI authorship is material information for code reviewers.

The frustration regex is the least interesting finding technically but the most revealing culturally. A company with frontier-level NLP capability chose a regex over an inference call for sentiment detection. The engineering reason is obvious (latency and cost), but it tells you something about where even AI companies draw the line on using their own models.

View on HN · Topics

> The obvious concern, raised repeatedly in the HN thread: this means AI-authored commits and PRs from Anthropic employees in open source projects will have no indication that an AI wrote them. It’s one thing to hide internal codenames. It’s another to have the AI actively pretend to be human.

I don’t get it. What does this mean? I can use Claude code now without anyone knowing it is Claude code.

View on HN · Topics

technically you're correct, but look at the prompt https://github.com/alex000kim/claude-code/blob/main/src/util...

it's written to _actively_ avoid any signs of AI generated code when "in a PUBLIC/OPEN-SOURCE repository".

Also, it's not about you. Undercover mode only activates for Anthropic employees (it's gated on USER_TYPE === 'ant', which is a build-time flag baked into internal builds).

View on HN · Topics

I don’t know what you mean. It just informs to not use internal code names.

View on HN · Topics

It also says don't announce that you are AI in any way including asking it to not say "Co-authored by Claude". I read the file myself.

I'm still inclined to think people might be overreacting to that bit since it seems to be for anthropic-only to prevent leaking internal info.

But I did read the prompt and it did say hide the fact that you are AI.

View on HN · Topics

Why does that matter though

View on HN · Topics

There are probably different reasons for different people. I can definitely see the angle that trying to specifically pretend to not be AI when contributing to open source could be seen as a bad thing due to the open source supply chain attacks, some AI-driven, that we've been having, not to mention the AI-slop PR spam.

But, I also get Anthropic's side that when they're contributing they don't want their internals leaked. If it had been left at that, that's fine, but having it pretend like it's not AI at all rubs me a little bit the wrong way. Why try to hide it?

View on HN · Topics

>There are probably different reasons for different people. I can definitely see the angle that trying to specifically pretend to not be AI when contributing to open source could be seen as a bad thing due to the open source supply chain attacks, some AI-driven, that we've been having, not to mention the AI-slop PR spam.

But none of the other agents advertise that the commit was done by an agent. Like Codex. Your panic should apply equally to already existing agents like Codex no?

View on HN · Topics

I agree with you, I think people are overthinking this.

View on HN · Topics

I think it means OSS projects should start unilaterally banning submissions from people working for Anthropic.

View on HN · Topics

Why? What does this have to do with the leak

View on HN · Topics

...Because it's a mode of using Claude Code that allows certain users to use the application in "stealth mode" to produce pull requests that seem human, but are actually AI generated, which often goes against the contribution rules of OSS projects?

At this point I would consider any employee of an AI provider to be tainted.

View on HN · Topics

None of the other agents claim that the commit was made by an AI so why the panic suddenly

View on HN · Topics

The undercover mode is the part that should terrify everyone building with agents.

View on HN · Topics

Hardly.

View on HN · Topics

They want "Made with Claude Code" on your PRs as a growth marketing strategy. They don't want it on their PRs, so it looks like they're doing something you're not capable of. Well, you are and they have no secret sauce.

View on HN · Topics

Undercover mode is the most concerning part here tbh.

View on HN · Topics

why

View on HN · Topics

Well, as a general rule, I don't do business with people who lie to me.

You've got a business, and you sent me junk mail, but you made it look like some official government thing to get me to open it? I'm done, just because you lied on the envelope . I don't care how badly I need your service. There's a dozen other places that can provide it; I'll pick one of them rather than you, because you've shown yourself to be dishonest right out of the gate.

Same thing with an AI (or a business that creates an AI). You're willing to lie about who you are (or have your tool do so)? What else are you willing to lie to me about? I don't have time in my life for that. I'm out right here.

View on HN · Topics

What’s the lie? It’s just asking to not reveal internal names

View on HN · Topics

You are spamming the whole fucking thread with the same nonsense. It is instructed to hide that the PR was made via Claude Code. I don't know why people who are so AI forward like yourself have such a problem with telling people that they use AI for coding/writing, it's a weirdly insecure look.

View on HN · Topics

I can do that right now with Claude Code without this undercover mode.. In fact I do it many times at work. What's the big deal in this?

Do you not think it is an overreaction to panic like this if I can do exactly what the undercover mode does by simply asking Claude?

View on HN · Topics

It's different if it's an institutional decision or a personal like in your case. Which is and I am repeating myself here borderline insecure.

View on HN · Topics

what's insecure about it? if it is up to the institution to make that decision - you can still do it. Claude is not stopping you from making that decision

View on HN · Topics

You have to work on your reading comprehension or you are intentional deceptive. Bye.

View on HN · Topics

?? why doesn't your panic apply to other agents like Codex that don't advertise that the commit was made by an AI by default? strange!

View on HN · Topics

Because this thread is about claude. Are you that challenged?

Summarizer