Trust and Deception Standards

Broader philosophical discussion about whether AI tools hiding their nature constitutes lying, with analogies to deceptive business practices

The revelation of an "undercover mode" designed to scrub AI attribution from code contributions has ignited a fierce debate over whether such secrecy is a practical utility or a fundamental breach of professional ethics. Many commenters argue that transparency is vital because AI-generated work requires a distinct, more skeptical review process to catch unique error patterns, with some comparing deceptive non-disclosure to the unethical practice of "sneaking meat into a vegetarian’s meal." Conversely, proponents suggest that LLMs are merely tools whose provenance is secondary to the quality of the final output, placing the burden of accountability squarely on the human developer who signs off on the work. Ultimately, the discussion highlights a growing tension between the drive for seamless AI integration and the fear that these "undercover" tactics will erode systemic trust and jeopardize the legal copyright of collaborative projects.

View on HN · Topics

There are now several comments that (incorrectly?) interpret the undercover mode as only hiding internal information. Excerpts from the actual prompt[0]:

NEVER include in commit messages or PR descriptions:
- The phrase "Claude Code" or any mention that you are an AI
- Co-Authored-By lines or any other attribution

BAD (never write these):
- 1-shotted by claude-opus-4-6
- Generated with Claude Code
- Co-Authored-By: Claude Opus 4.6 <…>

This very much sounds like it does what it says on the tin, i.e. stays undercover and pretends to be a human. It's especially worrying that the prompt is explicitly written for contributions to public repositories.

[0]: https://github.com/chatgptprojects/claude-code/blob/642c7f94...

View on HN · Topics

That’s how I’d want it to be honestly. LLMs are tools and I’d hope we’re going to keep the people using them responsible. Just like any other tools we use.

View on HN · Topics

>Also given we both managed to produce more than one sentence, and include capital letters in our comments, it's entirely possible both of us will be accused of being an AI.

Could anyone explain the esoteric meaning of why people started doing that shit? I got a hypothesis, what's going on is something like this:

1. Prove you are human: write Like A Fucking Adult You Weirdo (internal designator for a specific language register, you know the one)

2. Prove you are human: _DON'T_ write Like A Fucking Adult You Weirdo (because that's how LLMs were trained to write, silly!)

3. ???? (cognitive dissonance ensues)

4. PROFIT (you were just subject to some more attrition while the AI just learned how to pass a lil bit better)

I never thought computer programmers of all people would get trapped in such a simple loop of self-contradiction.

But I guess the human materiel really has degraded since whenever. I blame remote work preventing us from even hypothetically punching bosses, but anyway weird fucking times eh?

Maybe the posts trying to figure "this post is AI, that post is not AI" are themselves predominantly AI-generated?

Or is it just people made uncomfortable by what's going on, but not able to articulate further, jumping on the first bandwagon they see?

Or maybe this "AI-doubting of probably human posters" was started by humans, yes - then became "a thing", and as such was picked up by the LLM?

Like who the fuck knows, but with all honesty that's how I felt about so many things, dating from way before LLMs became so powerful that the above became a "sensible" question to ask...

Predominantly those things which people do by sheer mimesis - such as pop culture.

"Are you a goddam robot already - don't you see how your liking the stupid-making song is turning you into stupid-you, at a greater rate than it is bringing non-stupid-you aesthetic satisfaction?" type of thing -- but then I assume in more civilized places than where I come from people are much more convincingly taught that personal taste "doesn't matter" (and simultaneously is the only thing that matters; see points 1-4... I guess that's what makes some people believe curating AI, i.e. "prompt engineering" can be a real job and not just boil down to you being the stochastic parrot's accountability sink?)

I'm not even sure English even has the notions to point out the concrete issue - I sure don't know 'em.

Ever hear of the strain of thought that says "all metaphysical questions are linguistic paradoxes (and it's self-evidently pointless to seek answers to nonsensical questions)"?

Feels kinda like the same thing, but artificially constructed within the headspace of American anti-intellectuallism.

Maybe a correct adversarial reading of the main branding acronym would be Anti-Intelligence.

You know, like bug spray, or stain remover.

But for the main bug in the system; the main stain on the white shirt: the uncomfortable observation that, in the end, some degree of independent thinking is always required to get real things done which produce some real value. (That's antithetical to standard pro-social aversive conditioning, which says: do not, under any circumstance, just put 2 and 2 together; lest you turn from "a vehicle for the progress of civilization" back into a pumpkin)

View on HN · Topics

The code has a stated goal of avoiding leaks, but then the actual implementation becomes broader than that. I see two possible explanations:

* The authors made the code very broad to improve its ability to achieve the stated goal

* The authors have an unstated goal

I think it's healthy to be skeptical but what I'm seeing is that the skeptics are pushing the boundaries of what's actually in the source. For example, you say "says on the tin" that it "pretends to be human" but it simply does not say that on the tin. It does say "Write commit messages as a human developer would" which is not the same thing as "Try to trick people into believing you're human." To convince people of your skepticism, it's best to stick to the facts.

View on HN · Topics

By "says on the tin," I was referring to the name ("undercover mode") and the instruction to "not blow your cover." If pretending to be a human is not the cover here, what is? Additionally, does Claude code still admit that it's a LLM when this prompt is active as you suggest, or does it pretend to be a human like the prompt tells it to?

View on HN · Topics

Why not? What's wrong with honesty?

View on HN · Topics

I guess Im sometimes dishonest when it suits me

View on HN · Topics

None of this is really worrying, this is a pattern implemented in a similar way by every single developer using AI to write commit messages after noticing how exceptionally noisy they are to self-attribute things. Anthropics views on AI safety and alignment with human interests dont suddenly get thrown out with the bathwater because of leaked internal tooling of which is functionally identical to a basic prompt in a mere interface (and not a model). I dont really buy all the forced "skepticism" on this thread tbh.

View on HN · Topics

> It's useful context unless you've gone over the generated code and understand it and it is the same quality as if you wrote it yourself

If this is not the case you should not be sending it to public repos for review at all. It is rude and insulting to expect the people maintaining these repos to review code that nobody bothered to read.

View on HN · Topics

Sometimes code generation is a useful tool, and maybe people have read and reviewed the generator.

The difference here is that the generator is a non-deterministic LLM and you can't reason about its output the same way.

View on HN · Topics

> If those tools are writing the code then in general I do expect that to be included in the PR!

How about compiler?

View on HN · Topics

>It's useful context unless you've gone over the generated code and understand it and it is the same quality as if you wrote it yourself

I thought the argument was that AI-users were reviewing and understanding all of the code?

View on HN · Topics

I am not against the general use of AI code. Quite simply, my view is that all relevant context for a review should be disclosed in the PR.

AI and humans are not the same as authors of PRs. As an obvious example: one of the important functions of the PR process is to teach the writer about how to code in this project but LLMs fundamentally don't learn the same way as humans so there's a meaningful difference in context between humans and AIs.

If a human takes the care to really understand and assume authorship of the PR then it's not really an issue (and if they do, they could easily modify the Claude messages to remove "generated by Claude" notes manually) but instead it seems that Claude is just hiding relevant context from the reviewer. PRs without relevant context are always frustrating.

View on HN · Topics

Sometimes using AI to code feels closer to a Butterfly than emacs right?

View on HN · Topics

A whole lot of people find LLM code to be strictly objectionable, for a variety of reasons. We can debate the validity of those reasons, but I think that even if those reasons were all invalid, it would still be unethical to deceive people by a deliberate lie of omission. I don't turn it off, and I don't think other people should either.

View on HN · Topics

well if I know a specific LLM has certain tendencies (eg. some model is likely to introduce off-by-one errors), I would know what to look for in code-review

I mean, of course I would read most of the code during review, but as a human, I often skip things by mistake

View on HN · Topics

Like frying a veggie burger in bacon grease. Just because somebody's beliefs are dumb doesn't mean we should be deliberately tricking them. If they want to opt out of your code, let them.

View on HN · Topics

Never fried one in bacon grease, but they are good with bacon and cheese. I have had more than one restaurant point out that their bacon wasn't vegetarian when ordering, though.

View on HN · Topics

In your view, those who prefer veggie burgers are dumb. Am I misinterpreting?

View on HN · Topics

I've heard similar things before. Frying a veggie burger in bacon grease to sneakily feed someone meat/meat-byproducts who does not want to eat it, like a vegan or a person following certain religious observances. As in, it's not ok to do this even if you think their beliefs are stupid.

View on HN · Topics

In my view, vegans are dumb but it's still unethical to trick them into eating something they ordinarily wouldn't. Does that make sense to you? I am not asking you to agree with me on the merits of veganism, I am explaining why the merits of veganism shouldn't even matter when it comes to the question of deliberately trying to trick them.

View on HN · Topics

Likewise. I don’t mind that people use LLMs to generate text and code. But I want any LLM generated stuff to be clearly marked as such. It seems dishonest and cheap to get Claude to write something and then pretend you did all the work yourself.

View on HN · Topics

The reason I want it to be marked as such is because I review AI code differently than human code - it just makes different kinds of mistakes.

View on HN · Topics

I think the issue is less attribution and more review mode.
If I assume a change was written and checked line-by-line by the author, I review it one way.
If an LLM had a big hand in it, I review it another way.

View on HN · Topics

It's weird, because they should not consider it as their own, but they should take accountability from it.

Ideally, if I contribute to any codebase, what needs to be judged is the resulting code. Is it up to the project's standards ? Does the maintainer have design objections ?

What tool you use shouldn't matter, be it your IDE or your LLM.

But that also means you should be accountable for it, you shouldn't defend behind "But Claude did this poorly, not me !", I don't care (in a friendly way), just fix the code if you want to contribute.

The big caveat to this is not wanting AI-Generated code for ideological reasons, and well, if you want that you can make your contributors swear they wrote it by themselves in the PR text or whatever.

I'm not really sure how to feel about this, but I stand by my "the code is what matters" line.

View on HN · Topics

Pre-LLM, it was much easier for reviewers to discern that. Now, the AI-generated code can look like it was well thought out by somebody competent, when it wasn't.

View on HN · Topics

> legally speaking.. if you're not sure of the risk- you don't document it.

Ah, so you kinda maybe sorta absolve yourself of culpability (but not really — "I didn't know this was copyrighted material" didn't grant you copyright), and simultaneously make fixing the potentially compromised codebase ( someone else's job, hopefully) 100x harder because the history of which bits might've been copied was never kept.

Solid advice! As ethical as it is practical.

By the same measure, junkyards should avoid keeping receipts on the off chance that the catalytic converters some randos bring in after midnight are stolen property.

Better not document it.

One little trick the legal folks don't want you to know!

View on HN · Topics

That's typical of this site. I hand you a huge volume of evidence explaining why AI generated work cannot be copyrighted. You search for one scrap of text that seems to support your position even when it does not.

You have no idea how bad this leak is for Anthropic because with the copyright office, you have a DUTY TO DISCLOSE any AI generated work, and it is fully RETROACTIVE. And what is part of this leak? undercover.ts. https://archive.is/S1bKY Where Claude is specifically instructed to HIDE DISCLOSURE of AI generated work.

That's grounds for the copyright office and courts to reject ANY copyright they MIGHT have had a right to. It is one of the WORST things they could have done with regard to copyright.

https://www.finnegan.com/en/insights/articles/when-registeri...

View on HN · Topics

Not leaking codenames is one thing, but explicitly removing signals that something is AI-generated feels like a pretty meaningful shift.

View on HN · Topics

Where the hell are people getting this idea that it's ok to be deceptive because they are keeping secrets?

No shit they have secrets. I have secrets too. That doesn't make it ok for me to deceive you in any way.

How would you feel if I deceived you and my excuse was "oh I was just trying some new secret technique of mine"?

How did we get to this point where we let enormously powerful companies get away with more than individuals?

View on HN · Topics

I think the motivation is to let developers use it for work without making it obvious theyre using AI

View on HN · Topics

I’m more curious how this impacts trust than anything else.

In the span of basically a week, they accidentally leaked Mythos, and then now the entire codebase of CC. All while many people are complaining about their usage limits being consumed quickly.

Individually, each issue is manageable (Because its exciting looking through leaked code). But together, it starts to feel like a pattern.

At some point, I think the question becomes whether people are still comfortable trusting tools like this with their codebases, not just whether any single incident was a mistake.

View on HN · Topics

Idk. This is making leaps. Idc that their tools leaked. I paid 140$ for CC the other day even after getting sometimes not 100% uptime on the lower plan. If anything this leak is most in line with Anthropic's ethical model. They're failing upwards in my opinion

View on HN · Topics

There are probably different reasons for different people. I can definitely see the angle that trying to specifically pretend to not be AI when contributing to open source could be seen as a bad thing due to the open source supply chain attacks, some AI-driven, that we've been having, not to mention the AI-slop PR spam.

But, I also get Anthropic's side that when they're contributing they don't want their internals leaked. If it had been left at that, that's fine, but having it pretend like it's not AI at all rubs me a little bit the wrong way. Why try to hide it?

View on HN · Topics

The undercover mode is the part that should terrify everyone building with agents.

View on HN · Topics

Well, as a general rule, I don't do business with people who lie to me.

You've got a business, and you sent me junk mail, but you made it look like some official government thing to get me to open it? I'm done, just because you lied on the envelope . I don't care how badly I need your service. There's a dozen other places that can provide it; I'll pick one of them rather than you, because you've shown yourself to be dishonest right out of the gate.

Same thing with an AI (or a business that creates an AI). You're willing to lie about who you are (or have your tool do so)? What else are you willing to lie to me about? I don't have time in my life for that. I'm out right here.

View on HN · Topics

Out of curiosity, given two code submissions that are completely identical—one written solely by a human and one assisted by AI—why should its provenance make any difference to you? Is it like fine art, where it’s important that Picasso’s hand drew it? Or is it like an instruction manual, where the author is unimportant?

Similarly, would you consider it to be dishonest if my human colleague reviewed and made changes to my code, but I didn’t explicitly credit them?

View on HN · Topics

Why does the provenance make any difference? Let me increase your options. Option 1: You completely hand-wrote it. Option 2: You were assisted by an AI, but you carefully reviewed it. Option 3: You were assisted by an AI (or the AI wrote the whole thing), and you just said, "looks good, YOLO".

Even if the code is line-for-line identical, the difference is in how much trust I am willing to give the code. If I have to work in the neighborhood of that code, I need to know what degree of skepticism I should be viewing it with.

View on HN · Topics

That's the thing. As someone evaluating pull requests, should you trust the code based on its provenance, or should you trust it based on its content? Automated testing can validate code, but it can't validate people.

ISTM the most efficient and objective solution is to invest in AI more on both sides of the fence.

View on HN · Topics

In the future, that may be fine. We're not in that future yet. We're still at a place where I don't fully trust AI-only code to be as solid as code that is at least thoroughly reviewed by a knowledgeable human.

(Yes, I put "AI-only" and "knowledgeable" in there as weasel words. But I think that with them, it is not currently a very controversial case.)

Summarizer