Summarizer

Switching to Codex

Multiple users announcing they've switched or are switching to OpenAI's Codex, citing better reliability and consistency despite potentially lower capability. Some praise GPT-5.4 performance

← Back to An update on recent Claude Code quality reports

Frustrated by significant performance regressions and perceived "gaslighting" regarding the quality of Opus 4.7, many developers are abandoning Anthropic in favor of OpenAI’s Codex and GPT-5.4. While some users still appreciate Claude’s flair for UI design, the prevailing sentiment favors Codex for its superior reliability, logical consistency, and more transparent reasoning traces. This migration is further accelerated by OpenAI’s aggressive enterprise incentives and a general exhaustion with Claude’s high token waste and restrictive usage limits, signaling a shift toward prioritizing predictability over raw, erratic capability.

41 comments tagged with this topic

View on HN · Topics
I’m not familiar with the Claude API but OpenAI has an encrypted thking messages option. You get something that you can send back but it is encrypted. Not available on Anthropic?
View on HN · Topics
They lost me at Opus 4.7 Anecdotally OpenAI is trying to get into our enterprise tooth and nail, and have offered unlimited tokens until summer. Gave GPT5.4 a try because of this and honestly I don’t know if we are getting some extra treatment, but running it at extra high effort the last 30 days I’ve barely see it make any mistakes. At some points even the reasoning traces brought a smile to my face as it preemptively followed things that I had forgotten to instruct it about but were critical to get a specific part of our data integrity 100% correct.
View on HN · Topics
Same here. I was a fervent Claude code user at $200/mo until Opus4.7. Freezing your IDE version is now a thing of the past, the new reality is that we can't expect agentic dev workflows to be consistent and I see too many people (including myself) getting burned by going the single-provider route. On one hand I’m glad to finally see anthropic communicate on this but at this point all I have to say is… time to diversify?
View on HN · Topics
They lost me a little before then - Claude Code's regressions were so very obvious and there's no sign they've learned their lesson in this article or in the comments of those who work on Claude Code on HN. They'll continue to tweak and generally mess around with a product people are using, altering the behaviour without notice in ways that can severely impact use, for months! GPT5.4 has been remarkably consistent and capable, as a replacement. I've cancelled my max plan.
View on HN · Topics
GPT-5.4 was already better than Opus 4.6 on a lot of areas, especially correctness and tricky logic. I’m eager to see if 5.5 is even better.
View on HN · Topics
extra high burns tokens i find. ( run 5.4 on medium for 90% of the tasks and high if i see medium struggling and its very focused and make minimum changes.
View on HN · Topics
Yeah but it also then strikes the perfect balance between being meticulous and pragmatic. Also it pushes back much more often than other models in that mode.
View on HN · Topics
What's your workflow like? I'd be curious to test OpenAI out again but Claude Code is how I use the models. Does it require relearning another workflow?
View on HN · Topics
Isn’t it bascially the same thing? You type what you want into the input box and it does what you ask for.
View on HN · Topics
I guess I'm asking if their CLI tool is the same or if it functions different. I've never used anything besides CC so I wouldn't know if it's basically the same thing
View on HN · Topics
I have found Claude to be especially unpredictable. I've mostly switched to GPT-5.4 now - although it's slightly less capable, it's massively more reliable.
View on HN · Topics
Actually, I think their deeper problems are twofold: - Claude Code is _vastly_ more wasteful of tokens than anything else I've used. The harness is just plain bad. I use pi.dev and created https://github.com/rcarmo/piclaw , and the gaps are huge -- even the models through Copilot are incredibly context-greedy when compared to GPT/Codex - 4.7 can be stupidly bad. I went back to 4.6 (which has always been risky to use for anything reliable, but does decent specs and creative code exploration) and Codex/GPT for almost everything. So there is really no reason these days to pay either their subscription or their insanely high per/token price _and_ get bloat across the board.
View on HN · Topics
so who do you trust and go to? (NotClearlySo)OpenAI?
View on HN · Topics
I went with MiniMax. The token plans are over what I currently need, 4500 messages per 5h, 45000 messages per week for 40$. I can run multiple agents and they don't think for 5-10 minutes like Sonnet did. Also I can finally see the thinking process while Anthropic chose to hide it all from me. I'm using Zed and Claude Code as my harnesses.
View on HN · Topics
I "subconsciously" moved to codex back in mid Feb from CC and it's been so freaking awesome. I don't think it's as good at UI, but man is it thorough and able to gather the right context to find solutions. I use "subconsciously" in quotes because I don't remember exactly why I did it, but it aligns with the degradation of their service so it feels like that probably has something to do with it even though I didn't realize it at the time.
View on HN · Topics
Anthropic definitely takes the cake when it comes to UI related activities (pulling in and properly applying Figma elements, understanding UI related prompts and properly executing on it, etc), and I say this as a designer with a personal Codex subscription.
View on HN · Topics
Codex does better if you ask it to take screenshots and critique its own UI work and iterate. It rarely one-shots something I like but it can get there in steps.
View on HN · Topics
it's been frustrating how bad it is at UI. I'm starting to test out using their image2 for UI and then handing it to codex to build out the images into code and I'm impressed and relieved so far
View on HN · Topics
Codex isn't great at UI, but you might find Gemini is competent enough as an adjunct. I've had some luck with that.
View on HN · Topics
Is Gemini cli not an agentic model? Or are you just saying it's built poorly? Gemini 2.5 didn't really work for me but Gemini 3 seems fairly solid
View on HN · Topics
Gemini fairs poorly at tool use, even in its own CLI and even in Antigravity. It gets into a mess just editing source files, it's tragic because it's actually not a bad model otherwise.
View on HN · Topics
It frequently fails to apply its diffs at first but it always succeeds eventually for me. I'm happy with it. I understand it is slower than other models but it also costs barely anything per month.
View on HN · Topics
Self-hosted models are the one true path.
View on HN · Topics
Anecdotally, I know many people who have supplemented Claude with Codex, and are experimenting with models such as GLM 5.1, Kimi, Qwen, etc.
View on HN · Topics
I like chutes because they always use the full weights, and prompts are encrypted with TEE.
View on HN · Topics
> Today we are resetting usage limits for all subscribers. I asked for this via support, got a horrible corporate reply thread, and eventually downgraded my account. I'm using Codex now as we speak. I could not use Claude any more, I couldn't get anything done. Will they restore my account usage limits? Since I no longer have Max? Is that one week usage restored, or the entire buggy timespan?
View on HN · Topics
Yeah you don't have to convince me. I switched to Codex mid-January in part because of the dubious quality of the tui itself and the unreliability of the model. Briefly switched back through March, and yep, still a mistake. Once OpenAI added the $100 plan, it was kind of a no-brainer.
View on HN · Topics
That’s what we see. It may be (but I wouldn’t know) that some of other changes not covered here reduced costs on their side without impacting users, improving the viability of their subscription model. Or maybe even improved things for users. I’d really appreciate more transparency on this, and not just when things fail. But I’ve learned my lesson. I’ve been weening off Claude for a few weeks, cancelled my subscription three weeks ago, let it expire yesterday, and moved to both another provider and a third-party open source harness.
View on HN · Topics
Damn it was real the whole time. I found Opus 4.7 to holistically underperform 4.6, and especially in how much wordiness there is. It's harder to work with so I just switched back to 4.6 + Kimi K2.6. Now GPT 5.5 is here and it's been excellent so far.
View on HN · Topics
Who’s going to pay for the exorbitant number of tokens Claude used without delivering any meaningful outcome? I spent many sessions getting zero results, and when I posted about it on their subreddit, all I got were personal attacks from bots and fanboys. I instantly cancelled my subscription and moved to Codex. Also, it may be a coincidence, that the article was published just before the GPT 5.5 launch, and then they restored the original model while releasing a PR statement claiming it was due to bugs.
View on HN · Topics
Doesn't change anything about opus 4.7 being an absolute buffon. Even going back to opus 4.6 doesn't feel like the magical period maybe 3-4 weeks ago. Gonna go back to openAI
View on HN · Topics
It's still night and day the difference in quality between chatgpt5.4 and opus 4.7. Heck even on Perplexity where 5.4 is included in Pro vs 4.7 which is behind the max plan or whatever, I will pick sonnet 4.6 over the 5.4 offering and it's consistently better. I don't love Anthropic, I don't have illusions about them as a business. But if a tool is better, it's better.
View on HN · Topics
You aren’t getting the 5.4 experience for code if you’re not using it in the Codex harness
View on HN · Topics
What's the alternative? Are you suggesting other LLM providers don't charge high price? Or that they don't make mistakes? Or that they provide better quality? We're talking about dynamically developed products, something that most people would have considered impossible just 5 years ago. A non-deterministic product that's very hard to test. Yes, Anthropic makes mistakes, models can get worse over time, their ToS change often. But again, is Gemini/GPT/Grok a better alternative?
View on HN · Topics
The consumer surplus is quite high. Even with the regressions in this postmortem, performance was above the models last fall, when I was gladly paying for my subscription and thought it was net saving me time. That said, there is now much better competition with Codex, so there's only so much rope they have now.
View on HN · Topics
Confused as well, I rather supposed Antrophic had some standing for saying no to Trump and being declared national security threat, but the anger they got and people leaving to OpenAI again, who gladly said yes to autonomous killing AI did astonish me a bit. And I also had weird things happening with my usage limits and was not happy about it. But it is still very useful to me - and I only pay for the pro plan.
View on HN · Topics
Cool but I switched to Codex for the time being.
View on HN · Topics
They feel they're in a position to make important trade-off decisions on behalf of the user. "It's just slightly worse, I'll sneak this change in" is not something to be tolerated, whether it actually turns out to be much worse or not. Their adaptive thinking mess has caused a ton of work for me. I know a lot of people are saying Codex is actually better now. I don't agree but I'm switching to it because it's much more reliable.
View on HN · Topics
yesterday CC created a fastapi /healthz endpoint and told me it's the gold standard (with the ending z). today I stopped my max sub and will be trying codex
View on HN · Topics
Too late bro, switched to Codex I’m done with your bullshit.
View on HN · Topics
Honestly, it’s kind of sad that Anthropic is winning this AI race. They are the most anti–open source company, and we should try to avoid them as much as possible. They are all doing it because OpenAI is snatching their customers. And their employees have been gaslighting people [1] for ages. I hope open-source models will provide fierce competition so we do not have to rely on an Anthropic monopoly. [1] https://www.reddit.com/r/claude/comments/1satc4f/the_biggest...