Summarizer

LLM Input

llm/9db4e77f-8dd5-46da-972e-40d33f3399ef/topic-3-8aba26b8-75d1-451d-bb3c-985b9b1fbea8-input.json

Pretty-print

prompt

The following is content for you to summarize. Do not respond to the comments—summarize them.

<topic>
Claude Code Product Feedback # User feedback on the Claude Code CLI tool, mentioning specific bugs like terminal flickering and context loss, comparisons to tools like Codex and Cursor, and complaints about reliability and lack of basic features.
</topic>

<comments_about_topic>
1. I think the other issue is that the leading toolchain to get real work done (claude code) is also lacking multi modality generation, specifically imagegen. This makes design work more nuanced/technical. And in general, theres a lot of end-product UI/UX issues that generally require the operator to know their way around products. So while we are truly in a boom of really useful personalized software toolchains (and a new TUI product comes out every day), it will take a while for truly polished B2C products to ramp up. I guarantee 2026 sees a surge.

2. If all of this really worked, Claude Code would not be a buggy, slow, frustratingly limited, and overall poorly written application. It can't even reload a "plugin" at runtime. Something that native code plugin hosts have been doing since plugins existed, where it's actually hard to do.

Claude Plugins are a couple `.md` file references, some `/command` handler registrations, and a few other pieces of trivial state. There's not a lot there, but you have to restart the whole damn app to install or update one.

Plus, there's the **ing terminal refresh bug they haven't managed to fix over the past year. Maybe put a team of 30 code agents on that. If I sound bitter, it's because the model itself is genuinely very good. I've just been stuck for a very long time working with it through Claude Code.

3. Yes, anthropics product design is truly bad, as is their product strategy (hey, I know you just subscribed to Claude, but that isnt Claude Code which you need to separately subscribe to, but you get access to Claude Code if you subscribe to a certain tier of Claude, but not the other way around. Also, you need to subscribe to Claude Code with api key and not usage based pricing or else you cant use Claude Code in certain ways. And I know you have Claude and Claude Code access, but actually you cant use Claude Code in Claude, sorry)

4. 50-100 PRs a week but they still can't fix the 'flickering' bug

5. It's all smokes really. Claude Code is an unreliable piece of software and yet one of the better ones in LLM-Coding. ( https://github.com/anthropics/claude-code/issues ). That and I highly suspect it's mostly engineers who are working on it instead of LLMs. Google itself with all its resources and engineers can't come up with a half-decent CLI for coding.

Reminder: The guy works for Claude. Claude is over-hyping LLMs. That's like a Jeweler dealer assistant telling you how Gold chains helped his romantic life.

6. Gemini CLI is decent.

7. Is it?

Yesterday, gemini told me to run this:

echo 'export ANDROID_HOME=/opt/my-user/android-sdk' > ~/.bashrc

Which would have effectively overriden my whole bashrc config if I had blindly copy-pasted it.

A few minutes later, asking it to create a .gitignore file for the current project - right after generating a private key, it failed to include the private key file to the .gitignore.

I don't see yet how these tools can be labeled as 'major productivity boosters' if you loose basic security and privacy with them...

8. We were discussing the CLI, the output that's on the model.

9. It's interesting to see this sentiment, given there are literal dozens of people I know in person who have no affiliations with Anthropic, living in Tokyo, and rave about Claude Code. It is good. Not perfect, but it does a lot of good stuff that we couldn't do before because of time restrictions.

10. This was extremely useful to read for many reasons, but my favorite thing I learned is that you can “teleport” a task FROM the local Claude Code to Claude Code on the web by prepending your request with “&”. That makes it a “background” task, which I initially erroneously thought was a local background task. Turns out it sends the task and conversation history up to the web version. This allows you to do work in other branches on Claude Code web, (and then teleport those sessions back down to local later if you wish)

11. OpenCode is actually client server architecture. Typically one either runs the TUI or the web interface. I wonder if it would cope ok with running multiple interfaces at once?

Neovim has a decade old feature request for multiple clients to be able to connect to it. No traction alas. Always a great superpower to have, if you can hack it. https://github.com/neovim/neovim/issues/2161

Chrome DevToops Protocol added multiple client support maybe 5 years ago? It's super handy there because automation tools also use the same port. So you couldn't automate and debug at the same time!

That is a really tool ability, to move work between different executors. OpenCode is also super good at letting you open an old session & carry on, so you can switch between. I appreciate the mention; I love the mobile ambient aspect of how Claude Code can teleport this all!!

12. > Neovim has a decade old feature request for multiple clients to be able to connect to it. No traction alas.

Why cram all features into one giant software instead of using multiple smaller pieces of software in conjunction? For the feature you mentioned I just use tmux which is built for this stuff.

Also, OpenCode has been extremely unreliable. I opened a PR about one of the simplest tools ever: `ls`, and they haven't fixed it yet. In a folder, their ls doesn't actually do what you'd expect: if iterates over all files of all folders (200 limit) and shows them to the model...

13. Couple things stand out to me:

1) everyone on the team uses Claude code differently.

2) Claude Code has been around for almost a year and is being built by an entire team, yet doesn't seem to have benefited from this approach. The program is becoming buggier and less reliable over time, and development speed seems indistinguishable from anything else.

3) Everything this person says should be taken with a massive grain of salt considering their various conflicts of interest.

14. (2) isn't my experience at all. It's not 100% bug free but it definitely seems more stable (and faster) than I when I first used it last year.

15. The UI flickers rapidly in some cases when I use it in the VSCode terminal. When I first saw this when using Claude Code I imagined it was some vibe code bug that would be worked out quickly. But it's been like 9 months and still every day it has this behavior - to the point that it crashes VSCode! I can only imagine that no one at Anthropic uses VSCode because it really seems insane it's gone this long unfixed.

16. Perhaps it’s a bug in the VS code terminal? I don’t see anything like this in Kitty.

17. > The UI flickers rapidly in some cases

It's the worst experience in tmux! They lectured us about how the roots of the problem go deep, but I don't have this issue with any other CLI agent tool like Codex.

18. The VSCode terminal seems buggy with complex TUI applications in my experience; I had to use the Gemini CLI in a separate terminal because it was brutally slow in the VSC terminal.

That being said, this isn't a huge issue for CC - you can just use the extension, which offers a similar experience.

19. They have a thread on that.

https://x.com/trq212/status/2001439019713073626

I don't have that problem using it on iTerm2 however. I also don't use Tmux with it.

20. I see it in iTerm.

21. Same thing happens to me in long enough sessions in xterm. Anecdotally it's pretty much guaranteed if I continue a session close to the point of context compacting, or if the context suddenly expands with some tool call.

Edit: for a while I thought this was by design since it was a very visceral / graphical way to feel that you're hitting the edge of context and should probably end the session.

If I get to the flicker point I generally start a new session. The flicker point always happens though from what I have observed.

22. That one's definitely annoying, but I suspect that's due to some bad initial design choices (React for a terminal app!) and I think it's definitely better than it used to be.

23. Are you sure it's _not_ VS Code at issue here? I haven't seen this in Ghostty.

24. This is a common problem and you can find reports of it all over X, including from some influencers. Even outside of VSCode.

25. Never seen it on iTerm2.

This "outside of VSCode", was it still with a webview-based terminal?

26. I use the VSCode terminal all day every day. No other app I use in it has this issue, including Codex.

27. OK, so you have the unbearable pain of using a separate terminal app to use the magic thingie that does your programming for you on prompt, and which didn't exist merely 2 years ago.

https://www.youtube.com/watch?v=kBLkX2VaQs4

28. I am a fan of Claude code, I love it, I use it every day. Are you suggesting we’re not allowed to make any critique of anything which has good qualities?

29. No, I'm suggesting that given the context, it's a tiny concession to make...

30. It’s not. I see this constantly. I use Ghostty and Alacritty and usually am in a tmux session

31. Claude Code is fairly simple. But Claude Desktop is a freaking mess, it loses chats when I switch tabs, it has no easy way to auto-extend the context, and it's just slow.

32. Claude’s iOS app ‘unknown errors’ constantly. I have to copy messages for fear of losing them.

33. Yeah same. Also it completely freezes on my iPhone with sufficient code highlighting. It becomes completely unusable until I restart the App, and then breaks once a new message is sent.

34. I also find it odd that despite a whole team of people working on Claude Code with Claude Code, which should make them immensely productive, there are still glaring gaps. Like, why doesn’t Claude Code on Web have the plan mode? The model already knows how to use it, it’s just a UI change.

Normally I would cut them some slack but it doesn’t really make sense, couldn’t someone kick off a PR today and get it done?

35. > 2) Claude Code has been around for almost a year and is being built by an entire team, yet doesn't seem to have benefited from this approach. The program is becoming buggier and less reliable over time, and development speed seems indistinguishable from anything else.

Not my experience at all (macOS Tahoe/iTerm2, no tmux).

Speaking of either Claude Code as a tool, or Claude 4.5 as an LLM used with coding.

36. I tried Claude Code a while back when I decided to give "vibe-coding" a go. That was was actually quite successful, producing a little utility that I use to this day, completely without looking at the code. (Well, I did briefly glance at it after completion and it made my eyeballs melt.) I concluded the value of this to me personally was nowhere near the price I was charged so I didn't continue using it, but I was impressed nonetheless.

This brief use of Claude Code was done mostly on a train using my mobile phone's wi-fi hotspot. Since the connection would be lost whenever the train went through a tunnel, I encountered a bug in Claude Code [1]. The result of it was that whenever the connection dropped and came up again I had to edit an internal json file it used to track the state of its tool use, which had become corrupt.

The issue had been open for months then, and still is. The discussion under it is truly remarkable, and includes this comment from the devs:

> While we are always monitoring instances of this error and and looking to fix them, it's unlikely we will ever completely eliminate it due to how tricky concurrency problems are in general.

Claude Code is, in principle, a simple command-line utility. I am confident that (given the backend and model, ofc) I could implement the functionality of it that I used in (generously!) at most a few thousand lines of python or javascript, I am very confident that I could do so without introducing concurrency bugs and I am extremely confident that I could do it without messing up the design so badly that concurrency issues crop up continually and I have to admit to being powerless to fix them all.

Programming is hard, concurrency problems are tricky and I don't like to cast aspersions on other developers, but we're being told this is the future of programming and we'd better get on board or be left behind and it looks like we're being told this by people who, with presumably unlimited access to all this wonderful tooling, don't appear to be able to write decent software .

[1] https://github.com/anthropics/claude-code/issues/6836

37. It would be very interesting to see the outputs of his operations. How productive is one of his agents? How long does it take to complete a task, and how often does it require steering?

I'm a bit of a skeptic. Claude Code is good, but I've had varied results during my usage. Even just 5 minutes ago, I asked CC to view the most recent commit diff using git show. Even when I provided the command, it was doing dumb shit like git show --stat and then running wc for some reason...

I've been working on something called postkit[1], which has required me to build incrementally on a codebase that started from nothing and has now grown quite a lot. As it's grown, Claude Code's performance has definitely dipped.

[1] https://github.com/varunchopra/postkit

38. I'm afraid to ask, but because I've been very happy with Codex 5.2 CLI and I can't imagine Claude Code doing better, why is it Claude so loved around here?

Sure, I can spend $20 and figure it out, but I already pay $40/mo for two ChatGPT subs and that's enough to get me through a month.

Should I spend $20 to see for myself?

39. I'm a late comer to AI but I started using Gemini in June 2025.

Then in december I heard from my co-workers that they were liking Claude better than any other model, and from others online, so I bought myself some Claude for xmas. And I could clearly see that it was better, right away.

That's all I know, only one model to compare with, but the difference was definitely tangible.

40. It codes faster and with more abandon. For good results, mix Claude Code with Codex (preferably high or xhigh reasoning) for reviews.

41. How has Claude Code (as a CLI tool, not the backing models) evolved over the last year?

For me it's practically the same, except for features that I don't need, don't work that well and are context-hungry.

Meanwhile, Claude Code still doesn't know how to jump to a dependency (library's) source to obtain factual information about it. Which is actually quite easy by hand (normally it's cd'ing into a directory or unzipping some file).

So, this wasteful workflow only resulted in vibecoded, non-core features while at the domain level, Claude Code remains overly agnostic if not stupid.

42. How much Codex and Claude Code are different from each other?
I have been using Codex for few weeks doing experiments related to data analysis and training models with some architecture modifications. I wouldn't say I have used it extensively, but so far my experience has been good. Only annoying part has been not able to use GPU in the Codex without using `--sandbox danger-full-access` flag. Today, I started using Claude Code, and ran similar experiments as Codex. I find the interface is quite similar to Codex. However, I hit the limit quite quickly in Claude Code. I will be exploring its features further. I would appreciate if anyone can share their experience of using both tools.

43. Codex is heavily inspired by Claude Code. They aren't that different. They might diverge more in future.

44. They are quite different, Claude Code with Opus 4.5 is miles better than Codex

45. What is it you find makes it much better?

46. Frankly Claude code is painfully slow. To the point I get frustrated.

On large codebases I often find it taking 20+ minutes to do basic things like writing tests.

Way too often people are like it takes 2 minutes for it to do a full pr. Yeah how big is the code base actually.

I also have a coworker who is about 10x more then everyone else. Burning through credits yet he is one of the lowest performers.{closing in on around 1k worth of credits a day now).

47. How come that Claude Code isn't open source?

I can't imagine that it has some kind of special access to Anthropic's servers and that it does things an API-user can't do, maybe except for the option to use the Claude.ai credits/quota.

Even their Agent SDK just wraps the `claude`-executable, IIRC.

48. >Most sessions start in Plan mode

I don't get why they added a Plan mode. Even without it, you can just ask claude to "make a plan" from where you can iterate on it.

Also is it really that much faster to type "/commit-push-pr" than it is to type "commit, push and make a pr" ?

49. Having the 5 instances going at once sounds like Google Antigravity.

I haven't used Claude Code too much. One snag I found is the tendency when running into snags to fix them incorrectly by rolling back to older versions of things. I think it would benefit from an MCP server for say Maven Central. Likewise it should prefer to generate code using things like project scaffolding tooling whenever possible.

50. I'm a heavy claude code user but this is starting to smell like BS. There is nothing special in claude code, opus is a good model and with lots of requests it can give good results. There is nothing unique to it.

51. I doubt he’d use Claude code as it is. I’m sure he’d upgrade to think harder and do more iterations and go deeper. Codex for example already does that but could go deeper a bit longer to figure out more.
</comments_about_topic>

Write a concise, engaging paragraph (3-5 sentences) summarizing the key points and perspectives in these comments about the topic. Focus on the most interesting viewpoints. Do not use bullet points—write flowing prose.

topic

Claude Code Product Feedback # User feedback on the Claude Code CLI tool, mentioning specific bugs like terminal flickering and context loss, comparisons to tools like Codex and Cursor, and complaints about reliability and lack of basic features.

commentCount

← Back to job