Summarizer

LLM Input

llm/fa6df919-50f4-440a-804d-6a9d3e9721d8/topic-4-25803a43-bc75-4911-a20b-3728e57cadd5-input.json

Pretty-print

prompt

The following is content for you to summarize. Do not respond to the comments—summarize them.

<topic>
Code Review Burden # Concerns that AI shifts work from enjoyable coding to tedious reviewing of AI output, with questions about maintainability and technical debt accumulation
</topic>

<comments_about_topic>
1. > Instead of manual coding training your time is better invested in learning to channel coding agents

All channelling is broken when the model is updated. Being knowledgeable about the foibles of a particular model release is a waste of time.

> how to test code to our satisfaction

Sure testing has value.

> how to know if what AI did was any good

This is what code review is for.

> Testing without manual review, because manual review is just vibes

Calling manual review vibes is utterly ridiculous. It's not vibes to point out an O(n!) structure. It's not vibes to point out missing cases.

If your code reviews are 'vibes', you're bad at code review

> If we treat AI-generated code like human code that requires a line-by-line peer review, we are just walking the motorcycle.

To fix the analogy you're not reviewing the motorcycle, you're reviewing the motorcycle's behaviour during the lap.

2. > This is what code review is for.

My point is that visual inspection of code is just "vibe testing", and you can't reproduce it. Even you yourself, 6 months later, can't fully repeat the vibe check "LGTM" signal. That is why the proper form is a code test.

3. > But if I didn't already have experience of code review, I'd be limited to vibe-coding (by the original definition, not even checking).

Code review done visually is "just vibe testing" in my book. It is not something you can reproduce, it depends on the context in your head this moment. So we need actual code tests. Relying on "Looks Good To Me" is hand waving, code smell level testing.

We are discussing vibe coding but the problem is actually vibe testing. You don't even need to be in the AI age to vibe test, it's how we always did it when manually reviewing code. And in this age it means "walking your motorcycle" speed, we need to automate this by more extensive code tests.

4. $20 is fine. I used a free trial before Christmas, and my experience was essentially that my code review speed would've prevented me doing more than twice that anyway… and that's without a full time job, so if I was working full time, I'd only have enough free time to review $20/month of Claude's output.

You can vibe code, i.e. no code review, but this builds up technical debt. Think of it as a junior who is doing one sprint's worth of work every 24 hours of wall-clock time when considering how much debt and how fast it will build up.

5. Yep, that’s not a bad approach, either.

I did that a lot initially, it’s really only with the advent of Claude Code integrated with VS Code that I’m learning more like I would learn from a code review.

It also depends on the project. Work code gets a lot more scrutiny than side projects, for example.

6. ‘Why were they long term?’ is what you need to ask. Code has become essentially free in relative terms, both in time and money domains. What stands out now is validation - LLMs aren’t oracles for better or worse, complex code still needs to be tested and this takes time and money, too. In projects where validation was a significant percentage of effort (which is every project developed by more than two teams) the speed up from LLM usage will be much less pronounced… until they figure out validation, too; and they just might with formal methods.

7. As a customer, I don't want to pay for vibe-coded products, because authors also don't have a time (and/or skills) to properly review, debug and fix products.

8. Sure, as long as you don’t expect me to digest it, live with it, and crap it out for you, I see no problem with it.

9. Have you evaluated the maintainability of the generated code? Becuause that could of course start to count in the negative direction over time.

Some of the AI generated I've seen has been decent quality, but almost all of it is much more verbose or just greater in quantity than hand written code is/would be. And that's almost always what you don't want for maintenance...

10. There is no x is because LLM performance is non deterministic. You get slop out at varying degrees of quality and so your job shifts from writing to debugging.

11. I feel like I can manage the entire stack again - with confidence.

I have less confidence after a session, now I second guess everything and it slows me down because I know the foot-gun is in there somewhere.

For example, yesterday Gemini started added garbage Unicode and then diagnosed file corruption which it failed to fix.

And before you reply, yes it's my fault for not adding "URGENT CRITICAL REQUIREMENT: don't add rubbish Unicode" to my GEMINI.md.

12. My problem is that code review has always been the least enjoyable part of the job. It’s pure drudgery, and is mentally taxing. Unless you’re vibe coding, you’re now doing a lot of code review. It’s almost all you’re doing outside of the high-level planning and guidance (which is enjoyable).

I’ve settled on reviewing the security boundaries and areas that could affect data leaks / invalid access. And pretty much scanning everything else.

From time to time, I find it doing dumb things- n+1 queries, mutation, global mutable variables, etc, but for the most part, it does well enough that I don’t need to be too thorough.

However, I wouldn’t want to inherit these codebases without an AI agent to do the work. There are too many broken windows for human maintenance to be considered.

13. Worse, you’re doing code review of poorly written code with random failure modes no human would create, and an increasingly big ball of mud that is unmaintainable over time. It’s just the worst kind of reviewing imaginable. The AI makes an indecipherable mess, and you have to work out what the hell is going on.

14. > My problem is that code review has always been the least enjoyable part of the job.

The article is about personal projects. The need to review the code is usually 10x less :-)

15. For most of my AI uses, I already have an implementation in mind. The prompt is small enough that most of the time, the agent would get it 90% there. In a way, it's basically an advanced autocomplete.

I think this is quite nice cause it doesn't feel like code review. It's more of a: did it do it? Yes? Great. Somewhat? Good enough, i can work from there. And when it doesn't work, I just scrap that and re-prompt or implement it manually.

But I do agree with what you say. When someone uses AI without making the code their own, it's a nightmare. I've had to review some PRs where I feel like I'm prompting AI rather than an engineer. I did wonder if they simply put my reviews directly to some agent...

16. Agreed. I've settled on writing the code myself and having AI do the first pass review.

17. AI has increased my productivity in dealing with side tasks in languages/frameworks I'm not familiar with. But it has not made development fun. To the contrary, I enjoy writing code, not reviewing code.

18. Web development may be fun again but you aren’t developing.
You order and became a customer.

Maybe you can distinguish good code from bad code but how long will you check it? Auditing wasn’t the fun part ever.

And I bet at some point you will recognize a missing feeling of accomplishment because you didn’t figure out the how, you just ordered the what.

We wouldn’t call someone a painter who let AI do the painting.

19. Except to me it feels more like AI is painting while I have to do the chores
</comments_about_topic>

Write a concise, engaging paragraph (3-5 sentences) summarizing the key points and perspectives in these comments about the topic. Focus on the most interesting viewpoints. Do not use bullet points—write flowing prose.

topic

Code Review Burden # Concerns that AI shifts work from enjoyable coding to tedious reviewing of AI output, with questions about maintainability and technical debt accumulation

commentCount

← Back to job