llm/065c6e83-d0d5-4aca-be3d-92768a8a3506/topic-18-f194c509-4e7c-485a-b1c1-5f4e266417ef-input.json
The following is content for you to summarize. Do not respond to the comments—summarize them. <topic> Productivity Claims Skepticism # Questions about actual time savings versus perceived productivity. References to studies showing AI sometimes makes developers less productive. Concerns about false progress. </topic> <comments_about_topic> 1. Developers should work by wasting lots of time making the wrong thing? I bet if they did a work and motion study on this approach they'd find the classic: "Thinks they're more productive, AI has actually made them less productive" But lots of lovely dopamine from this false progress that gets thrown away! 2. Classic https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o... 3. How can you know that 100k lines plan is not just slop? Just because plan is elaborate doesn’t mean it makes sense. 4. They didn't write 100k plan lines. The llm did (99.9% of it at least or more). Writing 30k by hand would take weeks if not months. Llms do it in an afternoon. 5. And my weeks or months of work beats an LLMs 10/10 times. There are no shortcuts in life. 6. I have no doubts that it does for many people. But the time/cost tradeoff is still unquestionable. I know I could create what LLMs do for me in the frontend/backend in most cases as good or better - I know that, because I've done it at work for years. But to create a somewhat complex app with lots of pages/features/apis etc. would take me months if not a year++ since I'd be working on it only on the weekends for a few hours. Claude code helps me out by getting me to my goal in a fraction of the time. Its superpower lies not only in doign what I know but faster, but in doing what I don't know as well. I yield similar benefits at work. I can wow management with LLM assited/vibe coded apps. What previously would've taken a multi-man team weeks of planning and executing, stand ups, jour fixes, architecture diagrams, etc. can now be done within a single week by myself. For the type of work I do, managers do not care whether I could do it better if I'd code it myself. They are amazed however that what has taken months previously, can be done in hours nowadays. And I for sure will try to reap benefits of LLMs for as long as they don't replace me rather than being idealistic and fighting against them. 7. > What previously would've taken a multi-man team weeks of planning and executing, stand ups, jour fixes, architecture diagrams, etc. can now be done within a single week by myself. This has been my experience. We use Miro at work for diagramming. Lots of visual people on the team, myself included. Using Miro's MCP I draft a solution to a problem and have Miro diagram it. Once we talk it through as a team, I have Claude or codex implement it from the diagram. It works surprisingly well. > They are amazed however that what has taken months previously, can be done in hours nowadays. Of course they're amazed. They don't have to pay you for time saved ;) > reap benefits of LLMs for as long as they don't replace me > What previously would've taken a multi-man team I think this is the part that people are worried about. Every engineer who uses LLMs says this. By definition it means that people are being replaced. I think I justify it in that no one on my team has been replaced. But management has explicitly said "we don't want to hire more because we can already 20x ourselves with our current team +LLM." But I do acknowledge that many people ARE being replaced; not necessarily by LLMs, but certainly by other engineers using LLMs. 8. Might be true for you. But there are plenty of top tier engineers who love LLMs. So it works for some. Not for others. And of course there are shortcuts in life. Any form of progress whether its cars, medicine, computers or the internet are all shortcuts in life. It makes life easier for a lot of people. 9. Dunno. My 80k+ LOC personal life planner, with a native android app, eink display view still one shots most features/bugs I encounter. I just open a new instance let it know what I want and 5min later it's done. 10. Both can be true. I have personally experienced both. Some problems AI surprised me immensely with fast, elegant efficient solutions and problem solving. I've also experienced AI doing totally absurd things that ended up taking multiple times longer than if I did it manually. Sometimes in the same project. 11. Todos, habits, goals, calendar, meals, notes, bookmarks, shopping lists, finances. More or less that with Google cal integration, garmin Integration (Auto updates workout habits, weight goals) family sharing/gamification, daily/weekly reviews, ai summaries and more. All built by just prompting Claude for feature after feature, with me writing 0 lines. 12. > That’s because it’s superstition. This field is full of it. Practices are promoted by those who tie their personal or commercial brand to it for increased exposure, and adopted by those who are easily influenced and don't bother verifying if they actually work. This is why we see a new Markdown format every week, "skills", "benchmarks", and other useless ideas, practices, and measurements. Consider just how many "how I use AI" articles are created and promoted. Most of the field runs on anecdata. It's not until someone actually takes the time to evaluate some of these memes, that they find little to no practical value in them.[1] [1]: https://news.ycombinator.com/item?id=47034087 13. We have tests and benchmarks to measure it though. 14. > But we can predict the outcomes [...] Maybe not 100% of the time So 60% of the time, it works every time. ... This fucking industry. 15. if it’s so smart, why do i need to learn to use it? 16. I don’t use plan.md docs either, but I recognise the underlying idea: you need a way to keep agent output constrained by reality. My workflow is more like scaffold -> thin vertical slices -> machine-checkable semantics -> repeat. Concrete example: I built and shipped a live ticketing system for my club (Kolibri Tickets). It’s not a toy: real payments (Stripe), email delivery, ticket verification at the door, frontend + backend, migrations, idempotency edges, etc. It’s running and taking money. The reason this works with AI isn’t that the model “codes fast”. It’s that the workflow moves the bottleneck from “typing” to “verification”, and then engineers the verification loop: -keep the spine runnable early (end-to-end scaffold) -add one thin slice at a time (don’t let it touch 15 files speculatively) -force checkable artifacts (tests/fixtures/types/state-machine semantics where it matters) -treat refactors as normal, because the harness makes them safe If you run it open-loop (prompt -> giant diff -> read/debug), you get the “illusion of velocity” people complain about. If you run it closed-loop (scaffold + constraints + verifiers), you can actually ship faster because you’re not paying the integration cost repeatedly. Plan docs are one way to create shared state and prevent drift. A runnable scaffold + verification harness is another. 17. This all looks fine for someone who can't code, but for anyone with even a moderate amount of experience as a developer all this planning and checking and prompting and orchestrating is far more work than just writing the code yourself. There's no winner for "least amount of code written regardless of productivity outcomes.", except for maybe Anthropic's bank account. 18. I really don't understand why there are so many comments like this. Yesterday I had Claude write an audit logging feature to track all changes made to entities in my app. Yeah you get this for free with many frameworks, but my company's custom setup doesn't have it. It took maybe 5-10 minutes of wall-time to come up with a good plan, and then ~20-30 min for Claude implement, test, etc. That would've taken me at least a day, maybe two. I had 4-5 other tasks going on in other tabs while I waited the 20-30 min for Claude to generate the feature. After Claude generated, I needed to manually test that it worked, and it did. I then needed to review the code before making a PR. In all, maybe 30-45 minutes of my actual time to add a small feature. All I can really say is... are you sure you're using it right? Have you _really_ invested time into learning how to use AI tools? 19. Trust me I'm very impressed at the progress AI has made, and maybe we'll get to the point where everything is 100% correct all the time and better than any human could write. I'm skeptical we can get there with the LLM approach though. The problem is LLMs are great at simple implementation, even large amounts of simple implementation, but I've never seen it develop something more than trivial correctly. The larger problem is it's very often subtly but hugely wrong. It makes bad architecture decisions, it breaks things in pursuit of fixing or implementing other things. You can tell it has no concept of the "right" way to implement something. It very obviously lacks the "senior developer insight". Maybe you can resolve some of these with large amounts of planning or specs, but that's the point of my original comment - at what point is it easier/faster/better to just write the code yourself? You don't get a prize for writing the least amount of code when you're just writing specs instead. 20. My experience has so far been similar to the root commenter - at the stage where you need to have a long cycle with planning it's just slower than doing the writing + theory building on my own. It's an okay mental energy saver for simpler things, but for me the self review in an actual production code context is much more draining than writing is. I guess we're seeing the split of people for whom reviewing is easy and writing is difficult and vice versa. 21. The key part of my comment is "correctly". Does it write maintainable code? Does it write extensible code? Does it write secure code? Does it write performant code? My experience has been it failing most of these. The code might "work", but it's not good for anything more than trivial, well defined functions (that probably appeared in it's training data written by humans). LLMs have a fundamental lack of understanding of what they're doing, and it's obvious when you look at the finer points of the outcomes. That said, I'm sure you could write detailed enough specs and provide enough examples to resolve these issues, but that's the point of my original comment - if you're just writing specs instead of code you're not gaining anything. 22. > In all, maybe 30-45 minutes of my actual time to add a small feature Why would this take you multiple days to do if it only took you 30m to review the code? Depends on the problem, but if I’m able to review something the time it’d take me to write it is usually at most 2x more worst case scenario - often it’s about equal. I say this because after having used these tools, most of the speed ups you’re describing come at the cost of me not actually understanding or thoroughly reviewing the code. And this is corroborated by any high output LLM users - you have to trust the agent if you want to go fast. Which is fine in some cases! But for those of us who have jobs where we are personally responsible for the code, we can’t take these shortcuts. 23. I want to be clear, I'm not against any use of AI. It's hugely useful to save a couple of minutes of "write this specific function to do this specific thing that I could write and know exactly what it would look like". That's a great use, and I use it all the time! It's better autocomplete. Anything beyond that is pushing it - at the moment! We'll see, but spending all day writing specs and double-checking AI output is not more productive than just writing correct code yourself the first time, even if you're AI-autocompleting some of it. 24. It strikes me that if this technology were as useful and all-encompassing as it's marketed to be, we wouldn't need four articles like this every week 25. People are figuring it out. Cars are broadly useful, but there's nuance to how to maintain then, use them will in different terrains and weather, etc. 26. How many millions of articles are there about people figuring out how to write better software? Does something have to be trivial-to-use to be useful? 27. What I've read is that even with all the meticulous planning, the author still needed to intervene. Not at the end but at the middle, unless it will continue building out something wrong and its even harder to fix once it's done. It'll cost even more tokens. It's a net negative. You might say a junior might do the same thing, but I'm not worried about it, at least the junior learned something while doing that. They could do it better next time. They know the code and change it from the middle where it broke. It's a net positive. 28. this sounds... really slow. for large changes for sure i'm investing time into planning. but such a rigid system can't possible be as good as a flexible approach with variable amounts of planning based on complexity 29. How much time are you actually saving at this point? </comments_about_topic> Write a concise, engaging paragraph (3-5 sentences) summarizing the key points and perspectives in these comments about the topic. Focus on the most interesting viewpoints. Do not use bullet points—write flowing prose.
Productivity Claims Skepticism # Questions about actual time savings versus perceived productivity. References to studies showing AI sometimes makes developers less productive. Concerns about false progress.
29