Summarizer

LLM Input

llm/3a862c31-848e-4e32-be93-99402d2b43b6/topic-19-4556b9a7-9a6f-4c29-967b-5e667f0984bb-input.json

Pretty-print

prompt

You are a comment summarizer. Given a topic and a list of comments tagged with that topic, write a single paragraph summarizing the key points and perspectives expressed in the comments.

TOPIC: Goodhart's Law and Metrics Gaming

COMMENTS:
1. What about the second order effects?

Ignoring the customers becomes a habit, which doesn’t lead to success.

But then, caving to each customer demand will make solution overfit.

Somewhere in there one has to exercise judgement.

But how does one make judgment a repeatable process? Feedback is rarely immediate in such tradeoffs, so promotions go to people who are capable of showing some metric going up, even if the metrics is shortsighted. The repeatable outcome of this process is mediocracy. Which, surprisingly enough, works out on average.

2. Steve Jobs has a bunch of videos on creating products- https://youtu.be/Q3SQYGSFrJY

Some person or small team needs to have a vision of what they are crafting and have the skill to execute on it even if users initially complain, because they always do. And the product that is crafted is either one customers want or don’t. But without a vision you’re just a/b testing your way to someone else replacing you in the market with something visionary.

3. My favorite is the first one, "The best engineers are obsessed with solving user problems." and what I hate about it is that it is super hard to judge someone's skills about it without really working with him/her for a very long time. It is super easier said than done. And it is super hard to prove and sell when everybody is looking for easily assessable skills.

4. I think you can also learn from users when they complain en masse about the current atrocious state of software quality. But I guess that doesn't show up in telemetry. Until it does. Looking at you, Microsoft!

5. Agree that this can be an issue but to clarify, I was finding bugs or missed outages, not gathering feature requests or trying to do product dev. Think "I clicked the button and got a 500 Server Error". I don't think random devs should try and decide what features to work on by reading user forums - having PMs decide that does make sense as long as the PM is good. However, big tech PMs too often abstract the user base behind metrics and data, and can miss obvious/embarrassing bugs that don't show up in those feeds. The ground truth is still whether users are complaining. Eng can skip complaints about missing features/UI redesigns or whatever, but complaints about broken stuff in prod needs their attention.

6. I seem to recall sitting in weekly abuse team meetings where one of the metrics was the price of a google account on the black market. So at least some of these things were tracked and not just by one individual.

7. "data-driven agile"™

8. >What I learned was:

>• Almost nobody else in engineering did this.

>• I was considered weird for doing it.

>• It was viewed negatively by managers and promo committees.

>• An engineer talking directly to users was considered especially weird and problematic.

>• The products did always have serious bugs that had escaped QA and monitoring

Sincerely, thank you for confirming my anecdotal but long-standing observations. My go-to joke about this is that Google employees are officially banned from even visiting user forums. Because otherwise, there is no other logical explanation why there are 10+ year old threads where users are reporting the same issue over and over again, etc.

Good engineering in big tech companies (I work for one, too) has evaporated and turned into Promotion Driven Development.

In my case: write shitty code, cut corners, accumulate tech debt, ship fast, get promo, move on.

9. PM is a fake job where the majority have long learned that they can simply (1) appease leadership and (2) push down on engineering to advance their career. You will notice this does not actually involve understanding or learning about products.

It's why the GP got that confused reaction about reading user reports. Talk to someone outside big company who has no power? Why?

10. It's not just Google, the UX is degrading in... Well everything. I think it's because companies are in a duopole, monopole etc position.

They only do what the numbers tell them. Nothing else and UX just does not matter anymore.

It's like those gacha which make billions. Terrible games, almost zero depth, but people spend thousands in them. Not because they are good, but because they don't have much choice ( similar game without gacha) and part the game loop is made for addiction and build around numbers.

11. > I just tested this out and I don't think that's a particularly good example of bad UI/UX

Luckily for both you and me, we dont have to rely on our feelings of what is good UX or not. There are concrete UX metholodogies such as Hierarchical Task Analysis or Heuristic Evaluation. These allow us to evaluate concrete KPIs, such as number of steps and levels of navigation required for an action, in order to evaluate just how good or bad (or better said, complicated a UX design is).

Lets say we apply the HTA. Starting from the top of your navigation level when you want to execute the task, count the number of operations and various levels of navigation you have to go through with the new design, compared to just clicking and correcting the e-mail address in-place? How much time does it take you to write your e-mail in the both cases? How many times do you have to switch back and forth between the main interface and the context menu google kindly placed for us?
Now, phase out of your e-mai

12. > And material UI is still the worst of all UIs

I'm not sure how that got approved either, but at least we now know what would happen if a massive corporation created a UI/UX toolkit, driven only by quantitative analytics making every choice for how it should be, seemingly without any human oversight. Really is the peak of the "data-driven decisions above all" era.

13. > quantitative analytics making every choice for how it should be, seemingly without any human oversight

the root of all evil right there...

14. Oh, I have no doubt they are at Google. I was just trying to say that the author was not really making a commentary on UX directly. The author was trying to make the point that understanding what sort of products and problems users have is a valid long term strategy for solving meaningful problems and attaching yourself to projects, within Google, that are more likely to yield good results. And if you, yourself, are doing this within Google it benefits you directly. A lot of arguments win and die on data, so if you can make a data driven argument about how users are using a system, or what the ground reality of usage in a particular system is and can pair that with anecdotal user feedback it can take you a long way to steering your own, and your orgs work, towards things that align well with internal goals and or help reset and re-prioritize internal goals.

15. > 15. When a measure becomes a target, it stops measuring.

This is Goodhart's law - "When a measure becomes a target, it ceases to be a good measure" [1].

[1] https://en.wikipedia.org/wiki/Goodhart%27s_law

16. Right, this annoyed me too - it was stated w/o attribution as if novel.

What is the name of the law when someone writes a think piece of "stuff I've learned" and fails to cite any of it to existing knowledge?

Makes me wonder if (A) they do know it's not their idea, but they are just cool with plagiarism or (B) they don't know it's not their idea.

17. There are many big bosses under the Google CEO that lead hordes of developers to specific targets-to-meet. Eventually they prioritise their bonuses and the individual goals deviate with every iteration. So the quality will diminish continuously.

18. I'm going to pick out 3 points:

> 2. Being right is cheap. Getting to right together is the real work

> 6. Your code doesn’t advocate for you. People do

> 14. If you win every debate, you’re probably accumulating silent resistance

The common thread here is that in large organizations, your impact is largely measured by how much you're liked. It's completely vibes-based. Stack ranking (which Google used to have; not sure if it still does) just codifies popularity.

What's the issue with that? People who are autistic tend to do really badly through no fault of their own. These systems are basically a selection filter for allistic people.

This comes up in PSC ("perf" at Meta, "calibration" elsewhere) where the exact same set of facts can be constructed as a win or a loss and the only difference is vibes. I've seen this time and time again.

In one case I saw a team of 6 go away and do nothing for 6 months then come back and shut down. If they're liked, "we learned a lot". If they're

19. It's funny that I agree with most or all of these principles but don't feel like my 10 years at Google accord with most of this. I wouldn't say I learned these things at Google, but learned them before (and a bit after) and was continually frustrated about how many of them were not paid attention to at Google at all?

Incentive structure inside Google is impaired.

I do think Google engineering culture does bias against excessive abstraction and for clean readable code and that's good. But acting in the user's interest, timely shipping, etc... not so much.

20. This has to be the 50th or 100th version of this article that repeats the same thing

Every single point in this article was already explicitly described between roughly 1968 and 1987: Brooks formalized coordination cost and the fallacy of adding manpower in The Mythical Man-Month

Conway showed that system architecture inevitably mirrors organizational communication structure in 1968

Parnas defined information hiding and modularity as organizational constraints, not coding style, in 1972

Dijkstra *repeatedly warned* that complexity grows faster than human comprehension and cannot be managed socially after the fact

None of this is new, reframed, or extended here; it is a faithful re-enumeration of half-century-old constraints.

These lists keep reappearing because we refuse to solve is the structural one: none of these constraints are enforceable inside modern incentive systems.

So almost like clockwork somebody comes out of nowhere saying hey I’ve I’ve observed these things that a

Write a concise, engaging paragraph (3-5 sentences) that captures the main ideas, notable perspectives, and overall sentiment of these comments regarding the topic. Focus on the most interesting and representative viewpoints. Do not use bullet points or lists - write flowing prose.

topic

Goodhart's Law and Metrics Gaming

commentCount

← Back to job