llm/7c7e49f1-870c-4915-9398-3b2e1f116c0c/batch-7-33bcf498-16a3-42fc-b2c9-1519a4c72206-input.json
You are a comment classifier. Given a list of topics and a batch of comments, assign each comment to up to 3 of the most relevant topics.
TOPICS (use these 1-based indices):
1. Toxic moderation culture
2. LLMs replacing Stack Overflow
3. Duplicate question closures
4. Knowledge repository vs help desk debate
5. Community decline timeline
6. Discord as alternative platform
7. Future of LLM training data
8. Gamification and reputation systems
9. Expert knowledge preservation
10. Reddit as alternative
11. Question quality standards
12. Moderator power dynamics
13. Google search integration decline
14. Stack Exchange expansion problems
15. Human interaction loss
16. Documentation vs community answers
17. Site mission misalignment
18. New user experience
19. GitHub Discussions alternative
20. Corporate ownership changes
COMMENTS TO CLASSIFY:
[
{
"id": "46483540",
"text": "It was a more innocent time.\n\nProof: https://web.archive.org/web/19990429180417/http://www.expert..."
}
,
{
"id": "46483083",
"text": "The same is true for reddit imo, it became impossible to post anything to a subreddit way before LLMs"
}
,
{
"id": "46483078",
"text": "Seemed like for every other question, I received unsolicited advice telling me how I shouldn't be doing it this way, only for me to have to explain why I wanted to do it this way (with silence from them)."
}
,
{
"id": "46489035",
"text": "This is called the XY problem https://meta.stackexchange.com/a/66378 . You ask for X, I tell you that what you really want is Y, I bully you, and I become more convinced that you and people that ask for X want Y."
}
,
{
"id": "46483234",
"text": "Oh I love that game! (At least I think it's a game)\n\nYou ask how to do X.\n\nMember M asks why you want to do X.\n\nBecause you want to do Y.\n\nWell!? why do you want to do Y??\n\nBecause Y is on T and you can't do K so you need a Z\n\nWell! Well! Why do you even use Z?? Clearly J is the way it is now recommended!\n\nBecause Z doesn't work on a FIPS environment.\n\n...\n\nCan you help me?\n\n...\n\nI just spent 15 minutes explaining X, Y and Z. Do you have any help?\n\n...(crickets)"
}
,
{
"id": "46483690",
"text": "To be fair, asking why someone wants to do something is often a good question. Especially in places like StackOverflow where the people asking questions are often inexperienced.\n\nI see it all the time professionally too. People ask \"how do I do X\" and I tell them. Then later on I find out that the reason they're asking is because they went down a whole rabbit hole they didn't need to go down.\n\nAn analogy I like is imagine you're organising a hike up a mountain. There's a gondola that takes you to the top on the other side, but you arrange hikes for people that like hiking. You get a group of tourists and they're all ready to hike. Then before you set off you ask the question \"so, what brings you hiking today\" and someone from the group says \"I want to get to the top of the mountain and see the sights, I hate hiking but it is what it is\". And then you say \"if you take a 15 minute drive through the mountain there's a gondola on the other side\". And the person thanks you and goes on their way because they didn't know there was a gondola. They just assumed hiking was the only way up. You would have been happy hiking them up the mountain but by asking the question you realised that they didn't know there was an easier way up.\n\nIt just goes back to first principles.\n\nThe truth is sometimes people decide what the solution looks like and then ask for help implementing that solution. But the solution they chose was often the wrong solution to begin with."
}
,
{
"id": "46483772",
"text": "The well known XY problem[1].\n\nI spent years on IRC, first getting help and later helping others. I found out myself it was very useful to ask such questions when someone I didn't know asked a somewhat unusual question.\n\nThe key is that if you're going to probe for Y, you usually need to be fairly experienced yourself so you can detect the edge cases, where the other person has a good reason.\n\nOne approach I usually ended up going for when it appeared the other person wasn't a complete newbie was to first explain that I think they're trying to solve the wrong problem or otherwise going against the flow, and that there's probably some other approach that's much better.\n\nThen I'd follow up with something like \"but if you really want to proceed down this rrack, this is how I'd go about it\", along with my suggestion.\n\n[1]: https://en.wikipedia.org/wiki/XY_problem"
}
,
{
"id": "46485912",
"text": "It's great when you're helping people one on one, but it's absolutely terrible for a QA site where questions and answers are expected to be helpful to other people going forward.\n\nI don't think your analogy really helps here, it's not a question. If the question was \"How do I get to the top of the mountain\" or \"How do I want to get to the top of the mountain without hiking\" the answer to both would be \"Gondola\"."
}
,
{
"id": "46484389",
"text": "> Especially in places like StackOverflow where the people asking questions are often inexperienced.\n\nExcept that SO has a crystal clear policy that the answer to questions should be helpful for everybody reaching it through search, not only the person asking it. And that questions should never be asked twice.\n\nSo if by chance, after all this dance the person asking the question actually needs the answer to a different question, you'll just answer it with some completely unrelated information and that will the the mandatory correct answer for everybody that has the original problem for any reason."
}
,
{
"id": "46487154",
"text": "Yes exactly. The fact that the \"XY problem\" exists, and that users sometimes ask the wrong question, isn't being argued. The problem is that SO appears to operate at the extreme, taking the default assumption that the asker is always wrong. That toxic level of arrogance (a) pushes users away and (b) ...what you said."
}
,
{
"id": "46488847",
"text": "Which is why LLMs are so much more useful than SO and likely always will be. LLMs do this even. Like trying to write my own queue by scratch and I ask an LLM for feedback I think it’s Gemini that often tells me Python’s deque is better. duh! That’s not the point. So I’ve gotten into the habit of prefacing a lot of my prompts with “this is just for practice” or things of that nature. It actually gets annoying but it’s 1,000x more annoying finding a question on SO that is exactly what you want to know but it’s closed and the replies are like “this isn’t the correct way to do this” or “what you actually want to do is Y”"
}
,
{
"id": "46484056",
"text": ">I see it all the time professionally too. People ask \"how do I do X\" and I tell them. Then later on I find out that the reason they're asking is because they went down a whole rabbit hole they didn't need to go down.\n\nYep. The magic question is \"what are you trying to accomplish?\". Oftentimes people lacking experience think they know the best way to get the results they're after and aren't aware of the more efficient ways someone with more experience might go about solving their problem."
}
,
{
"id": "46485983",
"text": "My heuristic is that if your interlocutor asks follow-up questions like that with no indication of why (like “why do you want to do X?” rather than “why do you want to do X? If the answer is Y, then X is a bad approach because Q, you should try Z instead”) then they are never going to give you a helpful answer."
}
,
{
"id": "46483522",
"text": "How do I add a second spout to this can?\n\n...\n\nWell, the pump at the gas station doesn't fit in my car, but they sold me a can with a spout that fits in my car.\n\n...\n\nIt's tedious to fill the can a dozen times when I just want to fill up my gas tank. Can you help me or not?\n\n...\n\nI understand, but I already bought the can. I don't need the \"perfect\" way to fill a gas tank, I just want to go home."
}
,
{
"id": "46483634",
"text": "My favourite is this disclaimer in the question. lol\n\n> Is there any way to force install a pip python package ignoring all its dependencies that cannot be satisfied?\n\n> (I don't care how \"wrong\" it is to do so, I just need to do it, any logic and reasoning aside...)\n\nhttps://stackoverflow.com/questions/12759761/pip-force-insta..."
}
,
{
"id": "46483350",
"text": "Tbf the problem there is probably FIPS more than anything else."
}
,
{
"id": "46487865",
"text": "If someone is paying you to implement a security vulnerability and you've told them and you don't have liability, you just do it. That's how capitalism works. You do whatever people give you money for."
}
,
{
"id": "46492491",
"text": "I wasn’t referring to vulnerabilities, I was referring to arbitrary silly security theatre controls. But id hate to deal with you professionally. Gross."
}
,
{
"id": "46483289",
"text": "To avoid going insane the mindset should be to produce something useful for future readers."
}
,
{
"id": "46482718",
"text": "Long before LLMs. Setting aside peak-COVID as a weird aberration, question volume has been in decline since 2014 or maybe 2016."
}
,
{
"id": "46482848",
"text": "Stack Overflow would still have a vibrant community if it weren't for the toxic community.\n\nImagine a non-toxic Stack Overflow replacement that operated as an LLM + Wiki (CC-licensed) with a community to curate it. That seems like the sublime optimal solution that combines both AI and expertise. Use LLMs to get public-facing answers, and the community can fix things up.\n\nNo over-moderation for \"duplicates\" or other SO heavy-handed moderation memes.\n\nSomeone could ask a question, an LLM could take a first stab at an answer. The author could correct it or ask further questions, and then the community could fill in when it goes off the rails or can't answer.\n\nYou would be able to see which questions were too long-tail or difficult for the AI to answer, and humans could jump in to patch things up. This could be gamified with points.\n\nThis would serve as fantastic LLM training material for local LLMs. The authors of the site could put in a clause saying that \"training is allowed as long as you publish your weights + model\".\n\nSomeone please build this.\n\nEdit: Removed \"LLMs did not kill Stack Overflow.\" first sentence as suggested. Perhaps that wasn't entirely accurate, and the rest of the argument stands better on its own legs."
}
,
{
"id": "46482897",
"text": "The fact that they basically stopped the ability to ask 'soft' questions without a definite answer made it very frustrating. There's no definitive answer to a question about best practices, but you can't ask people to share their experiences or recommendations."
}
,
{
"id": "46483504",
"text": "They actually added some new question categories a while ago [1]\n\n\"Troubleshooting / Debugging\" is meant for the traditional questions, \"Tooling recommendation\", \"Best practices\", and \"General advice / Other\" are meant for the soft sort of questions.\n\nI have no clue what the engagement is on these sort of categories, though. It feels like a fix for a problem that started years ago, and by this point, I don't really know if there's much hope in bringing back the community they've worked so hard to scare away. It's pretty telling just how much the people that are left hate this new feature.\n\n[1] https://meta.stackoverflow.com/questions/435293/opinion-base..."
}
,
{
"id": "46483736",
"text": "Oh, that's good that they added them. I stopped being active in on the sites a long time ago, so I missed that."
}
,
{
"id": "46482985",
"text": "Fixing loads of LLM-generated content is neither easy nor fun. You'll have a very hard time getting people to do that."
}
,
{
"id": "46483203",
"text": "Hardly.\n\n- A huge number of developers will want to use such a tool. Many of them are already using AI in a \"single player\" experience mode.\n\n- 80% of the answers will be correct when one-shot for questions of moderate difficulty.\n\n- The long tail of \"corrector\" / \"wiki gardening\" / pedantic types fill fix the errors. Especially if you gamify it.\n\nJust because someone doesn't like AI doesn't mean the majority share the same opinion. AI products are the fastest growing products in history. ChatGPT has over a billion MAUs. It's effectively won over all of humanity.\n\nI'm not some vibe coder. I've been programming since the 90's, including on extremely critical multi-billion dollar daily transaction volume infra, yet I absolutely love AI. The models have lots of flaws and shortcomings, but they're incredibly useful and growing in capability and scope -- I'll stand up and serve as your counter example."
}
,
{
"id": "46483261",
"text": "People answer on SO because it's fun. Why should they spend their time fixing AI answers?\n\nIt's very tedious as the kind of mistakes LLMs make can be rather subtle and AI can generate a lot of text very fast. It's a sisyphean taks, I doubt enough people would do it."
}
,
{
"id": "46495182",
"text": "Your points are arguing that the tool would be useful - not that anyone would build it. No one wants to curate what is, essentially, randomly generated text. What an absolute nightmare that would be"
}
,
{
"id": "46495270",
"text": "> essentially, randomly generated text.\n\nYou oversimplified and lost too much precision. Try again?"
}
,
{
"id": "46484533",
"text": "I just think you could save a lot of money and energy doing all this but skipping the LLM part? Like what is supposed to be gained? The moment/act of actual generation of lines of code or ideas, whether human or not, is a much smaller piece of the pie relative to ongoing correction, curation, etc (like you indicate). Focusing on it and saying it intrinsically must/should come from the LLM mistakes the intrinsically ephemeral utility of the LLMs and the arguably eternal nature of the wiki at the same time. As sibling says, it turns it into work vs the healthy sharing of ideas.\n\nThe whole pitch here just feels like putting gold flakes on your pizza: expensive and would not be missed if it wasn't there.\n\nJust to say, I'm maybe not as experienced and wise I guess but this definitely sounds terrible to me. But whatever floats your boat I guess!"
}
,
{
"id": "46485830",
"text": "The community is not \"toxic\". The community is overwhelmed by newcomers believing that they should be the ones who get to decide how the site works (more charitably: assuming that they should be able to use the site the same way as other sites, which are not actually at all the same and have entirely different goals).\n\nI don't know why you put \"duplicates\" in quotation marks. Closing a duplicate question is doing the OP (and future searchers) a service, by directly associating the question with an existing answer."
}
,
{
"id": "46482958",
"text": "> Someone could ask a question, an LLM could take a first stab at an answer. The author could correct it or ask further questions, and then the community could fill in when it goes off the rails or can't answer.\n\nIsn't this how Quora is supposed to operate?"
}
,
{
"id": "46483223",
"text": "Maybe my experience is unique - but Quora seems to be largely filled with adverts-posing-as-answers."
}
,
{
"id": "46483400",
"text": "Quora, sadly, is a good example of enshittification."
}
,
{
"id": "46483004",
"text": "Absolutely 100% this. I've used them on and off throughout the years. The community became toxic, so I took my question to other platforms like Reddit (they became toxic as well) and elsewhere.\n\nMind you, while I'm a relative nobody in terms of open source, I've written everything from emulators and game engines in C++ to enterprise apps in PHP, Java, Ruby, etc.\n\nThe consistent issues I've encountered are holes in documentation, specifically related to undocumented behavior, and in the few cases I've asked about this on SO, I received either no response and downvotes, or negative responses dismissing my questions and downvotes. Early on I thought it was me. What I found out was that it wasn't. Due to the toxic responses, I wasn't about to contribute back, so I just stopped contributing, and only clicked on an SO result if it popped up on Google, and hit the back button if folks were super negative and didn't answer the question.\n\nLater on, most of my answers actually have come from Github,and 95% of the time, my issues were legitimate ones that would've been mentioned if a decent number of folks used the framework, library, or language in question.\n\nI think the tl;dr of this is this: If you can't provide a positive contribution on ANY social media platform like Stack Overflow, Reddit, Github, etc. Don't speak. Don't vote. Ignore the question. If you happen to know, help out! Contribute! Write documentation! I've done so on more than one occasion (I even built a website around it and made money in the process due to ignorance elsewhere, until I shut it down due to nearly dying), and in every instance I did so, folks were thankful, and it made me thankful that I was able to help them. (the money wasn't a factor in the website I built, I just wanted to help folks that got stuck in the documentation hole I mentioned)\n\nEDIT: because I know a bunch of you folks read Ars Technica and certain other sites. I'll help you out: If you find yourself saying that you are being \"pedantic\", you are the problem, not the solution. Nitpicking doesn't solve problems, it just dilutes the problem and makes it bigger. If you can't help, think 3 times and also again don't say anything if your advice isn't helpful."
}
,
{
"id": "46483241",
"text": "It doesn't have anything to do with LLMs. It has to do with shifting one's focus from doing good things to making money. Joel did that, and SO failed because of it.\n\nJoel promised the answering community he wouldn't sell SO out from under them, but then he did.\n\nAnd so the toxicity at the top trickled down into the community.\n\nThose with integrity left the community and only toxic, selfcentered people remained to destroy what was left in effort to salvage what little there was left for themselves.\n\nMods didn't dupe questions to help the community. They did it to keep their own answers at the top on the rankings."
}
,
{
"id": "46483455",
"text": "How did Joel sell out? Curious as I’m not aware of any monetary changes. I watched Joel several times support completely brain dead policies in the meta discussions which really set the rules and tone. So my respect there is low."
}
,
{
"id": "46483517",
"text": "He and Jeff made it abundantly clear their mission was to destroy the sex change site because that site was immoral for monetizing the benevolence of the community who answered the questions.\n\n\"Knowledge should be free\" they said. \"You shouldn't make money off stuff like this,\" they said.\n\nPlenty of links and backstory in my other comments."
}
,
{
"id": "46483562",
"text": "They literally sold: https://arstechnica.com/gadgets/2021/06/stack-overflow-sold-..."
}
,
{
"id": "46482758",
"text": "I also wonder if GitHub Discussions was also a (minor) contributing factor to the decline. I recall myself using GitHub Discussions more and more when it came to repo specific issues.\n\nThe timeline also matches:\n\nhttps://github.blog/changelog/2020-12-08-github-discussions-...\n\nhttps://github.blog/news-insights/product-news/github-discus..."
}
,
{
"id": "46482834",
"text": "Do we have any stats for the number of GitHub discussions created each month to compare to this?"
}
,
{
"id": "46485891",
"text": "> legitimate questions being closed for no good reason\n\nThey are closed for good reasons. People just have their own ideas about what the reasons should be. Those reasons make sense according to others' ideas about what they'd like Stack Overflow to be, but they are completely wrong for the site's actual goals and purposes. The close reasons are well documented ( https://meta.stackoverflow.com/questions/417476 ) and well considered, having been exhaustively discussed over many years.\n\n> or being labeled a duplicate even though they often weren’t\n\nI have seen so many people complain about this. It is vanishingly rare that I actually agree with them. In the large majority of cases it is comically obvious to me that the closure was correct. For example, there have been many complaints in the Python tag that were on the level of \"why did you close my question as a duplicate of how to do X with a list? I clearly asked how to do it with a tuple!\" (for values of X where you do it the same way.)\n\n> a generally toxic and condescending culture amongst the top answerers.\n\nOn the contrary, the top answerers are the ones who will be happy to copy and paste answers to your question and ignore site policy, to the constant vexation of curators like myself trying to keep the site clean and useful (as a searchable resource) for everyone.\n\n> For all their flaws, LLMs are so much better.\n\nI actually completely agree that people who prefer to ask LLMs should ask LLMs. The experience of directly asking (an LLM) and getting personalized help is explicitly the exact thing that Stack Overflow was created to get away from (i.e., the traditional discussion forum experience, where experts eventually get tired of seeing the same common issues all the time and all the same failures to describe a problem clearly, and where third parties struggle to find a useful answer in the middle of along discussion)."
}
,
{
"id": "46487260",
"text": "You seem to have filled this thread with a huge number of posts that try to justify SO's actions. Over and over, these justifications are along the lines of \"this is our mission\", \"read our policy\", \"understand us\".\n\nOften, doing what your users want leads to success. Stamping authority over your users, and giving out a constant air of \"we know better than all of you\", drives them away. And when it's continually emphasized publicly (rather than just inside a marketing department) that the \"mission\" and the \"policy\" are infinitely more important than what your users are asking for, that's a pretty quick route to failure.\n\nWhen you're completely embedded in a culture, you don't have the ability to see it through the eyes of the majority on the outside. I would suggest that some of your replies here - trying to deny the toxicity and condescension - are clearly showing this."
}
,
{
"id": "46489978",
"text": "> Often, doing what your users want leads to success.\n\nYou misunderstand.\n\nPeople with accounts on Stack Overflow are not \"our users\".\n\nStack Exchange, Inc. does not pay the moderators, nor high-rep community members (who do the bulk of the work, since it is simply far too much for a handful of moderators) a dime to do any of this.\n\nBuilding that resource was never going to keep the lights on with good will and free user accounts (hence \"Stack Overflow for Teams\" and of course all the ads). Even the company is against us, because the new owners paid a lot of money for this. That doesn't change what we want to accomplish, or why.\n\n> When you're completely embedded in a culture, you don't have the ability to see it through the eyes of the majority on the outside.\n\nI am not \"embedded in\" the culture. I simply understand it and have put a lot of time into its project. I hear the complaints constantly. I just don't care . Because you are trying to say that I shouldn't help make the thing I want to see made.\n\n> trying to deny the toxicity and condescension\n\nI consider the term \"toxicity\" more or less meaningless in general, and especially in this context.\n\nAs for \"condescension\", who are you to tell me what I should seek to accomplish?"
}
,
{
"id": "46493240",
"text": "> \"why did you close my question as a duplicate of how to do X with a list? I clearly asked how to do it with a tuple!\" (for values of X where you do it the same way.)\n\nThis is a great example of a question that should not be closed as a duplicate. Lists are not tuples in Python, regardless of how similar potential answers may be."
}
,
{
"id": "46498531",
"text": "I'm talking here about cases (which is basically all of them) where the first person to ask was simply needlessly specific. Or where the canonical has the list as an incidental detail and the next person insists that the answers won't work because this code has a tuple , you see, and doesn't see the merit in trying them.\n\nIf you imagine that the answer should be re-written from scratch to explain that the approach will be the same, you have fundamentally misunderstood the purpose of the site. Abstraction of contextually unimportant details is supposed to be an essential skill for programmers."
}
,
{
"id": "46482756",
"text": "It seemed to me that pre-llm, google had stopped surfacing stackoverflow answers in search results."
}
,
{
"id": "46482851",
"text": "My memory is there were a spate of SO scraping sites that google would surface above SO and google just would not zap.\n\nIt would have been super trivial to fix but google didn’t.\n\nMy pet theory was that google were getting doubleclick revenue from the scrapers so had incentives to let them scrape and to promote them in search results."
}
,
{
"id": "46483873",
"text": "I remember those too! There were seemingly thousands of them!\n\nReminds me of my most black-hat project — a Wikipedia proxy with 2 Adsense ads injected into the page. It made me like $20-25 a month for a year or so but sadly (nah, perfectly fairly) Google got wise to it."
}
,
{
"id": "46484372",
"text": "I'm actually surprised it was only ~$20 a month."
}
]
Return ONLY a JSON array with this exact structure (no other text):
[
{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,
{
"id": "comment_id_2",
"topics": [
2
]
}
,
...
]
Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices
- Only assign topics that are genuinely relevant to the comment
- If no topics match, use an empty array:
{
"id": "...",
"topics": []
}
50