llm/7c7e49f1-870c-4915-9398-3b2e1f116c0c/batch-17-bc653980-6b86-4940-b121-aae6f4c00c06-input.json
You are a comment classifier. Given a list of topics and a batch of comments, assign each comment to up to 3 of the most relevant topics.
TOPICS (use these 1-based indices):
1. Toxic moderation culture
2. LLMs replacing Stack Overflow
3. Duplicate question closures
4. Knowledge repository vs help desk debate
5. Community decline timeline
6. Discord as alternative platform
7. Future of LLM training data
8. Gamification and reputation systems
9. Expert knowledge preservation
10. Reddit as alternative
11. Question quality standards
12. Moderator power dynamics
13. Google search integration decline
14. Stack Exchange expansion problems
15. Human interaction loss
16. Documentation vs community answers
17. Site mission misalignment
18. New user experience
19. GitHub Discussions alternative
20. Corporate ownership changes
COMMENTS TO CLASSIFY:
[
{
"id": "46483391",
"text": "The SO mission is complete. It's now an LLM training set.\n\nThings would be different if we didn't."
}
,
{
"id": "46487655",
"text": "This is the elephant in the room most commenters chose to turn their blind eye to."
}
,
{
"id": "46487053",
"text": "Someone needs to archive the entirety of StackOverflow and make it available over torrent so that it can be preserved when the site shuts down. Urgently."
}
,
{
"id": "46487135",
"text": "https://archive.org/details/stackexchange\nFound it"
}
,
{
"id": "46488397",
"text": "https://archive.org/details/stackexchange_20250930\n\n> As of (and including) the 2025-06-30 data dump, Stack Exchange has started including watermarking/data poisoning in the data. At the time of writing, this does not appear to apply to the 2025-09-30 data dump. The format(s), the dates for affected data dumps, and by extension how the garbage data can be filtered out, are described in this community-compiled list: https://github.com/LunarWatcher/se-data-dump-transformer/blo... . If the 2025-09-30 data dump turns out to be poisoned as well, that's where an update will be added. For obvious reasons, the torrent cannot be updated once created."
}
,
{
"id": "46486709",
"text": "It's amazing to think that in the next few years, we may have software engineers entering the workforce who don't know what StackOverflow is..."
}
,
{
"id": "46485710",
"text": "Stagnation started around 03/2014 and downward trend started around 03/2017. Looking at dates, it doesn't seem like AI caused those trend changes."
}
,
{
"id": "46482883",
"text": "This is a great example of how free content was exploited by LLMs and used against oneself to an ultimate destruction.\n\nEvery content creator should be terrified of leaving their content out for free and I think it will bring on a new age of permanent paywalls and licensing agreements to Google and others, with particular ways of forcing page clicks to the original content creators."
}
,
{
"id": "46483025",
"text": "What if we filter out all the questions closed as dupes, off topic, etc?"
}
,
{
"id": "46485202",
"text": "Acquired in June 2021 for $1.8 billion usd. Hurts but acquirer Naspers is a prolific tech investor, its stake in TenCent is worth > $150B usd today."
}
,
{
"id": "46487014",
"text": "Looks like they sold right before the end. Wonder whether the AI deals they've struck make up for the difference"
}
,
{
"id": "46483032",
"text": "I suspect a lot of the traffic shift is from Google replacing the top search result, which used to be Stack Overflow for programming questions, with a Gemini answer."
}
,
{
"id": "46483731",
"text": "Everyone here talks about LLMs, but for me, the reason why StackOverflow became totally irrelevant is because of dedicated Discord servers and forums."
}
,
{
"id": "46483188",
"text": "I misread the title at first and thought it was hacker news questions [comments] that were being graphed. That’s what I would be interested in seeing"
}
,
{
"id": "46483173",
"text": "Signs of over-moderation and increasing toxicity on Stack Overflow became particularly evident around 2016, as reflected by the visible plateau in activity.\n\nMany legitimate questions were closed as duplicates or marked off-topic despite being neither. Numerous high-quality answers were heavily edited to sound more \"neutral\", often diluting their practical value and original intent.\n\nSome high-profile users (with reputation scores > 10,000) were reportedly incentivized by commercial employers to systematically target and downvote or flag answers that favored competing products. As a result, answers from genuine users that recommended commercial solutions based on personal experience were frequently removed altogether.\n\nAdditionally, the platform suffers from a lack of centralized authentication: each Stack Exchange subdomain still operates with its own isolated login system, which creates unnecessary friction and discourages broader user participation."
}
,
{
"id": "46482618",
"text": "The result is not surprising! Many people are now turning to LLMs with their questions instead. This explains the decline in the number of questions asked."
}
,
{
"id": "46484391",
"text": "Has AI summarization led to people either getting their answer from a search engine directly, and failing that, just giving up?"
}
,
{
"id": "46485458",
"text": "RTFM"
}
,
{
"id": "46487625",
"text": "TFMs are not a thing anymore. Most of them are merely collections of sparse random dots one might join by sheer luck only, granted no other knowledge of the system being attempted to document."
}
,
{
"id": "46490149",
"text": "I don't know what you are building, but if a thing doesn't have comprehensive docs it doesn't make it into my stack."
}
,
{
"id": "46490306",
"text": "glad I don’t work at any place that would make a professional write this comment"
}
,
{
"id": "46483796",
"text": "StackOverflow cemented my fears of asking questions. Even though there were no results for what I needed, I was too afraid to ask.\n\nGood riddance, now I’m never afraid to ask dumb questions to LLM and I’ve learned a lot more with no stress of judgement."
}
,
{
"id": "46485263",
"text": "While the decline started a decade ago in 2014 and accelerated in 2020, the huge drop since 2023 is remarkable"
}
,
{
"id": "46486196",
"text": "It's funny to see people's new year's resolution to learn how to code in the graph"
}
,
{
"id": "46482852",
"text": "If nobody is on StackOverflow, What will LLM's train on for new problems?"
}
,
{
"id": "46491438",
"text": "GitHub Issues and Disscussions + searching the code base, fetching the docs and some reasoning on top. Maybe even firing up a sandbox VM and testing some solutions."
}
,
{
"id": "46495313",
"text": "\"firing up a sandbox VM and testing some solutions\"\n\nIf the LLM can start up a VM and test a solution, to identify a new unique problem, and find it's own solution. That would be pretty impressive. I'm not sure they are really to that point. But some AI's are winning the Math Olympiad, so maybe it is happening. I'm sure this is the overall goal."
}
,
{
"id": "46483980",
"text": "Couldn’t have happened to a meaner community"
}
,
{
"id": "46486037",
"text": "You're clearly excluding gaming communities such as DotA2."
}
,
{
"id": "46482607",
"text": "Everything we have done and said on the internet since its birth has just been to train the future AI."
}
,
{
"id": "46482869",
"text": "Maybe the average question will be more \"high level\" now that all simple questions are answered by LLMs ?"
}
,
{
"id": "46483557",
"text": "they pretend like everything is fine at HN too wouldn't surprise me looking similar in the future."
}
,
{
"id": "46491573",
"text": "I fairly recently tried to ask a question on SO because the LLMs did not work for that domain. I’m no beginner to SO, having some 13k points from many questions and answers. I made, in my opinion, a good question, referenced my previous attempts, clearly stating my problem and what I tried to do. Almost immediately after posting I got downvoted, no comments, a close- suggestions etc. A similar thing happened the last two times I tried this too. I’m not sure what is going on over there now, but whatever that site was many years ago, it isn’t any more. It’s s shame, because it was such a great thing, but now I am disincentivized to use it because I lose points each time I tip my toes back in."
}
,
{
"id": "46485070",
"text": "It was a good idea ruined by the compulsively obtuse and pedantic, not unlike Reddit."
}
,
{
"id": "46483633",
"text": "It was a good 16 year-ish run."
}
,
{
"id": "46484896",
"text": "And still last month one of my questions on SO got closed because it was - \"too broad\".\nI mean it was 2025 and how many very precise software engineering questions are there that any flagship models couldn't answer in seconds?\n\nAlthough I had moderate popularity on SO I'm not gonna miss it; that community had always been too harsh for newcomers. They had the tiniest power, and couldn't handle that well."
}
,
{
"id": "46482955",
"text": "I've never once asked a question on there\nMostly because you can't unless your account has X something-points. Which you get by answering questions.\n\nThis threw me off so much when I got started with programming. Like why are the people who have the most questions, not allowed to ask any...?"
}
,
{
"id": "46483022",
"text": "Are you sure? You can post questions even with a completely new blank account. It's comments that require some reputation, maybe you were thinking about those?"
}
,
{
"id": "46483007",
"text": "You don't need any reputation to ask questions, you only need to create an account."
}
,
{
"id": "46482751",
"text": "They're desperately trying to save it e.g. by introducing \"discussions\" which are just questions that would normally have been closed. The first one I saw, the first reply was \"this should have been a question instead of a discussion\".\n\nLet's never forget that Stackoverflow was killed by its mods. Sure, it needed AI as an alternative so people could actually leave, but the thing that actually pushed them away was the mods."
}
,
{
"id": "46482602",
"text": "Probably similar for google. My first line of search is always chatgpt"
}
,
{
"id": "46482850",
"text": "Good riddance to bad rubbish (TLDR: Questions are now almost never being asked on Stack overflow).\n\nThe most annoying example I can think of (but can’t link to, alas) is when I Googled for an answer to a technical question, and got an annoying Stack Overflow answer which didn’t answer the question, telling the person to just Google the answer."
}
,
{
"id": "46484498",
"text": "Not surprising. It's very often a toxic, unhelpful, stubborn community. I think maybe once or twice in years of use did I ever find it genuinely welcoming and helpful. Frequently instead I thought \"Why should I even bother to post this? It'll just get either downvoted, deleted, or ignored.\""
}
,
{
"id": "46485045",
"text": "End of an era. :-("
}
,
{
"id": "46486096",
"text": "Someone turn off the lights on the way out"
}
,
{
"id": "46484281",
"text": "A death graph.\n\nKind of sad that they ran out of ideas how to fix SO."
}
,
{
"id": "46483337",
"text": "Gee...I wonder why it's almost dead (again)?"
}
,
{
"id": "46488335",
"text": "Clicked on the link and got stopped by cloudflare. Guess I won't be giving them any more traffic either."
}
,
{
"id": "46487419",
"text": "StackOverflow was immediately dead for me the day they declared that AI sellout of theirs.\n\nPathetic thieves, they won't even allow deleting my own answers after that. Not that it would make the models unlearn my data, of course, but I wanted to do so out of principle.\n\nhttps://meta.stackexchange.com/questions/399619/our-partners..."
}
,
{
"id": "46482860",
"text": "I wonder if google search saw a similar hit"
}
]
Return ONLY a JSON array with this exact structure (no other text):
[
{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,
{
"id": "comment_id_2",
"topics": [
2
]
}
,
...
]
Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices
- Only assign topics that are genuinely relevant to the comment
- If no topics match, use an empty array:
{
"id": "...",
"topics": []
}
50