llm/5daab79e-f20f-476c-ab87-82c7ff678250/batch-15-1352d18e-840c-4172-a049-8f30426c2433-input.json
You are a comment classifier. Given a list of topics and a batch of comments, assign each comment to up to 3 of the most relevant topics.
TOPICS (use these 1-based indices):
1. Toxic moderation culture
2. LLMs replacing Stack Overflow
3. Duplicate question closures
4. Community hostility toward newcomers
5. Question quality standards
6. Knowledge base vs help forum debate
7. Future of LLM training data
8. Reddit and Discord as alternatives
9. Gamification and reputation systems
10. Outdated answers problem
11. SO sale to private equity
12. Google search integration decline
13. Expert knowledge preservation
14. GitHub Discussions adoption
15. Elitist gatekeeping behavior
16. Human interaction loss
17. Question saturation theory
18. Moderator power dynamics
19. AI-generated content concerns
20. Community decline timeline
COMMENTS TO CLASSIFY:
[
{
"id": "46485165",
"text": "Ideally, you'd train them on the core documentation of the language or tool itself.\n\nHopefully, LLMs lead to more thorough documentation at the start of a new language, framework, or tool. Perhaps to the point of the documentation being specifically tailored to read well for the LLM that will parse and internalize it.\n\nMost of what StackOverflow was was just a regurgitation of knowledge that people could acquire from documentation or research papers. It obviously became easier to ask on SO than dig through documentation. LLMs (in theory) should be able to do that digging for you at lightning speed.\n\nWhat ended up happening was people would turn to the internet and Stack Overflow to get a quick answer and string those answers together to develop a solution, never reading or internalizing documentation. I was definitely guilty of this many times. I think in the long run it's probably good that Stack Overflow dies."
}
,
{
"id": "46483233",
"text": "I still would like to get other humans' experiences and perspectives when it comes to solving some problems, I hope SO doesn't go away entirely.\n\nWith LLMs, at least in my experience, they'll answer your question best they can, just as you asked it. But they won't go the extra step to make assumptions based on what they think you're trying to do and make recommendations. Humans do that, and sometimes it isn't constructive at all like \"just use a different OS\", but other times it could be \"I don't know how to solve that, but I've had better lack with this other library/tool\"."
}
,
{
"id": "46482792",
"text": "IMHO Good Riddance to such a toxic community."
}
,
{
"id": "46483990",
"text": "Before writing the comment I had in my head I did a CTRL+F search for \"toxic\" in the comment section here. 42 occurences. It says everything about what's happening to SO."
}
,
{
"id": "46487598",
"text": "When you see AI giving you back various coding snippets almost verbatim from SO, it really makes you wonder what will happen in the future with AI when it can't depend on actual humans doing the work first."
}
,
{
"id": "46483063",
"text": "I'm glad I learned how to program when you could coax useful answers from Google searches.\n\nWhenever a Stack Overflow result comes up now the answer is years old and wrong, you might as well search archive.org."
}
,
{
"id": "46487677",
"text": "I certainly use it less now that I get a CloudFlare check every time I go and sometimes it fails or loads forever.\nI usually just go back to search results and look elsewhere after a second or two."
}
,
{
"id": "46486583",
"text": "I recently wrote a blog post similar to this situation: https://ertu.dev/posts/ai-is-killing-our-online-interaction/"
}
,
{
"id": "46485027",
"text": "SO has been a curse on technology. I've met teams of people who decide whether to adopt some technology based solely on if they can find SO answers for it. They refuse to read documentation or learn how the technology works; they'll only google for SO answers, and if the answer's not there, they give up. There's an entire generation like this now."
}
,
{
"id": "46483067",
"text": "Between 2017 and 2022 (pre-LLM), it appears to show a clear downward trend, ignoring the covid surge. Any ideas why this might be?\n\nThe query also filters to PostTypeId = 1, what does this refer to?"
}
,
{
"id": "46483221",
"text": "Incompetent moderation and the air of hostility towards contributing users."
}
,
{
"id": "46483668",
"text": "PostTypeId = 1 means \"only select questions.\"\n\n2 would be answers.\n\nThere is a bunch more of further post types: https://meta.stackexchange.com/a/2678"
}
,
{
"id": "46484160",
"text": "Good times. Although, I have to say, I was getting sick of SO before the LLM age. Modding felt a bit tyrannical, with a fourth of all my questions getting closed as off topic, and a lot of aggressive comments all around the site (do your homework, show proof, etc.)\n\nBack when I was an active member (10k reputation), we had to rush to give answers to people, instead of angrily down voting questions and making snark comments."
}
,
{
"id": "46483656",
"text": "People are mentioning the politicization of moderation. But also don’t forget when Joel broke the rules to use the site to push his personal political agenda."
}
,
{
"id": "46483162",
"text": "Interesting timing. I just analyzed TabNews (Brazilian dev community) and ~50% of 2025 posts mention AI/LLMs. The shift is real.\nThe 2014 peak is telling. That's before LLMs, before the worst toxicity complaints. Feels like natural saturation, most common questions were already answered. My bet, LLMs accelerated the decline but didn't cause it. They just made finding those existing answers frictionless."
}
,
{
"id": "46482789",
"text": "It's unfortunate that SO hasn't found a way to leverage LLMs. Lots of questions benefit from some initial search, which is hard enough that moderators likely felt frustrated with actual duplicates, or close enough duplicates, and LLMs seem able to assist. However I hope we don't lose the rare gem answers that SO also had, those expert responses that share not just a programming solution but deeper insight."
}
,
{
"id": "46483744",
"text": "I think that SO leveraging LLMs implicitly. Like I'll always ask LLM first, that's the easiest option. And I'll only come to SO if LLM fails to answer."
}
,
{
"id": "46486445",
"text": "There will be a generation of coders that will never have heard of stack overflow."
}
,
{
"id": "46483005",
"text": "How is Experts Exchange doing?"
}
,
{
"id": "46484567",
"text": "I am surprised at the amount of hate for Stack Overflow here. As a developer I can't think of a single website that has helped me as much over the last ten years.\n\nIt has had a huge benefit for the development community, and I for one will mourn its loss.\n\nI do wonder where answers will come from in the future. As others have noted in this thread, documentation is often missing, or incorrect. SO collected the experiences of actual users solving real problems. Will AI share experiences in a similar way? In principle it could, and in practice I think it will need to. The shared knowledge of SO made all developers more productive. In an AI coded future there will need to be a way for new knowledge to be shared."
}
,
{
"id": "46484043",
"text": "While SO is mostly dead, narrower stackexchange communities may be very much alive. E.g. the Emacs community is responsive."
}
,
{
"id": "46487094",
"text": "So it seems all the questions have now been answered– Great!"
}
,
{
"id": "46483416",
"text": "There's no doubt that generally LLMs are better. In addition SO had its issues. That being said I can't help but worry about losing humans asking questions and humans answering questions. The sentimentality aside, if humans aren't posing questions and if humans aren't recommending answers, what are the models going to use?"
}
,
{
"id": "46484937",
"text": "Wonder if this is a good proxy for '# of Google Searches'. Or perhaps a forward indicator (sign of things to come), since LLMs are adopted by the tech-savvy first, then the general public a little later, so Stack Overflow was among the first casualties."
}
,
{
"id": "46487053",
"text": "Someone needs to archive the entirety of StackOverflow and make it available over torrent so that it can be preserved when the site shuts down. Urgently."
}
,
{
"id": "46487135",
"text": "https://archive.org/details/stackexchange\nFound it"
}
,
{
"id": "46488397",
"text": "https://archive.org/details/stackexchange_20250930\n\n> As of (and including) the 2025-06-30 data dump, Stack Exchange has started including watermarking/data poisoning in the data. At the time of writing, this does not appear to apply to the 2025-09-30 data dump. The format(s), the dates for affected data dumps, and by extension how the garbage data can be filtered out, are described in this community-compiled list: https://github.com/LunarWatcher/se-data-dump-transformer/blo... . If the 2025-09-30 data dump turns out to be poisoned as well, that's where an update will be added. For obvious reasons, the torrent cannot be updated once created."
}
,
{
"id": "46483207",
"text": "While I generally agree with the narrative of the negative arc that stack overflow took, I found (and have as recently as a few months ago) that I could have enjoyable interactions on the math, Ux, written language, and aviation exchanges. The OS ones in the middle (always found the difference between Linux and superuser confusing)."
}
,
{
"id": "46482805",
"text": "For me, my usage of SO started declining as LLMs rose. Occasionally I still end up there, usually because a chat response referenced a SO thread. I was willing to put up with the toxicity as long as the site still had technical value for me.\n\nBut still, machines leave me wanting. Where do people go to ask real humans novel technical questions these days?"
}
,
{
"id": "46483775",
"text": "> Where do people go to ask real humans novel technical questions these days?\n\nI don't think such generic place exists. I just do my own research or abandon the topic. I think that in big companies you probably could use some internal chats or just ask some smart guy directly? I don't have that kind of connections and all online communities are full of people whose skill is below mine, so it makes little sense to ask something. I still do sometimes, but rarely receive competent answer.\n\nIf you have some focused topic like a question about small program, of course you can just use github issues or email author directly. But if you have some open question, probably SO is the only generic platform out there.\n\nTo put it differently, find some experts and ask which online place to the visit to help strangers. Most likely they just don't do it.\n\nSo for me, personally, LLMs are the saviour. With enough forth and back I can research any topic that doesn't require very deep expertise. Sure, acc"
}
,
{
"id": "46483625",
"text": "Find the relevant discord and search."
}
,
{
"id": "46486709",
"text": "It's amazing to think that in the next few years, we may have software engineers entering the workforce who don't know what StackOverflow is..."
}
,
{
"id": "46484623",
"text": "On what will the LLMs train, now?"
}
,
{
"id": "46486040",
"text": "On the same 14 year old Java questions like the rest of us."
}
,
{
"id": "46488379",
"text": "user chat logs clearly. They are not much diffent than the SO Q&A format."
}
,
{
"id": "46483130",
"text": "I think the bigger point we should realize is LLMs offer the EXACT same thing in a better way. Many people are still sharing answers to problems but they do it through an AI which then fine tunes on it and now that problem solution is shared with EVERYONE.\n\nFar better method of automated sharing of content"
}
,
{
"id": "46482670",
"text": "There are still airgapped places in the world where transferring information to offsite LLMs is expressly forbidden, but the offline LLMs available perform so terribly that they’re not worth using. An SO type application can be immensely helpful for engineering teams working in these environments."
}
,
{
"id": "46482697",
"text": "But surely transmitting information to actual SO is just as forbidden?\n\nAnd if you're making an internal-only site, it doesn't really need to be name-brand SO."
}
,
{
"id": "46482947",
"text": "Stack overflow was useful with a fairly sanitized search like “mysql error 1095”. Agentic LLMs do there best work when able to access your entire repository or network environment for context, which is impossible to sanitize. For a season, private environments will continue to be able to use SO. But as LLMs capture all the good questions and keep them private, public SO will become less and less relevant. It’s sad to see a resource of this class go."
}
,
{
"id": "46483750",
"text": "I just typed the literal phrase \"mysql error 1095\" into ChatGPT with no context, and it gave an answer that was no worse than SO for the same search.\n\nNo need to give it anything about my repository, network environment, or even a complete sentence."
}
,
{
"id": "46482886",
"text": "https://stackoverflow.co/internal/"
}
,
{
"id": "46484337",
"text": "Surprising to see it bottom out so hard.\n\nI imagine at least some of the leveling off could be due to question saturation. If duplicates are culled (earnestly or overzealously) then there will be a point where most of the low hanging fruit is picked."
}
,
{
"id": "46483087",
"text": "I wonder what the April 2020 spike is about... maybe lockdowns meant people started learning new stuff?"
}
,
{
"id": "46483341",
"text": "I'd still use SO at times if it weren't for how terribly it was managed and moderated. It offers features that LLMs can't, and I actually enjoyed answering questions enough to do it quite often at one time. These days I don't even think about it."
}
,
{
"id": "46487014",
"text": "Looks like they sold right before the end. Wonder whether the AI deals they've struck make up for the difference"
}
,
{
"id": "46485710",
"text": "Stagnation started around 03/2014 and downward trend started around 03/2017. Looking at dates, it doesn't seem like AI caused those trend changes."
}
,
{
"id": "46483391",
"text": "The SO mission is complete. It's now an LLM training set.\n\nThings would be different if we didn't."
}
,
{
"id": "46487655",
"text": "This is the elephant in the room most commenters chose to turn their blind eye to."
}
,
{
"id": "46485195",
"text": "It's a very toxic place, you ask a doubt, and someone will abuse you, down vote you, make you feel you are not for to be a human. Better it's dead."
}
,
{
"id": "46485202",
"text": "Acquired in June 2021 for $1.8 billion usd. Hurts but acquirer Naspers is a prolific tech investor, its stake in TenCent is worth > $150B usd today."
}
]
Return ONLY a JSON array with this exact structure (no other text):
[
{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,
{
"id": "comment_id_2",
"topics": [
2
]
}
,
...
]
Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices
- Only assign topics that are genuinely relevant to the comment
- If no topics match, use an empty array:
{
"id": "...",
"topics": []
}
50