llm/7c7e49f1-870c-4915-9398-3b2e1f116c0c/batch-11-9ce7c0d4-3632-4ca4-9f1f-626348d0aaa9-input.json
You are a comment classifier. Given a list of topics and a batch of comments, assign each comment to up to 3 of the most relevant topics.
TOPICS (use these 1-based indices):
1. Toxic moderation culture
2. LLMs replacing Stack Overflow
3. Duplicate question closures
4. Knowledge repository vs help desk debate
5. Community decline timeline
6. Discord as alternative platform
7. Future of LLM training data
8. Gamification and reputation systems
9. Expert knowledge preservation
10. Reddit as alternative
11. Question quality standards
12. Moderator power dynamics
13. Google search integration decline
14. Stack Exchange expansion problems
15. Human interaction loss
16. Documentation vs community answers
17. Site mission misalignment
18. New user experience
19. GitHub Discussions alternative
20. Corporate ownership changes
COMMENTS TO CLASSIFY:
[
{
"id": "46484886",
"text": "Yuck. I don't know if it's just me, but something feels completely off about the GH issue tracker. I don't know if it's the spacing, the formatting, or what, but each time it feels like it's actively trying to shoo me away.\n\nIt's whatever the visual language equivalent of \"low signal\" is."
}
,
{
"id": "46485430",
"text": "Still gh issues are better than some random discord server. The fact that forums got replaced by discord for \"support\" is a net loss for humanity, as discord is not searchable (to my knowledge). So instead of a forum where someone asks a question and you get n answers, you have to visit the discord, and talk to the discord people, and join a wave channel first, hope the people are there, hope the person that knows is online, and so on."
}
,
{
"id": "46485608",
"text": "Yeah, I suspect that a lot of the decline represented in the OP's graph (starting around early 2020) is actually discord and that LLMs weren't much of a factor until ChatGPT 3.5 which launched in 2022.\n\nLLMs have definitely accelerated Stackoverflow's demise though. No question about that. Also makes me wonder if discord has a licensing deal with any of the large LLM players. If they don't then I can't imagine that will last for long. It will eventually just become too lucrative for them to say no if it hasn't already."
}
,
{
"id": "46487690",
"text": "Discord isn’t just used for tech support forums and discussions. There are loads of completely private communities on there. Discord opening up API access for LLM vendors to train on people’s private conversations is a gross violation of privacy. That would not go down well."
}
,
{
"id": "46485960",
"text": "I think most relevant data that provides best answers lives in GitHub. Sometimes in code, sometimes in issues or discussions. Many libs have their docs there as well. But the information is scattered and not easy to find, and often you need multiple sources to come up with a solution to some problem."
}
,
{
"id": "46487612",
"text": "A lot of valuable information lived/lives in email threads that might or might not be publicly archived."
}
,
{
"id": "46484441",
"text": "The second answer cites Lippert's pre-existing blog post on the subject: https://ericlippert.com/2009/11/12/closing-over-the-loop-var...\n\nI agree that there will be some degradation here, but I also think that the developers inclined to do this kind of outreach will still find ways to do it."
}
,
{
"id": "46490927",
"text": "I believe the community has seen the benefit of forums like SO and we won’t let the idea go stale. I also believe the current state of SO is not sustainable with the old guard flagging any question and response you post there. The idea can/should/might be re-invented in an LLM context and we’re one good interface away from getting there. That’s at least my hope."
}
,
{
"id": "46487157",
"text": "I used to look at all TensorFlow questions when I was on the TensorFlow team ( https://stackoverflow.com/tags/tensorflow/info ). Unclear where people go to interact with their users now....Reddit? But the tone on Reddit is kind of negative/complainy"
}
,
{
"id": "46484845",
"text": "I had a similar beautiful experience where an experienced programmer answered one of my elementary JavaScript typing questions when I was just starting to learn programming.\n\nHe didn't need to, but he gave the most comprehensive answer possible attacking the question from various angles.\n\nHe taught me the value of deeply understanding theoretical and historical aspects of computing to understand why some parts of programming exist the way they are. I'm still thankful.\n\nIf this was repeated today, an LLM would have given a surface level answer, or worse yet would've done the thinking for me obliviating the question in the first place.\n\nI wrote a blog post about my experience at https://nmn.gl/blog/ai-and-learning"
}
,
{
"id": "46485759",
"text": "Had a similar experience. Asked a question about a new language feature in java 8 (parallell streams), and one of the language designers (Goetz) answered my question about the intention of how to use it.\n\nAn LLM couldn't have done the same. Someone would have to ask the question and someone answer it for indexing by the LLM. If we all just ask questions in closed chats, lots of new questions will go unanswered as those with the knowledge have simply not been asked to write the answers down anywhere."
}
,
{
"id": "46486570",
"text": "Would you share the link to the answer?"
}
,
{
"id": "46488979",
"text": "https://stackoverflow.com/questions/20375176/should-i-always..."
}
,
{
"id": "46485260",
"text": "You can prompt the LLM to not just give you the answer. Possibly even ask it to consider the problem from different angles but that may not be helpful when you don't know what you don't know."
}
,
{
"id": "46487116",
"text": "For every example of that, there were 999 instances of people having their question closed, criticised, or ignored."
}
,
{
"id": "46483614",
"text": "You can write a paper, submit the arxiv, and you can also make a blog post.\nAt any rate, I agree - SO was (is?) a wonderful place for this kind of thing.\n\nI once had a professor mention that they knew me from SO because I posted a few underhanded tricks to prevent an EKF from \"going singular\" in production. That kind of community is going to be hard to replace, but SO isnt going anywhere, you can still ask a question and answer your own question for permanent, searchable archive."
}
,
{
"id": "46484051",
"text": "I would imagine the endorsement requirement reduces submissions by a few orders of magnitude."
}
,
{
"id": "46484338",
"text": "At this point SO seems harder to publish into than arxiv."
}
,
{
"id": "46484747",
"text": "If you had used the search feature you’d realize that many similar comments have already been posted on HN. Vote to close."
}
,
{
"id": "46485927",
"text": "If only those who voted to close would bother to check whether the dup/close issue was ACTUALLY a duplicate. If only there were (substantial) penalties for incorrectly dup/closing. The vast majority of dup/closes seem to not actually be dup/closes. I really wish they would get rid of that feature. Would also prevent code rot (references to ancient versions of the software or compiler you're interested in that are no longer relevant, or solutions that have much easier fixes in modern versions of the software). Not missing StackOverflow in the least. It did not age well. (And the whole copyright thing was just toxically stupid)."
}
,
{
"id": "46487331",
"text": "I think they should have had some mechanism that encouraged people to help everybody, including POSITIVELY posting links to previously answered questions, and then only making meaningfully unique ones publicly discoverable (even in the site search by default), afterwards. Instead, they provided an incentive structure and collection of rationales that cultivated a culture of hall monitors with martyr complexes far more interested in punitively enforcing the rules than being a positive educational resource."
}
,
{
"id": "46486569",
"text": "Has anyone tried building a modern Stack Overflow that's actually designed for AI-first developers?\nThe core idea: question gets asked → immediately shows answers from 3 different AI models. Users get instant value. Then humans show up to verify, break it down, or add production context.\nBut flip the reputation system: instead of reputation for answers, you get it for catching what's wrong or verifying what works. \"This breaks with X\" or \"verified in production\" becomes the valuable contribution.\nKeep federation in mind from day one (did:web, did:plc) so it's not another closed platform.\nStack Overflow's magic was making experts feel needed. They still do—just differently now."
}
,
{
"id": "46486818",
"text": "Oh, so it wasn't bad enough to spot bad human answers as an expert on Stack Overflow... now humans should spend their time spotting bad AI answers? How about a model where you ask a human and no AI input is allowed, to make sure that everyone has everyone else's full attention?"
}
,
{
"id": "46486869",
"text": "Why disallow AI input? Is it that poor? Surely it isn't."
}
,
{
"id": "46486966",
"text": "The entire purpose of answering questions as an \"expert\" on S.O. is/was to help educate people who were trying to learn how to solve problems mostly on their own. The goal isn't to solve the immediate problem, it's to teach people how to think about the problem so that they can solve it themselves the next time. The use of AI to solve problems for you completely undermines that ethos of doing it yourself with the minimum amount of targeted, careful questions possible ."
}
,
{
"id": "46487398",
"text": "What's the point of AI on a site like that? Wouldn't you just ask an LLM directly if you were fine with AI answers?"
}
,
{
"id": "46488062",
"text": "You're absolutely correct, but the scary thing is this: What happens when a whole generation grows up not knowing how to answer another person's question without consulting AI?\n\n[edit]\nIt seems to me that this is a lot like the problem which bar trivia nights faced around the inception of the smartphone. Bar trivia nights did, sporadically and unevenly, learn how to evolve questions themselves which couldn't be quickly searched online. But it's still not a well-solved problem.\n\nWhen people ask \"why do I need to remember history lessons - there is an encyclopedia\", or \"why do I need to learn long division - I have a calculator\", I guess my response is: Why do we need you to suck oxygen? Why should I pay for your ignorance? I'm perfectly happy to be lazy in my own right, but at least I serve a purpose. My cat serves a purpose. If you vibe code and you talk to LLMs to answer your questions...I'm sorry, what purpose do you serve?"
}
,
{
"id": "46488527",
"text": "I and many others already go the extra mile to ask multiple LLM's for hard questions or for getting a diversity of AI opinions to then internalize and cross check myself.\n\nThere are apps that build up a nice sized user base on this small convenience aded of getting 2 answers at once REF https://lmarena.ai/ https://techcrunch.com/2025/05/21/lm-arena-the-organization-...\n\nAll the major AI companies of course do not want to give you the answers from other AI's so this service needs to be a third party.\n\nBut then beyond that there are hard/niche questions where the AI's are wrong often and humans also have a hard time getting it right, but with a larger discussion and multiple minds chewing the problem one can get to a more correct answer often by process of elimination.\n\nI encountered this recently in a niche non-US insurance project and I basically coded together the above as an internal tool. AI suggestions + human collaboration to find the best answer. Of course in this case everyone is getting paid to spend time with this thing so more like AI first Stack Overflow Internal. I have no evidence that an public version would do well when ppl don't get paid to commend and rate."
}
,
{
"id": "46488941",
"text": "I was making a point elsewhere in this thread that the best way to learn is to teach; and that's why Stack Overflow was valuable for contributors, as a way of honing their skills. Not necessarily for points.\n\nWhat you need to do, in your organization, is to identify the people who actually care about teaching and learning for their own sake , as opposed to the people who do things for money, and to find a way to promote the people with the inclination to learn and teach into higher positions. Because it shows they aren't greedy, they aren't cheating, and they probably will have your organization's best interests at heart (even if that is completely naïve and they would be better off taking a long vacation - even if they are explicitly the people who claim to dislike your organization the most). I am not talking about people who simply complain. I mean people who show up and do amazing work on a very low level, and teach other people to do it - because they are committed to their jobs. Even if they are completely uneducated.\n\nFor me, the only people I trust are people who exhibit this behavior: They do something above and beyond which they manifestly did not need to do, without credit, in favor of the project I'm spending my time on.\n\n>> But then beyond that there are hard/niche questions where the AI's are wrong often and humans also have a hard time getting it right, but with a larger discussion and multiple minds chewing the problem one can get to a more correct answer often by process of elimination.\n\nHumans aren't even good at this, most of the time, but one has to consider AI output to be almost meaningless babble.\n\nMay I say that the process of elimination is actually not the most important aspect of that type of meeting. It is the surfacing of things you wouldn't have considered - even if they are eliminated later in debate - which makes the process valuable."
}
,
{
"id": "46486745",
"text": "Am I reading an AI trying to trick me into becoming its subordinate?"
}
,
{
"id": "46487108",
"text": "In 2014, one benefit of Stack Overflow / Exchange is a user searching for work can include that they are a top 10% contributor. It actually had real world value. The equivalent today is users with extensive examples of completed projects on Github that can be cloned and run. OP's solution if contained in Github repositories will eventually get included in a training model. Moreover, the solution will definitely be used for training because it now exists on Hacker News."
}
,
{
"id": "46488620",
"text": "I had a conversation with a couple accountants / tax-advisor types about them participating in something like this for their specialty. And the response was actually 100% positive because they know that there is a part of their job that the AI can never take 1) filings requires you to have a human with a government approved license 2) There is a hidden information about what tax optimization is higher or lower risk based on their information from their other clients 3) Humans want another human to make them feel good that their tax situation is taken care of well.\n\nBut also many said that it would be better if one wraps this in an agency so the leads that are generated from the AI accounting questions only go to a few people instead of making it fully public stackexchange like.\n\nSo +1 point -1 point for the idea of a public version."
}
,
{
"id": "46487425",
"text": "LOL. As a top 10% contributor on Stack Overflow, and on FlashKit before that, I can assure you that any real world value attached to that status was always imaginary, or at least highly overrated.\n\nMainly, it was good at making you feel useful and at honing your own craft - because providing answers forced you to think about other people's questions and problems as if they were little puzzles you could solve in a few minutes. Kept you sharp. It was like a game to play in your spare time. That was the reason to contribute, not the points."
}
,
{
"id": "46486876",
"text": "Yeah, they didn't even bother to suggest paying you with tokens for the job well done! The audacity!"
}
,
{
"id": "46488445",
"text": "hehe yea this existing of course. like these guys https://yupp.ai/ they have not announced the tokens but there are points and they got all their VC money from web3 VC. I'm sure there are others trying"
}
,
{
"id": "46488647",
"text": "hehe, damn I did let an AI fix my grammer and they promptly put the classic tell of — U+2014 in there"
}
,
{
"id": "46486972",
"text": "That seems like a horrible core idea. How is that different from data labeling or model evaluation?\n\nHuman beings want to help out other human beings, spread knowledge and might want to get recognition for it. Manually correcting (3 different) automation efforts seems like incredible monotone, unrewarding labour for a race to the bottom. Nobody should spend their time correcting AI models without compensation."
}
,
{
"id": "46488553",
"text": "Great point, thanks for the reality check.\n\nSpeaking of evals the other day I found out that most of the people who contributed to Humanities Last Exam https://agi.safe.ai/ got paid >$2k each. So just adding to your point."
}
,
{
"id": "46486747",
"text": "I think this could be really cool, but the tricky thing would be knowing when to use it instead of just asking the question directly to whichever AI. It’s hard to know that you’ll benefit from the extra context and some human input unless you already have a pretty good idea about the topic."
}
,
{
"id": "46486886",
"text": "Presumably over time said AI could figure out if your question had already been answered and in that case would just redirect you too the old thread instead."
}
,
{
"id": "46486910",
"text": "AI is generally setup to return the \"best\" answer as defined as the most common answer, not the rightest, or most efficient or effective answer, unless the underlying data leans that way.\n\nIt's why AI based web search isn't behaving like google based search. People clicking on the best results really was a signal for google on what solution was being sought. Generally, I don't know that LLMs are covering this type of feedback loop."
}
,
{
"id": "46484925",
"text": "thanks for sharing that, it was simple, neat, elegant.\n\nthis sent me down a rabbit hole -- I asked a few models to solve that same problem, then followed up with a request to optimize it so it runs more efficiently.\n\nchatgpt & gemini's solutions were buggy, but claude solved it, and actually found a solution that is even more efficient. It only needs to compute sqrt once per iteration. It's more complex however.\n\nyours claude\n------------------------------\nTime (ns/call) 40.5 38.3\nsqrt per iter 3 1\nAccuracy 4.8e-7 4.8e-7\n\nClaude's trick: instead of calling sin/cos each iteration, it rotates the existing (cos,sin) pair by the small Newton step and renormalizes:\n\n// Rotate (c,s) by angle dt, then renormalize to unit circle\nfloat nc = c + dt*s, ns = s - dt*c;\nfloat len = sqrt(nc*nc + ns*ns);\nc = nc/len; s = ns/len;\n\nSee: https://gist.github.com/achille/d1eadf82aa54056b9ded7706e8f5...\n\np.s: it seems like Gemini has disabled the ability to share chats can anyone else confirm this?"
}
,
{
"id": "46485114",
"text": "Thanks for pushing this, I've never gone beyond \"zero\" shotting the prompt (is it still called zero shot with search?)\n\nAs a curiosity, it looks like r and q are only ever used as r/q, and therefore a sqrt could be saved by computing rq = sqrt((rx rx + ry ry) / (qx qx + qy qy)). The if q < 1e-10 is also perhaps not necessary, since this would imply that the ellipse is degenerate. My method won't work in that case anyway.\n\nFor the other sqrt, maybe try std::hypot\n\nFinally, for your test set, could you had some highly eccentric cases such as a=1 and b=100\n\nThanks for the investigation:)\n\nEdit: BTW, the sin/cos renormalize trick is the same as what tx,ty are doing. It was pointed out to me by another SO member. My original implementation used trig functions"
}
,
{
"id": "46485351",
"text": "Nice, that worked. It's even faster.\n\nyours yours+opt claude\n---------------------------------------\nTime (ns) 40.9 36.4 38.7\nsqrt/iter 3 2 1\nInstructions 207 187 241\n\nEdit: it looks like the claude algorithm fails at high eccentricities. Gave chatgpt pro more context and it worked for 30min and only made marginal improvement on yours, by doing 2 steps then taking a third local step.\n\nhttps://gist.github.com/achille/23680e9100db87565a8e67038797..."
}
,
{
"id": "46485435",
"text": "Haha nice, hanging in there by a thread"
}
,
{
"id": "46487997",
"text": "Consider updating your answer on SO - I know I'll keep visiting SO for answers like these for quite some time. And enjoy the deserved upvotes :)"
}
,
{
"id": "46488352",
"text": "Do you think you can extend it to distance from a point to an ellipsoid?"
}
,
{
"id": "46488760",
"text": "Yes, people have done this"
}
,
{
"id": "46485627",
"text": "I can relate. I used to have a decent SO profile (10k+ reputation, I know this isnt crazy but it was mostly on non low hanging fruit answers...it was a grind getting there). I used to be proud of my profile and even put it in my resume like people put their Github. Now - who cares? It would make look like a dinosaur sharing that profile, and I never go to SO anymore."
}
,
{
"id": "46486778",
"text": "I don't disagree completely by any means, it's an interesting point, but in your SO answer you already point to your blog post explaining it in more detail, so isn't that the answer, you'd just blog about it and not bother with SO?\n\nThen AI finding it (as opposed to already trained well enough on it, I suppose) will still point to it as did your SO answer."
}
]
Return ONLY a JSON array with this exact structure (no other text):
[
{
"id": "comment_id_1",
"topics": [
1,
3,
5
]
}
,
{
"id": "comment_id_2",
"topics": [
2
]
}
,
...
]
Rules:
- Each comment can have 0 to 3 topics
- Use 1-based topic indices
- Only assign topics that are genuinely relevant to the comment
- If no topics match, use an empty array:
{
"id": "...",
"topics": []
}
50