Observations that LLM evaluation is highly subjective and vibes-based, comparison to gambling psychology, criticism of cargo-culting prompt engineering tricks from influencers
The current AI landscape is increasingly dominated by a "vibes-based" culture where objective benchmarks are often dismissed in favor of subjective anecdotes and "coding voodoo," leading to frequent, unverified claims that models are being secretly "nerfed." This environment is frequently compared to gambling psychology, with users acting like bettors on a lucky streak as they swap magical prompt incantations and treat non-deterministic outputs like rigged slot machines. Such an atmosphere fuels a "cargo cult" of influencer-driven hype and joke projects that gain unearned prestige through bot-inflated metrics and venture capital interest, potentially masking a plateau in actual model capabilities. Consequently, the discourse reflects a deep tension between those who see "mass psychosis" in user complaints and those who insist that personal intuition is the only way to measure the "sloppiness" of a model’s reasoning.
72 comments tagged with this topic