Model Release Acceleration

Observation that AI model releases are accelerating dramatically, multiple frontier models released within days, connection to Chinese New Year timing, and competition between US and Chinese labs

The current explosion of frontier AI releases is being framed as a strategic "game of leapfrog," driven largely by a geopolitical race to dominate the narrative surrounding the Chinese New Year. While some attribute this acceleration to technical breakthroughs in post-training and AI-assisted development, others believe the rapid-fire launches of models like Gemini 3 and GPT-5.3 signify a "fast takeoff" toward the singularity. This unprecedented pace has fueled both skepticism regarding incremental marketing hype and genuine concern that human productivity is being eclipsed as difficult benchmarks are shattered within weeks. Ultimately, these discussions suggest that whether the cause is seasonal competition or a fundamental shift in progress, the barrier between current capabilities and general intelligence is narrowing at an almost absurd rate.

View on HN · Topics

DeepSeek hasn't been SotA in at least 12 calendar months, which might as well be a decade in LLM years

View on HN · Topics

What about Kimi and GLM?

View on HN · Topics

These are well behind the general state of the art (1yr or so), though they're arguably the best openly-available models.

View on HN · Topics

Idk man, GLM 5 in my tests matches opus 4.5 which is what, two months old?

View on HN · Topics

4.5 was never sota

View on HN · Topics

According to artificial analysis ranking, GLM-5 is at #4 after Claude Opus 4.5, GPT-5.2-xhigh and Claude Opus 4.6 .

View on HN · Topics

But... there's Deepseek v3.2 in your link (rank 7)

View on HN · Topics

Could it also be that the models are just a lot better than a year ago?

View on HN · Topics

What's the point of denying or downplaying that we are seeing amazing and accelerating advancements in areas that many of us thought were impossible?

View on HN · Topics

Is it me or is the rate of model release is accelerating to an absurd degree? Today we have Gemini 3 Deep Think and GPT 5.3 Codex Spark. Yesterday we had GLM5 and MiniMax M2.5. Five days before that we had Opus 4.6 and GPT 5.3. Then maybe two weeks I think before that we had Kimi K2.5.

View on HN · Topics

I think it is because of the Chinese new year.
The Chinese labs like to publish their models arround the Chinese new year, and the US labs do not want to let a DeepSeek R1 (20 January 2025) impact event happen again, so i guess they publish models that are more capable then what they imagine Chinese labs are yet capable of producing.

View on HN · Topics

Singularity or just Chinese New Year?

View on HN · Topics

I guess. Deepseek v3 was released on boxing day a month prior

https://api-docs.deepseek.com/news/news1226

View on HN · Topics

And made almost zero impact, it was just a bigger version of Deepseek V2 and when mostly unnoticed because its performances weren't particularly notable especially for its size.

It was R1 with its RL-training that made the news and crashed the srock market.

View on HN · Topics

Aren't we saying "lunar new year" now?

View on HN · Topics

I don't think so; there are different lunar calendars.

View on HN · Topics

There are hints this is a preview to Gemini 3.1.

View on HN · Topics

More focus has been put on post-training recently. Where a full model training run can take a month and often requires multiple tries because it can collapse and fail, post-training is don't on the order of 5 or 6 days.

My assumption is that they're all either pretty happy with their base models or unwilling to do those larger runs, and post-training is turning out good results that they release quickly.

View on HN · Topics

So, yes, for the past couple weeks it has felt that way to me. But it seems to come in fits and starts. Maybe that will stop being the case, but that's how it's felt to me for awhile.

View on HN · Topics

its cause of a chain of events.

Next week Chinese New year -> Chinese labs release all the models at once before it starts -> US labs respond with what they have already prepared

also note that even in US labs a large proportion of researchers and engineers are chinese and many celebrate the Chinese New Year too.

TLDR: Chinese New Year. Happy Horse year everybody!

View on HN · Topics

Fast takeoff.

View on HN · Topics

They are spending literal trillions. It may even accelerate

View on HN · Topics

There's more compute now than before.

View on HN · Topics

They are using the current models to help develop even smarter models. Each generation of model can help even more for the next generation.

I don’t think it’s hyperbolic to say that we may be only a single digit number of years away from the singularity.

View on HN · Topics

Of course, n-1 wasn't good enough but n+1 will be singularity, just two more weeks my dudes, two more week... rinse and repeat ad infinitum

View on HN · Topics

Interestingly, the title of that PDF calls it "Gemini 3.1 Pro". Guess that's dropping soon.

View on HN · Topics

I looked at the file name but not the document title (specifically because I was wondering if this is 3.1). Good spot.

edit: they just removed the reference to "3.1" from the pdf

View on HN · Topics

I think this is 3.1 (3.0 Pro with the RL improv of 3.0 Flash).
But they probably decided to market it as Deep Think because why not charge more for it.

View on HN · Topics

The Deep Think moniker is for parallel compute models though, not long CoT like pro models.

It's possible though that deep think 3 is running 3.1 models under the hood.

View on HN · Topics

That's odd considering 3.0 is still labeled a "preview" release.

View on HN · Topics

I think it'll be 3.1 by the time it's labelled GA - they said after 3.0 launch that they figured out new RL methods for Flash that the Pro model hasn't benefitted from.

View on HN · Topics

The rumor was that 3.1 was today's drop

View on HN · Topics

Where are these rumors floating around?

View on HN · Topics

One of many https://x.com/synthwavedd/status/2021983382314660075

View on HN · Topics

Huh, so if a China-based lab takes ARC-AGI-2 on the new year, then they can say they had just-shy of a solution anyway.

View on HN · Topics

The general purpose ChatGpt 5.3 hasn’t been released yet, just 5.3-codex.

View on HN · Topics

It's a giant game of leapfrog, shift or stretch time out a bit and they all look equivalent

View on HN · Topics

It’s incredible how fast these models are getting better. I thought for sure a wall would be hit, but these numbers smashes previous benchmarks. Anyone have any idea what the big unlock that people are finding now?

View on HN · Topics

I unironically believe that arc-agi-3 will have a introduction to solved time of 1 month

View on HN · Topics

We will see at the end of April right? It's more of a guess than a strongly held conviction--but I see models improving rapidly at long horizon tasks so I think it's possible. I think a benchmark which can survive a few months (maybe) would be if it genuinely tested long time-frame continual learning/test-time learning/test-time posttraining (idk honestly the differences b/t these).

But i'm not sure how to give such benchmarks. I'm thinking of tasks like learning a language/becoming a master at chess from scratch/becoming a skill artists but where the task is novel enough for the actor to not be anywhere close to proficient at beginning--an example which could be of interest is, here is a robot you control, you can make actions, see results...become proficient at table tennis. Maybe another would be, here is a new video game, obtain the best possible 0% speedrun.

View on HN · Topics

Everyone is already at 80% for that one. Crazy that we were just at 50% with GPT-4o not that long ago.

View on HN · Topics

I think I'm finally realizing that my job probably won't exist in 3-5. Things are moving so fast now that the LLMs are basically writing themselves. I think the earlier iterations moved slower because they were limited by human ability and productivity limitations.

Summarizer