GPU Memory Bandwidth History

Technical discussion of memory bus widths, Radeon VII's 1TB/s throughput, comparison to modern RTX 5080/5090, regression in performance per dollar over years

Technical enthusiasts highlight a perceived regression in GPU memory bandwidth value, noting that the 2019 Radeon VII’s 1TB/s throughput remains superior to modern high-end offerings like the RTX 5080. While flagship cards like the RTX 5090 eventually surpassed this milestone, commenters argue that manufacturers are now using faster GDDR7 memory to mask narrower bus widths, effectively cutting production costs while raising consumer prices. This architectural shift prioritizes gaming-focused compute power over the raw bandwidth and FP64 performance required for AI inference and scientific tasks, leading some users to cling to "ancient" AMD hardware or budget Intel alternatives for specialized workloads. Ultimately, the discussion suggests that while modern cards are faster for general users, the "golden era" of affordable, high-bandwidth HBM technology has been restricted to the expensive professional market.

View on HN · Topics

The Radeon VII came out in 2019 as a $700 consumer GPU with an 1TB/s HBM2 memory subsystem which is more than any consumer GPU you can get today, including the high-end ones afaik. At that point in time, there was a whole lineup of AMD GPUs with HBM going down into the midrange.

If they could make this stuff and sell it to regular people a decade ago for very palatable prices, why do they come up with the idea that this is the technology of the gods, unaffordable by mere mortals?

View on HN · Topics

Yes, it's interesting that HBM was invented by a collaboration between AMD and SK Hynix. It seems, HBM is the way to go for GPUs, anyway.

The GB202 die that's in the GDDR7 based RTX 5090 and RTX 6000 Pro literally needed to be this big to support the 512bit memory bus. It's probably only getting worse with smaller node sizes. (see https://www.youtube.com/watch?v=rCwgAGG2sZQ&t=65s ).

BTW: The 1TB/s is matched by RTX4090 and surpassed by the RTX5090 (1,79 TB/s).

View on HN · Topics

I have been wondering this recently. It was the convention that if you wanted to keep costs down, try to keep the memory bus size down as low as possible. Still remember the awful Radeon 9200 SE - 64bit data bus that strangled an already slow GPU.

Heck, I have a phone with a 16bit memory bus for instance. The high(ish) clock rate only makes up the difference slightly.

But with general prices on all components going up, it might not be such a big factor any more.

HBM migght make sense for higher end products which can free up space for the lower end that will never use the tech.

View on HN · Topics

Eh I feel like the memory bus width thing was more a case of binning memory controllers and the like.

Designing a part with a wide bus and putting the traces down on the board is what I would expect to be the easy part these days (surely).

But yield, yield comes for us all.

View on HN · Topics

Vega was a card with decent perf/$ for the consumer, but from a pure technical point of view (perf/mm2, perf/BW, perf/W) it was a major failure. Both Vega (and Fiji before it) showed that excess memory BW alone is not sufficient to win.

View on HN · Topics

> Both Vega (and Fiji before it) showed that excess memory BW alone is not sufficient to win.

That's correct if you're targeting gamers, but local AI inference changes this picture substantially.

View on HN · Topics

> 1TB/s HBM2 memory subsystem which is more than any consumer GPU you can get today

5090 has 1.8 TB/s?

View on HN · Topics

5090 is an overpriced outlier. A typical consumer GPU, like RTX 5070, has a 3-times lower memory throughput.

Even a RTX 5080 has a lower memory throughput than a Radeon VII from 2019, 7 years ago, while being much more expensive.

The memory throughput of GPUs per dollar has regressed greatly during the last 5 years, despite the fact that the widths of the GPU memory interfaces have been reduced, in order to decrease the production costs.

RTX 5080 has a 256-bit memory interface, while the much cheaper Radeon VII had an 1024-bit memory interface. RTX 5080 has almost 4-times faster memories than Radeon VII, but it has not used this to increase the memory throughput, but only to reduce the production costs, while simultaneously increasing the product price.

View on HN · Topics

> Even a RTX 5080 has a lower memory throughput than a Radeon VII from 2019, 7 years ago, while being much more expensive.

And it's faster for gaming, I guess? Which is what matters for the typical user.

Anyway you can buy much faster GPUs now than in 2019. They are also much more expensive, yes.

View on HN · Topics

Modern GPUs like RTX 5080 are much faster for the applications that are limited by computational capabilities, mainly because they have more execution units, whose clock frequencies have also increased.

I suppose that most games are limited by computation, so they are indeed much faster on modern GPUs.

However, there are applications that are limited by memory throughput, not by computation, including AI inference and many scientific/technical computing applications.

For such applications, old GPUs with higher memory throughput are still faster.

This is why I am still using an old Radeon VII and a couple of other ancient AMD GPUs with high memory throughput.

Last year I have bought an Intel GPU, which is still slower than my old GPUs, but it at least had very good performance per dollar, competitive with that of the old GPUs, because it was very cheap, while the current AMD and especially NVIDIA GPUs have poor performance per dollar.

View on HN · Topics

That card only had 16GB of memory; its memory bandwidth was 1TB/s.

View on HN · Topics

The Pro variant had 32GB, I had one in a 2019 Mac Pro

View on HN · Topics

You're saying this in a world where AMD's highest end consumer GPU in 2026 is also limited to 16 GB.

View on HN · Topics

RX7900 XTX has 24GB

View on HN · Topics

7900XT has 20GB and you can still get some unused ones.

R9700 has 32GB and is cheaper than most NVidia consumer GPUs, even though it's a "pro".

View on HN · Topics

It also does 64 bit floating point I think?

View on HN · Topics

After NVIDIA essentially removed FP64 from consumer GPUs (their 1:64 performance ratio is worse than what you can obtain by software emulation, so it is useless, except for testing programs intended to run on datacenter GPUs), AMD persisted for a few years, but then they also followed NVIDIA.

AMD Hawaii GPUs still had 1:2 FP64:FP32, while the consumer variant of Radeon VII dropped to 1:4. The following AMD consumer GPUs dropped the FP64 performance to levels that are not competitive with CPUs.

Nowadays the only consumer GPUs with decent FP64 performance are the Intel Battlemage GPUs, which have a 1:8 performance ratio, which provides very good performance per dollar.

View on HN · Topics

AMD has built some consumer GPUs in the recent past with HBM - RX Vega and Radeon VII (although I assume not all "HBM" is created equal).

View on HN · Topics

My vega 56 still has 400gb/s of memory which is still insane for how old the card is.

View on HN · Topics

AMD's Hawaii architecture had 320GB/s on a 512b GDDR5 bus in 2013.

The Fiji XT architecture after it had 512GB/S on a 4096b HBM bus in 2015.

The Vega architecture did have 400GB/s or so in 2017, which was a bit of a downgrade.

Summarizer