Each TPC is expected to contain 2 SMs, each with 128 FP32 cores. This structure leads to a theoretical core count of 24,576 (192 SMs) for a fully enabled GB202 die. However, manufacturing efficiencies may necessitate disabling some SMs, reducing the expected core count to 24,046 or less. The RTX 5090's memory bandwidth is projected to be 1,536 GB/s, a considerable increase from the 1,008 GB/s of the RTX 4090. This improvement represents a 50% enhancement in memory throughput and approximately a 60% increase in shader density. The overall performance uplift is anticipated to be up to 2x or more compared to its predecessor.
For comparison, here are the specifications of various NVIDIA graphics cards:
- GeForce RTX 5090: Specifications as discussed above
- GeForce RTX 4080: 16 GB GDDR6X, 256-bit bus, 736 GB/s bandwidth
- GeForce RTX 4070 Ti: 12 GB GDDR6X, 192-bit bus, 504 GB/s bandwidth
- GeForce RTX 4070: 12 GB GDDR6X, 192-bit bus, 504 GB/s bandwidth
- GeForce RTX 4060 Ti: 8-16 GB GDDR6, 128-bit bus, 288 GB/s bandwidth
- GeForce RTX 4060: 8 GB GDDR6, 128-bit bus, 272 GB/s bandwidth
Additionally, the speculated RTX 5080, with a 192-bit bus and 32Gbps GDDR7 memory, is expected to offer 768GB/s bandwidth, a modest improvement over the RTX 4080's 717GB/s. The larger L2 cache buffers in NVIDIA's Ada architecture contribute to improved on-die hit rates, which are instrumental in enhancing overall GPU performance.